I upgraded our 2-node cluster last week and everything upgraded and worked as expected. But later I noticed our monitoring system (Nagios) couldn't connect to either node and query the CPU load variable via SNMP. I verified the SNMP connection was timing out, so I re-entered the community and the SNMP process restarted. I was then able to walk the SNMP tree until it reached a point and failed:
[...]
IF-MIB::ifOperStatus.1 = INTEGER: up(1)
IF-MIB::ifOperStatus.2 = INTEGER: up(1)
IF-MIB::ifOperStatus.3 = INTEGER: up(1)
IF-MIB::ifOperStatus.4 = INTEGER: down(2)
Timeout: No Response from clearpass
This failure happens on both nodes. If I re-enter the community it will work a single time, then fail. There's no errors in the Events Viewer nor indications anywhere that I can find that the SNMP daemon is down.