Good morning to all of you airheads.
One customer is having issues with a Cluster of Aruba IAPs 305. They reboot randomly. Here's an example of the debugging we did:
2019-07-04 14:44:42.000 192.168.220.232
2019 192.168.220.232 cli[3175]: <341014> <INFO> <192.168.220.232 24:F2:7F:C8:BF:78> AP rebooting System cmd at uptime 8D 0H 52M 37S: Gateway unreachable.
2019-07-04 14:44:42.000 192.168.220.232
2019 192.168.220.232 nanny[3099]: <303086> <ERRS> <192.168.220.232 24:F2:7F:C8:BF:78> Process Manager (nanny) shutting down - AP will reboot!
2019-07-04 14:44:42.000 192.168.220.232
2019 192.168.220.232 <192.168.220.232 24:F2:7F:C8:BF:78> sapd[3194]: SAPD received SIGTERM; exiting
2019-07-04 14:44:42.000 192.168.220.232
2019 192.168.220.232 cli[3175]: <341089> <ERRS> <192.168.220.232 24:F2:7F:C8:BF:78> cli_data_post_to_airwave: 10210: AWC login error.
2019-07-04 14:44:42.000 192.168.220.232
2019 192.168.220.232 cli[3175]: <341014> <INFO> <192.168.220.232 24:F2:7F:C8:BF:78> AP rebooting Gateway unreachable.
2019-07-04 14:44:42.000 192.168.220.232
2019 192.168.220.232 <192.168.220.232 24:F2:7F:C8:BF:78> KERNEL(AP-INJU-04@192.168.220.232): [694351.890735] NOHZ: local_softirq_pending 08
2019-07-04 14:44:41.000 192.168.220.232
2019 192.168.220.232 <192.168.220.232 24:F2:7F:C8:BF:78>
Another example: 2019 192.168.220.232 cli[3175]: <341014> <INFO> <192.168.220.232 24:F2:7F:C8:BF:78> AP rebooting System cmd at uptime 0D 20H 26M 40S: Gateway unreachable.
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 nanny[3099]: <303086> <ERRS> <192.168.220.232 24:F2:7F:C8:BF:78> Process Manager (nanny) shutting down - AP will reboot!
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| DEL: bridge user doesn't exist ip:fe80::1022:812b:23e1:6b9 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| DEL: bridge user doesn't exist ip:fe80::a650:46ff:fefb:96d6 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| user ip:fe80::b4ce:e657:5b88:1a31 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| user ip:fe80::960e:6bff:fe49:a4d3 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.6
2019 192.168.220.6 stm[3149]: <400166> <DBUG> <192.168.220.6 24:F2:7F:C8:BF:7C> wifi_ap_down_rap processing started for 24:f2:7f:0b:f7:d0
2019-07-05 11:16:09.000 192.168.220.6
2019 192.168.220.6 stm[3149]: <400110> <DBUG> <192.168.220.6 24:F2:7F:C8:BF:7C> AP AP-INJU-01: Delete AP state for bssid 24:f2:7f:0b:f7:d0; Deauth 1 Clear 1
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 <192.168.220.232 24:F2:7F:C8:BF:78> claritylive[3881]: clarity: Number of dns records:0
2019-07-05 11:16:09.000 192.168.220.230
2019 192.168.220.230 stm[3203]: <400166> <DBUG> <192.168.220.230 24:F2:7F:C8:BF:74> wifi_ap_down_rap processing started for 24:f2:7f:0b:f7:50
2019-07-05 11:16:09.000 192.168.220.230
2019 192.168.220.230 <192.168.220.230 24:F2:7F:C8:BF:74> stm[3203]: stm_send_sta_offline: Sending sta offline msg to CLI0, mac='a8:8e:24:1c:90:55'
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 <192.168.220.232 24:F2:7F:C8:BF:78> KERNEL(AP-INJU-04@192.168.220.232): [73600.329189] NOHZ: local_softirq_pending 08
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 cli[3175]: <341014> <INFO> <192.168.220.232 24:F2:7F:C8:BF:78> AP rebooting Gateway unreachable.
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| user ip:fe80::1022:812b:23e1:6b9 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| user ip:fe80::a650:46ff:fefb:96d6 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| user ip:169.254.13.42 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 <192.168.220.232 24:F2:7F:C8:BF:78> stm[3205]: rap_bridge_user_handler: 14775: user entry deleted for '169.254.13.42' '50:3e:aa:77:c7:19'
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| DEL: bridge user doesn't exist ip:fe80::960e:6bff:fe49:a4d3 mac:00:00:00:00:00:00
2019-07-05 11:16:09.000 192.168.220.232
2019 192.168.220.232 stm[3205]: <304008> <DBUG> <192.168.220.232 24:F2:7F:C8:BF:78> |ap| rap_bridge_user_handler : failed
2019-07-05 11:16:09.000 192.168.220.6
2019 192.168.220.6 stm[3149]: <304008> <DBUG> <192.168.220.6 24:F2:7F:C8:BF:7C> |ap| wifi_vap_down_from_sapd ip=127.0.0.1
2019-07-05 11:16:09.000 192.168.220.6
2019 192.168.220.6 <192.168.220.6 24:F2:7F:C8:BF:7C> stm[3149]: sap_sta_mac_ppsk_timer_start: 18393: mac ppsk timer start
2019-07-05 11:16:09.000 192.168.220.6
2019 192.168.220.6 <192.168.220.6 24:F2:7F:C8:BF:7C> stm[3149]: stm_send_sta_offline: Sending sta offline msg to CLI0, mac='ac:2b:6e:94:69:ed'
2019-07-05 11:16:09.000 192.168.220.230
2019 192.168.220.230 stm[3203]: <304008> <DBUG> <192.168.220.230 24:F2:7F:C8:BF:74> |ap| wifi_vap_down_from_sapd ip=127.0.0.1
2019-07-05 11:16:09.000 192.168.220.230
2019 192.168.220.230 <192.168.220.230 24:F2:7F:C8:BF:74> stm[3203]: stm_send_sta_offline: Sending sta offline msg to CLI0, mac='ac:2b:6e:92:cb:65'
2019-07-05 11:16:09.000 192.168.220.230
2019 192.168.220.230 stm[3203]: <400110> <DBUG> <192.168.220.230 24:F2:7F:C8:BF:74> AP AP-INJU-02: Delete AP state for bssid 24:f2:7f:0b:f7:50; Deauth 1 Clear 1
2019-07-05 11:16:09.000 192.168.220.230
2019 192.168.220.230 <192.168.220.230 24:F2:7F:C8:BF:74> stm[3203]: stm_send_sta_offline: Sending sta offline msg to CLI0, mac='a8:8e:24:1c:90:55'
2019-07-05 11:16:08.000 192.168.220.6
2019 192.168.220.6 <192.168.220.6 24:F2:7F:C8:BF:7C> sapd[3138]: SAPD received SIGTERM; exiting
2019-07-05 11:16:08.000 192.168.220.230
2019 192.168.220.230 cli[3173]: <341014> <INFO> <192.168.220.230 24:F2:7F:C8:BF:74> AP rebooting Gateway unreachable.
2019-07-05 11:16:08.000 192.168.220.6
2019 192.168.220.6 stm[3149]: <304008> <DBUG> <192.168.220.6 24:F2:7F:C8:BF:7C> |ap| DEL: bridge user doesn't exist ip:fe80::7078:add4:4831:2826 mac:00:00:00:00:00:00
2019-07-05 11:16:08.000 192.168.220.6
2019 192.168.220.6 stm[3149]: <304008> <DBUG> <192.168.220.6 24:F2:7F:C8:BF:7C> |ap| user ip:fe80::746b:302a:58c9:da47 mac:00:00:00:00:00:00
2019-07-05 11:16:08.000 192.168.220.6
What we already did:
1) Updated OS version to 8.3.0.6
2) Selected another preferred master
3) Checked power budget of switch (it's sufficient)
4) Default gateway is a Cisco MPLS router.
Thanks in advance and sorry for the long post!