Environment : Any Aruba Controller
Legacy Access Points - AP-61/65/70/8x
Aruba OS 5.0+
AP-70s are in a reboot cycle on the Controller.
The main cause is AP to Controller connectivity lost during power shutdown. When the Controller and APs come back up, APs need to be re-approved. If the connectivity remains intermittent, we may sometimes see the APs are struck in "certified-hold-switch-cert" state. In such cases, APs will either recover automatically or if it takes too long for connectivity to be restored, it leaves the APs in this state until we manually approve the APs from the Controller.
We assume in this case that CPSec (Control-Plane Security) is enabled on the Controller. The main cause is AP to Controller connectivity lost during power shutdown. When the Controller and APs come back up, APs need to be re-approved. If the connectivity remains intermittent, we may sometimes see the APs are struck in "certified-hold-switch-cert" state. In such cases, APs will either recover automatically or if it takes too long for connectivity to be restored, it leaves the APs in this state until we manually approve the APs from the Controller.
To troubleshoot APs in a reboot cycle, logical troubleshooting follows as below:
1) Check the AP database for the actual status of the APs (Down/Rebooting etc.). and Flags to get a clue.
# show ap database
For this case, we noticed:
AP Database
-----------
Name Group AP Type IP Address Status Flags Switch IP
---------- ----- -------------- --------------- --------- ----- --------------
Aruba-1 Aruba-group 65 192.168.1.24 Down 192.168.1.2
Aruba-2 Aruba-group 65 192.168.1.32 Down 192.168.1.2
Aruba-3 Aruba-group 65 192.168.1.23 Down 192.168.1.2
Aruba-4 Aruba-group 65 192.168.1.56 Rebooting I 192.168.1.2
Aruba-5 Aruba-group 65 192.168.1.55 Rebooting I 192.168.1.2
Aruba-6 Aruba-group 65 192.168.1.69 Rebooting I 192.168.1.2
Aruba-7 Aruba-group 65 192.168.1.59 Rebooting I 192.168.1.2
2) Verify whether the APs can build PAPI and GRE with the Controller.
# show datapath session table <AP IP>
For this case, we noticed for one AP:
Source IP Destination IP Prot SPort DPort Cntr Prio ToS Age Destination TAge UsrIdx UsrVer Flags
-------------- ------------------ ----- ------- -------- ---- --- --- ----------- -------------- ------ ------ ----- -----
192.168.1.24 192.168.1.2 47 0 0 0/0 0 0 0 1/2 3a16 1 1 F
192.168.1.2 192.168.1.24 47 0 0 0/0 0 0 0 1/2 31b8 1 1 F
3) Verify the system logs to check the reboot reason for APs. A sample output that applies to our case:
(Aruba) #show log system 30
Jan 31 02:01:38 :311002: <WARN> |AP Aruba-4@192.168.1.56 sapd| Rebooting: SAPD: Unable to install cert. Need to re-approve AP
Jan 31 02:01:38 :303086: <ERRS> |AP Aruba-4@192.168.1.56 nanny| Process Manager (nanny) shutting down - AP will reboot!
Jan 31 02:01:38 :311002: <WARN> |AP Aruba-7@192.168.1.59 sapd| Rebooting: SAPD: Unable to install cert. Need to re-approve AP
Jan 31 02:01:38 :311002: <WARN> |AP Aruba-6@192.168.1.69 sapd| Rebooting: SAPD: Unable to install cert. Need to re-approve AP
Jan 31 02:01:39 :303086: <ERRS> |AP Aruba-6@192.168.1.69 nanny| Process Manager (nanny) shutting down - AP will reboot!
Jan 31 02:01:39 :303086: <ERRS> |AP Aruba-7@192.168.1.59 nanny| Process Manager (nanny) shutting down - AP will reboot!
4) Verify the CAP Whitelist to check the CPSec status of the APs:
(Aruba) #show whitelist-db cpsec
Control-Plane Security Whitelist-entry Details
----------------------------------------------
MAC-Address Enable State Cert-Type Description Revoke Text Last Updated
----------- ------ ----- --------- ----------- ----------- ------------
00:0b:86:c7:6e:32 Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:01 2013
00:0b:86:c7:6d:56 Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:01 2013
00:0b:86:c7:6f:26 Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:05 2013
00:0b:86:c7:6d:3e Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:05 2013
The resolution is to manually approve the APs from Configuration > Wireless> AP Installation> Campus Whitelist> Select the AP MAC address - Modify - Select "approved-ready-for-cert". We will see the AP reboots and comes up fine after manually approving:
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Generating CSR I 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Installing cert I 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Rebooting I 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Up 1m:1s 2 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap active
Active AP Table
---------------
Name Group IP Address 11g Clients 11g Ch/EIRP/MaxEIRP 11a Clients 11a Ch/EIRP/MaxEIRP AP Type Flags Uptime Outer IP
---- ----- ---------- ----------- ------------------- ----------- ------------------- ------- ----- ------ --------
Aruba-4 Aruba-group 192.168.1.56 0 AP:6/9/20.5 0 AP:165/15/21 65 A2a 1m:8s N/A
We assume in this case that CPSec (Control-Plane Security) is enabled on the Controller. The main cause is AP to Controller connectivity lost during power shutdown. When the Controller and APs come back up, APs need to be re-approved. If the connectivity remains intermittent, we may sometimes see the APs are struck in "certified-hold-switch-cert" state. In such cases, APs will either recover automatically or if it takes too long for connectivity to be restored, it leaves the APs in this state until we manually approve the APs from the Controller.
To troubleshoot APs in a reboot cycle, logical troubleshooting follows as below:
1) Check the AP database for the actual status of the APs (Down/Rebooting etc.). and Flags to get a clue.
# show ap database
For this case, we noticed:
AP Database
-----------
Name Group AP Type IP Address Status Flags Switch IP
---------- ----- -------------- --------------- --------- ----- --------------
Aruba-1 Aruba-group 65 192.168.1.24 Down 192.168.1.2
Aruba-2 Aruba-group 65 192.168.1.32 Down 192.168.1.2
Aruba-3 Aruba-group 65 192.168.1.23 Down 192.168.1.2
Aruba-4 Aruba-group 65 192.168.1.56 Rebooting I 192.168.1.2
Aruba-5 Aruba-group 65 192.168.1.55 Rebooting I 192.168.1.2
Aruba-6 Aruba-group 65 192.168.1.69 Rebooting I 192.168.1.2
Aruba-7 Aruba-group 65 192.168.1.59 Rebooting I 192.168.1.2
2) Verify whether the APs can build PAPI and GRE with the Controller.
# show datapath session table <AP IP>
For this case, we noticed for one AP:
Source IP Destination IP Prot SPort DPort Cntr Prio ToS Age Destination TAge UsrIdx UsrVer Flags
-------------- ------------------ ----- ------- -------- ---- --- --- ----------- -------------- ------ ------ ----- -----
192.168.1.24 192.168.1.2 47 0 0 0/0 0 0 0 1/2 3a16 1 1 F
192.168.1.2 192.168.1.24 47 0 0 0/0 0 0 0 1/2 31b8 1 1 F
3) Verify the system logs to check the reboot reason for APs. A sample output that applies to our case:
(Aruba) #show log system 30
Jan 31 02:01:38 :311002: <WARN> |AP Aruba-4@192.168.1.56 sapd| Rebooting: SAPD: Unable to install cert. Need to re-approve AP
Jan 31 02:01:38 :303086: <ERRS> |AP Aruba-4@192.168.1.56 nanny| Process Manager (nanny) shutting down - AP will reboot!
Jan 31 02:01:38 :311002: <WARN> |AP Aruba-7@192.168.1.59 sapd| Rebooting: SAPD: Unable to install cert. Need to re-approve AP
Jan 31 02:01:38 :311002: <WARN> |AP Aruba-6@192.168.1.69 sapd| Rebooting: SAPD: Unable to install cert. Need to re-approve AP
Jan 31 02:01:39 :303086: <ERRS> |AP Aruba-6@192.168.1.69 nanny| Process Manager (nanny) shutting down - AP will reboot!
Jan 31 02:01:39 :303086: <ERRS> |AP Aruba-7@192.168.1.59 nanny| Process Manager (nanny) shutting down - AP will reboot!
4) Verify the CAP Whitelist to check the CPSec status of the APs:
(Aruba) #show whitelist-db cpsec
Control-Plane Security Whitelist-entry Details
----------------------------------------------
MAC-Address Enable State Cert-Type Description Revoke Text Last Updated
----------- ------ ----- --------- ----------- ----------- ------------
00:0b:86:c7:6e:32 Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:01 2013
00:0b:86:c7:6d:56 Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:01 2013
00:0b:86:c7:6f:26 Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:05 2013
00:0b:86:c7:6d:3e Enabled certified-hold-switch-cert switch-cert Thu Jan 31 02:03:05 2013
The resolution is to manually approve the APs from Configuration > Wireless> AP Installation> Campus Whitelist> Select the AP MAC address - Modify - Select "approved-ready-for-cert". We will see the AP reboots and comes up fine after manually approving:
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Generating CSR I 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Installing cert I 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Rebooting I 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap database long | include 31.45
Aruba-4 Aruba-group 65 192.168.1.56 Up 1m:1s 2 192.168.1.2 00:0b:86:c7:6e:32 A62203877 1/10 57\.2\.2417.Floor 1.Aruba-group.Main Campus N/A
(Aruba) #show ap active
Active AP Table
---------------
Name Group IP Address 11g Clients 11g Ch/EIRP/MaxEIRP 11a Clients 11a Ch/EIRP/MaxEIRP AP Type Flags Uptime Outer IP
---- ----- ---------- ----------- ------------------- ----------- ------------------- ------- ----- ------ --------
Aruba-4 Aruba-group 192.168.1.56 0 AP:6/9/20.5 0 AP:165/15/21 65 A2a 1m:8s N/A