Reply
Contributor I
jeffersonc
Posts: 39
Registered: ‎01-16-2010

APs Rebooting (SAPD: Unable to contact switch)

Yesterday I turned on triggers in Airwave to alert me when an AP went down/up. Since then I've discovered that a number of my APs keep rebooting. When I look at the debug log for the APs I see the following:

Feb 2 16:39:00 nanny: <303022> |AP CUC-HON-2-ADMIN@172.27.8.230 nanny| ^

Any clue what would cause that? I'm seeing it intermittantly on various APs. It feels like it might be load related. (As I look at the APs that have rebooted vs those that haven't it seems like the ones in lighter areas haven't. I also think I've seen fewer reboots overnight when there are fewer users on than during the day/evening hours.) I don't think it's wired network related as I've got APs that haven't rebooted on the same physical switch as those that have. Our controller is running 3.4.2 - any ideas?
Moderator
cjoseph
Posts: 12,036
Registered: ‎03-29-2007

Topology and configuration

How are those access points connected to the controller? Are many access points connected to the controller through a possibly congested link? Is there an LMS-IP on an AP that redirects an access point to a different location, and that destination is unreachable at times?

This is part of the troubleshooting methodology.
Colin Joseph
Aruba Customer Engineering
Contributor I
jeffersonc
Posts: 39
Registered: ‎01-16-2010

Re: APs Rebooting (SAPD: Unable to contact switch)


How are those access points connected to the controller? Are many access points connected to the controller through a possibly congested link? Is there an LMS-IP on an AP that redirects an access point to a different location, and that destination is unreachable at times?

This is part of the troubleshooting methodology.




We have two controllers in a master/local configuration. The first is an M3 controller. It has a 10GB link to a Cisco 6500 chassis. The second is an SC1. It has a GB link to that same 6500 chassis. The APs I'm having problems with are in a AP system group that point to the SC1 as primary with the M3 as backup. The APs themselves are a few L2 hops away. Most of those links are GB, but a couple APs connect to switches that only use 100Mb uplinks. In any event, my monitoring shows these links are virtually always below 5% utilization. (I'll occasionally see spikes up to the 25% range.) Further the only link shared by the APs I've seen reboot is shared by all of my APs. As a result I don't think this is an L2 link congestion issue.
Moderator
cjoseph
Posts: 12,036
Registered: ‎03-29-2007

show ap debug counters

execute the "show ap debug counters" command to see how many aps are rebootstrapping. Normally the line after the "Nanny" line tells why the AP rebooted (show log system x) and that is usually the best way. You also can use "show ap debug system-status ap-name ". that will tell you why an AP rebooted:

(M3.arubanetworks.com) # show ap debug system-status ap-name testap-rap5

Reboot Information
------------------
AP rebooted Fri Dec 31 16:02:25 PST 1999; 'reboot' command executed with no reason given (called from init,rcS).
----------------------------------------------------------------------------------------------------------------

Rebootstrap Information
-----------------------
Date Time Reason (Latest 10)
--------------------------------------
2010-02-02 07:40:29 Keepalive request failed: UNKNOWN_AP
2010-02-02 08:29:10 Keepalive request failed: UNKNOWN_AP
2010-02-02 09:28:24 Keepalive request failed: UNKNOWN_AP
2010-02-02 17:11:44 Missed heartbeats on radio 0 VAP 0
2010-02-02 17:12:38 Changing to LMS #0 (216.31.249.247)
2010-02-03 07:05:07 Keepalive request failed: UNKNOWN_AP
2010-02-03 08:16:30 Bootstrap requested by STM
2010-02-03 08:16:31 Bootstrap requested by STM
2010-02-03 08:16:31 Bootstrap requested by STM
2010-02-03 10:26:45 Keepalive request failed: UNKNOWN_AP


Rebootstrap LMS
---------------
2010-02-02 17:12:38 Changing to LMS #0 (116.31.249.247)
-------------------------------------------------------


Also, look at the heartbeat statistics to see if any are being dropped. It is typical for heartbeats to be missing for remote AP deployments, and a couple here and there for Campus APs:

(M3.arubanetworks.com) # show ap debug system-status ap-name test-rap5 | begin Tunnel 
Tunnel Heartbeat Stats
----------------------
Interface Hearbeats Sent Hearbeats Received
--------- -------------- ------------------
wifi000 1 1
wifi001 1 1
wifi002 0 0
eth1 2069323 2047283
eth2 2069342 2046883
eth3 0 0
eth4 0 0



Heartbeats sent should match heartbeats received, or you are having packet loss for Campus APs. Heartbeats are normally sent out one per second.
Colin Joseph
Aruba Customer Engineering
Contributor I
jeffersonc
Posts: 39
Registered: ‎01-16-2010

Re: APs Rebooting (SAPD: Unable to contact switch)


execute the "show ap debug counters" command to see how many aps are rebootstrapping. Normally the line after the "Nanny" line tells why the AP rebooted (show log system x) and that is usually the best way. You also can use "show ap debug system-status ap-name ". that will tell you why an AP rebooted:





# show ap debug counters group CUC-MAIN

AP Counters
-----------
Name Group IP Address Configs Sent Configs Acked AP Boots Sent AP Boots Acked Bootstraps Reboots
---- ----- ---------- ------------ ------------- ------------- -------------- ---------- -------
CUC-CFS-1-FMCS CUC-Main 172.27.8.250 2060 2060 0 0 3 2
CUC-CFS-1-HR CUC-Main 172.27.8.232 13 13 0 0 10 9
CUC-CFS-1-NE CUC-Main 172.27.8.251 6 6 0 0 6 4
CUC-HON-1-ELEVATOR CUC-Main 172.27.8.202 1986 1986 0 0 7 6
CUC-HON-1-NORTH CUC-Main 172.27.8.201 1227 1227 0 0 6 5
CUC-HON-1-PRES CUC-Main 172.27.8.199 2332 2332 0 0 5 4
CUC-HON-1-WEST CUC-Main 172.27.8.200 1726 1726 0 0 7 5
CUC-HON-2-ADMIN CUC-Main 172.27.8.230 51 51 0 0 8 6
CUC-HON-2-ADMIN-CONF CUC-Main 172.27.8.226 60 59 0 0 9 7
CUC-HON-2-NORTH CUC-Main 172.27.8.203 1453 1453 0 0 5 4
CUC-HON-2-SOUTH CUC-Main 172.27.8.204 1119 1119 0 0 6 5
CUC-HON-3-ASIAN-STUDIES CUC-Main 172.27.8.228 274 274 0 0 30 29
CUC-HON-3-ASIAN-STUDIES-NORTH CUC-Main 172.27.8.234 320 320 0 0 21 20
CUC-HON-3-EAST CUC-Main 172.27.8.223 99 99 0 0 16 16
CUC-HON-3-LIT CUC-Main 172.27.8.227 130 130 0 0 8 7
CUC-HON-4-EAST CUC-Main 172.27.8.216 79 79 0 0 6 4
CUC-HON-4-ELEVATOR CUC-Main 172.27.8.219 83 83 0 0 6 5
CUC-HON-4-IRVINE CUC-Main 172.27.8.236 293 293 0 0 10 9
CUC-HON-4-NORTH CUC-Main 172.27.8.246 35 35 0 0 8 7
CUC-HON-4-SOUTH CUC-Main 172.27.8.214 86 86 0 0 5 4
CUC-HON-4-WEST CUC-Main 172.27.8.231 964 964 0 0 4 3
CUC-HUNT-1-N CUC-Main 172.27.8.205 20 20 0 0 9 5
CUC-HUNT-1-S CUC-Main 172.27.8.206 22 22 0 0 34 11
CUC-MCA-0-DISABILITY CUC-Main 172.27.8.252 112 112 0 0 4 3
CUC-MCA-1-MEETING CUC-Main 172.27.8.229 87 86 0 0 6 5
CUC-MCA-1-RECEPTION CUC-Main 172.27.8.249 5636 5635 0 0 3 2
CUC-MUDD-1-MATERIALS-HANDLING CUC-Main 172.27.8.241 63 63 0 0 4 2
CUC-MUDD-1-MULTIMEDIA CUC-Main 172.27.8.240 216 216 0 0 5 4
CUC-MUDD-1-WEST-CLOSET CUC-Main 172.27.8.242 618 618 0 0 8 7
CUC-MUDD-2-CIRC CUC-Main 172.27.8.225 97 96 0 0 6 5
CUC-MUDD-2-KLR-AV CUC-Main 172.27.8.254 478 478 0 0 5 4
CUC-MUDD-2-NORTH-EAST CUC-Main 172.27.8.224 126 126 0 0 7 6


AP Counters
-----------
Name Group IP Address Configs Sent Configs Acked AP Boots Sent AP Boots Acked Bootstraps Reboots
---- ----- ---------- ------------ ------------- ------------- -------------- ---------- -------
CUC-MUDD-2-SOUTH-BELVEDERE CUC-Main 172.27.8.245 74 73 0 0 12 10
CUC-MUDD-3-IRIS-N CUC-Main 172.27.8.235 114 113 0 0 27 23
CUC-MUDD-3-KECK-2 CUC-Main 172.27.8.222 58 57 0 0 6 5
CUC-MUDD-3-NEW-LIBRARY CUC-Main 172.27.8.238 62 62 0 0 11 10
CUC-MUDD-3-NL-NORTH CUC-Main 172.27.8.221 3 3 0 0 4 2
CUC-MUDD-3-NL-SOUTH CUC-Main 172.27.8.248 62 61 0 0 9 8
CUC-OBSA-2-N CUC-Main 172.27.8.207 82 82 0 0 17 15
CUC-PEN-1-BENEFITS CUC-Main 172.27.8.217 198 198 0 0 9 8
CUC-PEN-1-FS-EAST CUC-Main 172.27.8.247 29 29 0 0 4 2
CUC-PEN-1-FS-WEST CUC-Main 172.27.8.220 3731 3731 0 0 4 2
CUC-REC-1-ENTRY CUC-Main 172.27.8.208 11 11 0 0 5 3
CUC-REC-1-MAIN-ROOM CUC-Main 172.27.8.212 21 21 0 0 4 2
CUC-REC-2-CONF CUC-Main 172.27.8.211 5 5 0 0 13 6
CUC-REC-2-NE CUC-Main 172.27.8.213 46 45 0 0 4 3
CUC-REC-2-SE CUC-Main 172.27.8.209 27 26 0 0 4 2
CUC-REC-2-SW CUC-Main 172.27.8.210 9 9 0 0 7 5
CUC-TEL-1-OFFICE CUC-Main 172.27.8.218 3 3 0 0 8 5
CUC-TRA-1-MONSOUR-NORTH CUC-Main 172.27.8.253 15 14 0 0 4 2
CUC-TRA-1-MONSOUR-SOUTH CUC-Main 172.27.8.244 7 7 0 0 4 2
CUC-TRA-1-N CUC-Main 172.27.8.243 6 6 0 0 9 8
CUC-TRA-1-SHS CUC-Main 172.27.8.239 1292 1292 0 0 4 3
CUC-TRA-2-CONFERENCE CUC-Main 172.27.8.233 19 19 0 0 7 5
Total APs :54

# show ap debug counters group CUC-IT

AP Counters
-----------
Name Group IP Address Configs Sent Configs Acked AP Boots Sent AP Boots Acked Bootstraps Reboots
---- ----- ---------- ------------ ------------- ------------- -------------- ---------- -------
CUC-PEN-0-DATACENTER CUC-IT 172.27.8.237 3 3 0 0 14 8
CUC-PEN-0-IT-FRONT CUC-IT 172.27.8.215 12 12 0 0 22 19
Total APs :2



#show ap debug system-status ap-name CUC-PEN-0-DATACENTER

Reboot Information
------------------
(none found)
------------

Rebootstrap Information
-----------------------
(none found)
------------

#show ap debug system-status ap-name CUC-PEN-0-IT-FRONT

Reboot Information
------------------
AP rebooted Tue Feb 2 22:16:25 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
--------------------------------------------------------------------------------------------------

Rebootstrap Information
-----------------------
2010-02-03 09:38:17 Switching to primary LMS 134.173.64.7
---------------------------------------------------------


When I look at the web interface to see update for all my APs I see some APs that have been up for 20 days (we had a controller reboot at that point) and others that have rebooted within the last hour.
Contributor I
jeffersonc
Posts: 39
Registered: ‎01-16-2010

Re: APs Rebooting (SAPD: Unable to contact switch)


execute the "show ap debug counters" command to see how many aps are rebootstrapping. Normally the line after the "Nanny" line tells why the AP rebooted (show log system x) and that is usually the best way. You also can use "show ap debug system-status ap-name ". that will tell you why an AP rebooted:




While a couple show missed heartbeats, most dont:

# show log system 50 | include "CUC-"
Feb 2 14:03:36 :303022: |AP CUC-REC-2-NE@172.27.8.213 nanny| Reboot Reason: AP rebooted Tue Feb 2 14:02:15 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 14:03:49 :304002: |stm| AP CUC-REC-2-NE: No response from authmgr for BSSID 00:1a:1e:e8:1f:88
Feb 2 14:37:26 :303022: |AP CUC-HON-1-NORTH@172.27.8.201 nanny| Reboot Reason: No reboot message found.
Feb 2 16:01:35 :303022: |AP CUC-HON-2-ADMIN@172.27.8.230 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:00:18 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 16:14:08 :303022: |AP CUC-HON-2-ADMIN@172.27.8.230 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:12:51 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 16:26:07 :303022: |AP CUC-HON-2-ADMIN@172.27.8.230 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:24:52 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 16:27:48 :303022: |AP CUC-MUDD-3-KECK-2@172.27.8.222 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:26:32 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 16:29:54 :311004: |AP CUC-TRA-1-N@172.27.8.243 sapd| Missed 8 heartbeats on radio 0 VAP 0; rebootstrapping
Feb 2 16:39:00 :303022: |AP CUC-HON-2-ADMIN@172.27.8.230 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:37:45 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 16:41:34 :303022: |AP CUC-TRA-1-N@172.27.8.243 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:40:11 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 16:46:40 :303022: |AP CUC-TEL-1-OFFICE@172.27.8.218 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:45:18 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 17:00:15 :303022: |AP CUC-TRA-1-N@172.27.8.243 nanny| Reboot Reason: AP rebooted Tue Feb 2 16:58:53 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 17:03:44 :303022: |AP CUC-MUDD-3-KECK-2@172.27.8.222 nanny| Reboot Reason: AP rebooted Tue Feb 2 17:02:26 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 17:18:48 :303022: |AP CUC-TRA-1-N@172.27.8.243 nanny| Reboot Reason: AP rebooted Tue Feb 2 17:17:28 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 17:34:03 :303022: |AP CUC-TRA-1-N@172.27.8.243 nanny| Reboot Reason: AP rebooted Tue Feb 2 17:32:43 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 17:37:35 :303022: |AP CUC-MCA-1-MEETING@172.27.8.229 nanny| Reboot Reason: AP rebooted Tue Feb 2 17:36:12 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 17:42:00 :303022: |AP CUC-HON-4-NORTH@172.27.8.246 nanny| Reboot Reason: AP rebooted Tue Feb 2 17:40:45 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 18:07:36 :303022: |AP CUC-HON-4-NORTH@172.27.8.246 nanny| Reboot Reason: AP rebooted Tue Feb 2 18:06:19 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 18:49:24 :303022: |AP CUC-TEL-1-OFFICE@172.27.8.218 nanny| Reboot Reason: AP rebooted Tue Feb 2 18:48:02 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 19:08:49 :311004: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 sapd| Missed 8 heartbeats on radio 0 VAP 1; rebootstrapping
Feb 2 19:08:49 :304002: |stm| AP CUC-PEN-0-IT-FRONT: No response from authmgr for BSSID 00:1a:1e:e7:bc:e9
Feb 2 19:14:40 :303022: |AP CUC-MUDD-2-CIRC@172.27.8.225 nanny| Reboot Reason: AP rebooted Tue Feb 2 19:13:22 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 19:19:23 :303022: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 nanny| Reboot Reason: AP rebooted Tue Feb 2 19:18:04 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 19:38:29 :303022: |AP CUC-TRA-1-N@172.27.8.243 nanny| Reboot Reason: AP rebooted Tue Feb 2 19:37:06 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 20:09:18 :303022: |AP CUC-TEL-1-OFFICE@172.27.8.218 nanny| Reboot Reason: AP rebooted Tue Feb 2 20:07:59 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 20:09:31 :304002: |stm| AP CUC-TEL-1-OFFICE: No response from authmgr for BSSID 00:1a:1e:e7:bc:88
Feb 2 20:42:38 :303022: |AP CUC-HON-4-NORTH@172.27.8.246 nanny| Reboot Reason: AP rebooted Tue Feb 2 20:41:23 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 20:57:06 :311004: |AP CUC-PEN-0-DATACENTER@172.27.8.237 sapd| Missed 8 heartbeats on radio 0 VAP 1; rebootstrapping
Feb 2 20:57:08 :304002: |stm| AP CUC-PEN-0-DATACENTER: No response from authmgr for BSSID 00:1a:1e:e8:0b:49
Feb 2 20:57:22 :311004: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 sapd| Missed 8 heartbeats on radio 0 VAP 1; rebootstrapping
Feb 2 20:57:23 :304002: |stm| AP CUC-PEN-0-IT-FRONT: No response from authmgr for BSSID 00:1a:1e:e7:bc:e9
Feb 2 20:57:38 :304002: |stm| AP CUC-PEN-0-DATACENTER: No response from authmgr for BSSID 00:1a:1e:e8:0b:49
Feb 2 21:08:33 :303022: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:07:11 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 21:09:45 :303022: |AP CUC-PEN-0-DATACENTER@172.27.8.237 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:08:23 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 21:32:11 :303022: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:30:49 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 21:32:25 :304002: |stm| AP CUC-PEN-0-IT-FRONT: No response from authmgr for BSSID 00:1a:1e:e7:bc:e8
Feb 2 21:37:10 :311004: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 sapd| Missed 8 heartbeats on radio 0 VAP 1; rebootstrapping
Feb 2 21:37:10 :304002: |stm| AP CUC-PEN-0-IT-FRONT: No response from authmgr for BSSID 00:1a:1e:e7:bc:e9
Feb 2 21:37:20 :303022: |AP CUC-MUDD-3-NL-SOUTH@172.27.8.248 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:36:03 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 21:37:57 :304002: |stm| AP CUC-PEN-0-IT-FRONT: No response from authmgr for BSSID 00:1a:1e:e7:bc:e9
Feb 2 21:41:31 :303022: |AP CUC-HUNT-1-S@172.27.8.206 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:40:01 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 21:43:50 :303022: |AP CUC-PEN-0-DATACENTER@172.27.8.237 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:42:27 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 21:47:09 :303022: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:45:49 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 21:53:52 :303022: |AP CUC-HUNT-1-S@172.27.8.206 nanny| Reboot Reason: AP rebooted Tue Feb 2 21:52:24 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 22:04:04 :303022: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 nanny| Reboot Reason: AP rebooted Tue Feb 2 22:02:43 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 22:15:55 :303022: |AP CUC-PEN-0-DATACENTER@172.27.8.237 nanny| Reboot Reason: AP rebooted Tue Feb 2 22:14:34 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 2 22:17:45 :303022: |AP CUC-PEN-0-IT-FRONT@172.27.8.215 nanny| Reboot Reason: AP rebooted Tue Feb 2 22:16:25 PST 2010: SAPD: Unable to contact switch. Called by sapd_hello_cb:4
Feb 3 10:42:45 :303022: |AP CUC-MCA-1-MEETING@172.27.8.229 nanny| Reboot Reason: No reboot message found.
Feb 3 12:02:25 :303022: |AP CUC-PEN-0-DATACENTER@172.27.8.237 nanny| Reboot Reason: No reboot message found.
Feb 3 12:14:27 :303022: |AP CUC-PEN-0-DATACENTER@172.27.8.237 nanny| Reboot Reason: No reboot message found.
Moderator
cjoseph
Posts: 12,036
Registered: ‎03-29-2007

Network

Are those access points connecting to a VRRP? (134.173.64.7) Is it possible that VRRP flapped at Tue Feb 2 22:16:25 PST 2010. Do a "show log network (x) to see if your interfaces went up or down at that time. Depending on how busy your network is, the logs might have rolled, so logging to an external syslog server might be a good bet. Some of the output to show ap debug system-status ap-name also shows what the port that the AP is on negotiated to. If it is anything different from what you expect, the packet loss could cause rebootstraps.
Colin Joseph
Aruba Customer Engineering
Contributor I
jeffersonc
Posts: 39
Registered: ‎01-16-2010

Re: APs Rebooting (SAPD: Unable to contact switch)



Also, look at the heartbeat statistics to see if any are being dropped. It is typical for heartbeats to be missing for remote AP deployments, and a couple here and there for Campus APs:



Heartbeats sent should match heartbeats received, or you are having packet loss for Campus APs. Heartbeats are normally sent out one per second.




This AP has rebooted a number of times but doesn't report having missed a heartbeat.

# show ap debug system-status ap-name CUC-PEN-0-DATACENTER | begin Tunnel
Tunnel Heartbeat Stats
----------------------
Interface Heartbeats Sent Heartbeats Received
--------- --------------- -------------------
aruba000 1483 1483
aruba001 1509 1509
aruba002 1509 1509
aruba003 1509 1509
aruba100 1508 1508
aruba101 1508 1508
aruba102 1508 1508
aruba103 1508 1508

LMS Information
---------------
Item Value
---- -----
Primary LMS 134.173.64.7
Backup LMS 134.173.64.23
Using Primary
Preemption Enabled
Hold-down period 60
VRRP No


#show ap debug counters ap-name CUC-PEN-0-DATACENTER

AP Counters
-----------
Name Group IP Address Configs Sent Configs Acked AP Boots Sent AP Boots Acked Bootstraps Reboots
---- ----- ---------- ------------ ------------- ------------- -------------- ---------- -------
CUC-PEN-0-DATACENTER CUC-IT 172.27.8.237 5 5 0 0 14 8


(Sorry for the fact this was 3 replies - I hit character limits for vBulletin.)
Contributor I
jeffersonc
Posts: 39
Registered: ‎01-16-2010

Re: APs Rebooting (SAPD: Unable to contact switch)




"show log network all" is empty for me at the moment. Do the Aruba controllers use separate logs for network and system? i.e. Does the fact that "show log system" shows me far enough back that I can see APs rebooting and there's nothing in "show log network" imply there wasn't a network issue? ("show log all" goes back to Feb 1.)

We aren't using VRRP; we don't have any redundancy for the master controller.

The speed/duplex settings are what I'd expect.

Moderator
cjoseph
Posts: 12,036
Registered: ‎03-29-2007

show log network

the show log network command would show if a link on the controller went up or down.

I'm going to ask you to open a case, because there are plenty more logs where this can be narrowed down more quickly than me guessing here. It would seem that you have some sort of connectivity issue that is between the access points and the Aruba controller, that comes and goes periodically. Looking at all the logs would give a better indication at what is happening.
Colin Joseph
Aruba Customer Engineering