We have been monitoring an Aruba 135 we have set up in the cafeteria of one of our high schools. All of the APs in this building are set up as remote APs as opposed to campus. We wanted to monitor AP and client health during the AP's peak usage time, the students' lunch hour. We have seen upwards of 100 clients on the AP during peak usage. We were unsure of the amount of stress the AP could handle, and we were using the students as a test. While monitoring the AP, I noticed all of the clients dropped for roughly 30 seconds. Most of the clients were able to re-associate fine, but a few seemed to have issues re-associating. We noticed on our controller that the AP also had it's IP tunnel changed during this brief outage.
Here is info I pulled from the debug log for the AP on the controller. At 12:00 is exactly when the outage occurred, and you can even see the AP's tunnel IP changes from 74.122 to 74.35. I believe the font that is bolded could mean something, but I am unsure how to interpret it.
Nov 6 11:58:44 stm[1413]: <132094> <WARN> |AP RHS_CAFE_330041@192.168.74.122 stm| ^[msg WPA2 Key Message 2] [mac c8:d1:5e:33:95:c6] [bssid 24:de:c6:f1:f5:40] [apname RHS_CAFE_330041]
Nov 6 11:58:46 stm[1413]: <132094> <WARN> |AP RHS_CAFE_330041@192.168.74.122 stm| ^[msg WPA2 Key Message 2] [mac c8:d1:5e:33:95:c6] [bssid 24:de:c6:f1:f5:40] [apname RHS_CAFE_330041]
Nov 6 11:58:47 stm[1413]: <132094> <WARN> |AP RHS_CAFE_330041@192.168.74.122 stm| ^[msg WPA2 Key Message 2] [mac c8:d1:5e:33:95:c6] [bssid 24:de:c6:f1:f5:40] [apname RHS_CAFE_330041]
Nov 6 11:59:44 stm[1413]: <132093> <ERRS> |AP RHS_CAFE_330041@192.168.74.122 stm| ^[msg WPA2 Key message 2] [mac 04:f7:e4:b1:8f:55] [bssid 24:de:c6:f1:f5:50] [apname RHS_CAFE_330041] [stcnt1 0] [stcnt2 1] [apcnt1 0] [apcnt2 2]
Nov 6 11:59:59 sapd[1392]: sapd_redun_config_dnsmasq, rewrite dnsmasq config file
Nov 6 12:00:25 sapd[1392]: PAPI_Send: sendto RAPPER_PORT2 failed: No such file or directory Message Code 9 Sequence Num is 27853
Nov 6 12:00:27 sapd[1392]: PAPI_Send: To: 7f000001:8424 Type:0x3 Timed out.
Nov 6 12:00:30 KERNEL(RHS_CAFE_330041@172.16.12.220): asap_station_add: WARNING: !
Nov 6 12:00:41 stm[1413]: <132094> <WARN> |AP RHS_CAFE_330041@192.168.74.35 stm| ^[msg WPA2 Key Message 2] [mac b0:9f:ba:35:97:b0] [bssid 24:de:c6:f1:f5:40] [apname RHS_CAFE_330041]
Nov 6 12:00:42 stm[1413]: <132094> <WARN> |AP RHS_CAFE_330041@192.168.74.35 stm| ^[msg WPA2 Key Message 2] [mac b0:9f:ba:35:97:b0] [bssid 24:de:c6:f1:f5:40] [apname RHS_CAFE_330041]
Nov 6 12:00:44 stm[1413]: <132094> <WARN> |AP RHS_CAFE_330041@192.168.74.35 stm| ^[msg WPA2 Key Message 2] [mac b0:9f:ba:35:97:b0] [bssid 24:de:c6:f1:f5:40] [apname RHS_CAFE_330041]
I ran a show ap remote debug association-failure ap-name on the AP right after the outage, and it returned a few clients that did not seem to reassociate properly:
b0:9f:ba:35:97:b0 RHS_CAFE_330041 24:de:c6:f1:f5:50 OpenVVSDWiFi auth 802.11a 1m:54s Did not attempt to associate
b0:9f:ba:35:97:b0 RHS_CAFE_330041 24:de:c6:f1:f5:40 OpenVVSDWiFi 802.11g 1m:54s Sapcp Ageout (internal ageout)
84:38:35:eb:92:a7 RHS_CAFE_330041 24:de:c6:f1:f5:50 OpenVVSDWiFi auth 802.11a 2m:14s Did not attempt to associate
c0:63:94:4e:b5:6e RHS_CAFE_330041 24:de:c6:f1:f5:50 OpenVVSDWiFi auth 802.11a 2m:14s Did not attempt to associate
c0:63:94:4e:b5:6e RHS_CAFE_330041 24:de:c6:f1:f5:40 OpenVVSDWiFi auth 802.11g 2m:14s Did not attempt to associate
8c:58:77:1c:39:c6 RHS_CAFE_330041 24:de:c6:f1:f5:40 OpenVVSDWiFi auth 802.11g 2m:14s Did not attempt to associate
Num Association Failures:6
I finally did a show ap debug system-status ap-name on the AP and saw that the CPU and Memory peaked right at the same time:
Peak CPU Util in the last one hour
----------------------------------
Timestamp CPU Util(%) Memory Util(%)
--------- ----------- --------------
2013-11-06 12:00:31 48 28
My original thought was maybe the AP had too many clients on it, as it had I believe 102 cilents right before the outage. If anyone has any insight or suggestions, please share. Thanks.
#AP135