As I mentioned before, I have another access point on a different switch, same gets disconnected too.
I'm currently monitoring the access points and controller on PRTG and I'll see if there's any disconnection on the network.
When AP gets disconnected I get similar logs like below:
Sep 30 14:02:51 stm[3461]: <399838> <3461> <WARN> |stm| Resource 'Total APs' has dropped below 80% threshold (actual:0%).
(S-10) *[mynode] #show log system 50
Sep 30 17:00:52 :399838: <3369> <WARN> |fpapps| handleMasterIpMsg: CFGM Msg: U plink Master IP 192.168.150.111 Role 2 peer_ip 0.0.0.0 sec_master_ip 0.0.0.0 vpn _ip 0.0.0.0 sec_vpn_ip 0.0.0.0
Sep 30 17:00:54 :309811: <3760> <WARN> |extifmgr| ifmap_current_state(): Broad cast IF-MAP Status: CPPM:Inactive.
Sep 30 17:00:54 :330104: <3847> <NOTI> |cert_dwnld| SAPI sync done with servic e 8212 at level 3
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sending msg 5001 to 127.0.0.1:8226
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sent message 5001 to 127.0.0.1:8226
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_master_ip _req: Cert Req for masterip sent successfully
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_mgr_get_master _ip: Master ip request sent
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sending msg 7003 to 127.0.0.1:8212
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sent message 7003 to 127.0.0.1:8212
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_mgr_get_switch _ip: Switch ip request sent
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_mgr_get_switch _ip: Starting cert_downld_switchip_timer
Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| get_all_config: Getting cu rrent configuration for the app
Sep 30 17:00:54 :306602: <3847> <INFO> |cert_dwnld| Changing the logging level for 6 facilities
Sep 30 17:00:55 :399838: <3369> <WARN> |fpapps| procRtTableSingleMsg: ACTION_F LUSH protocol 5
Sep 30 17:00:56 :330104: <3470> <NOTI> |amon_sender_proc| SAPI sync done with service 8226 at level 4
Sep 30 17:00:56 :384006: <3470> <DBUG> |amon_sender_proc| process_init, CFG Ma nager is UP
Sep 30 17:00:56 :330103: <3470> <NOTI> |amon_sender_proc| SAPI sync (blocking) with service 8345 at level 4
Sep 30 17:00:57 :399838: <3672> <WARN> |fpapps| getMasterIp: Received MasterIp (192.168.150.111) and Role(2). Cancelling retry
Sep 30 17:00:58 :399803: <4035> <ERRS> |policymgr| An internal system error ha s occurred at file policymgr_config.c function switch_ip_resp_hdlr line 165 erro r Received response for Switch IP.
Sep 30 17:00:58 :399816: <4067> <ERRS> |vrrp| gsm_object_lookup failed for pot no: 0
Sep 30 17:01:00 :309811: <3760> <WARN> |extifmgr| ifmap_current_state(): Broad cast IF-MAP Status: CPPM:Inactive.
Sep 30 17:01:01 :300800: <4022> <ERRS> |aruba-central| Athena server configura tion not present
Sep 30 17:01:01 :300800: <4022> <ERRS> |aruba-central| Central Agent Trace dis abled by configuration
Sep 30 17:01:01 :399816: <4013> <ERRS> |upgrademgr| Received NCFG_PROFMGR_EVEN T_ALL_CONFIG_RCVD...
Sep 30 17:01:03 KERNEL: [ 301.434000] 1:<4>process `trapd' is using obsolete setsockopt SO_BSDCOMPAT
Sep 30 17:01:04 KERNEL: [ 301.982000] 0:Nae: configuring port/hw 3/3 iftype 1 for speed=2, duplex=1
Sep 30 17:01:04 :306510: <3330> <WARN> |publisher| Dropping message from 8212 for service '76 (service not found)'
Sep 30 17:01:20 :399816: <3755> <ERRS> |mdns| ncfg_init: No name set in ncfg_i nit context.Nothing will be logged
Sep 30 17:01:20 :309811: <3760> <WARN> |extifmgr| ifmap_current_state(): Broad cast IF-MAP Status: CPPM:Inactive.
Sep 30 17:01:21 :316004: <3421> <WARN> |wms| WMS Ready: AP Load Time (secs): 0 , STA Load Time (secs): 2, Probe Load Time (secs): 0, Total Load Time (secs): 0
Sep 30 17:01:22 :330104: <3470> <NOTI> |amon_sender_proc| SAPI sync done with service 8345 at level 4
Sep 30 17:01:22 :384006: <3470> <DBUG> |amon_sender_proc| process_init, STM is UP
Sep 30 17:01:22 :384006: <3470> <DBUG> |amon_sender_proc| get_all_config, Gett ing current configuration for the app
Sep 30 17:01:22 :306602: <3470> <INFO> |amon_sender_proc| Changing the logging level for 6 facilities
Sep 30 17:01:23 KERNEL: [ 320.732000] 1:alloc_vis_map: Allocated 2432008 byt es for ip_flow_export(3900)
Sep 30 17:01:23 KERNEL: [ 320.732000] 1:apsd_porf_getdmamem: physical addres s 0x30000000, virtual address 0xc000000030000000, size 2432008
Sep 30 17:01:23 :384002: <3470> <ERRS> |amon_sender_proc| is_airwave_reconnect _msg, Wrong code received DTLS registration message
Sep 30 17:01:24 KERNEL: [ 322.387000] 2:alloc_vis_map: Allocated 9437184 byt es for fw_visibility(3662)
Sep 30 17:01:24 KERNEL: [ 322.387000] 2:apsd_porf_getdmamem: physical addres s 0x31000000, virtual address 0xc000000031000000, size 9437184
Sep 30 17:01:26 KERNEL: [ 324.399000] 0:alloc_vis_map: Allocated 88064 bytes for ctamon(3665)
Sep 30 17:01:26 KERNEL: [ 324.399000] 0:apsd_porf_getdmamem: physical addres s 0xBE620000, virtual address 0xc0000000be620000, size 88064
Sep 30 17:01:41 KERNEL(7c:57:3c:cf:bd:
3a@192.168.150.124): [194505.947741] VAP device aruba100 created osifp: (dda49540) os_if: (dc0c8000)
Sep 30 17:01:53 KERNEL: [ 350.750000] 0:<4>hrtimer: interrupt took 2677067 n s
Sep 30 17:02:19 :399838: <3459> <WARN> |stm| Resource 'Total APs' has exceeded 80% threshold (actual:93%).
Sep 30 23:03:51 KERNEL(c8:b5:ad:c3:6d:
d8@192.168.150.100): [219499.062601] anul _stale_sta_check: sta:94:fb:29:25:9a:d5 maybe stale sta detected
Sep 30 23:04:51 KERNEL(c8:b5:ad:c3:6d:
d8@192.168.150.100): [219559.122878] anul _stale_sta_check: sta:94:fb:29:25:9a:d5 stale sta aged out
Oct 1 07:15:12 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has e xceeded 45% threshold (actual:46%).
Oct 1 07:16:12 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has d ropped below 45% threshold (actual:5%).
Oct 1 09:23:00 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has e xceeded 45% threshold (actual:63%).
Oct 1 09:24:00 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has d ropped below 45% threshold (actual:7%).
------------------------------
Keshav Boodhun
------------------------------
Original Message:
Sent: Oct 01, 2021 04:12 AM
From: Colin Joseph
Subject: AP goes up and down on controller
The number in the parenthesis is high for all your APs. I would type "show log system 50" to see if there is something, controller-wide that is happening. Barring that, I would see if there is happening in your network (a switch going down somewhere) that affects connectivity with all access points.
------------------------------
Any opinions expressed here are solely my own and not necessarily that of Hewlett Packard Enterprise or Aruba Networks.
Original Message:
Sent: Oct 01, 2021 03:30 AM
From: Keshav Boodhun
Subject: AP goes up and down on controller
Hi Herman/Marcel,
192.168.150.111 is the controller IP.
What is the discovery method for the APs to detect the controller?
Actually we connect the AP to the network on the same vlan, and then we provision it on the Aruba Controller.
What are the LMS settings in your AP group?
LMS Settings were left blank.
Is this a single 7005 controller ?
Yes single controller.
What is the controller management IP ?
192.168.150.111
Is VRRP configured ?
No VRRP.
Are the access points and controller in the same vlan ?
Yes both are on the same vlan.
Can you ping from the controller to the Access points ?
I cannot ping it from the controller.
(S-10) *[mynode] #show ap debug counters
AP Counters
-----------
Name Group IP Address Configs Sent Configs Acked AP Boo ts Sent AP Boots Acked Bootstraps (Total) Reboots Crash Current License cou nter Global License counter GSM Info for AP
---- ----- ---------- ------------ ------------- ------ ------- -------------- ------------------ ------- ----- ------------------- ---- ---------------------- ---------------
7c:57:3c:cf:bd:3a default 192.168.150.124 2 2 0 0 9 (10 ) 2 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
90:4c:81:cf:b4:62 default 192.168.150.109 2 2 0 0 10 (78 ) 4 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba1 default 192.168.150.106 2 2 0 0 28 (159 ) 35 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba10 default 192.168.150.105 2 2 0 0 69 (101 ) 12 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba11 default 192.168.150.101 2 2 0 0 11 (189 ) 18 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba12 default 192.168.150.113 2 2 0 0 45 (209 ) 53 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba13 default 192.168.150.119 2 2 0 0 37 (132 ) 24 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba14 default 192.168.150.107 2 2 0 0 80 (188 ) 17 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba15 default 192.168.150.108 2 2 0 0 75 (98 ) 11 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba4 default 192.168.150.104 2 2 0 0 12 (160 ) 39 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba5 default 192.168.150.114 2 2 0 0 11 (100 ) 11 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
Aruba6_inno default 192.168.150.120 2 2 0 0 37 (130 ) 28 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
aruba7_inno default 192.168.150.121 2 2 0 0 26 (132 ) 22 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
c8:b5:ad:c3:6d:76 default 192.168.150.110 2 2 0 0 10 (116 ) 21 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
c8:b5:ad:c3:6d:d8 default 192.168.150.100 2 2 0 0 12 (100 ) 13 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
Current License Counter : Increment/Decrement/Active-Increment/Active-Decrement/Standby-Increment/Standby-Decrement
Global License counter : G-Increment/G-Decrement/G-Active-Increment/G-Active-Decrement/G-Standby-Increment/G-Standby-Decrement
GSM Info for AP : AP-Flags/HA-Flags/AP-Flag-Standby/HA-Flag-Standby
Total APs :15
------------------------------
Keshav Boodhun
Original Message:
Sent: Sep 30, 2021 11:01 AM
From: marcel koedijk
Subject: AP goes up and down on controller
Hi Keshav,
Based on your logging i think its a network or firmware issue, i have some more questions:
- Is this a single 7005 controller ?
- What is the controller management IP ?
- Is VRRP configured ?
- Wat is the LMS IP in the AP-Group profile?
- Are the access points and controller in the same vlan ?
- Are the access points in a dedicated management vlan ?
- Can you ping from the controller to the Access points ?
You could also check the next command:
(MC02) [MDC] *#show ap debug counters
------------------------------
Marcel Koedijk | MVP Guru 2021 | ACEP | ACMP | ACCP | ACDP | Ekahau ECSE | Not an HPE Employee | Opionions are my own
Original Message:
Sent: Sep 30, 2021 06:24 AM
From: Keshav Boodhun
Subject: AP goes up and down on controller
Hi,
Model of access points 218 & 318.
We're actually planning to upgrade the firmware this weekend.
The access points are distributed to 3 different switches. Models of switch: HP 2530-24G-PoEP Switch (J9773A).
According to me, this is not a PoE issue as I provisioned an access point to a different switch and this access point have been disconnected also together with the other one.
When access points are disconnected on the controller, I can ping the access point from the internal network.
Regards
------------------------------
Keshav Boodhun
Original Message:
Sent: Sep 30, 2021 03:17 AM
From: marcel koedijk
Subject: AP goes up and down on controller
Hi Keshav,
What model of Access Points do you have?
Running a major firmware version 8.5.x.x without any patches 8.5.0.0 is never a good starting point. Also note that 8.5.x.x is end of support by end of 2021.
As a good starting point for troubleshoot further upgrade to firmware 8.6.0.13 first and see if it maybe all ready solve your issue.
From the mobility contoller your can run the following command to see if the AP is rebootstrapping.
MC01) [MDC] #show log all | include strap
Aug 3 11:04:00 sapd[4149]: <311004> <WARN> |AP AP01-AP01@172.16.10.50 sapd| Missed 8 heartbeats; rebootstrapping
You could focus on IP connectivity between AP and Controller, are both in the same vlan or go through a router.
Another point of attention could be a power issue, be sure the switch have enough PoE budget and that cabling (mostly cat6 those days) can transport the power over the right distance. What model of switch and cabling do you use?
But my first recommendation is start with update your firmware levels ;).
For urgent problems always contact Aruba TAC support.
------------------------------
Marcel Koedijk | MVP Guru 2021 | ACEP | ACMP | ACCP | ACDP | Ekahau ECSE | Not an HPE Employee | Opionions are my own
Original Message:
Sent: Sep 30, 2021 02:46 AM
From: Keshav Boodhun
Subject: AP goes up and down on controller
Hello Guys,
In fact I'm having an issue with all my access points which are managed by an Aruba Controller.
Controller/Firmware version: Aruba 7005 8.5.0.0
During the day, we have frequent disconnection between access point and controller. That is, I cannot see the SSID broadcasting and on controller is goes down. However, I can ping the IP addresses of the access points.
I also provisioned another AP to another switch, I can see that all access points goes down together and after 5-10 mins, it's UP again.
I get these issues 3/4 times during the day.
I tried rebooting the controller and access points, but same is still persisting.
Can someone help ? What logs can I check on the controller for this type of issue ?
Thanks
------------------------------
Keshav Boodhun
------------------------------