Wireless Access

last person joined: 15 hours ago 

Access network design for branch, remote, outdoor and campus locations with Aruba access points, and mobility controllers.
Expand all | Collapse all

AP goes up and down on controller

This thread has been viewed 56 times
  • 1.  AP goes up and down on controller

    Posted Sep 30, 2021 02:47 AM
    Hello Guys,

    In fact I'm having an issue with all my access points which are managed by an Aruba Controller.

    Controller/Firmware version: Aruba 7005 8.5.0.0

    During the day, we have frequent disconnection between access point and controller. That is, I cannot see the SSID broadcasting and on controller is goes down. However, I can ping the IP addresses of the access points.

    I also provisioned another AP to another switch, I can see that all access points goes down together and after 5-10 mins, it's UP again.

    I get these issues 3/4 times during the day.

    I tried rebooting the controller and access points, but same is still persisting.

    Can someone help ?  What logs can I check on the controller for this type of issue ?

    Thanks

    ------------------------------
    Keshav Boodhun
    ------------------------------


  • 2.  RE: AP goes up and down on controller

    Posted Sep 30, 2021 03:17 AM
    Hi Keshav,

    What model of Access Points do you have?

    Running a major firmware version 8.5.x.x without any patches 8.5.0.0 is never a good starting point. Also note that 8.5.x.x is end of support by end of 2021.
    As a good starting point for troubleshoot further upgrade to firmware 8.6.0.13 first and see if it maybe all ready solve your issue.

    From the mobility contoller your can run the following command to see if the AP is rebootstrapping.

    MC01) [MDC] #show log all | include strap
    Aug  3 11:04:00  sapd[4149]: <311004> <WARN> |AP AP01-AP01@172.16.10.50 sapd|  Missed 8 heartbeats; rebootstrapping

    You could focus on IP connectivity between AP and Controller, are both in the same vlan or go through a router.

    Another point of attention could be a power issue, be sure the switch have enough PoE budget and that cabling (mostly cat6 those days) can transport the power over the right distance. What model of switch and cabling do you use?

    But my first recommendation is start with update your firmware levels ;).

    For urgent problems always contact Aruba TAC support.

    ------------------------------
    Marcel Koedijk | MVP Guru 2021 | ACEP | ACMP | ACCP | ACDP | Ekahau ECSE | Not an HPE Employee | Opionions are my own
    ------------------------------



  • 3.  RE: AP goes up and down on controller

    Posted Sep 30, 2021 06:15 AM
    Hi Marcel,

    I just found some logs, see if you can help.

    Sep 30 14:02:51 stm[3461]: <399838> <3461> <WARN> |stm| Resource 'Total APs' has dropped below 80% threshold (actual:0%).

    2021-09-30 14:01:45 Switching to LMS 192.168.150.111: Broken heartbeat tunnel. Last Ctrl message: KEEPALIVE len=77 dest=192.168.150.111 tries=1 seq=128
    2021-09-30 14:05:34 Switching to LMS 192.168.150.111: HELLO-TIMEOUT. Last Ctrl message: HELLO len=409 dest=192.168.150.111 tries=10 seq=0
    2021-09-30 14:08:56 Switching to LMS 192.168.150.111: HELLO-TIMEOUT. Last Ctrl message: HELLO len=409 dest=192.168.150.111 tries=10 seq=0

    Best Regards

    ------------------------------
    Keshav Boodhun
    ------------------------------



  • 4.  RE: AP goes up and down on controller

    Posted Sep 30, 2021 10:01 AM
    What is the IP 192.168.150.111? This 'switching to LMS' message may originate from the AP and seems to indicate that the AP cannot reach the configured/detected controller at that IP address.

    What is the discovery method for the APs to detect the controller?
    What are the LMS settings in your AP group?

    ------------------------------
    Herman Robers
    ------------------------
    If you have urgent issues, always contact your Aruba partner, distributor, or Aruba TAC Support. Check https://www.arubanetworks.com/support-services/contact-support/ for how to contact Aruba TAC. Any opinions expressed here are solely my own and not necessarily that of Hewlett Packard Enterprise or Aruba Networks.

    In case your problem is solved, please invest the time to post a follow-up with the information on how you solved it. Others can benefit from that.
    ------------------------------



  • 5.  RE: AP goes up and down on controller

    Posted Sep 30, 2021 06:25 AM
    Hi,

    Model of access points 218 & 318.

    We're actually planning to upgrade the firmware this weekend.

    The access points are distributed to 3 different switches. Models of switch: HP 2530-24G-PoEP Switch (J9773A).
    According to me, this is not a PoE issue as I provisioned an access point to a different switch and this access point have been disconnected also together with the other one.

    When access points are disconnected on the controller, I can ping the access point from the internal network.

    Regards


    ------------------------------
    Keshav Boodhun
    ------------------------------



  • 6.  RE: AP goes up and down on controller

    Posted Sep 30, 2021 11:02 AM
    Hi Keshav,

    Based on your logging i think its a network or firmware issue, i have some more questions:

    • Is this a single 7005 controller ? 
    • What is the controller management IP ?
    • Is VRRP configured ?
    • Wat is the LMS IP in the AP-Group profile?
    • Are the access points and controller in the same vlan ?
    • Are the access points in a dedicated management vlan ?
    • Can you ping from the controller to the Access points ?

    You could also check the next command:

    (MC02) [MDC] *#show ap debug counters

    ------------------------------
    Marcel Koedijk | MVP Guru 2021 | ACEP | ACMP | ACCP | ACDP | Ekahau ECSE | Not an HPE Employee | Opionions are my own
    ------------------------------



  • 7.  RE: AP goes up and down on controller

    Posted Oct 01, 2021 03:30 AM
    Hi Herman/Marcel,

    192.168.150.111 is the controller IP.

    What is the discovery method for the APs to detect the controller?
    Actually we connect the AP to the network on the same vlan, and then we provision it on the Aruba Controller.

    What are the LMS settings in your AP group?
    LMS Settings were left blank.

    Is this a single 7005 controller ?
    Yes single controller.

    What is the controller management IP ?
    192.168.150.111

    Is VRRP configured ?
    No VRRP.

    Are the access points and controller in the same vlan ?
    Yes both are on the same vlan.

    Can you ping from the controller to the Access points ?
    I cannot ping it from the controller.



    (S-10) *[mynode] #show ap debug counters

    AP Counters
    -----------
    Name Group IP Address Configs Sent Configs Acked AP Boo ts Sent AP Boots Acked Bootstraps (Total) Reboots Crash Current License cou nter Global License counter GSM Info for AP
    ---- ----- ---------- ------------ ------------- ------ ------- -------------- ------------------ ------- ----- ------------------- ---- ---------------------- ---------------
    7c:57:3c:cf:bd:3a default 192.168.150.124 2 2 0 0 9 (10 ) 2 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    90:4c:81:cf:b4:62 default 192.168.150.109 2 2 0 0 10 (78 ) 4 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba1 default 192.168.150.106 2 2 0 0 28 (159 ) 35 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba10 default 192.168.150.105 2 2 0 0 69 (101 ) 12 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba11 default 192.168.150.101 2 2 0 0 11 (189 ) 18 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba12 default 192.168.150.113 2 2 0 0 45 (209 ) 53 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba13 default 192.168.150.119 2 2 0 0 37 (132 ) 24 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba14 default 192.168.150.107 2 2 0 0 80 (188 ) 17 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba15 default 192.168.150.108 2 2 0 0 75 (98 ) 11 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba4 default 192.168.150.104 2 2 0 0 12 (160 ) 39 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba5 default 192.168.150.114 2 2 0 0 11 (100 ) 11 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    Aruba6_inno default 192.168.150.120 2 2 0 0 37 (130 ) 28 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    aruba7_inno default 192.168.150.121 2 2 0 0 26 (132 ) 22 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    c8:b5:ad:c3:6d:76 default 192.168.150.110 2 2 0 0 10 (116 ) 21 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    c8:b5:ad:c3:6d:d8 default 192.168.150.100 2 2 0 0 12 (100 ) 13 N 1/0/1/0/0/0/0 2/1/2/1/0/0/0 40/4/0/0
    Current License Counter : Increment/Decrement/Active-Increment/Active-Decrement/Standby-Increment/Standby-Decrement
    Global License counter : G-Increment/G-Decrement/G-Active-Increment/G-Active-Decrement/G-Standby-Increment/G-Standby-Decrement
    GSM Info for AP : AP-Flags/HA-Flags/AP-Flag-Standby/HA-Flag-Standby
    Total APs :15


    ------------------------------
    Keshav Boodhun
    ------------------------------



  • 8.  RE: AP goes up and down on controller

    Posted Oct 01, 2021 04:12 AM
    The number in the parenthesis is high for all your APs.  I would type "show log system 50" to see if there is something, controller-wide that is happening.  Barring that, I would see if there is happening in your network (a switch going down somewhere) that affects connectivity with all access points.

    ------------------------------
    Any opinions expressed here are solely my own and not necessarily that of Hewlett Packard Enterprise or Aruba Networks.
    ------------------------------



  • 9.  RE: AP goes up and down on controller

    Posted Oct 01, 2021 04:24 AM
    As I mentioned before, I have another access point on a different switch, same gets disconnected too.
    I'm currently monitoring the access points and controller on PRTG and I'll see if there's any disconnection on the network.

    When AP gets disconnected I get similar logs like below:
    Sep 30 14:02:51 stm[3461]: <399838> <3461> <WARN> |stm| Resource 'Total APs' has dropped below 80% threshold (actual:0%).


    (S-10) *[mynode] #show log system 50


    Sep 30 17:00:52 :399838: <3369> <WARN> |fpapps| handleMasterIpMsg: CFGM Msg: U plink Master IP 192.168.150.111 Role 2 peer_ip 0.0.0.0 sec_master_ip 0.0.0.0 vpn _ip 0.0.0.0 sec_vpn_ip 0.0.0.0
    Sep 30 17:00:54 :309811: <3760> <WARN> |extifmgr| ifmap_current_state(): Broad cast IF-MAP Status: CPPM:Inactive.
    Sep 30 17:00:54 :330104: <3847> <NOTI> |cert_dwnld| SAPI sync done with servic e 8212 at level 3
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sending msg 5001 to 127.0.0.1:8226
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sent message 5001 to 127.0.0.1:8226
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_master_ip _req: Cert Req for masterip sent successfully
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_mgr_get_master _ip: Master ip request sent
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sending msg 7003 to 127.0.0.1:8212
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_send_papi_mess age: Sent message 7003 to 127.0.0.1:8212
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_mgr_get_switch _ip: Switch ip request sent
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| cert_downld_mgr_get_switch _ip: Starting cert_downld_switchip_timer
    Sep 30 17:00:54 :355002: <3847> <DBUG> |cert_dwnld| get_all_config: Getting cu rrent configuration for the app
    Sep 30 17:00:54 :306602: <3847> <INFO> |cert_dwnld| Changing the logging level for 6 facilities
    Sep 30 17:00:55 :399838: <3369> <WARN> |fpapps| procRtTableSingleMsg: ACTION_F LUSH protocol 5
    Sep 30 17:00:56 :330104: <3470> <NOTI> |amon_sender_proc| SAPI sync done with service 8226 at level 4
    Sep 30 17:00:56 :384006: <3470> <DBUG> |amon_sender_proc| process_init, CFG Ma nager is UP
    Sep 30 17:00:56 :330103: <3470> <NOTI> |amon_sender_proc| SAPI sync (blocking) with service 8345 at level 4
    Sep 30 17:00:57 :399838: <3672> <WARN> |fpapps| getMasterIp: Received MasterIp (192.168.150.111) and Role(2). Cancelling retry
    Sep 30 17:00:58 :399803: <4035> <ERRS> |policymgr| An internal system error ha s occurred at file policymgr_config.c function switch_ip_resp_hdlr line 165 erro r Received response for Switch IP.
    Sep 30 17:00:58 :399816: <4067> <ERRS> |vrrp| gsm_object_lookup failed for pot no: 0
    Sep 30 17:01:00 :309811: <3760> <WARN> |extifmgr| ifmap_current_state(): Broad cast IF-MAP Status: CPPM:Inactive.
    Sep 30 17:01:01 :300800: <4022> <ERRS> |aruba-central| Athena server configura tion not present
    Sep 30 17:01:01 :300800: <4022> <ERRS> |aruba-central| Central Agent Trace dis abled by configuration
    Sep 30 17:01:01 :399816: <4013> <ERRS> |upgrademgr| Received NCFG_PROFMGR_EVEN T_ALL_CONFIG_RCVD...
    Sep 30 17:01:03 KERNEL: [ 301.434000] 1:<4>process `trapd' is using obsolete setsockopt SO_BSDCOMPAT
    Sep 30 17:01:04 KERNEL: [ 301.982000] 0:Nae: configuring port/hw 3/3 iftype 1 for speed=2, duplex=1
    Sep 30 17:01:04 :306510: <3330> <WARN> |publisher| Dropping message from 8212 for service '76 (service not found)'
    Sep 30 17:01:20 :399816: <3755> <ERRS> |mdns| ncfg_init: No name set in ncfg_i nit context.Nothing will be logged
    Sep 30 17:01:20 :309811: <3760> <WARN> |extifmgr| ifmap_current_state(): Broad cast IF-MAP Status: CPPM:Inactive.
    Sep 30 17:01:21 :316004: <3421> <WARN> |wms| WMS Ready: AP Load Time (secs): 0 , STA Load Time (secs): 2, Probe Load Time (secs): 0, Total Load Time (secs): 0
    Sep 30 17:01:22 :330104: <3470> <NOTI> |amon_sender_proc| SAPI sync done with service 8345 at level 4
    Sep 30 17:01:22 :384006: <3470> <DBUG> |amon_sender_proc| process_init, STM is UP
    Sep 30 17:01:22 :384006: <3470> <DBUG> |amon_sender_proc| get_all_config, Gett ing current configuration for the app
    Sep 30 17:01:22 :306602: <3470> <INFO> |amon_sender_proc| Changing the logging level for 6 facilities
    Sep 30 17:01:23 KERNEL: [ 320.732000] 1:alloc_vis_map: Allocated 2432008 byt es for ip_flow_export(3900)
    Sep 30 17:01:23 KERNEL: [ 320.732000] 1:apsd_porf_getdmamem: physical addres s 0x30000000, virtual address 0xc000000030000000, size 2432008
    Sep 30 17:01:23 :384002: <3470> <ERRS> |amon_sender_proc| is_airwave_reconnect _msg, Wrong code received DTLS registration message
    Sep 30 17:01:24 KERNEL: [ 322.387000] 2:alloc_vis_map: Allocated 9437184 byt es for fw_visibility(3662)
    Sep 30 17:01:24 KERNEL: [ 322.387000] 2:apsd_porf_getdmamem: physical addres s 0x31000000, virtual address 0xc000000031000000, size 9437184
    Sep 30 17:01:26 KERNEL: [ 324.399000] 0:alloc_vis_map: Allocated 88064 bytes for ctamon(3665)
    Sep 30 17:01:26 KERNEL: [ 324.399000] 0:apsd_porf_getdmamem: physical addres s 0xBE620000, virtual address 0xc0000000be620000, size 88064
    Sep 30 17:01:41 KERNEL(7c:57:3c:cf:bd:3a@192.168.150.124): [194505.947741] VAP device aruba100 created osifp: (dda49540) os_if: (dc0c8000)
    Sep 30 17:01:53 KERNEL: [ 350.750000] 0:<4>hrtimer: interrupt took 2677067 n s
    Sep 30 17:02:19 :399838: <3459> <WARN> |stm| Resource 'Total APs' has exceeded 80% threshold (actual:93%).
    Sep 30 23:03:51 KERNEL(c8:b5:ad:c3:6d:d8@192.168.150.100): [219499.062601] anul _stale_sta_check: sta:94:fb:29:25:9a:d5 maybe stale sta detected
    Sep 30 23:04:51 KERNEL(c8:b5:ad:c3:6d:d8@192.168.150.100): [219559.122878] anul _stale_sta_check: sta:94:fb:29:25:9a:d5 stale sta aged out
    Oct 1 07:15:12 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has e xceeded 45% threshold (actual:46%).
    Oct 1 07:16:12 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has d ropped below 45% threshold (actual:5%).
    Oct 1 09:23:00 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has e xceeded 45% threshold (actual:63%).
    Oct 1 09:24:00 :399838: <3250> <WARN> |nanny| Resource 'Controlpath CPU' has d ropped below 45% threshold (actual:7%).

    ------------------------------
    Keshav Boodhun
    ------------------------------



  • 10.  RE: AP goes up and down on controller

    Posted Oct 01, 2021 04:35 AM
    There is an "alert" when the number of APs on the platform exceed 80% and then recede below that.  It is an indicator of what you are seeing, but it doesn't say why.   If the output of "show version" shows that there is a crash, I would consider upgrading.

    I would open a technical support case with Aruba/HPE so that they can look at all of your logs...  It is not practical/possible to analyze all of your logs and come up with a definitive recommendation on this forum, unfortunately.

    EDIT:  The simplest thing to check is to make sure that you are not giving out duplicate ip addresses in your DHCP server.  That has happened in the past.

    ------------------------------
    Any opinions expressed here are solely my own and not necessarily that of Hewlett Packard Enterprise or Aruba Networks.
    ------------------------------



  • 11.  RE: AP goes up and down on controller

    Posted Oct 07, 2021 08:29 AM
    Hi Guys,

    I have been monitoring the system for a week and the issue did not reproduce again.

    I have perform an upgrade on the controller.

    On my network, access points network vlan was configured on 2 different firewalls. (I corrected it).

    I also noticed that Darktrace was blocking FTP connection on controller and access points. (Antigena) (Disabled)

    Hopefully this issue is solved forever and this post may help someone else.

    Thank you guys 👍


    ------------------------------
    Keshav Boodhun
    ------------------------------



  • 12.  RE: AP goes up and down on controller

    Posted Oct 07, 2021 11:28 AM
    Ki Keshav

    Glad to hear the issue is solved. And thanks to give your feedback to the community to help others.

    ------------------------------
    Marcel Koedijk | MVP Guru 2021 | ACEP | ACMP | ACCP | ACDP | Ekahau ECSE | Not an HPE Employee | Opionions are my own
    ------------------------------