Wireless Access

last person joined: 22 hours ago 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

APs not connecting to primary LMS after enabling HA Fast Failover

This thread has been viewed 1 times
  • 1.  APs not connecting to primary LMS after enabling HA Fast Failover

    Posted May 22, 2020 09:22 AM

    We have a master / local environment and we recently enabled HA Fast Failover on the two locals (they are in separate data centers).  The feature seems to work fairly well with fail over times of around 20 seconds before service is fully restored.  One thing that we've noticed though is that some APs appear to be "stuck" on their standby controller instead of connecting back to their primary controller.  In one example, half the APs on one floor are on their primary and half are on their standby.  I know that we can move the APs but we have a couple of thousand APs and likely around 50-100 of them that would need to be manually moved.  Is this expected behaviour or are we encountering some kind of bug?  We also upgraded to 6.5.4.16 as part of enabling Fast Failover.

     

    ET-02F-AP01 <ap-group> 325 10.200.220.234 Up 11h:35m:45s 2 10.204.65.27 10.204.1.27
    ET-02F-AP02 <ap-group> 325 10.200.220.89 Up 11h:40m:5s 2 10.204.65.27 10.204.1.27
    ET-02F-AP03 <ap-group> 325 10.200.220.66 Up 11h:42m:15s 2S 10.204.1.27 10.204.65.27
    ET-02F-AP04 <ap-group> 325 10.200.220.210 Up 11h:43m:18s 2S 10.204.1.27 10.204.65.27
    ET-02F-AP05 <ap-group> 325 10.200.220.247 Up 11h:42m:13s 2S 10.204.1.27 10.204.65.27
    ET-02F-AP06 <ap-group> 325 10.200.221.210 Up 11h:42m:57s 2S 10.204.1.27 10.204.65.27
    ET-02F-AP07 <ap-group> 325 10.200.221.50 Up 11h:36m:43s 2 10.204.65.27 10.204.1.27
    ET-02F-AP08 <ap-group> 325 10.200.221.215 Up 11h:42m:13s 2S 10.204.1.27 10.204.65.27
    ET-02F-AP09 <ap-group> 325 10.200.221.140 Up 11h:40m:22s 2 10.204.65.27 10.204.1.27
    ET-02F-AP10 <ap-group> 325 10.200.221.75 Up 11h:36m:34s 2 10.204.65.27 10.204.1.27
    ET-02F-AP11 <ap-group> 325 10.200.221.240 Up 11h:35m:3s 2 10.204.65.27 10.204.1.27

     

    In the example above, all of these APs should be on the 10.204.65.27 controller as their primary yet five of them are using the standby as their primary.

     

    Hoping someone has an idea on how to resolve this.

     

    Thanks.

     



  • 2.  RE: APs not connecting to primary LMS after enabling HA Fast Failover

    Posted May 22, 2020 10:30 AM
    Not an expected behavior but do you have other APs in that same location that do not behave this way ?

    What type of APs?
    What’s the return traffic between the standby controller and that location ?


    Thank you

    Victor Fabian

    Pardon typos sent from Mobile


  • 3.  RE: APs not connecting to primary LMS after enabling HA Fast Failover

    Posted May 22, 2020 10:43 AM

    Yes, as per the output above, about half are pointing to their correct/primary LMS while the others are pointing to their standby. 

     

    These are all AP-325s and the two controllers (primary/standby) are in their own data center with about 15ms of latency between them.  Controllers are 7240XMs.



  • 4.  RE: APs not connecting to primary LMS after enabling HA Fast Failover

    Posted May 26, 2020 08:22 AM

    I've opened up a TAC case to see if they could identify anything and so far they've come back to state that having "ha-on-bkup-lms" enabled on the HA profile configuration, could be the reason for the APs not swinging back to their primary LMS.  Based on the description of the command, it seems plausible.  I'm going to try testing this out in my prod environment and report back.

     

    HA on Backup-LMS
    Starting from AOS-W 6.4.4.15, a new parameter, ha-on-bkup-lms, is added in the HA profile to enable or disable the HA on Backup-LMS.  When this parameter is enabled, an AP can set up a standby tunnel after the AP rebootstraps to Backup-LMS. However, in this case, LMS preemption will be ignored. When this parameter is disabled, the AP cannot set up a standby tunnel after the AP rebootstraps to Backup-LMS; the AP will rebootstrap to LMS if LMS is back and LMS preemption is enabled.



  • 5.  RE: APs not connecting to primary LMS after enabling HA Fast Failover
    Best Answer

    Posted Jun 03, 2020 01:58 PM

    Just thought I'd comment that entering the "no ha-on-backup-lms" command of the HA profile has fixed my issue of APs being stuck on their standby controller.  After the LMS preemption period has expired, the APs will flip back to their normal setup.