Wireless Access

last person joined: yesterday 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

Tweaking RAP LMS Failover

This thread has been viewed 0 times
  • 1.  Tweaking RAP LMS Failover

    Posted Jun 04, 2014 12:07 AM

    Having some issues with RAPs failing over to backup LMS due to missed heartbeats.  It's a RAP here and a RAP there, not all at the same time that they fail over, so I suspect it's random packet loss at the branch offices.  I see two options that will configure LMS failover:

     

    Tunnel Hearbeat Interval = 1 sec

    Bootstrap threshold = 8

     

    Is there a science to configuring these values or just trial and error to find what works best?  Quick failover is not a requirement as I really only want the RAPs to failover if there's a serious outage.  That being said I wouldn't think it matters much how I set these values, but I'm still curious to know how others would approach this.

     

    Thanks.



  • 2.  RE: Tweaking RAP LMS Failover

    EMPLOYEE
    Posted Jun 04, 2014 01:56 AM

    I would consider raising the bootstrap threshold to 12 to start with.



  • 3.  RE: Tweaking RAP LMS Failover

    EMPLOYEE
    Posted Jun 04, 2014 03:12 AM

    A RAPs bootstrap threshold is 30 minimum, even if in the configuration it says 8.  You should configure it at 40 or above:

    https://arubanetworkskb.secure.force.com/pkb/articles/FAQ/What-is-the-default-bootstrap-threshold-for-Remote-and-Campus-APs-on-an-Aruba-Controller

     

     



  • 4.  RE: Tweaking RAP LMS Failover

    Posted Jun 04, 2014 07:24 AM
    Ahh, interesting little caveat. Would've never known that was the case. Thanks for pointing that out. I'll bump it to 40 and see what happens.

    Also, will the RAP prioritize heartbeat traffic over user traffic is there's contention on enet0? Some locations have every small pipes to the internet so I think it's possible that heartbeat traffic is getting dropped at the source.


  • 5.  RE: Tweaking RAP LMS Failover

    EMPLOYEE
    Posted Jun 04, 2014 07:51 AM

    If there is QoS enabled on those wan links, you could try changing the heartbeat DSCP.  From the CLI guide,

     

    Define the DSCP value of AP heartbeats.
    Use this feature to prioritize AP heartbeats
    and prevent the AP from losing connectivity
    with the controller over high-latency or
    low-bandwidth WAN connections.



  • 6.  RE: Tweaking RAP LMS Failover

    EMPLOYEE
    Posted Jun 04, 2014 08:02 AM

    Michael_Clarke,

     

    Have you ever used that feature?  It will tag the the outer heartbeat packets with whatever dscp you want, but the entire infrastructure from the RAP to the controller must support prioritizing those packets, otherwise it will not work as intended.



  • 7.  RE: Tweaking RAP LMS Failover

    EMPLOYEE
    Posted Jun 04, 2014 08:05 AM

    I've never used that feature.  Just thought it might be what thecompnerd is looking for, but as you say, if it's not setup end to end, it won't work as intended.



  • 8.  RE: Tweaking RAP LMS Failover

    Posted Jun 04, 2014 08:45 AM
    Unfortunately, the remote locations are not setup for QoS. I was hoping by
    default that the RAP would protect heartbeat traffic via high priority
    queue.


  • 9.  RE: Tweaking RAP LMS Failover

    Posted Jun 04, 2014 08:59 AM

    The RAP WILL protect the traffic as it sends it but its up to the next device in the chain to do the same. Its like the RAP saying to the the next hop "make this a priority - its really important" and the next saying "yeah, whatever, not interested" and sending it with no specific priority. QoS has to be end-to-end with all devices in the chain between source and destination honoring the markings, or translating them to an equivalent priority in a different system.



  • 10.  RE: Tweaking RAP LMS Failover

    Posted Aug 06, 2014 08:46 AM

    Thanks for the input all.  I'm still having a lot of issues with this.

     

    Also, I'm wanting to understand what the max request retries is for.  Based on what the user guide, this is my understanding: Once the bootstrap threshold is met, the AP turns off the radios and attempts to connect to the primary LMS.  If the primary LMS is not reachable (based on max request retries count) the AP will attempt to connect to the backup LMS or reboot.  Is that right?



  • 11.  RE: Tweaking RAP LMS Failover

    EMPLOYEE
    Posted Aug 06, 2014 08:48 AM

    @thecompnerd wrote:

    Thanks for the input all.  I'm still having a lot of issues with this.

     

    Also, I'm wanting to understand what the max request retries is for.  Based on what the user guide, this is my understanding: Once the bootstrap threshold is met, the AP turns off the radios and attempts to connect to the primary LMS.  If the primary LMS is not reachable (based on max request retries count) the AP will attempt to connect to the backup LMS or reboot.  Is that right?


    thecompnerd,

     

    What issues are you having?  Please explain.



  • 12.  RE: Tweaking RAP LMS Failover

    Posted Aug 06, 2014 09:23 AM

    Almost every RAP is failing between primary/backup LMS every week.  Everytime I look at the debug for the RAP when it moves between controllers the reason is "controller aged out".  I've been suspecting ISP issues, but I have many RAPs deployed with  different ISPs and bandwidth plans used, so I'm not ready to rule out configuration/RAP/controllers yet.

     

    As suggested, I set the bootstrap threshold to 40.  I've confirmed via testing that a RAP will rebootstrap after 40 seconds (bootstrap threshold of 40 x 1 sec heartbeat interval).



  • 13.  RE: Tweaking RAP LMS Failover

    EMPLOYEE
    Posted Aug 06, 2014 09:30 AM

    removed.  duplicate