Wireless Access

last person joined: 10 hours ago 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

APs failed over to backup LMS

This thread has been viewed 4 times
  • 1.  APs failed over to backup LMS

    Posted Mar 08, 2013 06:01 PM

    The majority of our campus APs failed over to our backup LMS IP (master controller) and I'm having a difficult time understanding how this happened.  It resulted in a large wireless outage which is why I'm trying to find the root cause.  I see the following message logged for APs that failed over:

     

    Rebootstrap Information

    -----------------------

    Date       Time     Reason (Latest 10)

    --------------------------------------

    2013-03-08 10:22:47 Switching to LMS 10.X.X.9. Send failed in function sapd_check_hbt.  Last Ctrl message: BW_REPORT len=150 dest=10.X.X.10 tries=1 seq=14549

     

    TAC said this message indicates the AP heartbeats to the controller were missed, resulting in a failover.  I can't find any indication that we had network problems, either in our core infrastructure or with the controller, or links flapping.  All systems have been up, no topology changes, no interface errors.  I don't see how the heartbeats could've been missed after confirming all this.

     

    Anyone have thoughts on this?



  • 2.  RE: APs failed over to backup LMS

    MVP EXPERT
    Posted Mar 09, 2013 11:28 AM

    What do you see in the controller logs? Generally there is missed hear beat messages in there too. Do you have any Heartbeat DSCP configured on the AP? You can then prioritize AP heartbeats if the link becomes saturated.



  • 3.  RE: APs failed over to backup LMS

    Posted Mar 09, 2013 03:17 PM
    Is there a specific log file or log command that would you would use to look for heartbeat issues?

    No, I haven't configured any heartbeat DSCP. I know that was configurable.


  • 4.  RE: APs failed over to backup LMS

    EMPLOYEE
    Posted Mar 09, 2013 05:04 PM

    @thecompnerd wrote:
    Is there a specific log file or log command that would you would use to look for heartbeat issues?

    No, I haven't configured any heartbeat DSCP. I know that was configurable.

    Very few people use Heartbeat DSCP.  Unless your wired utilization is over a certain percentage sustaned, it does not come into play.

     

    If you type show ap debug counters, it will tell you what devices have more bootstraps than others.  Those are the ones you should look at the connectivity to.

     



  • 5.  RE: APs failed over to backup LMS

    Posted Apr 09, 2013 11:02 AM

    We've had the same occurrences recently.  No change in the network topology and no apparent network issues - yet hundreds of APs bootstrapped (and regardless of how saturated their links were). We opened a case and the end result was to change the heartbeat DSCP value.  We've not had any issues in the two weeks since.

     

    One thing I'm curious about is why access points that were not missing their heartbeats also boostrapped along with all the others.  Did you see similar behavior?



  • 6.  RE: APs failed over to backup LMS

    Posted Apr 09, 2013 11:13 AM

    And in doing some additional checking of the AP debug logs, the access points without missing heartbeats did bootstrap due to missed heartbeats.  They haven't missed any since their last known reboot, and so my misplaced confusion...

     

    Changing the DSCP value seems to be the way to go.



  • 7.  RE: APs failed over to backup LMS

    Posted Apr 10, 2013 08:15 AM

    What did you change the dscp value to?



  • 8.  RE: APs failed over to backup LMS

    Posted Apr 10, 2013 09:51 AM

    We changed the value to 46, per the support engineer's guidance.



  • 9.  RE: APs failed over to backup LMS

    Posted Apr 11, 2013 09:48 PM

     

     

    Do this happened at different times of the day or around the same time ?

     

    Also check if other APs are experiencing this issue and you can check if APs are bootstrapping/rebooting by running the show ap debug counters 

     

    (controller) #show ap debug counters

    AP Counters
    -----------
    Name Group IP Address Configs Sent Configs Acked AP Boots Sent AP Boots Acked Bootstraps (Total) Reboots Crash

    If it is only certain APs then check for layer 1 issues , wire to the AP or the port or the trunk to that switch.

     

     

     

                                          

     



  • 10.  RE: APs failed over to backup LMS

    Posted Apr 21, 2013 11:31 PM

    Very good info!  Thanks for the reply.



  • 11.  RE: APs failed over to backup LMS

    Posted Feb 01, 2017 11:15 AM

    @Victor Fabian wrote:

     

     

    Do this happened at different times of the day or around the same time ?

     

    Also check if other APs are experiencing this issue and you can check if APs are bootstrapping/rebooting by running the show ap debug counters 

     

    (controller) #show ap debug counters

    AP Counters
    -----------
    Name Group IP Address Configs Sent Configs Acked AP Boots Sent AP Boots Acked Bootstraps (Total) Reboots Crash

    If it is only certain APs then check for layer 1 issues , wire to the AP or the port or the trunk to that switch.

     

     

     

                                          

     


    Please tell me how to interpret the headers... by the way: a good idea is add a "sort-by" (headers) in the command becasue i have to export the output to excel and then sort the data... 

     

    By the way: Do you have any other command (not "sh ap tech-support Ap" one by one) to see the "Up time" (that is the real "UP time" & different to not real time  in the "Status" column when you see the "sh ap database local long sort-by uptime" output .

    Example (cuted from 1000 records to just top 30) were i detect tooo much reboots clearly sorted by the header "Reboots":

    Configs SentConfigs AckedAP Boots SentAP Boots  AckedBootstrapsTotalRebootsCrash
    32001551551N
    44002307306N
    4400231N
    4400231N
    4400231N
    4400231N
    4400231N
    4400231N
    222200221N
    5500221N
    4400221N
    4400221N
    4400221N
    4400220N
    2200220N
    323200110N
    131300110N
    111100110N
    111100110N
    101000110N
    8800110N
    7700110N
    6600110N
    6600110N
    6600110N
    6600110N
    6600110N
    6600110N
    6600110N
    6600110N

    Best regards.