Wireless Access

last person joined: yesterday 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

Issues with AP-103 and HA Issues

This thread has been viewed 3 times
  • 1.  Issues with AP-103 and HA Issues

    Posted Sep 25, 2014 03:26 PM

    Hi All,

     

    Has anybody deployed AP-103 ?

     

    We are facing performance issues on it and also do we have any HA issues on 6.4.2.1 code version as the ap's randomly move over to the local controller.

     

     

     



  • 2.  RE: Issues with AP-103 and HA Issues

    Posted Sep 25, 2014 05:18 PM

     

    Do all the APs move at the same time, or do they move one-by-one?

     

     



  • 3.  RE: Issues with AP-103 and HA Issues

    Posted Sep 25, 2014 09:56 PM
    They all move at the same time and we have checked the latency part on the network seems to be no issue.


  • 4.  RE: Issues with AP-103 and HA Issues

    Posted Sep 25, 2014 10:24 PM

     

    You may be hitting an issue I currently have in with TAC/engineering, which they tried to fix in 6.4.2.1 but the fix didn't quite do the trick in production.

     

    I'll let TAC know someone else may be seeing this issue.  In the meantime, we are running with inter-controller heartbeats disabled.  This means HA is not as fast, but the AP-based HA is still working (they still build preemptive tunnels) -- we have found that the APs themselves only fail over when there is a real problem, but the inter-controller heartbeat is tripping for some reason.

     

    To turn of heartbeats go to the redundancy menu, and in the groups containing the affected controllers uncheck the "heartbeat" checkbox and apply and save configuration.  Or from the CLI go into the "ha group-profile" and execute "no heartbeat" then do a "wr mem".

     

    I'd be interested to know what gear you have between your controllers.

     



  • 5.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 02:13 AM

    Good to know..we have temporarily disabled  HA and we have enabled LMS and BLMS redundancy which is traditional way of failing over the AP's.

     

    We have two controllers connected over the MPLS link and they are connected to the Core network on a Port channel.

     

    We have checked the latency and there is no latency whatsoever and also the ports have no error frames.

     

     



  • 6.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 08:03 AM

     

    OK let's do a bit of verification that you have the same issue.

     

    Please go to each controller and check "show ha hearbeat counters" and see if any were missed

    while you had the feature turned on.  The stats should still be there as long as you have not reloaded.

     

    On one of the APs, do a "show ap debug system-status ap-name XXX | begin "HA Failover Information"

     

    See if you have a line like this there matching the same time as the APs moved:

    2014-09-24 15:10:03 Failover request from standby: fail-over to 10.5.5.81

    Do you have entries in the syslog from about that that time like this and also at other

    times?

     

    sbHeartbeat: PAPI RxPacketFromSibyte: ACK to invalid packet SN = 0x0000a36e opcode=0x6

     

    We also have to rule out that you had actual packet loss.  I know you check the latency, but

    did you also check for queueing drops?

     



  • 7.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 09:16 AM

    Unfortunately i am not able to see the hear-beat counters between the controllers

     

     

     

    HA Failover Information
    -----------------------
    Date       Time     Reason (Latest 10)
    --------------------------------------
    2014-09-25 16:45:13 Failover request from standby: fail-over to 10.224.32.30
    2014-09-25 16:50:16 Pre-emptive failover back to LMS 10.223.32.30
    2014-09-25 17:26:16 Failover request from standby: fail-over to 10.224.32.30
    2014-09-25 17:31:20 Pre-emptive failover back to LMS 10.223.32.30

     

    Please find the logs taken from one of the AP's



  • 8.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 09:34 AM

     

    OK, if at some point in the future you turn HA back on to test, it might be best

    if you disabled the "preemption" checkbox so you don't get two failovers for

    every actual failover.  This will result in the APs remaining on the standby

    unless/until there is another event.

     

     



  • 9.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 09:57 AM

    Is it the Same bug which you are facing ?

     

    Also are you using AP-103 on your network ?



  • 10.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 10:00 AM

     

    It is hard to tell for sure because it could also be caused by real packet loss, but I don't see anything that would rule it out so far based on what I've seen.

     

    We have a lot of 103H's.  The HA bug is not model-specific, it affects all models.

     

     



  • 11.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 10:09 AM

    We are running lots of applications which rely on the DC and backup DC in case there is any issues then there would a real problem on the network that we have its the group network and its a 100 mbps MPLS link between the sites and its never been utilized more than 30 or 40 mb max.

     

    The only thing is we noticed is the failover happening like 3 to 4 times a day out of the blue.

     

    Its not for that ... My actual questions on AP-103 was how many clients can be connected at the same time and also what is the bandwidth expected when we do a file transfer between two cleints connected on the a-HT  with the ideel deployment scenario .



  • 12.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 10:35 AM

    show ha heartbeat counters  

     

    Local Controller

    Heartbeat stats
    ---------------
    Controller IP  Active Reference Count  Total Heartbeat Sent  Total Heartbeat Received  Last Missed Heartbeat (Count) Time
    -------------  ----------------------  --------------------  ------------------------  ----------------------------------
    10.223.32.30   5                       843312                843310                    (2) Thu Sep 25 17:42:59 2014

     

    show ha heartbeat counters  

     

    Master Controller

    Heartbeat stats
    ---------------
    Controller IP  Active Reference Count  Total Heartbeat Sent  Total Heartbeat Received  Last Missed Heartbeat (Count) Time
    -------------  ----------------------  --------------------  ------------------------  ----------------------------------
    10.224.32.30   0                       38474                 38474                     0



  • 13.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 10:44 AM

     

    That's not enough missed heartbeats for the problem to be real packet loss (and if you look at the time of the last missed one you see at least one was not during a failover and you only missed two and had two failovers), so you are indeed likely to be experiencing the HA issue I have in with the TAC.

     

    As far as the 103Hs go I really can't answer that question because ours are so densely packed they never get more than a few users on each one.  I do know they were not meant for very busy environments like classrooms like the 225s are.  Here we have them using 11an-HT but only on the conventional channels as those devices are not certified for DFS yet.  We have had very few complaints.

     

     



  • 14.  RE: Issues with AP-103 and HA Issues

    EMPLOYEE
    Posted Sep 26, 2014 11:25 AM

    sripathy, Are you working with Aruba TAC?



  • 15.  RE: Issues with AP-103 and HA Issues

    Posted Sep 26, 2014 04:24 PM

    This is the same issue hapening with Alcatel as well we are using AOS-W 6.4.2.1 and we have a TAC ticket opened for the same.

     

    If the bug id can be shared we can expedite the fix.



  • 16.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 10:53 AM

    HI bjulin,

     

     Could you please share the bug id so that i can forward my requet from ALU referring to that.



  • 17.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 11:05 AM

     

    #1552028

     

    Note that the fix for bug ID #105535 in 6.4.2.1 was supposed to adress this, but did not.

     

     



  • 18.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 03:27 PM

    Hi,

     

    Just got a mail from ALU TAC and they have recommended me to change the below settings

     

    • We should not deploy HA in Active and standby mode, instead both controller should be in dual.
    • Since it is an HA, we should not configure backup lms in AP system profile. Instead it should be only LMS for master IP and HA will take effect for failover.

    Are you running Master-Master or Master-local ?



  • 19.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 03:32 PM

     

    Yes, we are running both dual mode and yes we only configure the master IP.

     



  • 20.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 03:42 PM

    Even after these changes you are hitting the same bug ?



  • 21.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 03:44 PM

     

    We never made those as "changes" that's the way we initially configured it.

     



  • 22.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 03:52 PM

    Ok..For me i had ot remove these ip address from the AP system profile.

     

    So we still do miss heartbeats between the controllers ?



  • 23.  RE: Issues with AP-103 and HA Issues

    Posted Sep 30, 2014 04:01 PM

     

    Some part of the controller seems to think there are missed heartbeats, because it tells the APs to move.  However we have seen it happen even when the controllers know (via the show ha heartbeat counters command) that no heartbeats were ever missed.

     

     

     

     



  • 24.  RE: Issues with AP-103 and HA Issues

    Posted Oct 06, 2014 03:14 PM

    i was told that there was a seriuos bug on 6.4.2.1 release and its been taken off the site ?

     

     



  • 25.  RE: Issues with AP-103 and HA Issues

    Posted Oct 06, 2014 03:21 PM

     

    Yes, 6.4.2.2 is the same code with, as far as I know, nothing but the fix for this bug (EDIT: the "serious" bug, not the HA bug)

     

    I'm running 6.4.2.2 fine in production now.