Wireless Access

last person joined: 17 hours ago 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

Missed heartbeat on 6.4.2.6, APs rebootstrap

This thread has been viewed 9 times
  • 1.  Missed heartbeat on 6.4.2.6, APs rebootstrap

    Posted May 19, 2015 06:50 AM

    Wondering if anyone else is seeing this problem.

     

    We've experienced problems with after updating controllers to 6.4.2.6 

    All APs on a controller will rebootstrap after a period of time, usually less than 48 hours. Ap debug info shows heartbeats missed and the reason for disconnect as: controller aged out

     

    We've rolled our live controllers back to the previous version (with the ARM memory leak) for now and have a couple controllers on test. The problem occurred on one of the test controllers over the weekend, the rest of our 1900 APs were just fine on the older code.

     

    We believe there's a bug in 6.4.2.6 and we're in conversation with support about this. Currently we're being told nobody else has reported this problem, so thought I'd ask here... Shout up if you've seen it.



  • 2.  RE: Missed heartbeat on 6.4.2.6, APs rebootstrap

    EMPLOYEE
    Posted May 19, 2015 06:54 AM

    Ultimate_Fish,

     

    How many controllers do you have serving those 1900 access points, and how are they connected to your switched network?

     



  • 3.  RE: Missed heartbeat on 6.4.2.6, APs rebootstrap

    Posted May 19, 2015 08:43 AM

    Up until recently we've had a pair of 7200 controllers with a VRRP address. As we're nearing the limit of one controller's ability to sustain our AP estate we've added another two controllers. 

     

    As of now, we're back to two live, rolled back to 6.4.2.4, and the second pair are running 6.4.2.6 with a small number of APs.

     

    The controllers each have two 10Gbe ports trunked to our procurve switches.

     

    All was working just fine on 6.4.2.4, and indeed the original pair of controllers are now fine on that code with about 1897 APs. 

     

    The APs we moved over to the test controllers running 6.4.2.6 all rebootstrapped over the weekend, so it isn't load dependent. When the problem occurs we lose all APs on the affected controller.

     

     



  • 4.  RE: Missed heartbeat on 6.4.2.6, APs rebootstrap

    Posted May 19, 2015 09:19 AM

     

    Anyway per my post on the other thread, you may be seeing what we were seeing.

     

    We were told this time the fixed-in version to be 6.4.2.4.  We're currently testing on 6.4.2.5.

    If you have TAC working on this, please tell them to review case #1552028.

     



  • 5.  RE: Missed heartbeat on 6.4.2.6, APs rebootstrap

    Posted May 19, 2015 09:24 AM

    ...and we can now confirm we are seeing this behavior on 6.4.2.5 as well.

     



  • 6.  RE: Missed heartbeat on 6.4.2.6, APs rebootstrap

    Posted May 19, 2015 09:26 AM

    OK, well that's sort of good to know... Which version were you running when you first had the problem?



  • 7.  RE: Missed heartbeat on 6.4.2.6, APs rebootstrap

    Posted May 19, 2015 09:35 AM

    Ever since 6.4.1 but not before that.

     

    Note our controllers are connected directly with nothing but a MAS between them, on the same VLAN, and often show no missed heartbeats when the problem happens.

     

    We have other reasons to be going to EA again so we'll be trying this out on 6.4.3.1 soon.

     

     



  • 8.  RE: Missed heartbeat on 6.4.2.6, APs rebootstrap

    Posted May 26, 2015 12:15 PM

    So, almost a week now with HA intercontroller heartbeats enabled on 6.4.3.1 and no mass failover events.

     

    Other than the software upgrade (from 6.4.2.5), the only other change we made was raising the port MTUs on the MAS directly connected to the controller to 9216 from 9000.