05-19-2015 03:50 AM
Wondering if anyone else is seeing this problem.
We've experienced problems with after updating controllers to 184.108.40.206
All APs on a controller will rebootstrap after a period of time, usually less than 48 hours. Ap debug info shows heartbeats missed and the reason for disconnect as: controller aged out
We've rolled our live controllers back to the previous version (with the ARM memory leak) for now and have a couple controllers on test. The problem occurred on one of the test controllers over the weekend, the rest of our 1900 APs were just fine on the older code.
We believe there's a bug in 220.127.116.11 and we're in conversation with support about this. Currently we're being told nobody else has reported this problem, so thought I'd ask here... Shout up if you've seen it.
05-19-2015 03:53 AM
How many controllers do you have serving those 1900 access points, and how are they connected to your switched network?
Aruba Customer Engineering
Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base
05-19-2015 05:42 AM
Up until recently we've had a pair of 7200 controllers with a VRRP address. As we're nearing the limit of one controller's ability to sustain our AP estate we've added another two controllers.
As of now, we're back to two live, rolled back to 18.104.22.168, and the second pair are running 22.214.171.124 with a small number of APs.
The controllers each have two 10Gbe ports trunked to our procurve switches.
All was working just fine on 126.96.36.199, and indeed the original pair of controllers are now fine on that code with about 1897 APs.
The APs we moved over to the test controllers running 188.8.131.52 all rebootstrapped over the weekend, so it isn't load dependent. When the problem occurs we lose all APs on the affected controller.
05-19-2015 06:18 AM
Anyway per my post on the other thread, you may be seeing what we were seeing.
We were told this time the fixed-in version to be 184.108.40.206. We're currently testing on 220.127.116.11.
If you have TAC working on this, please tell them to review case #1552028.
05-19-2015 06:34 AM
Ever since 6.4.1 but not before that.
Note our controllers are connected directly with nothing but a MAS between them, on the same VLAN, and often show no missed heartbeats when the problem happens.
We have other reasons to be going to EA again so we'll be trying this out on 18.104.22.168 soon.
05-26-2015 09:14 AM
So, almost a week now with HA intercontroller heartbeats enabled on 22.214.171.124 and no mass failover events.
Other than the software upgrade (from 126.96.36.199), the only other change we made was raising the port MTUs on the MAS directly connected to the controller to 9216 from 9000.