01-19-2015 07:57 AM - edited 03-18-2015 03:59 AM
we have two 7240 controllers on dual mode with HA-lite setup, running 220.127.116.11.
when the APs miss a few heartbeats (~10sec) they try to failover, but when the connection is restored the APs fall into a state, where they are assossiated with their original master but user sessions are tunneled to the controller who was their Standby and clients are stuck in logon status.
We have removed Backup LMS configuration and also tried with and without controller state-sync with no success.
Aruba TAC has been contacted, but haven't given any solution yet, so I 'm wandering if anyone had a similar problem and if he can propose something we could try.
I can see some errors in:
#show ap remote debug mgmt-frames ap-name "test-AP
Timestamp stype SA DA BSS signal Misc
--------- ----- -- -- --- ------ ----
Jan 19 14:34:32 deauth xxx yyy zzz 15 Ptk Challenge Failed (seq num 0)
#show ap remote debug sta-msg-stats ap-name "test-ap"
Current context-IP : shows the standby controller's IP
#show datapath tunnel table
shows the IP of the correct/active controller
Solved! Go to Solution.
01-19-2015 09:08 AM - edited 01-19-2015 09:09 AM
The backup LMS is unnecessary in an HA configuration.
How many controllers do you have?
Which of those controllers are part of the failover group?
What role is each controller set to? Active/dual/standby.
Do you have IP Mobility configured?
If a reply adequately addresses your issue, please click on the "Accept as Solution" and "Give Kudos" button so this information can benefit other users.
01-19-2015 11:00 AM
we are focusing the problem between two controllers in dual mode.
Both part of the failover group.
IP Mobility is not enabled.
Backup LMS was initially configured, but we removed it based on a bug related to HA-Lite on latest release notes (and TAC's suggestion). Problem persists either way.
Apart from countless reports of users in a couple occations when a building link went down for a few seconds, we have tested with a test ap-group with a unique AP and seperate SSID, to avoid roaming connections of the client.
Disconnecting the upling of the switch where we have connected the test-ap triggers the event.
01-27-2015 09:09 AM
This appears to be bug#105294 which is marked fixed in AOS 18.104.22.168
Unfortunately bug discription on the release notes is not very close to how we were experiencing this problem.