09-23-2014 07:04 PM
According to a slide in Module 22 of http://cloud.arubanetworks.com/instant-training ("Mini-Features"), Master election is supposed to use the following criteria:
- IAP with an alternative uplink (3G/4G)
- IAP with the most capable hardware
- IAP with the longest uptime
I've got a cluster consisting of mostly IAP-105s with a couple of IAP-115s. None of the AP's have alternative uplinks, and none of them are configured as a "Preferred Master".
Based on the info in this slide, I would expect that the cluster would reliably elect one of the IAP-115s, but that's not what I see happening -- when I reboot the cluster, it's electing one of the 105s.
Is this expected behavior? I'm running firmware 126.96.36.199-188.8.131.52.
Solved! Go to Solution.
09-25-2014 07:49 AM
The election would occur if there were a tie -- no VC and two APs contending for VC at the same time. If the IAP-105 happened to boot just slightly faster and became the VC then it stays the VC.
09-25-2014 06:33 PM
So, once I've got the cluster running (and it's selected one of the 105s as the master), I've tried a couple of different things that I would expect to make it select a new master:
- telling the AP acting as master to reboot through Maintenance -> Reboot
- removing power from the AP acting as master by turning off PoE for that switch port
In both cases, it seems to reboot the whole cluster and select a new master... which is still not one of the 115s.
In short, if (as documented in the training slides) hardware capability is supposed to be a more important factor than uptime in selecting a master, I'm not seeing it happen here. It appears that whichever AP gets to a certain point in the boot process first gets to be master, regardless of hardware spec.
Am I missing something? Is there another way to trigger a master re-election that doesn't reboot the entire cluster?
09-25-2014 06:41 PM
I first noticed this when I upgraded the cluster's firmware after adding the 115s, which of course rebooted all AP's at roughly the same time. The uptime differences between the AP's at that point would have been negligible (seconds at most).
09-26-2014 01:29 PM
I just dug into the spec. and the IAP-13x and RAP-155 have a higher priority than other models. All others are equal in HW priority.
As such, in your deployment it would follow most likely be longest uptime. The full list is below.
- IP scope: default-ip will lose against normal ip;
- 3G modem: IAP with 3G dongle detected takes higher priority;
- AP class: AP-13x or RAP-155 take higher priority;
- Uptime: IAP with longer uptime takes higher priority;
- MAC address: IAP with bigger MAC address takes higher priority;
If you are seeing all IAPs reboot when the active VC is failed that is not expected behavior and you should open a case with TAC. You should not see any interruption in service and one of the other IAPs should simply take over the VC responsibility.
09-26-2014 01:49 PM
Thanks! That answers my question. It surprises me that the 115 is not higher priority than the 105 (I can see that it's at least got a lot more free RAM), but I'm willing to accept that it's working as intended. You might want to clarify this a bit in the training slides, to avoid confusion.
Looking at the uptime, it does appear that the other AP's aren't actually _rebooting_ when the master fails, but all wireless clients are being kicked off the network and have to reconnect. Again, not exactly the behavior I was expecting, but I can understand why it would work that way (resetting the virtual controller is probably a lot more robust than trying to replicate all of the ephemeral data and make that transition seamless for the clients).