03-10-2014 06:28 AM - edited 03-10-2014 08:47 AM
Assuming I've got this right, [I don't have enough spare controllers to lab this up, so maybe someone can help me out]:
How does an AP connected to a controller configured for Fast Failover choose it's failover controller in the case there are multiple standbys?
-Three Local Controllers (LC1,LC2 and LC3), all three configured with a single HA Group listing all three controllers in the Dual Role.
-Three AP groups (APG1,APG2,APG3), with each group configured with a respective LMS-IP (I.e. APG1 has LMSIP of LC1, APG2 has LMSIP of LC2, etc)
In the event of a Failure of LC1, how do the APs in APG1 choose whether to connect to LC2 or LC3?
03-10-2014 08:50 AM
When an AP boots up, it will behave similar to previous version of AOS where the AP will terminate against the lms-ip. Once the AP has terminate to the primary controller, the controller will provide a standby controller (as determine by the HA Group). If the AP is able to establish a standby tunnel, the process is done. If the AP is unable to establish a standby tunnel, it will attempt to repeat the process with the next controller in the list.
At any given time, the AP will only know about the two controllers where the tunnels (primary and standby) are established with.
Looking forward to hearing your experiences with AP Fast Failover.
03-10-2014 08:53 AM
Yup, I get all that.
My question is - if an HA Group has more than two controllers in it - how does the actual primary controller (LMSIP) determine which of the other backup controllers the AP should then create a secondary tunnel to?
03-10-2014 09:03 AM
The standby tunnel will be determined by the ordered list of controllers defined in the HA Group. in your example, if the lms-ip is different for each AP, then each AP will establish a primary tunnel with the controller define in the lms-ip. The AP will establish standby tunnel to the next available controller (based on the order of the HA Group).
03-10-2014 10:12 AM
Ok. I guess that makes sense.
To confirm; since a controller only belongs to one HA group; it would not be possible to have APs take two deterministicly different fast failover paths to seperate secondary controllers in the case of a primary failure?
I understand that with the floating licenses, I'm not in license trouble; but I am concerned about the CAP max AP counts (in the case of the M3s, 512).
If had 3 M3s each actively hosting 300 APs (for a total of 900APs - which evenly distributed would be supportable on two M3s) I could not use Fast Failover to deterministly force half to one secondary and the other half to a different secondary in the case of a primary failure while using HA Groups/Fast Failover?
In this scenario I must use multiple VRRP pairs and/or LMSIP/BackupLMSIP?
03-13-2014 08:17 PM
This can be solved using AP fast-failover, but not in a deterministic way. Let me take your example of LC1, LC2 and LC3, all in dual role. The standby controller selection for an AP(when it connects to its LMS) is round-robin. 300 APs are connected to each of LC1, LC2 and LC3 when you enable AP fast-failover. Lets assume LC2 is selected as standby for all 300 APs connected to LC1. These APs will attempt to connect to LC2 in standby mode. Since LC2 has only 212 AP capacity left, 212 will connect to LC2 in standby mode. remaining 88 will be denied standby connection by LC2. These APs will report this back to LC1. Then, LC1 will check if there's another standby available in that HA group. In this case, there is an LC3. LC1 will then assign LC3 as standby to those 88 APs. These APs will then attempt standby-mode connection to LC3, which should succeed.
Having said that, your requirement is not just that. You need all 900 APs to have a standby connection. Since with AP fast-failover, each AP established a standby mode connection to its standby controller, you effectively need a platform capacity of 900 (active) + 900 (standby) APs = 1800 APs. Between the 3 M3s, you have a platform capacity of 1536 APs, not 1800. Hence, you cannot have standby connection for all 900 APs.
In this case, you'd need to use one of the legacy failover mechanisms (VRRP or LMS/Backup-LMS) to achieve this.
The support for your requirement is available in AOS 6.4 release, through standby capacity extension feature for AP fast-failover, which allows (with certain constraints) to go beyond the controller's rated platform capacity of APs, for standby mode connections.
03-14-2014 12:56 PM
Ok, thank you for the thoughtful and detailed response. This is great stuff!
If I am to understand you correctly, I believe you say (relative to v6.3):
- Each AP will create a connection to a primary local controller and *only one* standby controller, regardless of the number of controllers in a fast failover group
- The primary local controller will *round robin/nondeterministically* choose a standby controller for an AP if it is configured with more than two controllers in it's fast-failover group at the time the AP establishes it's session with the primary local controller
- While license count will not be decrementented for a standby AP session; CAP capacity *is* impacted. That is to say a controller with a 512 CAP limit like the M3 could handle 512 Primary AP connections OR 512 Standby Connections OR 300 Active and 212 Standby Connections but NOT 300 Active connections and 300 Standby connections (88 standby connections would thusly be denied)
Do I have that right?
My followup questions:
- In v6.4 you mention standby-mode connection capacity to exceed the rated controller CAP limit. Will this be limited to specific hardware like the 7200 or newer platforms?
- In v6.4 is support for deterministic pathways supported? For example, on a college campus, I would typically terminate all APs for a building to a specific controller. In a non-deterministic fast-failover; I could have a building where APs were now terminated across different local controllers.
03-14-2014 05:10 PM
>> "Each AP will create a connection to a primary local controller and *only one* standby controller, regardless of the number of
>> controllers in a fast failover group”
[mw] Yes, that is correct
>> While license count will not be decrementented for a standby AP session; CAP capacity *is* impacted. That is to say a controller
>> with a 512 CAP limit like the M3 could handle 512 Primary AP connections OR 512 Standby Connections OR 300 Active and 212
>> Standby Connections but NOT 300 Active connections and 300 Standby connections (88 standby connections would thusly be
[mw] This is correct
>> In v6.4 you mention standby-mode connection capacity to exceed the rated controller CAP limit. Will this be limited to specific
>> hardware like the 7200 or newer platforms?
[mw] Yes, there are some dependency on hardware platform. For 7200 series controller, it will allow 4x the platform limit for standby tunnel. The controller will only allow active tunnel up to platform limit. for M3/3600, the ratio is 2x.
>> In v6.4 is support for deterministic pathways supported? For example, on a college campus, I would typically terminate all APs
>> for a building to a specific controller. In a non-deterministic fast-failover; I could have a building where APs were now terminated
>> across different local controllers.
[mw] You can build predictability by making sure that the controller have enough capacity, where AP build active and standby tunnel to the same set of controllers.
12-17-2015 07:27 AM
Just circling back on this topic. In a scenario where you have (3) locals...let's say 7240's (Local-1, Local-2, and Local-3). What I would like to do is to specify the controller that the AP will terminate. I would do this by specifying the LMS IP. For AP redundancy, I will create (1) HA Group and include the (3) locals configured as dual.
I didn't mention but I have 2100 AP's. I would like to terminate 700 AP's to each of the (3) locals and specify a specific local for it's stanby GRE tunnels. Is this possible? I could potentially configure (2) locals in an active role and (1) as standby (which I do not want to do).
If I understand it correctly, the standby tunnel will be established in a round-robin fashion, so the load will be split...but I cannot specify the local that the standby GRE tunnel will be established.
If that is the case, this link is rather mis-leading:
This makes me assume that I can specify the local that the standby tunnels will be established. Unless this means to specify the BKLMS (which I will do as well in the event of an AP reboot).
Am I missing something?
12-17-2015 07:42 AM