Courtesy of: @Plane, @amishshah, @Shiv, @mmoussa
The Live Upgrade feature is available only with Mobility Controllers running in a cluster managed by a Mobility Master running AOS 8.1 or higher. Clusters would be upgradable starting from AOS 8.1 to higher ArubaOS versions.
In order to fully benefit from the Live Upgrade with minimal RF impact and client disruptions, the following AOS 8.1 features should be in place:
- Stateful failover achieved through an L2-Connected state cluster with redundancy enabled (Ref - Controller Clustering chapter in AOS 8.1 user guide)
- Centralized Image Upgrade
- Airmatch (schedule enabled)
- Aruba best practices applied to AP deployment and RF coverage
How does it work?
Let us take an example of a 3-nodes cluster A: MC1, MC2 and MC3 and go through the steps taken to upgrade this cluster from AOS 8.1 to 22.214.171.124, assuming that the Mobility Master (MM) has already been upgraded to 126.96.36.199.
The Live Upgrade is launched from the MM GUI through ‘Upgrade Cluster’ Configuration Task.
The Mobility Master (MM) leverages Airmatch and Cluster Upgrade Manager to partition the APs into logical groups (partitions) based on their base channels, and assigns a Target controller to each partition, leaving one controller unassigned by design. The Target controller would become the AP Anchor Controller (AAC) that the AP will connect to after the code upgrade. Hence, in the above diagram, partition 1 (AP1, AP4) and partition 2 (AP5) had MC2 as their target controller, while partition 3 and 4 had MC3 as their target.
Then the MM makes use of the Centralized Image Upgrade feature to push AOS 188.8.131.52 to each cluster member, one mobility controller at a time. Once the image copy is successful on all controllers, one target controller (MC3) is chosen and rebooted. Upon successful reboot, MC3 comes up on AOS 184.108.40.206 and forms a cluster by itself, having a different code version than MC1 and MC2. The APs and clients that were on MC3 fail over seamlessly to their standby controllers, thanks to the AP pre-existing standby tunnels and clients state duplication
Once the Mobility Master get confirmation of the UP state of the upgraded MC3 controller, it initiates the firmware 220.127.116.11 pre-load to all the APs that had the upgraded controller MC3 as their target one partition at a time.
After each partition image pre-load, the APs in that partition are rebooted to the new firmware 18.104.22.168 to come up and connect to their upgraded target controller. In our example, AP2 and AP6 get pre-loaded first and then rebooted to come up on controller MC3 as their AAC. Next, AP3 and AP7 would be pre-loaded and rebooted to come back up and terminate their tunnels on MC3.
Such action should have minimal impact on associated clients since only a single channel becomes unavailable and clients would naturally roam to a nearby AP.
This mechanism of controller reboot, AP image pre-load followed by a reboot on a per partition basis, is repeated for controller MC2 and partitions 1 and 2 to move the rest of the APs to the upgraded controller MC2 that rejoined the upgraded cluster.
Only the cluster member MC1 that was intentionally left as target-less remain, and a reboot of this controller would bring it to AOS 22.214.171.124, and would allow it to re-join the upgraded cluster.
Once the entire Live Upgrade process is complete and all cluster controllers and their APs are all upgraded to AOS version 126.96.36.199, cluster leader will load balance APs and/or clients in a stateful manner as needed.
Q1. Could Live Upgrade be used to upgrade clusters running 8.0.1?
A1. No. Live Upgrade is an AOS 8.1 feature. Therefore, Live Upgrade should be leveraged to upgrade clusters running a minimum version of AOS 8.1.
Q2. How can I check Live Upgrade progress?
A2. The progess of Live Upgrade could be followed through the Mobility Master GUI ‘Show upgrade status’ Configuration Task or from the CLI using the command:
‘show lc-cluster <CLUSTER_NAME> upgrade status verbose’
Q3. Do I need to enable Airmatch for Live Upgrade to work?
A3. Yes, Airmatch schedule enabled is required for Live Upgrade to work. Airmatch interacts with Cluster Upgrade Manager to implement the AP partitions on a per channel basis.
Q4. Is AP preload required?
A4. AP preload consists of pre-loading the APs with the new firmware while the AP is still operational. Its purpose is to accelerate the AP upgrade process. Yes, it is part of the In-Service Live Upgrade process.
Q5. What about AP deployment and Aruba RF best practices, are they part of the Live Upgrade process?
A5. Aruba best practices are not part of the Live Upgrade process itself. However, they are strongly recommended to minimize RF disruptions during access points reboots.
Q6. What is the basis used for the AP partitioning?
A6. The APs were selected to be in a partition based on their a-band channel, so that an AP partition would have APs in the same base channel irrespective of the bandwidth. Such design was made to minimize RF disruptions when access points in a partition are rebooted to upgrade.
Q7. Is Client-Match being used to move WiFi clients over prior to AP reboots?
A7. No. Client-Match is not utilized. The fact that no adjacent APs should be on the same channel makes one reboot of an AP a simulation of a client roaming, and the client WiFi driver would naturally roam to an adjacent AP on a different channel. That is where overlapping AP coverage is important to minimize RF disruptions.
Q8. Why does Live Upgrade keep on the last upgraded controller with no target assignment?
A8. Leaving one controller in the cluster target-less is intentional and the purpose is not to leave any access points still attached to that last controller. If there were any APs still attached to the last controller, and the controller is rebooted to get upgraded, the APs will incur a long outage until their target comes back up. On the other hand, the cluster should be able to sustain the total load if the last controller is rebooted.
Q9. Can I still use Live Upgrade with a Cluster of two controllers?
A9. Yes, as long as a single controller is able to handle the full load of APs and clients.
Q10. What happens if I don’t have overlapping coverage on my entire campus?
A10. Areas where only one AP can be heard, will see a brief outage as the AP reboots to come up on the upgraded controller. This outage should only last a minute or two. But it is something to keep in mind for mission critical networks. Networks designed to Aruba’s recommended RF density should not have any locations that only hear a single AP.
Q11. Is there a need to manually download new firmware on controllers before starting the Live Upgrade?
A11. No. There is no need to do that, as long as an upgrade profile is configured for the controllers to pull the new firmware when instructed by the Cluster Upgrade Manager.