Hi
@parnassus,
Happy to help out, I am always looking for real world examples to understand possible impacts to production networks as well.
Ok this network is as follows:
1. Core pair of 5412rRzl2 chassis switches with a single MM in each. They are inter-connected with 2 x 10Gbps VSF (about 2km apart as it is a big site).
2. The switch being upgraded was a single 5406Rzl2 with dual MMs in another building on site. It is connected with a 2 x 10Gbps LAG trunk to BOTH of the core switches above via Long Haul SM fibre.
3. The core VSF runs L3 routing and switching.
4. The upgraded switch purely runs L2 VLANs back to the core switches.
My workstation was on a subnet the other side of the core switches so routing through the core to access the VLAN of the device I was pinging connected to a switch port on the switch being upgraded. Admittedly, device I was pinging was a printer so possibly its NIC isn't the fastest to recover but I doubt it would be significant.
I can 100% confirm that "nonstop switching" was configured on the upgraded switch. It was upgraded from KB.16.07.003 to KB.16.10.0012.
PS: Firmware 16.10.0012 looks stable so far. Running also on a number of 2930F VSF stacks in the same site. Yet to try on a switch running routing though.
------------------------------
Aaron Wheeler
------------------------------
Original Message:
Sent: Mar 10, 2021 11:08 AM
From: Davide Poletto
Subject: 5406Rzl2 upgrades - minimal downtime
Hello @Azz, your feedback is really appreciated! Thanks!
Since 15 seconds seem a long time to me...I have just one question more, you wrote the ping was done "... from another subnet across another 5400 VSF stack which is running as a L3 core with dual 1Gbps fibre uplinks to the 5400 being upgraded" ...that made immediately me to suspect that your upgraded 5400 wasn't exactly performing "routing" for/between hosts involved in the ping test you did...am I wrong? Is the 5400 connected as a Layer 2 extension (via a simple or via an aggregated uplink) to the VSF acting as Layer 3 or is it routed to the VSF Core?
My aim is to understand what downtime to expect (in seconds) on Aruba 5400R zl2 configured with "NonStop Switching" redundancy mode and acting as a IP Router for directly connected subnets when it is upgraded with the technique described on this thread (Standby MM first then ex-Active MM after a redundancy switchover).
If your Aruba 5400R zl2 configured with "NonStop Switching" redundancy wasn't performing IP routing (mine could be a wrong assumption) but was acting as a pure Layer 2 switch I would have expected no packets loss at all (no loss at all for Layer 2 switched traffic...but a traffic disruption only at Layer 3 level for routed traffic traversing the 5400).
How is the 5400 uplinked to your VSF? LACP I suppose...
Kind regards, Davide.
------------------------------
Davide Poletto
Original Message:
Sent: Mar 09, 2021 04:57 PM
From: Aaron Wheeler
Subject: 5406Rzl2 upgrades - minimal downtime
Hi Davide,
The ping was from another subnet across another 5400 VSF stack which is running as a L3 core with dual 1Gbps fibre uplinks to the 5400 being upgraded. The ping was just a standard windows ping command that I think has a default 2 sec delay between pings which means around 10-15 secs drop for me.
I have two more 5400 VSF stacks to upgrade on the site when I can get a operational outage (24x7 operations). I will post my results of the VSF "Fast Software Upgrade" impact when I get it done. Likely could be a few weeks before I get the opportunity though.
------------------------------
Aaron Wheeler
Original Message:
Sent: Mar 09, 2021 03:52 PM
From: Davide Poletto
Subject: 5406Rzl2 upgrades - minimal downtime
Hi @Azz really interesting! just one question: when your wrote "The upgrade was fully successful and I only lost 7 pings to a device on that switch after issuing the "redundancy switchover" command to reboot the commander management module. My switch was in "non-stop switching" state prior to starting the upgrade." did you mean that the ping test was done between two hosts and both were members of different routed subnets on the Aruba 5400R zl2? that's to understand if the impact you saw refers to the "routing" and not to the "switching" capabilities.
------------------------------
Davide Poletto
Original Message:
Sent: Mar 08, 2021 03:46 PM
From: Aaron Wheeler
Subject: 5406Rzl2 upgrades - minimal downtime
I just performed these steps on a production switch:
#copy tftp flash <TFTP SERVER IP> KB_16_10_0012.swi primary
#boot set-default flash primary
#write memory
#boot standby
#show redundancy (wait for sync)
#redundancy switchover
The upgrade was fully successful and I only lost 7 pings to a device on that switch after issuing the "redundancy switchover" command to reboot the commander management module. My switch was in "non-stop switching" state prior to starting the upgrade.
From what I read, the redundancy switchover forces the the modules to restart to load the new software image which causes the short outage. Redundancy switchover without a software upgrade wold cause no outage.
NOTE: My lightweight Aruba APs connected to the switch had to reboot so there was a longer outage for wireless connected devices.
------------------------------
Aaron Wheeler
Original Message:
Sent: Jul 24, 2019 03:37 PM
From: Michael Naylor
Subject: 5406Rzl2 upgrades - minimal downtime
Hello airheads,
Our core switches are 5406Rzl2 each with dual management modules. I was under the impression that I could upgrade the code (using some mix of redundancy and other features) with minimal downtime. Around the Internet, I've seen various ramblings of things tried but never any definite yes or no. Is there a way to accomplish this?
#5400