Alright, I believe I solved it. A spanning-tree issue after all, if I understand this correctly.
Configuration with bad networking symptoms:
- MSTP instance 1 with mapped Vlans 1,20-30
- VLAN 1 linked to MX router from Switch#A
- VLAN 20-30 linked to MX router from Switch#B (root switch for mstp instance 1)
The ICMP-to-SwitchInterface-loss stopped as soon as I added VLAN 1 to the same link as VLANs 20-30, uplinked from Switch#B
Preliminary conclusion:
- Though there was no actual loop in any VLan, there was an awkward topology for MSTP instance 1
- Since Switch#B is the root switch for instance 1, all VLANS mapped to that instance should have been uplinked from that Switch, including VLAN 1
Still pondering this:
- If the issue was with VLAN 1, why are the symptoms only evident when pinging the switch interfaces in VLAN 22?
- Why were there no Spanning-Tree errors or warnings? Because there was no real loop, but just an awkward setup?
- I have a similar setup at another site, Router uplinks of VLANS in the same MSTP instance from multiple switches. It seems to work there without issues. (Although I see that it should be changed)
@parnassus @thomasbnc thank you so much for your input and patience. I would apreciate any comments on my conclusion before it is closed.
------------------------------
Ronald Ratzlaff
------------------------------
Original Message:
Sent: May 25, 2022 11:26 AM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Excellent question. I configured MSTP partially because the tutorial I followed did so, but also because I was trying to think ahead to a scenario in which it would be beneficial to have a different topology for some VLANS.
The stacked switch is configured as root as per request from my supervisor. Eventually it may take over the inter-VLAN routing role from the Meraki MX.
Does this answer your question somewhat?
I'm considering disabling redunant link ports and dismantling all LACP links to see if a simpler setup does a way with the problem. Is that something you would consider worthwhile?
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 25, 2022 05:42 AM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
I'm not totally sure how to evaluate the Spanning Tree configuration (with MSTP and just one instance):
spanning-tree config-name "hallert"
spanning-tree config-revision 2
spanning-tree instance 1 vlan 1 20-30
spanning-tree instance 1 priority 0
of backplane stacked Aruba 2920 switches on the top and:
spanning-tree config-name "hallert"
spanning-tree config-revision 2
spanning-tree instance 1 vlan 1 20-30
of Serverrack switch on the bottom, given the topology you posted.
------------------------------
Davide Poletto
Original Message:
Sent: May 24, 2022 03:10 PM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Adding some explanation, port 2/31 which has the BPDU filter enabled is an uplink to a different site via Ubiquity AirFiber. I am filtering BPDUs on both ends of that link to prevent both topologies from interfering with one another. We've had issues with that in the past when the AirFiber would get flaky and this would cause spanningtree to re-adjust the topology constantly. Since then I've implemented the filters and specific MST instances for each site. Just clarifying what that's for.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 24, 2022 03:00 PM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Yeah, not surprised you spot some questionable items. Work in progress.
In response to your specific observation:
- I have (now again) tried making VLan 22 a standard VLan. It does not change the symptoms.
- I have switches in management VLans routed through MX devices on 2 more sites. No issues there, only on this site.
Here is a screenshot of the ping process: (from host in VLan 28)
To my untrained eye it would seem like spanning-tree is re-routing the traffic every 10-30 seconds but I find no evidence of spanningtree changing the topography.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 24, 2022 02:31 PM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
To me, that "management-vlan 22" on both configuration files (apart from other more or less questionable items) looks strange: that way VLAN 22 become not routable.
Original Message:
Sent: 5/24/2022 1:46:00 PM
From: ronald.ratzlaff@vanbelle.com
Subject: RE: Inconsistent ping/access to Aruba Switch management interfaces
And here is a network map. The first config I shared is of the 'Stacked Switch' and the second is of "Serverrack-Bottom"
All switches have GVRP enabled. However, VLAN 22 is manually configured on some uplink/downlink ports
- Serverrack bottom: tagged only on TRK2 -> link to Stacked switch
- Stacked switch: GVRP tagged on Trk1-3 (LACP links), manually tagged on 1/46 (to MX) and all other downlink ports
Spanning-tree is enabled and configured as per the shared configs.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 24, 2022 07:02 AM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Also...a basic network topology would be of help (with the VLAN membership status of all involved uplink/downlink ports too)...since the whole network path involves more that just two switches.
------------------------------
Davide Poletto
Original Message:
Sent: May 24, 2022 01:24 AM
From: Thomas Siegenthaler
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Hi Ronald
would it be possible to share the entire config of one of the switches in question?
And can you please let us know which software version you are running?
Best regards,
Thomas
------------------------------
Thomas Siegenthaler
Original Message:
Sent: May 23, 2022 07:16 PM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Thanks for the clarifying question. Here is another way of explaining the phenomenon:
(Computer#1 in VLAN 28)-------------ICMP-------------(Computer#2 in VLAN 22) no problem, 100% success
This holds true regardless of having one or multiple switches between computers.
(Computer#1 in VLAN 28)-------------ICMP-------------(most switch interfaces in VLAN 22) some time outs, 80% success
This is the case from any computer in VLAN 28, connected to any of the switches with native VLAN 28 on the port to the computer.
Any other hosts in VLAN 22 respond to ping request from VLAN 28 perfectly. It is only a group of switches (5 out of 7 on this site) which act up. All switch interfaces in VLAN 22 have the same Subnet, Mask, and Gateway configured as Computer#2. But every half minute or so they won't respond to ICMP for about 5 seconds. This is very frustrating in terms of configuring the switches - SSH sessions get reset every few moments. Production and office traffic in other VLANS is not affected.
I hope this explains the issue a bit better. Please keep asking questions if I still don't make sense.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 23, 2022 06:10 PM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Hi Ronald,
You wrote "The issue seems to only appear with the switch interfaces in vlan 22." and that's the part I don't understand.
If Host 1 <-- ICMP --> Host 2 traffic is correctly routed in both directions (1->2 and 2->1) and the RTT is consistent without any packet loss, where exactly is the issue you reported?
Switch 1 is acting as Layer 2 extension of the Meraki MX Router's LAN port (tagged with VLAN x and y) so you should consider that Switch 1 should have (for its management traffic) a one of the two VLANs (say VLAN x) with an IP Address (Switch IP on VLAN x) and it should have a Default Gateway IP Address on that very VLAN (the DG should be the Meraki LAN port IP on VLAN x at this point), you could also set the IP Address on the other VLAN y but without routing enabled (and I believe routing feature should be disabled and stay disabled - on switches supporting it - on both Switch 1 and Switch 2) but I believe this will be not useful at all from the Switch 1 perspective.
------------------------------
Davide Poletto
Original Message:
Sent: May 23, 2022 01:14 PM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
I can confirm, assuming vlan x = vlan 28 (dev) and vlan y = vlan 22 (manage). Tested using two Windows hosts, and ping is 100% success both ways. The issue seems to only appear with the switch interfaces in vlan 22. Everything else communicates without any hickups.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 21, 2022 01:41 PM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Perfect...so in a scenario where an Host 1 (IP+Mask+DG of VLAN x) connected to Switch 1 Port n untagged member of VLAN x (and correctly set as an access port) performs ping test to an Host 2 (IP+Mask+DG of VLAN y) connected to Switch 1 Port m untagged member of VLAN y (and correctly set as an access port), I expect to see no ping loss at all (maybe some rtt spikes due to latency variance related to Meraki processing ICMP low priority traffic) and vice-versa in the opposite direction (from Host 2 to Host 1).
Can you confirm?
Original Message:
Sent: 5/21/2022 12:41:00 PM
From: ronald.ratzlaff@vanbelle.com
Subject: RE: Inconsistent ping/access to Aruba Switch management interfaces
Thanks for the clarification questions! Using your terminology, one leg between switch#1 and the MX, VLANs 22 and 28 tagged on both ends of the link.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 21, 2022 05:03 AM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Hi Ronald, looking at this part of your representation:
Switch#1------MerakiMX(Router)------Switch#1(Vlan22)
Could you better explain us how Meraki MX ( Router of SVI VLAN 28 and 22) is physically connected to Switch 1 (which is, as I can understand, just a Layer 2 extension of the Meraki LAN Ports)?
I guess you would have chosen to use one LAN Port of Meraki MX per VLAN (say LAN 1 for VLAN 22 untagged or tagged member of VLAN 22 downlinked to Switch 1 Port X untagged or tagged member of the very same VLAN....and...LAN 2 for VLAN 28 untagged or tagged member of VLAN 28 downlinked to Switch 1 Port Y untagged or tagged member of the very same VLAN)...or - just another example (an implementation which I tend to prefer) - eventually just the only one, say, Meraki MX LAN 1 Port for carrying both VLAN 22 and VLAN 28 (and to manage their SVIs) as tagged members of both VLANs downlinked to Switch 1 Port Z with Port Z necessarily tagged member of both VLAN 22 and 23?
Basically is two physical legs (with each leg carrying its specific VLAN) or one physical leg (carrying all VLANs) from Meraki MX to Switch 1?
Original Message:
Sent: 5/20/2022 12:55:00 PM
From: ronald.ratzlaff@vanbelle.com
Subject: RE: Inconsistent ping/access to Aruba Switch management interfaces
Thanks for these pointers, Thomas!
There are no redundant routed paths, but there are multiple LACP links in the topology. And I notice something here. Given this network path:
Test-host(Vlan28)------Switch#1------MerakiMX(Router)------Switch#1(Vlan22)====[LACP]====Switch#2(Vlan22)
Pinging from Test-host in Vlan28 to the switch interfaces in Vlan22, here are the results:
Test-host to Switch#1 = 1-2ms with occasional spikes (100ms+), 0% loss
Test-host to Switch#2 = 1-2ms with frequent spikes and blocks of timeouts, 14% loss
Switch#1 to Switch#2 = 2-3ms with some mild spikes (6-10ms), 0% loss
Switch#2 to Switch#1 = 2-3ms with some mild spikes (6-10ms), 0% loss
Switch#1 to Test-host = 100% loss
Switch#2 to Test-host = 100% loss
Perhaps many odd things here, but what sticks out to me is that intermittent loss starts to occurr as soon as I go past Switch#1. At first I thought that could be a LACP, but I tried with 3 other switches uplinked to Switch#1 via non-LACP links and get the same interruptions on 2 out of 3. Besides, I checked all LACP interfaces from the perspective of each, and everything lines up. No split LACP, links down, etc. I'm getting more confused by the minute.
I double-checked. There are 15 hosts total in Vlan22, all with static IPs and no conflicts. The vlan22 gateway is properly configured on all switches.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 20, 2022 01:37 AM
From: Thomas Siegenthaler
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Hi Ronald
After what you did / found out previously, this seems to be more of a forwarding issue than a misconfiguration on the switch itself. Did you check the following?
- What is the result if you pinged from one switch to the other (e.g. within the same VLAN with only L2 components in between)?
- Do you have any kind of redundant paths on the way between your test workstation and the switch management interface (LACP / redundant routed paths)?
- Does the router in VLAN22 show any signs of a possible IP conflict and therefore flapping ARP entries?
- Since routing is involved here, did you also check if the return path is properly routed? So does the switch know the proper default gateway address and does this gateway then have a proper path back to your workstation where you started the ping from?
Best regards,
Thomas
------------------------------
Thomas Siegenthaler
Original Message:
Sent: May 19, 2022 06:49 PM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
That makes sense and we should head down that route. However, it may not solve the issue at hand. I tested disabling the 'management' function to make it just an ordinary VLAN, but I get the same ping symptoms.
I might add, all aruba switches operate on L2 only. The Meraki is the only router in the topology. I appear to have this issue only at one of 3 sites, so now I'm starting to think the router is problematic.
What made me assume it was the switches is that any other kind of host (Access Point, windows PC) in the that VLAN responds to ping with no glitches. Only the switches time out periodically. And only on this site. The behaviour vaguely reminds me of a broadcast storm but I triple checked the spanning-tree setup and alert messages, also looked at the traffic using wireshark, and can't find any further evidence.
Any other/further thoughts or suggestions will be much appreciated!
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 19, 2022 05:50 PM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Well, my answer to your follow-up question should be...a yes...even if transporting (tagged) that Management VLAN id (so a VLAN which is not routable by the Switch itself when the Switch operates in Layer 3 mode thus with IP Routing enabled) up to an external router should not invalidate the main purpose of having such not routable (Management) VLAN exactly because it will continue to be simply not routable (so, no matter the presence of the Meraki MX on that VLAN, only directly connected - IIRC tagged - peers should be able to reach involved hosts/switches belonging to that VLAN).
The reason of using a Management VLAN (and not simply an ordinary VLAN for Management purposes) is to have an isolated segment to securely manage Switches from a management workstation.
------------------------------
Davide Poletto
Original Message:
Sent: May 19, 2022 04:43 PM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Thank you for your response! The management VLAN is configured as such on the switches. The router is a Meraki MX, so the router does not consider VLAN 22 non-routable. This would explain why the Access Points in the same VLAN are unaffected by the described phenomenon. It still confuses me though, because we have identical setups on two other sites, and we can ping switches on those management VLANS (also configured to be management vlan on the switches) without any hickups, even over a VPN.
Follow-up question:
If the router deals with the management VLAN as wit hany other VLAN, have we at that point defeated the purpose of a management VLAN and should just make it a standard VLAN? I'm personally inclined to think so but I would appreciate educated opinions on that.
------------------------------
Ronald Ratzlaff
Original Message:
Sent: May 19, 2022 03:47 PM
From: Davide Poletto
Subject: Inconsistent ping/access to Aruba Switch management interfaces
Hi Ronald, when a VLAN is configured to be the Management VLAN it becomes not routable so, if this is your case, Ping tests should be done between hosts or between switches or any other combination of them if they are all placed WITHIN that VLAN.
If instead the VLAN you call Management VLAN is not configured to be the Management VLAN (not routed) but is just a normal VLAN that you use for Management purposes it could be routed (provided that there is a router or a routing Switch doing so for your VLANs).
Considering both scenarios and using correctly placed hosts I don't expect any of the issues you're experiencing (RTT for directly connected Switches should generally be below 300 micro seconds or below 0.3 ms) but this is true only if the Switches are not under heavy loads (heavy CPU usage) and IF the network is stable from the point of view of Spanning Tree topology.
Did you check that?
------------------------------
Davide Poletto
Original Message:
Sent: May 19, 2022 12:57 PM
From: Ronald Ratzlaff
Subject: Inconsistent ping/access to Aruba Switch management interfaces
I'm having issues properly setting up the management vlan. The management vlan is 22, and the dev vlan is 28 (in which my computer is located). Besides the switches there are meraki wireless access points in the management vlan 22. I can ping the meraki devices non-stop and never get a glitch. However, when I ping the management interface of any switch in management vlan 22, I get about 50% loss. Usually there is a chunk that responds well, then a bunch of time-outs, then more success, then more time-outs. Consitently inconsistent in a way. What am I missing here?
The switches are mostly 2530s with a few 2920s thrown in.
- I checked spanning-tree settings and all switches have the same config hash.
- I checked the messages log in the switches and get no errors or alerts (other than ntp server not available, but that's should be a non-related problem I think)
- I put a windows machine into the management vlan 22 and tried pinging it but in this case everything gets lost
My assumption is that I am somehow not understanding the aruba concept of the management vlan well, I would appreciate any pointers or clarifications concerning what the management vlan does and does not.
------------------------------
Ronald Ratzlaff
------------------------------