I've built the below network using OSPF, BGP EVPN, and in the case of the 6200, static VXLAN
All BGP sessions are up, VTEPs look great and L2 traffic successfully crosses over the L3 links by way of VXLAN.
However, there is one particular issue that I cannot solve - SSH and HTTPS connectivity from a PC to any of the 8360 switches in a VSX pair doesn't work.
All hosts can ping each other on their VLAN 79 SVI addresses. The PC can ping all SVI interfaces and can SSH into the 6200F. Without enabling SSH on the default VRF, SSHing to the VSX switches results in a connection rejected message - exactly as it should be. However, once I enable SSH on the default VRF, I get no response from the switches when SSHing from the PC. Nothing, not a blip. The logs say SSH enabled on default VRF but nothing else. The same goes for HTTPS access.
If I ssh/https to a VSX on its SVI 79 IP address switch from the 6200, it works. I can confirm that the source and destination of the SSH session are on VLAN 79.
I am completely stumped - no idea what's going on. Is there some caveat about SSHing to an SVI over VXLAN? Something simple I've missed? I've worked with lots of VSX pairs where the mgmt address has been in the default VRF and never had any issue, nor would you expect to, access the devices with SSH. This is the first time I've built this type of topology with VXLAN/EVPN/BGP, so it's possible that there are some caveats I'm missing. Having said that, my network connectivity testing hasn't shown any issues. Everyone can ping everyone. Pull out some links, ping keeps flowing. Tear down the entire connection between the 6200F and a VSX pair and everything keeps working.
Hi, I run a similar network in production and can SSH to both the VRF in the overlay and to the underlay address (loopback).
In the PoC I was confused about connecting from an address in the overlay (VRF that is transported around using VXLAN/EVPN) to individual addresses on VSX pairs. Similar to you trying to SSH to 10.1.1.1 and 10.1.1.2.
After asking Aruba this was seen as normal because of the way routing happens within VXLAN and the rules around traffic inter VXLAN. So if traffic arrives or the reply is routed over the VSX inter-link, it is dropped in certain cases. So typically connecting to one member was OK but the other less so. Power off a member that was contactable and the other becomes contactable.
I think rather than spending time looking into this (although it is quite interesting to see using the very good mirror-session and diag utils tcpdump commands) you might want to consider what the end result will be. Should you be connecting to devices for management using the underlay? Therefore be able to troubleshoot VXLAN issues etc. In the classic model each VSX member will have a loopback0 (with shared loopback1). Using this enables us to turn off SSH/HTTPS on all user facing subnets. Right now we have to route leak into the underlay which I feel a bit dirty about but that is only until the infrastructure to properly separate the underlay (also management) from overlay is in place.
Also, in our overlay, each SVI has the same IP address on each member (interface and active-gateway are the same on both VSX members). So I would have to engineer something like loopbacks in the overlay to be able to achieve your vlan79 example.
So in short, depending on the detail of the config you could be seeing a normal thing. Rebooting a member and gaining access to the other would demonstrate that (assuming you have enabled SSH/HTTPS for all VRFs). But worth considering at this stage how you will manage the network long term.
Apologies for the delay in my response - I got whacked by a case of covid. Interesting what you say about the VSX link. I recall from many years ago, some similar behaviour ago whilst working on Cisco Nexus. There were a 101 caveats for traffic that passed over the peer link and for layer 2 traffic and SVIs in general, but Cisco has a great Nexus caveats documentation and I would very much like to see something from Aruba for the VSX.
Ok, so I tried access the switches via their loopbacks. Here is that happened:
I'd love to get to the bottom of this and get a thorough understanding of it. I'll get in touch with my local Aruba resources and see what they can give me.
With regards to how we will manage the network once in production, we will access the loopbacks from a management network. The customer isn't highly competent when it comes to networking so keeping it simple is the name of the game. We use VXLAN because it is warranted as to limit L2 failure domains, but building separate VRFs and route-leaking is a bit above the customers pay grade. The network is a production factory where uptime and resilience is key.
Thank you for your reply and tips - that saved the day!
© Copyright 2024 Hewlett Packard Enterprise Development LPAll Rights Reserved.