I have two VSX pairs of 8325 switches working in two datacenters on OS version 10.11.0001. There is BGP EVPN runnig on them, several VLAN stretched between DCs, some servers (including ESXi hosts) and bunch of external connections to WAN routers. Recently I've tried to upgrade to 10.13.1040 and failed in some interesting ways.
After the upgrade some random things lose communication. It seems that all ARPs, MACs and required routes are present both in the l2tp evpn address family in the underlay as well as in overlay ipv4 but no communication between some random parts of the network. In one case I could not even ping switch SVI from a VM despite MAC and ARP present on the switch. Rebooting the switches back to 10.11 restores everything.
Could you suggest some troubleshooting steps and ideas what to try to fix the config for 10.13. I have only couple of hours late in the night every week to two to try out something
Below a simplified diagram with switch connections and most of cases of external connections.
------------------------------
-- tommyd
------------------------------