08-08-2019 12:41 PM
I've been meaning to post this since Atmosphere '19 - finally getting around to it:
We have built a fairly atypical iAP Cluster/Controller hybrid network with some locations split into multiple smaller clusters.
At one location - a warehouse we just converted from 105 controller APs to two iAP clusters or 48 and 57 access points. We built Aruba GRE tunnels back to a controller at our Internet edge from each cluster. Then ran into a surprising issue.
When either cluster's tunnel was up everything worked as expected. When both cluster's tunnels were up, neither tunnel's clients could connect to the Internet. Worse, when both were up, none of our tunnelled clients in any location could connect to the Internet either.
We opened a case with TAC and after several troubleshooting sessions - including a six-hour marathon call from a table in the Airheads' Lounge at Atmosphere '19 last week.
At the end of that call we were no closer to a solution. I drew up a new picture of the systems layout and sent it to TAC with some configuration files to chew on and went to dinner.
At dinner with:
I mentioned my lengthy call and briefly outlined my issue.
In just a few minutes of rapid-fire questions the Airheads worked out the cause and the solution.
In a nutshell this is exactly why I go to Atmosphere, and why I use and contribute to the Airheads Community. Talking through my issues and helping others with theirs gets us all to our solutions faster.
Oh, my problem? Turns out one of cluster A's member iAPs had a bad patch cable and was connecting to the cluster B as a mesh-node. Back at the datacenter, the Internet gateway would send a broadcast ARP telling the world where 192.168.0.1 was located (port 4 on my DMZ switch) and that broadcast was picked up by my controller and sent up the tunnel to cluster A where the meshed iAP would transfer it to cluster B; which in turn put it into the tunnel and send it back to the controller, which was updating its MAC address table to send gateway bound traffic to the controller (port 11 on my DMZ switch)
The fix? Disable meshing, disconnect the bad cable and get it fixed, purge each cluster's member AP lists of the other cluster's APs and reenable meshing (in case of more bad cables)
Next time you see Airheads MVP @mrtwentytwo, give him kudos for me.
if I've helped, please give kudos
if I've provided a solution, please mark the solution so others can find it
Solved! Go to Solution.
08-08-2019 12:47 PM
Happy I could help!!
Very nice meeting you!
AirHeads MVP Expert |AMFX#22| ACCX#613| ACMX#733| ACDX#744
If you like my posts, kudo's are welcome. If it solves your problem, please click 'Accept as Solution'