Controllerless Networks

Reply

Riddle

Hi All,

 

I recieved this email from a colleague the other day and decyphered what was occuring. Can you work it out?

 

----------------

 

Myself and Batman just came across a seriously troublesome bug with the Aruba instant setup at <Customer Name>. Because this was a nightmare to troubleshoot I figured it would be worth sharing!

 

After a power outage the AP that was elected the master AP holding the Virtual controller role and VC IP of 10.10.10.250 went down.

When the AP came back up it did not replicate with the other APs and reconfigure so we had 2 Master APs with only 1 holding the VC IP.

We took the rouge master down and factory reset it so it would pull its config from the new master AP.

After this all APs were rebooted and came back up but no clients could connect using RADIUS. No errors in event viewer, no errors on the VC. All RADIUS responses were successful from NPS according to the server but the VC did not receive the response.

 

The problem was that the RADIUS response was going to 10.10.10.250 but none of the APs held this IP and would respond to a query on it. This meant RADIUS success responses were dropped which gave no error at either end.

 

We solved this by configuring a new RADIUS client on the NPS server under .249 and changed the shared secret on the VC for the NPS server. Once this was changed over the VC responded to requests on .249 and was allowed to send RADIUS requests to the NPS server. This then solved the issue with RADIUS clients and all is working as expected again.

 

I don’t know if you’ve seen it before but it was a nightmare to find the cause.

 

---------------------

I'll provide a hint tomorrow morning.

 

Cheers

James

P.S. Names and IPs have been change to protect the innocent.

Cheers
James

-------------------------------------------------------
-------------------@whereisjrw-------------------
------------------------blog-------------------------
ACCX #540 | ACMX #353 | ACDX #216 | AMFX #11
---------------------
-------------------------------------------------------

If a reply adequately addresses your issue, please click on the "Accept as Solution" and "Give Kudos" button so this information can benefit other users via search.

Re: Riddle

My guess is timing when the IAPs came back online.  What is supposed to happen is if another AP assumes the master role, is it sends three ARP messages with the VC IP address and its MAC address to update the network ARP cache.

 

My guess is that the wired network may have gotten a "split-brain" on the VC IP address?

Seth R. Fiermonti
Consulting Systems Engineer - ACCX, ACDX, ACMX
Email: seth@hpe.com
-----
If you found my post helpful, please give kudos
Moderator

Re: Riddle

There is a configurable option in the IAP called "Dynamic Radius Proxy".  With this enabled any IAP that is running as the VC will use the statically configured VC IP address (which is also the IP you configure in NPS).   If you do not enable this then the IAP uses whatever IP address it received via DHCP.  So, in the case of VC failover NPS is no longer seeing the request from the correct IP and it fails.

 

You can find this setting under "System" (or settings) in the upper right-corner of the WebUI.

 

This, however, is separate from the problem of having two VC's on the same segment which I have never encountered myself.

Re: Riddle

Hint: The "rogue master" had an APIPA address.

Cheers
James

-------------------------------------------------------
-------------------@whereisjrw-------------------
------------------------blog-------------------------
ACCX #540 | ACMX #353 | ACDX #216 | AMFX #11
---------------------
-------------------------------------------------------

If a reply adequately addresses your issue, please click on the "Accept as Solution" and "Give Kudos" button so this information can benefit other users via search.
Search Airheads
cancel
Showing results for 
Search instead for 
Did you mean: