ArubaOS and Controllers

Reply
Occasional Contributor I
Posts: 8
Registered: ‎09-14-2009

Gremlins & ARP tables

Help! Gremlins are stealing our ARP table entries!

Not really, but we are experiencing a strange "bug" that I can only attribute to Gremlins at this point, and am hoping that others have seen it or have suggestions. I had a ticket open for this issue once before, but it went no where because we could not reproduce the problem. It is a rare event, or rarely reported, and the current "fix" is preferred for getting clients back online rather than taking the time to troubleshoot with Aruba TAC.


The situation:

Client:
-is associated & authenticated (WPA/AES, .1x, radius)
-has an IP, default gateway, etc. (/24 subnet usually some smaller, default gateway is the vlan IP on the controller)
-has an ARP entry for the default gateway (controller)
-can NOT go anywhere on the network, pings to default gateway fail

Controller:
-config is mostly vanilla (no arp proxy, no broadcast filter, etc.)
-has an entry for the client in the user table
-does NOT have an entry for the client in the ARP table
-pings to client fail

Things we have tried that have had no effect:
-reboot laptop
-repair wireless connection
-reinstall wireless driver

"Fixes":
-reboot the AP it is currently associated to (maybe because client moves to another AP)
-OR physically walk the laptop to an area with a different AP
-OR add a static ARP entry for the client, this is not a solution but a quick way to determine if that is the problem, entry is removed and one of the two above fixes is used

The key to the fix seems to be to move from the current AP, and then the ARP table is updated. I've had this happen to my own laptop on AP 105 or 61 and a SC1 controller. Has also recently happened at a separate site with AP 61 and 800 controller.

Like I stated earlier, it is a rare event and we haven't had a chance to do a lot of troubleshooting beyond the simple stuff. Anyone know of specific processes that I should capture debug logs on the next time this happens? (something like ARP request/response logs perhaps?)

Any suggestions or comments about Gremlins are welcome. Thanks.
Frequent Contributor I
Posts: 108
Registered: ‎09-26-2008

testing

Perhaps get one laptop to get connected to the wireless, and execute a continuous ping to thde network elements, for 24 hours, turn on debugging in the system and prep tech-support file...

Michael
Guru Elite
Posts: 21,587
Registered: ‎03-29-2007

What version of code is this?


Help! Gremlins are stealing our ARP table entries!

Not really, but we are experiencing a strange "bug" that I can only attribute to Gremlins at this point, and am hoping that others have seen it or have suggestions. I had a ticket open for this issue once before, but it went no where because we could not reproduce the problem. It is a rare event, or rarely reported, and the current "fix" is preferred for getting clients back online rather than taking the time to troubleshoot with Aruba TAC.


The situation:

Client:
-is associated & authenticated (WPA/AES, .1x, radius)
-has an IP, default gateway, etc. (/24 subnet usually some smaller, default gateway is the vlan IP on the controller)
-has an ARP entry for the default gateway (controller)
-can NOT go anywhere on the network, pings to default gateway fail

Controller:
-config is mostly vanilla (no arp proxy, no broadcast filter, etc.)
-has an entry for the client in the user table
-does NOT have an entry for the client in the ARP table
-pings to client fail

Things we have tried that have had no effect:
-reboot laptop
-repair wireless connection
-reinstall wireless driver

"Fixes":
-reboot the AP it is currently associated to (maybe because client moves to another AP)
-OR physically walk the laptop to an area with a different AP
-OR add a static ARP entry for the client, this is not a solution but a quick way to determine if that is the problem, entry is removed and one of the two above fixes is used

The key to the fix seems to be to move from the current AP, and then the ARP table is updated. I've had this happen to my own laptop on AP 105 or 61 and a SC1 controller. Has also recently happened at a separate site with AP 61 and 800 controller.

Like I stated earlier, it is a rare event and we haven't had a chance to do a lot of troubleshooting beyond the simple stuff. Anyone know of specific processes that I should capture debug logs on the next time this happens? (something like ARP request/response logs perhaps?)

Any suggestions or comments about Gremlins are welcome. Thanks.




Questions:

- What version of ArubaOS is this?
- On what clients does this happen and what is their operating system and driver version and date? (Windows, MAC, etc)?
- Do the clients have a power save setting, and what is it?
- What is your routing/switching infrastructure?
- Is the Aruba controller the default gateway for clients, or your routing infrastructure?
- Is it all the clients on that access point or just a single client?
- Have you tried to do a "aaa user delete" for the client to force reauthentication?
- Have you turned on debug for that client to see if the client is trying to re-attach
- Have you done a wireless packet capture to monitor the frames between the client and the access point?


Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

Occasional Contributor I
Posts: 8
Registered: ‎09-14-2009

Re: Gremlins & ARP tables

Hi Colin, thanks for your time. I considered adding some of this information before, but my post had already grown rather long. Also, I was kinda hoping for someone to reply with a quick "Yes, I've seen that, and here's what you do...". Oh well, can't always be easy. Answered your questions below, as best I can anyways.


- What version of ArubaOS is this?
3.4.2.5 (on the SC1) & 3.4.2.2 (on the 800)

- On what clients does this happen and what is their operating system and driver version and date? (Windows, MAC, etc)?
Windows XP Pro SP3. wireless adapters, drivers versions & dates vary
we have a large variety of laptops and a mixture of manufacturers unfortunately
sorry, don't have details here, but will collect more information the next time it happens

- Do the clients have a power save setting, and what is it?
Yes, usually default settings. Will get more details next time.

- What is your routing/switching infrastructure?
Cisco

- Is the Aruba controller the default gateway for clients, or your routing infrastructure?
Controller (the IP interface of the vlan the clients are in)

- Is it all the clients on that access point or just a single client?
Not all clients, but one or more clients on the same AP.

- Have you tried to do a "aaa user delete" for the client to force reauthentication?
I have not tried this. Would the effect be different than a client reboot?

- Have you turned on debug for that client to see if the client is trying to re-attach
I'm sure we did, but I do not recall what we saw. The client does not disconnect & reconnect, everything appears normal from the client side except that it can't go anywhere. It's been a while since we did any real troubleshooting, usually just reboot the AP. I'll get debug logs next time.

- Have you done a wireless packet capture to monitor the frames between the client and the access point?
Yes, using controller & the custom aruba wireshark. Saw client and AP talking to each other. Is there something specific I should be looking for here?

Thanks again.
Occasional Contributor I
Posts: 8
Registered: ‎09-14-2009

Re: Gremlins & ARP tables

Quick update. Had this problem happen to me today. Called in a ticket and worked on it for a couple hours. After troubleshooting the client side for a while, end result was to reboot the AP and escalate the ticket.

Perhaps it is a coincidence, but I changed my power management settings to default earlier in the day to simulate typical end user settings. Before, it was set to 'Highest' (max performance) and the default is 'Lowest' (max battery). I find it peculiar though, if this caused the problem, that all troubleshooting on the client side failed.
Aruba Employee
Posts: 119
Registered: ‎05-16-2007

Re: Gremlins & ARP tables

#1 - do you have any validuser acl settings in your controller? (doesn't sound like you do, but I wanted to ask.)

#2 - Can you narrow this behaviour down to a certain type of client? Meaning, "I only see this with Intel 3945 and driver 11.5" etc...that sort of thing. What NIC type and driver versions are seeing the issue? Next time this happens can you note that down and share?

#3 - Is WMM enabled on your SSID?
Occasional Contributor I
Posts: 8
Registered: ‎09-14-2009

Re: Gremlins & ARP tables

Hi Brian, answers below

#1. validuser is default (any/any/permit)

#2. From last three occurrences:

1. Intel 5300 AGN, 13.1.1.1
2. Intel 5300 AGN, 13.0.0.107
3. Gemtek WPEB-103AG, (don't have driver version at this time, but it is a new laptop, <4 months)

#3. WMM is disabled

Thanks.
New Contributor
Posts: 1
Registered: ‎06-15-2010

Re: Gremlins & ARP tables

Hi,

Where does the vlan originate, in the controller or your Cisco router? Is the vlan on a trunk port to your router?

If the vlan origanates and is trunked on the Cisco router/switch, then I would think the Cisco router would be the default gateway for the vlan. The standby ip address of the vlan should be the default gateway. The controller needs an IP address on that vlan, but the default gateway would be the routers vlan. Your clients should be getting the router default gateway for that vlan.

Example, Cisco router config;

interface Vlan250
description ***ARUBA-WLAN0***
ip address 10.10.10.2 255.255.255.0
ip helper-address 10.10.11.4
ip helper-address 10.10.11.40
no ip redirects
no ip proxy-arp
standby ip 10.10.10.1
standby priority 150
standby preempt

Sample Aruba Vlan Config;

interface vlan 250
ip address 10.10.10.4 255.255.255.0
ip helper-address 10.10.11.4
ip helper-address 10.10.11.40

Terry
Occasional Contributor I
Posts: 8
Registered: ‎09-14-2009

Re: Gremlins & ARP tables

Hi Terry,

The vlan only exists on the controller. It is not trunked over the uplink. We have a static route for the wireless subnet on the router that points to the uplink IP of the controller (controller side of a /30 subnet).

We have been talking about changing our configuration to something like what you have described, but only because of this issue and those pesky gremlins. We've used this setup for quite a while, and everything else seems to be working just fine.
Search Airheads
Showing results for 
Search instead for 
Did you mean: