ArubaOS and Controllers

Reply
Occasional Contributor II
Posts: 43
Registered: ‎02-14-2008

PEAP clients occasionally unable to logon...

We've had a recurring issue where some laptops fail to logon using PEAP. User side symptom is the message ' The domain xxx is unavailable'.
In the security log on the controller we see the message 'Dropping the radius packet for station xxxx doing 802.1x'. Looking on the radius server event logs there are no entries for the associated laptop or user. I've looked through a lot of different logs and cannot fathom why this client is having issues. Everything checks out. Any ideas on how to troubleshoot or where to look would be most appreciated.
Occasional Contributor II
Posts: 14
Registered: ‎04-23-2009

Re: PEAP clients occasionally unable to logon...


We've had a recurring issue where some laptops fail to logon using PEAP. User side symptom is the message ' The domain xxx is unavailable'.
In the security log on the controller we see the message 'Dropping the radius packet for station xxxx doing 802.1x'. Looking on the radius server event logs there are no entries for the associated laptop or user. I've looked through a lot of different logs and cannot fathom why this client is having issues. Everything checks out. Any ideas on how to troubleshoot or where to look would be most appreciated.




Having exactly the same issue. At first I suspected some sort of certificate was expiring if they weren't logged on within so many days but it seems to be totally random. The fix for us has been to hook it up on a wire and run "gpupdate /force". When it reboots it seems to be able to authenticate again. If anyone has any insight to this issue we would really appreciate it!

My latest change is to disable "Validate server certificate" in the PEAP settings under the 802.1x tab on the wireless network group policy. A shot in the dark but I suspect that when I originally set this up, I did not generate the certificate properly and did not setup the CA correctly. It's where I had the most trouble. How did you do your certificate?
Aruba Employee
Posts: 6
Registered: ‎07-11-2007

Re: PEAP clients occasionally unable to logon...

The first step is to gather a little more info. Up the logging level on the security log to debug ("logging level debug security" in config term), and turn on user debugging for a workstation having the problem ("logging level debug user-debug aa:bb:cc:dd:ee:ff" in config term). You should also gather the logs from the radius server. If it's IAS, the event viewer system log should have a message indicating what happened when the user attempted authentication. All this is critical information because you want to see if the rejection happened on the user side (server cert rejected) or server side (user rejected due to credentials or RAS policy), or if the authentication even attempted.

Another way to check for a client side problem is to uncheck the "validate server cert" box. If that fixes it, there is likely a problem with the root ca cert from the CA that issued the radius server's cert. The fact that forcing a gpupdate fixes it might indicate a trusted CA cert problem on the client side. Examples of the commands to turn on debugging are here:

Password:**********
(Orchard Green1) #show station

Station Entry
-------------
MAC Name Role Age(d:h:m) Auth AP name Essid Phy Associations Remote Profile
------------ ------ ---- ---------- ---- ------- ----- --- ------------ ------ -------
00:21:43:06:cd:bb wired-qos 00:21:47 No N/A wired 1 No wired-qos
00:17:ee:cf:4a:6d wired-qos 00:21:47 No N/A wired 1 No wired-qos
00:03:2a:02:74:cc wired-qos 00:01:33 No mesh-point2 TCG_WORK g 1 No default-dot1x-psk
00:0c:29:a6:1a:7b wired-qos 00:21:46 No N/A wired 1 No wired-qos
00:0c:29:20:92:fb wired-qos 00:21:45 No N/A wired 1 No wired-qos
00:0c:29:8c:f3:69 wired-qos 00:21:47 No N/A wired 1 No wired-qos
00:15:e9:64:20:9a wired-qos 00:00:49 No N/A wired 1 No wired-qos

Station Entries: 7

(Orchard Green1) #configure t
Enter Configuration commands, one per line. End with CNTL/Z

(Orchard Green1) (config) #logging level debugging security
(Orchard Green1) (config) #logging level debugging user-debug 00:03:2a:02:74:cc
(Orchard Green1) (config) #show log user-debug 10

Apr 29 08:06:47 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:07:58 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:09:09 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:10:20 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:11:32 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:12:43 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:13:54 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:15:06 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:20:24 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc
Apr 29 08:21:36 :501065: |stm| Get Next/Get Request mac is 00:03:2a:02:74:cc

(Orchard Green1) (config) #show log security 10

Apr 29 08:20:51 :125025: |aaa| Radius Authentication is disabled
Apr 29 08:20:51 :125020: |aaa| Server Authentication Failed, Checking mgmt-user config-db. State=9
Apr 29 08:20:51 :125024: |aaa| Authentication Succeeded for User admin, Logged in from 192.168.0.163 port 3485, Connecting to 192.168.0.254 port 22 connection type SSH
Apr 29 08:20:55 :124004: |authmgr| Rx message 8001/1, length 111 from 127.0.0.1:8236
Apr 29 08:20:58 :124004: |authmgr| Rx message 14001/5221, length 144 from 127.0.0.1:8220
Apr 29 08:20:58 :124004: |authmgr| Rx message 8001/1, length 111 from 127.0.0.1:8236
Apr 29 08:21:09 :124004: |authmgr| Rx message 1003/5, length 600 from 127.0.0.1:8407
Apr 29 08:21:30 :124004: |authmgr| Rx message 8001/1, length 111 from 127.0.0.1:8236
Apr 29 08:21:33 :124004: |authmgr| Rx message 1005/67108864, length 203 from 127.0.0.1:8407
Apr 29 08:21:33 :124004: |authmgr| Rx message 8001/1, length 111 from 127.0.0.1:8236

Dennis
Aruba Employee
Posts: 119
Registered: ‎05-16-2007

Re: PEAP clients occasionally unable to logon...




I'd be curious as to the end result of this change. I am helping two customers with this problem and here is what I've found out so far.

* Laptops are multi-user laptops, no cached credentials.
* Laptops typically complete Machine-Authentication upon startup
* Then the user logs in which is authed against the AD Domain and everything is fine.

Over time, some laptops (again, random...doesn't matter if XP SP2 or SP3) will "fail" and give the error "cannot contact domain XXX"

The reason for this failure is that the Machine Authentication in the background has failed for some reason. This failure (in my scenarios at least) is hard to track down because there are no RADIUS logs on the IAS server saying accept, reject or discard. When looking at the "show auth-tracebuf" command output, it looks like the authentication is just stopping mid-stream....after the 1st EAP Identity Response. The client then tries over and over...all failures. After a few tries, the controller stops sending these auth requests to the authentication server and that is why you're seeing the "Dropping EAPOL for MAC ADDRESS XX:XX, etc" in the logs.

This log message is the RESULT of the issue and not the cause of the issue.
If you connect the machine into a wired port and issue a GPUPDATE, it temporarily fixes the problem. Also, if you just leave the computer plugged into a wired port overnight, something updates in the background and this also temporarily fixes the problem. In my experience, you can also log the user in via a wired port and during the logon something is updated that again, temporarily fixes the problem.

In my experience this "fix" lasts for about 2 weeks...usually on the dot.

What's happening with this fix, is that after the fix the machine can successfully do the Machine Authentication phase during startup (but before the user tries to log in)......thus, when the user actually does try to log in, the log in is successful because the machine has an active network connection (a link light if you will).

What's unknown to me at this point is what is actually broken when Machine Authentication fails....nor do I know what exactly is fixed after the gpudate. I have packet captures of a non-working machine that started when the non-working machine had a gpupdate command sent. Lots of stuff buried in there that I'm still looking at.

I have a high degree of certainty that this is a Windows authentication/domain issue between the client machine and the AD backend....really has nothing to do with the WLAN system in between. My theory is that somehow, somewhere the machine thinks that the RADIUS server responding is presenting a certificate that it either cannot verify or does not trust and does not continue with the authentication. I think this is why a log never appears in the IAS server......the authentication never gets that far. I suspect that during the GPupdate something gets updated that fixes this. Could also be Machine Account Password mismatches (unlikely per my research), could also be Secure Channel issues between machine and domain (not sure about this one.)

I will update the thread when I find out, but I've been hot on the trail of this for a few weeks now. If anyone else has any insight, please let me know.

Occasional Contributor II
Posts: 43
Registered: ‎02-14-2008

Re: PEAP clients occasionally unable to logon...

Really interesting replies and I'm glad we're not alone. I have an open support call with our partner who in turn have asked Aruba tech to investigate. BJWhite seems to hit the nail on the head describing the issue.

If I find anything else out I'll let people know.

For interest our initial radius server and Certification Authority was on a Windows 2003 Standard edition. This has been upgraded to Enterprise but I haven't rebuilt the CA and root certs.
Occasional Contributor II
Posts: 14
Registered: ‎04-23-2009

Re: PEAP clients occasionally unable to logon...

Unchecking the "Validate server certificate" seems to be working so far but it might be too soon to tell for sure.

Seems we all have EXACTLY the same problem. My thought is is that when the GPUPDATE is run, it's "resyncing" the server certificate. My CA was setup as a root CA, however, it's not on a domain controller which might be the problem. I've even added the cert into group policy (which makes me think that's why gpudpate fixes the problem) to be distributed to the computers.
Moderator
Posts: 243
Registered: ‎09-12-2007

Re: PEAP clients occasionally unable to logon...

Turning off server certificate validation will solve a lot of problems - but it also destroys the security of WPA. :eek: If you don't validate the certificate, then I can come in with my own AP and my own RADIUS server and pretend to be your network. I can get your clients to connect to my AP, let them think they are authenticated, obtain their passwords, and generally do bad things. So it's not a good option to disable.
---
Jon Green, ACMX, CISSP
Security Guy
Occasional Contributor II
Posts: 14
Registered: ‎04-23-2009

Re: PEAP clients occasionally unable to logon...

Yes but until I can find an actual solution, it beats hooking 50+ laptops to a wire every few weeks and running GPUPDATE :)
Moderator
Posts: 243
Registered: ‎09-12-2007

Re: PEAP clients occasionally unable to logon...

OK, just wanted to make sure you were aware. :)

If turning off that option does help with the problem, it seems like it might give a clue as to what's going on - it would tell me that authentication is halting because the client doesn't trust the server certificate.

Is there anything in Windows which says "Don't trust root CAs anymore unless the CA list has been updated in the past 2 weeks?" I am grasping at straws here - I could see why such a feature might be good security policy, but I don't know that it actually exists. :)
---
Jon Green, ACMX, CISSP
Security Guy
Occasional Contributor II
Posts: 14
Registered: ‎04-23-2009

Re: PEAP clients occasionally unable to logon...

It's got to be the issue but I don't know of any expiration other than the actual expiration of the certificate. I suspect it's because my CA isn't on the domain controller or the CA root certificate wasn't generated properly when I started this setup. Does that make sense?
Search Airheads
Showing results for 
Search instead for 
Did you mean: