Remote Networking

Reply
Frequent Contributor I
Posts: 70
Registered: ‎04-06-2007

some raps down after master controller rebooted

Background:
I have 12 RAPs (60s, 61s, 65s, 70s) spread around the country. I'm using a 5000 SC-1 (128) with (some) eval licenses.

Problem:
Eval licenses expired and controller rebooted.

Fix:
No problem, it's happened before. I extended my eval licenses did NOT save config, and rebooted.

New Problem:
4 of my 12 RAPs (all 70s) came back online with no problems. Where are the other 8? All 12 are pointing to the same DNS name for the public IP of the SC, using the same ike PSK, using the same internal database username and password. The 8 that are down (another 70, with 60/61s, and 65s)are all broadcasting the wpa2psk backup ssid though none of my users can connect to it even with the right psk though.
I reprovisioned one of MY APs to be a RAP, came up fine.
So i finally obtained one of my downed RAPs. Plugged in SoE (and cisco rollover repin cable with dc power), console port is dead (older ap-60, meh, needs an rma) so i did a wireshark on it to see what it's doing. This is what i see from wireshark...
It boots up, Gets dhcp ok
Does a dns through google dns (8.8.8.8) for lookup to the same master it’s been set to for months
Sends initial syslog messages to the master “no reboot reason found.”
I can pinged it for fun. No problem. No telnet or ssh, as expected. The backup ssid is on, even though the wpa2 psk wont work for me either. i even decrypted the config on the controller, and pasted the key into my supplicant.
So, then any I’d expect the ap to do ike and build ipsec tunnels. Instead it continually does dhcp discovers, and gets offers, and more discovers. Weird.

What I know:
1) This all started with a controller reboot when the master ran out of eval license time.
2) That has happened before and i recovered with no problem.
3) I am not a rap newbie.
4) my controller config WAS good, and it is now. I even dumped the entire controller config and started over using the same ike psk and un/pw. The same 4 raps come back with no problem.

As of right now i'm getting ready to provision new aps and ship to users on monday. pretty terrible solution for something that should have never needed it.

I included my wireshark capture filtered by the APs mac.

1) Any secrets (that can be shared) to looking at the environment variables after the ap has booted up? (if telnet is disabled?)
2) Any way to set anything from the web gui of an offline ap70? Purge and repaste configs? or is the gui strictly a one way street for viewing.
3) Where is the reset button that i submitted a feature request for years ago? (kidding. kinda. not really) Man, at least you could purge the ap with a button and get into a web gui to tell it where to go for the master.

Anyone have any suggestions?
Guru Elite
Posts: 20,761
Registered: ‎03-29-2007

Support Case


Background:
I have 12 RAPs (60s, 61s, 65s, 70s) spread around the country. I'm using a 5000 SC-1 (128) with (some) eval licenses.

Problem:
Eval licenses expired and controller rebooted.

Fix:
No problem, it's happened before. I extended my eval licenses did NOT save config, and rebooted.

New Problem:
4 of my 12 RAPs (all 70s) came back online with no problems. Where are the other 8? All 12 are pointing to the same DNS name for the public IP of the SC, using the same ike PSK, using the same internal database username and password. The 8 that are down (another 70, with 60/61s, and 65s)are all broadcasting the wpa2psk backup ssid though none of my users can connect to it even with the right psk though.
I reprovisioned one of MY APs to be a RAP, came up fine.
So i finally obtained one of my downed RAPs. Plugged in SoE (and cisco rollover repin cable with dc power), console port is dead (older ap-60, meh, needs an rma) so i did a wireshark on it to see what it's doing. This is what i see from wireshark...
It boots up, Gets dhcp ok
Does a dns through google dns (8.8.8.8) for lookup to the same master it’s been set to for months
Sends initial syslog messages to the master “no reboot reason found.”
I can pinged it for fun. No problem. No telnet or ssh, as expected. The backup ssid is on, even though the wpa2 psk wont work for me either. i even decrypted the config on the controller, and pasted the key into my supplicant.
So, then any I’d expect the ap to do ike and build ipsec tunnels. Instead it continually does dhcp discovers, and gets offers, and more discovers. Weird.

What I know:
1) This all started with a controller reboot when the master ran out of eval license time.
2) That has happened before and i recovered with no problem.
3) I am not a rap newbie.
4) my controller config WAS good, and it is now. I even dumped the entire controller config and started over using the same ike psk and un/pw. The same 4 raps come back with no problem.

As of right now i'm getting ready to provision new aps and ship to users on monday. pretty terrible solution for something that should have never needed it.

I included my wireshark capture filtered by the APs mac.

1) Any secrets (that can be shared) to looking at the environment variables after the ap has booted up? (if telnet is disabled?)
2) Any way to set anything from the web gui of an offline ap70? Purge and repaste configs? or is the gui strictly a one way street for viewing.
3) Where is the reset button that i submitted a feature request for years ago? (kidding. kinda. not really) Man, at least you could purge the ap with a button and get into a web gui to tell it where to go for the master.

Anyone have any suggestions?




James.Vaught,

There are a number of variables at play here, and the answers that are needed will frankly reveal personally identifiable information, so this forum is not a good avenue to troubleshoot your problem. Please open a support case to get to the bottom of this. I can tell you in general that resetting a controller whose RAP license has expired is a very error-prone process, so we avoid license expiry at all costs especially in a production environment. With that being said, If you got it to work and 4 APs were up, however, they should all be able to connect provided that you have the correct amount of licenses.

Without having the logs.tar from the controller, and logs from the console of the AP, it will still be difficult to determine exactly what is wrong. The PCAP is a symptom, but not a cause.

The RAP2WG and RAP5 and RAP5WN were designed with the rap console, which has the advantage over the AP70 of allowing a user with a browser to see and extract detailed diagnostic information about a RAP. In addition, they both have reset buttons so that they can be purged in the field. But you probably already know that.


Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

New Contributor
Posts: 1
Registered: ‎02-17-2010

Re: some raps down after master controller rebooted

We have a similiar issue. However all of the remote APs on our controller are no longer reachable. This started last night, as they slowly start disappearing off the controller. The syslog message you mentioned is in our controller as well as the following.

When an RAP first boots and connects to the controller.

System encountered an internal communication error. Error occurred when message is being sent from source application rfd destination application SAPM Client at file rfd_msg.c function rfd_papi_snd_cb line 132 error err Connection timed out arg 0 dest IP 127.0.0.1 msgcode 16116.


Afterwards the following messages repeat over and over. I have validated the account.

PPP/VPN Authentication failed aruba2009 10.120.25.125 PAP. Please check authentication server radius/ldap/tacacs logs

User aruba2009 Failed Authentication


This was a working controller and nothing changed. I have an open case on this but I am stuck at the moment. The only way we can get the AP's to a usable state is to purge and reset them. Just curious what you or anyone found on this.
Guru Elite
Posts: 20,761
Registered: ‎03-29-2007

User

Do you have a user aruba2009 in your local database? Do a "show audit-trail" to see what might have changed"


Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

Search Airheads
Showing results for 
Search instead for 
Did you mean: