Wireless Access

Reply
Occasional Contributor II
Posts: 16
Registered: ‎05-31-2016

AP fails to rejoin controller after AP reboot

Have come across an issue today at one of our remote sites, that I've been struggling to resolve.

 

Site in question has 2 x local controllers, which manages 25 Access Points. We also have a main site with 2 x controllers that act as the Master devices. If I reboot a previously connected AP at the remote site, it will connect to the Switch and pick up a DHCP address, but never connect to a controller. I connected via a console to an AP today, and could see that it picked up the correct DHCP information (IP, Subnet and Gateway) along with the IP address configured for as the master controller. From a console output perspective, the AP then stays connected to the network (ping to its IP address is successful) but does not connect to a controller.

 

If I reset the AP, and run through the conversion process again entering the master IP address, I can see the AP join the master controller and I'm then able to provision it on to the local controllers, it will reboot and connect. Any subsequent reboots then cause it to disconnect from the controllers.

 

I've rebooted another AP at the same site prior to leaving site this evening and this is also showing the same behaviour, so I'm faily convinced it's a site issue.

 

2 x Master Controllers - 150 APs at Main site - APs resides in 'MainSite' AP Group

2 x Local Controllers - 25 APs at local site - APs reside in 'localgroup' AP Group

 

APs are 325, and we're running 6.4.4.11 on the controllers.

 

DHCP scope for the Access Points is configured with Option 60 of ArubaAp and option 43 having the IP address of the master controller.

Both sites are using different IP ranges for the AP connections and also the user subnets.

 

The controllers also reside in their own separate Layer 3 networks. Routing between all is working as expected.

 

Is anyone able to give any pointers on any particular troubleshooting steps which may help pinpoint the cause of the issue, or any settings I need to check within the controller config that tend to cause issues like those described above?

 

TIA

Dan

MVP
Posts: 4,307
Registered: ‎07-20-2011

Re: AP fails to rejoin controller after AP reboot

During the boot process do you see the AP able to discover the controller , this happens right after obtaining an ip ?

Can you the ping the controller from the AP console ?

Try running the show log system all | include AP MAC address or name
And see if anything shows up on the controller side.

Are you guys experiencing any network delays between the remote sites and where the controller is located ?



Get Outlook for iOS
Thank you

Victor Fabian
Lead Mobility Engineer @ Integration Partners
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA
Occasional Contributor II
Posts: 16
Registered: ‎05-31-2016

Re: AP fails to rejoin controller after AP reboot

Thanks for the reply.

 

Regards connectivity between the 2 sites, I'm happy that is running without issues. We have services running between the 2 and there are no other issues. I'm able to ping from the SVI that hosts the APs to the controller IP address, no drops and relatively low latency (15ms latency, sites are about 150 miles apart).

 

I've run the show show log system all | include AP command from the master controller, and it appears I can see logs from the AP that is trying to connect, but there appear to be timeout messages within those logs.

 

Does the Hello Timeout indicate the AP is not communicating correctly with the controller, or the controller is not able to see the AP on it's IP address? Also, there's mention there of the packet length being 1504, could this potentially being caused any MTU related issues?

Occasional Contributor II
Posts: 16
Registered: ‎05-31-2016

Re: AP fails to rejoin controller after AP reboot

Attached is a copy of the console output when the AP boots. I can see at 20 seconds in to the boot, the AP picks up its IP address and also that of the Master controller, but then there doesn't appear to be any further connectivity.

Guru Elite
Posts: 21,526
Registered: ‎03-29-2007

Re: AP fails to rejoin controller after AP reboot

I see that you have two ethernet ports connected.

 

Try to reboot the AP, and when the console reaches 

Hit <Enter> to stop autoboot

 Press Enter and then type "printenv" to see what variables are configured and paste it into your reply.



Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

Occasional Contributor II
Posts: 16
Registered: ‎05-31-2016

Re: AP fails to rejoin controller after AP reboot

printenv information as requested

 

Hit <Enter> to stop autoboot: 2  0
apboot> printenv
bootdelay=2
baudrate=9600
autoload=n
boardname=Octomore
bootcmd=boot ap
autostart=yes
bootfile=ipq806x.ari
mtdids=nand0=nand0
ethaddr=a8:bd:27:ca:72:7a
os_partition=0
NEW_SBL2=1
backup_vap_init_master=10.100.100.101
backup_vap_password=6745C6236998734069D9DAD9AFE6BD8691CF40D94CEC6BB90292A296277BCBE9
num_ipsec_retry=85
previous_lms=0
backup_vap_opmode=0
backup_vap_band=2
name=LDNAP016
group=GROUP2
syslocation=
master=10.100.100.100
ip6prefix=64
serverip=10.100.100.100
a_antenna=0
g_antenna=0
usb_type=0
uplink_vlan=0
auto_prov_id=0
is_rmp_enable=0
priority_ethernet=0
priority_cellular=0
cellular_nw_preference=1
usb_power_mode=0
ap_power_mode=0
cert_cap=0
mesh_role=0
installation=1
mesh_sae=0
start_type=warm_start
stdin=serial
stdout=serial
stderr=serial
machid=1260
mtdparts=mtdparts=nand0:0x2000000@0x0(aos0),0x2000000@0x2000000(aos1),0x4000000@0x4000000(ubifs)
partition=nand0,0
mtddevnum=0
mtddevname=aos0
ethact=eth0

Environment size: 991/65532 bytes
apboot>

 

.100 is the VRRP address for the master contollers. 101 is the Primary device within the VRRP pair.

Guru Elite
Posts: 21,526
Registered: ‎03-29-2007

Re: AP fails to rejoin controller after AP reboot

Question:

 

Why do you have the master and serverip statically configured, instead of using DNS or dhcp discovery? (doing this could mask an issue with discovery)

 

Does the group "GROUP2" exist?

 

You should boot the AP and type "show datapath session table <ip address of ap>" repeatedly on the master controller to see if the AP is sending any traffic.

 

Did this AP ever work?

 

What was the last change you made before the AP stopped working?

 

After you boot the ap, you should type "show ap database" on the master controller to see if that AP has any flags that could explain your problem.

 

You should also try a single ethernet port at a time to eliminate any configuration issues with bonding.



Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

Occasional Contributor II
Posts: 16
Registered: ‎05-31-2016

Re: AP fails to rejoin controller after AP reboot

Thanks for the reply. Regards the manually configured master and serverip addresses, I assume the AP has them in place from when we ran the conversion process and initially registered them to the master controller? Is it normal that the master and serverip addresses are the same? I'm confident the APs have been rebooted in the past without issue, there have been no significant changes to the site set up of late, I am trying a few different things in an attempt to resolve.

 

Below is a copy of the show datapath session table <AP-IP> output, it only appears to be the bottom line that increments in small volumes

 

(MASTER) #show datapath session table <AP-IP-ADDRESS>

Datapath Session Table Entries
------------------------------

Flags: F - fast age, S - src NAT, N - dest NAT
D - deny, R - redirect, Y - no syn
H - high prio, P - set prio, T - set ToS
C - client, M - mirror, V - VOIP
Q - Real-Time Quality analysis
I - Deep inspect, U - Locally destined
E - Media Deep Inspect, G - media signal
r - Route Nexthop
A - Application Firewall Inspect

Source IP Destination IP Prot SPort DPort Cntr Prio ToS Age Destination TAge Packets Bytes Flags
--------------- --------------- ---- ----- ----- -------- ---- --- --- ----------- ---- --------- --------- ---------------
MASTER-IP AP-IP-ADDRESS 17 8211 8211 0/0 0 0 2 0/0/2 28 0 0 FYI
MASTER-IP AP-IP-ADDRESS 17 8222 8211 0/0 0 0 0 0/0/2 2 0 0 FYI
AP-IP-ADDRESS MASTER-IP 17 8211 8222 0/0 0 0 0 0/0/2 2 0 0 FYCI
AP-IP-ADDRESS MASTER-IP 17 8211 8211 0/0 0 0 0 0/0/2 28 12 9128 FCI

 

Show Ap database just has the AP listed as down, there are no flags mentioned.

Guru Elite
Posts: 21,526
Registered: ‎03-29-2007

Re: AP fails to rejoin controller after AP reboot

How many access points are in this situation?  Is it only one?

I would type "show log system 50" on the master controller to see if there is an issue



Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

Occasional Contributor II
Posts: 16
Registered: ‎05-31-2016

Re: AP fails to rejoin controller after AP reboot

We have 24 APs at this site, I've rebooted 2 and both have shown the same issue. I've left one offline for now in order to troubleshoot, If I perform an AP reset and point it at the master, it will convert and connect to the controller and allow me to assign it to the local site group. It's only subsequent reboots where they don't then come back online, which is making me think if there's an issue in relation to the AP talking to the Master or local Controllers.

 

Below is the sh log output:

 

Jul 6 06:11:55 :303086: <ERRS> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Process Manager (nanny) shutting down - AP will reboot!
Jul 6 06:13:14 :303022: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Reboot Reason: AP rebooted Wed Dec 31 16:44:45 PST 1969; SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 06:57:11 :311002: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> sapd| Rebooting: SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 06:57:11 :303086: <ERRS> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Process Manager (nanny) shutting down - AP will reboot!
Jul 6 06:58:29 :303022: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Reboot Reason: AP rebooted Wed Dec 31 16:44:45 PST 1969; SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 07:42:26 :311002: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> sapd| Rebooting: SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 07:42:26 :303086: <ERRS> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Process Manager (nanny) shutting down - AP will reboot!
Jul 6 07:43:44 :303022: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Reboot Reason: AP rebooted Wed Dec 31 16:44:45 PST 1969; SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 08:27:41 :311002: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> sapd| Rebooting: SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 08:27:41 :303086: <ERRS> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Process Manager (nanny) shutting down - AP will reboot!
Jul 6 08:29:00 :303022: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Reboot Reason: AP rebooted Wed Dec 31 16:44:45 PST 1969; SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 08:32:43 :303022: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Reboot Reason: AP rebooted caused by cold HW reset(power loss)
Jul 6 09:16:39 :311002: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> sapd| Rebooting: SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0
Jul 6 09:16:39 :303086: <ERRS> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Process Manager (nanny) shutting down - AP will reboot!
Jul 6 09:17:58 :303022: <WARN> |AP LDNAP016@<AP-IP-ADDRESS> nanny| Reboot Reason: AP rebooted Wed Dec 31 16:44:44 PST 1969; SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 228 sec before: Last Ctrl msg: HELLO len=1504 dest=10.100.100.100 tries=10 seq=0

 

 

Search Airheads
Showing results for 
Search instead for 
Did you mean: