Wireless Access

Reply
Contributor II
Posts: 64
Registered: ‎09-17-2011

VLAN going up and down

OK - got a strange one here that we are getting nowhere fast with - even with TAC.

 

We have 2 x 3400 Controllers - <Master and Local - with AP's deployed to Local using Primary LMS with the Master Controller as the backup LMS.

 

Once, maybe twice a day  (no consistency) we see all the AP's swing to the Master controller - stay there for the hold down period of 10 minutes and then swing back.

 

We started debugging and found the following:

 

Using show ap debug system-status ap-name AP Name we found the following from this morning:

Rebootstrap Information

-----------------------

Date       Time     Reason (Latest 10)

--------------------------------------

2012-08-03 16:49:38 Switching to primary LMS 10.1.80.5

2012-08-06 10:25:01 Switching to LMS 10.1.80.3 (sapd_check_hbt)

2012-08-06 10:25:09 Broken tunnel

2012-08-06 10:25:14 Broken tunnel

2012-08-06 10:25:29 Broken tunnel

2012-08-06 10:25:39 Broken tunnel

2012-08-06 10:25:51 Broken tunnel

2012-08-06 10:36:58 Switching to primary LMS 10.1.80.5

2012-08-07 08:44:53 Switching to LMS 10.1.80.3 (sapd_check_hbt)

2012-08-07 08:56:09 Switching to primary LMS 10.1.80.5

 

 

Rebootstrap LMS

---------------

(none found)

------------

 

Crash Information

-----------------

(none found)

------------

 

Heartbeat Stats

---------------

Heartbeats Sent  Heartbeats Received

---------------  -------------------

910467           902577

 

Obviously looks like heartbeats have failed for the 8 consecutive tries and then AP has swapped to the Master.

 

 

Did a show log network on the local controller and find:

 

Aug 7 08:44:51 :208006:  <INFO> |fpapps|  Changing the vlan 20 state to UP from DOWN

Aug 7 08:44:51 :208045:  <DBUG> |fpapps|  Received event 3 for Interface 320

Aug 7 08:44:51 :208043:  <DBUG> |fpapps|  Nim received event L7_UP for interface 320 linkState 3

Aug 7 08:44:51 :208004:  <DBUG> |fpapps|  Dot1q Change Call back is called 320 event L7_UP (3)

Aug 7 08:44:51 :208044:  <DBUG> |fpapps|  Nim Interface 320 state change notification, new state L7_FORWARDING

Aug 7 08:44:51 :208045:  <DBUG> |fpapps|  Received event 6 for Interface 320

Aug 7 08:44:51 :208043:  <DBUG> |fpapps|  Nim received event L7_FORWARDING for interface 320 linkState 3

Aug 7 08:44:51 :208004:  <DBUG> |fpapps|  Dot1q Change Call back is called 320 event L7_FORWARDING (6)

Aug 7 08:44:52 :204229:  <DBUG> |pim|  Received IP multicast interface VLAN VLAN Up message for VLAN 20

Aug 7 08:44:52 :208045:  <DBUG> |fpapps|  Received event 6 for Interface 320

Aug 7 08:44:52 :208043:  <DBUG> |fpapps|  Nim received event L7_FORWARDING for interface 320 linkState 3

Aug 7 08:44:52 :208004:  <DBUG> |fpapps|  Dot1q Change Call back is called 320 event L7_FORWARDING (6)

Aug 7 08:44:52 :204229:  <DBUG> |pim|  Received IP multicast interface VLAN VLAN Up message for VLAN 20

Aug 7 08:45:43 :208008:  <INFO> |fpapps|  No change in the Vlan Interface 200 state UP Vlan Interface has tunnels configured

Aug 7 08:45:46 :208007:  <INFO> |fpapps|  Vlan interface 20 state is DOWN

Aug 7 08:45:46 :208008:  <INFO> |fpapps|  No change in the Vlan Interface 200 state UP Vlan Interface has tunnels configured

Aug 7 08:47:50 :202085:  <DBUG> |dhcpdwrap|  No arp entry for ip address 192.168.207.160 eth1.200

 

 

And we find similar entries at the other times that the AP's have swung.

 

On the surface it appears that the VLAN 20 on the local controller with IP 10.1.80.5 (as the primary LMS IP) is going down and the AP's switch to the backup....but somethings don;t make sense:

 

1. Firstly - why is the VLAN going down - this is a VLAN assigned to a Port and should not go down even with no clients connected - correct?

2. At 8:44am - when the AP's rebootsraps - the network logs shows the VLAN as going from DOWN TO UP - not UP to DOWN as would be expected ( i think) if the VLAN went down. At 8:45 - the VLAN is then reported as DOWN - this seems back to front to me - can anyone shed any light on this?

3. If the VLAN interface 20 did in fact go down for 8 seconds - would we not expect to see a time out on the IP interface of 10.1.80.5? We pinged it constantly during this period of (several hours before and several after) and not 1 packet loss.

 

Initially TAC has suggested we have a congested network - but we are on term break - there is roughly 100 people on campus as opposed to 3000 - and very little traffic. We have also recently updated our core switch which the Controllers are connected to - and NOTE - the issues was occurring both before and after the switch upgrade.

 

With ALL the AP's switching at once and the network log - to me this points to the controller having an issue...but not sure where to dig next.

 

Anyone with any ideas?

 

Cheers

Wally

Contributor II
Posts: 64
Registered: ‎09-17-2011

Re: VLAN going up and down

Here is something else that is a bit confusing:

 

Going through the debug cheat sheet commands - decided to run a debug counters on one of the AP's that was swinging:

 

#show ap debug counters ap-name jnr_l201

 

and got back:

 

AP Counters

-----------

Name      Group          IP Address   Configs Sent  Configs Acked  AP Boots Sent  AP Boots Acked  Bootstraps (Total)  Reboots

----      -----          ----------   ------------  -------------  -------------  --------------  ------------------  -------

JNR_L201 Students  10.1.121.29  7             7              0              0               1          (1    )  0

 

It only shows one bootstrap and I know from the logs that this AP has done a boot strap at least 5 times over the past week (moved from Primary to Backup LMS). You can see at least 2 in the previous debug system status log entry posted earlier.

 

Why is the bootstrap not incrementing?

Not my major concern of course - but still puzzling....

Wally

Guru Elite
Posts: 21,280
Registered: ‎03-29-2007

Re: VLAN going up and down

[ Edited ]

Wally,

 

Is your connection from the Master Controller to your switch a 802.1x trunk?

Are you running a VRRP between the Master Controller and the local?

Is the master controller dual-connected to the switch?

 



Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

Contributor II
Posts: 64
Registered: ‎09-17-2011

Re: VLAN going up and down

Hi Colin

 

  • No it is not a 802.1x trunk
  • There is a VRRP running between Master and Local - but on a different profile  - that we are using for testing. But the majority of the AP's are using LMS. Yes we could use VRRP instead and it we may avoid the issue - but I would like to understand why we have the issue in the first place.
  • The master controller (in fact all controllers) are only connected via single link to the core switch - all controllers located in the same central server room on the same subnet.

Wally

Contributor II
Posts: 64
Registered: ‎09-17-2011

Re: VLAN going up and down - SOLVED

Long overdue feedback - we were finally advised by TAC of the following:

Bug 63843 in AOS 6.1.3.1 was probably the cause and we should update to a later AOS version.

At the time of update AOS 6.1.3.4 was current and we updated to that version and we have not seen the VLAN going down nor the AP swinging between LMS and Backup LMS - this has been for over 2 weeks now - so hopefully issue fixed.Wally

 

                       

 

 

 

                       

Search Airheads
Showing results for 
Search instead for 
Did you mean: