Wireless Access

Reply
Occasional Contributor II
Posts: 42
Registered: ‎06-25-2013

AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

Hello,

 

  I have a Master/Local 7210 Controller Setup.  I have 45 Access Points currently a mixture of AP 65 and AP 105 devices.  They will just be working and out of the blue fail from the Local to the Master and go into a D Flag and sit that way for 5 minutes or so and then once they are all up on the Master fail back and sit in a D Flag state and take about 5 minutes to clear up so it knocks the Wireless out for at least 10 minutes.  Some times it is fine for a few days and then others it will go back and forth.  Prior the APs were managed by a Single 2400 without issue.  Currently running 6.2.1.2 on both Controllers.  Not certain if this is something network wise impacting the controllers and causing lose of communication or something on the Controllers code wise or configuration wise causing it to recycle a service and causing it to fail over.  Opened a Support case but I never know when it will happen and sometimes I catch it on the tail end so they are unable to see the issue.  They stated the configuration is setup correctly.

 

   Thanks,

      Evan Cardanha

MVP
Posts: 4,301
Registered: ‎07-20-2011

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

What's your redundancy configured ? LMS or VRRP ?

Make sure your master controller has the same VLANs as your local .

You may be having some networks issues on your local controller :
- check your connection back to the uplink (maybe layer 1 issues: cable , gbic,etc )
- are you using port channels or trunks ?
- do a show port stats and look for errors

Thank you

Victor Fabian
Lead Mobility Engineer @ Integration Partners
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA
MVP
Posts: 4,301
Registered: ‎07-20-2011

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

Does the D disappears after a certain time in the master controller , meaning are the APs able to come up normally in the master ?
Thank you

Victor Fabian
Lead Mobility Engineer @ Integration Partners
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA
MVP
Posts: 4,301
Registered: ‎07-20-2011

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

 

Run the following commands too these may give some information:

 

show log system all

show log error-log all

show log network all

show  ap  debug system-status ap-name <apname>

 

Thank you

Victor Fabian
Lead Mobility Engineer @ Integration Partners
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA
MVP
Posts: 289
Registered: ‎11-04-2008

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

[ Edited ]

I am curious to see what the resolution for this problem.  I have same problems with my 3400 controller backup for another two 3400 controllers in N+1 VRRP.  My controllers are AOS 6.1.3.2, which I already scheduled to upgrade to 7220 AOS 6.2

 

The log showed once or twice a day, the backup transformed itself to Master or ACTIVE, but the ACTIVE controllers announced the presence with higher priority so, the BACKUP backed out.  APs moved back and forth, and dropped clients.  All controllers are on the same vlan

 

(BACKUP) #show log system 10

Jul 2 09:16:51 :313328:  <WARN> |fpapps|  vrrp: vrid "35" - VRRP state transitioned from MASTER to BACKUP
Jul 2 09:16:51 :313332:  <WARN> |fpapps|  VRRP: vrid "35"(Master) -  Received VRRP Advertisement with HIGHER PRIORITY (150) from x.x.x.x
Jul 2 09:22:37 :313331:  <WARN> |fpapps|  VRRP: vrid "25" - Missed 3 Hello Advertisements from VRRP Master 172.17.254.22
Jul 2 09:22:37 :313328:  <WARN> |fpapps|  vrrp: vrid "25" - VRRP state transitioned from BACKUP to MASTER
Jul 2 09:22:37 :313328:  <WARN> |fpapps|  vrrp: vrid "25" - VRRP state transitioned from MASTER to BACKUP
Jul 2 09:22:37 :313332:  <WARN> |fpapps|  VRRP: vrid "25"(Master) -  Received VRRP Advertisement with HIGHER PRIORITY (150) from x.x.x.x
Jul 2 09:26:28 :313331:  <WARN> |fpapps|  VRRP: vrid "25" - Missed 3 Hello Advertisements from VRRP Master 172.17.254.22
Jul 2 09:26:28 :313328:  <WARN> |fpapps|  vrrp: vrid "25" - VRRP state transitioned from BACKUP to MASTER
Jul 2 09:26:28 :313328:  <WARN> |fpapps|  vrrp: vrid "25" - VRRP state transitioned from MASTER to BACKUP
Jul 2 09:26:28 :313332:  <WARN> |fpapps|  VRRP: vrid "25"(Master) -  Received VRRP Advertisement with HIGHER PRIORITY (150) from x.x.x.x

 

~Trinh Nguyen~
Boys Town
Occasional Contributor II
Posts: 42
Registered: ‎06-25-2013

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

Currently it is setup with LMS.  We had a single controller before so I re-used the port it had been using without issue and mirrored the configuration to the connection that the local controller is connected to.  We have a Trunk port.  It certainly looks more like a controller connectivity issue as when it happens all APs drop off the Local and go to the Master and sit in a D Flag state they all eventually clear up and then not long after that happens they all fail back to the Local and go into a D Flag state and then slowly clear up and run normal for days.  Currently just using the 1 Gig Copper Connections so none of the GBIC slots are populated.  The Master and Local are patched to the same Cisco 6509 Switch but on different blade slots.

MVP
Posts: 4,301
Registered: ‎07-20-2011

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

Do you experiencing the same issues if you use LMS primary / backup setup ?

Have you tried disabling preemption ?

Are you sharing the VRRP segment /VLAN with anything else in your network ?

Thank you

Victor Fabian
Lead Mobility Engineer @ Integration Partners
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA
MVP
Posts: 4,301
Registered: ‎07-20-2011

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up


Are you using aruba supported gbics ?

Have you taken a look at one of the APs when this occurring through the console ?
Thank you

Victor Fabian
Lead Mobility Engineer @ Integration Partners
AMFX | ACMX | ACDX | ACCX | CWAP | CWDP | CWNA
Occasional Contributor II
Posts: 42
Registered: ‎06-25-2013

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

Currently the AP Configuration is setup with an LMS IP which is the Local Controller and then a Backup LMS IP which is the Master.  I haven't tried disabling preemption.  Our Controllers are the Production VLAN.  I have looked at one I haven't looked at any recently which all I have seen is it looses communication and then reconnects it connects back and forth between the 2 controllers and the uptime never resets which if I manually pull an AP offline that number right away clears.  Currently all the GBIC slots are not populated just using one of the 1 Gig Interfaces on the Controller 0/0/0.

Occasional Contributor II
Posts: 42
Registered: ‎06-25-2013

Re: AP 105s and 65s keep failing back and forth when both 7210 Master/Local are up

Just had a moment to run a show log system all and saw this issue in the logs.

 

Aug 6 08:55:19 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at arm_update, 323, Invalid length AP 00:24:6c:1a:6f:e0 got 23423 expect 1388
Aug 6 08:55:26 :303073:  <ERRS> |nanny|  Process /mswitch/bin/stm [pid 13935] died: got signal SIGSEGV
Aug 6 08:55:33 :303029:  <ERRS> |nanny|  Process /mswitch/bin/stm [pid 13935]: crash data saved in dir /flash/crash/process/8-6-2013@08-55-26/stm
Aug 6 08:55:38 :303079:  <ERRS> |nanny|  Restarted process /mswitch/bin/stm, new pid 27675
Aug 6 08:55:38 :303025:  <ERRS> |nanny|  Found core file /tmp/core.13935.stm.A72xx_38532, 65339392 bytes, compressing...
Aug  6 08:55:42  KERNEL:   0:<7>UDP: short packet: From 255.255.255.255:8211 1621/1517 to 129.2.139.140:8419
--More-- (q) quit (u) pageup (/) search (n) repeat
                                                  
Aug 6 08:55:49 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac d8:c7:c8:c6:96:d3, and phy_type is 1
Aug 6 08:55:53 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:96:32, and phy_type is 1
Aug 6 08:55:54 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac d8:c7:c8:c6:96:bb, and phy_type is 1
Aug 6 08:55:55 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:a7:08, and phy_type is 1
Aug 6 08:55:59 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:a6:e4, and phy_type is 1
Aug 6 08:55:59 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:1a:1e:c7:c0:4e, and phy_type is 1
Aug 6 08:56:04 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:96:34, and phy_type is 1
Aug 6 08:56:04 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:a6:f2, and phy_type is 1
Aug 6 08:56:07 :303080:  <ERRS> |nanny|  Please tar and email the file crash.tar to support@arubanetworks.com
Aug 6 08:56:07 :303081:  <ERRS> |nanny| To tar type the following commands at the Command Line Interface: (1) tar crash (2) copy flash: crash.tar tftp: [serverip] [destn filename]
Aug 6 08:56:08 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:1a:1e:c7:c2:28, and phy_type is 1
Aug 6 08:56:08 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:96:20, and phy_type is 1
Aug 6 08:56:10 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:a7:56, and phy_type is 1
Aug 6 08:56:13 :311004:  <WARN> |AP RIEOC_AP105.10@10.200.200.10 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:13 :311004:  <WARN> |AP RIHPHC_AP105.1@10.203.89.2 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:13 :311004:  <WARN> |AP RIETH_AP105.2@10.230.40.4 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:13 :311004:  <WARN> |AP RIDOTMT_AP105.1@10.203.36.11 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:13 :311004:  <WARN> |AP RIETH_AP105.1@10.230.40.3 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RIEOC_AP105.8@10.200.200.12 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RIEOC_AP105.5@10.200.200.9 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RIEOC_AP105.3@10.200.200.8 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RISH_AP65.7@10.230.4.3 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RISH_AP65.3@10.230.4.7 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RISH_AP65.1@10.230.4.6 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RIDOA_AP105.8@158.123.114.202 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RISH_AP65.2@10.230.4.5 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RIEOC_AP105.12@10.200.200.6 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RIEOC_AP105.7@10.200.200.2 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RIDOA_AP65.1@158.123.114.207 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:14 :311004:  <WARN> |AP RISH_AP65.8@10.230.4.4 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:15 :311004:  <WARN> |AP RIDOA_AP65.5@158.123.114.151 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:15 :311004:  <WARN> |AP RISH_AP65.9@10.230.4.10 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:15 :311004:  <WARN> |AP RISH_AP65.6@10.230.4.2 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:15 :311004:  <WARN> |AP RIDOA_AP65.6@158.123.114.150 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:15 :311004:  <WARN> |AP RIPUC_AP65.1@10.203.1.3 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:15 :311004:  <WARN> |AP RIDOA_AP65.12@158.123.114.160 sapd|  Missed 25 heartbeats; rebootstrapping
Aug 6 08:56:16 :311004:  <WARN> |AP RIDOA_AP65.11@158.123.114.148 sapd|  Missed 25 heartbeats; rebootstrapping
--More-- (q) quit (u) pageup (/) search (n) repeat
                                                  
Aug 6 09:06:28 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:a6:f6, and phy_type is 1
Aug 6 09:06:46 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 00:24:6c:c9:a6:fe, and phy_type is 1
Aug 6 09:06:50 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac d8:c7:c8:c6:96:d3, and phy_type is 1
Aug  6 09:06:50  KERNEL:   0:<7>UDP: short packet: From 255.255.255.255:8211 1621/1517 to 129.2.139.140:8419
Aug 6 09:06:50 :304001:  <ERRS> |stm|  Unexpected stm (Station management) runtime error at handle_ap_statistics, 1019, Length mismatch expected 1527 received 1387 from             AP with eth_mac 6c:f3:7f:c5:bd:ec, and phy_type is 1
Aug  6 09:07:10  KERNEL:   0:<7>UDP: short packet: From 255.255.255.255:8211 1621/1517 to 193.0.12.160:8419

Search Airheads
Showing results for 
Search instead for 
Did you mean: