Wired Intelligent Edge (Campus Switching and Routing)

 View Only
last person joined: one year ago 

Bring performance and reliability to your network with the HPE Aruba Networking Core, Aggregation, and Access layer switches. Discuss the latest features and functionality of HPE Aruba Networking switching devices, and find ways to improve security across your network.
Expand all | Collapse all

2920 Stack dropping packets

This thread has been viewed 4 times
  • 1.  2920 Stack dropping packets

    Posted Sep 07, 2017 05:12 PM

    Hi,

     

    I have a network with a 5412R zl2 as the core switch, and several stacks of 4 2920s, each stack almost identically configured ( same VLANs, some small differences in which ports are assigned to which VLAN ). These are "real" stacks with stacking modules and cables, configured as a ring. There is a single Aruba 7200 wireless controller connected to the 5412, with a number of APs connected to each stack. Each stack has two fibre links back to the core and spanning tree is enabled.

     

    After a period of some days one specific stack (always the same one ) will stop passing traffic to newly connected edge devices. Traffic via the Aruba APs connected to the same stack ( tunelled back to the controller ) is unaffected, and so wireless becomes the only practical way to access the switch

     

    When the fault arises approx 80% of pings from wired devices are dropped. ssh'ing in to the stack via wireless and trying to ping a device connected to the stack nearly always fails. The MAC addresses of the device I can't ping shows in the output of show mac-address

     

    It appears that this connectivity issue is limited to VLAN 1 ( which is untagged across all uplinks and most edge ports ) which is the VLAN that contains the switches, servers and PCs.

     

    I appreciate that having all that equipment in a single VLAN, using VLAN 1, and having VLAN 1 untagged across all switches isn't best practice, but I dont "own" this network and I'm not in a position to get any of this changed.

     

    If I remove one stacking cable between members 2 and 3, then members 1 and 2 start to work perfectly. Members 3 and 4 have no connectivity, despite it being a broken ring, not a chain. 

     

    A restart of the whole stack resolves the issue.

     

    As I say, this only occurs on one stack. The other 8 stacks of 2920s all work exactly as I'd expect.

     

    Any ideas welcome!



  • 2.  RE: 2920 Stack dropping packets

    EMPLOYEE
    Posted Sep 08, 2017 12:56 AM

    Did you open a TAC case?  What version of Code are you using?



  • 3.  RE: 2920 Stack dropping packets

    Posted Sep 08, 2017 12:21 PM

    HI Colin,

     

    We'll be opening a support case on Monday as the issue reocurred just as the site was closing today, so we didn't have time to attend site and retrieve the info that would be likely required.

     

    All switches in all stacks are running WB.16.03.0005

     



  • 4.  RE: 2920 Stack dropping packets

    Posted Sep 15, 2017 02:49 PM

    Hi Colin,

     

    Just to close this off. When we got to site we saw that stack member four was missing. We took a "show tech all" to open a support case, rebooted it and left site. The next morning the switch was dead - no activity lights, and was missing from the stack. A reboot didn't bring it fully back online as it reported a faulty stacking card.

     

    We replaced the switch and the stack has been fine for 3+ days now.

     

    One oddity is that none of the APs that were connected to the dead switch would reconnect to the 7205 controller - even when connected to a different stack. The core switch could ping the APs, but the 7205 could not. A new AP, in the new switch, worked perfectly.

     

    The 7205 required a reboot before it could communicate with the APs. I'm at a loss to explain why. Can the 7205 blacklist APs? It didn't appear to be an issue with the switches, as all switches could ping the APs. It was just AP<->controller traffic that was affected.

     

    I wasn't able to open a support case as the client needed everything up ASAP and the problem isn't reproducible.