Wired Intelligent Edge

 View Only
last person joined: 22 hours ago 

Bring performance and reliability to your network with the HPE Aruba Networking Core, Aggregation, and Access layer switches. Discuss the latest features and functionality of your switching devices, and find ways to improve security across your network to bring together a mobile-first solution
Expand all | Collapse all

AOS-CX switches transition to STP forwarding state too quickly on boot

This thread has been viewed 42 times
  • 1.  AOS-CX switches transition to STP forwarding state too quickly on boot

    Posted Oct 27, 2023 02:17 AM

    Hi all,

    When my 6100 switches running firmware 10.10.1040 boot up, it seems their uplinks go into 'forwarding' states too fast. It's almost as if they are ignoring the 'spanning-tree port-type admin-network' command on the uplinks to the distribution switches.

    Sanitised logs (apologies for the reverse order):

    2023-10-26T02:40:01.693563+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/16 for CIST from source: 68:87:c6:21:28:d6
    2023-10-26T02:40:01.684650+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/15 for CIST from source: 80:24:8f:4f:73:d6
    2023-10-26T02:39:59.895664+00:00 6100 lldpd[2375]: Event|104|LOG_INFO|AMM|1/1|LLDP neighbor 20:cf:ae:75:39:40 added on 1/1/16
    2023-10-26T02:39:59.694396+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/16 for CIST from source: 68:87:c6:21:28:d6
    2023-10-26T02:39:59.674385+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/15 for CIST from source: 80:24:8f:4f:73:d6
    2023-10-26T02:39:59.067362+00:00 6100 hpe-udldd[2394]: Event|4103|LOG_INFO|AMM|1/1|UDLD interface 1/1/16 is unblocked
    2023-10-26T02:39:59.066786+00:00 6100 hpe-udldd[2394]: Event|4107|LOG_INFO|AMM|1/1|UDLD link is enabled on interface: 1/1/16
    2023-10-26T02:39:59.012758+00:00 6100 hpe-udldd[2394]: Event|4103|LOG_INFO|AMM|1/1|UDLD interface 1/1/15 is unblocked
    2023-10-26T02:39:59.011410+00:00 6100 hpe-udldd[2394]: Event|4107|LOG_INFO|AMM|1/1|UDLD link is enabled on interface: 1/1/15
    2023-10-26T02:39:57.883672+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/16 for CIST from source: 68:87:c6:21:28:d6
    2023-10-26T02:39:57.882129+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/15 for CIST from source: 80:24:8f:4f:73:d6
    2023-10-26T02:39:57.722429+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/15 for CIST from source: 80:24:8f:4f:73:d6
    2023-10-26T02:39:57.293482+00:00 6100 cdpd[2352]: Event|8903|LOG_INFO|AMM|1/1|CDP neighbor 68:87:c6:21:28:d6 is added on 1/1/16
    2023-10-26T02:39:57.292687+00:00 6100 cdpd[2352]: Event|8903|LOG_INFO|AMM|1/1|CDP neighbor 80:24:8f:4f:73:d6 is added on 1/1/15
    2023-10-26T02:39:57.291859+00:00 6100 hpe-mstpd[2370]: Event|2014|LOG_INFO|AMM|1/1|Port 1/1/16 blocked on CIST
    2023-10-26T02:39:57.290973+00:00 6100 hpe-mstpd[2370]: Event|2011|LOG_INFO|AMM|1/1|Topology Change received on port 1/1/15 for CIST from source: 80:24:8f:4f:73:d6
    2023-10-26T02:39:56.821326+00:00 6100 hpe-mstpd[2370]: Event|2006|LOG_INFO|AMM|1/1|CST  - Root changed from 61440: 94:60:d5:4e:aa:e0 to 8192: cc:ed:4d:c6:c7:e4
    2023-10-26T02:39:56.400919+00:00 6100 lldpd[2375]: Event|109|LOG_INFO|AMM|1/1|Configured LLDP tx-delay to 2
    2023-10-26T02:39:55.505926+00:00 6100 hpe-mstpd[2370]: Event|2012|LOG_INFO|AMM|1/1|CIST - Topology Change generated on port 1/1/16 going in to forwarding
    2023-10-26T02:39:55.505060+00:00 6100 ops_appsdnsclient[2353]: Event|11901|LOG_INFO|||Domain server create event for VRF default
    2023-10-26T02:39:54.863670+00:00 6100 intfd[581]: Event|403|LOG_INFO|AMM|1/1|Link status for interface 1/1/16 is up
    2023-10-26T02:39:54.846044+00:00 6100 intfd[581]: Event|403|LOG_INFO|AMM|1/1|Link status for interface 1/1/15 is up
    2023-10-26T02:39:53.084641+00:00 6100 ndmd[2361]: Event|6131|LOG_INFO|AMM|1/1|Neighbor Discovery daemon started
    2023-10-26T02:39:49.385150+00:00 6100 hpe-mstpd[2370]: Event|2001|LOG_INFO|AMM|1/1|MSTP Enabled
    2023-10-26T02:39:48.603506+00:00 6100 lacpd[2368]: Event|1319|LOG_INFO|AMM|1/1|LAG global load balancing mode is set to l3-src-dst-hash
    2023-10-26T02:39:48.569600+00:00 6100 lacpd[2368]: Event|1307|LOG_INFO|AMM|1/1|LACP system ID set to 94:60:d5:4e:aa:e0
    2023-10-26T02:39:43.647907+00:00 6100 hpe-cardd[519]: Event|3204|LOG_INFO|AMM|1/1|Line module 1/1 is ready
    2023-10-26T02:39:43.516813+00:00 6100 hpe-cardd[519]: Event|3225|LOG_DEBUG|AMM|1/1|Line module 1/1 ASIC initialization completed
    2023-10-26T02:39:20.653892+00:00 6100 pmd[591]: Event|3808|LOG_INFO|AMM|1/1|Transceiver 1G-LX inserted in 1/1/16 is supported. An HPE pluggable module that is supported in this interface
    2023-10-26T02:39:20.559194+00:00 6100 pmd[591]: Event|3808|LOG_INFO|AMM|1/1|Transceiver 1G-LX inserted in 1/1/15 is supported. An HPE pluggable module that is supported in this interface
    2023-10-26T02:39:20.132182+00:00 6100 pmd[591]: Event|3804|LOG_INFO|AMM|1/1|Transceiver hot-swap insert for interface 1/1/16
    2023-10-26T02:39:20.131186+00:00 6100 pmd[591]: Event|3804|LOG_INFO|AMM|1/1|Transceiver hot-swap insert for interface 1/1/15
    2023-10-26T02:39:18.633080+00:00 6100 hpe-cardd[519]: Event|3224|LOG_INFO|AMM|1/1|Line module 1/1 ASIC initialization started
    2023-10-26T02:39:12.079474+00:00 6100 hpe-sysmond[582]: Event|6303|LOG_INFO|AMM|1/1|Current system memory usage for module 1/1 is 8%
    2023-10-26T02:39:10.141753+00:00 6100 crash-tools[2151]: Event|1206|LOG_CRIT|||Module rebooted. Reason : Reboot requested by user, Version: PL.10.10.1040#012, Boot-ID : de795b4fc0c34f5fac7969ca7fe6e829
    2023-10-26T02:39:07.772767+00:00 6100 hpe-cardd[519]: Event|3233|LOG_INFO|AMM|1/1|Line module 1/1 configured with product number JL679A
    2023-10-26T02:39:05.641167+00:00 6100 hpe-config-boot[564]: Event|6803|LOG_INFO|AMM|-|Config-Migration:JL679A found in Subsystem data
    2023-10-26T02:38:48.606740+00:00 6100 powerd[590]: Event|301|LOG_INFO|AMM|1/1|PSU 1/1 changed state to OK

    STP and interface onfig:

    spanning-tree
    spanning-tree trap new-root
    spanning-tree trap loop-guard-inconsistency
    spanning-tree priority 15
    !
    interface 1/1/15
        no shutdown
        vlan trunk native xx
        vlan trunk allowed xx,xx,xx
        spanning-tree loop-guard
        loop-protect
        udld
        udld mode rfc5171 normal
    !
    interface 1/1/16
        no shutdown
        vlan trunk native xx
        vlan trunk allowed xx,xx,xx
        spanning-tree loop-guard
        loop-protect
        udld
        udld mode rfc5171 normal

    According to the default STP port type of 'admin-network' (no port type set), the port should wait 3 seconds before transitioning to the forwarding state. This doesn't seem to be happening if you look at the logs.

    If I am reading the logs correctly, the interfaces come up at 2:39:54, and port 1/1/16 transitions to forwarding at 2:39:55. A BPDU is received 2 seconds later, and port 1/1/6 is shut down as it should always have been. I don't know what is happening to port 1/1/15. It's not clear.

    I just opened a TAC case, but would love to discuss this openly for the sake of urgency. A 100+ of these reboot at the exact same time (power outage) and caused chaos upstream. 100+ switches looping for a couple of seconds simultaneously isn't pretty.

    PS - LACP isn't an option in this network segment. That is a long story.



    ------------------------------
    Regards,

    Brett V
    ------------------------------



  • 2.  RE: AOS-CX switches transition to STP forwarding state too quickly on boot

    Posted Nov 01, 2023 02:39 AM

    An update:

    This happens not just at boot time, but whenever we disconnect, and reconnect interfaces.

    In fact - it's much faster in this scenario. The switch puts both uplinks in forwarding states less than half a second after the interfaces are reconnected. It is not honouring any STP wait times.

    Waiting on TAC's official response.



    ------------------------------
    Regards,

    Brett V
    ------------------------------



  • 3.  RE: AOS-CX switches transition to STP forwarding state too quickly on boot

    Posted Dec 06, 2023 07:50 AM

    TAC have been able to reproduce the issue - looks like a bug. Waiting to see the scope of the issue (which firmware, which devices etc)



    ------------------------------
    Regards,

    Brett V
    ------------------------------



  • 4.  RE: AOS-CX switches transition to STP forwarding state too quickly on boot

    Posted Dec 11, 2023 07:11 AM

    Hello Brett.

    did TAC provide any additional info yet, that you could share? 
    I've been also working on STP-related troubleshooting on AOS-S and AOS-CX over the last few week, mostly regarding combinations of Aruba 8325 and 2930F.

    Regards, Michael




  • 5.  RE: AOS-CX switches transition to STP forwarding state too quickly on boot

    Posted Dec 13, 2023 01:22 AM

    Hi Michael,

    TAC originally came back and said there was an issue - but have since backpedalled and said this is normal operations. We have escalated through our reps because we are 100% confident that our admin-network ports aren't behaving as per the described documentation.

    I want them to show me that my admin-network ports are waiting 3 seconds before forwarding (they aren't), or tell me that documentation is wrong (they said it is correct).

    I'll keep you posted.



    ------------------------------
    Brett V
    ------------------------------



  • 6.  RE: AOS-CX switches transition to STP forwarding state too quickly on boot

    MVP GURU
    Posted Dec 13, 2023 09:08 AM

    Hi!

    "TAC originally came back and said there was an issue - but have since backpedalled and said this is normal operations."

    And so does it mean that there wasn't an (unknown before) Bug in the ArubaOS-CX?

    Just curious if the Aruba TAC asked you to first update your Aruba CX 6100 from running ArubaOS-CX 10.10.1040 (02/2023) to latest available ArubaOS-CX 10.10.1100 (12/2023) before trying to reproduce your issue or not (given that it is really an unknown bug affecting ArubaOS-CX 10.10 since the 10.10.1040 at best/worst...maybe with relationship with particular Aruba CX switches series...just guessing).




  • 7.  RE: AOS-CX switches transition to STP forwarding state too quickly on boot

    Posted Mar 06, 2024 07:58 AM

    Hello guys, i was just wondering, has there been any update on this issue in the last weeks ?
    I am just asking because I've also been investigating issues and unexpected STP behavior on CX devices in recent weeks.




  • 8.  RE: AOS-CX switches transition to STP forwarding state too quickly on boot

    Posted Mar 07, 2024 09:32 PM

    Hey Michael,

    So this one is still going.

    It turns out that some of the STP daemon messages saying that the interface is going in to forwarding, has nothing to do with STP. It's just the interface being initialised. They are very misleading. TAC proved that STP is not putting interfaces in to forwarding incorrectly. So we closed THAT case.

    However, I am still having issues so logged another TAC case recenlty... I did a packet capture on interfaces in the blocking state, and they are forwarding a single type of frame! It happens to be 'Loop Protect' frames generated by other switches in the network. Comes in one interface, and out of the other. They have god mode access and bypass loop prevention protocols. But it only happens in the first few seconds of an interface being enabled in STP (basically the learning state forwards this one frame type). Interestingly when we disable STP and use Smart Link instead, Smart Link forwards these Loop Protect frames 100% of the time, regardless of the secondary Smart Link port being blocked.

    On the upstream distribution switch, I get 1 MAC flap for each switch that sends a loop protect frame, on every port connecting another Aruba switch. Thats because the loop protect frames have a source MAC of the switches MGMT interface, when it comes back out of the other Aruba switches, the distribution switch updates the MAC table. It has no idea where the switch sending the Loop Protect frame lives. I have 500 of these switches currently, so I get 2500 MAC flaps when this occurs. With smart link enabled its every 5 seconds - it completely kills the rest of the network. STP enabled - if there is a global power outage in this area for example, when they all boot I still get 3 seconds of chaos - then it settles down.

    I'll keep you posted with the TAC case.



    ------------------------------
    Brett V
    ------------------------------