Wired Intelligent Edge

 View Only
last person joined: yesterday 

Bring performance and reliability to your network with the HPE Aruba Networking Core, Aggregation, and Access layer switches. Discuss the latest features and functionality of your switching devices, and find ways to improve security across your network to bring together a mobile-first solution
Expand all | Collapse all

Help with possible physical failure Aruba 2930F stack

This thread has been viewed 59 times
  • 1.  Help with possible physical failure Aruba 2930F stack

    Posted Apr 18, 2022 10:43 AM
    Hi, I've a stack formed by 4 JL259A and 2 JL260A. All running softare version WC.16.10.0009 and ROM version 16.01.0008.
    Members 1 to 4 are JL259A and member 5 and 6 are JL260A. Topology used is Ring topology.

    Four days ago I noticed that member 5 present many errors one or two times to day, breaking for moments ring topology to chain. I'm not sure, but maybe it has a power supply failure and restarts only every so often.
    What I want to know is if these errors are physical, i.e. hardware errors to send the equipment to the manufacturer's warranty.

    This are the logs related:

    M 04/18/22 04:59:30 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 04:58:21 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 04:58:21 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 04:59:22 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/18/22 04:59:16 04992 vsf: ST6-MMBR: VSF link 1 is up

    W 04/18/22 04:58:02 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    W 04/18/22 04:58:02 03270 stacking: ST1-CMDR: Topology is a Chain

    M 04/18/22 01:50:42 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 01:49:31 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 01:49:31 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 01:50:32 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/18/22 01:50:32 03125 mgr: ST1-CMDR: Startup configuration changed by SNMP. New seq. number 155

    W 04/18/22 01:49:15 03270 stacking: ST1-CMDR: Topology is a Chain

    W 04/18/22 01:49:15 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    M 04/18/22 01:03:25 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 01:02:17 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 01:02:17 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 01:03:17 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/18/22 00:53:22 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 224

    W 04/18/22 01:02:00 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication.

    M 04/18/22 00:51:07 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 00:49:53 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 00:49:53 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 00:50:58 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in

    W 04/18/22 00:49:34 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    W 04/18/22 00:49:34 03270 stacking: ST1-CMDR: Topology is a Chain

    M 04/17/22 21:11:22 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 21:10:12 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 21:10:12 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:11:12 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/17/22 21:01:17 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 217

    W 04/17/22 21:09:56 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    W 04/17/22 21:09:56 03270 stacking: ST1-CMDR: Topology is a Chain

    M 04/17/22 21:09:03 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 21:07:07 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 21:07:07 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:08:54 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in

    W 04/17/22 21:06:08 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    M 04/17/22 21:02:52 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 21:01:41 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 21:01:41 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:02:42 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/17/22 20:52:47 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 211

    W 04/17/22 21:01:27 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    M 04/17/22 21:00:10 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 20:59:01 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 20:59:01 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:00:01 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/17/22 20:50:06 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 207

    Any help in identifying the source of these faults would be greatly appreciated.
    Best regards

    Gabriel


    ------------------------------
    Gabriel Mancuse
    ------------------------------


  • 2.  RE: Help with possible physical failure Aruba 2930F stack

    MVP GURU
    Posted Apr 19, 2022 02:17 AM
    Hello Gabriel, I strongly suggest you to update your Aruba 2930F (WC family) VSF stack to latest WC.16.10.0020 ArubaOS-Switch software release (released during March 2022), that build solves, among other things, the Bug ID 256274:


    which is the issue your stack seems to suffer (consider that your ArubaOS-Switch software version dates back to June 2020).

    Please read the relevant Release Notes here and be prepared to perform a full VSF stack reboot as per VSF stack updating procedure.

    ------------------------------
    Davide Poletto
    ------------------------------



  • 3.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 05, 2022 09:50 AM
    Hi Davide, about 5 days ago I updated the firmware to version WC_16.10.0020. Unfortunately the problems seem to have worsened. Today the whole stack seems to have rebooted.
    The stack was working fine as long as it was 4 JL259A members. But everything seems to have gotten worse with the addition about 40 days ago of 2 JL260A to the stack.
    I am finding it necessary to remove them. I cannot tolerate these failures.
    I am sending you the relevant logs in case you see anything I can do to fix these problems.
    Any recommendations you can give me will be more than appreciated.


    M 05/05/22 05:55:33 02796 chassis: ST1-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:33 02797 chassis: ST1-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:56:12 00064 system: ST1-CMDR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 1/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 2/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 3/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 4/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 6/1-52 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:55:47 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:47 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:56:20 00064 system: ST5-MMBR: Reboot of Member ID 5, Lost
    commander and standby
    M 05/05/22 05:55:24 02796 chassis: ST3-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:24 02797 chassis: ST3-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:55:24 02796 chassis: ST4-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:24 02797 chassis: ST4-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:56:20 00064 system: ST3-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 05:56:19 00064 system: ST4-MMBR: Software exception at
    M 05/05/22 05:56:28 00064 system: ST2-STBY: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 05:55:25 02796 chassis: ST6-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:25 02797 chassis: ST6-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:57:30 00064 system: ST6-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 07:15:50 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0xadb6adb6 0x62316231 0x00000000 0x00000000 0x5f365f36 0x00000008

    M 05/05/22 07:14:46 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 07:14:46 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 07:15:56 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 07:22:50 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    M 05/05/22 07:21:46 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 07:21:46 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 07:22:56 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 08:03:14 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0xcc43cc43 0x04690469 0x00000000 0x00000000 0xf242f242 0x00000008

    M 05/05/22 08:02:10 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 08:02:10 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 08:03:20 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 09:06:34 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0x8db58db5 0xbc19bc19 0x00000000 0x00000000 0xe5a3e5a3 0x00000008
    M 05/05/22 09:05:30 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 09:05:30 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 09:06:40 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 09:08:53 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0x37fc37fc 0x4f784f78 0x00000000 0x00000000 0x336b336b 0x00000008

    M 05/05/22 09:07:49 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 09:07:49 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 09:08:59 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 09:23:19 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0x6a5c6a5c 0xba77ba77 0x00000000 0x00000000 0xb96fb96f 0x00000008
    M 05/05/22 09:22:15 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 09:22:15 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 09:23:25 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0




    ------------------------------
    Gabriel Mancuse
    ------------------------------



  • 4.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 05, 2022 11:30 AM
      |   view attached
    I am replying to myself to upload a log file of my Stack from today.

    ------------------------------
    Gabriel Mancuse
    ------------------------------

    Attachment(s)

    txt
    putty_2.txt   634 KB 1 version


  • 5.  RE: Help with possible physical failure Aruba 2930F stack
    Best Answer

    EMPLOYEE
    Posted May 05, 2022 11:55 AM
    Hi Gabriel,

    I think that you have to approach the Aruba support for further help/analysis. 
    For me it looks like there is a problem with member 5 only (as it is rebooting few times a day and probably brought the whole stack down for a reboot). Did you try to form the ring topology without member 5, or even to replace it if you have a device on stock? My idea is to isolate member 5.

    ------------------------------
    Stanislav Naydenov
    ------------------------------



  • 6.  RE: Help with possible physical failure Aruba 2930F stack

    EMPLOYEE
    Posted May 06, 2022 11:11 AM
    This looks like a possible hardware issue with the stack member in question. This could be tested by taking the following actions:

    1. Back up the stack configuration
    2. Remove member 5 from the stack configuration:  switch(config)# no vsf member 5
    3. After the member is removed and has been shut down, physically disconnect it from the rest of the stack
    4. Power-cycle the removed switch and connect to its console port
    5. Log in and zeroize the switch:  switch# erase all zeroize

    Allow the switch to run for a while as a standalone device and monitor for crashes/reboots. If they continue to occur, open a TAC case. If the switch operates normally, try re-adding it to the stack and continue to monitor for crashes or reboots; if they start occurring again, open a TAC case.

    ------------------------------
    Matt Fern
    Sr. Technical Marketing Engineer, Aruba Switching
    Aruba, a Hewlett Packard Enterprise company
    ------------------------------



  • 7.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 13, 2022 08:56 AM
    Thank you for your help. I isolated member 5 and have had no further failures on my stack so far. I left it on and started monitoring it. The switch resets itself about 5 or 6 times per day. I opened a TAC case to request replacement.

    Best regards.

    ------------------------------
    Gabriel Mancuse
    ------------------------------



  • 8.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 21, 2022 05:37 AM
    Clean all your fiber-cabels and trancivers...
    monitor trancivers power Tx/Rx [eg. sh interfaces transceiver 1/51 detail]
    and status/errors [eg. sh interface 1/52 hc].

    ------------------------------
    Steinar Grande
    ------------------------------