Wired Intelligent Edge

 View Only
last person joined: 2 days ago 

Bring performance and reliability to your network with the Aruba Core, Aggregation, and Access layer switches. Discuss the latest features and functionality of the ArubaOS-Switch and ArubaOS-CX devices, and find ways to improve security across your network to bring together a mobile first solution.
Expand all | Collapse all

Help with possible physical failure Aruba 2930F stack

Jump to Best Answer
This thread has been viewed 46 times
  • 1.  Help with possible physical failure Aruba 2930F stack

    Posted Apr 18, 2022 10:43 AM
    Hi, I've a stack formed by 4 JL259A and 2 JL260A. All running softare version WC.16.10.0009 and ROM version 16.01.0008.
    Members 1 to 4 are JL259A and member 5 and 6 are JL260A. Topology used is Ring topology.

    Four days ago I noticed that member 5 present many errors one or two times to day, breaking for moments ring topology to chain. I'm not sure, but maybe it has a power supply failure and restarts only every so often.
    What I want to know is if these errors are physical, i.e. hardware errors to send the equipment to the manufacturer's warranty.

    This are the logs related:

    M 04/18/22 04:59:30 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 04:58:21 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 04:58:21 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 04:59:22 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/18/22 04:59:16 04992 vsf: ST6-MMBR: VSF link 1 is up

    W 04/18/22 04:58:02 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    W 04/18/22 04:58:02 03270 stacking: ST1-CMDR: Topology is a Chain

    M 04/18/22 01:50:42 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 01:49:31 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 01:49:31 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 01:50:32 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/18/22 01:50:32 03125 mgr: ST1-CMDR: Startup configuration changed by SNMP. New seq. number 155

    W 04/18/22 01:49:15 03270 stacking: ST1-CMDR: Topology is a Chain

    W 04/18/22 01:49:15 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    M 04/18/22 01:03:25 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 01:02:17 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 01:02:17 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 01:03:17 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/18/22 00:53:22 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 224

    W 04/18/22 01:02:00 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication.

    M 04/18/22 00:51:07 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/18/22 00:49:53 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/18/22 00:49:53 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/18/22 00:50:58 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in

    W 04/18/22 00:49:34 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    W 04/18/22 00:49:34 03270 stacking: ST1-CMDR: Topology is a Chain

    M 04/17/22 21:11:22 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 21:10:12 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 21:10:12 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:11:12 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/17/22 21:01:17 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 217

    W 04/17/22 21:09:56 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    W 04/17/22 21:09:56 03270 stacking: ST1-CMDR: Topology is a Chain

    M 04/17/22 21:09:03 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 21:07:07 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 21:07:07 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:08:54 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in

    W 04/17/22 21:06:08 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    M 04/17/22 21:02:52 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 21:01:41 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 21:01:41 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:02:42 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/17/22 20:52:47 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 211

    W 04/17/22 21:01:27 03258 stacking: ST1-CMDR: Member switch with Member ID 5 removed due to loss of communication

    M 04/17/22 21:00:10 00064 system: ST5-MMBR: Software exception at lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0

    M 04/17/22 20:59:01 02796 chassis: ST5-UKWN: Internal power supply 1 inserted. Total fault count: 0.

    M 04/17/22 20:59:01 02797 chassis: ST5-UKWN: Internal power supply 1 is OK. Total fault count: 0.

    M 04/17/22 21:00:01 04702 chassis: ST1-CMDR: Slot 5/1-52 Software exception in 04/17/22 20:50:06 03125 mgr: ST2-STBY: Startup configuration changed by SNMP. New seq. number 207

    Any help in identifying the source of these faults would be greatly appreciated.
    Best regards

    Gabriel


    ------------------------------
    Gabriel Mancuse
    ------------------------------


  • 2.  RE: Help with possible physical failure Aruba 2930F stack

    MVP GURU
    Posted Apr 19, 2022 02:17 AM
    Hello Gabriel, I strongly suggest you to update your Aruba 2930F (WC family) VSF stack to latest WC.16.10.0020 ArubaOS-Switch software release (released during March 2022), that build solves, among other things, the Bug ID 256274:


    which is the issue your stack seems to suffer (consider that your ArubaOS-Switch software version dates back to June 2020).

    Please read the relevant Release Notes here and be prepared to perform a full VSF stack reboot as per VSF stack updating procedure.

    ------------------------------
    Davide Poletto
    ------------------------------



  • 3.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 05, 2022 09:50 AM
    Hi Davide, about 5 days ago I updated the firmware to version WC_16.10.0020. Unfortunately the problems seem to have worsened. Today the whole stack seems to have rebooted.
    The stack was working fine as long as it was 4 JL259A members. But everything seems to have gotten worse with the addition about 40 days ago of 2 JL260A to the stack.
    I am finding it necessary to remove them. I cannot tolerate these failures.
    I am sending you the relevant logs in case you see anything I can do to fix these problems.
    Any recommendations you can give me will be more than appreciated.


    M 05/05/22 05:55:33 02796 chassis: ST1-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:33 02797 chassis: ST1-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:56:12 00064 system: ST1-CMDR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 1/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 2/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 3/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 4/1-28 Software exception in
    ISR at pvDmaV1Rx.c:3042
    M 05/05/22 05:56:17 04702 chassis: ST1-CMDR: Ports 6/1-52 Software exception in
    ISR at pvDmaV1Rx.c:3042
    -> Internal Msg Problem

    M 05/05/22 05:55:47 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:47 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:56:20 00064 system: ST5-MMBR: Reboot of Member ID 5, Lost
    commander and standby
    M 05/05/22 05:55:24 02796 chassis: ST3-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:24 02797 chassis: ST3-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:55:24 02796 chassis: ST4-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:24 02797 chassis: ST4-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:56:20 00064 system: ST3-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 05:56:19 00064 system: ST4-MMBR: Software exception at
    M 05/05/22 05:56:28 00064 system: ST2-STBY: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 05:55:25 02796 chassis: ST6-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 05:55:25 02797 chassis: ST6-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 05:57:30 00064 system: ST6-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 07:15:50 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0xadb6adb6 0x62316231 0x00000000 0x00000000 0x5f365f36 0x00000008

    M 05/05/22 07:14:46 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 07:14:46 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 07:15:56 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 07:22:50 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    M 05/05/22 07:21:46 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 07:21:46 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 07:22:56 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 08:03:14 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0xcc43cc43 0x04690469 0x00000000 0x00000000 0xf242f242 0x00000008

    M 05/05/22 08:02:10 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 08:02:10 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 08:03:20 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 09:06:34 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0x8db58db5 0xbc19bc19 0x00000000 0x00000000 0xe5a3e5a3 0x00000008
    M 05/05/22 09:05:30 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 09:05:30 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 09:06:40 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 09:08:53 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0x37fc37fc 0x4f784f78 0x00000000 0x00000000 0x336b336b 0x00000008

    M 05/05/22 09:07:49 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 09:07:49 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 09:08:59 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0
    M 05/05/22 09:23:19 04702 chassis: ST1-CMDR: Ports 5/1-52 Software exception in
    ISR at interrupts_clue.c:126
    -> CLUE Poll Watchdog 0x9b580000
    0x6a5c6a5c 0xba77ba77 0x00000000 0x00000000 0xb96fb96f 0x00000008
    M 05/05/22 09:22:15 02796 chassis: ST5-UKWN: Internal power supply 1 inserted.
    Total fault count: 0.
    M 05/05/22 09:22:15 02797 chassis: ST5-UKWN: Internal power supply 1 is OK.
    Total fault count: 0.
    M 05/05/22 09:23:25 00064 system: ST5-MMBR: Software exception at
    lava_chassis_slot_sm.c:3626 -- in 'eChassMgr', task ID = 0x37b07bc0




    ------------------------------
    Gabriel Mancuse
    ------------------------------



  • 4.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 05, 2022 11:30 AM
      |   view attached
    I am replying to myself to upload a log file of my Stack from today.

    ------------------------------
    Gabriel Mancuse
    ------------------------------

    Attachment(s)

    txt
    putty_2.txt   634 KB 1 version


  • 5.  RE: Help with possible physical failure Aruba 2930F stack
    Best Answer

    EMPLOYEE
    Posted May 05, 2022 11:55 AM
    Hi Gabriel,

    I think that you have to approach the Aruba support for further help/analysis. 
    For me it looks like there is a problem with member 5 only (as it is rebooting few times a day and probably brought the whole stack down for a reboot). Did you try to form the ring topology without member 5, or even to replace it if you have a device on stock? My idea is to isolate member 5.

    ------------------------------
    Stanislav Naydenov
    ------------------------------



  • 6.  RE: Help with possible physical failure Aruba 2930F stack

    EMPLOYEE
    Posted May 06, 2022 11:11 AM
    This looks like a possible hardware issue with the stack member in question. This could be tested by taking the following actions:

    1. Back up the stack configuration
    2. Remove member 5 from the stack configuration:  switch(config)# no vsf member 5
    3. After the member is removed and has been shut down, physically disconnect it from the rest of the stack
    4. Power-cycle the removed switch and connect to its console port
    5. Log in and zeroize the switch:  switch# erase all zeroize

    Allow the switch to run for a while as a standalone device and monitor for crashes/reboots. If they continue to occur, open a TAC case. If the switch operates normally, try re-adding it to the stack and continue to monitor for crashes or reboots; if they start occurring again, open a TAC case.

    ------------------------------
    Matt Fern
    Sr. Technical Marketing Engineer, Aruba Switching
    Aruba, a Hewlett Packard Enterprise company
    ------------------------------



  • 7.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 13, 2022 08:56 AM
    Thank you for your help. I isolated member 5 and have had no further failures on my stack so far. I left it on and started monitoring it. The switch resets itself about 5 or 6 times per day. I opened a TAC case to request replacement.

    Best regards.

    ------------------------------
    Gabriel Mancuse
    ------------------------------



  • 8.  RE: Help with possible physical failure Aruba 2930F stack

    Posted May 21, 2022 05:37 AM
    Clean all your fiber-cabels and trancivers...
    monitor trancivers power Tx/Rx [eg. sh interfaces transceiver 1/51 detail]
    and status/errors [eg. sh interface 1/52 hc].

    ------------------------------
    Steinar Grande
    ------------------------------