Controllerless Networks

last person joined: yesterday 

Instant Mode - the controllerless Wi-Fi solution that's easy to set up, is loaded with security and smarts, and won't break your budget
Expand all | Collapse all

WAPs rebooting with watchdog timer or kernel panic causing chaos

This thread has been viewed 10 times
  • 1.  WAPs rebooting with watchdog timer or kernel panic causing chaos

    Posted Mar 21, 2017 06:42 PM

    Hi,

     

    Hoping someone might have come across this before, we've recently dployed a fleet of IAP315 waps to replace our aging 125s and controller. WAPs are operating in instant mode.

     

    Having major issues with WAPs rebooting as soon as they have a decent amount of client connections. They are rebooting with either one of these reasons listed in the show version cli command:

    Kernel panic - not syncing: softlockup: hung tasks (this is the main one)

    OR

    watchdog timer caused reboot.

     

     

    The APs are working fine with up to 20-25 clients or so but anything past that they just spit the dummy. And it's causing chaos here. for example I've got 4 waps in a building currently with 120 users with 30 in each room. One WAP reboots, those 30 users jump onto the next closest wap and that causes that one to reboot, then basically continually takes down the entire buildings wifi for hours on end.

     

    I have a case open with HPE at the moment but I'm really hoping someone might have a stop gap solution or something I can possibly do to at least make it a little bit reliable before I get lynched here.

     

    All waps are configured by DHCP, VLANs assigned in round robin format to clients, 2.4 and 5ghz wireless networks are available. Set to default fairness with a remote radius server running windows server 2012. Also have all clients connecting with a role for bandwidth restrictions 2mbs down and 768kbps up.

     

    Thanks for the help everyone.

     

     

     



  • 2.  RE: WAPs rebooting with watchdog timer or kernel panic causing chaos

    EMPLOYEE
    Posted Mar 21, 2017 06:44 PM
    You didn't mention what version of Instant. An Open case is the best thing at this point.


  • 3.  RE: WAPs rebooting with watchdog timer or kernel panic causing chaos

    Posted Mar 21, 2017 06:57 PM

    Apologies, verion is 6.5.0.0-4.3.0.0_56428 which seems to be the latest available I can get. I realise the open case is probably my best option, was just hoping someone might have some advice to stop me getting killed. $60k on new wifi  and it's less reliable than the 8 year old setup :(



  • 4.  RE: WAPs rebooting with watchdog timer or kernel panic causing chaos

    Posted Mar 21, 2017 07:00 PM

    This is my morning so far:

     

    2017-03-22 09:44:57
    135
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:42:50
    134
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:42:08
    133
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:41:59
    131
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:41:27
    130
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:38:59
    129
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:36:55
    128
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:36:52
    127
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:36:42
    126
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:34:41
    125
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:32:41
    124
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:30:41
    123
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:29:41
    122
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:29:24
    121
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:28:34
    120
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:26:31
    119
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:26:20
    118
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:24:25
    116
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:24:25
    117
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:22:19
    113
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:b4 is down
     
    2017-03-22 09:22:04
    114
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:22:04
    115
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:20:14
    111
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:20:14
    112
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:18:09
    110
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:17:50
    109
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:16:05
    108
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:15:57
    107
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:14:18
    105
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:b4 is down
     
    2017-03-22 09:13:59
    104
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:13:59
    106
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:11:54
    102
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:11:49
    101
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:09:48
    100
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:09:43
    99
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down
     
    2017-03-22 09:07:44
    98
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:b4 is down
     
    2017-03-22 09:06:39
    95
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:c8 is down
     
    2017-03-22 09:06:34
    96
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:61:d0 is down
     
    2017-03-22 09:05:31
    94
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:60:f8 is down
     
    2017-03-22 09:05:30
    93
    System
    a8:bd:27:c1:61:52
    Access point a8:bd:27:c1:5f:26 is down


  • 5.  RE: WAPs rebooting with watchdog timer or kernel panic causing chaos
    Best Answer

    EMPLOYEE
    Posted Mar 21, 2017 09:54 PM

    @lorby89 wrote:

    Apologies, verion is 6.5.0.0-4.3.0.0_56428 which seems to be the latest available I can get. I realise the open case is probably my best option, was just hoping someone might have some advice to stop me getting killed. $60k on new wifi  and it's less reliable than the 8 year old setup :(


    Even long-deployed products can have bugs.  You can upgrade to later versions of Early Release software based on the lifetime warranty software agreement here:  http://support.arubanetworks.com/LifetimeWarrantySoftware/tabid/121/DMXModule/661/Default.aspx?EntryId=20388

     

    4.3.0.0 is the first release and is guaranteed to have bugs.



  • 6.  RE: WAPs rebooting with watchdog timer or kernel panic causing chaos

    MVP EXPERT
    Posted Mar 22, 2017 04:25 AM

    Hey, just a little extra item you can try (TAC might have even mentioned this already) but is to run the #show version command. There is a section which will tell you why it rebooted, sometimes if it is a crash there will be a core dump generated which TAC can analyse.

     

    IAP-Lab# show version 
    Aruba Operating System Software.
    ArubaOS (MODEL: 105), Version 6.4.4.4-4.2.3.1
    Website: http://www.arubanetworks.com
    Copyright (c) 2002-2016, Aruba Networks, an HP company.
    Compiled on 2016-04-15 at 06:40:57 PDT (build 54637) by p4build
    FIPS Mode :disabled
    
    AP uptime is 2 weeks 22 hours 18 minutes 16 seconds
    Reboot Time and Cause: XXXXXXXXXX

    You can see if there is a core dump using the below

     

    https://community.arubanetworks.com/t5/Controller-less-WLANs/How-to-retrieve-the-core-file-from-the-IAP/ta-p/253039



  • 7.  RE: WAPs rebooting with watchdog timer or kernel panic causing chaos

    Posted Mar 22, 2017 06:48 AM

    Hi guys, thank you for the responses. 

     

    Thank you Cjoseph, that was exactly the kind of stuff I was hoping for. I didn't actually realise there was an early release section inside the lifetime warranty. I assumed the general release was the only options available. I have done an upgrade of the firmware to the latest I can find, I figure it can't be much worse haha. At least I am able to try something now :).

     

    zalion0- I have supplied TAC the show tech support dumps and also the crash dump which may contain that info as well, that's all they've asked for at the moment. If it happens again I will try and save a copy of that as well, just in case. Thanks.

     

    On a positive note also just received a response from TAC that this seems similar to an issue they may have experienced already and I should get confirmation by cob tomorrow. And I'm crossing my fingers that it may have even been rectified in the early deployment firmware update since the major release version I was running. 



  • 8.  RE: WAPs rebooting with watchdog timer or kernel panic causing chaos

    Posted Mar 22, 2017 11:37 PM

    Guess what!

     

    Upgraded to 6.5.1.0-4.3.1.2_58595 last night. So far today a total of zero reboots!! (unless the fault history has stopped working in this version haha). I had 49-50 clients on a single wap for over 20 mins with no crash at all!!! Normally it would have gone for about 30 seconds.

     

    Thanks again for letting me know about those early release/lifetime warranty firmware options available :)