Wireless Access

last person joined: 20 hours ago 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

Controller crashing, all clients disconnect

This thread has been viewed 6 times
  • 1.  Controller crashing, all clients disconnect

    Posted May 24, 2017 04:47 AM
      |   view attached

    Hi All,

     

    i have an issue this morning. 5 sites, each has two units of 7210. the issue occur on one of the site with two local 7210.

    the issue starts with users on the whole area with APs adopted to those controller cannot connect to the SSIDs. the controller's web GUI cannot be accessed (very slow, always loading), SSH can be access fine but sometime shows "Module STM is busy" when i try to run something.

    i try to access second 7210 which can be opened fine unlike the first controller. I then remove the first controller from the network to force all APs to failover to the second controller. then all clients can continue to work well.

    i try to check cpuload it says process which stm command using 100% of the CPU.

    now the network work well with the second controller. when we try to connect the first controller back, the problem re-occur. even after a reboot.

    i need guidance what to do next?

     

    PS: attached the result from "show cpuload" and "show cpuload current"



  • 2.  RE: Controller crashing, all clients disconnect

    EMPLOYEE
    Posted May 24, 2017 07:13 AM

    That could be a symptom, but you need to open a TAC case so that they can collect and analyze your logs and a possible crash.tar  It is hard to know what is wrong with the limited information that you can print here.



  • 3.  RE: Controller crashing, all clients disconnect

    Posted May 24, 2017 01:20 PM

    We had a similar issue and had a tac case opened for many months while things were being analyzed.  In a nut shell our 7240 controllers weren't cleaning up the data sessions and our controllers would crash, very similar to what you're reporting.  We would hard boot the controllers and things would work okay for a while, but then they would lock up again.  They first wrote a customer specific AOS build for us addressing the issue, then included the fix starting in code 6.5.1.3

    Not sure if the same thing exactly as you are reporting but I agree, a TAC case would be best.

    Good luck!



  • 4.  RE: Controller crashing, all clients disconnect

    Posted May 29, 2017 05:41 AM

    opened a TAC case. there is a bug in 6.5.x with the communication between controller and airwave using AMON. we detected a looping packets between them, causes stm process in the controller to overload.

    if you are experiencing this issue, the quick fix is to disable amon in the controller config for the airwave. currently no fix for this issue, the latest AOS when i try this is 6.5.1.5.