Wireless Access

last person joined: 3 hours ago 

Access network design for branch, remote, outdoor and campus locations with Aruba access points, and mobility controllers.
Expand all | Collapse all

Issues with 215s going down

Jump to Best Answer
  • 1.  Issues with 215s going down

    Posted Oct 09, 2019 07:17 AM
      |   view attached

    Good day,

     

    We are having an issue where 215s in the cluster go down. 115s will continue to run. If you restart the controller the 215s come back up but will go down again after some time (this time apparently varies, and can be as much as a few days, as little as a few hours). While the APs are listed as down in the controller you are still able to ping them. Also after the reboot, the listed uptime is not consistent with the time they were rebooted.

     

    They are using a 7005 on Aruba OS 8.3.0.3

     

    Does anyone know what could be causing this?

    Attachment(s)

    txt
    Show_ap_database.txt   4K 1 version


  • 2.  RE: Issues with 215s going down

    Posted Oct 09, 2019 07:29 AM

    Don't reboot the controlller, because that will make all of the APs reboot.  On the controller commandline, type "show log system all".  When APs go down and come up, it will tell you why.



  • 3.  RE: Issues with 215s going down

    Posted Oct 09, 2019 09:58 AM

    Thank you, I will try it



  • 4.  RE: Issues with 215s going down

    Posted Oct 09, 2019 07:51 AM

    what is the output of the following command?

     

    " show ap debug system-status ap-name NMB-EXCEL-BDLE | include Reboot "

     

    --Give Kudos: found something helpful, important, or cool? Click Kudos Star in a post.
    --Problem Solved? Click "Accepted Solution" in a post.




  • 5.  RE: Issues with 215s going down

    Posted Oct 09, 2019 09:59 AM

    Thank you, im going onsite a bit later, so ill run it then and log the output.



  • 6.  RE: Issues with 215s going down

    Posted Oct 10, 2019 02:28 AM

     


    @Mr.RFC wrote:

    what is the output of the following command?

     

    " show ap debug system-status ap-name NMB-EXCEL-BDLE | include Reboot "

     

    --Give Kudos: found something helpful, important, or cool? Click Kudos Star in a post.
    --Problem Solved? Click "Accepted Solution" in a post.



    Good morning, here is the output of the command:

     

    Reboot Information

    ------------------

    AP rebooted Fri Dec 31 16:45:49 PST 1999; SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 229 sec before: Last Ctrl msg: HELLO len=1447 dest=192.168.254.24 tries=10 seq=0

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

     

    Rebootstrap Information

    -----------------------

    Date       Time     Reason (Latest 10)

    --------------------------------------

    2000-01-03 02:49:51 Switching to LMS 192.168.254.24: Missed heartbeats: Last Sequence Generated=9 Sent=9 Rcvd=0; eth Sent=29230 Drop=0; gre Sent=29236 Drop=0 First=1; ipsec Sent=0 Drop=0. Last Ctrl message: STATUS_REPORT len=77 dest=192.168.254.24 tries=1 seq=2

    2000-01-03 02:50:02 Switching to LMS 192.168.254.24: Missed heartbeats: Last Sequence Generated=9 Sent=9 Rcvd=0; eth Sent=29239 Drop=0; gre Sent=29245 Drop=0 First=1; ipsec Sent=0 Drop=0. Last Ctrl message: STATUS_REPORT len=77 dest=192.168.254.24 tries=1 seq=2

    2000-01-03 02:50:13 Switching to LMS 192.168.254.24: Missed heartbeats: Last Sequence Generated=9 Sent=9 Rcvd=0; eth Sent=29248 Drop=0; gre Sent=29254 Drop=0 First=1; ipsec Sent=0 Drop=0. Last Ctrl message: STATUS_REPORT len=77 dest=192.168.254.24 tries=1 seq=2

    2000-01-03 02:50:24 Switching to LMS 192.168.254.24: Missed heartbeats: Last Sequence Generated=9 Sent=9 Rcvd=0; eth Sent=29257 Drop=0; gre Sent=29263 Drop=0 First=1; ipsec Sent=0 Drop=0. Last Ctrl message: STATUS_REPORT len=77 dest=192.168.254.24 tries=1 seq=2



  • 7.  RE: Issues with 215s going down

    Posted Oct 10, 2019 03:25 AM

    The APs rebooted due to being unable to reach the controller.

     

    AP rebooted Fri Dec 31 16:45:49 PST 1999; SAPD: Unable to contact switch: HELLO-TIMEOUT. Last rebootstrap reason: HELLO-TIMEOUT, 229 sec before: Last Ctrl msg: HELLO len=1447 dest=192.168.254.24 tries=10 seq=0

    Do you have any packet loss between the APs and the controllers? Reason being is you can see there is missed heartbeats between the controller and the AP.

     

    2000-01-03 02:50:13 Switching to LMS 192.168.254.24: Missed heartbeats: Last Sequence Generated=9 Sent=9 Rcvd=0; eth Sent=29248 Drop=0; gre Sent=29254 Drop=0 First=1; ipsec Sent=0 Drop=0. Last Ctrl message:

    Also, do you have NTP configured?



  • 8.  RE: Issues with 215s going down

    Posted Oct 10, 2019 08:54 AM
    What is the topology? Do you have a standalone setup or a redundant setup?

    Do you have redundancy setup?(vrrp)

    APs exchange keepalives (heartbeats ) with the controller every second.

    If they miss 8 heartbeats (8 seconds in total) they rebootstrap.(basically a reboot with some nuances)

    I am assuming that you have sufficient licenses or they would have come up with an IL flag if you don't have sufficient licenses.

    Do you see any errors in the configuration profiles?

    Show profile-errors.

    Finally how are the APs setup to discover their master? Is the Masters ip hardcoded or are you using DHCP/DNS.

    What is the configuration for the ap system-profile?

    Show ap system-profile (name of the profile used)

    Could you post the boot log of any one ap.



  • 9.  RE: Issues with 215s going down
    Best Answer

    Posted Oct 17, 2019 03:51 AM

    Good morning all,

     

    My apologies for not getting back on this sooner. 

     

    We were able to fix the issue with a controller upgrade. We have also since configured an NTP server.

     

    I appreciate all the responses that we got on the post. Thank you for your assistance.