Wireless Access

 View Only
  • 1.  AOS 8.7 - MDs refusing communication when MM offline?

    Posted Aug 03, 2021 11:48 AM
    Good afternoon,

    One of our remote sites has a two 7210 controllers, setup as MDs in a cluster on a locally hosted single MM (all running 8.7), which has been running well for months, no issues.

    Recently the EXSi host that MM is hosted on ran into a power failure and shut down. The controllers remained powered up, but when the MM lost power, the MDs stopped responding to pings, SSH, HTTPS, etc. Looking in the local router's ARP table, I still saw they were there however. No one at the site complained of the wireless being down, so perhaps the APs and controllers were still communicating.

    Once I powered the EXSi host back on and MM powered back on, the MDs immediately became reachable again and everything appears to be working as normal.

    I've never run into this situation before - is it normal for the MDs to refuse communications when the MM is unreachable? Is there something I can change in the configuration to prevent this from happening, so we can at least log into the MDs to check status? We are running a fairly vanilla setup.

    Thanks in advance for any advice.

    ------------------------------
    Matthew Waite
    ------------------------------


  • 2.  RE: AOS 8.7 - MDs refusing communication when MM offline?

    Posted Aug 03, 2021 12:12 PM
    Type "show log system all" on the MD to get a clue why

    ------------------------------
    Any opinions expressed here are solely my own and not necessarily that of Hewlett Packard Enterprise or Aruba Networks.
    ------------------------------



  • 3.  RE: AOS 8.7 - MDs refusing communication when MM offline?

    Posted Aug 03, 2021 12:33 PM
    Unfortunately by the time I got to the site, the log didn't go back far enough to look at those events. I've setup off an site syslog target to see what is output if it happens again. Thinking I may replicate the issue during the next maintenance window by turning off the MM's interface in ESXi.

    ------------------------------
    Matthew Waite
    Senior Microcomputer Technical Support Specialist
    Erie 1 BOCES
    West Seneca NY
    7168217621
    ------------------------------



  • 4.  RE: AOS 8.7 - MDs refusing communication when MM offline?

    Posted Aug 03, 2021 02:08 PM
    I was able to pull the tail end of course, when the power was restored to the host. Prior to that the log is basically just the same cfgm, snmp and ntpwrap warnings and errors on repeat.

    Aug 2 08:36:17 cfgm[3531]: <399838> <3531> <WARN> |cfgm| LmsHeartBeatResultAction: State(READY:LAST SNAPSHOT:CFGID-45:PEND-0:INITCFGID:0) FD=33:Cannot heartbeat with the master.
    Aug 2 08:36:17 cfgm[3531]: <399816> <3531> <ERRS> |cfgm| handle_read: State(READY:LAST SNAPSHOT:CFGID-45:PEND-0:INITCFGID:0) FD=33:Failure receiving heartbeat response header information Result=-1 Err=Connection timed out
    Aug 2 08:36:19 ntpwrap[3806]: <399816> <3806> <ERRS> |ntpwrap| ntpdPollingTimer:926 Listen address unavailable,restart Ntpd Daemon.
    Aug 2 08:36:26 cfgm[3531]: <399838> <3531> <WARN> |cfgm| LmsHeartBeatResultAction: State(CONNECTINPROGRESS:LAST SNAPSHOT:CFGID-45:PEND-0:INITCFGID:0) FD=33:Cannot heartbeat with the master.
    Aug 2 08:36:33 snmp[3803]: <301250> <3803> <ERRS> |snmp| Sendto failed, unable to send trap to manager 10.85.18.75:162.
    Aug 2 08:36:34 ntpwrap[3806]: <399816> <3806> <ERRS> |ntpwrap| getNtpSrvRouteAddr:447:connect() failed
    Aug 2 08:36:35 cfgm[3531]: <399838> <3531> <WARN> |cfgm| LmsHeartBeatResultAction: State(READY:LAST SNAPSHOT:CFGID-45:PEND-0:INITCFGID:0) FD=33:Cannot heartbeat with the master.
    Aug 2 08:36:35 fpapps[3564]: <399838> <4287> <WARN> |fpapps| updateUplinkReachState: Wired uplink vlan 4 (bkp NO) reachability (to 10.85.18.150) changed from 2(DOWN) to 1(UP). mode 1. lb-state: ENABLED
    Aug 2 08:36:35 fpapps[3564]: <399838> <4287> <WARN> |fpapps| ipMapAddUplinkDefaultGateway: Adding static vlan 4 gateway 10.85.4.1
    Aug 2 08:36:39 cluster_mgr[4237]: <352302> <5592> <ERRS> |cluster_mgr| cluster_proc_dds_peer_channel_add_event, peer_index for peer 10.85.18.150 is 0!

    ------------------------------
    Matthew Waite
    ------------------------------



  • 5.  RE: AOS 8.7 - MDs refusing communication when MM offline?

    Posted Aug 03, 2021 09:20 PM
    Edited by cjoseph Aug 04, 2021 05:45 PM
    Nothing in that log is rejecting access points...
    EDIT:  I read that post wrong, unfortunately.

    ------------------------------
    Any opinions expressed here are solely my own and not necessarily that of Hewlett Packard Enterprise or Aruba Networks.
    ------------------------------



  • 6.  RE: AOS 8.7 - MDs refusing communication when MM offline?

    Posted Aug 04, 2021 01:50 AM
    It sounds like a routing issue.  Are there routes for your subnet that point to the MCR or static routes on the MD that could be causing this?

    When you are able to try to reproduce you could try to capture the ssh traffic on the MD.

    packet-capture destination local
    packet-capture controlpath tcp 22
    ​

    Then after trying to ssh to it.

    show packet-capture controlpath

    You'll be able to clearly see if the traffic is getting there or not.



    ------------------------------
    Michael Clarke (Aruba)
    ------------------------------



  • 7.  RE: AOS 8.7 - MDs refusing communication when MM offline?

    Posted Aug 04, 2021 02:42 PM
    cjoseph: Sorry, perhaps I didn't make my concern clear enough. I don't believe the APs were being rejected. What wasn't working was SSH, SNMP, HTTPS, ping, etc. to each of the 7210 MDs. I couldn't do any monitoring as a result. When I did restore power to the MM, I was immediately able to SSH/web into a 7210 and nothing looked amiss, including APs and clients connected as normal.

    Michael_Clarke: I don't believe it was a routing issue - both subnets are local to the L3 switch they are connected to. The L3 switch did not lose power and if I recall correctly, I could ping other devices on that subnet. Thank you for the command suggestions, I will look at these as well when I try to reproduce the issue.

    ------------------------------
    Matthew Waite
    ------------------------------