Controllerless Networks

 View Only
last person joined: 21 hours ago 

Instant Mode - the controllerless Wi-Fi solution that's easy to set up, is loaded with security and smarts, and won't break your budget
Expand all | Collapse all

Preempted by provisioned conductor - consistent reboots of AP's in cluster

This thread has been viewed 20 times
  • 1.  Preempted by provisioned conductor - consistent reboots of AP's in cluster

    Posted 24 days ago

    We have a cluster of AP's managed by Aruba Central running 8.10.0.11 right now.  About two months ago we began to see instability in this cluster and found the checksum values of various AP's didn't match with the rest of the cluster.  We opened a support case to track this down and it appears that this is related to conductor changes (possibly due to missing transmit/receive heartbeats from the conductor AP), but so far have not been able to find a root cause or fix.  The reason for AP's rebooting is (most recent one below):

    Reboot Time and Cause: AP rebooted Wed May 22 13:25:28 UTC 2024; System cmd at uptime 0D 22H 34M 59S: Preempted by provisioned conductor (8c:79:09:c6:e7:42 172.16.237.213) uptime from boot: 22 hours 34 minutes 56 seconds; uptime from being conductor: 25 seconds

    I see in our syslog server numerous entries like:

    172.16.237.204 cli[5958]: <341135> <WARN> AP:BOS-FL2-AP-3d:10 <172.16.237.204 8C:79:09:C7:3D:10>  Conductor Changed - new 172.16.237.207 old 172.16.237.213 current swarm state 4.

    We have done the following thus far in terms of troubleshooting:

    1. Factory reset AP's
    2. Upgraded from 8.10.0.9 to 8.10.0.11
    3. Re-terminated all physical cabling and received certification of no errors from low voltage vendor
    4. Moved AP ports to different ports on switch - issue followed
    5. Plugged in spare AP into switch directly with patch cable, bypassing patch panel/cabling, and it also sees the same issue but not as frequently
    6. Reviewed switch side config and standardized it across the board to include portfast
    7. Verified all AP's are receiving the correct amount of power via LLDP
    8. Changed conductor AP to different AP's (problem persists, doesn't matter which one is conductor)

    I'm running out of ideas on what to check at this point.  This was working fine until we expanded the 2 AP cluster to add an additional 9, bringing it up to 11 total.  I can clearly see by graphing the conductor change logs that the issue began on April 10th.

    The only thing that appears to have changed at this time is an attempt to upgrade from 8.10.0.9 to 8.10.0.10 which failed in the Aruba Central audit trail.
    Any ideas on what else could be at fault here, or something to check?


  • 2.  RE: Preempted by provisioned conductor - consistent reboots of AP's in cluster

    Posted 24 days ago

    I'm curious if maybe the uplink settings on the AP's need to have enet1 set to default, rather than uplink? Maybe the AP is trying to send heartbeats on enet1, which is not plugged in?




  • 3.  RE: Preempted by provisioned conductor - consistent reboots of AP's in cluster

    EMPLOYEE
    Posted 23 days ago

    The status of E1 should have zero impact.

    What model of APs are involved?  Can you upgrade to 8.12 and see if the issue persists?



    ------------------------------
    Carson Hulcher, ACEX#110
    ------------------------------



  • 4.  RE: Preempted by provisioned conductor - consistent reboots of AP's in cluster

    Posted 23 days ago

    I can confirm that setting to default and rebooting the swarm did not fix the issue.  We're still seeing the conductor changed events written to our syslog server and AP's reboot.

    This is a cluster of 11x AP-535's and 1x AP-335 (pending a new mount kit we will swap this out for a 535).

    Is there a specific reason to go with 8.12.x.x code? I see in the new features there are improvements to the audit trail that may be relevant to our situation, but I'd have to power down the 335 to upgrade the cluster (release notes seem to indicate that the 330 series is not supported https://www.arubanetworks.com/techdocs/Aruba-Instant-8.x-Books/Release-Notes/812/Aruba-Instant-8.12.0.0-Release-Notes.pdf) .




  • 5.  RE: Preempted by provisioned conductor - consistent reboots of AP's in cluster

    EMPLOYEE
    Posted 23 days ago

    AP-330 series is parked in AOS 8.10.

    Bug fixes get done in newer versions (SSR) and then backported to the older version (LSR) and, based on your initial description, there wasn't any reason to not be on 8.12.  Since you mention the issue started post-upgrade, was wondering if a different version would still have the problem.

    Is one of the APs configured as the preferred conductor?



    ------------------------------
    Carson Hulcher, ACEX#110
    ------------------------------



  • 6.  RE: Preempted by provisioned conductor - consistent reboots of AP's in cluster

    Posted 23 days ago

    Yes, one of them is the preferred conductor.  I was reviewing the syslog messages from it and noticed there is a minute long gap during one of the events where it just stopped sending messages.  Other AP's in the cluster continue to send messages just fine - normally each sends multiple per second, so a gap of a minute is pretty big.  The VC didn't reboot or lose power during this time, but I see on our switch side the interface flapped for 5 seconds.

    No other AP I've spot checked has had this issue, so I'm going to shut the VC off after hours and see if everything remains stable overnight.




  • 7.  RE: Preempted by provisioned conductor - consistent reboots of AP's in cluster

    Posted 17 days ago

    For what it's worth, shutting down that one AP resolved the issue we were seeing.  The AP itself did not reboot (uptime remained) and it did not lose PoE, so Aruba Central showed it as not having any issues.  Our cluster has been stable since doing this and an RMA has been issued.