I would recommend going ahead with the TAC case, that behavior isn't what we expect.
Original Message:
Sent: Nov 14, 2024 07:55 AM
From: p@rick
Subject: AOS8 Clustering AAC/UAC Traffic Flow
They are regularily heartbeating and it is L2 connected. In the meantime the controller was rebooted and the balancing parameters were adapted to accommodate the actual used AP/client count.
Looks ok again so far.
I am trying to understand how the controllers are determining the split ratio per ESSID now. I checked the ESSIDs and they differ from each other.
49:51
50:50
46:54
47:53
35:65
Out of curiosity I connected to another cluster that never had those issues. There I can also see unequal split ratios (different platform same software version). Is there any documentation on how those are derived? I was under the impression that the ratio is based on the number of controllers in a cluster and not some seemingly arbitrary number. I also cannot find specifics in the book "Understanding AOS8" by @westcott
Regarding TAC, I consider it once I cannot find any further information that is available or can be shared publicly.
Original Message:
Sent: Nov 14, 2024 05:09 AM
From: Herman Robers
Subject: AOS8 Clustering AAC/UAC Traffic Flow
That is weird... with equal platforms the distribution is expected to be roughly 50-50 (for 2 controllers); it's quite offset.
Cluster health looks further good? vlan-probes, heartbeat counters, etc? If you are not comfortable troubleshooting yourself, I would work with TAC on this to find what is causing this unbalance.
------------------------------
Herman Robers
------------------------
If you have urgent issues, always contact your Aruba partner, distributor, or Aruba TAC Support. Check https://www.arubanetworks.com/support-services/contact-support/ for how to contact Aruba TAC. Any opinions expressed here are solely my own and not necessarily that of Hewlett Packard Enterprise or Aruba Networks.
In case your problem is solved, please invest the time to post a follow-up with the information on how you solved it. Others can benefit from that.
Original Message:
Sent: Nov 14, 2024 02:19 AM
From: p@rick
Subject: AOS8 Clustering AAC/UAC Traffic Flow
It is a cluster comprised of 2 7210 controllers running 8.10.0.12 LSR
Original Message:
Sent: Nov 13, 2024 12:33 PM
From: chulcher
Subject: AOS8 Clustering AAC/UAC Traffic Flow
What model controllers are you using? What version of AOS 8?
------------------------------
Carson Hulcher, ACEX#110
Original Message:
Sent: Nov 13, 2024 07:01 AM
From: p@rick
Subject: AOS8 Clustering AAC/UAC Traffic Flow
Hi,
I am currently trying to understand the general traffic flow within a cluster. I could see that APs build the typical 4 tunnels (AAC/S-AAC/UAC/S-UAC) and that the UAC tunnel is build per BSSID as well as only one tunnel for each client connected to said SSID.
However I was just troubleshooting a situation where a controller uplink was saturated. I checked the client/AP distribution in the cluster and saw this.
Cluster Load Distribution for APs --------------------------------- Type IPv4 Address Active APs Standby APs ---- --------------- -------------- --------------- self x.x.x.x 30 69 peer y.y.y.y 69 30 Total: Active APs 99 Standby APs 99 Cluster Load Distribution for Clients ------------------------------------- Type IPv4 Address Active Clients Standby Clients ---- --------------- -------------- --------------- self x.x.x.x 20 1219 peer y.y.y.y 1220 20
The cluster balancing settings were left to default, which at least explains why it did not put effort in rebalancing them. I don't understand why it did not do the initial 1% balancing.
This is the bucket map for the ESSID. It is an 80:20 split for reason I did not yet find out.
Active Map[0-31] 01 01 01 01 01 00 01 01 00 01 01 01 01 00 01 01 00 01 01 01 01 01 01 01 00 00 01 00 01 01 01 01 Active Map[32-63] 01 00 01 01 01 01 01 01 01 01 01 01 00 00 01 01 01 01 00 01 01 01 01 00 01 01 01 01 01 01 00 01 Active Map[64-95] 01 00 01 00 01 01 01 01 01 01 01 01 01 01 01 01 01 01 00 00 01 00 00 01 01 01 01 01 00 01 01 01 Active Map[96-127] 01 00 01 01 00 01 01 00 00 01 01 00 01 01 01 01 01 01 01 01 01 01 01 00 01 00 01 01 01 00 01 01 Active Map[128-159] 01 01 01 01 01 01 01 00 01 01 00 01 01 01 01 00 01 01 01 01 01 01 01 00 01 01 01 00 01 01 01 01 Active Map[160-191] 01 00 01 01 01 01 01 00 01 01 01 01 00 01 01 01 00 01 01 01 00 01 01 01 00 01 01 01 01 01 00 01 Active Map[192-223] 01 01 01 01 01 01 00 01 00 01 01 01 01 01 01 00 01 01 01 01 01 01 01 01 00 00 01 01 01 01 01 01 Active Map[224-255] 00 01 01 01 01 01 01 01 01 00 01 00 01 01 01 01 01 00 01 01 01 01 01 01 01 00 01 01 01 00 00 01 Standby Map[0-31] 00 00 00 00 00 01 00 00 01 00 00 00 00 01 00 00 01 00 00 00 00 00 00 00 01 01 00 01 00 00 00 00 Standby Map[32-63] 00 01 00 00 00 00 00 00 00 00 00 00 01 01 00 00 00 00 01 00 00 00 00 01 00 00 00 00 00 00 01 00 Standby Map[64-95] 00 01 00 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 01 01 00 01 01 00 00 00 00 00 01 00 00 00 Standby Map[96-127] 00 01 00 00 01 00 00 01 01 00 00 01 00 00 00 00 00 00 00 00 00 00 00 01 00 01 00 00 00 01 00 00 Standby Map[128-159] 00 00 00 00 00 00 00 01 00 00 01 00 00 00 00 01 00 00 00 00 00 00 00 01 00 00 00 01 00 00 00 00 Standby Map[160-191] 00 01 00 00 00 00 00 01 00 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 01 00 00 00 00 00 01 00 Standby Map[192-223] 00 00 00 00 00 00 01 00 01 00 00 00 00 00 00 01 00 00 00 00 00 00 00 00 01 01 00 00 00 00 00 00 Standby Map[224-255] 01 00 00 00 00 00 00 00 00 01 00 01 00 00 00 00 00 01 00 00 00 00 00 00 00 01 00 00 00 01 01 00
None of those however explain how it is possible that the seemingly less utilized controller had a saturated uplink. That is why I want to understand the traffic flow between cluster members.
Does anyone know more details?
Thanks in advance