Took a long time to reach a close here with TAC because the problem moved between Airwave TAC and AOS TAC.
Everything is now in a 'Good' state (MC and MD) and Glass aggregates client counts correctly.
For the controllers there were 3 places on AW that held relevant config for ssh username and SNMP wrt to AW contacting a controller/conductor
- Device Setup > Discover > Communication (only relevant if you're using scans for discovery)
- Devices > Select Folder > Select Device > Manage
- Groups > Select Group > Basic (needed to set the SNMP version in use)
I was going through the Manage screen to set the telnet/SSH username and password, same creds for all devices. But one cluster refused to work, leaving me in an error state with a message stating "Telnet/SSH password timeout".
The problem turned out to be that on the Basic page, the SNMP version was set to v3, when I was expecting to be using v2 and I didn't have a v3 user configured. My error and I didn't know where to look to to resolve it - the error messaging didn't seem to indicate a problem with a specific SNMP version. Once v2 was selected, and the password was re-set again, things came good.
The Conductors were different. They were getting an SNMP timeout message, but pinging them from AW was fine - they're on the same subnet and so far as I was concerned there was no firewall between the AW and MCR virtual machines. I was wrong. It was the AW TAC fella that thought it was a firewall problem when he get onto the AW cli and tried to SSH from AW to MCR using the user creds specified in the Manage config screen. The connection just dropped. The firewall was configured on the MCR under Services > Firewall > ACL Allowlist, but he couldn't guide me to that because he didn't know about it. So AOS TAC got involved.
I had looked in that Allowlist screen in the past but saw only check boxes and some number string boxes for rate limiting and so on. I hadn't realised that at the bottom of that screen there are extra menus, one of which is the actual ACL. It would have helped if those menus were at the top of the screen instead of several scrolls down on my monitor. I simply hadn't scrolled far enough! The ACL indicated that there was a deny rule for port 22. So I removed it, added allow rules for the AW servers, and put the deny back. The Primary MCR was now talking to the AW servers, but the secondary wasn't. I hadn't realised I'd put the allow rules directly on the Primary, instead of higher up the config hierarchy. This took a good while to realise. Once the ACL was updated at the correct level, folder level, the Secondary MCR still wouldn't talk to AW. The folder level config of the ACL wasn't pushing to the Secondary correctly and the deny rule was always going in above the host permit rules. I had no idea you could config the Secondary directly through its GUI. I thought the nature of the cluster meant config was only permitted via the Primary. That appears not to be the case for ACLs, and a direct edit sorted the problem out.
There was an awful lot of messaging going on between me and TAC, but eventually this was all figured out. We're all good now each AW server monitors its own cluster of controllers, and the Conductors report to all 3 AW servers all preferring AMON.
Thanks for the guidance from those involved.
Original Message:
Sent: Oct 25, 2024 12:31 PM
From: Gowri Amujuri
Subject: Conductor client counts around 55% higher than Airwave/Glass client counts
Nathan,
Thanks! Please update on the case progress and unicast me the Case# so that I can check internally.
w.r.t client counts, the AirWave server's client count should match the cluster client count and not MCR client count as MCR has consolidated client count from all clusters.
w.r.t WMS, the guide is about MCR to controller and it should be present by default and one of AMON is configured on MCR to all conductors. Similar to AMON config on conductors, set AirWave as AMON receiver on MCR too that should send AMON data from MCR to AirWave about detailed info on AP and MON stats.
About the audit to get monitoring schema, is needed too atleast once post upgrade of MCR and controllers. By default it is set to 1 day.
I suggest checking on the AMON profiles on the MCR/controller as some may be disabled and enable all of the AMON profiles on all the controllers and MCR except for 'Tags' and couple of others. TAC engineer can review the settings too.
------------------------------
Regards
Gowri Amujuri
Original Message:
Sent: Oct 25, 2024 10:40 AM
From: n.millward
Subject: Conductor client counts around 55% higher than Airwave/Glass client counts
Hi Gowri, thanks for the longer reply with explanations.
Validated versions - noted, excellent. So that's been laid to rest.
For the changes I need to make I've held off on adding the MCR to each of the AW instances at the moment, so I can see the effect of re-enabling AMON again first. AMON is now preferred on all AW instances and the results for client count for cluster 2 and 3 are much closer to the MCR count.
The cluster 1 AW server is still very short of clients, despite being configured the same as the other 2 AW servers.
I have a call with Aruba TAC on Monday afternoon for an engineer to investigate, I'll update after that. I do note what both Carson and you have commented wrt to having the MCR on all Airwaves and ignoring devices.
Client count change after reverting to AMON again.
Cluster 2 AW2

Cluster 3 AW3

You mentioned WMS so I looked into that here
https://www.arubanetworks.com/techdocs/ArubaOS_8.12.0_Web_Help/Content/arubaos-solutions/wireless-intrus-prev/conf-wlan-mana-sys.htm
Because we have no config in the Management Server field I think we are in a position of using the MCR for WMS Termination, so that should be fine. I'm going to change the default 60000ms IDS AP poll interval to 30000ms to see what effect that has on reporting.
Good weekend.
------------------------------
Nathan
Original Message:
Sent: Oct 23, 2024 12:19 PM
From: Gowri Amujuri
Subject: Conductor client counts around 55% higher than Airwave/Glass client counts
Nathan,
The validated versions are the first versions tested for the AP models. Post that the tests are with new OS version and new AP's at time of AirWave release. It doesn't mean AirWave doesn't support older AP's on newer AOS versions (unless EOL). I have pointed this to TAC management to correct this for newer folks in TAC.
w.r.t, MCR and clusters, the split is always hard and generally we see issues with data not consolidated. So, currently you have 3 clusters in 3 AirWave servers. MCR must be added in all 3 AirWave servers. Not just because its MCR. MCR has WMS data from all clusters and improves monitoring from APs, clients, Rogues etc. w.r.t device split and problem with adding MCR, where it discovers all devices, Devices that are not meant to be on respective AirWave server need to be *ignored*. For example, If Cluster 1 is on AW1. Cluster2 and cluster3 devices (including controllers) need to be in *ignored* state so they dont get rediscovered. Select the cluster2 and cluster3 devices and ignore them from modify list. Ignoring devices will not take up license or resources as they will not be monitored. Do this on other AirWave servers to ignore devices apart from the cluster and MCR that AirWave server needs to monitor.
Now w.r.t SNMP vs AMON, please enable AMON on all the controllers including MCR to send to AirWave. AMON has more and better data than SNMP. Also AMON sends data every min vs 5 mins with SNMP. This will help to find fast roaming clients, especially clients with 1 or 2min of association times. Enable Audit for the devices in monitor only mode. AirWave to audit the devices, add the admin SSH creds of controllers in controller Manage page. Audit must be done atleast once (for every controller upgrade) to get the monitoring schema of the controller. This will ensure the data from AMON is shown as per the schema.
P.S. Please unicast or email me the TAC case#.
------------------------------
Regards
Gowri Amujuri
Original Message:
Sent: Oct 23, 2024 06:56 AM
From: n.millward
Subject: Conductor client counts around 55% higher than Airwave/Glass client counts
I'd love to find a configurational problem, but a TAC Airwave engineer has been on our servers and not found anything wrong. His focus is purely Airwave, with no knowledge of controllers. I'm not sure that's particularly smart on Aruba's part. In the mean time, my response to TAC:
'I think it is appropriate to comment here that having bought our estate of 515 and 535 APs around Q1 and Q2 2021, and that they are still shipping in channel today, customers are well within their rights to expect, and in fact demand that the Aruba monitoring platform that goes along with their substantial investment in an Aruba wireless infrastructure should be able to correctly and accurately monitor their environment.
Early on in our change from Cisco to Aruba (2021 onwards) we used a single instance of Airwave to plot in Visual RF where we had put APs. We soon out-grew that single instance (4000 AP limit in Airwave) and spent considerable time in 2023 reconfiguring the controller and AP group config to break the campus up into 3 regions. Each region was controlled by one controller cluster, and the controllers and APs in the region reported to a specific new instance of AW. We then used Glass to aggregate those 3 Airwave instances. Right from the beginning we could never understand why the Mobility Conductor had upwards of double the number of clients the Glass instance showed. When we looked at each Airwave we couldn't understand why the Airwave instances had so few clients compared to the controller cluster that was reporting to them. It was only recently, having finished the re-engineering, when asked to provide some data about clients that I was unable to provide that this started to become an apparent problem.
I can't imagine that we are the only customer with this as a problem either, given the TAC response about AP version and Airwave compatibility in the linked Airwave 8.3.0.2 document. The APs in question are current models, they are not deprecated, they are still on sale. I hardly think those are signs of old APs.
I think this case needs escalating because it is perfectly acceptable for a customer to expect that the investment they have made can be monitored appropriately for the service life of that investment. In this case the service life is right up until the AOS version deprecates the AP model.
Please pass this on.'
------------------------------
Nathan
Original Message:
Sent: Oct 23, 2024 04:11 AM
From: n.millward
Subject: Conductor client counts around 55% higher than Airwave/Glass client counts
Latest message from Aruba TAC - I think this is pretty poor form on Aruba's part in my opinion, these APs (515 and 535) only went in in 2021. Our 503H were installed from August 2021 to September 2022. They are all supported by the controller OS, so to say that the Airwave platform requires a controller OS that ceased being supported in July 2021 in order to work with our APs is dismal service on the Airwave team's part.
'I have checked internally on the matter why you are not getting the correct number of clients on your AirWave. It seems that while you are using OS version for controller and APs which are not shown to be supported on your current version of AirWave, we can not guarantee that AirWave will work properly which includes the incorrect number of clients.
Unfortunately, as your APs are old and do not support a higher version of their firmware, there is not much that can be done. If possible, trying to roll back to a version which is shown to be supported in the supported device matrix page should be of help with the matter.
Please, check the supported device matrix page here: https://www.arubanetworks.com/techdocs/AirWave/8302/AirWave_8.3.0.2_Supported_Devices_Matrix.pdf.'
Hello Viou, our system event logs don't have the messages you're seeing. Ours is totally filled with
'Aruba 7240XM net-mc05 Telnet/SSH Error: pattern match timed-out'
and
'Aruba AP 503H co-wap-d440 Configuration status changed to 'Telnet/SSH Error: (pattern match timed-out) in password failure: Permission denied, please try again.'
The password has been updated at the device level (as well as at the discovery level) in an attempt to remedy, but hasn't helped.
------------------------------
Nathan
Original Message:
Sent: Oct 22, 2024 10:35 AM
From: viou
Subject: Conductor client counts around 55% higher than Airwave/Glass client counts
We are seeing the same thing on 8.10.0.12 and 8.3.0.3 with 3 controllers also.
Are you also seeing logs in System, Event log showing AW is deleting controllers from cluster and then coming back in cluster and that is when we see the client totals drop.
We have an open ticket and they are saying resolved in 8.3.0.4 but haven't said what is occurring. Still waiting to hear back if this is what is fixed.
Original Message:
Sent: Oct 18, 2024 07:49 AM
From: n.millward
Subject: Conductor client counts around 55% higher than Airwave/Glass client counts
Hello, there seemed little point putting this in the Airwave channel and there is so little engagement in there.
I've been over other posts about client count mismatches on AW and just had a call with Aruba TAC and not got very far.
The lay of the land:
3 controller clusters running 8.10.0.12
1 AW server per cluster (APs allocated by LMS) running 8.3.0.3
1 Glass server polling from the AW servers running 1.3.3
The view right now from the Conductor is showing 21,001 clients

But the Airwaves tell only a fraction of that client story
Cluster01

Cluster02

Cluster03

Glass

Within AMP Setup we have tried both prefer AMON and prefer SNMP (currently set to prefer SNMP) - AMON gave a marginally better count.
Each controller has the correct AW configured per its cluster membership, and as a MON Receiver. e.g.


The clusters are using SNMPv2
We had set our AMP config to disable TLS v1.0 and 1.1, but the TAC call had us re-enable those old versions. The radio button is now set to 'No'

This TLS change made a small difference to client count but nowhere near correct still.
TAC showed us an AW 8.3.0.2 doc that detailed that our AP-515 and AP-535 are not tested to be compatible with AW on anything above AOS 8.7.0.0. This just seems too crazy to be true, that Aruba's monitoring platform is not compatible with it's management platform.
https://www.arubanetworks.com/techdocs/AirWave/8302/AirWave_8.3.0.2_Supported_Devices_Matrix.pdf
It is evident that some of the clients on our estate of 503H, 514, 515, and 535 devices are visible to AW, but not all of them for some reason.
Anyone got any thoughts on where we go from here?
Thanks.
------------------------------
Nathan
------------------------------