Just had a minor blip with Airwave in that it stopped showing connected clients for a short period. After a minor panic and logging into each local controller I can see we had no break in service.
Around the same time we got this message:
Mon Jul 10 10:20:08 2017 System System Restarting service Client Monitor Worker
Looking through the event log I can see that we get a lot of these messages, sometimes one after another. Any ideas why these crop up please?
Same here. We're getting them every 15 seconds.
I'm seeing these regularly, with intervals varying between a few seconds and about 15 minutes. I can't see any other events or actions that might be related to it. I've seen this both in my clean 8.2.4 installation and my previous 8.2.3. The latter has been updated several times over the years but I couldn't say which version it started on.
- 238 switches (Cisco 3560, Aruba S1500 and Aruba 2930F)
- 715 access points (AP-105, AP-115, AP-134, AP-135, AP-205, AP-225 and RAP3)
- Aruba 7220 running ArubaOS 188.8.131.52.
I got the Client Monitor Worker messages while I was running ArubaOS 6.x and minimal SNMP traps on the 7220 as well. As I'm unsure when the problem started, I can't say whether adding new types of switches and access points might have triggered it.
As far as i know it doesn't cause any trouble, but it's filling up my event log. And I'm not too comfortable with ignoring seemingly inconsequential errors.
I am having this problem as well and have a case open with Aruba TAC. We will be working on this more tomorrow and I will post any progress we make. Anyone else find a solution to this? I am not sure if it is related but my client counts are fluctuating every 5 minutes by 500-1000 clients and that is not normal behavior.
Tue Oct 10 20:28:32 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:28:15 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:28:00 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:27:32 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:27:15 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:26:15 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:26:00 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:25:45 2017 System System Restarting service Client Monitor WorkerTue Oct 10 20:25:16 2017 System System Restarting service Client Monitor Worker
I'd be really interested to hear about what you find as, as mentioned, we are getting bucket loads of these messages.
The other reason it is not great is that it means that it is obscuring the genuine messages that I'm checking the audit log for.
The restart messages show up in the service_watcher log. There's probably more info about the specific process if you look in the async_logger_client log (should be an option on the System -> Status page).
The async_logger_client log certainly helps in troubleshooting this. Thanks! I'm attaching a snippet of mine with MAC addresses semi-anonymized.
The messages about uninitialized value in DiscoveredViaProxy.pm always come in groups of four, with two errors in line 61 followed by two errors in line 80. There seems to always be 572 lines, or 143 groups, before they stop for a few minutes. Although the recurring number of lines seemed promising I cant' make 572 or 143 fit with AP model or anything else in my system.
After the first pause the <$__ANONIO__> line reference is appended, and the line number updates after each pause. When the service restarts the ANOINO reference is gone again. I haven't been able to find anything relating to this in any other logs.
The uninit values are negligible since it's coming from device discovery. The more concerning lines are the ones w/ 'illegal attempt to update' since the timings identical.
I would probably follow a path thinking that it may be a performance issue on the system. What's the system specs, and then number of devices monitored/managed? Also, this is probably the right point to engage with TAC so that they can investigate further, add additional debug logging, and fine tune performance or file defect as needed.
I have about 1750 access points and have it installed on a pretty beefy server..
CPUIntel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz Hyper-Threaded 8 Cores 20480 KB cache (2600.224 MHz actual)Intel(R) Xeon(R) CPU E5-2640 v3 @ 2.60GHz Hyper-Threaded 8 Cores 20480 KB cache (2600.224 MHz actual)
MemoryInstalled Physical RAM: 62.80 GBConfigured Swap Space: 4.00 GB
I'm running a VM with 4 VCPUs and 24 GB RAM. AMP is using a steady 13 GB, while Kernel Cache eats up whatever's left. CPU load is pretty stable at 40% with 60% peaks about every half hour. The peaks don't coincide with Client Monitor Worker restarts.
I've still got my old 8.2.3 running, with only the 240-ish switches. It still restarts CMW every few minutes a few times a day, without much detail as to why:
Mon Oct 16 15:20:54 2017: Started (PID: 16366)
Mon Oct 16 15:20:54 2017: Postgres PID: 16367
Mon Oct 16 15:36:38 2017: pid 9069 exiting: reached max payload count.
Mon Oct 16 15:36:39 2017: Started (PID: 17486)
Mon Oct 16 15:36:39 2017: Postgres PID: 17487
Mon Oct 16 15:39:11 2017: pid 9653 exiting: reached max payload count.
Mon Oct 16 15:39:25 2017: Started (PID: 17706)
Mon Oct 16 15:39:25 2017: Postgres PID: 17707
Mon Oct 16 15:40:30 2017: pid 9656 exiting: reached max payload count.
Mon Oct 16 15:40:39 2017: Started (PID: 17799)
Mon Oct 16 15:40:39 2017: Postgres PID: 17806
Mon Oct 16 15:40:59 2017: pid 9280 exiting: reached max payload count.
Mon Oct 16 15:41:09 2017: Started (PID: 17850)
Mon Oct 16 15:41:09 2017: Postgres PID: 17852
It also has some illegal update attempts, albeit fewer, and seemingly not coinciding with the CMW restarts.
Async Logger CLient or monitoring Workers (ALC) restart after reaching certain payload or memory limit in 8.2.x is normal. Nothing to worry about. We made a change in 8.2.x for self recovering of ALC to restart if the workers reach certain threshold of payload or takes more memory to avoid memory overflow from this service. This is hardcoded to 4G mem and 65K payload limit.
So the servers are not affected by this. We will see if we can log these else where instead of event logs to avoid confuison.
Ah, that makes sense. It's a bit worrisome that it reaches those limits every 2-10 minutes througout the day, though. Are there any known settings or functions that can cause that?
It would be great if you could log it somewhere else, so it doesn't fill the event log and push out more important information.
Agreed is this physical RAM or swap space? I've got much more than 4GB so making it configurable would be useful.
@ novec - This depend son amount of AMON data that controller is sending. We need to edit in code to increase the limits but for majority of enviroments thats not needed as this entirely depends on data received per min from the network. Also depends on the server specs. Faster CPU's Faster Disk I/o etc. However, as stated not to worry about these restarts. I ahve started a thread internally to move these out of Event logs to avoid panicky assumptions.
@wifialexander - this is Physical RAM. But mostly the restarts are due to size than RAM. If due to RAM, then probably the CPUs are on bit slower side. There will 2/3 number of ALC's for the numbe rof cores you have on the server. So its not one process takes up to 4G. Each ALC can take up to 4G. As before there was no need to increase this as this works for majority of envronments. If for some reason, this si needed on your server, please open a TAC case and let me know i will check what we can do on your server to make it better.
Nice. I'm looking forward to having an event log that isn't filled up with useless information :-)
I am not sure what to make of that log at all on my system.. Attached..
You're seeing far more issues against the association table which is a main table for data gathering. You should definitely open a support case so that support can look into your setup further.
At Aruba, we believe that the most dynamic customer experiences happen at the Edge. Our mission is to deliver innovative solutions that harness data at the Edge to drive powerful business outcomes.
© Copyright 2021 Hewlett Packard Enterprise Development LPAll Rights Reserved.