02-05-2015 02:45 AM - edited 02-05-2015 02:50 AM
We have a AW-2500 188.8.131.52 install monitoring just under 1,900 APs. We're running it on a nice big bare metal server with 2 x 8core E5-2660 CPUs, 94GB ram and an array of about 16 disks which (I believe) are 15k SAS.
On visual RF we currently have 22 buildings, most are three floors, with 255 APs added so far.
As far as I can see we're well within the hardware sizing guide for our AP deployment. However AMP is struggling. It's using all the available ram most of the time and then hitting swap so there are times the system starts to thrash.
Essentially it's unusable at times, as it starts thrashing. As you'd expect the problem is ram/disk bound as the CPUs never break a sweat.
Before I raise this with TAC does anyone have any tips for obviously daft things we may have done with the config?
02-05-2015 08:55 AM
I'm not an Airwave expert, but curious to know what service(s) are taking up all your RAM. You can see this under System > Performance.
If a reply adequately addresses your issue, please click on the "Accept as Solution" and "Give Kudos" button so this information can benefit other users.
02-05-2015 10:25 AM
Dipping into swap used to be a panic point, but several performance enhancements have happened since. What you may be seeing is light caching that's not yet reclaimed by the system.
In addition to the already suggested system performance values, how much RAM is allocated specifically to VisualRF (see VisualRF -> Setup -> Memory Allocation)? This is RAM set aside for VisualRF use only.
Also, what are the numbers of threads allocated to data processing? This is on AMP Setup -> General -> Perfomance box.
Do you have any large memory consuming features in use? AppRF, UCC, IGC (instant gui config), large report runs (monthly/yearly - if you do run such reports, it's best to run them at times when the network is not at peak)
Another thing to check is the timing for when nightly backups complete:
# ls -la /var/airwave-backup
Ideally you want the backups to complete prior to peak network hours. You can adjust timing of backups in AMP Setup -> General.
Can you also expand more on the sluggishness? Is it pages with graphs that are loading slow? Pages with list tables? Rule of thumb is does it take longer than 10-15 seconds for these items to load.
Senior QA Engineer - Network Services
Aruba Networks, a Hewlett Packard Enterprise Company
02-05-2015 12:41 PM
Interestingly we didn't see this problem until January. There's been some updates to Airwave but the main change is we've added more floor plans to visualrf. However this ram usage doesn't seem at all right to me:
02-05-2015 12:59 PM
To answer each of your questions....
I'm seeing swap and all ram being maxed out for extended periods so I don't think it's caching.
VisualRF currently has 10GB allocated.
Monitoring processes - 24
Config processes - 40
Audit processes - 40
SNMP fetcher - 4
AppRF is turned on, I can check if this is something we feel we need at this time.
UCC is on, igc im not sure.
I don't think we have any report running going on.
When things get sluggish we see slow response of a previously fast ui. When it's really bad the server returns a blank white page and the entire ui is inaccessible. So we do see total loss of the GUI.
02-05-2015 01:48 PM
Just curious, but are you seeing the same behaviour across different browsers?
Have you rebooted the airwave server?
If my post is helpful please give kudos, or mark as solved if it answers your post.
ACCP, ACMP, ACMX #294
02-05-2015 01:56 PM
02-06-2015 02:30 AM
We've taken a look at our server this morning. What we've noticed is a ramp up of ram use in January. This would fit with upgrading our controllers from 6.3 to 6.4.
In January as our users returned (we're a university campus) we saw ram go up very quickly from the peaks of 50GB through December to maxing out everything available.
Today we've disabled UCC and AppRF to see what happens. Currently it's looking OK, having forced restart httpd RAM use is at 20GB and slowly climbing. I'll give it a couple of hours and see what happens.
We're not routinely doing any UCC comms across our wireless network, and for most users the firewall doesn't currently allow this.
Current thoughts are a presumption that the switch to 6.4 on the controllers has thrown a lot more data at airwave. Of course this could still be a bug as we wouldn't expect everything to completely max out, especially as we appear to be within the sizing guide.
02-06-2015 07:14 AM
Yes, I'm told that we did...
Also I've clarified somethings. Apparently we went to 6.4 on the controllers in early December. Airwave was updated in late December and again last week.
So far the only change we can think of that was made between healthy Ram usage in December and the problems starting in January was the adding of more VisualRF floor plans. But we don't have all that many to be honest, and VisualRF has 10GB assigned to it so it should be using more than 40GB.
Just checked the server and our RAM use is still steadily climbing upwards, so we're passed what was being used in December.