Wireless Access

last person joined: yesterday 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

Two Controllers using 99% memory

This thread has been viewed 1 times
  • 1.  Two Controllers using 99% memory

    Posted May 30, 2019 09:52 AM

    I have two 7210 controllers in a cluster with 338 access points.


    Both controllers are reporting 99% memory usage. 

    Version is: 8.3.0.1

     

    I am calling tac now.

     

    Any ideas?



  • 2.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted May 30, 2019 10:41 AM

    You are doing the right thing.  TAC will sort that out.



  • 3.  RE: Two Controllers using 99% memory

    Posted May 30, 2019 02:05 PM

    They will get back to me by EOB tomorrow. :(

     

    Here is the dump of the processes.

     

    show processes sort-by memory


    %CPU S PID PPID VSZ RSS F NI START TIME EIP CMD
    7.4 R 4235 4017 546752 256384 4 0 2018 21-06:45:48 004ad214 /mswitch/bin/stm
    2.9 S 4521 4017 462784 265920 0 0 2018 8-12:35:48 2b2d38a4 /mswitch/bin/dds
    0.2 S 4087 4017 437568 320576 0 0 2018 15:10:01 2b289d2c /mswitch/bin/gsmmgr
    1.0 S 4215 4017 416768 217792 4 -10 2018 2-23:49:12 2b9c9d2c /mswitch/bin/auth
    1.4 S 4136 4017 357952 157440 4 0 2018 4-02:14:51 2b26bd8c /mswitch/bin/fpapps
    0.1 S 4439 4017 316608 150080 4 0 2018 12:08:52 2b5d9d2c /mswitch/bin/fw_visibility
    0.6 S 4717 4017 293376 174848 4 - 2018 1-17:46:40 2b2d9d2c /mswitch/bin/cluster_mgr
    0.9 S 4490 4017 288896 76288 4 0 2018 2-20:00:46 2b5f9d2c /mswitch/bin/mdns
    1.7 R 4276 4017 277056 177472 0 0 2018 5-02:13:16 2b100d84 /mswitch/bin/amon_sender
    1.1 S 4518 4017 243584 52416 4 0 2018 3-09:41:09 2b349d2c /mswitch/bin/arm

    0.0 S 4502 4017 231296 76928 0 0 2018 05:49:55 2b5b9d2c /mswitch/bin/extifmgr -w 10 -c 10
    0.1 R 4154 4017 229056 47296 4 0 2018 11:40:30 2b1e9d2c /mswitch/bin/pim
    0.1 S 4553 4017 228096 52416 4 0 2018 12:25:02 2b3c9d2c /mswitch/bin/ucm
    0.1 S 4170 4017 200704 126656 0 0 2018 11:55:05 2b589d2c /mswitch/bin/licensemgr
    0.1 S 4807 4017 192128 65536 0 0 2018 10:53:11 2b079d2c /mswitch/bin/sc_rep_mgr
    3.8 S 4579 4017 188096 60480 4 0 2018 11-03:17:21 2b539d2c /mswitch/bin/ofa
    0.1 S 4102 4017 166656 116416 4 0 2018 07:11:43 2b4b9d2c /mswitch/bin/cfgm
    0.0 S 4310 4106 157120 19904 1 0 2018 02:30:01 2ad14478 postgres: root userdbv3 [local] idle
    0.0 S 20816 4106 155968 15424 1 0 09:48 00:00:01 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 20814 4106 155968 15360 1 0 09:48 00:00:01 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 20813 4106 155968 15232 1 0 09:48 00:00:01 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 20818 4106 155968 15232 1 0 09:48 00:00:01 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 20817 4106 155968 15104 1 0 09:48 00:00:01 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 28622 4106 155968 15040 1 0 10:11 00:00:01 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 20815 4106 155968 14912 1 0 09:48 00:00:01 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 787 4106 155968 13568 1 0 13:34 00:00:00 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 814 4106 155968 13568 1 0 13:34 00:00:00 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 905 4106 155968 13504 1 0 13:34 00:00:00 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 5704 4106 155968 11904 1 0 2018 00:00:00 2ad14478 postgres: root wms [local] idle
    0.0 S 4176 4106 155968 11456 1 0 2018 00:39:20 2ad14478 postgres: root licensedb [local] idle
    0.0 S 4494 4106 155968 11328 1 0 2018 00:00:00 2ad14478 postgres: root agrp_db [local] idle
    0.0 S 4489 4106 155968 11200 1 0 2018 00:00:00 2ad14478 postgres: root iapdb [local] idle
    0.0 S 4212 4106 155968 10944 1 0 2018 00:00:00 2ad14478 postgres: root userdbv3 [local] idle
    0.0 S 5107 4106 155904 20288 1 0 2018 01:20:49 2ad14478 postgres: root userdbv3 [local] idle
    0.0 S 4218 4106 155904 14016 1 0 2018 00:28:04 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 17117 4106 155904 13120 1 0 2018 00:00:00 2ad14478 postgres: root userdbv3 [local] idle
    0.0 S 5132 4106 155904 12352 1 0 2018 00:02:00 2ad14478 postgres: root userdbv3 [local] idle
    0.0 S 12058 4106 155904 11648 1 0 Apr11 00:00:00 2ad14478 postgres: root cfgmdb [local] idle
    0.0 S 5295 4106 155904 11264 1 0 2018 00:00:00 2ad14478 postgres: root bocmgrdb [local] idle
    0.0 S 4291 4106 155904 10944 1 0 2018 00:00:00 2ad14478 postgres: root userdbv3 [local] idle
    0.0 S 5068 4106 155776 8832 1 0 2018 00:00:00 2ad14478 postgres: root bocmgrdb [local] idle
    0.0 S 5070 4106 155776 8640 1 0 2018 00:00:00 2ad14478 postgres: root bocmgrdb [local] idle
    0.0 S 5072 4106 155776 8640 1 0 2018 00:00:00 2ad14478 postgres: root bocmgrdb [local] idle
    0.0 S 5074 4106 155776 8640 1 0 2018 00:00:00 2ad14478 postgres: root bocmgrdb [local] idle
    0.0 S 4219 4106 155776 8512 1 0 2018 00:00:00 2ad14478 postgres: root userdbv3 [local] idle



  • 4.  RE: Two Controllers using 99% memory

    Posted May 31, 2019 12:54 PM

    I had the same thing happen about a year ago with a customer on AOS 8.2, MM, two L2 clustered 7220's and a few hundred APs.  TAC ultimately had me upgrade the code to a slightly more recent version, which made the memory issue go away, and I haven't heard of the problem resurfacing yet.

    While this fixed the issue for me, I was unclear as to what may have caused it and if there was anything I could have done with the configuration or by restarting processes in order to fix it.

    If you hear back from TAC as to how to troubleshoot and fix this outside of code upgrades - or what the root cause is - I'd love to hear.



  • 5.  RE: Two Controllers using 99% memory

    Posted Jun 04, 2019 02:03 PM

    They said it was a cosmetic bug. SMH. Linux doesn't randomly report the wrong memory usage. Every cli command shows the memory was 99%.

     

    It was obviously a memory leak since they controllers had been up for a while. They suggested upgrading to newer code. A reboot brought memory back to normal.



  • 6.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted Jun 04, 2019 11:49 PM

    It sounds like you need to do your own investigation and get to the bottom of this.



  • 7.  RE: Two Controllers using 99% memory
    Best Answer

    Posted Jun 05, 2019 02:51 PM

    I rebooted the controllers that night. They didnt even tell me to try rebooting until the next day even after I told them its inpacting service.

     

    Its a memory leak. I will upgrade the code.



  • 8.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted Jun 05, 2019 03:20 PM

    What service was it impacting?  I didnt see that in your post.



  • 9.  RE: Two Controllers using 99% memory

    Posted Jun 05, 2019 03:22 PM

    users would be connected but could not pass traffic at all. It would stop working for like 5-10mins and start working on it's own. 

    This never happened before and was resolved once I rebooted.



  • 10.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted Jun 05, 2019 05:03 PM

    I didn't see that in your original post.  Did you tell TAC that?  The bug itself is a display error for the memory.  It is possible that you had another issue?



  • 11.  RE: Two Controllers using 99% memory

    Posted Jun 06, 2019 08:56 AM

    Well I disagree with TAC.

     

    We ran all the commands fro the cli which also reported 99% used memory. ARe you gonna tell me the cli binaries also have a display issue?

     

    It's a memory leak.

     



  • 12.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted Jun 06, 2019 10:44 AM

    I don't know which bug you could be hitting but 8.3.0.1 is old and has quite a few bugs.  One of them might be below:

    Screenshot 2019-06-06 at 07.39.04.png

    I don't have time to run through those specific bugs and understand what you could be experiencing specifically, but at minimum you should upgrade because 8.3.0.1 has been discovered to have quite a few bugs being a .1 release. 

     

    I really don't understand why you would also ask for help and then reject answers that you get.  It doesn't make sense...

     



  • 13.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted Jun 06, 2019 01:00 PM

    Agree with Colin, there's no telling if what Colin is referencing is *exactly* what you hit, but early in 8.x, there were 'controls' (for lack of a better term) to reserve certain CPU proceses for X amount of CPU resources, which led to the display of higher CPU util than actual (think 'dataplane reserve 25%' even though it may have only been using 5% kind of thing.

     

    So specific to OP, we won't know what the actual issue was, but that WAS one thing that was in some early code that has since been addressed.



  • 14.  RE: Two Controllers using 99% memory

    Posted Jun 06, 2019 04:04 PM

    memory on two controllers was using 99%.

     

    I rebooted both controllers.

    Memory immediatelly  was at and is still currently at about 49-50%.

     

    Are you saying that this is cosmetic according to the bug you posted?

     

    That bug specifically relates to the "mem available" column being omitted from the cli. That has nothing to do with the controllers exhausting all memory. I reject answers that don't make any sense.



  • 15.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted Jun 06, 2019 04:21 PM

    I only showed you the bug(s)  to demonstrate to you that yes, counters that relate to memory and CPU can be wrong.

     

    You are free to think what you like.



  • 16.  RE: Two Controllers using 99% memory

    EMPLOYEE
    Posted Jun 06, 2019 04:21 PM

    I only showed you the bug(s)  to demonstrate to you that yes, counters that relate to memory and CPU can be wrong.

     

    You are free to think what you like.