Network Management

Reply
MVP
Posts: 1,408
Registered: ‎10-25-2011

Airwave receiving AP Down trap from controller but AP is not down

Good day,

 

Seems like UI have been posting alot but I keep getting great answers!

 

I have an Airwave that is monitoring close to 2000 RAPs between 21 controllers.

 

Noticed that 954 APs suddenly went offline and by clicking on a few of them, they received a

AP is down (SNMP Trap) under Detailed Status

 

Looking at the Traps received by a specific AP I do not see an AP Is Down trap come in for the specified time it went offline.

 

What logs on Airwave, Controller can I look at in order to determine what is going on?

 

Shouldn't Airwave polling take care of the AP is up? They have been "offline" for 6 hours now since 6am this morning...

These APs are not offline on the controller....

 

Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACMP
[If you found my post helpful, please give kudos!]
Guru Elite
Posts: 20,570
Registered: ‎03-29-2007

Re: Airwave receiving AP Down trap from controller but AP is not down

Please open a support case so that this can be troubleshot in parallel with this post.  We could certainly guess, but with 954 APs down, you need answers, not guesses.

 



Colin Joseph
Aruba Customer Engineering

Looking for an Answer? Search the Community Knowledge Base Here: Community Knowledge Base

Moderator
Posts: 1,251
Registered: ‎10-16-2008

Re: Airwave receiving AP Down trap from controller but AP is not down

First check the controller - make sure that it's not also reported as down in AMP.

 

For troubleshooting:  on the CLI, go to /var/log/ap.  In this directory is a folder for each device by id.  Navigate to the id that corresponds with the URL for a down AP, and there should be 2-3 logs.  An event log, audit log, and possibly a telnet commands log if it's a device we have ssh/telnet access to.

 

Additional data can be gathered from the use of qlogs.

# qlog enable snmp_traps

# cd /var/log/amp_diag

check the snmp_traps log file - this log tracks all snmp traps that the AMP receives (from time of enabling qlog to time qlog is disabled, if log fills up, it will follow normal log rotation to not fill up the hard disk).

 

* remember to disable the qlog later on

# qlog disable snmp_traps

 

Since a large number of your devices are down, opening a support case is the best way to get additional eyes on the issue.  Support can help find out if there's a larger issue.

 

If you don't have time for troubleshooting, polling the controller that the RAPs are connected to would be the quick fix.


Rob Gin
Senior QA Engineer - Network Services
Aruba Networks, a Hewlett Packard Enterprise Company
New Contributor
Posts: 2
Registered: ‎01-24-2012

Re: Airwave receiving AP Down trap from controller but AP is not down

It sounds like the AP's aren't really down, right? Just Airwave is showing them as down when they aren't....

 

Are all of the "down" APs on certain controllers? We had this happen where I work and it ended up being due to a master-local sync up issue. All of the APs on the local controller showed up as down on the master and in Airwave even though they were still up. We had to reload the master controller to resolve.

MVP
Posts: 1,408
Registered: ‎10-25-2011

Re: Airwave receiving AP Down trap from controller but AP is not down

These are all "master" controllers, there are no locals in this case.

 

I had re-polled everything in order to quickly fix the issue but I will be calling support but now after logging back in I am seeing this error.

 

error.JPG

Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACMP
[If you found my post helpful, please give kudos!]
Moderator
Posts: 1,251
Registered: ‎10-16-2008

Re: Airwave receiving AP Down trap from controller but AP is not down

For that issue, this may be quicker:

http://www.airwave.com/support/knowledge-base/?sid=50140000000MtYw


Rob Gin
Senior QA Engineer - Network Services
Aruba Networks, a Hewlett Packard Enterprise Company
MVP
Posts: 1,408
Registered: ‎10-25-2011

Re: Airwave receiving AP Down trap from controller but AP is not down

[ Edited ]

Hi Rob,

 

That KB fixed that issue and I was on the phone with TAC at the same time for the other issue in the OP.

 

TAC took a look and said that because my polling periods were set to 10 min for the AP AMP Group and each individual controller had its own group and polling was set to 10min, that the traps received were most likely true and even though after 6 hours of 954 APs being offline, Airwave was not finished its polling of all of the APs.

 

Therefore, because there are so many devices that Airwave is monitoring, it could take up to 3-4 min to poll each device and if 900+ were offline, we may see that behavior.

 

I don't like the answer but every AP we checked subsequently where we received a AP trap down, was actually down on the controller.

 

We also noticed and found out that when clicking on a particular device that looking at the device events section, you may see inconsistencies and by inconsistencies I mean it will display information on traps from different APs and different controllers. This is because that particular RAP I selected had an IP of 192.168.1.240 and so did all of the other APs on the other controllers.


Therefore the Traps are based on IP and is normal behavior.

 

I will keep monitoring on see if I notice anything different today.

 

EDIT: Forgot to mention I am using AMP 7.5.5

Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACMP
[If you found my post helpful, please give kudos!]
MVP
Posts: 1,408
Registered: ‎10-25-2011

Re: Airwave receiving AP Down trap from controller but AP is not down

A little follow-up on this case.

 

Working with TAC has been great. We have been able to properly set Airwave to receive traps and have them processed relatively quickly.

 

What they had found was the following:

the async_logger_client_debug file showed the following

the payload_timestamp vs the current_timestamp was something like 2 hours off.

 

 

 

We had to do the following:

AMP Setup - General -> Monitoring Processes is now set to 6

Each controllers group now has a 10 min polling period whereas the AP group has 5 min.

They also applied a script so that AIrwave ignores 3 particular traps coming from the controllers, this was done so that I did not have to disabled the 3 traps in 28 controllers. I can post the script if needed.

 

Currently the server is handling around ~20 million traps.

 

What we also noticed is the following:

 

Each controller (28 of them, 21 active, 7 backups) has the same subnet range for the RAPS, (e.g. 192.168.0.1 to 254)

Even though these raps will never bounce between active controllers, Airwave is getting confused in terms of processing traps for a particular RAP

 

RAPA for Location A will have ip 192.168.1.5 on Controller 1

RAPB for Location B will have ip 192.168.1.5 on Controller 2

 

In the Device Events for RAPA, you will sometimes see events for RAPB because it shares the same IP, TAC had advised me that this was working as designed because the traps for APs are based off the IP address and not the MAC address.

 

Interesting, so we may be looking at changing the subnets on all of the controllers for the RAPs so that they do not overlap.

 

Pasquale Monardo | Senior Network Solutions Consultant
ACDX #420 | ACMP
[If you found my post helpful, please give kudos!]
Search Airheads
Showing results for 
Search instead for 
Did you mean: