Wireless Access

last person joined: yesterday 

Access network design for branch, remote, outdoor, and campus locations with HPE Aruba Networking access points and mobility controllers.
Expand all | Collapse all

Timeout before AP will reboot

This thread has been viewed 11 times
  • 1.  Timeout before AP will reboot

    Posted Jan 10, 2018 10:46 AM

    Hi,

     

    We have a scenario where I will be moving a customer's APs from their existing controller onto our controller. I plan to do this by adding a provisioning profile to each AP group that gives the APs our master IP address. I will have previously whitelisted the APs on our master. The customer will then reconfigure the switchports that the APs are on so that they are on our AP mngmnt vlan.

     

    In testing this works fine, but there is a long delay before the AP appears on our controller after the above has been done. I assume what is happening is that the AP is trying to talk to the new master address, but has not yet re-DHCP'd so still has its old IP address. I guess eventually it gives up and reboots - at which point it gets an address on our AP vlan and then appears on our controller. My question is whether there is any way of shortening the time it takes between it losing contact with the controller and rebooting? I'm assuming the LMS hold-down period is not going to help here?

     



  • 2.  RE: Timeout before AP will reboot

    EMPLOYEE
    Posted Jan 10, 2018 10:49 AM

    If you have CPSEC enabled, the AP has to re-establish the CPSEC keys for the new controller and that would take some time (8 minutes?).  Subsequent reboots should be shorter.



  • 3.  RE: Timeout before AP will reboot

    Posted Jan 15, 2018 12:02 PM

    Ok thanks.

    Out of interest (stolen from one of your previous posts) you describe the rebootstrapping process like this:

     

    "

    During a Rebootstrap:

    - The AP turns of its radios except those with bridge mode SSIDs
    - It tries to establish communication with the current LMS using PAPI
    - If unable to contact the current LMS (lms-ip) and there is a backup LMS, it will try to establish communication with the bkup-lms-ip as configured in the AP system-profile
    If the AP is unable to contact an LMS, it will then initiate a full reboot

    "

     

    We're going to be disconnecting the APs from their current local by remotely reprovisioning a new master address (to move it onto our central master controller, away from the customer's own Aruba controller), the vlan will then be changed under the APs feet so that the new master address is contactable. As mentioned we've tried this and it takes our test AP a long time to find the new controller address and appear on our central master. At the stage where an AP is renegotiating keys would I see any sign of it on our controller - eg would it appear in the database yet? Would there be records of its MAC in the logs?

     

    At the point where you say 'If the AP is unable to contact an LMS, it will then initiate a full reboot' do you know how long it waits before it initiates the reboot?

     

    Thanks again



  • 4.  RE: Timeout before AP will reboot

    EMPLOYEE
    Posted Jan 15, 2018 01:54 PM

    "show ap database" would best show you what is going on with that AP.

     

    You should cut the power and drop the interface, so that the AP can find a new controller from scratch immediately.

     

    On the other hand, if you used a provisioning profile, the master-ip would be burned into the AP's flash and it will always attempt to look for that controller.  You might want to get a console cable to understand what is happening, because I am sure I am not accounting for every situation..



  • 5.  RE: Timeout before AP will reboot

    Posted Jan 15, 2018 02:03 PM

    Yes cutting the power would be handy - unfortunately the customer has power injectors for a lot of these APs and we'll be working remotely so it's going to be tricky. Well, we can try bumping the interface to see if that speeds things up, sounds worth a try.

     

    Just so I am clear is this right - once an AP misses a certain number of heartbeats it rebootstraps and goes through the process of trying to talk to a backup LMS (if one is configured), if it reboostraps 8 times without managing to talk to a controller then it hits the threshold and reboots completely. Does that sound correct? If so I guess we could bring that threshold down to speed the process up a little.

     

    I'll try to get a console on the test AP tomorrow.

     

    Thanks for the advice.



  • 6.  RE: Timeout before AP will reboot

    EMPLOYEE
    Posted Jan 15, 2018 02:46 PM
    There are alot of ways this can go wrong. Let's start from scratch.. How do the APs find the current controller that they are on? DHCP, dns, static?


  • 7.  RE: Timeout before AP will reboot

    Posted Jan 15, 2018 03:23 PM

    I'll ask the customer for details of how the APs are addressed and provisioned...

     

    I noticed that after we moved a couple of their APs to our system (a trial run with 2 APs which worked, but took a long time) when I look at the AP provisioning parameters (with the AP terminated on our local) the customer's master IP address is in the 'Server IP' field. I'm not sure how or when that field gets populated. It doesn't appear to be populated on other APs I've looked at that are still on their own system. None of our own APs have it set. Hmmmmm! I'll ask the customer whether Server IP was manually set for some reason on these particular APs, unfortunately I didn't notice whether it was set prior to the transition to our controller, but I assume it must have been.



  • 8.  RE: Timeout before AP will reboot

    Posted Jan 16, 2018 09:19 AM

    Hello again,

     

    So I chatted with our customer and it seems there is nothing particularly odd about the way they set their APs up. They use DHCP reservations for the addresses (we just use dynamic) and there aren't any odd settings.

     

    We've been practising some test migrations of a single AP again today with varying results. The AP always eventually appears on our system, but it can take nearly an hour to do so sometimes. Unfortunately I haven't been able to get onsite to console in to the AP to get any more detail, I might see if that's a possibility, but I have quite a lot of preparation I want to make sure I have done ready for the migration tomorrow(!)

     

    I think, as you said, the key to speeding it up is being able to make the AP reboot ASAP when it has been pushed onto our AP management vlan. At this stage it is still using its old LMS address, but it has our master address provisioned to it. Is there a way of decreasing the watchdog timeout? I have set the bootstrap threshold down to 3 but that hasn't had much affect as far as I can tell.

     

    Unfortunately the customer uses a lot of power injectors so we don't have a way of powering off via the switchports. 



  • 9.  RE: Timeout before AP will reboot

    EMPLOYEE
    Posted Jan 16, 2018 10:04 AM

    What is the big picture?  Does the customer own all of the controllers are you are doing a migration from one set of controllers to another?  Do the controllers have the same version of ArubaOS?  Did you look at "show ap provisining ap-name <name of ap>" on all of the customer's APs to see if the customer has any fixed parameters configured on their APS?  Do they have a backup LMS ip configured in their AP system profile?

     

    It is better to send a reboot command to that AP and when you can't ping it anymore, switch the VLAN, rather than rely on the bootstrap threshold to do anything.



  • 10.  RE: Timeout before AP will reboot

    Posted Jan 16, 2018 10:28 AM

    They are migrating all of their APs to our controllers. They have 3 x 3600 series controllers (1 master 2 locals) all running the same firmware, we're running 7-series controllers with different firmware. Different firmware but when the APs have managed to contact our controller they seem to upgrade ok and once they Up and running it's all fine (we already have 2 APs of theirs on our system which we bumped across in this way). I can't say that I have checked everty single AP for params, but I have asked the customer and he hasn't any knowledge of anything special applied anywhere, and certainly the test AP has nothing obviously wonky about it. There is no backup LMS.

     

    One thing I have been experimenting with is the whitelisting command.  I have tried adding the ap-group and (in my last test only) ap-name as well, eg:

     

    whitelist-db cpsec add mac-address 00:0b:86:80:ed:64 ap-name 00:0b:86:80:ed:64 ap-group fitz-main_h-aps

     

    This is because we have created a different set of AP groups to those that they use, and also they have named their APs with descriptive names whereas we just use MAC address. Maybe this is confusing things, but wouldn't I see the AP appear on the controller at least before the whitelist command has any effect? At the moment I just see 'AP with MAC address <mac> not found'. And having the ap-group specified in the whitelist has worked up until now (albeit slowly).



  • 11.  RE: Timeout before AP will reboot

    EMPLOYEE
    Posted Jan 16, 2018 10:40 AM

    If the delay is with the AP finding your controller, that is what needs to be troubleshot.  If the customer has failover configured incorrectly, that could be what is causing the holdup.  You are also not going to get a smooth transition with APS that have power injectors, unless you reboot them from the first controller and switch the port after the AP is not pingable anymore.  

     

    You are saying that "the customer is saying", but you need to go onsite, and plug in a console cable to understand what is really going on, because frequently customers are misinformed, or things may have changed.



  • 12.  RE: Timeout before AP will reboot

    Posted Jan 16, 2018 12:58 PM

    Ok thanks, I'll get a console cable on it.



  • 13.  RE: Timeout before AP will reboot

    Posted Jan 22, 2018 12:15 PM

    Ok, so by way of wrapping up loose ends I just wanted to update on how the migration went and highlight a couple of gotchas:

     

    A procedure that I tested but which did not work was to simply change the vlan that the AP was sitting on to our AP management vlan, with the idea that eventually the AP would realise it couldn't talk its current master, and would reboot, and as part of the reboot process it would *not* reuse the same master address it had been using, but would pick up the option 43 address that we supply to it via DHCP. But although the AP did pick up a new address for itself on the new vlan it just kept trying to use the old master address and so this method failed. I don't know if this is expected behaviour?

     

    The method we ended up using instead was to add a provisioning profile to each group of APs which included our master address, then once the APs went down the customer reconfigured his switchports onto our AP vlan. The APs were then able to find our master once they rebooted.

     

    I whitelisted all the APs that were being migrated onto our system in advance including the ap-name and ap-group parameters. I did this at about 8am on the morning of the migration, we began shifting APs at about 11 and the first (thankfully small) batch came up as 'Denied' - had the entries timed out? They still showed when I ran 'show whitelist cpsec mac <mac>'. I had to delete them and re-add the entries - which itself was entertaining because sometimes I would delete the entry and re-add it and all would be fine, but other times I would delete it and try to readd the entry but would receive notification that the entry already existed! This would happen whether or not I did a 'wr mem' after deleting the entry and waited for the locals to update. Once the entries were back in however the APs came up with the right AP groups and right AP names, so as long as I added them to the whitelist very shortly before migration it worked well.

     

    Anyway, ultimately the migration was completed in pretty good time and without much drama once the initial whitelist issue was ironed out.

     

    Thanks