Wired Intelligent Edge

last person joined: yesterday 

Bring performance and reliability to your network with the HPE Aruba Networking Core, Aggregation, and Access layer switches. Discuss the latest features and functionality of your switching devices, and find ways to improve security across your network to bring together a mobile-first solution
Expand all | Collapse all

MAS S1500 hangs SSH session within a few commands

This thread has been viewed 0 times
  • 1.  MAS S1500 hangs SSH session within a few commands

    Posted Apr 22, 2015 03:17 PM

    Well, bummed to discover I don't have a support contract on my switches. I wasn't ever offered one, and am now trying to price it out.

     

    So I turn to the community!

     

    I am having some STP issues in my predominatley HP switch stack, to which I have a few Arubas connected. The web interface is pretty useless for analysis or config, so I am SSH'ing in. But after I login and am able to issue one or two commands, the SSH sessions drops.

     

    So my only choice is a direct console cable, which isn't ideal. And I wonder if the SSH problem is a symptom of a deeper problem.

     

    Switch is 7.4.0.2 OS and controls access points. It is performing fine, except for this SSH-CLI problem. I would like to solve that, so I can do a better job of log analysis and configuration for SNMP.

     

    Thanks!



  • 2.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 12:08 AM

    Kevets,

     

    1. What is the IP connectivity interface

         out-of-band management interface (interface mgmt)   or

         in-band-mgmt (interface vlan <x>) 

    2. Can you run a continous ping to switch connectivity IP address & see if any drops

        At the same time, run the ssh session to test the experience

    3. If you have console access to switch, run the below commands

        show clock

        show spanning-tree | include Last     (repeat this command multiple times after few seconds)

         show log all 50 | include Flushing    (repeat this command multiple times after few seconds)

     

     What I am trying to find out is, if there are any too many STP changes happening, causing the switch IP to become unreachable intermittently.  

     If that is not the case, then need to check if anything related to ssh timeout is causing this.

     

    Thanks,

    -Vinay



  • 3.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 12:23 AM

    Since you mentioned it is S1500, you can ignore Qn.1, as S1500 doesn't have OOB interface (interface mgmt)

       just look for points 2 & 3 above

     



  • 4.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 10:02 AM
      |   view attached

    Vinay -

     

    Thanks so much.

     

    Many thanks.

    I am definitely having MSTP issues. Whether the Aruba gear is the victim or the perpetrator is what I’m trying to deduce, and the Aruba is something I am only slightly up to speed on.

    • Setup:
       HP Core switch
       5 HP Edge switches connected
       1 Aruba S1500 connected as edge (it is both for AP control as well as my Internet Layer 3 connection)
       1 Aruba 7210 connected

    All of the edge switches recognize the Core as CST root

     

    The problem:
      It’s a weird one! Every morning (literally every morning 7 days a week) between 9 and 9:40 in the morning, I have a broadcast storm on the switch stack and HP errors indicating CST election conversations going on. I keep on trying to find a single culprit. We are a zoo, open 7 days a week, and systems do come on at this time. But to date, we have not been able to find a smoking gun here.

     

    The temporary fix:
      if I reboot one or two of the edge switches, the storm goes away. Also, if I temporarily disconnect the Aruba controller, the storm clears. I don’t have STP on the controller and I did have TAC look at that pretty extensively, so I don’t think it’s the culprit. Plus, the problem is once a day at the same time, and the controller doesn’t do anything on that kind of schedule.

     

    So, to answer your questions:

    > What is the IP connectivity interface
    Hmm, I’m not sure. I’m attaching the config file.

    > Can you run a continous ping to switch connectivity IP address & see if any drops
    Continuous ping no problem

    > If you have console access to switch, run the below commands
    >    show clock
    hmm, I am off by an hour

    > show spanning-tree | include Last     (repeat this command multiple times after few seconds)


    Last TC received on intf GE0/0/23, on 2015-04-21 20:35:45 (EST)
    and that doesn't change


     > show log all 50 | include Flushing    (repeat this command multiple times after few seconds)

    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/23 vlan-id 1 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/23 vlan-id 99 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/23 vlan-id 110 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/23 vlan-id 114 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/23 vlan-id 999 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/8 vlan-id 114 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/9 vlan-id 114 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/18 vlan-id 99 due to STP topology change


    Apr 22 13:36:58 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/20 vlan-id 999 due to STP topology change


    Apr 22 13:47:35 :340004:  <WARN> |l2m|  Flushing mac-addresses on GE0/0/22 vlan-id 114 due to STP topology change


    and nothing today.


    #7210

    Attachment(s)

    txt
    switchcfg.txt   6 KB 1 version


  • 5.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 02:34 PM

    It is always best practice to configure all the edge ports with 'portfast'.  That will prevent any device connect/disconnect or AP reboot to further induce STP state change, leading to STP 're-calculations'

     

    And, from your configs, I see that you have already taken care of it (for ports where APs will be connected)

    ---------------------------------------------------

    interface-profile mstp-profile "AP"
       portfast
    !

    interface-group gigabitethernet "AP"
       apply-to 0/0/1-0/0/6,0/0/10-0/0/15
       mstp-profile "AP"
       poe-profile "AP"
       mtu 9216
       switching-profile "AP"
    !

    ---------------------------------------------------

     

    And from logs: the STP change notifications is happening on ports 8, 9, 18, 20, 22, 23.

     - which are "outside" of current set of portfast ports (0/0/1-0/0/6,0/0/10-0/0/15).

     Double check if above troublesome ports are also candidate for portfast (where, other end devices are   like PC, printer, etc... are connecting)  and accordingly clamp them as well with portfast config.

        Basically, repeat this excercise to configure all possible edge ports with portfast, even on HP switches (if any such ports).

      Then, configure higher bridge-priority for edge-switches & lower as you move towards core, with 'Root switch' having the lowest value.

     

    (MAS) (config) #mstp
    (MAS) (Global MSTP) #instance 0 bridge-priority ?
    <bridge-priority>       Bridge-priority [0-61440 in steps of 4096]. Default:
                            32768

     

    With that, you should see better stability across network.

     

    Note: If you have any edge ports configured as trunk ports, then configure them as 'portfast trunk'

     

    Alternatively,

    Run PVST on all switches instead of MSTP

    spanning-tree mode pvst

    !

    interface-profile pvst-port-profile <name>

      portfast

    !

     Then apply to interface-group config.

     

    If issue still persists, then it needs to be investigated further...

     

    Thanks,

    -Vinay

     

     

     



  • 6.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 02:40 PM

     

    We've found the HP kit to more finnicky when it comes to keeping STP stable during

    congestion or during excess traffic to the management plane.  What model are the HP edge

    switches?  Especially, are you doing any SNMP write operations to them?

     

    As for the "Flushing mac-addresses" messages we get these on all our MAS.  I don't know

    why we get them even when the last TC received does not update, but it hasn't caused any

    noticeable issues.

     



  • 7.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 03:04 PM

    Thanks. The HP Core is a Procurve 3500. The HP Edges are 2810's.



  • 8.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 03:05 PM

    and no, we are just SNMP reading from the HP's. I actually need to setup SNMP on the Aruba - just have to figure out how.



  • 9.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 23, 2015 03:19 PM

    Ah, well, not my models.  I have only one 2810 that is sitting unused on a shelf for lack of AAA port access.

     

    The first thing I'd look for is one of the HP switches complaining in the switch logs about being "starved"

    for a BPDU.  That switch is often close to the source of the problem, if not the source itself.  The error may

    not reach extrernal syslog servers, so go in and check with "log -r".  Luckily, HPs tend to have pretty deep

    log buffers.  Another one is "Out of pkt buffers" which means something is slamming the management plane.

     

    The previous advise about changing your bridge prios on the core is good, I would definitely do that.



  • 10.  RE: MAS S1500 hangs SSH session within a few commands

    Posted Apr 24, 2015 10:09 AM

    Thanks. BPDU starved for a receive is definitely in our stew of problems. Whether this precedes or follows the ensuing broadcast storm of high collissions is something I haven't been able to figure out.

     

    On one hand, this problem seems coincidental with our expanding into multi-vlans, so it's easy to think it's a loop somewhere or a config issue.

     

    On the other hand - why is this only once a day, at the same time of day (within 40 minutes)?

     

    It's been the devil to debug. I'm cleaning all sorts of things up, so there's the positive in it, but it's gotten stale, having to deal with this 7 days a week.