Protecting the Switch Control Plane
05-26-2018 12:02 AM - edited 05-26-2018 12:09 AM
Real World Example
I had an interesting discussion with a customer just recently. They are an educational institution, with a decent size network of mostly ProCurve switches. The core switches are the 8200zl chassis - precursor to the current 5400R.
They had been having problems with the network, characterised by slowdowns, sluggish performance and slow or unresponsive CLI. It was intermittent in incidence and duration. They had logged a call, but there was nothing of consequence in the logs and monitoring was difficult because of the inconsistent nature of the issue.
As we discussed what was going on, it became clear that these symptoms only occurred during teaching hours. That pointed to an external problem source, possibly on a student device. In the past, I have come across all of these on client devices:
- mismatched network chipset/driver
- faulty network interface
- misconfigured network settings
I talked about some of the lesser-known features in the switch such as virus throttling or control plane protection that may help alleviate the symptoms enough to enable better on-switch diagnostics.
About a week later, the customer reported that with better access to the CLI, they were able to run diagnostics (such as mirror port to a WireShark PC) that enabled them to determine the cause. It turned out that “several” miscreants were running ping-flood denial of service (DOS) programs. It wasn’t clear if this was a coordinated DOS, or multiple independent actors, nor was it clear what the motivation was. The story ends at “disciplinary action”...
Control Plane Protection
Control Plane Protection has been around since about 15.04, although it doesn't appear to be referenced in the manuals. It works on many of the older ProCurve switches, and is recommended as part of switch hardening. CoPP would be a better option where it is available. CoPP is mutually exclusive with control-plane-protection.
Control Plane Policing (CoPP)
Control Plane Policing sets rate limit on control protocols to protect CPU overload from DOS attacks
This supersedes Control Plane Protection, and is only available in code from 16.04 onwards (which excludes some of the older switches such as the 8200zl or 5400zlV1). CoPP is much more configurable, and you can also write custom classes.
copp traffic-class all limit default
I was able to replicate the ping flood DOS enough to illustrate the example. The switch in use is a 5406R.
Before CoPP is enabled
Core(config)# show cpu 1 sec ave: 42 percent busy 5 sec ave: 42 percent busy 1 min ave: 23 percent busy
After Copp is enabled
Core(config)# show cpu 5 percent busy, from 4 sec ago 1 sec ave: 7 percent busy 5 sec ave: 5 percent busy 1 min ave: 7 percent busy Core(config)# show copp status Traffic-Class CoPP Status Threshold Ex Rx Violate Pkts ------------------------------ ----------- ------------ ---------------------- station-arp Enabled No station-icmp Enabled Yes 98041 station-ip Enabled No ip-gateway-control Enabled No ospf Enabled No bgp Enabled No rip Enabled No multicast-route-control Enabled No loop-ctrl-mstp Enabled No loop-ctrl-pvst Enabled No loop-ctrl-loop-protect Enabled No loop-ctrl-smart-links Enabled No layer2-control-others Enabled No udld-control Enabled No sampling Enabled No icmp-redirect Enabled Yes unicast-sw-forward Enabled No multicast-sw-forward Enabled No mac-notification Enabled No exception-notification Enabled No broadcast Enabled No unclassified Enabled No
Rather than seeing a continuous stream of pings, the flow is slowed down and broken up:
Syslog also shows CoPP entries now.
Richard Litchfield, HPE Aruba
Network Solution Architect