Real World Example
I had an interesting discussion with a customer just recently. They are an educational institution, with a decent size network of mostly ProCurve switches. The core switches are the 8200zl chassis - precursor to the current 5400R.
They had been having problems with the network, characterised by slowdowns, sluggish performance and slow or unresponsive CLI. It was intermittent in incidence and duration. They had logged a call, but there was nothing of consequence in the logs and monitoring was difficult because of the inconsistent nature of the issue.
As we discussed what was going on, it became clear that these symptoms only occurred during teaching hours. That pointed to an external problem source, possibly on a student device. In the past, I have come across all of these on client devices:
- mismatched network chipset/driver
- faulty network interface
- misconfigured network settings
- virus/malware
I talked about some of the lesser-known features in the switch such as virus throttling or control plane protection that may help alleviate the symptoms enough to enable better on-switch diagnostics.
About a week later, the customer reported that with better access to the CLI, they were able to run diagnostics (such as mirror port to a WireShark PC) that enabled them to determine the cause. It turned out that “several” miscreants were running ping-flood denial of service (DOS) programs. It wasn’t clear if this was a coordinated DOS, or multiple independent actors, nor was it clear what the motivation was. The story ends at “disciplinary action”...
Control Plane Protection
Control Plane Protection has been around since about 15.04, although it doesn't appear to be referenced in the manuals. It works on many of the older ProCurve switches, and is recommended as part of switch hardening. CoPP would be a better option where it is available. CoPP is mutually exclusive with control-plane-protection.
control-plane-protection enable
Control Plane Policing (CoPP)
Control Plane Policing sets rate limit on control protocols to protect CPU overload from DOS attacks
This supersedes Control Plane Protection, and is only available in code from 16.04 onwards (which excludes some of the older switches such as the 8200zl or 5400zlV1). CoPP is much more configurable, and you can also write custom classes.
copp traffic-class all limit default
CoPP Example
I was able to replicate the ping flood DOS enough to illustrate the example. The switch in use is a 5406R.
Before CoPP is enabled
Core(config)# show cpu
1 sec ave: 42 percent busy
5 sec ave: 42 percent busy
1 min ave: 23 percent busy
After Copp is enabled
Core(config)# show cpu
5 percent busy, from 4 sec ago
1 sec ave: 7 percent busy
5 sec ave: 5 percent busy
1 min ave: 7 percent busy
Core(config)# show copp status
Traffic-Class CoPP Status Threshold Ex Rx Violate Pkts
------------------------------ ----------- ------------ ----------------------
station-arp Enabled No
station-icmp Enabled Yes 98041
station-ip Enabled No
ip-gateway-control Enabled No
ospf Enabled No
bgp Enabled No
rip Enabled No
multicast-route-control Enabled No
loop-ctrl-mstp Enabled No
loop-ctrl-pvst Enabled No
loop-ctrl-loop-protect Enabled No
loop-ctrl-smart-links Enabled No
layer2-control-others Enabled No
udld-control Enabled No
sampling Enabled No
icmp-redirect Enabled Yes
unicast-sw-forward Enabled No
multicast-sw-forward Enabled No
mac-notification Enabled No
exception-notification Enabled No
broadcast Enabled No
unclassified Enabled No
Rather than seeing a continuous stream of pings, the flow is slowed down and broken up:
Syslog also shows CoPP entries now.