Please open a TAC case to troubleshoot this.
From my interpretation of the logs, I would suspect the (Kerberos) connections between ClearPass and your AD servers. It seems like there are more authentication requests coming in than your AD can handle; please also check the load on the AD servers during these issues. ClearPass limits the number of concurrent authenticatations to backends like AD to prevent AD from locking up.
What you probably will see is that during these problems, the transaction times rapidly increase. You can check that in the ClearPass Dashboard in the Request Processing Time widget; and in more detail in Monitoring -> Live Monitor -> System Monitor -> ClearPass (tab). There you can see that transaction times for all parts of the authentication, so you can find where the delay is.
It might also be a connection problem between ClearPass and the AD servers (spanning tree? flapping switch ports?), or a Denial of Service by a poorly configured client (in that case you would see high rate failed authentications from a single source or username). 14,000 failures in 15 minutes might be overloading your AD if you haven't scaled it for that load. You can also manually set the login servers for MSCHAPv2 to steer your ClearPass to specific high capacity or nearby servers to prevent ClearPass from using a remote AD server for authentication:
Configure an (optional) restricted list of domain controllers to be used for MSCHAPv2 authentication. If not specified, all available domain controllers obtained from DNS will be used for authentications. |
TAC can help you with these troubleshooting tasks..
Herman