Environment : Site-to-Site IPSEC VPN Tunnel
In shot:
Dead Peer Detection (DPD) is a method of detecting a dead Internet Key Exchange (IKE) peer. The method uses IPsec traffic patterns to minimize the number of messages required to confirm the availability of a peer. DPD is used to reclaim the lost resources in case a peer is found dead and it is also used to perform IKE peer failover.
Explanation:
When two peers communicate with IKE and IPSec, the situation may arise in which connectivity between the two goes down unexpectedly. This situation can arise because of routing problems, one host rebooting, etc., and in such cases, there is often no way for IKE and IPSec to identify the loss of peer connectivity. As such, the SAs can remain until their lifetimes naturally expire resulting in a "black hole" situation where packets are tunneled to oblivion. It is often desirable to recognize black holes as soon as possible so that an entity can failover to a different peer quickly. Likewise, it is sometimes necessary to detect black holes to recover lost resources.
This problem of detecting a dead IKE peer has been addressed by proposals that require sending periodic HELLO/ACK messages to prove liveliness. These schemes tend to be unidirectional (a HELLO only) or bidirectional (a HELLO/ACK pair). For the purpose ,the term "heartbeat" will refer to a unidirectional message to prove liveliness. Likewise, the term "keepalive" will refer to a bidirectional message.
The problem with current heartbeat and keepalive proposals is their reliance upon their messages to be sent at regular intervals. In the implementation, this translates into managing some timer to service these message intervals. Similarly, because rapid detection of the dead peer is often desired, these messages must be sent with some frequency, again translating into considerable overhead for message processing. In implementations and installations where managing large numbers of simultaneous IKE sessions is of concern, these regular heartbeats/keepalives prove to be infeasible.
To this end, a number of vendors have implemented their own approach to detect peer liveliness without needing to send messages at regular intervals. This informational document describes the current practice of those implementations. This scheme, called Dead Peer Detection (DPD), relies on IKE Notify messages to query the liveliness of an IKE peer.
For More information about DPD refer the RFC: 3706 Link: http://www.ietf.org/rfc/rfc3706.txt
On Aruba:
On Aruba controller, DPD is enabled by default on the controller for site-to-site VPN.
#show crypto-local isakmp dpd
DPD is Enabled: Idle-timeout = 22 seconds, Retry-timeout = 2 seconds, Retry-attempts = 3
idle-timeout: Idle timeout, in seconds. 10-3600(By default 22 seconds)
retry-timeout: Retry interval, in seconds. 2-60 (By defualt, 2 seconds)
retry-attempts: Number of retry attempts. 3-10 (By default, 3)
Troubleshooting:
If the IPSEC tunnel is getting broken or keep on getting flap. Then by using the below command we can check whether the DPD is getting missed or not. If we are seeing too much of DPD dropped then its leads to network issue or bandwidth issue between the IPSEC peers
On the Aruba controller, in order to check the DPD is getting received, sent and dropped we need to look for the below command
#show crypto isakmp stats | include DPD
Datapath To Control DPD Triggers Received = 0
DPD Initiate Reqs-Sent/Re-Sent/Replies-Rcvd/Dropped = 0/0/0/0
DPD Responder Reqs-Rcvd/Reqs-Dropped/Replies-Sent = 0/0/0
DPD peers detected as Dead/P1_SA/P2_SA = 0/0/0
IKEv2 - DPD Initiate Reqs-Sent/Re-Sent/Replies-Rcvd/Dropped = 0/0/0/0
IKEv2 - DPD Responder Reqs-Rcvd/Reqs-Dropped/Replies-Sent = 0/0/0
IKEv2 - DPD peers detected as Dead = 0
Sample command output:
DPD Initiate Reqs-Sent/Re-Sent/Replies-Rcvd/Dropped = 464428/63366/458454/0
DPD Responder Reqs-Rcvd/Reqs-Dropped/Replies-Sent = 548836/0/548836 --------> The initiator able to handle the messages it has received being the responder.
DPD peers detected as Dead/P1_SA/P2_SA = 5940/8126/8138
Sample Security log output: