Problem:
Issue:
Wireless users over GRE tunnel are unable to reach default gateway (Cisco GLBP) unless the client’s gateway vlan has L3 interface configured in Anchor controller and a valid ARP entry
Topology
- AOS: 6.5.1.7-FIPS
- Topology: Master – Local (Anchor – Remote controller)
- Client Default gateway – VRRP IP (GLBP Cisco router)
- Clients are connecting to RAP in remote controller
- Clients default gateway is 138.254.22.254 (VRRP IP)
- GLBP Virtual MAC - 00:07:b4:00:16:01 and 00:07:b4:00:16:02
Network Setup
- Remote site is deployed with AP's configured with Guest SSID
- Guest vlan in Remote and Anchor controller is L2
- GRE tunnel is configured across 100 sites which terminates on different Anchor controllers
- Remote site controller communicates to Anchor controller via GRE tunnel over a WAN link
- Firewall is deployed at both sides of the sites
- As per the Cisco document, Active Virtual Gateway (AVG) creates Virtual MAC (Not mandatory multicast virtual mac)
- AVG will respond to the ARP request with the Virtual mac (01 or 02) in our case
Diagnostics:Diagnostics
- L2 GRE configured between two Aruba controllers (Anchor and Remote)
- Clients are associating to AP’s which are terminating on Remote controller
- Client vlan 3013 is L2 in both Anchor and remote controller
- L2 GRE Source/Destination is VRRP IP of both controllers
- Clients default gateway is 138.254.22.254 (VRRP IP)
- At the time of the issue, client vlan 3013 is L2 in both the controllers
- Clients are unable to reach the default gateway - 138.254.22.254 (GLBP)
- However, clients can reach another IP in same vlan 3013 - 138.254.22.252 and 138.254.22.253 (both resides on the Cisco router)
- Ran continuous ping from client to default gateway and datapath session showed ICMP request with FCI flag however the return traffic showed FYI flag – Datapath session output attached below
- Datapath session on Anchor controller doesn't’t show any traffic
- However, tunnel encaps and decaps are increasing
- While client runs continuous ping to 138.254.22.253/ 138.254.22.252 we see datapath traffic in Anchor controller
- No output is seen in ‘datapath bridge table’ and ‘datapath route-cache’ output
- Client’s default gateway is connected on anchor controller port 0/0/1 (access)
- Enabled datapath packet-capture in anchor controller and noticed ARP response for – client’s default gateway - 138.254.22.254 comes from two different MAC (00:07:b4:00:16:01 and 00:07:b4:00:16:01) since 138.254.22.254 is a VRRP IP (held by two different device)
- We suspected that since ARP response comes from two different MAC, controller is getting confused and unable to update the ARP table for 138.254.22.254 (client’s default gateway)
Remote Controller Logs:
Hostname is WLANlcl-HQ-CP01
System Time:Wed Aug 9 22:51:11 UTC 2017
Crash information available.
No kernel crash information available.
Reboot Cause: User reboot (Intent:cause:register 78:86:50:60)
ArubaOS (MODEL: Aruba7030-US), Version 6.5.1.7-FIPS
IP Address Name Location Type Model Version Status Configuration State Config Sync Time (sec) Config ID
---------- ---- -------- ---- ----- ------- ------ ------------------- ---------------------- ---------
199.254.9.37 WLANlcl-HQ-CP01 Building1.floor1 local Aruba7030 6.5.1.7-FIPS_60507 up UPDATE SUCCESSFUL 0 13
(WLANlcl-HQ-CP01) #show ip interface brief
Interface IP Address / IP Netmask Admin Protocol VRRP-IP (VRRP-Id)
vlan 1 199.254.9.37 / 255.255.255.248 up up 199.254.9.36 (141 )
vlan 2999 172.17.141.1 / 255.255.255.0 up up none (none)
vlan 3013 unassigned / unassigned up up none (none)
(WLANlcl-HQ-CP01) #show interface tunnel 141113
Tunnel 141113 is up line protocol is up
Description: Tunnel To AS-MID
Source 199.254.9.36
Destination 199.254.8.132
Tunnel mtu is set to 1100
Tunnel is a Layer2 GRE TUNNEL
Tunnel is Trusted
Inter Tunnel Flooding is disabled
Tunnel keepalive is enabled
Keepalive type is Default
Tunnel keepalive interval is 10 seconds, retries 3
Heartbeats sent 24142, Heartbeats lost 90
Tunnel is down 1 times
tunnel vlan 3013,3041
(WLANlcl-AS-MID-CP01) #show datapath bridge table | include 00:07:b4:00:16:01
(WLANlcl-AS-MID-CP01) #show datapath bridge table | include 00:07:b4:00:16:02
(WLANlcl-AS-MID-CP01) #show datapath route-cache table | include 138.254.22.254
(WLANlcl-AS-MID-CP01) #show datapath route-cache table verbose | include 138.254.22.254
(WLANlcl-HQ-CP01) show datapath route-cache verbose | include 138.254.22.254,IP,---
-------------------
Flags: L - Local, P - Permanent, T - Tunnel, I - IPsec,
S - Striping IP addr, r - Router
IP MAC VLAN RCI RCV PRTI PRTV Flags
--------------- ----------------- ----------- -------- ------- ------- ------- -----
138.254.22.254 00:07:B4:00:16:02 3013 100 33fa3 0 28 tA
----------------------------
Flags: L - Local, P - Permanent, T - Tunnel, I - IPsec, M - Mobile,
IP MAC VLAN RCI RCV PRTI PRTV Flags
--------------------------------------- ----------------- ----------- -------- ------- ------- ------- -----
(WLANlcl-HQ-CP01) #show user-table | include 88:5c
138.254.22.56 9c:2a:70:7c:88:5c host/SL072082.csw.l-3com.com L_3_DR_User_Role_AS-3324-2 00:00:02 802.1x RAP-WAN-Test-51:9a Wireless L3_Network/84:d4:7e:4e:a8:22/g TestAAA tunnel
With client vlan L2 and no ARP, datapath bridge and route-cache entry in anchor controller
(WLANlcl-HQ-CP01) #show datapath session table 138.254.22.56 | include 138.254.22.254
138.254.22.254 138.254.22.56 1 14877 0 0/0 0 0 1 tunnel 376 19 0 0 FYI
138.254.22.56 138.254.22.254 1 14877 2048 0/0 0 0 1 tunnel 376 19 0 0 FCI
138.254.22.56 138.254.22.254 1 14880 2048 0/0 0 0 0 tunnel 376 9 0 0 FCI
138.254.22.56 138.254.22.254 1 14878 2048 0/0 0 0 1 tunnel 376 13 0 0 FCI
138.254.22.254 138.254.22.56 1 14880 0 0/0 0 0 0 tunnel 376 9 0 0 FYI
138.254.22.254 138.254.22.56 1 14878 0 0/0 0 0 1 tunnel 376 13 0 0 FYI
138.254.22.254 138.254.22.56 1 14881 0 0/0 0 0 0 tunnel 376 4 0 0 FYI
138.254.22.254 138.254.22.56 1 14879 0 0/0 0 0 1 tunnel 376 e 0 0 FYI
138.254.22.56 138.254.22.254 1 14881 2048 0/0 0 0 0 tunnel 376 4 1 60 FCI
138.254.22.56 138.254.22.254 1 14879 2048 0/0 0 0 1 tunnel 376 e 0 0 FCI
Enabled datapath pcap in anchor controller for the client-mac and notice ARP response from 2 different mac address:
22:01:56.784696 ARP, Request who-has 138.254.22.254 tell 138.254.22.56, length 42
22:01:56.784936 ARP, Reply 138.254.22.254 is-at 00:07:b4:00:16:01, length 46
22:02:36.939554 ARP, Request who-has 138.254.22.254 tell 138.254.22.56, length 42
22:02:36.939828 ARP, Reply 138.254.22.254 is-at 00:07:b4:00:16:02, length 46
With client vlan L3 and ARP entry present in anchor controller, clients are able to reach the default gateway:
SolutionWorkaround:
- With the BCMC optimization disabled in client vlan in anchor controller and cisco GLBP, clients are able to reach the default gateway
- With the BCMC optimization enabled and Cisco router moved from GLBP to HSRP, clients are able to reach the default gateway
Aruba controller will update it's bridge table entry only by looking at the Ethernet header but not the ARP response header. In our case, the ARP response comes from a virtual MAC however which is different than ethernet header mac. With the BCMC optimization enabled, we see the issue happening because unknown unicast packets are not forwarded. With BCMC optimization disabled, unknown unicast packets are forwarded.