Wired Intelligent Edge

 View Only
last person joined: 2 days ago 

Bring performance and reliability to your network with the HPE Aruba Networking Core, Aggregation, and Access layer switches. Discuss the latest features and functionality of your switching devices, and find ways to improve security across your network to bring together a mobile-first solution
Expand all | Collapse all

Aruba CX 8360 Hardware Route, create failed for prefix

This thread has been viewed 22 times
  • 1.  Aruba CX 8360 Hardware Route, create failed for prefix

    Posted Jul 05, 2024 08:40 AM

    I'm evaluating using Aruba CX 8360 for routing with a huge number of bgp routes. From the specs it should handle about 600k routes but that seems not to be unachievable.

    The switch has learned less than 300k routes (ipv4+ipv6 combined):

    router# show ip route summary 

     IPv4 Route Table Summary 

     VRF name :  default
      Protocol      Active Routes
      ------------- -------------
      connected      6            
      local          7            
      ospfv2         324          
      bgp            3166         


    router# show ipv6 route summary 

     IPv6 Route Table Summary 

     VRF name :  default
      Protocol      Active Routes
      ------------- -------------
      connected      6            
      local          7            
      ospfv3         73           
      bgp            125421       

    And the capabilities seems to be fine:

    rtr-c1-dcg1# show capacities-status l3-resources 

    System Capacities Status: Filter L3 Resources
    Capacities Status Name                                                                                       Value Maximum
    -----------------------------------------------------------------------------------------
    Number of IP neighbor (IPv4+IPv6) entries                                                                      162   65536
    Number of IP Directed Broadcast neighbor entries                                                                 0    1024
    Number of IPv4 neighbor(ARP) entries                                                                           141   65536
    Number of IPv6 neighbor(ND) entries                                                                             21   65536
    Number of L3 Groups for IP Tunnels and ECMP Groups currently configured                                         38    2000
    Number of L3 Destinations for Routes, Nexthops in ECMP groups and Tunnels currently configured                  33    4093
    Number of routes (IPv4+IPv6) currently configured                                                            128880  631290
    Number of IPv4 routes currently configured                                                                    3404  630780
    Number of IPv6 routes currently configured with prefix 0-64                                                  125473  598014
    Number of IPv6 routes currently configured with prefix 65-127                                                    3     510

    After observing black holing through the router it seems that not all routes are getting into the FIB. I finally was able to find it in the syslog (/var/log/messages) a huge number of logging messages for a random selection of prefixes:

    2024-07-05T11:45:32.923337+00:00 router switchd_agent[3748]: debug|LOG_ERR|AMM|-|L3|L3_ASIC|Hardware Route, create failed for prefix: 2001:1A40:15FE::/47 vrf: 1 dest_id: 3 dest_fwd_type: route_ecmp_member dp_state: SINGLE due to OUT_OF_ROUTE. Total err_count=293514

    Restarting the bgp sessions will trigger those messages again. It looks like if the FIB does not get the valid routes of the RIB and bricks the routing:

    The hpe-routing daemon is running inside of the netns swns and I assume it is not aware that the route is missing in the FIB. Other routers might use the switch as next-hop based on a routing protocol, but the switch will not be able to route the packet correctly (according to the RIB) if the route is missing in the FIB 🤯🤯🤯

    Any ideas?



  • 2.  RE: Aruba CX 8360 Hardware Route, create failed for prefix

    EMPLOYEE
    Posted Jul 05, 2024 08:43 PM

    what is the output of "show profile current"



    ------------------------------
    If my post was useful accept solution and/or give kudos.
    Any opinions expressed here are solely my own and not necessarily that of HPE or Aruba.
    ------------------------------



  • 3.  RE: Aruba CX 8360 Hardware Route, create failed for prefix

    Posted Jul 07, 2024 04:00 PM

    It's Core-Spine as it should support the max number of routes:

    router# show profiles current 

    Current Profile
    --------------
    Core-Spine




  • 4.  RE: Aruba CX 8360 Hardware Route, create failed for prefix

    EMPLOYEE
    Posted Jul 07, 2024 07:18 PM

    ok then perhaps it is best to reach out to TAC 



    ------------------------------
    If my post was useful accept solution and/or give kudos.
    Any opinions expressed here are solely my own and not necessarily that of HPE or Aruba.
    ------------------------------



  • 5.  RE: Aruba CX 8360 Hardware Route, create failed for prefix

    Posted Aug 12, 2024 08:07 AM

    Here are the results of my tac case: the switches have a partioned TCAM, so the output of show capacities-status l3-resources won't show any thresholds of the usage of the partitions. There are 6 partitions where 5 of them can be assigned to prefix lengths using the commands:

    Any route with a prefix length other than the 5 named ones will use a "fallback" partition.

    Sadly the documentation seems to not discuss this TCAM design nor does the CLI command have any reference how to use it.

    ERT and lab has confirmed that it is only possible to check the TCAM partition usage using this diagnostic commands:

    • diagnostic
    • diag-dump l3 basic

    WARNING: The dump command creates very long output (mine was about 35MB), you should consider to save it into a file and upload it afterwards.

    At the very end of the dump the important details are listed:

    HW ROUTE:

    Entries: 52800
    Table-Name                Max-Capacity  Available  Filled 
    -----------------------------------------------------------
    IPv4 Prefix Table 1  /24   393216        381887     11329  
    IPv4 Prefix Table 2  /23   65536         56028      9508   
    IPv4 Prefix Table 3  /22   16384         12153      4231   
    IPv4 Prefix Table 4  /21   24576         12574      12002  
    IPv4 Prefix Table 5  /20   65536         62459      3077   
    IPv6 Prefix Table 1  /44   393216        381887     11329  
    IPv6 Prefix Table 2  /40   65536         56028      9508   
    IPv6 Prefix Table 3  /36   16384         12153      4231   
    IPv6 Prefix Table 4  /32   24576         12574      12002  
    IPv6 Prefix Table 5  /29   65536         62459      3077   
    IPv4 BMP TCAM             65532         26000      39532  
    IPv6 BMP TCAM             32766         14760      18006  

    The size of the partitions (table 1 - 5 + BMP) are defined by the profile and you can reassign the prefix lengths to different partitions using the prefix-priority statements.

    So it seems that some micro management is required if you have more than ~10k routes (not prefixes, remind the ECMP factor). There is no notification if a partition gets overload nor is it possible to monitor this easily :-(