Wired Intelligent Edge (Campus Switching and Routing)

Reply
MVP Expert

ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

ArubaOS-CX 10.01

 

Is LACP Layer 4 hashing algorithm (L4-src-dst) going to be available as an additional LACP hashing mechanism along with actually supported/implemented L3-src-dst and L2-src-dst?

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

Not at this time. Could you please elaborate the conditions where you see that L4 hashing would be required (like due to web-proxy usage, or PAT ?) ?

MVP Expert

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

Hello Giles,

 

actually our VSX is running with initial 13 VSX LAGs (this number is going to increase): our LACP peers are IBM system running PowerVM (VIOS) and those peers are set with src/dst port as hashing alghorithm (for their outgoing traffic to the VSX)...we're asking if having a VSX that provides Layer 4 hashing algorithm - in our specific case where traffic will remain southbound the VSX - would be benefical with respect having only Layer 3 that is used for selecting outgoing links on involved VSX LAGs...    

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

ok. Thanks for this clarification regarding your usage.

 

When you have 2 links in a VSX LAG, one physical link per each switch,

the hashing algorithm is not used to determine which link is being used.

Instead, we have a internal mechanism that optimizes the traffic to stay local to the switch (to avoid sending the traffic over ISL that would add one hop in the traffic path, which is not necessary).

The hashing algo would have an impact if you would have 4 links in VSX LAG, 2 per switch, downstream to the server.

So the desicion criteria in your case is actually the way the packet is received on CX-SW1 or on CX-SW2: this is your hypervisor that decides - based on its L4 hashing algo - to send packet to CX-SW1 or CX-SW2.

Each switch will locally process the packet if the destination is reachable behind a dual-attached VSX-LAG (which is your case).

 

I hope this clarifies. In a nutshell, in your case, hashing-tuning for L2 or L3 or L4 (if we would have) has no effect.

 

MVP Expert

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm


@vincent.giles wrote:

...

The hashing algo would have an impact if you would have 4 links in VSX LAG, 2 per switch, downstream to the server.

 

 

 


Hi Giles, really thanks for this explanation, really useful.

 

Our acutal setup has 13 VSX LAGs each one made of multiple member links (from 2 links up to 3 links on about half of our total number of VSX LAGs we deployed), this clearly on each Aruba 8320 node.

 

Table below summarize our production scenario VSX LAGs quite well:

 

Aruba_8320_VSX_VSX_LAGs_distribution_19092018.png

 

assignments are not static - especially regarding actual lag8-lag13 group with respect to lag1-lag7 group - indeed we're planning in a very near future to refactor lag8-lag13 (single leg VSX LAGs on each node) into multi-members VSX LAGs (as now is happening on lag1-lag7 group) that's because we're going to host a lot of new servers, each one having multiple 10Gbps ethernet links.

 

Consider that systems speak each others and also through TSM (lag1); lag128 used for VSX ISL is not listed ((2+2) x 40Gbps).

 

That's to say that LACP Hashing Algorithm (L2 vs L3 vs L4) would have (or actually has) an impact.

MVP Expert

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

Hello Giles,

 

Sorry for returning on this thread but now we're hardly trying to verify some traffic performances/patterns strictly related to all of our VSX LAGs implementation.

 

I think my last post on this thread was clear enough to let you see that we are exactly using the hashing algorithm:

 


@vincent.giles wrote: The hashing algo would have an impact if you would have 4 links in VSX LAG, 2 per switch, downstream to the server.

Indeed, with regards to the scenario shown months ago, now we have strictly only 2 or 3 ports per VSX member, ports part of multi-chassis LAGs (so only VSX LAGs with 4 or 6 ports)...basically now our lag2-lag7, lag14-lag15 (lag8-lag13 are going to be dismissed) are used to connect to our source Servers (Backup clients), those ones transmit large amount of data daily to one destination Server (Backup server) which is connected via lag1.

 

Aruba_8320_VSX_VSX_LAGs_distribution_13022019.png

 

Given the scenario above we noticed that, as you described, VSX keeps the traffic local on each node and try to minimize ISL usage (we see incoming traffic on above VSX LAGs flowing distributed to lag1 preferring to egress by interfaces distributed on both VSX nodes other than preferring to egrees from interfaces belonging to just one VSX member) and that is good.

 

We instead have an issue: ArubaOS-CX doesn't actually provide us a command [*] (as ArubaOS-Switch or Comware do) to understand how egress traffic leaving the VSX will be distributed on interfaces part of the lag1 when the destination is the Backup server host (this to understand if, with the IP Addressing we have on source/destination hosts, the concurrent egress traffic leaving the VSX on the lag1 will be well balanced or not on all of its four interfaces 1/1/1,1/1/2,1/1/1 and 1/1/2).

 

The question rised because we're seeing (we're still investigating) that egress traffic leaving lag1 - traffic for Backup server - is somewhat preferring a particular pattern where 1/1/2 of VSX 1 and 1/1/2 of VSX 2 are heavily used and 1/1/1 of VSX 2 and 1/1/1 of VSX 1 are lightly used...so a case where outgoing traffic looks a little bit unbalanced instead of being equally spread on all 1/1/1, 1/1/2 of VSX 1 and 1/1/1, 1/1/2 of VSX 2 interfaces.

 

This is happening with at least 10 Backup clients concurrently sending data using fixed TCP ports and the SRC IP Address variability should grant us good distribution (at least we expected that).

 

It's totally possible we're falling in a undesired corner case where there is some hashing polarization (Layer 3 Hashing based on our SRC/DST IP Addresses produces interfaces utilization pattern where 1/1/2 of VSX 1 and 1/1/2 of VSX 2 are used more on VSX lag1)...but we can't prove it without the missing command cited above.

 

So, long story short, how can we simulate/calculate how egress traffic is distributed through VSX lag1's interfaces considering we exactly know SRC/DST Addresses of all involved data streams?

 

[*] like the show trunks load-balance interface <TRUNK-ID> mac <SRC-MAC-ADDR> <DEST-MAC-ADDR> [ ip <SRC-IP-ADDR> <DEST-IP-ADDR> [<SRC-TCP/UDP-PORT> <DEST-TCP/UDP-PORT>] ] inbound-port <PORT-NUM> ether-type <ETHER-TYPE> inbound-vlan <VLAN-ID> CLI Command available on ArubaOS-Switch

Highlighted
MVP Expert

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

Is anybody able to help?

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

Did you try to change hashing setting from l3-src-dst to l2-src-dst ?

I assume in your case this might not give much improvment but just checking if you did try and what was the outcome ?

MVP Expert

Re: ArubaOS-CX 10.01 VSX: LAG LACP Layer 4 hashing algorithm

Hi Giles!

 


@vincent.giles wrote: Did you try to change hashing setting from l3-src-dst to l2-src-dst ?

No I didn't since l3-src-dst is also what is currently set on all connected servers.


@vincent.giles wrote: I assume in your case this might not give much improvment but just checking if you did try and what was the outcome ?

Yes, it's what I think too.

 

With regards to the idea of implementing also Layer 4 hashing algorithm on LACP used on VSX LAGs...we tought that using at least Layer 3 will help us to avoid polarization discussed above (the more data the algorithm uses the more probable is to avoid it).

 

What we now really miss more is, as noted, an ArubaOS-CX command (or a set of commands) to easily understand "what-if" scenarios while troubleshooting VSX LAGs LACP (I mean: how the traffic is going through each VSX LAG? what VSX LAG's interfaces are loaded more/less? is there a good/bad balancing on the egress traffic?).

 

Also...as written NAE is not helping on this side.

Search Airheads
cancel
Showing results for 
Search instead for 
Did you mean: