To follow-up on recommendation. Few things that might help:
-1- links member of the ISL LAG must be sized to sustain at least one uplink failure scenario. An other scenario could be if you have high volume of traffic between single attached nodes connected to primary and other single attached connected to secondary.
-2- It is suggested to use a dedicated VRF for keepalive UDP communication. This is not mandatory at all but minimize any risk of issue due to routing change.
-3- It is recommended to use dedicated direct link for keeplive between primary VSX node and secondary. Again not required but this avoids to be dependant on the stability of the upstream L3 domain. 1G transceiver can be used for this dedicated interconnection as there is very small BW need for keepalive traffic. If there is no port available for this usage, or no fiber path, then use a stable IP path between VSX nodes (ex: it could be through upstream network nodes).
-4- To avoid sub-optimal traffic path (traffic going unnecessary through ISL) please understand and use both features:
a) active-gateway (first hop Virtual IP), instead of VRRP
b) active-forwarding (in case you have ECMP routes on uptream nodes pointing to each VSX node). Active forwarding is set on the transit VLAN for upstream routing.
-5- If you don't allow all VLANs in ISL, it is higly recommended to use vsx-sync vlans on the ISL LAG on the primary to guaranty equal VLANs trunking.
If you have more points, clarification ned on best practices, don't hesitate to use the forum.
Finally, if you are a partner, there is a very technical presentation on VSX that you have access on Arubapedia for partner. (2 hour presentation and associated slide set).