Reactive Path Rebalancing
Overview
Dynamic load balancing (DLB) is an important tool for handling the large data flows (also known as elephant flows) inherent in AI-ML data center fabrics. Reactive path rebalancing is an enhancement to existing DLB features.
In the flowlet mode of DLB, you (the network administrator) configure an inactivity interval. The traffic uses the assigned outgoing (egress) interface until the flow pauses for longer than the inactivity timer. If the outgoing link quality deteriorates gradually, the pause within the flow might not exceed the configured inactivity timer. In this case, classic flowlet mode does not reassign the traffic to a different link, so the traffic cannot utilize a better-quality link. Reactive path rebalancing addresses this limitation by enabling the user to move the traffic to a better-quality link even when flowlet mode is enabled.
The device assigns a quality band to each equal-cost multipath (ECMP) egress member link that is based on the traffic flowing through the link. The quality band depends on the port load and the queue buffer. The port load is the number of egress bytes transmitted. The queue buffer is the number of bytes waiting to be transmitted from the egress port. You can customize these attributes based on the traffic pattern flowing through the ECMP.
Benefits
-
Scalable solution to link degradation
-
Optimal use of bandwidth for large data flows
-
Avoidance of load balancing inefficiencies due to long-lived flows
Configuration
Configuration Overview
Quality bands are numbered from 0 through 7, where 0 is the lowest quality and 7 is the highest quality. Based on the member port load and queue size, DLB assigns a quality band value to the member port. The port-to-quality band mapping changes based on instantaneous port load and queue size.
When both of the following conditions are met, reactive path rebalancing reassigns a flow to a higher-quality member link:
-
A better-quality member link is available whose quality band is equal to or greater than the current member's quality band plus the configured reassignment quality delta value. The quality delta is the difference between the two quality bands. Configure the quality delta value using the
quality-delta
statement. -
The packet random value that the system generates is lower than the reassignment probability threshold value. Configure the probability threshold value using the
prob-threshold
statement.
Be aware of the following when using this feature:
-
Reactive path rebalancing is a global configuration and applies to all ECMP DLB configurations in the system.
-
You can configure egress quantization in addition to reactive path rebalancing to control the flow reassignment.
-
Packet reordering can occur when the flow moves from one port to another. Configuring reactive path rebalancing can cause momentary out-of-order issues when the flow is reassigned to the new link.
Topology
In this topology, the device has three ingress ports and two egress ports. Two of the ingress streams are Layer 2 (L2) traffic and one is Layer 3 (L3) traffic. The figure shows the table entries forwarding the traffic to each of the egress ports. All the ingress and egress ports are of the same speed.
In this topology, reactive path rebalancing works as follows:
Quality delta of 2 is configured.
L2 stream 1 (
mac 0x123
) enters ingress port et-0/0/0 with a rate of 10 percent. It exits through et-0/0/10. The egress link utilization of et-0/0/10 is 10 percent and the quality band value is 6.The L3 stream enters port et-0/0/1 with a rate of 50 percent. It exits through et-0/0/11 and selects the optimal link from the ECMP member list. The egress link utilization of et-0/0/11 is 50 percent with a quality band value of 5.
L2 stream 2 (
mac 0x223
) enters port et-0/0/2 with a rate of 40 percent. It also exits through et-0/0/11. This further degrades the et-0/0/11 link quality band value to 4. Now the difference in the quality band values of both ECMP member links is 2.The reactive path balancing algorithm now becomes operational because the difference in quality band values for ports et-0/0/10 and et-0/0/11 is equal to or higher than the configured quality delta of 2. The algorithm moves the L3 stream from et-0/0/11 to a better-quality member link, which in this case is et-0/0/10.
After the L3 steam moves to et-0/0/10, the et-0/0/10 link utilization increases to 60 percent with a decrease in quality band value to 5. L2 stream 2 continues to exit through et-0/0/11. The et-0/0/11 link utilization remains at 40 percent with an increase in quality band value to 5.
Configure Reactive Path Rebalancing
Platform Support
See Feature Explorer for platform and release support.