ON THIS PAGE
Load Balancing for Aggregated Ethernet Interfaces
Load balancing is done on Layer 2 across the member links making the configuration better without congestion and maintaining redundancy. The below topics discuss the overview of load balancing, configuring load balancing based on MAC addresses and on LAG link, understanding the consistency through resilient hashing.
Load Balancing and Ethernet Link Aggregation Overview
You can create a link aggregation group (LAG) for a group of Ethernet ports. Layer 2 bridging traffic is load balanced across the member links of this group, making the configuration attractive for congestion concerns as well as for redundancy. Each LAG bundle contains up to 16 links. (Platform support depends on the Junos OS release in your installation.)
For LAG bundles, the hashing algorithm determines how traffic entering a LAG bundle is placed
onto the bundle’s member links. The hashing algorithm tries to manage bandwidth by
evenly load-balancing all incoming traffic across the member links in the bundle. The
hash-mode of the hashing algorithm is set to Layer 2 payload by default. When the
hash-mode is set to Layer 2 payload, the hashing algorithm uses the IPv4 and IPv6
payload fields for hashing. You can also configure the
load balancing hash key for Layer 2 traffic to use fields in the Layer 3 and Layer 4
headers using the payload
statement.
However, note that the load-balancing behavior
is platform-specific and based on appropriate hash-key configurations.
For more information, see Configuring Load Balancing on a LAG Link.In a Layer 2 switch, one link is overutilized and other links are underutilized.
Configuring Load Balancing Based on MAC Addresses
The hash key mechanism for load-balancing uses Layer 2 media access control (MAC)
information such as frame source and destination address. To load-balance traffic based
on Layer 2 MAC information, include the multiservice
statement at the
[edit forwarding-options hash-key]
or [edit chassis fpc
slot number pic PIC number
hash-key]
hierarchy level:
multiservice { source-mac; destination-mac; payload { ip { layer3-only; layer-3 (source-ip-only | destination-ip-only); layer-4; inner-vlan-id; outer-vlan-id; } } }
Use Feature Expolorer to confirm platform and release support for specific features.
Review the Platform-Specific MAC Address Based Loadbalancing Behavior section for notes related to your platform.
To include the destination-address MAC information in the hash key, include the
destination-mac
option. To include the source-address MAC
information in the hash key, include the source-mac
option.
-
Any packets that have the same source and destination address will be sent over the same path.
-
You can configure per-packet load balancing to optimize EVPN traffic flows across multiple paths.
-
Aggregated Ethernet member links will now use the physical MAC address as the source MAC address in 802.3ah OAM packets.
Platform-Specific MAC Address Based Loadbalancing Behavior
Platform |
Difference |
---|---|
ACX Series |
ACX7000 Series Routers support symmetric hashing. For example,
you need to configure both Note the following about hashing on ACX7000 Series Routers:
|
See Also
Configuring Load Balancing on a LAG Link
You can configure the load balancing hash key for Layer 2
traffic to use fields in the Layer 3 and Layer 4 headers
inside the frame payload for load-balancing purposes using the payload
statement. You can configure the statement to look
at layer-3 (and source-ip-only or destination-ip-only packet header fields) or layer-4 fields. You configure
this statement at the [edit forwarding-options hash-key family
multiservice]
hierarchy level.
You can configure Layer 3 or Layer 4 options, or both.
The source-ip-only or destination-ip-only options
are mutually exclusive. The layer-3-only
statement is not
available on MX Series routers.
By default, Junos implementation of 802.3ad balances traffic across the member links within an aggregated Ethernet bundle based on the Layer 3 information carried in the packet.
For more information about link aggregation group (LAG) configuration, see the Junos OS Network Interfaces Library for Routing Devices.
Example: Configuring Load Balancing on a LAG Link
This example configures the load-balancing hash key to use the source Layer 3 IP address option and Layer 4 header fields as well as the source and destination MAC addresses for load balancing on a link aggregation group (LAG) link:
[edit] forwarding-options { hash-key { family multiservice { source-mac; destination-mac; payload { ip { layer-3 { source-ip-only; } layer-4; } } } } }
Any change in the hash key configuration requires a reboot of the FPC for the changes to take effect.
Understanding Multicast Load Balancing on Aggregated 10-Gigabit Links for Routed Multicast Traffic on EX8200 Switches
Streaming video technology was introduced in 1997. Multicast protocols were subsequently developed to reduce data replication and network overloads. With multicasting, servers can send a single stream to a group of recipients instead of sending multiple unicast streams. While the use of streaming video technology was previously limited to occasional company presentations, multicasting has provided a boost to the technology resulting in a constant stream of movies, real-time data, news clips, and amateur videos flowing nonstop to computers, TVs, tablets, and phones. However, all of these streams quickly overwhelmed the capacity of network hardware and increased bandwidth demands leading to unacceptable blips and stutters in transmission.
To satisfy the growing bandwidth demands, multiple links were virtually aggregated to form bigger logical point-to-point link channels for the flow of data. These virtual link combinations are called multicast interfaces, also known as link aggregation groups (LAGs).
Multicast load balancing involves managing the individual links in each LAG to ensure that each link is used efficiently. Hashing algorithms continually evaluate the data stream, adjusting stream distribution over the links in the LAG, so that no link is underutilized or overutilized. Multicast load balancing is enabled by default on Juniper Networks EX8200 Ethernet Switches.
This topic includes:
- Create LAGs for Multicasting in Increments of 10 Gigabits
- When Should I Use Multicast Load Balancing?
- How Does Multicast Load Balancing Work?
- How Do I Implement Multicast Load Balancing on an EX8200 Switch?
Create LAGs for Multicasting in Increments of 10 Gigabits
The maximum link size on an EX8200 switch is 10 gigabits. If you need a larger link on an EX8200 switch, you can combine up to twelve 10-gigabit links. In the sample topology shown in Figure 1, four 10-gigabit links have been aggregated to form each 40-gigabit link.
When Should I Use Multicast Load Balancing?
Use a LAG with multicast load balancing when you need a downstream link greater than 10 gigabits. This need frequently arises when you act as a service provider or when you multicast video to a large audience.
To use multicast load balancing, you need the following:
An EX8200 switch—Standalone switches support multicast load balancing, while Virtual Chassis does not.
A Layer 3 routed multicast setup—For information about configuring multicasting, see Junos OS Routing Protocols Configuration Guide.
Aggregated 10-gigabit links in a LAG—For information about configuring LAGs with multicast load balancing , see Configuring Multicast Load Balancing for Use with Aggregated 10-Gigabit Ethernet Links on EX8200 Switches (CLI Procedure).
How Does Multicast Load Balancing Work?
When traffic can use multiple member links, traffic that is part of the same stream must always be on the same link.
Multicast load balancing uses one of seven available hashing algorithms and a technique called queue shuffling (alternating between two queues) to distribute and balance the data, directing streams over all available aggregated links. You can select one of the seven algorithms when you configure multicast load balancing, or you can use the default algorithm, crc-sgip, which uses a cyclic redundancy check (CRC) algorithm on the multicast packets’ group IP address. We recommend that you start with the crc-sgip default and try other options if this algorithm does not evenly distribute the Layer 3 routed multicast traffic. Six of the algorithms are based on the hashed value of IP addresses (IPv4 or IPv6) and will produce the same result each time they are used. Only the balanced mode option produces results that vary depending on the order in which streams are added. See Table 1 for more information.
Hashing Algorithms |
Based On |
Best Use |
---|---|---|
crc-sgip |
Cyclic redundancy check of multicast packets’ source and group IP address |
Default—high-performance management of IP traffic on 10-Gigabit Ethernet network. Predictable assignment to the same link each time. This mode is complex but yields a good distributed hash. |
crc-gip |
Cyclic redundancy check of multicast packets’ group IP address |
Predictable assignment to the same link each time. Try this mode when crc-sgip does not evenly distribute the Layer 3 routed multicast traffic and the group IP addresses vary. |
crc-sip |
Cyclic redundancy check of multicast packets’ source IP address |
Predictable assignment to the same link each time. Try this mode when crc-sgip does not evenly distribute the Layer 3 routed multicast traffic and the stream sources vary. |
simple-sgip |
XOR calculation of multicast packets’ source and group IP address |
Predictable assignment to the same link each time. This is a simple hashing method that might not yield as even a distribution as crc-sgip yields. Try this mode when crc-sgip does not evenly distribute the Layer 3 routed multicast traffic. |
simple-gip |
XOR calculation of multicast packets’ group IP address |
Predictable assignment to the same link each time. This is a simple hashing method that might not yield as even a distribution as crc-gip yields. Try this when crc-gip does not evenly distribute the Layer 3 routed multicast traffic and the group IP addresses vary. |
simple-sip |
XOR calculation of multicast packets’ source IP address |
Predictable assignment to the same link each time. This is a simple hashing method that might not yield as even a distribution as crc-sip yields. Try this mode when crc-sip does not evenly distribute the Layer 3 routed multicast traffic and stream sources vary. |
balanced |
Round-robin calculation method used to identify multicast links with the least amount of traffic |
Best balance is achieved, but you cannot predict which link will be consistently used because that depends on the order in which streams come online. Use when consistent assignment is not needed after every reboot. |
How Do I Implement Multicast Load Balancing on an EX8200 Switch?
To implement multicast load balancing with an optimized level of throughput on an EX8200 switch, follow these recommendations:
Allow 25 percent unused bandwidth in the aggregated link to accommodate any dynamic imbalances due to link changes caused by sharing multicast interfaces.
For downstream links, use multicast interfaces of the same size whenever possible. Also, for downstream aggregated links, throughput is optimized when members of the aggregated link belong to the same devices.
For upstream aggregated links, use a Layer 3 link whenever possible. Also, for upstream aggregated links, throughput is optimized when the members of the aggregated link belong to different devices.
See Also
Example: Configuring Multicast Load Balancing for Use with Aggregated 10-Gigabit Ethernet Interfaces on EX8200 Switches
EX8200 switches support multicast load balancing on link aggregation groups (LAGs). Multicast load balancing evenly distributes Layer 3 routed multicast traffic over the LAGs, You can aggregate up to twelve 10-gigabit Ethernet links to form a 120-gigabit virtual link or LAG. The MAC client can treat this virtual link as if it were a single link to increase bandwidth, provide graceful degradation as link failures occur, and increase availability. On EX8200 switches, multicast load balancing is enabled by default. However, if it is explicitly disabled, you can reenable it. .
An interface with an already configured IP address cannot form part of the LAG.
Only EX8200 standalone switches with 10-gigabit links support multicast load balancing. Virtual Chassis does not support multicast load balancing.
This example shows how to configure a LAG and reenable multicast load balancing:
Requirements
This example uses the following hardware and software components:
Two EX8200 switches, one used as the access switch and one used as the distribution switch
Junos OS Release 12.2 or later for EX Series switches
Before you begin:
Configure four 10-gigabit interfaces on the EX8200 distribution switch: xe-0/1/0, xe-1/1/0, xe-2/1/0, and xe-3/1/0. See Configuring Gigabit Ethernet Interfaces (CLI Procedure).
Overview and Topology
Multicast load balancing uses one of seven hashing algorithms to balance traffic between the individual 10-gigabit links in the LAG. For a description of the hashing algorithms, see multicast-loadbalance. The default hashing algorithm is crc-sgip. You can experiment with the different hashing algorithms until you determine the one that best balances your Layer 3 routed multicast traffic.
When a link larger than 10 gigabits is needed on an EX8200 switch, you can combine up to twelve 10-gigabit links to create more bandwidth. This example uses the link aggregation feature to combine four 10-gigabit links into a 40-gigabit link on the distribution switch. In addition, multicast load balancing is enabled to ensure even distribution of Layer 3 routed multicast traffic on the 40-gigabit link. In the sample topology illustrated in Figure 2, an EX8200 switch in the distribution layer is connected to an EX8200 switch in the access layer.
Link speed is automatically determined based on the size of the LAG configured. For example, if a LAG is composed of four 10-gigabit links, the link speed is 40 gigabits per second).
The default hashing algorithm, crc-sgip, involves a cyclic redundancy check of both the multicast packet source and group IP addresses.
You will configure a LAG on each switch and reenable multicast load balancing. When reenabled, multicast load balancing will automatically take effect on the LAG, and the speed is set to 10 gigabits per second for each link in the LAG. Link speed for the 40-gigabit LAG is automatically set to 40 gigabits per second.
Configuration
Procedure
CLI Quick Configuration
To quickly configure this example, copy the
following commands, paste them into a text file, remove any line breaks,
change any details necessary to match your network configuration,
and then copy and paste the commands into the CLI at the [edit]
hierarchy level.
set chassis aggregated-devices ethernet device-count 1 set interfaces ae0 aggregated-ether-options minimum-links 1 set interfaces xe-0/1/0 ether-options 802.3ad ae0 set interfaces xe-1/1/0 ether-options 802.3ad ae0 set interfaces xe-2/1/0 ether-options 802.3ad ae0 set interfaces xe-3/1/0 ether-options 802.3ad ae0 set chassis multicast-loadbalance hash-mode crc-gip
Step-by-Step Procedure
To configure a LAG and reenable multicast load balancing:
Specify the number of aggregated Ethernet interfaces to be created:
[edit chassis] user@switch#
set aggregated-devices ethernet device-count 1
Specify the minimum number of links for the aggregated Ethernet interface (aex), that is, the LAG, to be labeled
up
:Note:By default, only one link needs to be up for the LAG to be labeled
up
.[edit interfaces] user@switch#
set ae0 aggregated-ether-options minimum-links 1
Specify the four members to be included within the LAG:
[edit interfaces] user@switch#
set xe-0/1/0 ether-options 802.3ad ae0
user@switch#set xe-1/1/0 ether-options 802.3ad ae0
user@switch#set xe-2/1/0 ether-options 802.3ad ae0
user@switch#set xe-3/1/0 ether-options 802.3ad ae0
Reenable multicast load balancing:
[edit chassis] user@switch# set multicast-loadbalance
Note:You do not need to set link speed the way you do for LAGs that do not use multicast load balancing. Link speed is automatically set to 40 gigabits per second on a 40-gigabit LAG.
You can optionally change the value of the
hash-mode
option in the multicast-loadbalance statement to try different algorithms until you find the one that best distributes your Layer 3 routed multicast traffic.If you change the hashing algorithm when multicast load balancing is disabled, the new algorithm takes effect after you reenable multicast load balancing.
Results
Check the results of the configuration:
user@switch> show configuration chassis aggregated-devices { ethernet { device-count 1; } } multicast-loadbalance { hash-mode crc-gip; } interfaces xe-0/1/0 { ether-options { 802.3ad ae0; } } xe-1/1/0 { ether-options { 802.3ad ae0; } } xe-2/1/0 { ether-options { 802.3ad ae0; } } xe-3/1/0 { ether-options { 802.3ad ae0; } } ae0 { aggregated-ether-options { minimum-links 1; } } }
Verification
To confirm that the configuration is working properly, perform these tasks:
Verifying the Status of a LAG Interface
Purpose
Verify that a link aggregation group (LAG) (ae0) has been created on the switch.
Action
Verify that the ae0 LAG has been created:
user@switch> show interfaces ae0 terse
Interface Admin Link Proto Local Remote ae0 up up ae0.0 up up inet 10.10.10.2/24
Meaning
The interface name aex indicates that this is a LAG. A stands for aggregated, and E stands for Ethernet. The number differentiates the various LAGs.
Verifying Multicast Load Balancing
Purpose
Check that traffic is load-balanced equally across paths.
Action
Verify load balancing across the four interfaces:
user@switch> monitor interface traffic
Bytes=b, Clear=c, Delta=d, Packets=p, Quit=q or ESC, Rate=r, Up=^U, Down=^D ibmoem02-re1 Seconds: 3 Time: 16:06:14 Interface Link Input packets (pps) Output packets (pps) xe-0/1/0 Up 2058834 (10) 7345862 (19) xe-1/1/0 Up 2509289 (9) 6740592 (21) xe-2/1/0 Up 8625688 (90) 10558315 (20) xe-3/1/0 Up 2374154 (23) 71494375 (9)
Meaning
The interfaces should be carrying approximately the same amount of traffic.
Dynamic Load Balancing
Load balancing is used to ensure that network traffic is distributed as evenly as possible across members in a given ECMP (Equal-cost multi-path routing) or LAG (Link Aggregation Group). In general, load balancing is classified as either static or dynamic. Static load balancing (SLB) computes hashing solely based on the packet contents (for example, source IP, destination IP, and so on.). The biggest advantage of SLB is that packet ordering is guaranteed as all packets of a given flow take the same path. However, because the SLB mechanism does not consider the path or link load, the network often experiences the following problems:
-
Poor link bandwidth utilization
-
Elephant flow on a single link completely dropping mice flows on it.
Dynamic load balancing (DLB) is an improvement on top of SLB.
For ECMP, you can configure DLB globally, whereas for LAG, you configure it for each aggregated Ethernet interface. You can apply DLB on selected ether-type (Dynamic Load Balancing) (IPv4, IPv6, and MPLS) based on configuration. If you don't configure any ether-type (Dynamic Load Balancing), then DLB is applied to all EtherTypes. Note that you must explicitly configure the DLB mode because there is no default mode.
-
Starting in Junos OS Release 22.3R1-EVO, QFX5130-32CD switches support dynamic load balancing for both ECMP and LAG.
-
Starting in Junos OS Release 19.4R1, QFX5120-32C and QFX5120-48Y switches support dynamic load balancing for both ECMP and LAG. For LAG, DLB must be configured on per aggregated ethernet interface basis.
-
Starting in Junos OS evolved Release 19.4R2, QFX5220 switches support dynamic load balancing (DLB) for ECMP. For ECMP, DLB must be configured globally.
-
You cannot configure both DLB and resilient hashing at the same time. Otherwise, a commit error will be thrown.
-
DLB is applicable only for unicast traffic.
-
DLB is not supported when the LAG is one of the egress ECMP members.
-
DLB is not supported for remote LAG members.
-
DLB is not supported on Virtual Chassis and Virtual Chassis Fabric (VCF).
-
DLB on LAG and HiGig-trunk are not supported at the same time.
-
QFX5220, QFX5230-64CD, and QFX5240 switches do not support DLB on LAG.
Platform |
DLB Support for ECMP |
DLB Support for LAG |
---|---|---|
QFX5120-32C |
Yes |
Yes |
QFX5120-48Y |
Yes |
Yes |
QFX5220 |
Yes |
No |
QFX5230-64CD |
Yes |
No |
QFX5240 |
Yes |
No |
You can use the following DLB modes to load-balance traffic:
-
Per packet mode
In this mode, DLB is initiated for each packet in the flow. This mode makes sure that the packet always gets assigned to the best-quality member port. However, in this mode, DLB may experience packet reordering problems that can arise due to latency skews.
-
Flowlet mode
This mode relies on assigning links based on flowlets instead of flows. Real-world application traffic relies on flow control mechanisms of upper-layer transport protocols such as TCP, which throttle the transmission rate. As a result, flowlets are created. You can consider flowlets as multiple bursts of the same flow separated by a period of inactivity between these bursts—this period of inactivity is referred to as the inactivity interval. The inactivity interval serves as the demarcation criteria for identifying new flowlets and is offered as a user-configurable statement under the DLB configuration. In this mode, DLB is initiated per flowlet—that is, for the new flow as well as for the existing flow that has been inactive for a sufficiently long period of time (configured
inactivity-interval
). The reordering problem of per packet mode is addressed in this mode as all the packets in a flowlet take the same link. If theinactivity-interval
value is configured to be higher than the maximum latency skew across all ECMP paths, then you can avoid packet reordering across flowlets while increasing link utilization of all available ECMP links. -
Assigned flow mode
You can use assigned flow mode to selectively disable rebalancing for a period of time to isolate problem sources. You cannot use this mode for real-time DLB or predict the egress ports that will be selected using this mode because assigned flow mode does not consider port load and queue size.
Here are some of the important behaviors of DLB:
-
DLB is applicable for incoming EtherTypes only.
-
From a DLB perspective, both Layer 2 and Layer 3 link aggregation group (LAG) bundles are considered the same.
-
The link utilisation will not be optimal if you use dynamic load balancing in asymmetric bundles—that is, on ECMP links with different member capacities.
-
With DLB, no reassignment of flow happens when a new link is added in per packet and assigned flow modes. This can cause suboptimal usage in link flap scenarios where a utilized link may not be utilized after it undergoes a flap if no new flow or flowlets are seen after the flap.
Benefits
-
DLB considers member bandwidth utilization along with packet content for member selection. As a result, we achieve better link utilization based on real-time link loads.
-
DLB ensures that links hogged by elephant flows are not used by mice flows. Thus, by using DLB, we avoid hash collision drops that occur with SLB. That is, with DLB the links are spread across, and thus the collision and the consequent drop of packets are avoided.
Configuring Dynamic Load Balancing
This topic describes how to configure dynamic load balancing (DLB) in flowlet mode.
Starting in Junos OS Release 19.4R1, QFX5120-32C and QFX5120-48Y switches support dynamic load balancing for both ECMP and LAG. For LAG, DLB must be configured on per aggregated ethernet interface basis.
Starting in Junos OS evolved Release 19.4R2, QFX5220 switches support dynamic load balancing (DLB) for ECMP. For ECMP, DLB must be configured globally.
Configuring DLB for ECMP (Flowlet mode)
To configure dynamic load balancing for ECMP with flowlet mode (QFX5120-32C, QFX5120-48Y, and QFX5220 switches):
Similarly, you can configure DLB for ECMP with Per packet or Assigned flow mode.
Configuring DLB for LAG (Flowlet mode)
Before you begin, create an aggregated ethernet (AE) bundle by configuring a set of router interfaces as aggregated Ethernet and with a specific aggregated ethernet (AE) group identifier.
To configure dynamic load balancing for LAG with flowlet mode (QFX5120-32C and QFX5120-48Y):
Enable dynamic load balancing with flowlet mode:
[edit interfaces ae-x aggregated-ether-options] user@router# set dlb flowlet
(Optional) Configure the inactivity-interval value - minimum inactivity interval (in micro seconds) for link re-assignment:
[edit interfaces ae-x aggregated-ether-options] user@router# set dlb flowlet inactivity-interval (micro seconds)
(Optional) Configure dynamic load balancing with
ether-type
:[edit forwarding-options enhanced-hash-key] user@router# set lag-dlb ether-type mpls
(Optional) You can view the options configured for dynamic load balancing on LAG using
show forwarding-options enhanced-hash-key
command.
Similarly, you can configure DLB for LAG with Per packet or Assigned flow mode.
See Also
Example: Configure Dynamic Load Balancing
This example shows how to configure dynamic load balancing.
Requirements
This example uses the following hardware and software components:
Two QFX5120-32C or QFX5120-48Y switches
Junos OS Release 19.4R1 or later running on all devices
Overview
Dynamic load balancing (DLB) is an improvement on top of SLB.
For ECMP, you can configure DLB globally, whereas for LAG, you configure it for each aggregated Ethernet interface. You can apply DLB on selected ether-type (Dynamic Load Balancing) such as IPv4, IPv6, and MPLS based on configuration. If you don't configure any ether-type (Dynamic Load Balancing), then DLB is applied to all EtherTypes. Note that you must explicitly configure the DLB mode because there is no default mode.
Starting in Junos OS Release 19.4R1, QFX5120-32C and QFX5120-48Y switches support dynamic load balancing on both ECMP and LAG.
You cannot configure both DLB and Resilient Hashing at the same time. Otherwise, commit error will be thrown.
Topology
In this topology, both R0 and R1 are connected.
This example shows static configuration. You can also add configuration with dynamic protocols.
Configuration
- CLI Quick Configuration
- Configure Dynamic Load Balancing for LAG (QFX5120-32C and QFX5120-48Y)
- Configure Dynamic Load Balancing for ECMP (QFX5120-32C, QFX5120-48Y, and QFX5220 switches)
CLI Quick Configuration
To quickly configure this example, copy the
following commands, paste them into a text file, remove any line breaks,
change any details necessary to match your network configuration,
and then copy and paste the commands into the CLI at the [edit]
hierarchy level.
R0
set interfaces xe-0/0/0 unit 0 family inet address 10.1.0.2/24 set interfaces xe-0/0/10 unit 0 family inet address 10.1.1.2/24 set interfaces xe-0/0/54:0 unit 0 family inet address 10.10.10.2/24 set forwarding-options enhanced-hash-key ecmp-dlb per-packet set policy-options policy-statement loadbal then load-balance per-packet set routing-options static route 20.0.1.0/24 next-hop 10.1.0.3 set routing-options static route 20.0.1.0/24 next-hop 10.1.1.3 set routing-options forwarding-table export loadbal
R1
set interfaces xe-0/0/0 unit 0 family inet address 10.1.0.3/24 set interfaces xe-0/0/10 unit 0 family inet address 10.1.1.3/24 set interfaces xe-0/0/52:0 unit 0 family inet address 20.0.0.2/16
Configure Dynamic Load Balancing for LAG (QFX5120-32C and QFX5120-48Y)
Step-by-Step Procedure
The following example requires you to navigate various levels in the configuration hierarchy. For information about navigating the CLI, see Using the CLI Editor in Configuration Mode.
To configure the R0 router:
Repeat this procedure for the other routers, after modifying the appropriate interface names, addresses, and any other parameters for each router.
Configure Link Aggregation Group (LAG).
[edit interfaces]
user@R0# set interfaces xe-0/0/0 ether-options 802.3ad ae0 user@R0# set interfaces xe-0/0/10 ether-options 802.3ad ae0 user@R0# set interfaces ae0 aggregated-ether-options lacp active user@R0# set interfaces ae0 unit 0 family inet address 10.1.0.2/24 user@R0# set routing-options static route 20.0.1.0/24 next-hop 10.1.0.3After configuring LAG, in the verification section, execute the steps in the Verifying Traffic Load before configuring Dynamic Load Balancing Feature on LAG section, to check the configuration or the traffic load before configuring DLB.
Configure Dynamic Load Balancing with per-packet mode for LAG.
[edit]
user@R0# set interfaces ae0 aggregated-ether-options dlb per-packetAfter configuring the DLB, in the verification section, execute the steps in the Verifying Traffic Load after configuring Dynamic Load Balancing Feature on LAG section, to check the configuration or the traffic load before configuring DLB.
Configure Dynamic Load Balancing for ECMP (QFX5120-32C, QFX5120-48Y, and QFX5220 switches)
Step-by-Step Procedure
The following example requires you to navigate various levels in the configuration hierarchy. For information about navigating the CLI, see Using the CLI Editor in Configuration Mode.
To configure the R0 router:
Repeat this procedure for the other routers, after modifying the appropriate interface names, addresses, and any other parameters for each router.
Configure the Gigabit Ethernet interface link connecting from R0 to R1.
[edit interfaces]
user@R0# set interfaces xe-0/0/0 unit 0 family inet address 10.1.0.2/24 user@R0# set interfaces xe-0/0/10 unit 0 family inet address 10.1.1.2/24 user@R0# set interfaces xe-0/0/54:0 unit 0 family inet address 10.10.10.2/24Create the static routes:
[edit interfaces]
user@R0# set routing-options static route 20.0.1.0/24 next-hop 10.1.0.3 user@R0# set routing-options static route 20.0.1.0/24 next-hop 10.1.1.3Apply the load-balancing policy. The dynamic load balancing feature requires the multiple ECMP next hops to be present in the forwarding table.
[edit interfaces]
user@R0# set policy-options policy-statement loadbal then load-balance per-packet user@R0# set routing-options forwarding-table export loadbalConfigure Dynamic Load Balancing with per-packet mode for ECMP.
[edit interfaces]
user@R0# set forwarding-options enhanced-hash-key ecmp-dlb per-packetOn R1, configure the Gigabit Ethernet interface link.
[edit interfaces]
user@R2# set interfaces xe-0/0/0 unit 0 family inet address 10.1.0.3/24 user@R2# set interfaces xe-0/0/10 unit 0 family inet address 10.1.1.3/24 user@R2# set interfaces xe-0/0/52:0 unit 0 family inet address 20.0.0.2/16
Verification
Confirm that the configuration is working properly.
- Verify Traffic Load Before Configuring Dynamic Load Balancing Feature on LAG
- Verify Traffic Load After Configuring Dynamic Load Balancing Feature on LAG
Verify Traffic Load Before Configuring Dynamic Load Balancing Feature on LAG
Purpose
Verify before the DLB feature is configured on the Link Aggregation Group.
Action
From operational mode, run the show interfaces
interface-name | match pps
command.
user@R0>show interfaces xe-0/0/0 | match pps Input rate : 1240 bps (1 pps) Output rate : 1024616 bps (1000 pps) ## all traffic in one link. user@R0>show interfaces xe-0/0/10 | match pps Input rate : 616 bps (0 pps) Output rate : 1240 bps (1 pps)<< Output rate : 1240 bps (1 pps) ## no traffic
Verify Traffic Load After Configuring Dynamic Load Balancing Feature on LAG
Purpose
Verify that packets received on the R0 are load-balanced.
Action
From operational mode, run the show interfaces interface-name
command.
user@R0>show interfaces xe-0/0/0 | match pps Input rate : 616 bps (0 pps) Output rate : 519096 bps (506 pps)<< Output rate : 519096 bps (506 pps) ## load equally shared user@R0>show interfaces xe-0/0/10 | match pps Input rate : 1232 bps (1 pps) Output rate : 512616 bps (500 pps)<< Output rate : 512616 bps (500 pps) ## load equally shared
Meaning
Dynamic Load balancing with per-packet mode successfully working. After applying dynamic load balancing feature on LAG, the load is equally shared in the network.
Verification
Confirm that the configuration is working properly at R0.
Verify Dynamic Load Balancing on R0
Purpose
Verify that packets received on the R0 are load-balanced.
Action
From operational mode, run the run show route forwarding-table
destination destination-address
command.
user@R0>show route forwarding-table destination 20.0.1.0/24 inet.0: 178 destinations, 178 routes (178 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 20.0.1.0/24 *[Static/5] 1d 03:35:12 > to 10.1.0.3 via xe-0/0/0.0 to 10.1.1.3 via xe-0/0/10.0 user@R0>show route 20.0.1.0/24 inet.0: 178 destinations, 178 routes (178 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 20.0.1.0/24 *[Static/5] 1d 03:35:12 > to 10.1.0.3 via xe-0/0/0.0 to 10.1.1.3 via xe-0/0/10.0
Meaning
Verify Load Balancing on R1
Purpose
Confirm that the configuration is working properly at R1.
Action
From operational mode, run the show route
command.
user@R1>show route 20.0.1.25 inet.0: 146 destinations, 146 routes (146 active, 0 holddown, 0 hidden) + = Active Route, - = Last Active, * = Both 20.0.0.0/16 *[Direct/0] 1d 03:37:11 > via xe-0/0/52:0.0
Meaning
Dynamic Load balancing with per-packet mode successfully working. After applying dynamic load balancing feature on ECMP, the load is equally shared in the network.
Configure Flowset Table Size in DLB Flowlet Mode
Overview
Dynamic load balancing (DLB) is a load balancing technique that selects an optimal egress link based on link quality so that traffic flows are evenly distributed. You (the network administrator) can configure DLB in flowlet mode.
In flowlet mode, DLB tracks the flows by recording the last seen timestamp and the egress interface that DLB selected based on the optimal link quality. DLB records this information in the flowset table allocated to each ECMP group. The DLB algorithm maintains a given flow on a particular link until the last seen timestamp exceeds the inactivity timer. When the inactivity timer expires for a particular flow, DLB rechecks whether that link is still optimal for that flow. If the link is no longer optimal, DLB selects a new egress link and updates the flowset table with the new link and the last known timestamp of the flow. If the link continues to be optimal, the flowset table continues to use the same egress link.
You (the network administrator) can increase the flowset table size to change the distribution of the flowset table entries among the ECMP groups. The more entries an ECMP group has in the flowset table, the more flows the ECMP group can accommodate. In environments such as AI-ML data centers that must handle large numbers of flows, it is particularly useful for DLB to use a larger flowset table size. When each ECMP group can accommodate a large number of flows, DLB achieves better flow distribution across the ECMP member links.
The flowset table holds 32,768 total entries, and these entries are divided equally among the DLB ECMP groups. The flowset table size for each ECMP group ranges from 256 through 32,768. Use the following formula to calculate the number of ECMP groups:
32,768/(flowset size) = Number of ECMP groups
By default, the flowset size is 256 entries, so by default there are 128 ECMP groups.
Benefits
-
Improve load distribution over egress links.
-
Group flows to minimize how many calculations DLB has to make for each flow.
-
Customize flowset table entry allocation for maximum efficiency.
-
Increase the efficiency of flowlet mode.
Configuration
Be aware of the following when configuring the flowset table size:
-
When you change the flowset size, the scale of ECMP DLB groups also changes. Allocating a flowset table size greater than 256 reduces the number of DLB-capable ECMP groups.
-
When you commit this configuration, traffic can drop during the configuration change.
-
DLB is not supported when a link aggregation group (LAG) is one of the egress members of ECMP.
-
Only underlay fabrics support DLB.
-
QFX5240 switch ports with a speed less than 50 Gbps do not support DLB.
Platform Support
See Feature Explorer for platform and release support.
Related Documentation
Reactive Path Rebalancing
Overview
Dynamic load balancing (DLB) is an important tool for handling the large data flows (also known as elephant flows) inherent in AI-ML data center fabrics. Reactive path rebalancing is an enhancement to existing DLB features.
In the flowlet mode of DLB, you (the network administrator) configure an inactivity interval. The traffic uses the assigned outgoing (egress) interface until the flow pauses for longer than the inactivity timer. If the outgoing link quality deteriorates gradually, the pause within the flow might not exceed the configured inactivity timer. In this case, classic flowlet mode does not reassign the traffic to a different link, so the traffic cannot utilize a better-quality link. Reactive path rebalancing addresses this limitation by enabling the user to move the traffic to a better-quality link even when flowlet mode is enabled.
The device assigns a quality band to each equal-cost multipath (ECMP) egress member link that is based on the traffic flowing through the link. The quality band depends on the port load and the queue buffer. The port load is the number of egress bytes transmitted. The queue buffer is the number of bytes waiting to be transmitted from the egress port. You can customize these attributes based on the traffic pattern flowing through the ECMP.
Benefits
-
Scalable solution to link degradation
-
Optimal use of bandwidth for large data flows
-
Avoidance of load balancing inefficiencies due to long-lived flows
Configuration
Configuration Overview
Quality bands are numbered from 0 through 7, where 0 is the lowest quality and 7 is the highest quality. Based on the member port load and queue size, DLB assigns a quality band value to the member port. The port-to-quality band mapping changes based on instantaneous port load and queue size.
When both of the following conditions are met, reactive path rebalancing reassigns a flow to a higher-quality member link:
-
A better-quality member link is available whose quality band is equal to or greater than the current member's quality band plus the configured reassignment quality delta value. The quality delta is the difference between the two quality bands. Configure the quality delta value using the
quality-delta
statement. -
The packet random value that the system generates is lower than the reassignment probability threshold value. Configure the probability threshold value using the
prob-threshold
statement.
Be aware of the following when using this feature:
-
Reactive path rebalancing is a global configuration and applies to all ECMP DLB configurations in the system.
-
You can configure egress quantization in addition to reactive path rebalancing to control the flow reassignment.
-
Packet reordering can occur when the flow moves from one port to another. Configuring reactive path rebalancing can cause momentary out-of-order issues when the flow is reassigned to the new link.
Topology
In this topology, the device has three ingress ports and two egress ports. Two of the ingress streams are Layer 2 (L2) traffic and one is Layer 3 (L3) traffic. The figure shows the table entries forwarding the traffic to each of the egress ports. All the ingress and egress ports are of the same speed.
In this topology, reactive path rebalancing works as follows:
Quality delta of 2 is configured.
L2 stream 1 (
mac 0x123
) enters ingress port et-0/0/0 with a rate of 10 percent. It exits through et-0/0/10. The egress link utilization of et-0/0/10 is 10 percent and the quality band value is 6.The L3 stream enters port et-0/0/1 with a rate of 50 percent. It exits through et-0/0/11 and selects the optimal link from the ECMP member list. The egress link utilization of et-0/0/11 is 50 percent with a quality band value of 5.
L2 stream 2 (
mac 0x223
) enters port et-0/0/2 with a rate of 40 percent. It also exits through et-0/0/11. This further degrades the et-0/0/11 link quality band value to 4. Now the difference in the quality band values of both ECMP member links is 2.The reactive path balancing algorithm now becomes operational because the difference in quality band values for ports et-0/0/10 and et-0/0/11 is equal to or higher than the configured quality delta of 2. The algorithm moves the L3 stream from et-0/0/11 to a better-quality member link, which in this case is et-0/0/10.
After the L3 steam moves to et-0/0/10, the et-0/0/10 link utilization increases to 60 percent with a decrease in quality band value to 5. L2 stream 2 continues to exit through et-0/0/11. The et-0/0/11 link utilization remains at 40 percent with an increase in quality band value to 5.
Configure Reactive Path Rebalancing
Platform Support
See Feature Explorer for platform and release support.
Related Documentation
Change History Table
Feature support is determined by the platform and release you are using. Use Feature Explorer to determine if a feature is supported on your platform.
payload
statement.