Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

header-navigation
keyboard_arrow_up
list Table of Contents
file_download PDF
{ "lLangCode": "en", "lName": "English", "lCountryCode": "us", "transcode": "en_US" }
English
keyboard_arrow_right

PFC Using DSCP at Layer 3 for Untagged Traffic

date_range 12-Nov-24

Overview

AI and ML applications are rapidly expanding in data centers. When dealing with AI and ML workloads and large data sets, one critical challenge is handling the size of the data. Offloading the computation to graphics processing units (GPUs) can significantly speed up this task. However, the data size and the model, especially with large language models (LLMs), often exceed the memory capacity of a single GPU. As a result, you commonly require multiple GPUs to achieve reasonable job completion times, especially for training.

The performance of an AI data center depends on the number of GPUs that are used and the efficiency of the network that connects them. Slowdowns in the network can lead to underutilization of GPUs and longer job completion times. Ethernet-based networks are becoming more popular as an alternative to InfiniBand for AI data center networking. One solution is the Remote Direct Memory Access (RDMA) over Converged Ethernet version 2 (RoCEv2) network.

RoCEv2 involves encapsulating RDMA protocol packets within UDP packets for transport over Ethernet networks. The RoCEv2 protocol utilizes priority-based flow control (PFC) to establish a drop-free network, while data center quantized congestion notification (DCQCN) provides end-to-end congestion control for RoCEv2. Junos OS Evolved supports DCQCN by combining explicit congestion notification (ECN) and PFC to enable end-to-end lossless AI Ethernet networking.

To support lossless IPv6 traffic across Layer 3 (L3) connections to Layer 2 (L2) subnetworks, you can configure PFC to operate using 6-bit Differentiated Services code point (DSCP) values from L3 headers of untagged VLAN traffic. You can use PFC with DSCP as an alternative to IEEE 802.1p priority values in L2 VLAN-tagged packet headers. You need DSCP-based PFC to support RoCEv2.

Benefits

  • Utilize Ethernet-based networks for AI-ML data center networking.

  • Improve network efficiency for large data sets.

  • Enable end-to-end lossless AI-ML Ethernet networking.

Configuration

Enable DSCP-Based PFC

  1. Map a forwarding class (FC) to a PFC priority using the pfc-priority statement.
    content_copy zoom_out_map
    set class-of-service forwarding-classes class class-name pfc-priority pfc-priority
    set class-of-service forwarding-classes class class-name queue-num queue-num
    set class-of-service forwarding-classes class class-name no-loss
  2. Define a congestion notification profile to enable PFC on traffic specified by a 6-bit DSCP value. Map the code-point configuration to no-loss queues.
    content_copy zoom_out_map
    set class-of-service congestion-notification-profile cnp input dscp code-point dscp-value pfc
  3. Set up a classifier for the DSCP value and the PFC-mapped FC.
    content_copy zoom_out_map
    set class-of-service classifiers dscp classifier-name forwarding-class class-name loss-priority <low/medium-high/high> code-points dscp-value

Verify the Configuration

  1. Check the ingress port.
    content_copy zoom_out_map
    show interfaces interface-name extensive | match Priority
  2. Check the ingress port.
    content_copy zoom_out_map
    show interfaces queue interface-name
  3. Display the DSCP-based input CNP.
    content_copy zoom_out_map
    show class-of-service congestion-notification-profile cnp name
  4. Display which FCs are mapped to each PFC priority.
    content_copy zoom_out_map
    show class-of-service forwarding-classes

Platform Support

See Feature Explorer for platform and release support.

footer-navigation