Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Virtual Output Queues on PTX Series Packet Transport Routers

This section describes the virtual output queue (VOQ) architecture on PTX Series Packet Transport Routers and includes the following topics:

Introduction to Virtual Output Queues on PTX Series Packet Transport Routers

This topic introduces the virtual output queue (VOQ) architecture on PTX Series Packet Transport routers and how it operates with the configurable class-of-service (CoS) components on PTX Series routers.

Junos and PTX Series hardware CoS features use virtual output queues on the ingress to buffer and queue traffic for each egress output queue. The PTX Series router supports up to eight egress output queues per output port (physical interface).

The traditional method of forwarding traffic through a router is based on buffering ingress traffic in input queues on ingress interfaces, forwarding the traffic across the fabric to output queues on egress interfaces, and then buffering traffic again on the output queues before transmitting the traffic to the next hop. The traditional method of queueing packets on an ingress port is storing traffic destined for different egress ports in the same input queue (buffer).

During periods of congestion, the router might drop packets at the egress port, so the router might spend resources transporting traffic across the switch fabric to an egress port, only to drop that traffic instead of forwarding it. And because input queues store traffic destined for different egress ports, congestion on one egress port could affect traffic on a different egress port, a condition called head-of-line blocking (HOLB).

Virtual output queue (VOQ) architecture takes a different approach:

  • Instead of separate physical buffers for input and output queues, the PTX Series router uses the physical buffers on the ingress pipeline of each Packet Forwarding Engine to store traffic for every egress port. Every output queue on an egress port has buffer storage space on every ingress pipeline on all of the Packet Forwarding Engines on the router. The mapping of ingress pipeline storage space to output queues is 1-to-1, so each output queue receives buffer space on each ingress pipeline.

  • Instead of one input queue containing traffic destined for multiple different output queues (a one-to-many mapping), each output queue has a dedicated VOQ comprised of the input buffers on each Packet Forwarding Engine that are dedicated to that output queue (a 1-to-1 mapping). This architecture prevents communication between any two ports from affecting another port.

  • Instead of storing traffic on a physical output queue until it can be forwarded, a VOQ does not transmit traffic from the ingress port across the fabric to the egress port until the egress port has the resources to forward the traffic. A VOQ is a collection of input queues (buffers) that receive and store traffic destined for one output queue on one egress port. Each output queue on each egress port has its own dedicated VOQ, which consists of all of the input queues that are sending traffic to that output queue.

A VOQ is a collection of input queues (buffers) that receive and store traffic destined for one output queue on one egress port. Each output queue on each egress port has its own dedicated VOQ, which consists of all of the input queues that are sending traffic to that output queue.

VOQ Architecture

A VOQ represents the ingress buffering for a particular output queue. Each of the Packet Forwarding Engines in the PTX Series router uses a specific output queue. The traffic stored on the Packet Forwarding Engines comprises the traffic destined for one particular output queue on one port, and is the VOQ for that output queue.

A VOQ is distributed across all of the Packet Forwarding Engines in the router that are actively sending traffic to that output queue. Each output queue is the sum of the total buffers assigned to that output queue across all of the Packet Forwarding Engines in the router. So the output queue itself is virtual, not physical, although the output queue is comprised of physical input queues.

Round-Trip Time Buffering

Although there is no output queue buffering during periods of congestion (no long-term storage), there is a small physical output queue buffer on egress line cards to accommodate the round-trip time for traffic to traverse the fabric from ingress to egress. The round-trip time consists of the time it takes the ingress port to request egress port resources, receive a grant from the egress port for resources, and transmit the data across the fabric.

That means if a packet is not dropped at the router ingress, and the router forwards the packet across the fabric to the egress port, the packet will not be dropped and will be forwarded to the next hop. All packet drops take place in the ingress pipeline.

VOQ Advantages

VOQ architecture provides two major advantages:

Eliminate Head-of-Line Blocking

VOQ architecture eliminates head-of-line blocking (HOLB) issues. On non-VOQ switches, HOLB occurs when congestion at an egress port affects a different egress port that is not congested. HOLB occurs when the congested port and the non-congested port share the same input queue on an ingress interface.

VOQ architecture avoids HOLB by creating a different dedicated virtual queue for each output queue on each interface.

Because different egress queues do not share the same input queue, a congested egress queue on one port cannot affect an egress queue on a different port. For the same reason, a congested egress queue on one port cannot affect another egress queue on the same port—each output queue has its own dedicated virtual output queue composed of ingress interface input queues.

Performing queue buffering at the ingress interface ensures that the router only sends traffic across the fabric to an egress queue if that egress queue is ready to receive that traffic. If the egress queue is not ready to receive traffic, the traffic remains buffered at the ingress interface.

Increase Fabric Efficiency and Utilization

Traditional output queue architecture has some inherent inefficiencies that VOQ architecture addresses.

  • Packet buffering—Traditional queueing architecture buffers each packet twice in long-term DRAM storage, once at the ingress interface and once at the egress interface. VOQ architecture buffers each packet only once in long-term DRAM storage, at the ingress interface. The fabric is fast enough to be transparent to egress CoS policies, so instead of buffering packets a second time at the egress interface, the router can forward traffic at a rate that does not require deep egress buffers, without affecting the configured egress CoS policies (scheduling).

  • Consumption of resources—Traditional queueing architecture sends packets from the ingress interface input queue (buffer), across the fabric, to the egress interface output queue (buffer). At the egress interface, packets might be dropped, even though the router has expended resources transporting the packets across the fabric and storing them in the egress queue. VOQ architecture does not send packets across the fabric to the egress interface until the egress interface is ready to transmit the traffic. This increases system utilization because no resources are wasted transporting and storing packets that are dropped later.

Does VOQ Change How I Configure CoS?

There are no changes to the way you configure the CoS features. Figure 1 shows the Junoś OS and PTX Series hardware CoS components and VOQ selection, illustrating the sequence in which they interact.

Figure 1: Packet Flow Through CoS Components on PTX Series RoutersPacket Flow Through CoS Components on PTX Series Routers

The VOQ selection process is performed by ASICs that use either the behavior aggregate (BA) classifier or the multifield classifier, depending on your configuration, to select one of the eight possible virtual output queues for an egress port. The virtual output queues on the ingress buffer data for the egress port based on your CoS configuration.

Although the CoS features do not change, there are some operational differences with VOQ:

  • Random early detection (RED) occurs on the ingress Packet Forwarding Engines. With routers that support only egress output queuing, RED and associated congestion drops occur on the egress. Performing RED on the ingress saves valuable resources and increases router performance.

    Although RED occurs on the ingress with VOQ, there is no change to how you configure the drop profiles.

  • Fabric scheduling is controlled through request and grant control messages. Packets are buffered in ingress virtual output queues until the egress Packet Forwarding Engine sends a grant message to the ingress Packet Forwarding Engine indicating it is ready to receive them. For details on fabric scheduling, see Fabric Scheduling and Virtual Output Queues on PTX Series Routers.

Understanding How VOQ Works on PTX Series Routers

This topic describes how the VOQ process works on PTX Series routers.

Understanding the Components of the VOQ Process

Figure 2 shows the hardware components of the PTX Series routers involved in the VOQ process.

Figure 2: VOQ Components on PTX Series RoutersVOQ Components on PTX Series Routers

These components perform the following functions:

  • Physical Interface Card (PIC)—Provides the physical connection to various network media types, receiving incoming packets from the network and transmitting outgoing packets to the network.

  • Flexible PIC Concentrator (FPC)—Connects the PICs installed in it to the other packet transport router components. You can have up to eight FPCs per chassis.

  • Packet Forwarding Engine—Provides Layer 2 and Layer 3 packet switching and encapsulation and de-encapsulation, forwarding and route lookup functions, and manages packet buffering and the queuing of notifications. The Packet Forwarding Engine receives incoming packets from the PICs installed on the FPC and forwards them through the switch planes to the appropriate destination port.

  • Output queues—(Not shown) PTX Series routers support up to eight output queues per output port (physical interface). These output queues are controlled by the CoS scheduler configuration, which establishes how to handle the traffic within the output queues for transmission onto the switch fabric. In addition, these egress output queues control when packets are sent from the virtual output queues on the ingress to the egress output queues.

Understanding the VOQ Process

PTX Series routers support up to eight output queues per output port (physical interface). These output queues are controlled by the CoS scheduler configuration, which establishes how to handle the traffic within the output queues for transmission onto the fabric. In addition, these egress output queues control when packets are sent from the virtual output queues on the ingress to the egress output queues.

For every egress output queue, the VOQ architecture provides virtual queues on each and every ingress Packet Forwarding Engine. These queues are referred to as virtual because the queues physically exist on the ingress Packet Forwarding Engine only when the line card actually has packets enqueued to it.

Figure 3 shows three ingress Packet Forwarding Engines—PFE0, PFE1, and PFE2. Each ingress Packet Forwarding Engine provides up to eight virtual output queues (PFEn.e0.q0 through PFEn.e0.q7) for the single egress port 0. The egress Packet Forwarding Engine PFEn distributes the bandwidth to each ingress VOQ in a round-robin fashion.

For example, egress PFE N's VOQ e0.q0 has 10 Gbps of bandwidth available to it. PFE 0 has an offered load 10 Gbps to e0.qo, PFE1 and PFE2 have an offered load of 1Gbps to e0.q0. The result is that PFE1 And PFE2 will get 100 percent of their traffic through, while PFE0 will only get 80 percent of its traffic through.

Figure 3: Virtual Output Queues on PTX Series RoutersVirtual Output Queues on PTX Series Routers

Figure 4 illustrates an example of the correlation between the egress output queues and the ingress virtual output queues. On the egress side, PFE-X has a 100 Gbps port, which is configured with four different forwarding classes. As a result, the 100 Gbps egress output port on PFE-X uses four out of eight available egress output queues (as denoted by the four queues highlighted with dashed-orange lines on PFE-X), and the VOQ architecture provides four corresponding virtual output queues on each ingress Packet Forwarding Engine (as denoted by the four virtual queues on PFE-A and PFE-B highlighted with dashed-orange lines). The virtual queues on PFE-A and PFE-B exist only when there is traffic to be sent.

Figure 4: Example of VOQExample of VOQ

Fabric Scheduling and Virtual Output Queues on PTX Series Routers

This topic describes the fabric scheduling process on PTX Series routers that use VOQ.

VOQ uses request and grant messages to control fabric scheduling on PTX Series routers. The egress Packet Forwarding Engines control data delivery from the ingress virtual output queues by using request and grant messages. The virtual queues buffer packets on the ingress until the egress Packet Forwarding Engine confirms that it is ready to receive them by sending a grant message to the ingress Packet Forwarding Engine.

Figure 5: Fabric Scheduling and Virtual Output Queues ProcessFabric Scheduling and Virtual Output Queues Process

Figure 5 illustrates the fabric scheduling process used by PTX Series routers with VOQ. When packets arrive at an ingress port, the ingress pipeline stores the packet in the ingress queue associated with the destination output queue. The router makes the buffering decision after performing the packet lookup. If the packet belongs to a forwarding class for which the maximum traffic threshold has been exceeded, the packet may not be buffered and might be dropped. The scheduling process works as follows:

  1. An ingress Packet Forwarding Engine receives a packet and buffers it in virtual queues, then groups the packet with other packets destined for the same egress interface and data output queue.

  2. The ingress line card Packet Forwarding Engine sends a request, which contains a reference to the packet group, over the fabric to the egress Packet Forwarding Engine.

  3. When there is available egress bandwidth, the egress line card grant scheduler responds by sending a bandwidth grant to the ingress line card Packet Forwarding Engine. .

  4. When the ingress line card Packet Forwarding Engine receives the grant from the egress line card Packet Forwarding Engine, the ingress Packet Forwarding Engine segments the packet group and sends all of the pieces over the fabric to the egress Packet Forwarding Engine.

  5. The egress Packet Forwarding Engine receives the pieces, reassembles them into the packet group, and enqueues individual packets to a data output queue corresponding to the virtual output queue.

Ingress packets remain in the VOQ on the ingress port input queues until the output queue is ready to accept and forward more traffic.

Under most conditions, the fabric is fast enough to be transparent to egress class-of-service (CoS) policies, so the process of forwarding traffic from the ingress pipeline, across the fabric, to egress ports, does not affect the configured CoS policies for the traffic. The fabric only affects CoS policy if there is a fabric failure or if there is an issue of port fairness.

When a packet ingresses and egresses the same Packet Forwarding Engine (local switching), the packet does not traverse the fabric. However, the router uses the same request and grant mechanism to receive egress bandwidth as packets that cross the fabric, so locally routed packets and packets that arrive at a Packet Forwarding Engine after crossing the fabric are treated fairly when the traffic is vying for the same output queue.

Understanding the Packet Forwarding Engine Fairness and Virtual Output Queue Process

This topic describes the Packet Forwarding Engine fairness scheme used with VOQ on PTX Series routers.

Packet Forwarding Engine fairness means that all Packet Forwarding Engines are treated equally from a egress perspective. If multiple egress Packet Forwarding Engines need to transmit data from the same virtual output queue, they are serviced in round-robin fashion. Servicing of virtual output queues is not dependent upon the load that is present at each of the source Packet Forwarding Engines.

Figure 6 illustrates the Packet Forwarding Engine fairness scheme used with VOQ in a simple example with three Packet Forwarding Engines. Ingress PFE-A has a single stream of 10 Gbps data destined for VOQx on PFE-C. PFE-B has a single stream of 100 Gbps data also destined for VOQx on PFE-C. On PFE-C, VOQx is serviced by a 100 Gbps interface and that is the only active virtual output queue on that interface.

Figure 6: Packet Forwarding Engine Fairness with Virtual Output Queue ProcessPacket Forwarding Engine Fairness with Virtual Output Queue Process

In Figure 6, we have a total of 110 Gbps of source data destined for a 100 Gbps output interface. As a result, we need to drop 10 Gbps of data. Where does the drop occur and how does this drop affect traffic from PFE-A versus PFE-B?

Because PFE-A and PFE-B are serviced in round-robin fashion by egress PFE-C, all 10 Gbps of traffic from PFE-A makes it through to the egress output port. However, 10 Gbps of data is dropped on PFE-B, allowing only 90 Gbps of data from PFE-B to be sent to PFE-C. So, the 10 Gbps stream has a 0% drop and the 100 Gbps stream has only a 10% drop.

However, if PFE-A and PFE-B were each sourcing 100 Gbps of data, then they would each drop 50 Gbps of data. This is because the egress PFE-C actually controls the servicing and drain rate on the ingress virtual queues using the round-robin algorithm. With the round-robin algorithm, higher bandwidth sources are always penalized when multiple sources are present. The algorithm attempts to make the two sources equal in bandwidth; however, because it cannot raise the bandwidth of the slower source, it drops the bandwidth of the higher source. The round robin algorithm continues this sequence until the sources have equal egress bandwidth.

Each ingress Packet Forwarding Engine provides up to eight virtual output queues for a single egress port. The egress Packet Forwarding Engine distributes the bandwidth to each ingress virtual output queue; therefore they will receive equal treatment regardless of their presented load. The drain-rate of a queue is the rate at which a queue is draining. The egress Packet Forwarding Engine divides its bandwidth for each output queue equally across the ingress Packet Forwarding Engines. So, the drain-rate of each ingress Packet Forwarding Engine=Drain-rate of output queue/Number of ingress Packet Forwarding Engines.

Handling Congestion

There are two main types of congestion that can occur:

  • Ingress congestion — Occurs when the ingress Packet Forwarding Engine has more offered load than the egress can handle. The ingress congestion case, is very similarly to a traditional router in that the queues build-up and once they cross their configured threshold, packets are dropped.

  • Egress congestion — Occurs when the sum of all the ingress Packet Forwarding Engines exceeds the capability of the egress router. All drops are performed on the ingress Packet Forwarding Engines. However, the size of the ingress queue is attenuated by the queue’s drain-rate (how fast the egress Packet Forwarding Engine is requesting packets). This rate is essentially determined by the rate that requests are being converted in to grants by the egress Packet Forwarding Engine. The egress Packet Forwarding Engine services the request-to-grant conversion in round-robin fashion; it is not dependent on the ingress Packet Forwarding Engines offered load. For instance, if the ingress Packet Forwarding Engine’s drain-rate is half of what it expects it to be (as is the case when 2 ingress Packet Forwarding Engines are presenting an oversubscribed load for the target output queue), then the ingress Packet Forwarding Engine’s reduce the size of this queue to be half of its original size (when it was getting its full drain rate).

VOQ Queue-depth Monitoring

VOQ queue-depth monitoring, or latency monitoring, measures peak queue occupancy of a VOQ. This feature enables the reporting of peak queue length for a given physical interface for each individual Packet Forwarding Engine (PFE).

Note:

In addition to the peak queue-length data, each queue also maintains drop statistics and time-averaged queue length on the ingress data path. Also, each queue maintains queue transmission statistics on the egress data path.

In a typical deployment scenario that uses strict-priority scheduling, a HIGH priority queue can starve LOW priority queues. Thus the packets in such LOW priority queues can remain longer than desired. You can use this VOQ queue-depth monitoring feature, along with queue transmission statistics, to detect such stalled conditions.

Note:

You can only enable VOQ queue-depth monitoring on transit WAN interfaces.

To enable VOQ queue-depth monitoring on an interface, you first create a monitoring profile, and then attach that profile to the interface. If you attach a monitoring profile to an aggregated Ethernet (ae-) interface, each member interface has its own dedicated hardware VOQ monitor, unless you also apply the shared option to the monitoring profile attached to the ae- interface.

By default, a monitoring profile that you assign to an ae- interface replicates across all members of the ae- interface. The monitoring profile also reports VOQ depth individually on each interface. On large systems, this process can quickly consume the maximum supported hardware monitoring profile IDs. To conserve monitoring profile IDs, include the shared option at the [set class-of-service interfaces ae-interface monitoring-profile profile-name] hierarchy level. The configured shared option creates only one monitoring profile ID to share across all member interfaces. The option also reports the largest peak on a member interface as the common peak for the ae- interface.

Note:

You cannot enable the shared option on mixed mode ae- interfaces.

Each monitoring profile consists of one or more export filters. An export filter defines a peak queue-length percentage threshold for one or more queues on the physical interface. Once the defined peak queue-length percentage threshold is met for any queue in the export filter, Junos exports the VOQ telemetry data for all queues in the export filter.

Note:

The queue-depth monitoring data goes out only through a telemetry channel. In addition to configuring a monitoring profile (as shown below) you must initiate a regular sensor subscription in order for the data to go out. There is no CLI display option.

Configure VOQ Queue-depth Monitoring

Configure VOQ queue-depth monitoring to export queue utilization data. You can use this data to monitor micro-bursts and also assist in identifying stalled transit output queues. To configure VOQ queue-depth monitoring:

  1. Configure the monitoring profile.
  2. Attach the monitoring profile to an interface.

To configure the monitoring profile:

  1. Name the monitoring profile. For example:
  2. Name an export filter for the montoring profile. For example:
  3. Define which queues (0 through 7) belong to the export filter. For example:
  4. (Optional) Define the threshold peak queue length percentage to export VOQ telemetry data. The default percentage is 0. For example:
  5. (Optional) Define one or more other export filters for the monitoring profile. For example:
  6. Commit your changes.

To attach the monitoring profile to an interface:

  1. Attach the monitoring profile to an interface. For example:
  2. Commit your changes.

Check your configuration. For example:

Run these show commands to verify your configuration:

Note:

As you can see from this example, not setting a peak-queue-length percent for an export filter defaults the percentage to 0 percent, as export filter ef2 shows. This example shows different queues on the physical interface having different peak queue length thresholds for exporting VOQ telemetry data.