PFC Watchdog
PFC Watchdog Overview
Priority-based flow control (PFC) pause frames are used in lossless Ethernet to pause the link partner from sending packets. These PFC pause frames can propagate through the whole network and can cause the traffic on the PFC streams to halt. Use the PFC watchdog to detect and resolve PFC pause storms.
The PFC watchdog monitors PFC-enabled ports for PFC pause storms. When a PFC-enabled port receives PFC pause frames for an extended period of time and PFC watchdog does not detect flow control frames on that port, PFC watchdog mitigates the situation. It does this by disabling the queue where the PFC pause storm was detected for a configurable length of time called the recovery time. After the recovery time passes, PFC watchdog re-enables the affected queue.
Use Feature Explorer to confirm platform and release support for specific features.
Understanding PFC Watchdog
The PFC watchdog has three functions: detection, mitigation, and restoration.
The PFC watchdog checks the status of PFC queues at regular intervals called polling intervals. If the PFC watchdog finds a PFC queue with a non-zero pause timer, it compares the queue's current transmit counter register to the last recorded value. If the PFC queue has not transmitted any packets since the last polling interval, the PFC watchdog checks if there are any packets in the queue. If there are packets on the queue that are not being transmitted and there are no flow control frames on that port, the PFC watchdog detects a stall condition.
After the PFC watchdog detects a stall condition, it disables the queue where it detected the PFC pause storm for a period of time called the recovery time. During that time, it flushes all packets in the queue and prevents new packets from being added to the queue. The system monitors all packet drops on the PFC queue during the recovery time.
When the recovery time ends, the PFC watchdog collects the ingress drop counters and any other drop counters associated with disabling the PFC queue. The PFC watchdog maintains a count of the packets lost during the last recovery and the total number of lost packets due to PFC mitigation since the device was started. The PFC watchdog then restores the queue and re-enables PFC.
How to Configure PFC Watchdog
Enable PFC Watchdog
PFC watchdog only works for PFC queues. To designate a queue
as a PFC queue, use the flow-control-queue
statement with
the queue number:
set class-of-service congestion-notification-profile cnp output ieee-802.1 code-point 011 flow-control-queue 3 set class-of-service congestion-notification-profile cnp output ieee-802.1 code-point 100 flow-control-queue 4
Enable PFC watchdog using the pfc-watchdog
statement
at the [edit class-of-service congestion-notification-profile profile-name]
hierarchy level:
set class-of-service congestion-notification-profile profile-name pfc-watchdog
Enabling PFC watchdog on the congestion notification profile without configuring other options enables the PFC watchdog with the default values. By default, the polling interval is 100 ms, the detection period is set to 2 (that is, two polling intervals, or 200 ms), and the recovery time is 200 ms. To learn how to configure non-default values, read the following sections.
Detection
The PFC watchdog monitors the PFC-enabled queues periodically
for continuous PFC pause assertion by the downstream device when the
queue is empty. If this occurs, PFC watchdog detects a stall condition.
The system must detect this stall condition within a specified amount
of time. This length of time is determined by how you configure two
statements: poll-interval
and detection
.
The PFC watchdog checks the status of PFC queues at regular
intervals. Configure this interval in milliseconds using the poll-interval
statement. The PFC watchdog checks the status
of the queues once per polling interval. The default interval is 100 ms.
The minimum interval is 100 ms and the maximum is 1000 ms.
set class-of-service congestion-notification-profile profile-name pfc-watchdog poll-interval time
The PFC watchdog must detect stall conditions for at least two
consecutive polling intervals before it determines that a PFC queue
has stalled. Configure the detection
statement to control
how many polling intervals the PFC watchdog waits before it mitigates
the stalled traffic. The default is two polling intervals. The maximum
number is 10 polling intervals.
set class-of-service congestion-notification-profile profile-name pfc-watchdog detection number of polling intervals
The total detection time is the length of the polling interval multiplied by the number of polling intervals.
Mitigation
When the PFC watchdog detects that a PFC queue has stalled,
it moves the queue to the mitigation state. Configure the pfc-watchdog-action
statement to specify the action that the PFC watchdog takes to mitigate
the traffic congestion. The only option is the drop action. When the
PFC watchdog detects that a PFC queue has stalled, it drops all queued
packets and all newly arriving packets for the stalled PFC queue.
set class-of-service congestion-notification-profile profile-name pfc-watchdog watchdog-action drop
Restoration
Use the recovery
statement to configure how long
the PFC watchdog disables the affected queue for before it restores
PFC. The minimum recovery period is 200 ms and the maximum is
10,000 ms.
set class-of-service congestion-notification-profile profile-name pfc-watchdog recovery time
After the recovery time passes, the PFC watchdog re-enables PFC on the affected queues.
Verification
Use the following command to verify you have configured the PFC watchdog correctly:
show class-of-service congestion-notification-profile Name: cnp, Index: 0 Type: Input Cable Length: 100 Type: Output Priority Flow-Control-Queues 011 3 Priority Flow-Control-Queues 100 4 PFC Watchdog : enabled PFC-action : drop Polling Interval : 100 ms Detection Time : 200 ms Recovery Time : 200 ms
The detection time shown is the polling interval multiplied by the detection period. In this case, the polling interval is 100 ms, so the configured detection time was two.
Monitoring PFC Watchdog
You can view the number of PFC pause storms that have been detected and recovered, as well as the number of packets that have been dropped, on the PFC queues on an interface. Use the following command to view the PFC watchdog statistics on a particular interface.
show interfaces interface extensive ... Priority Flow Control Watchdog Statistics: Detected Recovered LastPacketDropCount TotalPacketDropCount Queue : 0 0 0 0 0 Queue : 1 0 0 0 0 Queue : 2 0 0 0 0 Queue : 3 0 0 0 0 Queue : 4 0 0 0 0 Queue : 5 0 0 0 0 Queue : 6 0 0 0 0 Queue : 7 0 0 0 0 ...
You can view the actions that PFC watchdog takes in the system log.
- When the PFC watchdog is enabled on a new port, the system log displays this
message:
CDA PfcWd: PFC Watchdog detection enabled on ifd: et-0/0/16 Poll Interval:100ms Detection Period:200ms Recovery Interval:200ms
- When the PFC watchdog detects a stall condition, the system log displays this
message:
CDA PfcWd: PFC Storm Detected! on ifd:et-0/0/16 Queue: 3 Priority: 3 BLOCKED for AutoRecovery Recovery Time: 200ms
- When the queue recovers from the PFC pause storm, the system log displays this
message:
CDA PfcWd: PFC Storm Recovered on Port ifd:et-0/0/16 Queue: 3 Priority: 3 UNBLOCKED after AutoRecovery Recovery Time: 200ms