Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation

Traffic Black Hole Caused by Fabric Degradation

A traffic black hole occurs when packets are dropped by a router without notification. Other connected routers continue to forward traffic to the affected router, impacting the network performance. A severely degraded fabric plane can be one of the reasons for a traffic black hole.

The M320, T640, and T1600 routers limit the black-hole time by detecting unreachable destination Packet Forwarding Engines and signaling connected routers when they cannot carry traffic because of a severely degraded fabric.

Packet Forwarding Engine destinations can become unreachable for the following reasons:

  • The fabric Switch Interface Boards (SIBs) go offline as a result of a CLI command or a pressed physical button.
  • The fabric SIBs are turned offline by the Switch Processor Mezzanine Board (SPMB) because of high temperature.
  • Voltage or polled I/O errors in the SIBs detected by the SPMB.
  • All Packet Forwarding Engines receive destination errors on all planes from remote Packet Forwarding Engines, even when the SIBs are online.
  • Complete fabric loss caused by destination timeouts, even when the SIBs are online.

When the system detects any unreachable Packet Forwarding Engine destinations, healing from a traffic black hole is attempted. If the healing fails, the system turns off the interfaces, thereby stopping the traffic black hole.

The recovery process consists of the following phases:

  1. Fabric plane restart phase: Healing is attempted by restarting the fabric planes one by one. This phase does not start if the fabric plane is functioning properly and a single Flexible PIC Concentrator (FPC) is bad.
  2. Fabric plane and FPC restart phase: Healing is attempted by restarting both the fabric planes and the FPCs. If there are bad FPCs that are unable to initiate high-speed links to the fabric after reboot, creation of traffic black hole is limited because no interfaces are created for these FPCs.
  3. FPC offline phase: Traffic black hole is limited by turning the FPCs offline and by turning off interfaces because previous attempts at recovery have failed.

By default, the system limits black-hole time by detecting severely degraded fabric. No user interaction is necessary.

Published: 2012-07-03