Understanding Switching Control Board Redundancy
SUMMARY Switching control board redundancy allows your device to continue routing and switching functions if a primary control board fails.
In this section, the term failover refers to an automatic event, whereas switchover refers to either an automatic or a manual event.
Redundant CFEBs on the M10i Router
On the M10i router, the CFEB performs the following functions:
-
Route lookups—Performs route lookups using the forwarding table stored in synchronous SRAM (SSRAM).
-
Management of shared memory—Uniformly allocates incoming data packets throughout the router’s shared memory.
-
Transfer of outgoing data packets—Passes data packets to the destination Fixed Interface Card (FIC) or Physical Interface Card (PIC) when the data is ready to be transmitted.
-
Transfer of exception and control packets—Passes exception packets to the microprocessor on the CFEB, which processes almost all of them. The remainder are sent to the Routing Engine for further processing. Any errors originating in the Packet Forwarding Engine and detected by the CFEB are sent to the Routing Engine using system log messages.
The M10i router has two CFEBs, one that is
configured to act as the primary and the other
that serves as a backup in case the primary fails.
You can initiate a manual switchover by issuing
the request chassis cfeb master
switch
command. For more information, see
the Junos OS Administration Library for Routing
Devices.
Redundant FEBs on the M120 Router
The M120 router supports up to six Forwarding Engine Boards (FEBs). Flexible PIC Concentrator (FPCs), which host PICs, are separate from the FEBs, which handle packet forwarding. FPCs are located on the front of the chassis and provide power and management to PICs through the midplane. FEBs are located on the back of the chassis and receive signals from the midplane, which the FEBs process for packet forwarding. The midplane allows any FEB to carry traffic for any FPC.
To configure the mapping of FPCs to FEBs, use the
fpc-feb-connectivity
statement as
described in the Junos OS Administration Library for Routing
Devices. You cannot specify a connection
between an FPC and a FEB configured as a backup.
If an FPC is not specified to connect to a FEB,
the FPC is assigned automatically to the FEB with
the same slot number. For example, the FPC in slot
1 is assigned to the FEB in slot 1.
You can configure one FEB as a backup for one or more FEBs by configuring a FEB redundancy group. When a FEB fails, the backup FEB can quickly take over packet forwarding. A redundancy group must contain exactly one backup FEB and can optionally contain one primary FEB and multiple other FEBs. A FEB can belong to only one group. A group can provide backup on a one-to-one basis (primary-to-backup), a many-to-one basis (two or more other-FEBs-to-backup), or a combination of both (one primary-to-backup and one or more other-FEBs-to-backup).
When you configure a primary FEB in a redundancy group, the backup FEB mirrors the exact forwarding state of the primary FEB. If switchover occurs from a primary FEB, the backup FEB does not reboot. A manual switchover from the primary FEB to the backup FEB results in less than 1 second of traffic loss. Failover from the primary FEB to the backup FEB results in less than 10 seconds of traffic loss.
If a failover occurs from the other FEB and a primary FEB is specified for the group, the backup FEB reboots so that the forwarding state from the other FEB can be downloaded to the backup FEB and forwarding can continue. Automatic failover from a FEB that is not specified as a primary FEB results in higher packet loss. The duration of packet loss depends on the number of interfaces and on the size of the routing table, but it can be minutes.
If a failover from a FEB occurs when no primary FEB is specified in the redundancy group, the backup FEB does not reboot and the interfaces on the FPC connected to the previously active FEB remain online. The backup FEB must obtain the entire forwarding state from the Routing Engine after a switchover, and this update may take a few minutes. If you do not want the interfaces to remain online during the switchover for the other FEB, configure a primary FEB for the redundancy group.
Failover to a backup FEB occurs automatically if
a FEB in a redundancy group fails. You can disable
automatic failover for any redundancy group by
including the no-auto-failover
statement at the [edit chassis redundancy
feb redundancy-group
group-name]
hierarchy
level.
You can also initiate a manual switchover by
issuing the request chassis redundancy feb
slot slot-number
switch-to-backup
command, where
slot-number
is the number of the active FEB. For more
information, see the CLI Explorer.
The following conditions result in failover as long as the backup FEB in a redundancy group is available:
-
The FEB is absent.
-
The FEB experienced a hard error while coming online.
-
A software failure on the FEB resulted in a crash.
-
Ethernet connectivity from a FEB to a Routing Engine failed.
-
A hard error on the FEB, such as a power failure, occurred.
-
The FEB was disabled when the offline button for the FEB was pressed.
-
The software watchdog timer on the FEB expired.
-
Errors occurred on the links between all the active fabric planes and the FEB. This situation results in failover to the backup FEB if it has at least one valid fabric link.
-
Errors occurred on the link between the FEB and all of the FPCs connected to it.
After a switchover occurs, a backup FEB is no longer available for the redundancy group. You can revert from the backup FEB to the previously active FEB by issuing the operational mode command request chassis redundancy feb slot slot-number revert-from-backup, where slot-number is the number of the previously active FEB. For more information, see the CLI Explorer.
When you revert from the backup FEB, it becomes available again for a switchover. If the redundancy group does not have a primary FEB, the backup FEB reboots after you revert back to the previously active FEB. If the FEB to which you revert back is not a primary FEB, the backup FEB is rebooted so that it can aligned with the state of the primary FEB.
If you modify the configuration for an existing redundancy group so that a FEB connects to a different FPC, the FEB is rebooted unless the FEB was already connected to one or two Type 1 FPCs and the change only resulted in the FEB being connected either to one additional or one fewer Type 1 FPC. For more information about how to map a connection between an FPC and a FEB, see the Junos OS Administration Library for Routing Devices. If you change the primary FEB in a redundancy group, the backup FEB is rebooted. The FEB is also rebooted if you change a backup FEB to a nonbackup FEB or change an active FEB to a backup FEB.
To view the status of configured FEB redundancy
groups, issue the show chassis redundancy
feb
operational mode command. For more
information, see the CLI Explorer.
Redundant SSBs on the M20 Router
The System and Switch Board (SSB) on the M20 router performs the following major functions:
-
Shared memory management on the FPCs—The Distributed Buffer Manager ASIC on the SSB uniformly allocates incoming data packets throughout shared memory on the FPCs.
-
Outgoing data cell transfer to the FPCs—A second Distributed Buffer Manager ASIC on the SSB passes data cells to the FPCs for packet reassembly when the data is ready to be transmitted.
-
Route lookups—The Internet Processor ASIC on the SSB performs route lookups using the forwarding table stored in SSRAM. After performing the lookup, the Internet Processor ASIC informs the midplane of the forwarding decision, and the midplane forwards the decision to the appropriate outgoing interface.
-
System component monitoring—The SSB monitors other system components for failure and alarm conditions. It collects statistics from all sensors in the system and relays them to the Routing Engine, which sets the appropriate alarm. For example, if a temperature sensor exceeds the first internally defined threshold, the Routing Engine issues a “high temp” alarm. If the sensor exceeds the second threshold, the Routing Engine initiates a system shutdown.
-
Exception and control packet transfer—The Internet Processor ASIC passes exception packets to a microprocessor on the SSB, which processes almost all of them. The remaining packets are sent to the Routing Engine for further processing. Any errors that originate in the Packet Forwarding Engine and are detected by the SSB are sent to the Routing Engine using system log messages.
-
FPC reset control—The SSB monitors the operation of the FPCs. If it detects errors in an FPC, the SSB attempts to reset the FPC. After three unsuccessful resets, the SSB takes the FPC offline and informs the Routing Engine. Other FPCs are unaffected, and normal system operation continues.
The M20 router holds up to two SSBs. One SSB is
configured to act as the primary and the other is
configured to serve as a backup in case the
primary fails. You can initiate a manual
switchover by issuing the request chassis
ssb master switch
command. For more
information, see the CLI Explorer.
Redundant SFMs on the M40e and M160 Routers
The M40e and M160 routers have redundant Switching and Forwarding Modules (SFMs). The SFMs contain the Internet Processor II ASIC and two Distributed Buffer Manager ASICs. SFMs ensure that all traffic leaving the FPCs is handled properly. SFMs provide route lookup, filtering, and switching.
The M40e router holds up to two SFMs, one that is configured to act as the primary and the other configured to serve as a backup in case the primary fails. Removing the standby SFM has no effect on router function. If the active SFM fails or is removed from the chassis, forwarding halts until the standby SFM boots and becomes active. It takes approximately 1 minute for the new SFM to become active. Synchronizing router configuration information can take additional time, depending on the complexity of the configuration.
The M160 router holds up to four SFMs. All SFMs are active at the same time. A failure or taking an SFM offline has no effect on router function. Forwarding continues uninterrupted.
You can initiate a manual switchover by issuing
the request chassis sfm master
switch
command. For more information, see
the CLI Explorer.