Understanding Graceful Routing Switchover
Understanding Graceful Routing Engine Switchover
This topic contains the following sections:
- Graceful Routing Engine Switchover Concepts
- Effects of a Routing Engine Switchover
- Graceful Routing Engine Switchover on Aggregated Services Interfaces
- Platform-Specific GRES Behavior
Graceful Routing Engine Switchover Concepts
The graceful Routing Engine switchover (GRES) feature in Junos OS and Junos OS Evolved enables a device with redundant Routing Engines to continue forwarding packets even if one Routing Engine fails. GRES preserves interface and kernel information, and traffic is not interrupted. However, GRES does not preserve the control plane.
Review the Platform-Specific GRES Behavior section for notes related to your platform.
Neighboring devices detect that the device has experienced a restart and react to the event in a manner prescribed by individual routing protocol specifications.
To preserve routing during a switchover, GRES must be combined with either:
-
Graceful restart protocol extensions
-
Nonstop active routing (NSR)
Any updates to the primary Routing Engine are replicated to the backup Routing Engine as soon as they occur.
Because of its synchronization requirements and logic, NSR/GRES performance is limited by the slowest Routing Engine in the system.
Primary Role switches to the backup Routing Engine if:
-
The primary Routing Engine kernel stops operating.
-
The primary Routing Engine experiences a hardware failure.
-
The administrator initiates a manual switchover.
To quickly restore or to preserve routing protocol state information during a switchover, GRES must be combined with either graceful restart or nonstop active routing, respectively. For more information about graceful restart, see Graceful Restart Concepts. For more information about nonstop active routing, see Nonstop Active Routing Concepts.
If the backup Routing Engine does not receive a keepalive from the primary Routing Engine after 2 seconds, it determines that the primary Routing Engine has failed; and assumes primary role.
The Packet Forwarding Engine:
-
Seamlessly disconnects from the old primary Routing Engine
-
Reconnects to the new primary Routing Engine
-
Does not reboot
-
Does not interrupt traffic
The new primary Routing Engine and the Packet Forwarding Engine then become synchronized. If the new primary Routing Engine detects that the Packet Forwarding Engine state is not up to date, it resends state update messages.
Note the following GRES behaviors, recommendations, or requirements:
-
Starting with Junos OS Release 12.2, if adjacencies between the restarting device and the neighboring peer 'helper' devices time out, graceful restart protocol extensions are unable to notify the peer 'helper' devices about the impending restart. Graceful restart can then stop and cause interruptions in traffic.
To ensure that these adjacencies are maintained, change the
hold-time
for IS-IS protocols from the default of 27 seconds to a value higher than 40 seconds. -
Successive Routing Engine switchover events must be a minimum of 240 seconds (4 minutes) apart after both Routing Engines have come up.
If the device displays a warning message similar to:
Standby Routing Engine is not ready for graceful switchover. Packet Forwarding Engines that are not ready for graceful switchover might be reset
then do not attempt a switchover. If you choose to proceed with switchover, the device resets only the Packet Forwarding Engines that were not ready for graceful switchover. None of the FPCs should spontaneously restart. We recommend that you wait until the warning no longer appears and then proceed with the switchover.
-
We do not recommend:
-
Doing a commit operation on the backup Routing Engine when GRES is enabled on the device.
-
Enabling GRES on the backup Routing Engine in any scenario.
-
Figure 1 shows the system architecture of graceful Routing Engine switchover and the process a routing platform follows to prepare for a switchover.
Check GRES readiness by executing both:
-
The
request chassis routing-engine master switch check
command from the primary Routing Engine -
The
show system switchover
command from the Backup Routing Engine
The switchover preparation process for GRES is as follows:
-
The primary Routing Engine starts.
-
The routing platform processes (such as the chassis process [chassisd]) start.
-
The Packet Forwarding Engine starts and connects to the primary Routing Engine.
-
All state information is updated in the system.
-
The backup Routing Engine starts.
-
The system determines whether GRES has been enabled.
-
The kernel synchronization process (ksyncd) synchronizes the backup Routing Engine with the primary Routing Engine.
-
After ksyncd completes the synchronization, all state information and the forwarding table are updated.
Figure 2 shows the effects of a switchover on the routing (or switching )platform.
A switchover process comprises the following steps:
-
When keepalives from the primary Routing Engine are lost, the system switches over gracefully to the backup Routing Engine.
-
The Packet Forwarding Engine connects to the backup Routing Engine, which becomes the new primary.
-
Routing platform processes that are not part of GRES (such as the routing protocol process rpd) restart.
-
State information learned from the point of the switchover is updated in the system.
-
If configured, graceful restart protocol extensions collect and restore routing information from neighboring peer helper devices.
Effects of a Routing Engine Switchover
Table 1 describes the effects of a Routing Engine switchover when different features are enabled:
-
No high availability features
-
Graceful Routing Engine switchover
-
Graceful restart
-
Nonstop active routing
Feature |
Benefits |
Considerations |
---|---|---|
Dual Routing Engines only (no features enabled) |
|
|
GRES enabled |
|
|
GRES and NSR enabled |
|
|
GRES and graceful restart enabled |
|
|
Graceful Routing Engine Switchover on Aggregated Services Interfaces
If a graceful Routing Engine switchover (GRES) is triggered by an operational mode command, the device does not preserve the state of aggregated services interfaces (ASIs). For example:
request interface <switchover | revert> asi-interface
However, if GRES is triggered by a CLI commit or FPC restart or crash, the backup Routing Engine updates the ASI state. For example:
set interface si-x/y/z disable commit
Or:
request chassis fpc restart
Platform-Specific GRES Behavior
Use Feature Explorer to confirm platform and release support for specific features.
Use the following table to review platform-specific behaviors for your platform:
Platform | Difference |
---|---|
MX Series |
|
PTX Series |
|
QFX Series |
|
See Also
Graceful Routing Engine Switchover System Requirements
Graceful Routing Engine switchover is supported on all routing (or switching) platforms that contain dual Routing Engines. All Routing Engines configured for graceful Routing Engine switchover must run the same Junos OS release. Hardware and software support for graceful Routing Engine switchover is described in the following sections:
- Graceful Routing Engine Switchover Platform Support
- Graceful Routing Engine Switchover Feature Support
- Graceful Routing Engine Switchover DPC Support
- Graceful Routing Engine Switchover and Subscriber Access
- Graceful Routing Engine Switchover PIC Support
Graceful Routing Engine Switchover Platform Support
To enable graceful Routing Engine switchover, your system must meet these minimum requirements:
M20 and M40e routers—Junos OS Release 5.7 or later
M10i router—Junos OS Release 6.1 or later
M320 router—Junos OS Release 6.2 or later
T320 router, T640 router, and TX Matrix router—Junos OS Release 7.0 or later
M120 router—Junos OS Release 8.2 or later
MX960 router—Junos OS Release 8.3 or later
MX480 router—Junos OS Release 8.4 or later (8.4R2 recommended)
MX240 router—Junos OS Release 9.0 or later
PTX5000 router—Junos OS Release 12.1X48 or later
Standalone T1600 router—Junos OS Release 8.5 or later
Standalone T4000 router—Junos OS Release 12.1R2 or later
TX Matrix Plus router—Junos OS Release 9.6 or later
TX Matrix Plus router with 3D SIBs—Junos Release 13.1 or later
EX Series switches with dual Routing Engines or in a Virtual Chassis — Junos OS Release 9.2 or later for EX Series switches
QFX Series switches in a Virtual Chassis —Junos OS Release 13.2 or later for the QFX Series
EX Series or QFX Series switches in a Virtual Chassis Fabric —Junos OS Release 13.2X51-D20 or later for the EX Series and QFX Series switches
For more information about support for graceful Routing Engine switchover, see the sections that follow.
Graceful Routing Engine Switchover Feature Support
Graceful Routing Engine switchover supports most Junos OS features in Release 5.7 and later. Particular Junos OS features require specific versions of Junos OS. See Table 2.
Application |
Junos OS Release |
---|---|
Aggregated Ethernet interfaces with Link Aggregation Control Protocol (LACP) and aggregated SONET interfaces |
6.2 |
Asynchronous Transfer Mode (ATM) virtual circuits (VCs) |
6.2 |
Logical systems Note:
In Junos OS Release 9.3 and later, the logical router feature is renamed to logical system. |
6.3 |
Multicast |
6.4 (7.0 for TX Matrix router) |
Multilink Point-to-Point Protocol (MLPPP) and Multilink Frame Relay (MLFR) |
7.0 |
Automatic Protection Switching (APS)—The current active interface (either the designated working or the designated protect interface) remains the active interface during a Routing Engine switchover. |
7.4 |
Point-to-multipoint Multiprotocol Label Switching MPLS LSPs (transit only) |
7.4 |
Compressed Real-Time Transport Protocol (CRTP) |
7.6 |
Virtual private LAN service (VPLS) |
8.2 |
Ethernet Operation, Administration, and Management (OAM) as defined by IEEE 802.3ah |
8.5 |
Extended DHCP relay agent |
8.5 |
Ethernet OAM as defined by IEEE 802.1ag |
9.0 |
Packet Gateway Control Protocol (PGCP) process (pgcpd) on Multiservices 500 PICs on T640 routers. |
9.0 |
Subscriber access |
9.4 |
Layer 2 Circuit and LDP-based VPLS pseudowire redundant configuration |
9.6 |
The following constraints apply to graceful Routing Engine switchover feature support:
When graceful Routing Engine switchover and aggregated Ethernet interfaces are configured in the same system, the aggregated Ethernet interfaces must not be configured for fast-polling LACP. When fast polling is configured, the LACP polls time out at the remote end during the Routing Engine primary-role switchover. When LACP polling times out, the aggregated link and interface are disabled. The Routing Engine primary role change is fast enough that standard and slow LACP polling do not time out during the procedure. However, note that this restriction does not apply to MX Series Routers that are running Junos OS Release 9.4 or later and have distributed periodic packet management (PPM) enabled—which is the default configuration—on them. In such cases, you can configure graceful Routing Engine switchover and have aggregated Ethernet interfaces configured for fast-polling LACP on the same device.
Note:MACSec sessions will flap upon Graceful Routing Engine switchover.
Starting with Junos OS Release 13.2, when a graceful Routing Engine switchover occurs, the VRRP state does not change. VRRP is supported by graceful Routing Engine switchover only in the case that PPM delegation is enabled (which the default).
Graceful Routing Engine Switchover DPC Support
Graceful Routing Engine switchover supports all Dense Port Concentrators (DPCs) on the MX Series 5G Universal Routing Platforms running the appropriate version of Junos OS as shown in Graceful Routing Engine Switchover Platform Support. For more information about DPCs, see the MX Series DPC Guide.
Graceful Routing Engine Switchover and Subscriber Access
Graceful Routing Engine switchover currently supports most of the features directly associated with dynamic DHCP and dynamic PPPoE subscriber access. Graceful Routing Engine switchover also supports the unified in-service software upgrade (ISSU) for the DHCP access model and the PPPoE access model used by subscriber access.
When graceful Routing Engine switchover is enabled for subscriber management, all Routing Engines in the router must have the same amount of DRAM for stable operation.
Graceful Routing Engine Switchover PIC Support
Graceful Routing Engine switchover is supported on most PICs, except for the services PICs listed in this section. The PIC must be on a supported routing platform running the appropriate version of Junos OS. For information about FPC types, FPC/PIC compatibility, and the initial Junos OS Release in which an FPC supported a particular PIC, see the PIC guide for your router platform.
The following constraints apply to graceful Routing Engine switchover support for services PICs:
You can include the
graceful-switchover
statement at the[edit chassis redundancy]
hierarchy level on a router with Adaptive Services, Multiservices, and Tunnel Services PICs configured on it and successfully commit the configuration. However, all services on these PICs—except the Layer 2 service packages and extension-provider and SDK applications on Multiservices PICs—are reset during a switchover.Graceful Routing Engine switchover is not supported on any Monitoring Services PICs or Multilink Services PICs. If you include the
graceful-switchover
statement at the[edit chassis redundancy]
hierarchy level on a router with either of these PIC types configured on it and issue thecommit
command, the commit fails.Graceful Routing Engine switchover is not supported on Multiservices 400 PICs configured for monitoring services applications. If you include the
graceful-switchover
statement, the commit fails.
When an unsupported PIC is online, you cannot enable graceful Routing Engine switchover. If graceful Routing Engine switchover is already enabled, an unsupported PIC cannot come online.
See Also
Change History Table
Feature support is determined by the platform and release you are using. Use Feature Explorer to determine if a feature is supported on your platform.