Understanding Chassis Cluster Control Link Failure and Recovery
If the control link fails, JUNOS Software disables the secondary node to prevent the possibility of each node becoming primary for all redundancy groups, including redundancy group 0.
In the event of a legitimate control link failure, redundancy group 0 remains primary on the node on which it is currently primary, inactive redundancy groups x on the primary node become active, and the secondary node enters a disabled state in which it is not handling traffic.
![]() | Note: When the secondary node is disabled, you can still log in to the management port and run diagnostics. |
To determine if a legitimate control link failure has occurred, the system relies on redundant liveliness signals sent across the control link and the data link.
The system periodically transmits probes over the fabric data link and heartbeat signals over the control link. Probes and heartbeat signals share a common sequence number that maps them to a unique time event. The software identifies a legitimate control link failure if the following two conditions exist:
- The threshold number of heartbeats were lost.
- At least one probe with a sequence number corresponding to that of a missing heartbeat signal was received on the data link.
When a legitimate control link failure occurs, the following conditions apply:
- Redundancy group 0 remains primary on the node on which
it is presently primary (and thus its Routing Engine remains active),
and all redundancy groups x on the node become
primary.
If the system cannot determine which Routing Engine is primary, the node with the higher priority value for redundancy group 0 is primary and its Routing Engine is active. (You configure the priority for each node when you configure the redundancy-group statement for redundancy group 0.)
- The system disables the secondary node.
To recover a device from disabled mode, you must reboot the device. When you reboot the disabled node, the node will synchronize its dynamic state with the primary node.
![]() | Note: If you make any changes to the configuration while the secondary node is disabled, execute the commit command to synchronize the configuration after you reboot the node. If you did not make configuration changes, the configuration file remains synchronized with that of the primary node. |
You cannot enable preemption for redundancy group 0. If you want to change the primary node for redundancy group 0, you must do a manual failover.
When you use dual control links (supported on the SRX5000 and SRX3000 lines), note the following:
- Host inbound or outbound traffic can be impacted for up to 3 seconds during a control link failure. For example, consider a case where redundancy group 0 is primary on node 0 and there is a Telnet session to the Routing Engine through a network interface port on node 1. If the currently active control link fails, the Telnet session will lose packets for 3 seconds, until this failure is detected.
- A control link failure that occurs while the commit process is running across two nodes might lead to commit failure. In this situation, run the commit command again after 3 seconds.
![]() | Note: Dual control links require a second Routing Engine on each node of the chassis cluster. For more information, see Understanding Chassis Cluster Dual Control Links. |
You can specify that control link recovery be done automatically by the system by setting the control-link-recovery statement. In this case, once the system determines that the control link is healthy, it issues an automatic reboot on the disabled node. When the disabled node reboots, the node joins the cluster again.
Related Topics
- JUNOS Software Feature Support Reference for SRX Series and J Series Devices
- Understanding Chassis Cluster Control Links
- Understanding Chassis Cluster Dual Control Links
- Connecting Dual Control Links for SRX Series Devices in a Chassis Cluster
- Understanding Chassis Cluster Control Link Heartbeats
- Example: Configuring Chassis Cluster Control Link Recovery (CLI)
- Verifying Chassis Cluster Control Plane Statistics