ON THIS PAGE
Troubleshooting an SRX Chassis Cluster with One Node in the Hold State and the Other Node in the Lost State
Problem
Description
The nodes of the SRX chassis cluster are in hold and lost states.
Environment
SRX chassis cluster
Symptoms
One node of the SRX chassis cluster is in
the hold state and the other node is in the lost state after you connect
the cables and reboot the devices in cluster mode. Run the show
chassis cluster status
command on each node to view the status
of the node. Here is a sample output:
{hold:node0} user@node0> show chassis cluster status Cluster ID: 1, Redundancy-group: 0 Node name Priority Status Preempt Manual failover node0 100 hold No No node1 1 lost No No {hold:node1} user@node1> show chassis cluster status Cluster ID: 1, Redundancy-group: 0 Node name Priority Status Preempt Manual failover node0 100 lost No No node1 1 hold No No
If the status of a node is hold
, the node is not ready to operate in a chassis cluster.
This issue does not impact high-end SRX Series Firewalls because these devices have dedicated control and management ports.
Cause
When you boot a branch SRX Series Firewall in cluster mode, two revenue interfaces (depending upon the model of the device) are designated for the out-of-band management link (fxp0) and control link (fxp1) of the chassis cluster. The fxp0 and fxp1 ports cannot be used for transit traffic.
If you configure the fxp0 and fxp1 ports, the chassis cluster goes into the hold/lost state. The following table lists the ports that are designated as fxp0 and fxp1 ports for branch SRX Series Firewalls:
Device |
Management (fxp0) |
HA Control (fxp1) |
Fabric (fab0 and fab1)—must be configured |
---|---|---|---|
SRX300 |
ge-0/0/0 |
ge-0/0/1 |
Any ge interface |
SRX320 |
ge-0/0/0 |
ge-0/0/1 |
Any ge interface |
SRX340, SRX345, and SRX380 |
MGMT |
ge-0/0/1 |
Any ge interface |
Resolution
- Remove the Configuration on a Device Running the Factory-Default Configuration
- Remove the Configuration on a Device Operating as a Standalone Device
Remove the Configuration on a Device Running the Factory-Default Configuration
The factory-default configuration includes configuration for the interfaces that are transformed into fxp0 and fxp1 interfaces. You must delete these configurations before enabling chassis cluster mode. A device can have the factory-default configuration in the following scenarios:
Typically, new devices are used in a chassis cluster. These new devices ship with the factory-default configuration, which includes configuration for the interfaces.
If a device that is in chassis cluster mode crashes, the device might come up with the factory-default configuration.
To remove the configuration on the interfaces, delete the factory-default configuration and reconfigure the device.
The following procedure removes the current configuration.
Log in to the device and enter the configuration mode.
Run the
delete
command to delete the current configuration from the device.root# delete This will delete the entire configuration Delete everything under this level? [yes,no] (no) yes
Configure the root password and commit the configuration:
root# set system root-authentication plain-text-password root# commit
Remove the Configuration on a Device Operating as a Standalone Device
If the device is currently running in a production environment, then check whether the interfaces that are designated as the fxp0 and fxp1 interfaces are configured. To determine which interfaces are transformed into fxp0 and fxp1 interfaces, see Table 1.
Run the following commands to list the configuration for the fxp0 and fxp1 interfaces:
show | display set | match <physical interface of the control port (fxp1)>
show | display set | match <physical interface of the management port (fxp0)>
For example:
show configuration | display set | match ge-0/0/0
show configuration | display set | match ge-0/0/1
Delete all the configurations related to the interfaces from every configuration hierarchy.
You can also choose to delete the entire configuration and reconfigure the device:
root# delete