Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

header-navigation
keyboard_arrow_up
close
keyboard_arrow_left
list Table of Contents
file_download PDF
{ "lLangCode": "en", "lName": "English", "lCountryCode": "us", "transcode": "en_US" }
English
keyboard_arrow_right

Troubleshooting a Redundancy Group that Does Not Fail Over in an SRX Chassis Cluster

date_range 30-May-23

Problem

Description

A redundancy group (RG) in a high-availability (HA) SRX chassis cluster does not fail over.

Environment

SRX chassis cluster

Diagnosis

From the command prompt of the SRX Series Services Gateway that is part of the chassis cluster, run the show chassis cluster status command.

Sample output:

content_copy zoom_out_map
Cluster ID: 1
Node                     Priority     Status     Preempt    Manual failover


Redundancy group: 0 , Failover count: 0
node0                       150       primary        no               no
node1                       100       secondary      no               no


Redundancy group: 1 , Failover count: 0
node0                       255       primary        yes              no
node1                       100       secondary      yes              no

In the sample output check the priority of the redundancy group that does not fail over.

Resolution

Redundancy Group Manual Failover

  1. Check whether a manual failover of the redundancy group was initiated earlier by using the show chassis cluster status command.

    Sample output:

    content_copy zoom_out_map
    Cluster ID: 1
    Node                     Priority     Status     Preempt    Manual failover
    
    
    Redundancy group: 0 , Failover count: 0
    node0                       150       primary        yes             no
    node1                       100       secondary      yes             no
    
    
    Redundancy group: 1 , Failover count: 0
    node0                       255       primary        no              yes
    node1                       100       secondary      no              yes
    

    In the sample output, Priority value of redundancy group 1 (RG1) is 255 and the status of Manual failover is yes, which means that a manual failover of the redundancy group was initiated earlier. You must reset the redundancy group priority.

    Note:

    After a manual failover of a redundancy group, we recommend that you reset the manual failover flag in the cluster status to allow further failovers.

  2. Reset the redundancy group priority by using the request chassis cluster failover reset redundancy-group <1-128>.

    For example:

    content_copy zoom_out_map
    user@host> request chassis cluster failover reset redundancy-group 1
    root@srx> request chassis cluster failover reset redundancy-group 1    
    node0:
    --------------------------------------------------------------------------
    Successfully reset manual failover for redundancy group 1
    
     
    
    node1:
    --------------------------------------------------------------------------
    No reset required for redundancy group 1.
  3. This must resolve the issue and allow further redundancy group failovers. If these steps do not resolve the issue, proceed to section Whats Next.

  4. If you want to initiate a redundancy group x (redundancy groups numbered 1 through 128) failover manually, see Understanding Chassis Cluster Redundancy Group Manual Failover.

Redundancy Group Auto Failover

  1. Check the configuration and link status of the control and fabric links by using the show chassis cluster interfaces command.

    Sample output for a branch SRX Series Services Gateway:

    content_copy zoom_out_map
    {primary:node0}
    root@SRX_Branch> show chassis cluster interfaces
    Control link 0 name: fxp1
    Control link status: Up
    
    Fabric interfaces:
    Name Child-interface Status
    fab0 ge-0/0/2 down
    fab0
    fab1 ge-9/0/2 down
    fab1
    Fabric link status: down
    

    Sample output for a high-end SRX Series Services Gateway:

    content_copy zoom_out_map
    {primary:node0}
    root@SRX_HighEnd> show chassis cluster interfaces
    Control link 0 name: em0
    Control link 1 name: em1
    Control link status: up
    
    Fabric interfaces:
    Name Child-interface Status
    fab0 ge-0/0/5 down
    fab0
    Fabric link status: down
  2. Proceed to Step 3 if both the control link and fabric link are up.

  3. Check the interface monitoring or IP monitoring configurations that are up. If the configurations are not correct rectify the configurations. If the configurations are correct proceed to step 4.

  4. Check the priority of each node in the output of the show chassis cluster status command.

    • If the priority is 0, see KB article KB16869 for JSRP (Junos OS Services Redundancy Protocol) chassis clusters and KB article KB19431 for branch SRX Series Firewalls.

    • If the priority is 255, see Redundancy Group Manual Failover.

    • If the priority is between 1 and 254 and if still the redundancy group does not fail over, proceed to the section Whats Next.

What's Next

  1. If these steps do not resolve the issue, see KB article KB15911 for redundancy group failover tips.

  2. If you wish to debug further, see KB article KB21164 to check the debug logs.

  3. To open a JTAC case with the Juniper Networks Support team, see Data Collection for Customer Support for the data you should collect to assist in troubleshooting before you open a JTAC case.

footer-navigation