Upgrading Both Devices in a Chassis Cluster Using a Low-Impact ISSU

Upgrading Both Devices in a Chassis Cluster Using an ISSU

For some platforms, devices in a chassis cluster can be upgraded without a service disruption using an in-service software upgrade (ISSU). The chassis cluster ISSU feature allows both devices in a cluster to be upgraded from supported JUNOS versions with a traffic impact similar to that of redundancy group failovers.

Note: If you upgrade from a JUNOS Software version that supports only IPv4 to a version that supports both IPv4 and IPv6, the IPv4 traffic will continue to work during the upgrade process. If you upgrade from a JUNOS Software version that supports both IPv4 and IPv6 to a version that supports both IPv4 and IPv6, both the IPv4 and IPv6 traffic will continue to work during the upgrade process. JUNOS Release 10.2 and later releases support flow-based processing for IPv6 traffic. For more information, see “Enabling Flow-Based Processing for IPv6 Traffic” in the JUNOS Software Security Configuration Guide.

Before You Begin

  1. Note that in-service software upgrades (ISSUs) are available only for JUNOS Release 9.6 and later.
  2. Before starting an ISSU, you should fail over all redundancy groups so that they are all active on only one device. See Initiating a Chassis Cluster Manual Redundancy Group Failover for more information.
  3. We also recommend that routing protocols graceful restart be enabled prior to starting an ISSU.

Once all redundancy groups are active on one device, the upgrade is initiated by using a request command:

  1. Fail over all redundancy groups to one device.
  2. Start the ISSU by entering the following command:
    user@host> request system software in-service-upgrade image_name reboot

    If reboot is not included in the command, you will need to manually reboot each device as the ISSU completes updating the software image.

  3. Wait for both devices to complete the upgrade, then verify that all policies, zones, redundancy groups, and other runtime objects (RTOs) return to their correct states. Also verify that both devices in the cluster are running the new JUNOS Software build.

Note: During the upgrade, both devices might experience redundancy group failovers, but traffic will not be disrupted. Each device validates the package and checks version compatibility before doing the upgrade. If the system finds that the new package is not version compatible with the currently installed version, the device will refuse the upgrade, or prompt you to take corrective action. Sometimes a single feature is not compatible, in which case the upgrade software will prompt you to either abort the upgrade or turn off the feature before doing the upgrade.

This feature is available only via the command-line interface. See the “request system software in-service-upgrade” section of the JUNOS Software CLI Reference.

Rolling Back Devices in a Chassis Cluster After an ISSU

If the ISSU fails to complete and only one device in the cluster has been upgraded, you can roll back to the previous configuration on that device alone by using the following commands on the upgraded device:

  1. request chassis cluster in-service-upgrade abort
  2. request system software rollback
  3. request system reboot

Guarding Against Service Failure in a Chassis Cluster ISSU

The ISSU command has one option: no-old-master-upgrade. This option leaves the current master device in a nonupgraded state, which is a precaution against service failure. The no-old-master-upgrade option allows routing control to be quickly returned to the old master device if the newly upgraded device does not operate correctly.

Use of the no-old-master-upgrade option will require you to run a standard upgrade on the old master device after the ISSU is completed on the backup device.

If you use the no-old-master-upgrade option, when the backup device completes its upgrade and you are confident that the new build is operating as expected, then upgrade the old master as follows:

  1. Run request system software add image_name
  2. Run request chassis cluster in-service-upgrade abort to stop the ISSU process.
  3. Run request system reboot

Enabling an Automatic Chassis Cluster Node Failback After an ISSU

If you want redundancy groups to automatically return to node 0 as the primary after the ISSU is complete, you must set the redundancy group priority such that node 0 is primary and enable the preempt option. Note that this method will work for all redundancy groups except redundancy group 0. You must manually fail over redundancy group 0. To set the redundancy group priority and enable the preempt option, see Example: Configuring Chassis Cluster Redundancy Groups (CLI). To manually fail over a redundancy group, see Initiating a Chassis Cluster Manual Redundancy Group Failover.

Note: To upgrade node 0 and make it available in the chassis cluster, manually reboot node 0. Node 0 does not reboot automatically.

Troubleshooting Chassis Cluster ISSU Failures

Certain circumstances might cause an ISSU attempt to fail. This section explains two of them.

Deciphering Mismatched Control Link Statistics During a Chassis Cluster ISSU

When using dual control links (supported on the SRX5000 and SRX3000 lines only), mismatched control link statistics might be reported with the show chassis cluster statistics and show chassis cluster control-plane statistics commands while you run an ISSU with nodes on devices running different releases. (ISSUs are available in JUNOS Release 9.6 and later and dual control links are available in JUNOS Release 10.0 and later.) For example, assume that one node on a device is running JUNOS Release 9.6 and another node on a device is running JUNOS Release 10.0. In this example, a mismatch might occur because the latter device will be sending heartbeats on both control links, but the other device will receive heartbeats only on one control link.

Related Topics