Upgrading a Chassis Cluster Using In-Service Software Upgrade
In-service software upgrade (ISSU) enables a software upgrade from one Junos OS version to a later Junos OS version with minimal downtime. For more information, see the following topics:
Understanding ISSU for a Chassis Cluster
In-service software upgrade (ISSU) enables a software upgrade from one Junos OS version to a later Junos OS version with little or no downtime. ISSU is performed when the devices are operating in chassis cluster mode only.
The chassis cluster ISSU feature enables both devices in a cluster to be upgraded from supported Junos OS versions with a minimal disruption in traffic and without a disruption in service.
Starting with Junos OS Release 15.1X49-D80, SRX4100 and SRX4200 devices support ISSU.
Starting with Junos OS Release 15.1X49-D70, SRX1500 devices support ISSU.
Starting with Junos OS Release 23.4R1, SRX1600 and SRX2300 devices support ISSU.
Starting with Junos OS Release 24.2R1, SRX4300 devices support ISSU.
-
On SRX1500, SRX4100, and SRX4200 devices, ISSU is not supported for upgrading to 17.4 releases from previous Junos OS releases. ISSU is supported for upgrading from Junos OS 17.4 to successive 17.4 releases.
-
On SRX5400, SRX5600 and SRX5800 devices, ISSU is not supported for upgrading to 17.3 and higher releases from earlier Junos OS releases. ISSU is supported for upgrading from Junos OS 17.3 to successive 17.3 releases.
-
SRX300 Series devices and vSRX Virtual Firewall do not support ISSU.
ISSU provides the following benefits:
-
Eliminates network downtime during software image upgrades
-
Reduces operating costs, while delivering higher service levels
-
Allows fast implementation of new features
ISSU has the following limitations:
-
ISSU is available only for Junos OS Release 10.4R4 or later.
-
ISSU does not support software downgrades.
-
If you upgrade from a Junos OS version that supports only IPv4 to a version that supports both IPv4 and IPv6, the IPv4 traffic continue to work during the upgrade process. If you upgrade from a Junos OS version that supports both IPv4 and IPv6 to a version that supports both IPv4 and IPv6, both the IPv4 and IPv6 traffic continue to work during the upgrade process. Junos OS Release 10.2 and later releases support flow-based processing for IPv6 traffic.
-
During an ISSU, you cannot bring any PICs online. You cannot perform operations such as commit, restart, or halt.
-
During an ISSU, operations like fabric monitoring, control link recovery, and RGX preempt are suspended.
-
During an ISSU, you cannot commit any configurations.
For details about ISSU support status, see knowledge base article KB17946.
The following process occurs during an ISSU for devices in a chassis cluster. The sequences given below are applicable when RG-0 is node 0 (primary node). Note that you must initiate an ISSU from RG-0 primary. If you initiate the upgrade on node 1 (RG-0 secondary), an error message is displayed.
-
At the beginning of a chassis cluster ISSU, the system automatically fails over all RG-1+ redundancy groups that are not primary on the node from which the ISSU is initiated. This action ensures that all the redundancy groups are active on only the RG-0 primary node.
The automatic failover of all RG-1+ redundancy groups is available from Junos OS release 12.1 or later. If you are using Junos OS release 11.4 or earlier, before starting the ISSU, ensure that all the redundancy groups are all active on only the RG-0 primary node.
After the system fails over all RG-1+ redundancy groups, it sets the manual failover bit and changes all RG-1+ primary node priorities to 255, regardless of whether the redundancy group failed over to the RG-0 primary node.
-
The primary node (node 0) validates the device configuration to ensure that it can be committed using the new software version. Checks are made for disk space availability for the /var file system on both nodes, unsupported configurations, and unsupported Physical Interface Cards (PICs).
If the disk space available on either of the Routing Engines is insufficient, the ISSU process fails and returns an error message. However, unsupported PICs do not prevent the ISSU. The software issues a warning to indicate that these PICs will restart during the upgrade. Similarly, an unsupported protocol configuration does not prevent the ISSU. However, the software issues a warning that packet loss might occur for the protocol during the upgrade.
-
When the validation succeeds, the kernel state synchronization daemon (ksyncd) synchronizes the kernel on the secondary node (node 1) with the node 0.
-
Node 1 is upgraded with the new software image. Before being upgraded, the node 1 gets the configuration file from node 0 and validates the configuration to ensure that it can be committed using the new software version. After being upgraded, it is resynchronized with node 0.
-
The chassis cluster process (chassisd) on the node 0 prepares other software processes for the lSSU. When all the processes are ready, chassisd sends a message to the PICs installed in the device.
-
The Packet Forwarding Engine on each Flexible PIC Concentrator (FPC) saves its state and downloads the new software image from node 1. Next, each Packet Forwarding Engine sends a message (unified-ISSU ready) to the chassisd.
-
After receiving the message (unified-ISSU ready) from a Packet Forwarding Engine, the chassisd sends a reboot message to the FPC on which the Packet Forwarding Engine resides. The FPC reboots with the new software image. After the FPC is rebooted, the Packet Forwarding Engine restores the FPC state and a high-speed internal link is established with node 1 running the new software. The chassisd is also reestablished with node 0.
-
After all Packet Forwarding Engines have sent a ready message using the chassisd on node 0, other software processes are prepared for a node switchover. The system is ready for a switchover at this point.
-
Node switchover occurs and node 1 becomes the new primary node (hitherto secondary node 1).
-
The new secondary node (hitherto primary node 0) is now upgraded to the new software image.
When both nodes are successfully upgraded, the ISSU is complete.
When upgrading a version cluster that does not support encryption to a version that supports encryption, upgrade the first node to the new version. Without the encryption configured and enabled, two nodes with different versions can still communicate with each other and service is not broken. After upgrading the first node, upgrade the second node to the new version. Users can decide whether to turn on the encryption feature after completing the upgrade. Encryption must be deactivated before downgrading to a version that does not support encryption. This ensures that communication between an encryption-enabled version node and a downgraded node does not break, because both are no longer encrypted.
The policies in the Routing Engine and Packet Forwarding Engine must be in sync for the configuration to be committed. When the policy configurations are modified and the policies are out of sync, the system displays an error message.
As a workaround, you must use the request security policies resync command to synchronize the configuration of security policies in the Routing Engine and Packet Forwarding Engine, in case if you notice security policies are out of sync after an upgrade.
ISSU System Requirements
You can use ISSU to upgrade from an ISSU-capable software release to a later release.
To perform an ISSU, your device must be running a Junos OS release that supports ISSU for the specific platform. See Table 1 for platform support.
Device |
Junos OS Release |
---|---|
SRX5800 |
10.4R4 or later |
SRX5600 |
10.4R4 or later |
SRX5400 |
12.1X46-D20 or later |
SRX1500 |
15.1X49-D70 or later |
SRX1600 |
23.4R1 or later |
SRX2300 |
23.4R1 or later |
SRX4100 |
15.1X49-D80 or later |
SRX4200 |
15.1X49-D80 or later |
SRX4300 |
24.2R1 or later |
SRX4600 |
17.4R1 or later |
For additional details on ISSU support and limitations, see ISSU/ICU Upgrade Limitations on SRX Series Devices.
Note the following limitations related to an ISSU:
The ISSU process is terminated if the Junos OS version specified for installation is a version earlier than the one currently running on the device.
The ISSU process is terminated if the specified upgrade conflicts with the current configuration, the components supported, and so forth.
ISSU does not support the extension application packages developed using the Junos OS SDK.
ISSU does not support version downgrading on all supported SRX Series Firewalls.
ISSU occasionally fails under heavy CPU load.
To downgrade from an ISSU-capable release to an earlier release
(ISSU-capable or not), use the request system software add
command. Unlike an upgrade using the ISSU process, a downgrade using
the request system software add
command might cause network
disruptions and loss of data.
We strongly recommend that you perform ISSU under the following conditions:
When both the primary and secondary nodes are healthy
During system maintenance period
During the lowest possible traffic period
When the Routing Engine CPU usage is less than 40 percent
In cases where ISSU is not supported or recommended, while still downtime during the system upgrade must be minimized, the minimal downtime procedure can be used, see knowledge base articleKB17947.
Upgrading Both Devices in a Chassis Cluster Using ISSU
Before you begin the ISSU for upgrading both the devices, note the following guidelines:
Ensure the following ISSU pre-check requirements are met:
All redundancy groups priority is greater than 0
All redundancy groups are either primary or secondary in state
There exists enough (double the image size) space available in the /var/tmp
Usage of CPU is under 80% within 5 seconds period
If the pre-check requirements are not met, ISSU will terminate at the beginning.
Back up the software using the
request system snapshot
command on each Routing Engine to back up the system software to the device’s hard disk. Therequest system snapshot
command is not supported on SRX1500, SRX1600, SRX2300, SRX4100, SRX4200, SRX4300, and SRX4600 platforms.If you are using Junos OS Release 11.4 or earlier, before starting the ISSU, set the failover for all redundancy groups so that they are all active on only one node (primary). See Initiating a Chassis Cluster Manual Redundancy Group Failover.
If you are using Junos OS Release 12.1 or later, Junos OS automatically fails over all RGs to the RG0 primary.
We recommend that you enable graceful restart for routing protocols before you start an ISSU.
On all supported SRX Series Firewalls, the first recommended ISSU from release is Junos OS Release 10.4R4.
The chassis cluster ISSU feature enables both devices in a cluster to be upgraded from supported Junos OS versions with a traffic impact similar to that of redundancy group failovers.
Starting with Junos OS Release 15.1X49-D70, SRX1500 devices support ISSU.
Starting with Junos OS Release 15.1X49-D80, SRX4100 and SRX4200 devices support ISSU.
Starting with Junos OS Release 17.4R1, SRX4600 devices support ISSU.
To perform an ISSU from the CLI on Routing Engine2:
If you want redundancy groups to automatically return to node
0 as the primary after an in-service software upgrade (ISSU), you
must set the redundancy group priority such that node 0 is primary
and enable the preempt
option. Note that this method works
for all redundancy groups except redundancy group 0. You must manually
set the failover for redundancy group 0.
To set the redundancy group priority and enable the preempt
option, see Example: Configuring
Chassis Cluster Redundancy Groups.
To manually set the failover for a redundancy group, see Initiating a Chassis Cluster Manual Redundancy Group Failover.
During the upgrade, both devices might experience redundancy group failovers, but traffic is not disrupted. Each device validates the package and checks version compatibility before beginning the upgrade. If the system finds that the new package version is not compatible with the currently installed version, the device refuses the upgrade or prompts you to take corrective action. Sometimes a single feature is not compatible, in which case, the upgrade software prompts you to either terminate the upgrade or turn off the feature before beginning the upgrade.
If you want to operate the SRX Series Firewall back as a standalone device or to remove a node from a chassis cluster, ensure that you have terminated the ISSU procedure on both the nodes (in case ISSU procedure is initiated)
To start ISSU process on SRX5K devices with Routing Engine3 and on SRX1600, SRX2300, and SRX4300 devices:
Run the following command to start ISSU:
user@host> request vmhost software in-service-upgrade image-name-with-full-path
See Also
Rolling Back Devices in a Chassis Cluster After an ISSU
If an ISSU fails to complete and only one device in the cluster is upgraded, you can roll back to the previous configuration on the upgraded device alone by issuing one of the following commands on the upgraded device:
request chassis cluster in-service-upgrade abort
request system software rollback node node-id reboot
request system reboot
Enabling an Automatic Chassis Cluster Node Failback After an ISSU
If you want redundancy groups to automatically return to node
0 as the primary after the an in-service software upgrade (ISSU),
you must set the redundancy group priority such that node 0 is primary
and enable the preempt
option. Note that this method works
for all redundancy groups except redundancy group 0. You must manually
set the failover for a redundancy group 0. To set the redundancy group
priority and enable the preempt
option, see Example: Configuring
Chassis Cluster Redundancy Groups. To manually set the
failover for a redundancy group, see Initiating a Chassis
Cluster Manual Redundancy Group Failover.
To upgrade node 0 and make it available in the chassis cluster, manually reboot node 0. Node 0 does not reboot automatically.
Change History Table
Feature support is determined by the platform and release you are using. Use Feature Explorer to determine if a feature is supported on your platform.