Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

Navigation

Understanding the Low-Impact ISSU Process on Devices in a Chassis Cluster

In-service software upgrade (ISSU) allows a software upgrade from one Junos OS version to a later Junos OS version with little or no downtime.

The chassis cluster ISSU feature allows both devices in a cluster to be upgraded from supported Junos OS versions with a minimal disruption in traffic and without a disruption in service.

An ISSU provides the following benefits:

  • Eliminates network downtime during software image upgrades
  • Reduces operating costs, while delivering higher service levels
  • Allows fast implementation of new features

Note:

The followings limitations are related to an ISSU:

  • ISSU is available only for Junos OS Release 10.4R4 or later.
  • ISSU does not support software downgrades.
  • If you upgrade from a Junos OS version that supports only IPv4 to a version that supports both IPv4 and IPv6, the IPv4 traffic will continue to work during the upgrade process. If you upgrade from a Junos OS version that supports both IPv4 and IPv6 to a version that supports both IPv4 and IPv6, both the IPv4 and IPv6 traffic will continue to work during the upgrade process. Junos OS Release 10.2 and later releases support flow-based processing for IPv6 traffic.
  • During an ISSU, you cannot bring any PICs online. You cannot perform operations such as commit, restart, halt, and so on.
  • During an ISSU, operations like fabric monitoring, control link recovery, and RGX preempt are suspended.

Note: For details on ISSU support status, see Knowledge Base Article KB17946.

The following process occurs during an ISSU for devices in a chassis cluster. The sequences given below are applicable when RG-0 is node 0 (primary node). Note that you must initiate an ISSU from RG-0 primary. If you initiate the ISSU on node 1 (RG-0 secondary), an error message will be displayed.

  1. At the beginning of a chassis cluster ISSU, the system automatically fails over all RG-1+ redundancy groups that are not primary on the node from which the ISSU is started. This action ensures that the redundancy groups are all active on only the RG-0 primary node.

    Note: The automatic failover of all RG-1+ redundancy groups is available from Junos OS Release 12.1 or later. If you are using Junos OS Release 11.4 or earlier, before starting an ISSU, ensure that the redundancy groups are all active on only the RG-0 primary node.

    After the system fails over all RG-1+ redundancy groups, it sets the manual failover bit and changes all RG-1+ primary node priorities to 255, regardless of whether the redundancy group failed over to the RG-0 primary node.

  2. The primary node (node 0) validates the device configuration to ensure that it can be committed using the new software version. Checks are made for disk space available for the /var file system on both nodes, unsupported configurations, and unsupported Physical Interface Cards (PICs).

    If there is insufficient disk space available on either of the Routing Engines, the ISSU process fails and returns an error message. However, unsupported PICs do not prevent an ISSU. The software issues a warning to indicate that these PICs will restart during the upgrade. Similarly, an unsupported protocol configuration does not prevent an ISSU. The software issues a warning that packet loss might occur for the protocol during the upgrade.

  3. When the validation succeeds, the kernel state synchronization daemon (ksyncd) synchronizes the kernel on the secondary node (node 1) with the node 0.
  4. The node 1 is upgraded with the new software image. Before being upgraded, the node 1 gets the configuration file from the node 0 and validates the configuration to ensure that it can be committed using the new software version. After being upgraded, it is resynchronized with the node 0.
  5. The chassis cluster process (chassisd) on the node 0 prepares other software processes for the low-impact ISSU. When all the processes are ready, chassisd sends a message to the PICs installed in the device.
  6. The Packet Forwarding Engine on each Flexible PIC Concentrator (FPC) saves its state and downloads the new software image from the node 1. Next, each Packet Forwarding Engine sends a message (ISSU ready) to the chassisd.
  7. After receiving the message (ISSU ready) from a Packet Forwarding Engine, the chassisd sends a reboot message to the FPC on which the Packet Forwarding Engine resides. The FPC reboots with the new software image. After the FPC is rebooted, the Packet Forwarding Engine restores the FPC state and a high-speed internal link is established with the node 1 running the new software. The chassisd is also reestablished with the node 0.
  8. After all Packet Forwarding Engines have sent a ready message using the chassisd on the node 0, other software processes are prepared for a node switchover. The system is ready for a switchover at this point.
  9. The node switchover occurs and the node 1 becomes the new primary node (old secondary node 1).
  10. The new secondary node (old primary node 0) is now upgraded to the new software image.

When both nodes are successfully upgraded, the ISSU is complete.

Note: When upgrading a version cluster that does not support encryption to a version that does support encryption, upgrade the first node to the new version. Without the encryption configured and enabled, two nodes with different versions can still communicate with each other and service is not broken. Then upgrade the second node to the new version. Users can decide whether to turn on the encryption feature after completing the upgrade.

Encryption must be deactivated before downgrading to a version that does not support encryption. This ensures that communication between an encryption-enabled version node and a downgraded node does not break because both are no longer encrypted.

Modified: 2015-07-06