Upgrading Contrail Networking with Red Hat Openstack 13 using ISSU
This document provides steps to upgrade Contrail Networking with an in-service software upgrade (ISSU) in an environment using Red Hat Openstack Platform 13 (RHOSP13).
When to Use This Procedure
Use this procedure to upgrade Contrail Networking when it is running in environments using RHOSP13.
This procedure has been validated for the following Contrail Networking upgrades:
Starting Contrail Networking Release |
Target Contrail Networking Upgrade Release |
---|---|
5.0 or 5.1 |
1907 |
1907 |
1908 |
1908 |
1909 |
1909 |
1910 |
1910 |
1911 |
1911 |
1912 |
Starting in Contrail Networking Releases 1912.L0 and 2003, use the Zero Impact Upgrade (ZIU) procedure to upgrade Contrail Networking in environments using Red Hat Openstack orchestration. See Updating Contrail Networking using the Zero Impact Upgrade Process in an Environment using Red Hat Openstack.
If you want to use this procedure to upgrade your Contrail Networking release to other releases, you must engage Juniper Networks professional services. Contact your Juniper representative for information on working with professional services.
Before you begin
Obtain the ContrailImageTag value for your Contrail Networking release. You can obtain this value from the readme files at the following locations:
Contrail Networking Release 20: README Access to Contrail Networking Registry 20xx
Contrail Networking Release 19: README Access to Contrail Registry 19XX
Enable RHEL subscription for the overcloud nodes.
Enable SSH migration for the Compute nodes if you do not have CEPH or alike storage.
Upgrading the compute nodes requires workload migrations and CEPH or alike storage allows VM migration.
Modify MigrationSshKey value at ~/tripleo-heat-templates/environments/contrail/contrail-services.yaml file.
The MigrationSshKey parameter with SSH keys for migration is typically provided during the overcloud deployment. The parameter is used to pass SSH keys between computes nodes to allow a VM to migrate from one compute node to another. The MigrationSshKey parameter is an optional parameter that can be added to the contrail-services.yaml file. The parameter is not included in the contrail-services.yaml file by default.
Run the following commands to find out the SSH keys:
(undercloud) [stack@queensa ~]$ cat .ssh/id_rsa
(undercloud) [stack@queensa ~]$ cat .ssh/id_rsa.pub
Backup the Contrail configuration database.
See How to Backup and Restore Contrail Databases in JSON Format.
Procedure
Troubleshoot
Following are the known issues:
- Failed upgrade run command for OpenStack controller
- Failed upgrade run command for any overcloud node
Failed upgrade run command for OpenStack controller
Problem
Description
You see the following error:
nodes=overcloud-controller-0 openstack overcloud upgrade run --nodes $nodes --playbook upgrade_steps_playbook.yaml ... TASK [Enable the cinder_volume cluster resource] ******************************* Thursday 25 July 2019 11:38:57 -0400 (0:00:00.887) 0:03:16.905 ********* FAILED - RETRYING: Enable the cinder_volume cluster resource (5 retries left). FAILED - RETRYING: Enable the cinder_volume cluster resource (4 retries left). FAILED - RETRYING: Enable the cinder_volume cluster resource (3 retries left). FAILED - RETRYING: Enable the cinder_volume cluster resource (2 retries left). FAILED - RETRYING: Enable the cinder_volume cluster resource (1 retries left). fatal: [overcloud-controller-0]: FAILED! => {"attempts": 5, "changed": false, "error": "Error: resource 'openstack-cinder-volume' is not running on any node\n", "msg": "Failed, to set the resource openstack-cinder-volume to the state enable", "output": "", "rc": 1} PLAY RECAP ********************************************************************* overcloud-controller-0 : ok=149 changed=68 unreachable=0 failed=1 Thursday 25 July 2019 11:39:31 -0400 (0:00:34.195) 0:03:51.101 *********
For details, refer to https://access.redhat.com/solutions/4122571.
Solution
Make SSH connection to the OpenStack controller node.
Run the following command:
sudo docker rm cinder_volume_init_bundle
Check if the cinder volume is in failed resources list.
sudo pcs status
Check if the cinder volume is not in failed resource list.
sudo pcs resource cleanup
Re-run the upgrade
run
command.
Failed upgrade run command for any overcloud node
Problem
Description
You see the following error:
****************************************************** TASK [include_tasks] *********************************************************** Wednesday 02 October 2019 09:21:02 -0400 (0:00:00.448) 0:00:29.101 ***** fatal: [overcloud-novacompute-1]: FAILED! => {"msg": "No variable found with this name: Compute_pre_deployments"}NO MORE HOSTS LEFT *******************************************************
Solution
This is a broken default behavior if a variable is missing.
Edit the tripleo-heat-templates/common/deploy-steps.j2 to apply the following change:
content_copyzoom_out_map (undercloud) [stack@queensa common]$ diff -U 3 deploy-steps.j2.org deploy-steps.j2 --- deploy-steps.j2.org 2019-10-04 09:09:57.414000000 -0400 +++ deploy-steps.j2 2019-10-04 09:13:51.120000000 -0400 @@ -433,7 +433,7 @@ - include_tasks: deployments.yaml vars: force: false - with_items: "{{ '{{' }} lookup('vars', tripleo_role_name + '_pre_deployments')|default([]) {{ '}}' }}" + with_items: "{{ '{{' }} hostvars[inventory_hostname][tripleo_role_name ~ '_pre_deployments']|default([]) {{ '}}' }}" tags: - overcloud - pre_deploy_steps @@ -521,7 +521,7 @@ - include_tasks: deployments.yaml vars: force: false - with_items: "{{ '{{' }} lookup('vars', tripleo_role_name + '_post_deployments')|default([]) {{ '}}' }}" + with_items: "{{ '{{' }} hostvars[inventory_hostname][tripleo_role_name ~ '_post_deployments']|default([]) {{ '}}' }}" tags: - overcloud - post_deploy_steps
After editing the deploy-steps.j2, run the prepare
command as given in
step 5.6.c.Once it is completed, continue the upgrade
procedure where you left off.