Guidance for Migrating to CentOS 7 for NorthStar 6.0.0 and Later
If you are already using CentOS or RHEL 7.x, you do not need these instructions. Instead, follow the installation procedures in the NorthStar Controller/Planner Getting Started Guide to install or upgrade your NorthStar application.
These instructions are intended to assist you in migrating a working NorthStar 5.1.0 three-node cluster running on CentOS or RHEL 6.10 to a NorthStar 5.1.0 three-node cluster on CentOS or RHEL 7.7. This creates an upgrade path for NorthStar 5.1.0 to NorthStar 6.0.0 or later as CentOS and RHEL 6.x are no longer supported. If you are running a VM or if you have a current backup plan in production, we recommend you take a snapshot or create a backup before proceeding, as the instructions will involve wiping out your HDD/SDD and removing all data on those drives.
This guidance assumes familiarity with the NorthStar installation and configuration process. If you have never installed/configured NorthStar before, we recommend you read the NorthStar Getting Started Guide for background, and have it available for reference.
You must upgrade the operating system first because NorthStar 6.0.0 or later installation requires CentOS or RHEL 7.6 or 7.7. The order of these procedures is important:
Back up your data.
The following files should be backed up:
/opt/northstar/data/*.json
/opt/northstar/data/northstar.cfg*
/opt/northstar/data/crpd/juniper.conf*
/opt/pcs/db/sys/npatpw
Output from the /opt/northstar/utils/cmgd_cli -c "show config" command.
Upgrade the operating system to CentOS or RHEL 7.7.
Install NorthStar 5.1.0 on the upgraded operating system.
When all nodes are running CentOS 7.7 or RHEL and NorthStar 5.1.0, upgrade NorthStar to 6.0.0 or later.
Example Scenario
For example purposes, these instructions assume you are migrating from CentOS 6.10 to CentOS 7.7, and your network configuration includes:
Three NorthStar application servers in a cluster
Three analytics servers in a cluster
Three collector nodes
Your actual operating system version and network topology might be different, but the principles still apply.
We recommend backing up your operating system files and directories so you have a reference since some of the files differ between CentOS 6.x and CentOS 7.x. Back up these operating system files and directories, and save them to an external or network drive:
/etc/selinux/config
/etc/sysconfig/
/etc/hosts
/etc/ntp.conf
/etc/resolv.conf
/etc/ssh/
/root/.ssh/
Back up these NorthStar files and directories, and save them to an external or network drive:
/opt/pcs/db/sys/npatpw
/opt/northstar/data/northstar.cfg
/opt/northstar/data/*.json
/opt/northstar/data/junosvm.conf
/opt/northstar/northstar.env
/opt/northstar/thirdparty/netconfd/templates
/opt/northstar/saved_models (if used for saving NorthStar Planner projects)
The Basic Work Flow
For any node, whether it is a NorthStar application node, an analytics node, or a collector node, the work flow to upgrade your operating system while preserving your clusters and data is essentially the same:
Power down one standby node in the cluster setup.
Boot that node from the operating system minimal ISO.
CentOS 7.7 minimal ISO is available here:
http://mirrors.mit.edu/centos/7.7.1908/isos/x86_64/
http://mirrors.tripadvisor.com/centos/7.7.1908/isos/x86_64/
Install the operating system on the node.
Run
yum -y update
to address any critical or security updates.Install recommended packages:
yum -y install net-tools bridge-utils ntp wget ksh telnet java-1.8.0-openjdk-headless
Install the NorthStar 5.1.0 application on this same node, setting it up as a standalone host.
Note:For NorthStar application nodes, you will need a new license because the interface names change from ethx to ensx when you upgrade the operating system. You will not need a new license for analytics or collector nodes.
For NorthStar application nodes, launch the web UI on the host https://northstar_ip_address:8443 to ensure the license is working and you can log in successfully.
You can check the status of the NorthStar processes by running the
supervisorctl status
command.
In this procedure, we have you start with upgrading the operating system on your analytics cluster, then your NorthStar application cluster, and your collector cluster last. However, this order is not a strict requirement. When all nodes in all clusters are running the upgraded operating system and NorthStar 5.1.0, you then upgrade to NorthStar 6.0.0 or later.
Upgrade the Operating System on Your Analytics Nodes
For analytics nodes, Elasticsearch will self-form the cluster and distribute the data per the replication policy. Therefore, there is no need to first delete the node from Elasticsearch history. To migrate your analytics cluster, use the following procedure:
Install CentOS 7.7 on a standby analytics node, including the previously stated recommended packages.
Install NorthStar-Bundle-5.1.0-20191210_220522_bb37a329b_64.x86_64.rpm on the node where you have the freshly installed operating system.
Copy the SSH keys from the existing active node in the analytics cluster and all application nodes to the new analytics node:
ssh-copy-id root@new_analytics_node_ip_address
Working from an existing node in the cluster, add the new analytics node into the cluster:
From net_setup.py, select Analytics Data Collector Setting (G) for external standalone/cluster analytics server setup.
Select Add new Collector node to existing cluster (E).
You can use the previous node’s ID and other setup information.
Once this process is completed for the first node, repeat the steps for the remaining analytics cluster nodes. Once the process is complete on all three nodes, your analytics cluster will be up and running with CentOS 7.7 and NorthStar 5.1.0.
The following are useful Elasticsearch (REST API) commands you can use before, during and after upgrading your operating system. Run these from an existing node in the analytics cluster.
curl -X GET "localhost:9200/_cluster/health?pretty"
curl -X GET "localhost:9200/_cat/nodes?v"
curl -X GET "localhost:9200/_cat/indices"
curl -X GET "localhost:9200/_cat/shards"
Use the following command to check that all nodes in your analytics cluster are up:
[root@centos-610-analytics1 root]# /opt/northstar/utils/cluster_status.py -u admin -p %password% | grep -v Connection | grep -v OAuth2 ZooKeeper cluster status: Host Name IPv4 Mode Version centOS-610-analytics1172.25.153.167 follower 3.5.4-beta-7f51e5b68cf2f80176ff944a9ebd2abbc65e7327, built on 05/11/2018 16 centOS-610-analytics3172.25.153.70 leader 3.5.4-beta-7f51e5b68cf2f80176ff944a9ebd2abbc65e7327, built on 05/11/2018 16 centOS-610-analytics2172.25.153.62 follower 3.5.4-beta-7f51e5b68cf2f80176ff944a9ebd2abbc65e7327, built on 05/11/2018 16
Upgrade the Operating System on Your NorthStar Application Nodes
Use the following procedure to upgrade your operating system on the NorthStar application nodes:
You can refer to the NorthStar Getting Started Guide, Replace a Failed Node if Necessary section for reference.
Install CentOS 7.7 on one of the NorthStar application standby nodes (server or VM), including the recommended packages listed previously.
Install the NorthStar 5.1.0 application software (NorthStar-Bundle-5.1.0-20191210_220522_bb37a329b_64.x86_64.rpm). It is important to provide the installation script with the same database password that is on the existing nodes. If necessary, you can reset the database passwords on the existing nodes for consistency before adding the node into the cluster.
Install /opt/pcs/db/sys/nptapw and chown pcs.pcs /opt/pcs/db/sys/npatpw
Copy your npatpw file to the location /opt/pcs/db/sys/npatpw. Then run the
chown pcs:pcs /opt/pcs/db/sys/npatpw
command.Update /opt/northstar/netconfd/templates.
Copy the SSH keys from the existing active node in the NorthStar cluster and all application nodes.
ssh-copy-id root@new_northstar_node_ip_address
From an existing node in the cluster, delete the knowledge of the CentOS 6.x node from the cluster, then add it back as a new node:
The example below shows identifying the node that needs to be deleted (the one that is down), removing the node from Cassandra, and then observing the output of status commands as the new node is added back into the cluster. UN = up normal, DN = down normal, UJ = up joining. The goal is to replace all nodes and see them return to UN status.
[root@node-1 ~]# . /opt/northstar/northstar.env [root@node-1 ~]# nodetool status [root@node1 northstar]# nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.16.18.11 1.28 MB 256 100.0% 56ae8cb0-8ee6-4d3a-9cc0-9499faf60a5f rack1 UN 172.16.18.12 1.3 MB 256 100.0% c4566fc1-3b31-40ce-adcc-729bbabc174e rack1 DN 172.16.18.13 2.4 MB 256 100.0% 1cd5aa2f-b8c9-40bb-8aa0-a7c211842c62 rack1 # identify which node needs to be deleted… it will be in Down (D) state [root@GNAQP13B1 northstar]# nodetool removenode 1cd5aa2f-b8c9-40bb-8aa0-a7c211842c62 [root@GNAQP13B1 northstar]# nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.16.18.11 1.28 MB 256 100.0% 56ae8cb0-8ee6-4d3a-9cc0-9499faf60a5f rack1 UN 172.16.18.12 1.31 MB 256 100.0% c4566fc1-3b31-40ce-adcc-729bbabc174e rack1 # later when the node is being added back (track in Cassandra log on new node) [root@GNAQP13B1 northstar]# nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.16.18.11 1.28 MB 256 100.0% 56ae8cb0-8ee6-4d3a-9cc0-9499faf60a5f rack1 UN 172.16.18.12 1.95 MB 256 100.0% c4566fc1-3b31-40ce-adcc-729bbabc174e rack1 UJ 172.16.18.13 265.45 KB 256 ? d068ca2f-9fd4-438f-9df6-6d9c7fa5bdd9 rack1 [root@GNAQP13B1 northstar]# nodetool status Datacenter: datacenter1 ======================= Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 172.16.18.11 1.28 MB 256 100.0% 56ae8cb0-8ee6-4d3a-9cc0-9499faf60a5f rack1 UN 172.16.18.12 1.95 MB 256 100.0% c4566fc1-3b31-40ce-adcc-729bbabc174e rack1 UN 172.16.18.13 265.45 KB 256 100.0% d068ca2f-9fd4-438f-9df6-6d9c7fa5bdd9 rack1
It is important that you resynchronize all your SSH keys once you have rebuilt each node, which includes updating the SSH key on your JunosVM.
After the SSH keys are updated on each JunosVM, back up any changes made to the JunosVM by using the net_setup.py script and selecting Option D > Option 1.
From the net_setup.py main menu, select HA Setup (E).
Select Add a new node to existing cluster (J), using the existing node data in the script, and allow HA deployment to complete.
Monitor failover to ensure that it completes properly:
Check the output of the
supervisorctl status
command on the current active node to ensure all processes come up.Check the cluster status using the following command:
/opt/northstar/utils/cluster_status.py -u admin -p %password%
On the node with the VIP (the active node), test failover using the following command:
supervisorctl restart infra:ha_agent
On the restored node promoting to VIP, use the following command to observe the failover process:
tail -f /opt/northstar/logs/ha_agent.msg
Test the failover process between the three nodes. Optionally, you can add host priority using the net_setup.py script option E (HA Settings).
Run the following command to determine which nodes are currently standby nodes. They should be the two with the higher priority numbers:
priority/opt/northstar/utils/cluster_status.py -u admin -p %password%
Check the NorthStar web UI again for each node while it is the active node, to make sure the data is synchronized properly between the three nodes.
At this point, you should have a fully-functioning NorthStar 5.1.0 three-node cluster running on the CentOS 7.7 operating system.
Upgrade the Operating System on Your Collector Nodes
Collector nodes operate independently, but are tied to the application VIP. They can be deleted or installed back in independently. Proceed one node at a time with reinstallation.
All three collectors are currently running CentOS 6.10 with NorthStar 5.1.0 (NorthStar-Bundle-5.1.0-20191210_220522_bb37a329b_64.x86_64.rpm).
If you have not already done so, back up the NorthStar files and directories listed previously, and save them to an external or network drive.
Install the CentOS 7.7 operating system minimal installation on any one of the collector nodes.
Install the following recommended packages: net-tools, bridge-utils, wget, ntp, telnet, ksh, java-1.8.0-openjdk-headless.
Bring the system back online with the same IP address. Download the NorthStar 5.1.0 package and install it.
rpm -Uvh NorthStar-Bundle-5.1.0-20191210_220522_bb37a329b_64.x86_64.rpm
Run the collector install script.
cd /opt/northstar/northstar_bundle_5.1.0/ && ./collector.sh install Config file /opt/northstar/data/northstar.cfg does not exist copying it from Northstar APP server, please enter below info: ----------------------------------------------------------------------------------------- Please enter application server IP address or host name: 172.25.153.89 (IP of APP Server or VIP) Please enter Admin Web UI username: admin Please enter Admin Web UI password: retrieving config file from application server... Saving to /opt/northstar/data/northstar.cfg Collector installed....
Repeat this process on the remaining collector nodes, one at a time.
Special Notes for Nested JunosVM Nodes
The following additional procedure applies to migrating a nested JunosVM setup:
Copy the configuration here: /opt/northstar/data/junosvm/junosvm.conf.
Use the net_setup.py script to assign the JunosVM IP address back to the JunosVM.
Copy your backup of junosvm.conf into /opt/northstar/data/junosvm/junosvm.conf.
Restart the JunosVM:
supervisorctl restart junos:junosvm
Observe the JunosVM boot process using this command:
#tail -f /opt/northstar/logs/junosvm_telnet.log
Upgrade all Nodes to NorthStar 6.0.0 or Later
Now that your network and configuration are upgraded to CentOS 7.7, you can proceed with upgrading NorthStar to 6.0.0 or later.
- Analytics Node Upgrade to NorthStar 6.0.0 or Later
- NorthStar Application Node Upgrade to NorthStar 6.0.0 or Later
- Collector Node Upgrade to NorthStar 6.0.0 or Later
Analytics Node Upgrade to NorthStar 6.0.0 or Later
Upgrade the nodes in the analytics cluster using the following procedure:
Determine which nodes are standby versus active using this command:
/opt/northstar/utils/cluster_status.py -u admin -p %password% | grep -v Connection | grep -v OAuth2
Back up any NorthStar files to an external or network directory.
Download the official NorthStar 6.0.0 or later RPM.
Install NorthStar using this command:
yum -y install NorthStar-Bundle-6.x.x-20200427_213714_5096f11f3_41.x86_64.rpm
Install the analytics application using this command:
cd /opt/northstar/northstar_bundle_6.x.x/ && ./install-analytics.sh
Netflowd will be in a FATAL state until the NorthStar application nodes are upgraded and the analytics data collector settings are redeployed as netflowd cannot communicate with cMGD until then. This is an expected error.
[root@centos-7-analytics3 northstar_bundle_6.x.x]# supervisorctl status analytics:elasticsearch RUNNING pid 14595, uptime 0:19:10 analytics:esauthproxy RUNNING pid 14592, uptime 0:19:10 analytics:logstash RUNNING pid 14809, uptime 0:18:08 analytics:netflowd FATAL Exited too quickly (process log may have details) analytics:pipeline RUNNING pid 14593, uptime 0:19:10 bmp:bmpMonitor RUNNING pid 13016, uptime 0:30:57 infra:ha_agent RUNNING pid 12656, uptime 0:31:41 infra:healthmonitor RUNNING pid 15317, uptime 0:12:50 infra:zookeeper RUNNING pid 12653, uptime 0:31:41 listener1:listener1_00 RUNNING pid 13113, uptime 0:30:26
Repeat this process on the remaining standby nodes, then do the same on the active node.
Check the Zookeeper status of the analytics cluster:
/opt/northstar/utils/cluster_status.py -u admin -p %password% | grep -v Connection | grep -v OAuth2 ZooKeeper cluster status: Host Name IPv4 Mode Version centOS-610-analytics1172.25.153.167 follower 3.5.4-beta-7f51e5b68cf2f80176ff944a9ebd2abbc65e7327, built on 05/11/2018 16 centOS-610-analytics3172.25.153.70 leader 3.5.4-beta-7f51e5b68cf2f80176ff944a9ebd2abbc65e7327, built on 05/11/2018 16 centOS-610-analytics2172.25.153.62 follower 3.5.4-beta-7f51e5b68cf2f80176ff944a9ebd2abbc65e7327, built on 05/11/2018 16
NorthStar Application Node Upgrade to NorthStar 6.0.0 or Later
Upgrade the NorthStar application nodes using the following procedure:
Back up any NorthStar files on all nodes.
Determine which nodes are standby versus active using this command:
/opt/northstar/utils/cluster_status.py -u admin -p %password%
Start the upgrade procedure on standby nodes first.
Download the official NorthStar 6.0.0 or later RPM.
Install NorthStar using these commands:
yum -y install NorthStar-Bundle-6.x.x-20200427_213714_5096f11f3_41.x86_64.rpm cd /opt/northstar/northstar_bundle_6.x.x/ && ./install.sh --skip-bridge --yes
Once installation is complete, set the cMGD root password. If this is not done, the cMGD-rest service will continually loop. The requirement to set a cMGD-rest password is due to the addition of the cMGD service in NorthStar 6.0.0.
In net_setup.py, select Maintenance & Troubleshooting (D).
Select Change cMGD Root Password (8).
Redeploy the analytics data collector configuration settings so netflowd can communication with cMGD.
In net_setup.py, select Analytics Data Collector Setting (G) for external standalone/cluster analytics server setup.
Select Prepare and Deploy SINGLE Data Collector Setting (A), Prepare and Deploy HA Analytics Data Collector Setting (B), or Prepare and Deploy GEO-HA Analytics Data Collector Setting (C) whichever you had set up before the upgrade.
Upgrading a standby node should not trigger a failover. Failover should only occur when the active node is upgraded. At that time, the active node should fail over to an already upgraded standby node.
After all standby nodes are upgraded, upgrade the active node to NorthStar 6.0.0 or later.
Once all nodes are upgraded and one of the standby nodes has assumed the active role and VIP, monitor the cluster using the following procedure:
Check the status of the NorthStar processes on the current active node using this command:
supervisorctl status
Check the cluster status using this command:
/opt/northstar/utils/cluster_status.py -u admin -p %password%
On the node with the VIP, test the failover using this command:
supervisorctl restart infra:ha_agent
Use the following command to monitor the progress of the failover on the restored node being promoted to active node (with the VIP):
tail -f /opt/northstar/logs/ha_agent.msg
Optionally, add priority to the nodes using the net_setup.py script, Option E (HA Settings). Test the failover process between the three nodes to ensure the priorities are working properly.
Run the following command to find which nodes are currently standby nodes and ensure that failover is proceeding. The standby nodes should be the two with the higher number priority.
/opt/northstar/utils/cluster_status.py -u admin -p %password%
Check the NorthStar web UI again for each node while it is the active node to make sure the data is synchronized properly between the three nodes. Check your nodes, links, LSPs, device profiles, and so on.
At this point you should have a fully functioning 6.0.0 (or later) three-node NorthStar application cluster running on the CentOS 7.7 operating system.
Collector Node Upgrade to NorthStar 6.0.0 or Later
Upgrade your collector nodes using the following procedure.
Backup any NorthStar files to an external or network drive.
Download the official NorthStar RPM.
Install NorthStar.
yum -y install NorthStar-Bundle-6.x.x-20200427_213714_5096f11f3_41.x86_64.rpm
Install the NorthStar Collector Application.
cd /opt/northstar/northstar_bundle_6.x.x/ && ./collector.sh install Adding config file /opt/northstar/data/northstar.cfg from Northstar APP server, Please enter below info: --------------------------------------------------------------------------------------------------------------------------- Please enter application server IP address or host name: 172.25.153.119 Please enter Admin Web UI username: admin Please enter Admin Web UI password: Error sending request to: 172.25.153.119 Collector installed.... collector_main: stopped collector_main: removed process group collector:worker1: stopped collector:worker3: stopped collector:worker2: stopped collector:worker4: stopped collector:worker1: started collector:worker3: started collector:worker2: started collector:worker4: started
Repeat this process on all remaining collector nodes. When complete, your collector nodes are running NorthStar 6.0.0 or later on CentOS 7.7.