Help us improve your experience.

Let us know what you think.

Do you have time for a two-minute survey?

 
 

Edit Cluster Nodes

Use the information provided in this topic to edit operational Paragon Automation cluster nodes. You can use the repair command to add, remove, or replace cluster nodes, and repair failed nodes. The repair process rebuilds the cluster node and restarts the pods in the node.

Edit Primary Nodes in Multi-Primary Node Clusters and Worker Nodes in All Clusters

In clusters with multiple primary nodes, you can edit both primary and worker nodes by adding or removing primary and worker nodes. However, when you add or remove primary nodes, you must ensure that the total number of primary nodes is an odd number. You must also have a minimum of three primary nodes for high availability in the control plane. Use the following procedure to edit nodes in multi-primary node clusters.

You can also use the same procedure to edit only worker nodes in single-primary node clusters.

  1. Prepare the new node or the replacement node and ensure that it meets all the cluster node prerequisites. See Prepare Ubuntu Cluster Nodes or Prepare RHEL Cluster Nodesdepending on your base OS.
  2. Log in to node you want to add or repair.
  3. Disable the udevd daemon.
    1. Check whether udevd is running.

      # systemctl is-active systemd-udevd

    2. If udevd is active, disable it. # systemctl mask system-udevd --now
  4. Log in to the control host.
  5. If you are adding a node, edit the inventory file to add the IP address of the new node.

    If you are removing a node, edit the inventory file to delete the IP address of the node you want to remove.

    If you are replacing a node, and the IP address of the replacement node is different from the current node, update the inventory file to replace the old node address with the new node address.

    If you are repairing a node and the IP address is unchanged, you need not edit the inventory file.

  6. Run one of the following commands:

    If the node address is unchanged or you are adding or removing a node, use

    ./run –c config-dir repair node-ip-address-or-hostname

    If the node address has changed, use

    ./run -c config-dir repair old-node-ip-address-or-hostname,new-node-ip-address-or-hostname

  7. When a node is repaired or replaced, the Ceph distributed filesystems are not automatically updated. If the data disks were destroyed as part of the repair process, then the object storage daemons (OSDs) hosted on those data disks must be recovered.

    1. Connect to the Ceph toolbox and view the status of OSDs. The ceph-tools script is installed on a primary node. You can log in to the primary node and use the kubectl interface to access ceph-tools. To use a node other than the primary node, you must copy the admin.conf file (in the config-dir on the control host) and set the kubeconfig environment variable or use the export KUBECONFIG=config-dir/admin.conf command.

      $ ceph-tools# ceph osd status

    2. Verify that all OSDs are listed as exists,up. If OSDs are damaged, follow the troubleshooting instructions explained in Troubleshoot Ceph and Rook.

  8. Log in to node that you added or repaired after verifying that all OSDs are created.
  9. Re-enable udevd on that node.

    systemctl unmask system-udevd

Edit Primary Nodes in Single-Primary Node Clusters

In single-primary node clusters, you can edit both primary and worker nodes. However, you cannot remove or add primary nodes.

Note:

You can add additional primary nodes only if your existing cluster is already a multiple-primary cluster.

During node repair, you cannot schedule new pods, and existing pods remain nonoperational, resulting in service degradation.

You need the latest version of the etcd-snapshot.db file to restore the primary node in single-primary node clusters.

Note:

The etcd-snapshot.db file is backed up locally in /export/backup/etcd-snapshot.db every five minutes. We recommend that you copy this file to a separate remote location at regular intervals or mount /export/backup/ to an external fileserver.

To replace or repair the primary node, you must have the etcd-snapshot.db file available.

  1. Log in to the node that you want to replace or repair.
  2. Disable the udevd process.
    1. Check whether udevd is running.

      # systemctl is-active systemd-udevd

    2. If udevd is active, disable it. # systemctl mask system-udevd --now
  3. Log in to the control host.
  4. Copy the etcd-snapshot.db file to the control host or restore the external /export/backup/ mount.
  5. Run one of the following commands to replace or repair the node:

    If the node address is unchanged, use

    ./run –c config-dir repair node-ip-address-or-hostname –e etcd_backup=path-to-etcd-snapshot.db

    If the node address has changed, use

    ./run –c config-dir repair old-node-ip-address-or-hostname,new-node-ip-address-or-hostname –e etcd_backup=path-to-etcd-snapshot.db

  6. When a node is repaired or replaced, the Ceph distributed filesystems are not automatically updated. If the data disks were destroyed as part of the repair process, then the object storage daemons (OSDs) hosted on those data disks must be recovered.

    1. Connect to the Ceph toolbox and view the status of OSDs. The ceph-tools script is installed on a primary node. You can log in to the primary node and use the kubectl interface to access ceph-tools. To use a node other than the primary node, you must copy the admin.conf file (in the config-dir on the control host) and set the kubeconfig environment variable or use the export KUBECONFIG=config-dir/admin.conf command.

      $ ceph-tools# ceph osd status

    2. Verify that all OSDs are listed as exists,up. If OSDs are damaged, follow the troubleshooting instructions explained in Troubleshoot Ceph and Rook.

  7. Log in to the node that you added or repaired after verifying that all OSDs are created.
  8. Re-enable udevd on that node.

    systemctl unmask system-udevd