Edit Cluster Nodes
Use the information provided in this topic to edit
operational
Paragon Automation cluster
nodes.
You can
use
the repair
command to add, remove,
or
replace
cluster nodes, and repair failed nodes. The repair process rebuilds
the cluster node and restarts the pods in the node.
Edit Primary Nodes in Multi-Primary NodeClusters and Worker Nodes in All Clusters
In clusters with multiple primary nodes, you can edit both primary and worker nodes.You can add or remove primary and worker nodes. However, when you add or remove primary nodes, you must ensure that the total number of primary nodes is an odd number. You must also have a minimum of three primary nodes for high availability in the control plane. Use the following procedure to edit nodes in multi-primary node clusters.
You can also use the same procedure to edit only worker nodes in single-primary node clusters.
- Prepare the new node or the replacement node and ensure that it meets all the cluster node prerequisites. See Prepare CentOS Cluster Nodes or Prepare Ubuntu Cluster Nodes depending on your base OS.
- Log in to node you want to add or repair.
- Disable the
udevd
daemon.- Check
whether
udevd is running.
# systemctl is-active systemd-udevd
- If
udevd
is active, disable it.# systemctl mask system-udevd --now
- Check
whether
udevd is running.
- Log in to the control host.
-
If you are adding a node, edit the inventory file to add the IP address of the new node.
If you are removing a node, edit the inventory file to delete the IP address of the node you want to remove.
If you are replacing a node, and the IP address of the replacement node is different from the current node, update the inventory file to replace the old node address with the new node address.
If you are repairing a node and the IP address is unchanged, you need not edit the inventory file.
- Run one of the following
commands:
If the node address is unchanged or you are adding or removing a node, use
./run –c config-dir repair node-ip-address-or-hostname
If the node address has changed, use
./run -c config-dir repair old-node-ip-address-or-hostname,new-node-ip-address-or-hostname
-
When a node is repaired or replaced, the Ceph distributed filesystems are not automatically updated. If the data disks were destroyed as part of the repair process, then the object storage daemons (OSDs) hosted on those data disks must be recovered.
-
Connect to the Ceph toolbox and view the status of OSDs. The
ceph-tools
script is installed on a primary node. You can log in to the primary node and use the kubectl interface to accessceph-tools
. To use a node other than the primary node, you must copy the admin.conf file (in the config-dir on the control host) and set thekubeconfig
environment variable or use theexport KUBECONFIG=config-dir/admin.conf
command.$ ceph-tools
# ceph osd status
-
Verify that all OSDs are listed as
exists,up
. If OSDs are damaged, follow the troubleshooting instructions explained in Troubleshoot Ceph and Rook.
-
- Log in to node that you added or repaired after verifying that all OSDs are created.
-
Reenable
udevd
on that node.systemctl unmask system-udevd
Edit Primary Nodes in Single-Primary Node Clusters
In single-primary node clusters, you can edit both primary and worker nodes. However, you cannot remove or add additional primary nodes.
You can add additional primary nodes only if your existing cluster is already a multiple-primary cluster.
During node repair, you cannot schedule new pods, and existing pods remain nonoperational, resulting in service degradation.
You need the latest version of the etcd-snapshot.db file to restore the primary node in single-primary node clusters.
The etcd-snapshot.db file is backed up locally in /export/backup/etcd-snapshot.db every five minutes. We recommend that you copy this file to a separate remote location at regular intervals or mount /export/backup/ to an external fileserver.
To replace or repair the primary node, you have the etcd-snapshot.db file available.
- Log in to the node that you want to replace or repair.
- Disable the
udevd
daemon.- Check
whether
udevd is running.
# systemctl is-active systemd-udevd
- If
udevd
is active, disable it.# systemctl mask system-udevd --now
- Check
whether
udevd is running.
- Log in to the control host.
- Copy the etcd-snapshot.db file to the control host or restore the external /export/backup/ mount.
-
Run one of the following commands to replace or repair the node:
If the node address is unchanged, use
./run –c config-dir repair node-ip-address-or-hostname –e etcd_backup=path-to-etcd-snapshot.db
If the node address has changed, use
./run –c config-dir repair old-node-ip-address-or-hostname,new-node-ip-address-or-hostname –e etcd_backup=path-to-etcd-snapshot.db
-
When a node is repaired or replaced, the Ceph distributed filesystems are not automatically updated. If the data disks were destroyed as part of the repair process, then the object storage daemons (OSDs) hosted on those data disks must be recovered.
-
Connect to the Ceph toolbox and view the status of OSDs. The
ceph-tools
script is installed on a primary node. You can log in to the primary node and use the kubectl interface to accessceph-tools
. To use a node other than the primary node, you must copy the admin.conf file (in the config-dir on the control host) and set thekubeconfig
environment variable or use theexport KUBECONFIG=config-dir/admin.conf
command.$ ceph-tools
# ceph osd status
-
Verify that all OSDs are listed as
exists,up
. If OSDs are damaged, follow the troubleshooting instructions explained in Troubleshoot Ceph and Rook.
-
- Log in to the node that you added or repaired after verifying that all OSDs are created.
-
Reenable
udevd
on that node.systemctl unmask system-udevd