Back Up and Restore Contrail Etcd
SUMMARY Learn how to back up and restore the Contrail etcd database.
In release 22.1, Contrail stores its data in the main OpenShift etcd database. When you back up and restore the main etcd database in release 22.1, you implicitly back up and restore Contrail data.
Starting in release 22.2, Contrail stores its data in its own etcd database. Use the procedures in this section to back up and restore the Contrail etcd database.
Back Up the Contrail Etcd Database in Release 22.2
Use this example procedure in release 22.2 to back up the Contrail etcd database. In release 22.2, you run etcdctl commands in the contrail-etcd pods themselves.
-
Get a list of the contrail-etcd pods.
Take note of the contrail-etcd pod names and IP addresses. You will refer to these names and IP addresses in the next step.user@ai-client:~# kubectl get pods -A | grep contrail-etcd
-
Back up the etcd database.
-
Get a shell into one of the contrail-etcd pods.
For example:
where contrail-etcd-xxx is the etcd pod that you want to get a shell into.kubectl exec -it contrail-etcd-xxx -c contrail-etcd -n contrail-system sh
-
Back up the etcd database.
This example saves the database to /tmp/etcdbackup.db.
where <etcd-pod-ip> is the IP address of the pod and the <etcd-port> is the port that etcd is listening on (by default, 12379).etcdctl snapshot save /tmp/etcdbackup.db --endpoints=<etcd-pod-ip>:<etcd-port>
exit
-
Get a shell into one of the contrail-etcd pods.
-
Copy the database to a safe location.
For example:
where contrail-etcd-xxx is the etcd pod where you backed up the database.user@ai-client:~# kubectl cp contrail-system/contrail-etcd-xxx:/tmp/etcdbackup.db -c contrail-etcd ./etcdbackup.db
Back Up the Contrail Etcd Database in Release 22.3
Use this example procedure in release 22.3 to back up the Contrail etcd database. In release 22.3, you run etcdctl commands on the control plane nodes.
-
Install etcdctl on all control plane nodes.
-
Log in to one of the control plane nodes.
For example:
ssh core@172.16.0.11
-
Download etcd. This example downloads to the
/tmp directory.
ETCD_VER=v3.4.13 curl -L https://storage.googleapis.com/etcd/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
-
Untar and move the etcd executable to a directory in your path (for
example /usr/local/bin).
tar -xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp sudo mv /tmp/etcd-${ETCD_VER}-linux-amd64/etcdctl /usr/local/bin
-
Check that you've installed etcd.
[core@ocp1]$ etcdctl version etcdctl version: 3.4.13 API version: 3.4
- Repeat on all the control plane nodes.
-
Log in to one of the control plane nodes.
-
Get a list of the contrail-etcd pods.
Take note of the contrail-etcd pod names, the IP addresses, and the nodes they're running on. You'll need this information in the next few steps.user@ai-client:~# kubectl get pods -A | grep contrail-etcd
-
Copy the etcd certificate and key files from the pods to the control plane
nodes.
We run kubectl on the control plane nodes in this step. We assume you've set up kubeconfig on the control plane nodes in its default location (~/.kube/config).
- Pick a contrail-etcd pod (for example, contrail-etcd-0) and log in to the control plane node that's hosting that pod.
-
Copy the certificate and key files from that contrail-etcd pod to
the hosting control plane node.
In this example, we're copying the certificates and key files from the contrail-etcd-0 pod to local files on this node.
This copies the certificate and key files from the contrail-etcd-0 pod to ca.crt, tls.crt, and tls.key in the current directory on this control plane node.kubectl exec --namespace contrail-system contrail-etcd-0 -c contrail-etcd -- cat /etc/member-tls/ca.crt > ./ca.crt kubectl exec --namespace contrail-system contrail-etcd-0 -c contrail-etcd -- cat /etc/member-tls/tls.crt > ./tls.crt kubectl exec --namespace contrail-system contrail-etcd-0 -c contrail-etcd -- cat /etc/member-tls/tls.key > ./tls.key
- Repeat for each contrail-etcd pod.
-
Back up the etcd database on one of the control plane nodes. You only need
to back up the database on one node.
- Log back in to one of the control plane nodes.
-
Back up the etcd database.
This example saves the database to /tmp/etcdbackup.db on this control plane node.
where <etcd-pod-ip> is the IP address of the pod on this node and the <etcd-port> is the port that etcd is listening on (by default, 12379).etcdctl snapshot save /tmp/etcdbackup.db --endpoints=<etcd-pod-ip>:<etcd-port> --cacert=ca.crt --cert=tls.crt --key=tls.key
- Copy the database to a safe location.
Restore the Contrail Etcd Database in Release 22.2
-
Copy the snapshot to all the contrail-etcd pods.
Repeat for the other contrail-etcd pods.kubectl cp etcdbackup.db contrail-system/contrail-etcd-xxx:/tmp/etcdbackup.db -c contrail-etcd
-
Restore the snapshot.
-
Get a shell into one of the contrail-etcd pods.
For example:
where contrail-etcd-xxx is the etcd pod that you want to get a shell into.kubectl exec -it contrail-etcd-xxx -n contrail-system sh
-
Restore the etcd database.
This creates a <contrail-etcd-xxx>.etcd directory on the pod.
where <contrail-etcd-xxx> is the name of the contrail-etcd pod that you're currently in and <contrail-etcd-xxx-ip> is the IP address of that pod. The <contrail-etcd-yyy> and <contrail-etcd-zzz> refer to the other contrail-etcd pods.ETCDCTL_API=3 etcdctl snapshot restore /tmp/etcdBackup.db \ --name=<contrail-etcd-xxx> \ --initial-cluster=<contrail-etcd-xxx>=https://<contrail-etcd-xxx-ip>:12380,<contrail-etcd-yyy>=https://<contrail-etcd-yyy-ip>:12380,<contrail-etcd-zzz>=https://<contrail-etcd-zzz-ip>:12380 \ --initial-advertise-peer-urls= https://<contrail-etcd-xxx>:12380
exit
-
Repeat for the other contrail-etcd pods, substituting the
--name
and--initial-advertise-peer-urls
values with the respective contrail-etcd pod name and IP address.
-
Get a shell into one of the contrail-etcd pods.
-
Copy the saved etcd data from the contrail-etcd pods to their respective
control plane nodes.
- SSH into one of the control plane nodes.
-
Copy the saved contrail-etcd-xxx.etcd from the
respective contrail-etcd pod to the node.
For example:
where contrail-etcd-xxx is the name of the contrail-etcd pod on the control plane node that you logged in to.kubectl cp contrail-system/contrail-etcd-xxx:/app/cmd/operator/operator.runfiles/cn2/contrail-etcd-xxx.etcd -c contrail-etcd ./<contrail-etcd-xxx>.etcd
exit
- Repeat for the other control plane nodes.
-
Stop the contrail-etcd pods.
This sets the replicas to 0, which effectively stops the pods.
user@ai-client:~# kubectl patch etcds.datastore.juniper.net contrail-etcd -n contrail-system --type=merge -p '{"spec": {"common": {"replicas": 0}}}
-
Replace contrail-etcd data with the data from the snapshot.
- SSH into one of the control plane nodes.
-
Replace the data.
where contrail-etcd-xxx is the name of the contrail-etcd pod on the control plane node that you logged in to.sudo rm -rf /var/lib/contrail-etcd/snapshots sudo mv /var/lib/contrail-etcd/etcd/member /var/lib/contrail-etcd/etcd/member.bak sudo mv contrail-etcd-xxx.etcd/member /var/lib/contrail-etcd/etcd/
exit
- Repeat for the other control plane nodes.
-
Start the contrail-etcd pods.
This sets the replicas to 3, which effectively starts the pods.
user@ai-client:~# kubectl patch etcds.datastore.juniper.net contrail-etcd -n contrail-system --type=merge -p '{"spec": {"common": {"replicas": 3}}}
-
Restart the contrail-system apiserver and controller.
Delete all the contrail-k8s-apiserver and contrail-k8s-controller pods.
kubectl delete pod <contrail-k8s-apiserver-xxx> -n contrail-system
These pods will automatically restart.kubectl delete pod <contrail-k8s-controller-xxx> -n contrail-system
-
Restart the vrouters.
Delete all the contrail-vrouter-masters and contrail-vrouter-nodes pods.
kubectl delete pod <contrail-vrouter-masters-xxx> -n contrail
These pods will automatically restart.kubectl delete pod <contrail-vrouter-nodes-xxx> -n contrail
-
Check that all pods are in running state.
user@ai-client:~# kubectl get pods -n contrail-system
user@ai-client:~# kubectl get pods -n contrail
Restore the Contrail Etcd Database in Release 22.3
-
Copy the snapshot you want to restore to all the control plane nodes.
The steps below assume you've copied the snapshot to /tmp/etcdbackup.db on all the control plane nodes.
-
Restore the snapshot.
- Log in to one of the control plane nodes. In this example, we're logging in to the control plane node that is hosting contrail-etcd-0.
-
Restore the etcd database to the contrail-etcd-0 pod on this
control plane node.
This creates a contrail-etcd-0.etcd directory on the node.
whereETCDCTL_API=3 etcdctl snapshot restore /tmp/etcdBackup.db \ --name=contrail-etcd-0 \ --initial-cluster=contrail-etcd-0=https://<contrail-etcd-0-ip>:12380,\ contrail-etcd-1=https://<contrail-etcd-1-ip>:12380,\ contrail-etcd-2=https://<contrail-etcd-2-ip>:12380 \ --initial-advertise-peer-urls= https://<contrail-etcd-0-ip>:12380 \ --cacert=ca.crt --cert=tls.crt --key=tls.key
--name=contrail-etcd-0
specifies that this command is restoring the database to contrail-etcd-0,--initial-cluster=...
lists all the contrail-etcd members in the cluster, and--initial-advertise-peer-urls=...
refers to the IP address and port number that the contrail-etcd-0 pod is listening on. -
Repeat for the other contrail-etcd pods on their respective control
plane nodes, substituting the
--name
and--initial-advertise-peer-urls
values with the respective contrail-etcd pod name and IP address.
-
Stop the contrail-etcd pods.
This sets the replicas to 0, which effectively stops the pods.
user@ai-client:~# kubectl patch etcds.datastore.juniper.net contrail-etcd -n contrail-system --type=merge -p '{"spec": {"common": {"replicas": 0}}}
-
Replace contrail-etcd data with the data from the snapshot.
- SSH into one of the control plane nodes.
-
Replace the data. Recall that the snapshot is stored in the
contrail-etcd-<xxx>.etcd
directory.
where contrail-etcd-xxx is the name of the contrail-etcd pod on the control plane node that you logged in to.sudo rm -rf /var/lib/contrail-etcd/snapshots sudo mv /var/lib/contrail-etcd/etcd/member /var/lib/contrail-etcd/etcd/member.bak sudo mv contrail-etcd-<xxx>.etcd/member /var/lib/contrail-etcd/etcd/
- Repeat for the other control plane nodes.
-
Start the contrail-etcd pods.
This sets the replicas to 3, which effectively starts the pods.
user@ai-client:~# kubectl patch etcds.datastore.juniper.net contrail-etcd -n contrail-system --type=merge -p '{"spec": {"common": {"replicas": 3}}}
-
Restart the contrail-system apiserver and controller.
Delete all the contrail-k8s-apiserver and contrail-k8s-controller pods.
kubectl delete pod <contrail-k8s-apiserver-xxx> -n contrail-system
These pods will automatically restart.kubectl delete pod <contrail-k8s-controller-xxx> -n contrail-system
-
Restart the vrouters.
Delete all the contrail-vrouter-masters and contrail-vrouter-nodes pods.
kubectl delete pod <contrail-vrouter-masters-xxx> -n contrail
These pods will automatically restart.kubectl delete pod <contrail-vrouter-nodes-xxx> -n contrail
-
Check that all pods are in running state.
user@ai-client:~# kubectl get pods -n contrail-system
user@ai-client:~# kubectl get pods -n contrail