Replace a master node
Steps to replace a master node of an OpenShift 4 cluster on cloudscale.
Starting situation
-
You already have an OpenShift 4 cluster on cloudscale
-
You have admin-level access to the cluster
-
You want to replace a master node of the cluster
Prerequisites
The following CLI utilities need to be available locally:
-
kubectl -
oc -
git -
jq -
yq -
vaultCLI
Preparation
-
Update the master node
etcd-NDNS records to have a lower TTL so the Puppet LB HAproxy backends get updated quickeretcd-0 300 IN A 172.18.200.a etcd-1 300 IN A 172.18.200.b etcd-2 300 IN A 172.18.200.c ^ (1)1 Per record TTL can be added before the record type. The value is in seconds. Ideally this is done at least an hour (one regular TTL for the zone) before actually starting to replace nodes, so the records with the regular TTL have time to expire. -
Set downtime for the "HAProxy socket" check for the cluster’s Puppet LBs in Icinga2.
Setup Terraform credentials
-
Ensure your
KUBECONFIGpoints to the target clusterkubectl cluster-info kubectl get nodes -
Setup access to VSHN systems
Access to VSHN GitLab# From https://git.vshn.net/-/user_settings/personal_access_tokens, "api" scope is sufficient export GITLAB_TOKEN=<gitlab-api-token> export GITLAB_USER=<gitlab-user-name>Access to VSHN Lieutenant# For example: https://api.syn.vshn.net # IMPORTANT: do NOT add a trailing `/`. Commands below will fail. export COMMODORE_API_URL=<lieutenant-api-endpoint> # Set Project Syn cluster and tenant ID export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<something> export TENANT_ID=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)Configuration for hieradata commitsexport GIT_AUTHOR_NAME=$(git config --global user.name) export GIT_AUTHOR_EMAIL=$(git config --global user.email) export TF_VAR_control_vshn_net_token=<control-vshn-net-token> # use your personal SERVERS API token from https://control.vshn.net/tokens -
Extract cloudscale token from the cluster
You can also fetch the token from Vault:
export VAULT_ADDR=https://vault-prod.syn.vshn.net vault login -method=oidc export CLOUDSCALE_API_TOKEN=$(vault kv get -format=json \ clusters/kv/${TENANT_ID}/${CLUSTER_ID}/cloudscale | \ jq -r '.data.data.token')export CLOUDSCALE_API_TOKEN=$(kubectl --as=system:admin -n openshift-machine-api \ get secret cloudscale-rw-token -ogo-template='{{.data.token|base64decode}}') -
Compile catalog
We use
commodore catalog compilehere to fetch the cluster catalog. You can also use an existing up-to-date catalog checkout.commodore catalog compile "${CLUSTER_ID}" -
Configure Terraform
cat >terraform.env <<EOF CLOUDSCALE_API_TOKEN GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL EOF -
Setup Terraform
Prepare Terraform execution environment# Set terraform image and tag to be used tf_image=$(\ yq eval ".parameters.openshift4_terraform.images.terraform.image" \ dependencies/openshift4-terraform/class/defaults.yml) tf_tag=$(\ yq eval ".parameters.openshift4_terraform.images.terraform.tag" \ dependencies/openshift4-terraform/class/defaults.yml) # Generate the terraform alias base_dir=$(pwd) alias terraform='touch .terraformrc; docker run -it --rm \ -e REAL_UID=$(id -u) \ -e TF_CLI_CONFIG_FILE=/tf/.terraformrc \ --env-file ${base_dir}/terraform.env \ -w /tf \ -v $(pwd):/tf \ --ulimit memlock=-1 \ "${tf_image}:${tf_tag}" /tf/terraform.sh' export GITLAB_REPOSITORY_URL=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r '.gitRepo.url' | sed 's|ssh://||; s|/|:|') export GITLAB_REPOSITORY_NAME=${GITLAB_REPOSITORY_URL##*/} export GITLAB_CATALOG_PROJECT_ID=$(curl -sH "Authorization: Bearer ${GITLAB_TOKEN}" "https://git.vshn.net/api/v4/projects?simple=true&search=${GITLAB_REPOSITORY_NAME/.git}" | jq -r ".[] | select(.ssh_url_to_repo == \"${GITLAB_REPOSITORY_URL}\") | .id") export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster" pushd catalog/manifests/openshift4-terraform/Initialize Terraformterraform init \ "-backend-config=address=${GITLAB_STATE_URL}" \ "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \ "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \ "-backend-config=username=${GITLAB_USER}" \ "-backend-config=password=${GITLAB_TOKEN}" \ "-backend-config=lock_method=POST" \ "-backend-config=unlock_method=DELETE" \ "-backend-config=retry_wait_min=5"
Replace the master node
| You can repeat the steps in this section if you need to replace multiple master nodes. |
-
Select the master node you want to replace
node=<master-XXXX> kubectl get node "${node}" -
Drain the master node
kubectl --as=system:admin drain --ignore-daemonsets \ --delete-emptydir-data --force "${node}" -
Stop the master node
You can also stop the node via the cloudscale control panel. oc debug "node/${node}" -n syn-debug-nodes --as=system:admin \ -- chroot /host shutdown -h now kubectl wait --for condition=ready=unknown --timeout=600s \ "node/${node}" -
Remove the master node object in the cluster
kubectl --as=system:admin delete node "${node}" -
Remove the master node from the etcd cluster
As far as we know, if you don’t do this step, things will go wrong later on! etcd_pod=$(kubectl -n openshift-etcd get pods -l app=etcd -oname | grep -v "${node}" | head -n1) member_id=$(kubectl --as=system:admin -n openshift-etcd exec "${etcd_pod}" \ -- etcdctl member list | grep "${node}" | cut -d, -f1) kubectl --as=system:admin -n openshift-etcd exec "${etcd_pod}" \ -- etcdctl member remove "${member_id}" -
Delete etcd secrets for the removed node
kubectl --as=system:admin -n openshift-etcd \ delete $(kubectl -n openshift-etcd get secrets --as=system:admin -oname | grep ${node}) -
Remove node and id from terraform state
terraform state pull > state.json state_index=$(jq --arg node "${node}" -r '.resources[] | select(.module == "module.cluster.module.master" and .type == "random_id") | .instances[] | select(.attributes.hex==$node).index_key' \ state.json) terraform state rm "module.cluster.module.master.random_id.node[$state_index]" terraform state rm "module.cluster.module.master.cloudscale_server.node[$state_index]" rm state.json -
Create new node via terraform
terraform apply -
Check the output of the Terraform run and update the appropriate
etcd-NDNS record with the IP of the new master node -
Wait for the new master node to come online and approve CSRs
# Once CSRs in state Pending show up, approve them # Needs to be run three times, three CSRs for each node need to be approved kubectl --as=system:admin get csr -w oc --as=system:admin get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | \ xargs oc --as=system:admin adm certificate approve kubectl --as=system:admin get nodes -
Wait until the
etcdandkube-apiservercluster-operators are healthy againIt’s especially important to wait here when needing to replace multiple master nodes in succession to ensure we never destroy the etcd quorum! kubectl wait --for condition=progressing=false --timeout=15m co etcd kube-apiserverIf you’re observing the etcd update, it’s normal that the new etcd pod crashes initially before it gets added to the cluster correctly.
-
Delete the old master node in the cloudscale control panel.
Finalize replacement
-
Once you’re done with all master nodes that need to be replaced, revert the short TTL for the
etcd-NDNS records -
Double-check that you’ve deleted all the old nodes in the cloudscale control panel
-
Remove downtime for the "HAProxy socket" check for the cluster’s Puppet LBs in Icinga2.