Replace a master node

Steps to replace a master node of an OpenShift 4 cluster on Exoscale.

Starting situation

You already have an OpenShift 4 cluster on Exoscale
You have admin-level access to the cluster
You want to replace a master node of the cluster

Prerequisites

The following CLI utilities need to be available locally:

kubectl
oc
git
jq
yq
vault CLI

Preparation

Set downtime for the "HAProxy socket" check for the cluster’s Puppet LBs in Icinga2.

Setup Terraform credentials

Ensure your KUBECONFIG points to the target cluster
```
kubectl cluster-info
kubectl get nodes
```

Setup access to VSHN systems

Access to VSHN GitLab

# From https://git.vshn.net/-/user_settings/personal_access_tokens, "api" scope is sufficient
export GITLAB_TOKEN=<gitlab-api-token>
export GITLAB_USER=<gitlab-user-name>

Access to VSHN Lieutenant

# For example: https://api.syn.vshn.net
# IMPORTANT: do NOT add a trailing `/`. Commands below will fail.
export COMMODORE_API_URL=<lieutenant-api-endpoint>

# Set Project Syn cluster and tenant ID
export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<something>
export TENANT_ID=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)

Configuration for hieradata commits

export GIT_AUTHOR_NAME=$(git config --global user.name)
export GIT_AUTHOR_EMAIL=$(git config --global user.email)
export TF_VAR_control_vshn_net_token=<control-vshn-net-token> # use your personal SERVERS API token from https://control.vshn.net/tokens

Get an Exoscale IAM key for Terraform
Access to cloud API
```
export EXOSCALE_API_KEY=<exoscale-key> (1)
export EXOSCALE_API_SECRET=<exoscale-secret>
export EXOSCALE_ZONE=<exoscale-zone> (2)
export EXOSCALE_S3_ENDPOINT="sos-${EXOSCALE_ZONE}.exo.io"
```
1 We recommend using the IAMv3 role called Owner for the API Key. This role gives full access to the project.

2 All lower case. For example ch-dk-2.
Compile catalog

We use commodore catalog compile here to fetch the cluster catalog. You can also use an existing up-to-date catalog checkout.
```
commodore catalog compile "${CLUSTER_ID}"
```

Configure Terraform

cat >terraform.env <<EOF
EXOSCALE_API_KEY
EXOSCALE_API_SECRET
GIT_AUTHOR_NAME
GIT_AUTHOR_EMAIL
EOF

Setup Terraform

Prepare Terraform execution environment

# Set terraform image and tag to be used
tf_image=$(\
  yq eval ".parameters.openshift4_terraform.images.terraform.image" \
  dependencies/openshift4-terraform/class/defaults.yml)
tf_tag=$(\
  yq eval ".parameters.openshift4_terraform.images.terraform.tag" \
  dependencies/openshift4-terraform/class/defaults.yml)

# Generate the terraform alias
base_dir=$(pwd)
alias terraform='touch .terraformrc; docker run -it --rm \
  -e REAL_UID=$(id -u) \
  -e TF_CLI_CONFIG_FILE=/tf/.terraformrc \
  --env-file ${base_dir}/terraform.env \
  -w /tf \
  -v $(pwd):/tf \
  --ulimit memlock=-1 \
  "${tf_image}:${tf_tag}" /tf/terraform.sh'

export GITLAB_REPOSITORY_URL=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r '.gitRepo.url' | sed 's|ssh://||; s|/|:|')
export GITLAB_REPOSITORY_NAME=${GITLAB_REPOSITORY_URL##*/}
export GITLAB_CATALOG_PROJECT_ID=$(curl -sH "Authorization: Bearer ${GITLAB_TOKEN}" "https://git.vshn.net/api/v4/projects?simple=true&search=${GITLAB_REPOSITORY_NAME/.git}" | jq -r ".[] | select(.ssh_url_to_repo == \"${GITLAB_REPOSITORY_URL}\") | .id")
export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster"

pushd catalog/manifests/openshift4-terraform/

Initialize Terraform

terraform init \
  "-backend-config=address=${GITLAB_STATE_URL}" \
  "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \
  "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \
  "-backend-config=username=${GITLAB_USER}" \
  "-backend-config=password=${GITLAB_TOKEN}" \
  "-backend-config=lock_method=POST" \
  "-backend-config=unlock_method=DELETE" \
  "-backend-config=retry_wait_min=5"

Replace the master node

You can repeat the steps in this section if you need to replace multiple master nodes.

Select the master node you want to replace

node=<master-XXXX>
kubectl get node "${node}"

Drain the master node

kubectl --as=system:admin drain --ignore-daemonsets \
  --delete-emptydir-data --force "${node}"

Stop the master node

You can also stop the node via the Exoscale Portal.

oc debug "node/${node}" -n syn-debug-nodes --as=system:admin \
  -- chroot /host shutdown -h now
kubectl wait --for condition=ready=unknown --timeout=600s \
  "node/${node}"

Remove the master node object in the cluster

kubectl --as=system:admin delete node "${node}"

Remove the master node from the etcd cluster

As far as we know, if you don’t do this step, things will go wrong later on!

etcd_pod=$(kubectl -n openshift-etcd get pods -l app=etcd -oname | grep -v "${node}" | head -n1)
member_id=$(kubectl --as=system:admin -n openshift-etcd exec "${etcd_pod}" \
  -- etcdctl member list | grep "${node}" | cut -d, -f1)
kubectl --as=system:admin -n openshift-etcd exec "${etcd_pod}" \
  -- etcdctl member remove "${member_id}"

Delete etcd secrets for the removed node

kubectl --as=system:admin -n openshift-etcd \
  delete $(kubectl -n openshift-etcd get secrets --as=system:admin -oname | grep ${node})

Remove node and id from terraform state

terraform state pull > state.json
state_index=$(jq --arg node "${node}" -r '.resources[]
  | select(.module == "module.cluster.module.master" and .type == "random_id")
  | .instances[]
  | select(.attributes.hex==$node).index_key' \
  state.json)
terraform state rm "module.cluster.module.master.random_id.node_id[$state_index]"
terraform state rm "module.cluster.module.master.exoscale_compute_instance.nodes[$state_index]"
rm state.json

Create new node via terraform
```
terraform apply
```
On Exoscale, Terraform will update the etcd-N DNS record with the new master node IP.

Wait for the new master node to come online and approve CSRs

# Once CSRs in state Pending show up, approve them
# Needs to be run twice, two CSRs for each node need to be approved

kubectl --as=system:admin get csr -w

oc --as=system:admin get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | \
  xargs oc --as=system:admin adm certificate approve

kubectl --as=system:admin get nodes

Wait until the etcd and kube-apiserver cluster-operators are healthy again

It’s especially important to wait here when needing to replace multiple master nodes in succession to ensure we never destroy the etcd quorum!
```
kubectl wait --for condition=progressing=false --timeout=15m co etcd kube-apiserver
```
If you’re observing the etcd update, it’s normal that the new etcd pod crashes initially before it gets added to the cluster correctly.
Delete the old master node in the Exoscale Portal.

Finalize replacement

Double-check that you’ve deleted all the old nodes in the Exoscale Portal.
Remove downtime for the "HAProxy socket" check for the cluster’s Puppet LBs in Icinga2.

1	We recommend using the IAMv3 role called `Owner` for the API Key. This role gives full access to the project.
2	All lower case. For example `ch-dk-2`.