Change worker node type (instance pool)

Steps to change the instance type of an OpenShift 4 cluster on Exoscale with instance pools.

Starting situation

  • You already have a OpenShift 4 cluster on Exoscale

  • Your cluster uses Exoscale instance pools for the worker and infra nodes

  • You have admin-level access to the cluster

  • Your kubectl context points to the cluster you’re modifying

  • You want to change the node type (size) of the worker or infra nodes

High-level overview

  • Update the instance pool with the new desired type

  • Replace each existing node with a new node

Prerequisites

The following CLI utilities need to be available locally:

Prepare local environment

  1. Create local directory to work in

    We strongly recommend creating an empty directory, unless you already have a work directory for the cluster you’re about to work on. This guide will run Commodore in the directory created in this step.

    export WORK_DIR=/path/to/work/dir
    mkdir -p "${WORK_DIR}"
    pushd "${WORK_DIR}"
  2. Configure API access

    Access to cloud API
    export EXOSCALE_API_KEY=<exoscale-key> (1)
    export EXOSCALE_API_SECRET=<exoscale-secret>
    export EXOSCALE_ZONE=<exoscale-zone> (2)
    export EXOSCALE_S3_ENDPOINT="sos-${EXOSCALE_ZONE}.exo.io"
    1 We recommend using the IAMv3 role called Owner for the API Key. This role gives full access to the project.
    2 All lower case. For example ch-dk-2.
    Access to VSHN GitLab
    # From https://git.vshn.net/-/user_settings/personal_access_tokens, "api" scope is sufficient
    export GITLAB_TOKEN=<gitlab-api-token>
    export GITLAB_USER=<gitlab-user-name>
    Access to VSHN Lieutenant
    # For example: https://api.syn.vshn.net
    # IMPORTANT: do NOT add a trailing `/`. Commands below will fail.
    export COMMODORE_API_URL=<lieutenant-api-endpoint>
    
    # Set Project Syn cluster and tenant ID
    export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<something>
    export TENANT_ID=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)
  3. Get required tokens from Vault

    Connect with Vault
    export VAULT_ADDR=https://vault-prod.syn.vshn.net
    vault login -method=oidc
    Grab the LB hieradata repo token from Vault
    export HIERADATA_REPO_SECRET=$(vault kv get \
      -format=json "clusters/kv/lbaas/hieradata_repo_token" | jq '.data.data')
    export HIERADATA_REPO_USER=$(echo "${HIERADATA_REPO_SECRET}" | jq -r '.user')
    export HIERADATA_REPO_TOKEN=$(echo "${HIERADATA_REPO_SECRET}" | jq -r '.token')
  4. Compile the catalog for the cluster. Having the catalog available locally enables us to run Terraform for the cluster to make any required changes.

    commodore catalog compile "${CLUSTER_ID}"

Update Cluster Config

  1. Set new desired node type

    new_type=<exoscale instance type> (1)
    1 An Exoscale instance type, for example standard.huge.
  2. Update cluster config

    pushd "inventory/classes/${TENANT_ID}/"
    
    yq eval -i ".parameters.openshift4_terraform.terraform_variables.worker_type = \"${new_type}\"" \
      ${CLUSTER_ID}.yml
  3. Review and commit

    # Have a look at the file ${CLUSTER_ID}.yml.
    
    git commit -a -m "Update worker nodes of cluster ${CLUSTER_ID} to ${new_type}"
    git push
    
    popd
  4. Compile and push cluster catalog

    commodore catalog compile ${CLUSTER_ID} --push -i

Run Terraform

  1. Configure Terraform secrets

    cat <<EOF > ./terraform.env
    EXOSCALE_API_KEY
    EXOSCALE_API_SECRET
    TF_VAR_control_vshn_net_token
    GIT_AUTHOR_NAME
    GIT_AUTHOR_EMAIL
    HIERADATA_REPO_TOKEN
    EOF
  2. Setup Terraform

    Prepare Terraform execution environment
    # Set terraform image and tag to be used
    tf_image=$(\
      yq eval ".parameters.openshift4_terraform.images.terraform.image" \
      dependencies/openshift4-terraform/class/defaults.yml)
    tf_tag=$(\
      yq eval ".parameters.openshift4_terraform.images.terraform.tag" \
      dependencies/openshift4-terraform/class/defaults.yml)
    
    # Generate the terraform alias
    base_dir=$(pwd)
    alias terraform='touch .terraformrc; docker run -it --rm \
      -e REAL_UID=$(id -u) \
      -e TF_CLI_CONFIG_FILE=/tf/.terraformrc \
      --env-file ${base_dir}/terraform.env \
      -w /tf \
      -v $(pwd):/tf \
      --ulimit memlock=-1 \
      "${tf_image}:${tf_tag}" /tf/terraform.sh'
    
    export GITLAB_REPOSITORY_URL=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r '.gitRepo.url' | sed 's|ssh://||; s|/|:|')
    export GITLAB_REPOSITORY_NAME=${GITLAB_REPOSITORY_URL##*/}
    export GITLAB_CATALOG_PROJECT_ID=$(curl -sH "Authorization: Bearer ${GITLAB_TOKEN}" "https://git.vshn.net/api/v4/projects?simple=true&search=${GITLAB_REPOSITORY_NAME/.git}" | jq -r ".[] | select(.ssh_url_to_repo == \"${GITLAB_REPOSITORY_URL}\") | .id")
    export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster"
    
    pushd catalog/manifests/openshift4-terraform/
    Initialize Terraform
    terraform init \
      "-backend-config=address=${GITLAB_STATE_URL}" \
      "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=username=${GITLAB_USER}" \
      "-backend-config=password=${GITLAB_TOKEN}" \
      "-backend-config=lock_method=POST" \
      "-backend-config=unlock_method=DELETE" \
      "-backend-config=retry_wait_min=5"
  3. Run Terraform

    This doesn’t make changes to existing instances. However, after this step, any new instances created for the instance pool will use the new configuration.

    terraform apply

Apply new instance pool configuration

Double-check that your kubectl context points to the cluster you’re working on

Depending on the number of nodes you’re updating, you may want to execute the steps in this section for a subset of the nodes at a time.

On clusters with dedicated hypervisors, you’ll need to execute the steps for each worker instance pool. You can list the worker instance pools with

exo compute instance-pool list -Ojson | jq -r '.[]|select(.name|contains("worker"))|.name'

If you’re using this how-to for changing the instance type of the infra nodes, you must run Terraform again after replacing nodes to ensure that the LB hieradata is updated with the new infra node IPs.

When replacing infra nodes, we strongly recommend doing so in two batches to ensure availability of the cluster ingress.

  1. Select the instance pool

    pool_name="${CLUSTER_ID}_worker-0" (1)
  2. Compute the new instance count

    new_count=$(exo compute instance-pool show "${pool_name}" -Ojson | jq -r '.size * 2')

    For larger clusters, you’ll probably want to do something like the following to replace nodes in batches. If you do this, you’ll need to repeat the steps below this one for each batch.

    batch_size=3 (1)
    new_count=$(exo compute instance-pool show "${pool_name}" -Ojson | \
      jq --argjson batch "$batch_size" -r '.size + $batch')
    1 Replace with the desired batch size. Please ensure that you adjust the last batch size to not provision extra nodes if your node count isn’t divisible by your selected batch size.
  3. Get the list of old nodes

    NODES_TO_REMOVE=$(exo compute instance-pool show "${pool_name}" -Ojson | \
      jq -r '.instances|join(" ")')

    If you’re replacing nodes in batches, save the list of old nodes in a file:

    exo compute instance-pool show "${pool_name}" -Ojson | jq -r '.instances' > old-nodes.json (1)
    1 Run this only once before starting to replace nodes.

    Compute a batch of old nodes to remove and drop those from the file:

    NODES_TO_REMOVE=$(jq --argjson batch "$batch_size" -r '.[:$batch]|join(" ")' old-nodes.json)
    jq -r '.[$batch:]' old-nodes.json > old-nodes-rem.json && \
      mv old-nodes-rem.json old-nodes.json
  4. Scale up the instance pool to create new instances with the new desired type

    exo compute instance-pool scale "${pool_name}" "${new_count}" -z "${EXOSCALE_ZONE}"
  5. Approve CSRs of new nodes

    # Once CSRs in state Pending show up, approve them
    # Needs to be run twice, two CSRs for each node need to be approved
    
    kubectl --as=cluster-admin get csr -w
    
    oc --as=cluster-admin get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | \
      xargs oc --as=cluster-admin adm certificate approve
    
    kubectl --as=cluster-admin get nodes
  6. Label nodes

    kubectl get node -ojson | \
      jq -r '.items[] | select(.metadata.name | test("infra|master|storage-")|not).metadata.name' | \
      xargs -I {} kubectl label node {} node-role.kubernetes.io/app=
  7. Drain and remove old nodes

    • If you are working on a production cluster, you need to schedule the node drain for the next maintenance.

      Schedule node drain (production clusters)
      1. Create an adhoc-config for the UpgradeJobHook that will drain the node.

        pushd "../../../inventory/classes/$TENANT_ID"
        cat > manifests/$CLUSTER_ID/drain_node_hook.yaml <<EOF
        ---
        apiVersion: managedupgrade.appuio.io/v1beta1
        kind: UpgradeJobHook
        metadata:
          name: drain-node
          namespace: appuio-openshift-upgrade-controller
        spec:
          events:
            - Finish
          selector:
            matchLabels:
              appuio-managed-upgrade: "true"
          run: Next
          template:
            spec:
              template:
                spec:
                  containers:
                    - args:
                        - -c
                        - |
                          #!/bin/sh
                          set -e
                          oc adm drain ${NODES_TO_REMOVE} --delete-emptydir-data --ignore-daemonsets
                      command:
                        - sh
                      image: quay.io/appuio/oc:v4.13
                      name: remove-nodes
                      env:
                        - name: HOME
                          value: /export
                      volumeMounts:
                        - mountPath: /export
                          name: export
                      workingDir: /export
                  restartPolicy: Never
                  volumes:
                    - emptyDir: {}
                      name: export
        ---
        apiVersion: rbac.authorization.k8s.io/v1
        kind: ClusterRoleBinding
        metadata:
          name: drain-nodes-upgrade-controller
        roleRef:
          apiGroup: rbac.authorization.k8s.io
          kind: ClusterRole
          name: cluster-admin
        subjects:
          - kind: ServiceAccount
            name: default
            namespace: appuio-openshift-upgrade-controller
        EOF
        
        git commit -am "Schedule drain of node ${NODES_TO_REMOVE} on cluster $CLUSTER_ID"
        git push
        popd
      2. Wait until after the next maintenance window.

      3. Confirm the node has been drained.

        kubectl get node ${NODES_TO_REMOVE}
      4. Clean up UpgradeJobHook

        # after redoing the local environment and preparation of terraform:
        pushd "../../../inventory/classes/$TENANT_ID"
        rm manifests/$CLUSTER_ID/drain_node_hook
        git commit -am "Remove UpgradeJobHook to drain node ${NODES_TO_REMOVE} on cluster $CLUSTER_ID"
        git push
        popd
      5. Delete the node(s) from the cluster

        for node in $(echo -n ${NODES_TO_REMOVE}); do
          kubectl --as=cluster-admin delete node "${node}"
        done
    • If you are working on a non-production cluster, you may drain and remove the nodes immediately.

      Drain and remove node immediately
      1. Drain the node(s)

        for node in $(echo -n ${NODES_TO_REMOVE}); do
          kubectl --as=cluster-admin drain "${node}" \
            --delete-emptydir-data --ignore-daemonsets
        done
      2. Delete the node(s) from the cluster

        for node in $(echo -n ${NODES_TO_REMOVE}); do
          kubectl --as=cluster-admin delete node "${node}"
        done
  8. Remove old VMs from instance pool

    Only do this after the previous step is completed. On production clusters this must happen after the maintenance.

    for node in "$NODES_TO_REMOVE"; do
      exo compute instance-pool evict "${pool_name}" "${node}" -z "${EXOSCALE_ZONE}"
    done