Uninstallation on cloudscale.ch

Steps to remove an OpenShift 4 cluster from cloudscale.ch.

  • The commands are idempotent and can be retried if any of the steps fail.

  • In the future, this procedure will be mostly automated

Prerequisites

Cluster Decommission

  1. Export the following vars

    export CLOUDSCALE_API_TOKEN=<cloudscale-api-token> # From https://control.cloudscale.ch/service/PROJECT_ID/api-token
    export CLUSTER_ID=<lieutenant-cluster-id>
    export TENANT_ID=<lieutenant-tenant-id>
    export REGION=<region> # rma or lpg (without the zone number)
    export GITLAB_TOKEN=<gitlab-api-token> # From https://git.vshn.net/-/profile/personal_access_tokens
    export GITLAB_USER=<gitlab-user-name>
  2. Setup Terraform

    Prepare Terraform execution environment
    # Set terraform image and tag to be used
    tf_image=$(\
      yq eval ".parameters.openshift4_terraform.images.terraform.image" \
      dependencies/openshift4-terraform/class/defaults.yml)
    tf_tag=$(\
      yq eval ".parameters.openshift4_terraform.images.terraform.tag" \
      dependencies/openshift4-terraform/class/defaults.yml)
    
    # Generate the terraform alias
    base_dir=$(pwd)
    alias terraform='docker run -it --rm \
      -e REAL_UID=$(id -u) \
      --env-file ${base_dir}/terraform.env \
      -w /tf \
      -v $(pwd):/tf \
      --ulimit memlock=-1 \
      "${tf_image}:${tf_tag}" /tf/terraform.sh'
    
    export GITLAB_REPOSITORY_URL=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r '.gitRepo.url' | sed 's|ssh://||; s|/|:|')
    export GITLAB_REPOSITORY_NAME=${GITLAB_REPOSITORY_URL##*/}
    export GITLAB_CATALOG_PROJECT_ID=$(curl -sH "Authorization: Bearer ${GITLAB_TOKEN}" "https://git.vshn.net/api/v4/projects?simple=true&search=${GITLAB_REPOSITORY_NAME/.git}" | jq -r ".[] | select(.ssh_url_to_repo == \"${GITLAB_REPOSITORY_URL}\") | .id")
    export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster"
    
    pushd catalog/manifests/openshift4-terraform/
    Initialize Terraform
    terraform init \
      "-backend-config=address=${GITLAB_STATE_URL}" \
      "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=username=${GITLAB_USER}" \
      "-backend-config=password=${GITLAB_TOKEN}" \
      "-backend-config=lock_method=POST" \
      "-backend-config=unlock_method=DELETE" \
      "-backend-config=retry_wait_min=5"
  3. Grab location of LB backups and potential Icinga2 satellite host before decommissioning VMs.

    declare -a LB_FQDNS
    for id in 1 2; do
      LB_FQDNS[$id]=$(terraform state show "module.cluster.module.lb.cloudscale_server.lb[$(expr $id - 1)]" | grep fqdn | awk '{print $2}' | sed -e 's/"//g')
    done
    for lb in ${LB_FQDNS[*]}; do
      ssh "${lb}" "sudo grep 'server =' /etc/burp/burp.conf && sudo grep 'ParentZone' /etc/icinga2/constants.conf"
    done
  4. Set downtimes for both LBs in Icinga2.

  5. Remove APPUiO hieradata Git repository resource from Terraform state

    terraform state rm module.cluster.module.lb.module.hiera.gitfile_checkout.appuio_hieradata
    This step is necessary to ensure the subsequent terraform destroy completes without errors.
  6. Delete resources from clouscale.ch using Terraform

    # The first time it will fail
    terraform destroy
    # Destroy a second time to delete private networks
    terraform destroy
  7. After all resources are deleted we need to remove the bucket

    # Use already exiting bucket user
    response=$(curl -sH "Authorization: Bearer ${CLOUDSCALE_API_TOKEN}" \
      https://api.cloudscale.ch/v1/objects-users | \
      jq -e ".[] | select(.display_name == \"${CLUSTER_ID}\")")
    
    # configure minio client to use the bucket
    mc config host add \
      "${CLUSTER_ID}" "https://objects.${REGION}.cloudscale.ch" \
      $(echo $response | jq -r '.keys[0].access_key') \
      $(echo $response | jq -r '.keys[0].secret_key')
    
    # delete bootstrap-ignition object
    mc rb "${CLUSTER_ID}/${CLUSTER_ID}-bootstrap-ignition" --force
    
    # delete image-registry object
    mc rb "${CLUSTER_ID}/${CLUSTER_ID}-image-registry" --force
    
    # delete cloudscale.ch user object
    curl -i -H "Authorization: Bearer ${CLOUDSCALE_API_TOKEN}" -X DELETE $(echo $response | jq -r '.href')
  8. Delete vault entries:

    # Vault login
    export VAULT_ADDR=https://vault-prod.syn.vshn.net
    vault login -method=oidc
    
    # delete token secret
    vault kv delete clusters/kv/${TENANT_ID}/${CLUSTER_ID}/cloudscale
    
    # delete registry secret
    vault kv delete clusters/kv/${TENANT_ID}/${CLUSTER_ID}/registry
    
    # delete ldap secret
    vault kv delete clusters/kv/${TENANT_ID}/${CLUSTER_ID}/vshn-ldap
  9. Decommission Puppet-managed LBs according to the VSHN documentation (Internal link).

    Don’t forget to remove the LB configuration in the APPUiO hieradata and the nodes hieradata.
  10. Delete cluster from Lieutenant API (via portal)

    • Select the Lieutenant API Endpoint

    • Search cluster name

    • Delete cluster entry using the delete button

  11. Delete all remaining volumes which were associated with the cluster in the cloudscale.ch project.

    This step is required because the csi-cloudscale driver doesn’t have time to properly cleanup PVs when the cluster is decommissioned with terraform destroy.
  12. Delete the cluster-backup bucket in the cloudscale.ch project

    Verify that the cluster backups aren’t needed anymore before cleaning up the backup bucket. Consider extracting the most recent cluster objects and etcd backups before deleting the bucket. See the Recover objects from backup how-to for instructions. At this point in the decommissioning process, you’ll have to extract the Restic configuration from Vault instead of the cluster itself.

  13. Delete all other Vault entries

  14. Delete LDAP service (via portal)

    • Search cluster name

    • Delete cluster entry service using the delete button

  15. Remove IPs from LDAP allowlist

    • Search cluster IPs and remove those lines and any comments related.

    • Create a Merge Request and invite a colleague for a review/approve/merge

  16. Delete all DNS records related with cluster (zonefiles)

  17. Update any related documentation