Destroy a Google Cloud Platform hosted cluster

General Procedure
  1. Destroy cluster

  2. Clean up resources

    • The GCP project

    • Syn cluster object

    • Vault hosted secrets

    • Everything else

Destroy the cluster

  1. Get the metadata.json

    This file was generated by the installer. If it’s still available, get yourself a copy of it. Otherwise you can recreate it.

    cluster_name=$(oc get Infrastructure cluster -o json | jq -r .status.etcdDiscoveryDomain | cut -d. -f1)
    cluster_id=$(oc get ClusterVersion version -o json | jq -r .spec.clusterID)
    gcp_region=$(oc get Infrastructure cluster -o json | jq -r .status.platformStatus.gcp.region)
    gcp_project=$(oc get Infrastructure cluster -o json | jq -r .status.platformStatus.gcp.projectID)
    infra_id=$(oc get Infrastructure cluster -o json | jq -r .status.infrastructureName)
    cat << EOF > metadata.json
    {
      "clusterName": "${cluster_name}",
      "clusterID": "${cluster_id}",
      "infraID": "${infra_id}",
      "gcp": {
        "projectID": "${gcp_project}",
        "region": "${gcp_region}"
        ]
      }
    }
    EOF
  2. Get the service account access key

    Get yourself a copy of the service account access key and place it at ~/.gcp/osServiceAccount.json. See Service Account Key.

    The installer takes what ever service account access key is placed at ~/.gcp/osServiceAccount.json. If that file doesn’t exist, it asks for one and copies it over. When ever you work with the installer, check that you have the correct access key is placed at ~/.gcp/osServiceAccount.json.

  3. Destroy the cluster

    openshift-install destroy cluster --log-level debug --dir <dir> (1)
    1 Will be the path to the directory containing the metadata.json

    The installer is most likely to enter an infinite loop. The reason being resources not known to the installer blocking deletion. If this happen, delete the resources by using gcloud and or the web console. Once the offending resource got deleted, the installer will continue the destroy.

Clean up resources

Clean up the GCP project

A lot of decision trees ahead. Read through carefully before starting do delete things.

The installer has removed everything related to the cluster except for the DNS zone. Check if there is still an need for that zone. If not, remove it.

---
gcloud --project <project name> dns managed-zones delete <zone name>
---

After deleting the zone, also remove name server records pointing to it. That could be the name servers configured at the domains registrar. It could also be an NS record at another zone.

If the cluster was the only thing within that project, one might opt to just delete the project.

The project might hold resources which need to remain. This could be data that was generated by things that was running on the decommissioned cluster. This could also be stuff not related to the cluster at all.

Remove the cluster from SYN

Use the Lieutenant API of your SYN setup to delete the cluster. See wiki.vshn.net/x/ngMBCg#ClusterRegistryinLieutenantSynfectaCluster(synfection)-Delete

VSHN

At the time of this writing, the git operator doesn’t take care of cleaning up when a cluster gets deleted. For that reason, one must remove the following resources manually:

  • The cluster cluster catalog repository.

  • The cluster config file in the Tenant object.

  • The cluster config file within the Tenants config git repository.

Remove secrets from vault

---
TENANT_ID=…
CLUSTER_ID=…
vault kv list -format=json kv/${TENANT_ID}/${CLUSTER_ID}/ | jq -r '.[]' | xargs -I{} vault kv metadata delete kv/${TENANT_ID}/${CLUSTER_ID}/{}
---

Vault might contain encryption keys for backups or other type of data that’s still existing. Don’t delete those keys unless you are sure they’re absolutely no longer needed.

Other things to remove

The cluster might have been embedded into a bigger setup of monitoring, log aggregation etc. Look out for things that are no longer required and get rid of them.

VSHN

In case of VSHN, these are the places to look out: