Adopt worker nodes with the cloudscale Machine API Provider
Steps to adopt worker nodes on cloudscale with the cloudscale Machine API Provider.
Starting situation
-
You already have an OpenShift 4 cluster on cloudscale
-
You have admin-level access to the cluster
-
You want the nodes adopted by the cloudscale Machine API Provider
Prerequisites
The following CLI utilities need to be available locally:
-
commodore
, see Running Commodore -
docker
-
kubectl
-
vault
-
yq
Prepare local environment
-
Create local directory to work in
We strongly recommend creating an empty directory, unless you already have a work directory for the cluster you’re about to work on. This guide will run Commodore in the directory created in this step.
export WORK_DIR=/path/to/work/dir mkdir -p "${WORK_DIR}" pushd "${WORK_DIR}"
-
Configure API access
Access to cloud API# From https://control.cloudscale.ch/service/<your-project>/api-token export CLOUDSCALE_API_TOKEN=<cloudscale-api-token>
Access to VSHN GitLab# From https://git.vshn.net/-/user_settings/personal_access_tokens, "api" scope is sufficient export GITLAB_TOKEN=<gitlab-api-token> export GITLAB_USER=<gitlab-user-name>
# For example: https://api.syn.vshn.net
# IMPORTANT: do NOT add a trailing `/`. Commands below will fail.
export COMMODORE_API_URL=<lieutenant-api-endpoint>
# Set Project Syn cluster and tenant ID
export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<something>
export TENANT_ID=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)
export GIT_AUTHOR_NAME=$(git config --global user.name)
export GIT_AUTHOR_EMAIL=$(git config --global user.email)
export TF_VAR_control_vshn_net_token=<control-vshn-net-token> # use your personal SERVERS API token from https://control.vshn.net/tokens
-
Get required tokens from Vault
Connect with Vaultexport VAULT_ADDR=https://vault-prod.syn.vshn.net vault login -method=oidc
Grab the LB hieradata repo token from Vaultexport HIERADATA_REPO_SECRET=$(vault kv get \ -format=json "clusters/kv/lbaas/hieradata_repo_token" | jq '.data.data') export HIERADATA_REPO_USER=$(echo "${HIERADATA_REPO_SECRET}" | jq -r '.user') export HIERADATA_REPO_TOKEN=$(echo "${HIERADATA_REPO_SECRET}" | jq -r '.token')
Get Floaty credentialsexport TF_VAR_lb_cloudscale_api_secret=$(vault kv get \ -format=json "clusters/kv/${TENANT_ID}/${CLUSTER_ID}/floaty" | jq -r '.data.data.iam_secret')
-
Compile the catalog for the cluster. Having the catalog available locally enables us to run Terraform for the cluster to make any required changes.
commodore catalog compile "${CLUSTER_ID}"
Update Cluster Config
-
Update cluster config
pushd inventory/classes/"${TENANT_ID}" yq -i '.applications += "machine-api-provider-cloudscale"' \ ${CLUSTER_ID}.yml yq eval -i ".parameters.openshift4_terraform.terraform_variables.make_worker_adoptable_by_provider = true" \ ${CLUSTER_ID}.yml yq eval -i '.parameters.machine_api_provider_cloudscale.secrets["cloudscale-user-data"].stringData.ignitionCA = "${openshift4_terraform:terraform_variables:ignition_ca}"' \ ${CLUSTER_ID}.yml git commit -m "Allow adoption of worker nodes" "${CLUSTER_ID}.yml" git push popd
-
Compile and push the cluster catalog.
commodore catalog compile "${CLUSTER_ID}" --push
Prepare Terraform environment
-
Configure Terraform secrets
cat <<EOF > ./terraform.env CLOUDSCALE_API_TOKEN TF_VAR_ignition_bootstrap TF_VAR_lb_cloudscale_api_secret TF_VAR_control_vshn_net_token GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL HIERADATA_REPO_TOKEN EOF
-
Setup Terraform
Prepare Terraform execution environment# Set terraform image and tag to be used tf_image=$(\ yq eval ".parameters.openshift4_terraform.images.terraform.image" \ dependencies/openshift4-terraform/class/defaults.yml) tf_tag=$(\ yq eval ".parameters.openshift4_terraform.images.terraform.tag" \ dependencies/openshift4-terraform/class/defaults.yml) # Generate the terraform alias base_dir=$(pwd) alias terraform='touch .terraformrc; docker run -it --rm \ -e REAL_UID=$(id -u) \ -e TF_CLI_CONFIG_FILE=/tf/.terraformrc \ --env-file ${base_dir}/terraform.env \ -w /tf \ -v $(pwd):/tf \ --ulimit memlock=-1 \ "${tf_image}:${tf_tag}" /tf/terraform.sh' export GITLAB_REPOSITORY_URL=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r '.gitRepo.url' | sed 's|ssh://||; s|/|:|') export GITLAB_REPOSITORY_NAME=${GITLAB_REPOSITORY_URL##*/} export GITLAB_CATALOG_PROJECT_ID=$(curl -sH "Authorization: Bearer ${GITLAB_TOKEN}" "https://git.vshn.net/api/v4/projects?simple=true&search=${GITLAB_REPOSITORY_NAME/.git}" | jq -r ".[] | select(.ssh_url_to_repo == \"${GITLAB_REPOSITORY_URL}\") | .id") export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster" pushd catalog/manifests/openshift4-terraform/
Initialize Terraformterraform init \ "-backend-config=address=${GITLAB_STATE_URL}" \ "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \ "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \ "-backend-config=username=${GITLAB_USER}" \ "-backend-config=password=${GITLAB_TOKEN}" \ "-backend-config=lock_method=POST" \ "-backend-config=unlock_method=DELETE" \ "-backend-config=retry_wait_min=5"
Run terraform
-
Verify terraform output and apply the changes if everything looks good.
Terraform will tag the nodes as preparation for the adoption by the cloudscale Machine API Provider.
terraform apply
Apply Machine and MachineSet manifests
Please ensure the terraform apply has completed successfully before proceeding with this step. Without the tags applied by Terraform, nodes will be duplicated with the same name and weird stuff might happen. Please be careful to not apply the |
-
Copy
worker-machines_yml
from the Terraform output and apply it to the cluster.terraform output -raw worker-machines_yml | yq -P > worker-machines.yml head worker-machines.yml kubectl apply -f worker-machines.yml
-
Check that all machines are in the
Running
state.kubectl get -f worker-machines.yml
-
Copy
worker-machineset_yml
from the Terraform output and apply it to the cluster.terraform output -raw worker-machineset_yml | yq -P > worker-machineset.yml head worker-machineset.yml kubectl apply -f worker-machineset.yml
-
Copy
infra-machines_yml
from the Terraform output and apply it to the cluster.terraform output -raw infra-machines_yml | yq -P > infra-machines.yml head infra-machines.yml kubectl apply -f infra-machines.yml
-
Check that all machines are in the
Running
state.kubectl get -f infra-machines.yml
-
Copy
infra-machineset_yml
from the Terraform output and apply it to the cluster.terraform output -raw infra-machineset_yml | yq -P > infra-machineset.yml head infra-machineset.yml kubectl apply -f infra-machineset.yml
-
Check for additional worker groups and apply them if necessary.
terraform output -raw additional-worker-machines_yml > /dev/null 2>&1 || echo "No additional worker groups"
-
If the output shows "No additional worker groups," jump to Remove nodes from the Terraform state.
-
Copy
additional-worker-machines_yml
from the Terraform output and apply it to the cluster.terraform output -raw additional-worker-machines_yml | yq -P > additional-worker-machines.yml head additional-worker-machines.yml kubectl apply -f additional-worker-machines.yml
-
Check that all machines are in the
Running
state.kubectl get -f additional-worker-machines.yml
-
Copy
additional-worker-machinesets_yml
from the Terraform output and apply it to the cluster.terraform output -raw additional-worker-machinesets_yml | yq -P > additional-worker-machinesets.yml head additional-worker-machinesets.yml kubectl apply -f additional-worker-machinesets.yml
Remove nodes from the Terraform state
-
Remove the nodes from the Terraform state.
terraform state rm module.cluster.module.worker terraform state rm module.cluster.module.infra terraform state rm module.cluster.module.additional_worker cat > override.tf <<EOF module "cluster" { infra_count = 0 worker_count = 0 additional_worker_groups = {} } EOF
-
Check the terraform plan output and apply the changes. There should be no server recreation. Hieradata changes must be ignored, otherwise the cluster ingress controller will become unavailable.
terraform plan terraform apply
Cleanup
-
Persist the Terraform changes and start managing the machine sets.
popd pushd "inventory/classes/${TENANT_ID}" yq -i e '.parameters.openshift4_terraform.terraform_variables.additional_worker_groups= {}' \ "${CLUSTER_ID}.yml" yq -i e '.parameters.openshift4_terraform.terraform_variables.infra_count = 0' \ "${CLUSTER_ID}.yml" yq -i e '.parameters.openshift4_terraform.terraform_variables.worker_count = 0' \ "${CLUSTER_ID}.yml" yq -i ea 'select(fileIndex == 0) as $cluster | $cluster.parameters.openshift4_nodes.machineSets = ([select(fileIndex > 0)][] as $ms ireduce ({}; $ms.metadata.name as $msn | del($ms.apiVersion) | del($ms.kind) | del($ms.metadata.name) | del($ms.metadata.labels.name) | del($ms.metadata.namespace) | . * {$msn: $ms} )) | $cluster' \ "${CLUSTER_ID}.yml" ../../../catalog/manifests/openshift4-terraform/*machineset*.yml git commit -am "Persist provider adopted machine and terraform state for ${CLUSTER_ID}" git push origin master popd commodore catalog compile "${CLUSTER_ID}" --push