Installation on cloudscale.ch

Steps to install an OpenShift 4 cluster on cloudscale.ch.

These steps follow the Installing a cluster on bare metal docs to set up a user provisioned installation (UPI). Terraform is used to provision the cloud infrastructure.

The commands are idempotent and can be retried if any of the steps fail.

The certificates created during bootstrap are only valid for 24h. So make sure you complete these steps within 24h.

This how-to guide is still a work in progress and will change. It’s currently very specific to VSHN and needs further changes to be more generic.

Starting situation

  • You already have a Tenant and its git repository

  • You have a CCSP Red Hat login and are logged into Red Hat Openshift Cluster Manager

  • You want to register a new cluster in Lieutenant and are about to install Openshift 4 on Cloudscale

Prerequisites

Make sure the version of openshift-install and the rhcos image is the same, otherwise ignition will fail.

Cluster Installation

Register the new OpenShift 4 cluster in Lieutenant.

Lietenant API endpoint

Use the following endpoint for Lieutenant:

Set up LDAP service

  1. Create an LDAP service

    Use control.vshn.net/vshn/services/_create to create a service. The name must contain the customer and the cluster name. And then put the LDAP service ID in the following variable:

    export LDAP_ID="Your_LDAP_ID_here"
    export LDAP_PASSWORD="Your_LDAP_pw_here"

Configure input

export CLOUDSCALE_TOKEN=<cloudscale-api-token> # From https://control.cloudscale.ch/user/api-tokens
export GITLAB_TOKEN=<gitlab-api-token> # From https://git.vshn.net/profile/personal_access_tokens
export GITLAB_CATALOG_PROJECT_ID=<project-id> # GitLab numerical project ID of the catalog repo
export REGION=rma # rma or lpg (without the zone number)
export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<verb>-<noun>-<number>
export TENANT_ID=<lieutenant-tenant-id> # Looks like: t-<verb>-<noun>-<number>
export BASE_DOMAIN=appuio-beta.ch
export PULL_SECRET='<redhat-pull-secret>' # As copied from https://cloud.redhat.com/openshift/install/pull-secret "Copy pull secret". value must be inside quotes.
export COMMODORE_API_URL=<lieutenant-api-endpoint> # For example: https://api-int.syn.vshn.net
export COMMODORE_API_TOKEN=<lieutenant-api-token> # See https://wiki.vshn.net/pages/viewpage.action?pageId=167838622#ClusterRegistryinLieutenantSynfectaCluster(synfection)-Preparation

For BASE_DOMAIN explanation, see DNS Scheme.

Set up S3 bucket

  1. Create S3 bucket

    1. If a bucket user already exists for this cluster:

      # Use already existing bucket user
      response=$(curl -sH "Authorization: Bearer ${CLOUDSCALE_TOKEN}" \
        https://api.cloudscale.ch/v1/objects-users | \
        jq -e ".[] | select(.display_name == \"${CLUSTER_ID}\")")
    2. To create a new bucket user:

      # Create a new user
      response=$(curl -sH "Authorization: Bearer ${CLOUDSCALE_TOKEN}" \
        -F display_name=${CLUSTER_ID} \
        https://api.cloudscale.ch/v1/objects-users)
  2. Configure the Minio client

    mc config host add \
      "${CLUSTER_ID}" "https://objects.${REGION}.cloudscale.ch" \
      $(echo $response | jq -r '.keys[0].access_key') \
      $(echo $response | jq -r '.keys[0].secret_key')
    
    mc mb --ignore-existing \
      "${CLUSTER_ID}/${CLUSTER_ID}-bootstrap-ignition"

Set secrets in Vault

Connect with Vault
export VAULT_ADDR=https://vault-prod.syn.vshn.net
vault login -method=ldap username=<your.name>
Store various secrets in vault
# Set the cloudscale.ch access secrets
vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/cloudscale \
  token=${CLOUDSCALE_TOKEN} \
  s3_access_key=$(mc config host ls ${CLUSTER_ID} -json | jq -r .accessKey) \
  s3_secret_key=$(mc config host ls ${CLUSTER_ID} -json | jq -r .secretKey)

# Generate an HTTP secret for the registry
vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/registry \
  httpSecret=$(LC_ALL=C tr -cd "A-Za-z0-9" </dev/urandom | head -c 128)

# Set the LDAP password
vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/vshn-ldap \
  bindPassword=${LDAP_PASSWORD}

# Generate a master password for backups
vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/cluster-backup \
  password=$(LC_ALL=C tr -cd "A-Za-z0-9" </dev/urandom | head -c 32)

OpenShift Installer Setup

For the following steps, change into a clean directory (for example a directory in your home).

These are the only steps which aren’t idempotent and have to be completed uninterrupted in one go. If you have to recreate the install config or any of the generated manifests you need to rerun all of the subsequent steps.
  1. Prepare install-config.yaml

    mkdir -p target
    
    cat > target/install-config.yaml <<EOF
    apiVersion: v1
    metadata:
      name: ${CLUSTER_ID}
    baseDomain: ${BASE_DOMAIN}
    platform:
      none: {}
    pullSecret: |
      ${PULL_SECRET}
    EOF
  2. Render install manifests (this will consume the install-config.yaml)

    openshift-install --dir target \
      create manifests
    1. If you want to change the default "apps" domain for the cluster:

      yq w -i target/manifests/cluster-ingress-02-config.yml \
        spec.domain apps.example.com
  3. Render and upload ignition config (this will consume all the manifests)

    openshift-install --dir target \
      create ignition-configs
    
    mc cp target/bootstrap.ign "${CLUSTER_ID}/${CLUSTER_ID}-bootstrap-ignition/"
    
    export TF_VAR_ignition_bootstrap=$(mc share download \
      --json --expire=4h \
      "${CLUSTER_ID}/${CLUSTER_ID}-bootstrap-ignition/bootstrap.ign" | jq -r '.share')

Terraform Cluster Config

Check Running Commodore for details on how to run commodore.

  1. Prepare Commodore inventory.

    This command will fail due to circular dependencies in the Commodore setup. You will see error messages starting with Cannot resolve ${openshift:*}. As long as all components are cloned for the cluster it’s enough to proceed.

    This can be improved once this issue is solved.

    # This will fail
    commodore catalog compile ${CLUSTER_ID}
  2. Prepare Terraform cluster config

    CA_CERT=$(jq -r '.ignition.security.tls.certificateAuthorities[0].source' \
      target/master.ign | \
      awk -F ',' '{ print $2 }' | \
      base64 --decode)
    
    pushd "inventory/classes/${TENANT_ID}/"
    
    yq w -i "${CLUSTER_ID}.yml" \
      "applications[+]" "openshift4-cloudscale"
    
    yq w -i "${CLUSTER_ID}.yml" \
      parameters.openshift.infraID -- "$(jq -r .infraID ../../../target/metadata.json)"
    yq w -i "${CLUSTER_ID}.yml" \
      parameters.openshift.clusterID -- "$(jq -r .clusterID ../../../target/metadata.json)"
    yq w -i "${CLUSTER_ID}.yml" \
      parameters.openshift.appsDomain -- "apps.${CLUSTER_ID}.${BASE_DOMAIN}"
    
    yq w -i "${CLUSTER_ID}.yml" \
      parameters.openshift4_cloudscale.variables.base_domain -- "${BASE_DOMAIN}"
    yq w -i "${CLUSTER_ID}.yml" \
      parameters.openshift4_cloudscale.variables.ignition_ca -- "${CA_CERT}"
    
    yq w -i "${CLUSTER_ID}.yml" \
      parameters.vshnLdap.serviceId -- ${LDAP_ID}
    
    git commit -a -m "Setup cluster ${CLUSTER_ID}"
    git push
    
    popd
  3. Compile and push Terraform setup

    commodore catalog compile ${CLUSTER_ID} --push -i

Provision Infrastructure

  1. Setup Terraform

    Prepare terraform

    # Set terraform image and tag to be used
    tf_image=$(\
      yq r dependencies/openshift4-cloudscale/class/defaults.yml \
      parameters.openshift4_cloudscale.images.terraform.image)
    tf_tag=$(\
      yq r dependencies/openshift4-cloudscale/class/defaults.yml \
      parameters.openshift4_cloudscale.images.terraform.tag)
    
    # Generate the terraform alias
    alias terraform='docker run -it --rm \
      -e CLOUDSCALE_TOKEN="${CLOUDSCALE_TOKEN}" \
      -e TF_VAR_ignition_bootstrap="${TF_VAR_ignition_bootstrap}" \
      -w /tf \
      -v $(pwd):/tf \
      -v $CLUSTER_ID:/tf/.terraform \
      --ulimit memlock=-1 \
      ${tf_image}:${tf_tag} terraform'
    
    export GITLAB_STATE_URL="https://git.vshn.net/api/v4/projects/${GITLAB_CATALOG_PROJECT_ID}/terraform/state/cluster"
    
    pushd catalog/manifests/openshift4-cloudscale/

    Initiate terraform

    terraform init \
      "-backend-config=address=${GITLAB_STATE_URL}" \
      "-backend-config=lock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=unlock_address=${GITLAB_STATE_URL}/lock" \
      "-backend-config=username=$(whoami)" \
      "-backend-config=password=${GITLAB_TOKEN}" \
      "-backend-config=lock_method=POST" \
      "-backend-config=unlock_method=DELETE" \
      "-backend-config=retry_wait_min=5"
  2. Provision bootstrap node

    cat > override.tf <<EOF
    module "cluster" {
      bootstrap_count = 1
      master_count    = 0
      infra_count     = 0
      worker_count    = 0
    }
    EOF
    
    terraform apply
  3. Create the first shown DNS records

  4. Wait for the DNS records to propagate!

    sleep 600
    host "api.${CLUSTER_ID}.${BASE_DOMAIN}"
  5. Provision master nodes

    cat > override.tf <<EOF
    module "cluster" {
      bootstrap_count = 1
      infra_count     = 0
      worker_count    = 0
    }
    EOF
    
    terraform apply
  6. Add the remaining DNS records to the previous ones.

    terraform output -json | jq -r ".cluster_dns.value"
  7. Wait for bootstrap to complete

    openshift-install --dir ../../../target \
      wait-for bootstrap-complete
  8. Remove bootstrap node and provision infra nodes

    cat > override.tf <<EOF
    module "cluster" {
      worker_count    = 0
    }
    EOF
    
    terraform apply
    
    export KUBECONFIG="$(pwd)/../../../target/auth/kubeconfig"
    
    # Once CSRs in state Pending show up, approve them
    # Needs to be run twice, two CSRs for each node need to be approved
    while sleep 3; do \
      oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | \
      xargs oc adm certificate approve; \
    done
    
    kubectl get nodes -lnode-role.kubernetes.io/worker
    kubectl label node -lnode-role.kubernetes.io/worker \
      node-role.kubernetes.io/infra=""
  9. Wait for installation to complete

    openshift-install --dir ../../../target \
      wait-for install-complete
  10. Provision worker nodes

    rm override.tf
    
    terraform apply
    
    # Once CSRs in state Pending show up, approve them
    # Needs to be run twice, two CSRs for each node need to be approved
    while sleep 3; do \
      oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | \
      xargs oc adm certificate approve; \
    done
    
    kubectl label --overwrite node -lnode-role.kubernetes.io/worker \
      node-role.kubernetes.io/app=""
    kubectl label node -lnode-role.kubernetes.io/infra \
      node-role.kubernetes.io/app-
  11. Create secret with S3 credentials for the registry (will be automated)

    oc create secret generic image-registry-private-configuration-user \
    --namespace openshift-image-registry \
    --from-literal=REGISTRY_STORAGE_S3_ACCESSKEY=$(mc config host ls ${CLUSTER_ID} -json | jq -r .accessKey) \
    --from-literal=REGISTRY_STORAGE_S3_SECRETKEY=$(mc config host ls ${CLUSTER_ID} -json | jq -r .secretKey)
  12. Make the cluster Project Syn enabled

    Install Steward on the cluster (see wiki for more details):

    export LIEUTENANT_NS="lieutenant-prod" # or lieutenant-[dev,int] accordingly
    export LIEUTENANT_AUTH="Authorization:Bearer ${COMMODORE_API_TOKEN}"
    
    # Reset the token
    curl \
      -H "${LIEUTENANT_AUTH}" \
      -H "Content-Type: application/json-patch+json" \
      -X PATCH \
      -d '[{ "op": "remove", "path": "/status/bootstrapToken" }]' \
      "https://rancher.vshn.net/k8s/clusters/c-c6j2w/apis/syn.tools/v1alpha1/namespaces/${LIEUTENANT_NS}/clusters/${CLUSTER_ID}/status"
    
    kubectl --kubeconfig target/auth/kubeconfig apply -f $(curl -sH "${LIEUTENANT_AUTH}" "https://${COMMODORE_API_URL}/clusters/${CLUSTER_ID}" | jq -r ".installURL")
  13. Save the admin credentials in the password manager. You can find the password in the file target/auth/kubeadmin-password and the kubeconfig in target/auth/kubeconfig

    popd
    ls -l target/auth/
  14. Delete local config files

    rm -r target/