Install OpenShift 4 on OpenStack

Steps to install an OpenShift 4 cluster on Red Hat OpenStack.

These steps follow the Installing a cluster on OpenStack docs to set up an installer provisioned installation (IPI).

This how-to guide is an early draft. So far, we’ve setup only one cluster using the instructions in this guide.

The certificates created during bootstrap are only valid for 24h. So make sure you complete these steps within 24h.

Starting situation

You already have a Project Syn Tenant and its Git repository
You have a CCSP Red Hat login and are logged into Red Hat Openshift Cluster Manager

Don’t use your personal account to login to the cluster manager for installation.
You want to register a new cluster in Lieutenant and are about to install Openshift 4 on OpenStack

Prerequisites

jq
yq yq YAML processor (version 4 or higher - use the go version by mikefarah, not the jq wrapper by kislyuk)
openshift-install (direct download: linux, macOS)
oc (direct download: linux, macOS)
kubectl
vault Vault CLI
curl
emergency-credentials-receive Install instructions
commodore, see Installing Commodore
kapitan (should automatically be available in $PATH if commodore is installed with uv as described in the link above)
gzip
docker
unzip

openstack CLI

The OpenStack CLI is available as a Python package.

Ubuntu/Debian

sudo apt install python3-openstackclient

Arch

sudo yay -S python-openstackclient

MacOS

brew install openstackclient

Optionally, you can also install additional CLIs for object storage (swift) and images (glance).

Cluster Installation

Lieutenant API endpoint

Use the following endpoint for Lieutenant:

VSHN: api.syn.vshn.net

Set cluster facts

For customer clusters, set the following cluster facts in Lieutenant:

sales_order: Name of the sales order to which the cluster is billed, such as S10000
service_level: Name of the service level agreement for this cluster, such as guaranteed-availability
access_policy: Access-Policy of the cluster, such as regular or swissonly
release_channel: Name of the syn component release channel to use, such as stable
maintenance_window: Pick the appropriate upgrade schedule, such as monday-1400 for test clusters, tuesday-1000 for prod or custom to not (yet) enable maintenance
cilium_addons: Comma-separated list of cilium addons the customer gets billed for, such as advanced_networking or tetragon. Set to NONE if no addons should be billed.

Set up Keycloak service

Create a Keycloak service

Use control.vshn.net/vshn/services/_create to create a service. The name and ID must be clusters name. For the optional URL use the OpenShift console URL.

Configure input

OpenStack API

export OS_AUTH_URL=<openstack authentication URL> (1)

1	Provide the URL with the leading `https://`

OpenStack credentials

export OS_USERNAME=<username>
export OS_PASSWORD=<password>

OpenStack project, region and domain details

export OS_PROJECT_NAME=<project name>
export OS_PROJECT_DOMAIN_NAME=<project domain name>
export OS_USER_DOMAIN_NAME=<user domain name>
export OS_REGION_NAME=<region name>
export OS_PROJECT_ID=$(openstack project show $OS_PROJECT_NAME -f json | jq -r .id) (1)

1	TBD if really needed

Cluster machine network

export MACHINE_NETWORK_CIDR=<machine network cidr>
export EXTERNAL_NETWORK_NAME=<external network name> (1)

1	The instructions create floating IPs for the API and ingress in the specified network.

VM flavors

export CONTROL_PLANE_FLAVOR=<flavor name> (1)
export INFRA_FLAVOR=<flavor name> (1)
export APP_FLAVOR=<flavor name> (1)

1	Check `openstack flavor list` for available options.

Access to VSHN Lieutenant

# For example: https://api.syn.vshn.net
# IMPORTANT: do NOT add a trailing `/`. Commands below will fail.
export COMMODORE_API_URL=<lieutenant-api-endpoint>

# Set Project Syn cluster and tenant ID
export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-<something>
export TENANT_ID=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)

OpenShift configuration

export BASE_DOMAIN=<your-base-domain> # customer-provided base domain without cluster name, e.g. "zrh.customer.vshnmanaged.net"
export PULL_SECRET='<redhat-pull-secret>' # As copied from https://cloud.redhat.com/openshift/install/pull-secret "Copy pull secret". value must be inside quotes.

For BASE_DOMAIN explanation, see DNS Scheme.

Set secrets in Vault

Connect with Vault

export VAULT_ADDR=https://vault-prod.syn.vshn.net
vault login -method=oidc

Store various secrets in Vault

# Store OpenStack credentials
vault kv put  clusters/kv/${TENANT_ID}/${CLUSTER_ID}/openstack/credentials \
  username=${OS_USERNAME} \
  password=${OS_PASSWORD}

# Generate an HTTP secret for the registry
vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/registry \
  httpSecret=$(LC_ALL=C tr -cd "A-Za-z0-9" </dev/urandom | head -c 128)

# Generate a master password for K8up backups
vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/global-backup \
  password=$(LC_ALL=C tr -cd "A-Za-z0-9" </dev/urandom | head -c 32)

# Generate a password for the cluster object backups
vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/cluster-backup \
  password=$(LC_ALL=C tr -cd "A-Za-z0-9" </dev/urandom | head -c 32)

Setup floating IPs and DNS records for the API and ingress

Create floating IPs in the OpenStack API

export API_VIP=$(openstack floating ip create \
  --description "API ${CLUSTER_ID}.${BASE_DOMAIN}" "${EXTERNAL_NETWORK_NAME}" \
  -f json | jq -r .floating_ip_address)
export INGRESS_VIP=$(openstack floating ip create \
  --description "Ingress ${CLUSTER_ID}.${BASE_DOMAIN}" "${EXTERNAL_NETWORK_NAME}" \
  -f json | jq -r .floating_ip_address)

Create the initial DNS zone for the cluster

cat <<EOF
\$ORIGIN ${CLUSTER_ID}.${BASE_DOMAIN}.

api       IN A     ${API_VIP}
ingress   IN A     ${INGRESS_VIP}

*.apps    IN CNAME ingress.${CLUSTER_ID}.${BASE_DOMAIN}.
EOF

This step assumes that DNS for the cluster is managed by VSHN. See the VSHN zonefiles repo for details.

Create security group for Cilium

Create a security group

CILIUM_SECURITY_GROUP_ID=$(openstack security group create ${CLUSTER_ID}-cilium \
  --description "Cilium CNI security group rules for ${CLUSTER_ID}" -f json | \
  jq -r .id)

Create rules for Cilium traffic

openstack security group rule create --protocol tcp --remote-ip "$MACHINE_NETWORK_CIDR" \
  --dst-port 4240 --description "Cilium health checks" "$CILIUM_SECURITY_GROUP_ID"
openstack security group rule create --protocol tcp --remote-ip "$MACHINE_NETWORK_CIDR" \
  --dst-port 4244 --description "Cilium Hubble server" "$CILIUM_SECURITY_GROUP_ID"
openstack security group rule create --protocol tcp --remote-ip "$MACHINE_NETWORK_CIDR" \
  --dst-port 4245 --description "Cilium Hubble relay" "$CILIUM_SECURITY_GROUP_ID"
openstack security group rule create --protocol tcp --remote-ip "$MACHINE_NETWORK_CIDR" \
  --dst-port 6942 --description "Cilium operator metrics" "$CILIUM_SECURITY_GROUP_ID"
openstack security group rule create --protocol tcp --remote-ip "$MACHINE_NETWORK_CIDR" \
  --dst-port 2112 --description "Cilium Hubble enterprise metrics" "$CILIUM_SECURITY_GROUP_ID"
openstack security group rule create --protocol udp --remote-ip "$MACHINE_NETWORK_CIDR" \
  --dst-port 8472 --description "Cilium VXLAN" "$CILIUM_SECURITY_GROUP_ID"

Prepare Cluster Repository

Starting with this section, we recommend that you change into a clean directory (for example a directory in your home).

Check Running Commodore for details on how to run commodore.

Prepare Commodore inventory.

mkdir -p inventory/classes/
git clone $(curl -sH"Authorization: Bearer $(commodore fetch-token)" "${COMMODORE_API_URL}/tenants/${TENANT_ID}" | jq -r '.gitRepo.url') inventory/classes/${TENANT_ID}

Configure the cluster’s domain in Project Syn

export CLUSTER_DOMAIN="${CLUSTER_ID}.${BASE_DOMAIN}" (1)

1	Adjust this as necessary if you’re using a non-standard cluster domain.

The cluster domain configured here must be correct. The value is used to configure how Cilium connects to the cluster’s K8s API.

pushd "inventory/classes/${TENANT_ID}/"

yq eval -i ".parameters.openshift.baseDomain = \"${CLUSTER_DOMAIN}\"" \
  ${CLUSTER_ID}.yml

git commit -a -m "Configure cluster domain for ${CLUSTER_ID}"

Include openshift4.yml in the cluster’s config if it exists

For some tenants, this may already configure some of the settings shown in this how-to.

if ls openshift4.y*ml 1>/dev/null 2>&1; then
    yq eval -i '.classes += ".openshift4"' ${CLUSTER_ID}.yml;
    git commit -a -m "Include openshift4 class for ${CLUSTER_ID}"
fi

Add Cilium to cluster configuration

These instructions assume that Cilium is configured to use api-int.${CLUSTER_DOMAIN}:6443 to connect to the cluster’s K8s API. To ensure that that’s the case, add the configuration shown below somewhere in the Project Syn config hierarchy.

parameters:
  cilium:
    cilium_helm_values:
      k8sServiceHost: api-int.${openshift:baseDomain}
      k8sServicePort: "6443"

For VSHN, this configuration is set in the Commodore global defaults (internal).

If you have a non-standard pod network, you need to ensure to include this in the configuration.

parameters:
  cilium:
    cilium_helm_values:
      ipam:
        operator:
          ~clusterPoolIPv4PodCIDRList:
            - <POD_NETWORK_CIDR>

yq eval -i '.applications += ["cilium"]' ${CLUSTER_ID}.yml

yq eval -i '.parameters.networkpolicy.networkPlugin = "cilium"' ${CLUSTER_ID}.yml

yq eval -i '.parameters.openshift4_monitoring.upstreamRules.networkPlugin = "cilium"' ${CLUSTER_ID}.yml

yq eval -i '.parameters.openshift.infraID = "TO_BE_DEFINED"' ${CLUSTER_ID}.yml
yq eval -i '.parameters.openshift.clusterID = "TO_BE_DEFINED"' ${CLUSTER_ID}.yml

git commit -a -m "Add Cilium addon to ${CLUSTER_ID}"

git push
popd

Compile catalog

commodore catalog compile ${CLUSTER_ID} --push -i \
  --dynamic-fact kubernetesVersion.major=$(echo "1.32" | awk -F. '{print $1}') \
  --dynamic-fact kubernetesVersion.minor=$(echo "1.32" | awk -F. '{print $2}') \
  --dynamic-fact openshiftVersion.Major=$(echo "4.19" | awk -F. '{print $1}') \
  --dynamic-fact openshiftVersion.Minor=$(echo "4.19" | awk -F. '{print $2}')

This commodore call requires Commodore v1.5.0 or newer. Please make sure to update your local installation.

Configure the OpenShift Installer

Generate SSH key

We generate a unique SSH key pair for the cluster as this gives us troubleshooting access.

SSH_PRIVATE_KEY="$(pwd)/ssh_$CLUSTER_ID"
export SSH_PUBLIC_KEY="${SSH_PRIVATE_KEY}.pub"

ssh-keygen -C "vault@$CLUSTER_ID" -t ed25519 -f $SSH_PRIVATE_KEY -N ''

BASE64_NO_WRAP='base64'
if [[ "$OSTYPE" == "linux"* ]]; then
  BASE64_NO_WRAP='base64 --wrap 0'
fi

vault kv put clusters/kv/${TENANT_ID}/${CLUSTER_ID}/openstack/ssh \
  private_key=$(cat $SSH_PRIVATE_KEY | eval "$BASE64_NO_WRAP")

ssh-add $SSH_PRIVATE_KEY

If the private key for the cluster already exists, you can fetch it with the following commands.

SSH_PRIVATE_KEY="$(pwd)/ssh_$CLUSTER_ID"
export SSH_PUBLIC_KEY="${SSH_PRIVATE_KEY}.pub"

vault kv get -format=json clusters/kv/${TENANT_ID}/${CLUSTER_ID}/cloudscale/ssh | \
  jq -r '.data.data.private_key|@base64d' > ${SSH_PRIVATE_KEY}
chmod 600 ${SSH_PRIVATE_KEY}
ssh-keygen -f ${SSH_PRIVATE_KEY} -y > ${SSH_PUBLIC_KEY}

ssh-add ${SSH_PRIVATE_KEY}

Prepare install-config.yaml

You can add more options to the install-config.yaml file. Have a look at the installation configuration parameters for more information.

export INSTALLER_DIR="$(pwd)/target"
mkdir -p "${INSTALLER_DIR}"

cat > "clouds.yaml" <<EOF
clouds:
  shiftstack:
    auth:
      auth_url: ${OS_AUTH_URL}
      project_name: ${OS_PROJECT_NAME}
      username: ${OS_USERNAME}
      password: ${OS_PASSWORD}
      user_domain_name: ${OS_USER_DOMAIN_NAME}
      project_domain_name: ${OS_PROJECT_DOMAIN_NAME}
EOF

cat > "${INSTALLER_DIR}/install-config.yaml" <<EOF
apiVersion: v1
metadata:
  name: ${CLUSTER_ID} (1)
baseDomain: ${BASE_DOMAIN} (1)
compute: (2)
  - architecture: amd64
    hyperthreading: Enabled
    name: worker
    replicas: 3
    platform:
      openstack:
        type: ${APP_FLAVOR}
        rootVolume:
          size: 100
          type: __DEFAULT__ # TODO: is this generally applicable?
        additionalSecurityGroupIDs: (3)
          - ${CILIUM_SECURITY_GROUP_ID}
controlPlane:
  architecture: amd64
  hyperthreading: Enabled
  name: master
  replicas: 3
  platform:
    openstack:
      type: ${CONTROL_PLANE_FLAVOR}
      rootVolume:
        size: 100
        type: __DEFAULT__ # TODO: is this generally applicable?
      additionalSecurityGroupIDs: (3)
        - ${CILIUM_SECURITY_GROUP_ID}
platform:
  openstack:
    cloud: shiftstack (4)
    externalNetwork: ${EXTERNAL_NETWORK_NAME}
    apiFloatingIP: ${API_VIP}
    ingressFloatingIP: ${INGRESS_VIP}
networking: (5)
  networkType: Cilium
  machineNetwork:
    - cidr: ${MACHINE_NETWORK_CIDR}
pullSecret: |
  ${PULL_SECRET}
sshKey: "$(cat $SSH_PUBLIC_KEY)"
EOF

1	Make sure that the values here match the value of `$CLUSTER_DOMAIN` when combined as `<metadata.name>.<baseDomain>`. Otherwise, the installation will most likely fail.
2	We only provision a single compute machine set. The final machine sets will be configured through Project Syn.
3	We attach the Cilium security group to both the control plane and the worker nodes. This ensures that there’s no issues with Cilium traffic during bootstrapping.
4	This field must match the entry in `clouds` in the `clouds.yaml` file. If you’re following this guide, you shouldn’t need to adjust this.
5	(Optional) Configure non-standard pod and service network here (docs).

If setting custom CIDR for the OpenShift networking, the corresponding values should be updated in your Commodore cluster definitions. See Cilium Component Defaults and Parameter Reference. Verify with less catalog/manifests/cilium/olm/*ciliumconfig.yaml.

Prepare the OpenShift Installer

The steps in this section aren’t idempotent and have to be completed uninterrupted in one go. If you have to recreate the install config or any of the generated manifests you need to rerun all of the subsequent steps.

Render install manifests (this will consume the install-config.yaml)

openshift-install --dir "${INSTALLER_DIR}" \
  create manifests

If you want to change the default "apps" domain for the cluster:

yq w -i "${INSTALLER_DIR}/manifests/cluster-ingress-02-config.yml" \
  spec.domain apps.example.com

Copy pre-rendered extra machine configs

machineconfigs=catalog/manifests/openshift4-nodes/10_machineconfigs.yaml
if [ -f $machineconfigs ];  then
  yq --no-doc -s \
    "\"${INSTALLER_DIR}/openshift/99x_openshift-machineconfig_\" + .metadata.name" \
    $machineconfigs
fi

Copy pre-rendered Cilium manifests

cp catalog/manifests/cilium/olm/* ${INSTALLER_DIR}/manifests/

Verify that the generated cluster domain matches the desired cluster domain

GEN_CLUSTER_DOMAIN=$(yq e '.spec.baseDomain' \
  "${INSTALLER_DIR}/manifests/cluster-dns-02-config.yml")
if [ "$GEN_CLUSTER_DOMAIN" != "$CLUSTER_DOMAIN" ]; then
  echo -e "\033[0;31mGenerated cluster domain doesn't match expected cluster domain: Got '$GEN_CLUSTER_DOMAIN', want '$CLUSTER_DOMAIN'\033[0;0m"
else
  echo -e "\033[0;32mGenerated cluster domain matches expected cluster domain.\033[0;0m"
fi

Prepare install manifests and ignition config

openshift-install --dir "${INSTALLER_DIR}" \
  create ignition-configs

Update Project Syn cluster config

Switch to the tenant repo
```
pushd "inventory/classes/${TENANT_ID}/"
```

Include no-opsgenie class to prevent monitoring noise during cluster setup

yq eval -i '.classes += "global.distribution.openshift4.no-opsgenie"' ${CLUSTER_ID}.yml;

Update cluster config

yq eval -i ".parameters.openshift.infraID = \"$(jq -r .infraID "${INSTALLER_DIR}/metadata.json")\"" \
  ${CLUSTER_ID}.yml

yq eval -i ".parameters.openshift.clusterID = \"$(jq -r .clusterID "${INSTALLER_DIR}/metadata.json")\"" \
  ${CLUSTER_ID}.yml

yq eval -i ".parameters.openshift.ssh_key = \"$(cat ${SSH_PUBLIC_KEY})\"" \
  ${CLUSTER_ID}.yml

If you use a custom "apps" domain, make sure to set parameters.openshift.appsDomain accordingly.

APPS_DOMAIN=your.custom.apps.domain
yq eval -i ".parameters.openshift.appsDomain = \"${APPS_DOMAIN}\"" \
  ${CLUSTER_ID}.yml

By default, the cluster’s update channel is derived from the cluster’s reported OpenShift version. If you want to use a custom update channel, make sure to set parameters.openshift4_version.spec.channel accordingly.

# Configure the OpenShift update channel as `fast`
yq eval -i ".parameters.openshift4_version.spec.channel = \"fast-{ocp-minor-version}\"" \
  ${CLUSTER_ID}.yml

Enable AppCat if required
```
yq eval -i ".classes += \"global.apps.appcat\"" \
  ${CLUSTER_ID}.yml
```
AppCat requires extra resources on the master nodes. Please ensure the master sizing is at least 8 CPU / 32 GB RAM.

Configure OpenStack parameters

yq eval -i ".parameters.openshift.openstack.app_flavor = \"${APP_FLAVOR}\"" \
  ${CLUSTER_ID}.yml

yq eval -i ".parameters.openshift.openstack.infra_flavor = \"${INFRA_FLAVOR}\"" \
  ${CLUSTER_ID}.yml

Commit changes and compile cluster catalog

Review changes. Have a look at the file ${CLUSTER_ID}.yml. Override default parameters or add more component configurations as required for your cluster.

Commit changes

git commit -a -m "Setup cluster ${CLUSTER_ID}"
git push

popd

Temporarily unset https_proxy since Commodore currently doesn’t support https_proxy
```
https_proxy_bak=$https_proxy
unset https_proxy
```
You can skip this step and "Restore https_proxy" if you’re not using a HTTPS proxy for the installation.

Compile and push cluster catalog

commodore catalog compile ${CLUSTER_ID} --push -i \
  --dynamic-fact kubernetesVersion.major=$(echo "1.32" | awk -F. '{print $1}') \
  --dynamic-fact kubernetesVersion.minor=$(echo "1.32" | awk -F. '{print $2}') \
  --dynamic-fact openshiftVersion.Major=$(echo "4.19" | awk -F. '{print $1}') \
  --dynamic-fact openshiftVersion.Minor=$(echo "4.19" | awk -F. '{print $2}')

This commodore call requires Commodore v1.5.0 or newer. Please make sure to update your local installation.

Restore https_proxy
```
export https_proxy=$https_proxy_bak
```

Provision the cluster

The steps related to openshift-install must be run on a host which can reach the OpenStack API. If you can’t reach the OpenStack API directly, but a SSH jumphost is available, you can setup a SOCKS5 proxy with the following commands:

export JUMPHOST_FQDN=<jumphost fqdn or alias from your SSH config> (1)
ssh -D 12000 -q -f -N ${JUMPHOST_FQDN} (2)
export HTTPS_PROXY="socks5://localhost:12000" (3)
export CURL_OPTS="-xsocks5h://localhost:12000"

1	The FQDN or SSH alias of the host which can reach the OpenStack API
2	This command expects that your SSH config is setup so that `ssh ${JUMPHOST_FQDN}` works without further configuration
3	The `openshift-install` tool respects the `HTTPS_PROXY` environment variable

If you have troubles downloading the RHCOS image, can happen when the jumphost isn’t in the same network as the cluster, you need to exclude the RHCOS image mirror from downloading through the proxy.

export NO_PROXY="rhcos.mirror.openshift.com"

Run the OpenShift installer

openshift-install --dir "${INSTALLER_DIR}" \
  create cluster --log-level=debug

Access cluster API

Export kubeconfig

export KUBECONFIG="${INSTALLER_DIR}/auth/kubeconfig"

Verify API access
```
kubectl cluster-info
```

If the cluster API is only reachable with a SOCKS5 proxy, run the following commands instead:

cp ${INSTALLER_DIR}/auth/kubeconfig ${INSTALLER_DIR}/auth/kubeconfig-socks5
yq eval -i '.clusters[0].cluster.proxy-url="socks5://localhost:12000"' \
    ${INSTALLER_DIR}/auth/kubeconfig-socks5
export KUBECONFIG="${INSTALLER_DIR}/auth/kubeconfig-socks5"

Create a server group for the infra nodes

Create the server group

openstack server group create $(jq -r '.infraID' "${INSTALLER_DIR}/metadata.json")-infra \
  --policy soft-anti-affinity

Configure registry S3 credentials

Create secret with S3 credentials for the registry

oc create secret generic image-registry-private-configuration-user \
  --namespace openshift-image-registry \
  --from-literal=REGISTRY_STORAGE_S3_ACCESSKEY=<TBD> \
  --from-literal=REGISTRY_STORAGE_S3_SECRETKEY=<TBD>

If the registry S3 credentials are created too long after the initial cluster setup, it’s possible that the openshift-samples operator has disabled itself because it couldn’t find a working in-cluster registry.

If the samples operator is disabled, no templates and builder images will be available on the cluster.

You can check the samples-operator’s state with the following command:

kubectl get config.samples cluster -ojsonpath='{.spec.managementState}'

If the command returns Removed, verify that the in-cluster registry pods are now running, and enable the samples operator again:

kubectl patch config.samples cluster -p '{"spec":{"managementState":"Managed"}}'

See the upstream documentation for more details on the samples operator.

Enable Project Syn

Make the cluster Project Syn enabled

Setup acme-dns CNAME records for the cluster

You can skip this section if you’re not using Let’s Encrypt for the cluster’s API and default wildcard certificates.

Extract the acme-dns subdomain for the cluster after cert-manager has been deployed via Project Syn.
```
fulldomain=$(kubectl -n syn-cert-manager \
  get secret acme-dns-client \
  -o jsonpath='{.data.acmedns\.json}' | \
  base64 -d  | \
  jq -r '[.[]][0].fulldomain')
echo "$fulldomain"
```
If the acme-dns-client secret hasn’t yet been populated, re-trigger the ArgoCD sync for cert-manager.

Add the following CNAME records to the cluster’s DNS zone

The _acme-challenge records must be created in the same zone as the cluster’s api and apps records respectively.

$ORIGIN <cluster-zone> (2)
_acme-challenge.api  IN CNAME <fulldomain>. (1)
$ORIGIN <apps-base-domain> (3)
_acme-challenge.apps IN CNAME <fulldomain>. (1)

1	Replace `<fulldomain>` with the output of the previous step.
2	The `_acme-challenge.api` record must be created in the same origin as the `api` record.
3	The `_acme-challenge.apps` record must be created in the same origin as the `apps` record.

Ensure emergency admin access to the cluster

Check that emergency credentials were uploaded and are accessible:

export EMR_KUBERNETES_ENDPOINT=https://api.${CLUSTER_DOMAIN}:6443
emergency-credentials-receive "${CLUSTER_ID}"

You need to be in the passbolt group VSHN On-Call.

If the command fails, check if the controller is already deployed, running, and if the credentials are uploaded:

kubectl -n appuio-emergency-credentials-controller get emergencyaccounts.cluster.appuio.io -o=jsonpath='{range .items[*]}{.metadata.name}{"\t"}{.status.lastTokenCreationTimestamp}{"\n"}{end}'

Follow the instructions from emergency-credentials-receive to use the downloaded kubeconfig file.

export KUBECONFIG="em-${CLUSTER_ID}"
kubectl get nodes
oc whoami # should output system:serviceaccount:appuio-emergency-credentials-controller:*

If the Let’s Encrypt certificate for the API isn’t fully provisioned yet, you may need to run the following yq to use the emergency kubeconfig:

yq -i e '.clusters[0].cluster.insecure-skip-tls-verify = true' "em-${CLUSTER_ID}"

Invalidate the 10 year admin kubeconfig.

kubectl -n openshift-config patch cm admin-kubeconfig-client-ca --type=merge -p '{"data": {"ca-bundle.crt": ""}}'

Enable Opsgenie alerting

Create the standard silence for alerts that don’t have the syn label

oc --as cluster-admin -n openshift-monitoring create job --from=cronjob/silence silence-manual
oc wait -n openshift-monitoring --for=condition=complete job/silence-manual
oc --as cluster-admin -n openshift-monitoring delete job/silence-manual

Check the remaining active alerts and address them where neccessary

kubectl --as=cluster-admin -n openshift-monitoring exec sts/alertmanager-main -- \
    amtool --alertmanager.url=http://localhost:9093 alert --active

Remove the "no-opsgenie" class from the cluster’s configuration

pushd "inventory/classes/${TENANT_ID}/"
yq eval -i 'del(.classes[] | select(. == "*.no-opsgenie"))' ${CLUSTER_ID}.yml
git commit -a -m "Enable opsgenie alerting on cluster ${CLUSTER_ID}"
git push
popd

Finalize installation

Verify that the Project Syn-managed machine sets have been provisioned

kubectl -n openshift-machine-api get machineset -l argocd.argoproj.io/instance

The command should show something like

NAME    DESIRED   CURRENT   READY   AVAILABLE   AGE
app     3         3         3       3           4d5h (1)
infra   4         4         4       4           4d5h (1)

1	The values for `DESIRED` and `AVAILABLE` should match.

If there’s discrepancies between the desired and available counts of the machine sets, you can list the machine objects which aren’t in phase "Running":

kubectl -n openshift-machine-api get machine | grep -v Running

You can see errors by looking at an individual machine object with kubectl describe.

If the Project Syn-managed machine sets are healthy, scale down the initial worker machine set

If the Project Syn-managed machine sets aren’t healthy, this step may reduce the cluster capacity to the point where infrastructure components can’t run. Make sure you have sufficient cluster capacity before continuing.

INFRA_ID=$(jq -r .infraID "${INSTALLER_DIR}/metadata.json")
kubectl -n openshift-machine-api patch machineset ${INFRA_ID}-worker-0 \
  -p '{"spec": {"replicas": 0}}' --type merge

Once the initial machine set is scaled down, verify that all pods are still running. The command below should produce no output.
```
kubectl get pods -A | grep -vw -e Running -e Completed
```

If all pods are still running, delete the initial machine set

kubectl -n openshift-machine-api delete machineset ${INFRA_ID}-worker-0

Delete local config files
```
rm -r ${INSTALLER_DIR}/
```

Post tasks

VSHN

Verify that an UpgradeConfig is present
```
kubectl -n appuio-openshift-upgrade-controller get upgradeconfig
```
Double-check the cluster’s maintenance_window fact, if this command doesn’t return any objects.

Schedule a first maintenance 1 minute in the future

uc=$(yq .parameters.facts.maintenance_window inventory/classes/params/cluster.yml)
kubectl -n appuio-openshift-upgrade-controller get upgradeconfig $uc -oyaml | \
  yq '
    .metadata.name = "first", (1)
    .metadata.labels = {}, (2)
    .spec.jobTemplate.metadata.labels.upgradeconfig/name = "first", (1)
    .spec.schedule.cron = ((now+"1m")|format_datetime("4 15")) + " * * *", (3)
    .spec.pinVersionWindow = "0m" (4)
  ' | \
  kubectl create -f - --as=cluster-admin

1	The name doesn’t matter, but the `upgradeconfig/name` label in the job template must match `metadata.name` of the copied `UpgradeConfig`.
2	We clear the resource labels so ArgoCD doesn’t delete the copied resource.
3	This expression converts `now+1m` to a valid cronspec for daily runs at that time of day.
4	We set the `pinVersionWindow` to 0 minutes to ensure that the first job actually gets scheduled one minute in the future.

Don’t forget to delete the copied UpgradeConfig resource after the initial maintenance completes.

kubectl --as=cluster-admin -n appuio-openshift-upgrade-controller \
  delete upgradeconfig first

Generic

Do a first maintenance