# Maintenance and Update of an OpenShift 4 cluster

1. Get list of available updates:

``````oc adm upgrade --as cluster-admin

VERSION IMAGE
4.5.19  quay.io/openshift-release-dev/ocp-release@sha256:bae5510f19324d8e9c313aaba767e93c3a311902f5358fe2569e380544d9113e
4.5.20  quay.io/openshift-release-dev/ocp-release@sha256:78b878986d2d0af6037d637aa63e7b6f80fc8f17d0f0d5b077ac6aca83f792a0
4.5.24  quay.io/openshift-release-dev/ocp-release@sha256:f3ce0aeebb116bbc7d8982cc347ffc68151c92598dfb0cc45aaf3ce03bb09d11``````

or

``kubectl --as cluster-admin get clusterversion version -o json | jq '.status.availableUpdates[] | {image: .image, version: .version}'``
 If you don’t get the newest available version, this might be intended. Red Hat does release new updates to specific cluster, when they do have no known issues. So on a stable channel you need some patience!
1. Update the configuration hierarchy

Set the following parameters to the values retrieved in the previous step:

• `parameters.openshift4_version.spec.desiredUpdate.image`

• `parameters.openshift4_version.spec.desiredUpdate.version`

2. Compile the cluster catalog

3. Enjoy the show

Let the OpenShift operators do their job.

``kubectl --as cluster-admin get clusterversion version --watch``
4. Check the upgrade state via the `oc` command:

``````$oc adm upgrade --as cluster-admin Cluster version is 4.5.24 No updates available. You may force an upgrade to a specific release image, but doing so may not be supported and result in downtime or data loss.``````  Even if `oc adm upgrade` shows that the upgrade has completed, it’s possible that nodes are still being upgraded. 5. Check node upgrade status by checking the status of the `MachineConfigPool` resources: ``````$ oc --as=cluster-admin -n openshift-machine-config-operator get machineconfigpool
master   rendered-master-92e100dc64d7c9ecf669b1f69cdb5dca   True      False      False      3              3                   3                     0                      19d
worker   rendered-worker-4648c4badfb057c7e3e9f1030fa42507   True      False      False      6              6                   6                     0                      19d``````
 Applications on the cluster may get rescheduled without prior notice as long as the worker `MachineConfigPool` doesn’t show `Updated=True`. You can observe the progress of the node upgrades with ``oc --as=cluster-admin get mcp -w``

So far, the upgrade process mostly just worked. Nevertheless, we’ve started documenting how to observe the upgrade process in the following section. More troubleshooting instructions will be added there as we gain experience.

For general information about the upgrade process, check out Updating a cluster between minor versions of the OpenShift 4 documentation.

Also have a look at the blog post The Ultimate Guide to OpenShift Release and Upgrade Process for Cluster Administrators which is an excellent source to understand the process.

## Troubleshooting

### General troubleshooting

Get some more detailed information about the current state of the upgrade:

``kubectl --as cluster-admin get clusterversions.config.openshift.io version -o jsonpath={.status.conditions}  | jq .``
 In general: In the case of an error or warning, try to get more detailed information from the log from the specific operator or controller. Be patient as some warnings are just temporary and the operators might be able to relieve themselves from a degraded state!

Follow the log of an operator or controller:

``````kubectl --as cluster-admin -n openshift-machine-config-operator logs deployment/machine-config-operator -f
kubectl --as cluster-admin -n openshift-machine-config-operator logs deployment/machine-config-controller -f``````

• List latest `MachineConfig` object for each machine pool:

``````POOL_COUNT=$(kubectl --as=cluster-admin -n openshift-machine-config-operator get machineconfigpool --no-headers | wc -l) kubectl --as=cluster-admin -n openshift-machine-config-operator get machineconfig \ --sort-by=".metadata.creationTimestamp" | grep "^rendered-" | tail -n "${POOL_COUNT}"``````
• List nodes with their current and desired `MachineConfig` objects:

``kubectl --as=cluster-admin get nodes -ocustom-columns="NAME:.metadata.name,Current Config:.metadata.annotations.machineconfiguration\.openshift\.io/currentConfig,Desired Config:.metadata.annotations.machineconfiguration\.openshift\.io/desiredConfig"``
• Check `machine-config-daemon` pod logs on the node(s) for which current and desired `MachineConfig` objects don’t match.

The `machine-config-daemon` logs contain the `kubectl drain` logs for the node among other things.

``````NODE=<node-name>
POD=$(kubectl --as=cluster-admin -o jsonpath='{.items[0].metadata.name}' \ -n openshift-machine-config-operator get pods \ --field-selector="spec.nodeName=${NODE}" -l k8s-app=machine-config-daemon)
• If nodes get stuck in `NotReady` during the upgrade process, check whether the VM got stuck trying to reboot itself into the new image: