Force reboot of all nodes in a machine config pool

Starting situation

  • You have admin-level access to the OpenShift 4 cluster

  • You want to trigger node reboots for a whole machine config pool

Prerequisites

The following CLI utilities need to be available

  • kubectl

  • oc (The commands assume you have v4.13 or newer)

  • jq

Reboot nodes

  1. Select machine config pool for which you want to reboot all nodes

    MCP=<name> (1)
    1 Replace with the name of the machine config pool for which you want to reboot the nodes
  2. List all nodes belonging to the pool

    node_selector=$( \
      kubectl get mcp "${MCP}" -ojsonpath='{.spec.nodeSelector.matchLabels}' | \
      jq -r '. as $root | [. | keys[] | "\(.)=\($root[.])"] | join(",")' \
    )
    kubectl get nodes -l $node_selector
  3. Prepare the nodes for a force machine config resync

    for node in $(kubectl get nodes -oname -l $node_selector); do
      oc --as=cluster-admin debug $node -- chroot /host touch /run/machine-config-daemon-force
    done
  4. Select an old rendered machine config for the pool

    The command selects the second newest rendered machine config. The exact value doesn’t matter, but we want to overwrite the currentConfig annotation with an existing machine config, so that the operator doesn’t mark the nodes as degraded.

    old_mc=$(kubectl get mc -o json | \
      jq --arg mcp rendered-$MCP -r \
      '[.items[] | select(.metadata.name | contains($mcp))]
      | sort_by(.metadata.creationTimestamp) | reverse
      | .[1] | .metadata.name' \
    )
  5. Trigger machine config daemon resync for one node at a time

    Don’t do this for multiple nodes at the same time, all the nodes for which this step is executed are immediately drained and rebooted.

    timeout=300s (1)
    for node in $(kubectl get node -o name -l $node_selector); do
      echo "Rebooting $node"
      kubectl --as=cluster-admin annotate --overwrite $node \
        machineconfiguration.openshift.io/currentConfig=$old_mc
      echo "Waiting for drain... (up to $timeout)"
      if ! oc wait --timeout=$timeout $node --for condition=ready=false; then
        echo "$node didn't drain and reboot, please check status, aborting loop"
        break
      fi
      echo "Waiting for reboot completed... (up to $timeout)"
      if ! oc wait --timeout=$timeout $node --for condition=ready; then
        echo "$node didn't become ready, please check status, aborting loop"
        break
      fi
    done
    1 Adjust if you expect node drains and reboots to be slower or faster than 5 minutes