Cluster Certificate Renewal

A typical OpenShift (or Kubernetes) cluster uses certificate-based encryption, authentication and autorization in a lot of places. This document aims to explain how those certificates can be renewed, replaced or rotated.

All cluster-managed certificates are monitored (see Cluster-wide certificate expiry check in monitoring.)

On-cluster components

For most components that are deployed on the cluster (Router, Registry, Metrics, Logging) and have a public Route, certificates are provided via Let’s Encrypt. If not, the route certificate can be configured via openshift-ansible.

The certificates themselves are usually provisioned to the Ansible Master host using the profile_certificates Puppet module. Add it to the Ansible Master’s Hieradata:

infra.yaml
...
certificates:
  wildcard.example.com:
    type: trusted
    privatekey: |
      -----BEGIN RSA PRIVATE KEY-----
      ...
      -----END RSA PRIVATE KEY-----
    certificate: |
      -----BEGIN CERTIFICATE-----
      ...
      -----END CERTIFICATE-----
    chain: |
      -----BEGIN CERTIFICATE-----
      ...
      -----END CERTIFICATE-----
...

After you replaced the certificate and ran Puppet on the Ansible Master system, run the appropriate playbooks to apply the new certificate.

Most components also have a redeploy-certificates Playbook in openshift-ansible. Those Playbooks however can only replace certificates created by openshift-ansible, meaning that custom certificates will not get replaced when running the playbook! Instead the Playbook documented below must be used.

Default Router certificate

Configured via openshift_hosted_router_certificate.

Rollout:

ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-hosted/redeploy-router-certificates.yml
ansible-playbook /usr/share/mungg/playbooks/postconfig.yml --tags objects

If you do not configure openshift_hosted_router_certificate, openshift-ansible will generate a wildcard certificatethat matches openshift_master_default_subdomain and is signed by the OpenShift internal CA.

Registry

The ROUTE certificate (for access from outside of the cluster) is onfigured via openshift_hosted_registry_routecertificates.

Rollout:

ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-hosted/deploy_registry.yml
ansible-playbook /usr/share/mungg/playbooks/postconfig.yml --tags objects

The INTERNAL certificate (for cluster-internal communication) is generated automatically from the cluster-internal CA.

To Renew:

ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-hosted/redeploy-registry-certificates.yml
ansible-playbook /usr/share/mungg/playbooks/postconfig.yml --tags objects

Logging

Configured via openshift_logging_kibana_{cert,key,ca}.

Rollout:

ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-logging/config.yml
ansible-playbook /usr/share/mungg/playbooks/postconfig.yml --tags logging,objects

Metrics

Configured via openshift_metrics_hawkular_{cert,key,ca}.

Rollout:

ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-metrics/config.yml
ansible-playbook /usr/share/mungg/playbooks/postconfig.yml --tags objects

Kubelet certificate

By default, node certificates are valid for one year. OpenShift Container Platform automatically rotates node certificates when they get close to expiring. If automatic approval isn’t configured, you must manually approve the certificate signing requests (CSRs).

Automatic approval is done by a componentn called "bootstrap auto approver" and should be configured on most clusters. If not, check for open CSRs:

oc get csr

If there are open CSRs, approve them (or at least the most recent ones) using the following command:

oc adm certificate approve <csr_name>
It’s not necessary to approve ALL CSRs only the most recent one per node. If a CSR isn’t approved, the Kubelet will keep creating new ones, which will lead to a buildup of CSRs on clusters without bootstrap auto approver.

Force renewal

To force renewal of a Kubelet certificate, simply remove it from the node and restart the Kubelet:

mkdir "/etc/origin/node/certificates-$(date +%F)"
mv /etc/origin/node/certificates/* "/etc/origin/node/certificates-$(date +%F)"
mv /etc/origin/node/node.kubeconfig "/etc/origin/node/certificates-$(date +%F)"
systemctl restart atomic-openshift-node

And then approve the new CSR:

oc get csr -o name | xargs oc adm certificate approve

OpenShift Master & etcd certificates

When renewing the OpenShift master certificates, the aggregated API server certificates for the metrics API server and the Service catalog API server need to be renewed manually.

  1. Renew master certificates

    ansible-playbook /usr/share/openshift-ansible/playbooks/redeploy-certificates.yml [-e first_master_client_binary=oc]
    ansible 'masters[0]' -a 'oc -n openshift-metrics-server delete pod --all'
    ansible 'masters[0]' -a 'oc -n kube-service-catalog delete pod --all'
  2. Renew metrics stack certificates

    ansible 'masters[0]' -m shell -a 'oc -n openshift-infra delete secret hawkular-cassandra-certs hawkular-metrics-certs heapster-certs'
    ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-metrics/config.yml
  3. Renew metrics API server certificate

    ansible 'masters[0]' -m shell -a 'oc -n openshift-metrics-server delete secret metrics-server-certs'
    ansible-playbook /usr/share/openshift-ansible/playbooks/metrics-server/config.yml
  4. (If service catalog is in use) Renew service catalog API server certificates (Note: redeploys the ASB)

    ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-service-catalog/config.yml
  5. Run the post-config playbook to ensure customizations that were removed by any of the playbooks are reapplied

    ansible-playbook /usr/share/mungg/playbooks/postconfig.yml

OpenShift CA

Do this only if the CA is going to expire, all docker daemons will be hard restarted!

Renew CA & all certificates

ansible-playbook /usr/share/openshift-ansible/playbooks/openshift-master/redeploy-openshift-ca.yml [-e first_master_client_binary=oc]
ansible-playbook /usr/share/openshift-ansible/playbooks/redeploy-certificates.yml -e openshift_redeploy_openshift_ca=true [-e first_master_client_binary=oc]

For APPUiO clusters, update CA for Aedifex:

oc -n aedifex get secret aedifex-remote-build-config -o yaml
# edit "data.docker-registry-ca"
  1. Disabled aedifex on all masters

    /etc/origin/master/master.env
    #OPENSHIFT_DOCKER_BUILDER_IMAGE='172.30.1.1:5000/aedifex/aedifex-${component}:${version}'
  2. Restart controllers

    ansible masters --forks=1 -a 'master-restart controllers'
  3. Rebuilde the Aedifex image

  4. Removed old tags

  5. Removed old images from OpenShift nodes

  6. Aedifex: remove all old VMs

  7. Enable aedifex on all masters

    /etc/origin/master/master.env
    OPENSHIFT_DOCKER_BUILDER_IMAGE='172.30.1.1:5000/aedifex/aedifex-${component}:${version}'
  8. Restart all controllers

    ansible masters --forks=1 -a 'master-restart controllers'

Other certificates

RHEL subscriptions

If monitoring complains about "RHEL subscription expiring," a simple subscription-manager refresh is usually enough.

ansible OSEv3 -a 'subscription-manager refresh'