Maintenance Alert Handling

This runbook describes how to handle alerts during automated Openshift maintenance.

Handling

Investigate

Resolve

Resolve alert as described in the runbook.

Feedback

To resolve recurring problems and improve automated maintenance we need feedback for every problem that blocked the automated maintenance.

During Maintenance

  • Create a Jira ticket

    • Project: APPUiO Managed OpenShift 4 (OCP)

    • Issue Type: Task

You don’t need to create a separate ticket for every cluster or problem that came up during maintenance. One ticket per maintenance window, including all encountered issues, is fine.

  • For every cluster that alerted provide the following infos in the ticket:

    • Link to the indivudial alerts from the Openshift dashboard

    • Description of the problem

    • Description of how the problem was solved, if not already covered by the alerts runbook

    • Was the alert a false positive, did it resolve itself (needs tuning)

  • The ticket must be linked in maintenance log under Post-Tasks

Follow Up

The dedicated Openshift team will refine the created ticket and / or create follow ups during office hours the next day.