ADR 0018 - SLA Reports Handling

Author

Gabriel Saratura

Owner

Schedar

Reviewers

Schedar

Date Created

2023-05-08

Date Updated

2023-05-09

Status

implemented

Tags

framework,sla,reports

Summary

We use a custom tool to create SLA reports.

Problem

AppCat customers do not have periodic access to their service SLAs in form of a report.

Requirements

  • An SLA Report is automatically generated each month without any inputs.

  • An SLA Report is easily and securely accessible by the customer.

  • An SLA Report contains the target SLA and the outcome of the month.

  • An SLA Report clearly states whether the SLA has been reached.

  • An SLA Report contains information as to what has been measured.

  • An SLA Report is 1 page in PDF format.

  • An SLA Report has a link to SLA exceptions.

Tool Proposals

Proposal 1.1: Opsgenie

Opsgenie came into discussion as a possible solution to this problem. After a quick research, it turned out that Opsgenie provides various reports in form of global and team level reports.

  • Global Reports are the reports that are account-wide and include generic analysis for notifications, API usage, the overall responsiveness on the alerts, MTTA/R.

  • Team Reports are the reports that focus on your team activities, performance, and the alerts that they receive. These reports also include some common looks with the Global Reports which focus only on the particular team that is selected.

We could leverage team reports, particularly the received alerts. The idea would be to catch relevant SLA alerts (or SLA alert) and design the report inside Opsgenie. The reports can be easily scheduled to be created and sent to relevant parties via email on a monthly basis.

Advantages
  • Using the same tool where the alerts are being received.

  • Easy report scheduling configuration per customer. Prometheus is indirectly configured in Opsgenie.

  • We could generate other reports in the future if need be.

Disadvantages
  • The reports cannot have any text input, the diagrams are fixed and cannot be changed therefore no VSHN customization.

  • The reports can only be sent via email and cannot be stored automatically in a place where the customer would have direct access.

  • No VSHN native template.

Proposal 1.2: Grafana

Grafana can be an alternative solution which gives very powerful tools to generate any report. It allows to add text panels which can further customize SLA reports. The free version of Grafana does not allow any automatic report generation and VSHN does not have any commercial license such as Grafana Enterprise or Grafana Advanced. It is possible to query Grafana and get relevant diagrams with different tools such as this reporter. The tool can be scheduled to run on a monthly basis.

Advantages
  • Prometheus is already configured in Grafana.

  • The SLA reports can be widely configured.

  • Future complex reports can be easily added.

Disadvantages
  • No reports out of the box with Grafana free version.

  • A tool will be needed to get the diagrams out of Grafana.

  • No VSHN native template.

Proposal 1.3: VSHN Tools

We already have a tool in house called docgen that can manually generate documents and reports. It also allows to define templates that later can be used to easily generate content. It works with AsciiDoc and the tool is not connected whatsoever to Prometheus.

Advantages
  • Full control over our SLA reports.

  • SLA reports has the same document structure as other VSHN documents.

  • Easy template creation.

Disadvantages
  • A second VSHN made tool is necessary to get data from Prometheus and trigger the report generation.

  • An update of docgen tool is necessary.

  • With current docgen version we get limited capabilities as to what we can show in the reports.

Reports Location Proposals

Proposal 2.1: S3 Buckets

An S3 bucket would be solution to where we can store these reports for our customers. One bucket per each customer. The buckets can be saved either on Exoscale or Cloudscale in their respective Organization space.

Advantages
  • Authentication and encryption is available.

Disadvantages
  • Not user friendly

  • Tedious access for customers.

  • It does not allow easy access to reports for VSHNeers.

Proposal 2.2: Nextcloud (files.vshn.net)

We have a VSHN storage place where we can save VSHN files internally and make give custom access to customers. The tool is connected to Jira therefore customer will have easy access to reports.

Advantages
  • Authentication and encryption is available.

  • UI User friendly.

  • Easy access for customer with Jira credentials

  • Easy access for VSHNeers.

Disadvantages
  • Potential security breach due to the common place of reports for all customers.

  • Arguably difficult to integrate with other tools.

Proposal 2.3: Email

Email is a reliable way to send our SLA reports. The customer does not need to access anything, the report will be provided automatically to their chosen email.

Advantages
  • The most straight forward way to send the SLA report to the customer.

Disadvantages
  • There’s no storage solution.

  • There might be day 2 operations involved.

Decision

Proposal 1.3 VSHN Tools with proposal 2.1 S3 buckets with 2.3 Email.

Rationale for VSHN Tools

The main reason for VSHN Tools is the fact that we have full control over our SLA reports and no alternative that would work right out of the box. Opsgenie currently does not satisfy SLA reports requirements meanwhile Grafana free version does not produce any reports on its own. We can integrate Grafana later on when diagrams will be required in our reports thus a combination of these proposals might happen in the future.

Rationale for S3 buckets and Email

The main reason behind emails is that we want the customer to directly receive the SLA reports. The customer should not look after reports we will provide them on a monthly basis. Since the customer might want older SLA reports we could save them easily in an S3 bucket. We can also decide how long these reports should be stored.