Ship selected cluster metrics to Mimir

This how-to guides you to ship a (number of) specified metric to the central Mimir for the purpose of monitoring.

Prerequisites

  • Write access to the commodore-defaults repository

  • Knowledge of which specific metrics are required for your needs

If you are unsure which specific metrics you need, you can query the cluster-local Prometheus. This shows you the available metrics, and you can formulate queries that answer the specific questions you have.

Procedure

  1. Identify whether any of your metrics can be pre-aggregated.

    • To keep the number of timeseries as low as possible, you should only ship timeseries at the granularity you need.

    • Pre-aggregation is achieved via recording rules and should follow the naming best practices.

    • Metrics are sent to Mimir using Prometheus’s remote-write mechanism.

  2. Add the recording rules and Prometheus remote-write configuration to the appropriate class in the Commodore defaults.

    For example, the configuration for APPUiO Managed OpenShift is managed in distribution/openshift4/central-metrics.yaml.

    For setups using component-prometheus, class apps/prometheus.yaml might be a suitable location for configuring recording rules and remote-write endpoints.

    If you want to forward recording rules from the OpenShift user workload monitoring stack, you’ll need to label the PrometheusRule object which defines the recording rule with openshift.io/prometheus-rule-evaluation-scope=leaf-prometheus. Otherwise, any PrometheusRule defined in the user workload monitoring stack will be evaluated by the Thanos ruler and won’t be available for the remote-write configuration in the user workload Prometheus.

  3. Setup a datasource in Grafana. The datasource setup only needs to be done once for each Mimir organization.

    Before doing this, check if there’s already a configured datasource for your Mimir organization.

    The datasource configuration needs at least the following settings:

    • URL: http://vshn-appuio-mimir-query-frontend.vshn-appuio-mimir.svc.cluster.local:8080/prometheus

    • Header: X-Scope-OrgID: <name of the Mimir organization where you ship the metrics>

    • HTTP Method: POST

    You can setup the datasource in the Grafana configuration (vshn_appuio_grafana.helm_values.datasources) in the APPUiO Syn tenant repo.
  4. Once your change is rolled out, metrics should start arriving in Mimir. You can then query the metrics in Grafana in the datasource for your Mimir organization.

  5. Set up a dashboard utilizing your metrics

    • When writing queries, ensure that you account for Prometheus replicas: Each timeseries is shipped to Mimir twice; once per replica.

    • De-duplicate each metric with a max without (prometheus-replica) (my_metric) query.

Importing existing Grafana dashboards

While many dashboards are purpose-built, we also use pre-built "kubernetes-mixin" dashboards. These are installed into Grafana organizations by the grafana-organizations-operator.