Espejote: An in-cluster templating controller

Problem Statement

Not every cluster configuration is fully static and can be managed through GitOps.

Some configurations depend on the cluster state or might want to be managed by upstream, by end-users, or by third-party products directly.

We got an extensive array of tools to manage these configurations, from bash "reconcilers," to cron jobs, to policy engines to fully fledged custom controllers. Those tools are all specialized and don’t cover all use-cases.

A maintainable, fully instrumented, and extensible controller that can manage arbitrary configurations in a cluster is missing. We decided to try our hand at creating one.

High Level Goals

We can manage arbitrary configurations in a cluster
- The configuration may depend on state managed by other controllers inside the cluster
- We can apply partial configurations to externally managed resources
The amount of requests and caching is tuneable
We can monitor failures and successes and have a tight feedback loop

Non-Goals

Centralized configuration management or GitOps
Security or access control

Name

"Espejote" for "big mirror" in Spanish. It’s a play on the earlier project "Espejo" which means "mirror." "Espejo" is a tool to mirror resources to multiple namespaces. "Espejote" is a more general tool to manage arbitrary configurations in a cluster.

Implementation

The controller is based on controller-runtime. It’s managed through two Custom Resource Definitions (CRD) called ManagedResource and Admission. ManagedResource implements a controller pattern that watches for changes in the cluster and applies the configuration. Admission implements a webhook pattern that applies the configuration on admission. We don’t implement any logic to automatically create admission webhooks for the resources or vice versa. The controller uses Jsonnet to render manifests. Any Kubernetes manifest can be added to the templating context.

Reconcile triggers are separated from the context resources to allow for more fine-grained control over the reconciliation.

Reconcile triggers

Reconcile triggers are used to re-render the template. They’re separated from the context resources to allow for more fine-grained control.

triggers:
- name: interval (1)
  interval: 10s
- name: ext-configmaps
  watchResource:
    apiVersion: v1
    kind: ConfigMap
    labelSelector:
      matchExpressions:
      - key: espejote.io/created-by
        operator: DoesNotExist
    minTimeBetweenTriggers: 10s
- name: my-configmap
  watchResource:
    apiVersion: v1
    kind: ConfigMap
    name: my-configmap

1	Free form name for the trigger. Can be referenced in the template and used to decide on partial re-renders.

The trigger that fires is exposed to the Jsonnet as esp.triggerData() and esp.triggerName().

local esp = import "espejote.libsonnet";

if esp.triggerName() == "ext-configmaps" then
  // Do some partial re-rendering
  local resource = esp.triggerData().resource;
else
  // Do a full re-render

Currently, we support two types of triggers:

`interval`

The template is re-rendered every interval.

`watchResource`

The template is re-rendered whenever the watched resource is created, changed, or deleted.

The minTimeBetweenTriggers field can be used to limit the rate of re-renders. This is useful to prevent a high rate of re-renders when a resource is changed multiple times in a short period. If a list of resources is watched, the minTimeBetweenTriggers is applied to each resource individually. minTimeBetweenTriggers is coming soon.

The watchResource trigger contains the full Kubernetes resource definition of the watched resource.

The resource is available in the Jsonnet template as esp.triggerData().resource.

`watchContextResource`

The watchContextResource trigger works the same as the watchResource trigger, but shares cache and configuration with a context resource. This avoids duplicate, possibly drifting, configuration when using the same configuration for both the context and the trigger. This often happens when doing partial re-renders. See the example below.

spec:
  context:
  - name: namespaces
    resource:
      apiVersion: v1
      kind: Namespace
      labelSelector:
        matchExpressions:
        - key: netpol.example.com/no-default
          operator: DoesNotExist
  triggers:
  - name: namespace
    watchContextResource:
      name: namespaces (1)
  template: |
    local esp = import 'espejote.libsonnet';
    local netpolForNs = function(ns) {
      [...]
    };
    if esp.triggerName() == 'namespace' then [
      netpolForNs(esp.triggerData().resource),
    ] else [
      netpolForNs(ns)
      for ns in esp.context().namespaces
    ]

1	The `name` field is used to reference the context resource where the configuration is copied from.

Additional triggers

Additional triggers can be added in the future. This may contain triggers based on Prometheus metrics or webhooks. Another option would be a trigger based on Kubernetes admission webhooks or Kubernetes events.

Context

The context is a list of resources that are available in the Jsonnet template. They’re not watched. If a watch is required, it should be added as a trigger.

context:
- def: configs
  resource:
    apiVersion: v1
    kind: ConfigMap
    labelSelector:
      matchExpressions:
      - key: espejote.io/created-by
        operator: DoesNotExist
  cache: true
- def: config
  resource:
    apiVersion: v1
    kind: ConfigMap
    name: my-configmap

The context is exposed to the Jsonnet as the esp.context() function.

local esp = import 'espejote.libsonnet';
local config = esp.context().config;

Currently, we support one type of context resource:

`resource`

A Kubernetes resource that’s available in the Jsonnet template. The resource can be selected by name or labelSelector.

matchNames and ignoreNames can be used to filter the resources after the labelSelector is applied. matchNames is a list of names that should be matched. ignoreNames is a list of names that should be ignored. Label selectors should be preferred over matchNames and ignoreNames because the resource can already be filtered on the API server.

It’s always a list with zero or more items. This is true even if a single resource is selected using the name field.

The cache field can be used to setup a controller-runtime cache for the resource. This is a in-memory cache with a watch on the resource. This is true by default. Can be disabled for a trade-off between memory usage and API requests. The possibility to disable the cache is coming soon.

Additional context resources

Additional context resources may be added in the future. This could contain resources based on Prometheus metrics or REST APIs.

Template

The template is a Jsonnet template that returns a list of Kubernetes resources.

It can pull in external resources through the context and triggers.

It can return a list of resources or a single resource. The resources are applied in the order they’re returned.

Template libraries

The template can use libraries to share code between templates. The libraries are stored as JsonnetLibrary resources.

The following libraries are available:

"espejote.libsonnet": The built in library for accessing the context and trigger information.
"lib/<NAME>/<KEY>" libraries in the shared library namespace. The name corresponds to the name of the JsonnetLibrary object and the key to the key in the data field. The namespace is configured at controller startup and normally points to the namespace of the controller.
"<NAME>/<KEY>" libraries in the same namespace as the ManagedResource.

The name corresponds to the name of the JsonnetLibrary object and the key to the key in the data field.

Namespace local libraries are an easy way to set configuration for a specific ManagedResource. See component-openshift4-nodes for an example: JsonnetLibrary, template.

local esp = import 'espejote.libsonnet';
local config = import 'my-managed-resource/config.json';
local sharedLib = import 'lib/shared-library/prometheus.libsonnet';

Deletion

Deletion can be achieved marking the resource for deletion using the esp.markForDelete function.

local esp = import 'espejote.libsonnet';
esp.markForDelete({
  apiVersion: 'v1',
  kind: 'ConfigMap',
  metadata: {
    namespace: 'my-namespace',
    name: 'my-configmap',
  }
})

Instrumentation

Prometheus metrics

The controller is instrumented with Prometheus metrics. The metrics are exposed on the /metrics endpoint.

The metrics should contain the following:

Amount of re-renders per trigger
Amount of re-renders per resource
Amount of errors per re-render
Amount of resources applied per re-render coming soon
Amount of admission requests with return status

The metrics should be labeled with the ManagedResource name and namespace.

Events

The controller should emit events on errors during reconciliation.

Status

The controller should update the status of the ManagedResource with the last error category or success.

Apply options

applyOptions.forceConflicts can be used to resolve field conflicts. The ManagedResource will become the sole owner of the field.
applyOptions.createOnly can be used to only create the resource. If the resource already exists, the ManagedResource won’t update it.

Permissions and RBAC

The ManagedResources should run with the least permissions possible.

Namespace isolation

By default the ManagedResource is namespace scoped. Namespace fields for triggers and contexts need to be set to "" explicitly if resources from all namespaces are required. Returned resources will default to the namespace of the ManagedResource if not explicitly set.

Running `ManagedResources` in the context of a service account

The ManagedResource is run in the context of a service account. The service account must hold all permissions required to query and apply the resources.

If no service account is specified, the ManagedResource will run context of the namespace’s default service account.

spec:
  serviceAccountRef:
    name: my-service-account

Testing

The controller should have a testing utility integrated that will render the template with a given context and triggers. The rendered manifests could then be compared to a golden file, applied to a cluster, or parts matched using further testing frameworks.

For each trigger a rendered YAML is created with the output of the template.

The utility checks that the triggers and contexts in the test file match the triggers and contexts in the ManagedResource.

The utility can export real sources from a cluster to an input file. It executes the template to to check which JsonnetLibraries are used and bundles them in the input file.

espejote collect-input my-template.yaml > my-input-1.yaml

triggers: (1)
- {} (2)
- name: namespace
  watchResource:
    apiVersion: v1
    kind: Namespace
    metadata:
      creationTimestamp: "2025-03-17T12:41:04Z"
      labels:
        kubernetes.io/metadata.name: blub
        managedresource-sample.espejote.io/inject-cm: glug
      name: blub
      resourceVersion: "1268206" (3)
      uid: 62bf8581-6e7b-49df-b1bb-3b2ad46405e9
    spec:
      finalizers:
      - kubernetes
    status:
      phase: Active
- name: namespace
  watchResource:
    apiVersion: v1
    kind: Namespace
    metadata:
      creationTimestamp: "2025-03-17T20:50:58Z"
      labels:
        kubernetes.io/metadata.name: blub2
        managedresource-sample.espejote.io/inject-cm: glugindeed
      name: blub2
      resourceVersion: "1268191"
      uid: b1d4b12a-e7d8-4334-ba68-b598e88c4dfb
    spec:
      finalizers:
      - kubernetes
    status:
      phase: Active
context:
- name: namespaces
  resources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      creationTimestamp: "2025-03-17T12:41:04Z"
      labels:
        kubernetes.io/metadata.name: blub
        managedresource-sample.espejote.io/inject-cm: glug
      name: blub
      resourceVersion: "1268206"
      uid: 62bf8581-6e7b-49df-b1bb-3b2ad46405e9
    spec:
      finalizers:
      - kubernetes
    status:
      phase: Active
  - apiVersion: v1
    kind: Namespace
    metadata:
      creationTimestamp: "2025-03-17T20:50:58Z"
      labels:
        kubernetes.io/metadata.name: blub2
        managedresource-sample.espejote.io/inject-cm: glugindeed
      name: blub2
      resourceVersion: "1268191"
      uid: b1d4b12a-e7d8-4334-ba68-b598e88c4dfb
    spec:
      finalizers:
      - kubernetes
    status:
      phase: Active
libraries:
  jsonnetlibrary-sample/sample.libsonnet: "{Sample: 'Hello World'}" (4)

1	The template is rendered for every trigger. If multiple different contexts variations are required, a new file should be created for each combination.
2	Empty trigger to test a full `ManagedResource` reconciliation.
3	Real resources exported from a cluster by using `espejote collect-input my-template.yaml`
4	Any JsonnetLibraries used by the template are exported from the cluster and included in the input file.

espejote render my-template.yaml --input my-input-1.yaml --input my-input-2.yaml (1)

1	The command can be used with multiple input files to test different context variations.

Admission

Espejote manages ValidatingWebhookConfiguration or MutatingWebhookConfiguration through Admission resources. The Admission allows validating API requests or mutating them before they’re persisted in the cluster.

Webhook configuration

Most webhook configurations are exposed in the Admission resource to make configuration as flexible as possible. Espejote manages the service configuration for the generated admission webhook configurations and injects a namespace selector for namespaced admissions.

webhookConfiguration:
  objectSelector:
    matchLabels:
      cluster-autoscaler: default
      k8s-app: cluster-autoscaler
  rules:
    - apiGroups:
        - ''
      apiVersions:
        - '*'
      operations:
        - CREATE
      resources:
        - pods

Requests

The full request can be accessed using admission.admissionRequest(). The request contains the full admission request as defined in pkg.go.dev/k8s.io/api/admission/v1#AdmissionRequest.

Responses

Accepted admission responses are allowed, denied, and patched. The allowed and denied responses are used to allow or deny the admission request. The patched response is used to apply a patch to the resource. Patches are only allowed for mutating: true admissions.

All responses should be created using the (import "espejote.libsonnet").ALPHA.admission library.

If a patch is returned the patch should be tested using admission.assertPatch. This will ensure the patch can be applied to the resource or fail the admission otherwise.

Manifests

`JsonnetLibrary`

apiVersion: espejote.io/v1alpha1
kind: JsonnetLibrary
metadata:
  name: my-library
  namespace: controller-namespace
spec:
  data:
    my-function.libsonnet: |
      {
        myFunction: function() 42,
      }
    hello-world.libsonnet: |
      {
        helloWorld: function(name) 'Hello, %s!' % [name],
      }

`ManagedResource`

apiVersion: espejote.io/v1alpha1
kind: ManagedResource
metadata:
  name: copy-configmap
  namespace: my-namespace (1)
spec:
  applyOptions: (2)
    forceConflicts: true
  triggers:
  - name: interval
    interval: 10s (3)
  - name: cm
    watchResource: (4)
      apiVersion: v1
      kind: ConfigMap
      # name: my-configmap (5)
      labelSelector: (6)
        matchExpressions:
        - key: espejote.io/created-by
          operator: DoesNotExist
      # matchNames: [] (7)
      # ignoreNames: [] (8)
      minTimeBetweenTriggers: 10s (9)
  context: (10)
  - name: cm (11)
    resource: (12)
      apiVersion: v1
      kind: ConfigMap
      # name: my-configmap
      # matchNames: []
      # ignoreNames: []
      labelSelector:
        matchExpressions:
        - key: espejote.io/created-by
          operator: DoesNotExist
      cache: true (13)
  template: |
    local esp = import 'espejote.libsonnet';
    local cm = esp.context().cm; (15)

    if esp.triggerName() == "cm" then {
      local triggerData = esp.triggerData().resource, (14)
      // Do some partial re-rendering
    } else [ (16)
      c {
        metadata: {
          name: 'copy-of-' + c.metadata.name,
          // namespace: c.metadata.namespace, (17)
          labels+: {
            'espejote.io/created-by': 'copy-configmap'
          }
        }
      },
      for c in cm.items (18)
    ]
  serviceAccountRef:
    name: my-service-account (19)

1	`ManagedResource` are always namespace-scoped and can’t access any resources outside of their namespace.
2	`applyOptions` can be used to resolve field conflicts.
3	`interval` triggers the template every 10 seconds.
4	`watchResource` triggers the template whenever the watched resource is created, changed, or deleted.
5	`name` is optional and can be used to select a specific resource.
6	`labelSelector` is optional and can be used to a specific set of resources.
7	`matchNames` is optional and can be used to filter the resources after the labelSelector is applied.
8	`ignoreNames` is optional and can be used to filter the resources after the labelSelector is applied.
9	`minTimeBetweenTriggers` is optional and can be used to limit the rate of re-renders. coming soon The re-renders are limited by the specific unique resource.
10	Context is a list of resources that are available in the Jsonnet template. They’re not watched.
11	`name` defines a variable in the Jsonnet template.
12	`resource` is a Kubernetes resource that’s available in the Jsonnet template.
13	`cache` is optional and can be used to setup a controller-runtime cache for the resource. The default is true. coming soon
14	`espejote.libsonnet` is used to access the trigger data. The full manifest is available if using the `watchResource` trigger.
15	`espejote.libsonnet` is used to access the context. It returns a object with all defined resources as keys.
16	The template can return null, a list of resources, or a single resource.
17	The namespace defaults to the namespace of the `ManagedResource` if not set.
18	Resource variables are always a list with zero or more items.
19	`serviceAccountRef.name` holds the reference to the service account used to run the `ManagedResource`. Optional, the namespaces default service account is used if not set.

`Admission`

apiVersion: espejote.io/v1alpha1
kind: Admission
metadata:
  name: pods-inject-creator-annotation
  namespace: my-namespace
spec:
  mutating: true (1)
  template: |
    local esp = import 'espejote.libsonnet';
    local admission = esp.ALPHA.admission; (2)

    local user = admission.admissionRequest().userInfo.username; (3)
    local obj = admission.admissionRequest().object;

    if std.get(obj.metadata, 'annotations') == null then
      admission.patched('added user annotation', admission.assertPatch([ (4)
        admission.jsonPatchOp('add', '/metadata/annotations', { 'creator': user }),
      ]))
    else
      admission.patched('added user annotation', admission.assertPatch([
        admission.jsonPatchOp('add', '/metadata/annotations/creator', user),
      ]))
  webhookConfiguration: (5)
    rules:
      - apiGroups:
          - ''
        apiVersions:
          - '*'
        operations:
          - CREATE
        resources:
          - pods

1	`mutating` is true if the webhook should mutate the resource.
2	`espejote.libsonnet` provides facilities to access the admission request and create patches from it.
3	`admissionRequest` returns the full admission request as defined in pkg.go.dev/k8s.io/api/admission/v1#AdmissionRequest.
4	`admission.patched` returns a patch that can be applied to the resource. Other valid return values are `admission.allowed(msg)` and `admission.denied(msg)`. `admission.assertPatch` validates whether the patch can be applied to the resource and fails the admission if the patch isn’t valid. This makes failures easier to debug than failing in the Kubernetes API server.
5	`webhookConfiguration` is used to configure the webhook. The user has basically full control over the webhook configuration with all upstream selectors available. The admission is namespace scoped and the webhook configuration has a namespace selector injecting the namespace of the `Admission` resource. Cluster scoped webhooks aren’t yet supported and tracked here.

Sample use-cases

Sync upgrade notifications with a ArgoCD sync hook and bash script

We want to show a notification in the OpenShift Console when a minor upgrade is scheduled. To implement this we added a job triggered by an ArgoCD sync. The job is reading another resource and creating a console notification manifest from it.

Sync OpenShift Console TLS Secret with a bash "reconciler"

Cert-manager only allows to create the secret holding the certificate data in the same namespace as the Certificate resource. The OpenShift Console route is in the openshift-console namespace, but the secret needs to be applied in the openshift-config namespace. However, we must create the Certificate resource in openshift-console, since otherwise the OpenShift ingress doesn’t admit the HTTP challenge ingress.

We use a bash reconciler to create the secret in the correct namespace.

source_namespace="openshift-console"
target_namespace="openshift-config"

# # Wait for the secret to be created before trying to get it.
# # TODO: --for=create is included with OCP 4.17
# kubectl -n "${source_namespace}" wait secret "${SECRET_NAME}" --for=create --timeout=30m
echo "Waiting for secret ${SECRET_NAME} to be created"
while test -z "$(kubectl -n "${source_namespace}" get secret "${SECRET_NAME}" --ignore-not-found -oname)" ; do
   printf "."
   sleep 1
done
printf "\n"

# When using -w flag kubectl returns the secret once on startup and then again when it changes.
kubectl -n "${source_namespace}" get secret "${SECRET_NAME}" -ojson -w | jq -c --unbuffered | while read -r secret ; do
   echo "Syncing secret: $(printf "%s" "$secret" | jq -r '.metadata.name')"

   kubectl -n "$target_namespace" apply --server-side -f <(printf "%s" "$secret" | jq '{"apiVersion": .apiVersion, "kind": .kind, "metadata": {"name": .metadata.name}, "type": .type, "data": .data}')
done

component-openshift4-console

Default network policies for namespaces with espejo

We apply default network policies to all namespaces using espejo.

apiVersion: sync.appuio.ch/v1alpha1
kind: SyncConfig
[...]
spec:
  namespaceSelector:
    ignoreNames:
      - my-ignored-namespace
    labelSelector:
      matchExpressions:
        - key: network-policies.syn.tools/no-defaults
          operator: DoesNotExist
  syncItems:
    - apiVersion: networking.k8s.io/v1
[...]

component-networkpolicy

Deletion

Policies can also be deleted with espejo.

apiVersion: sync.appuio.ch/v1alpha1
kind: SyncConfig
[...]
spec:
  deleteItems:
    - apiVersion: networking.k8s.io/v1
      kind: NetworkPolicy
      name: allow-from-same-namespace
[...]
  namespaceSelector:
    matchNames: []

component-networkpolicy

Espejote: An in-cluster templating controller

Problem Statement

High Level Goals

Non-Goals

Name

Implementation

Reconcile triggers

interval

watchResource

watchContextResource

Additional triggers

Context

resource

Additional context resources

Template

Template libraries

Deletion

Instrumentation

Prometheus metrics

Events

Status

Apply options

Permissions and RBAC

Namespace isolation

Running ManagedResources in the context of a service account

Testing

Admission

Webhook configuration

Requests

Responses

Manifests

JsonnetLibrary

ManagedResource

Admission

Sample use-cases

Sync upgrade notifications with a ArgoCD sync hook and bash script

Sync OpenShift Console TLS Secret with a bash "reconciler"

Default network policies for namespaces with espejo

Deletion

Resources

`interval`

`watchResource`

`watchContextResource`

`resource`

Running `ManagedResources` in the context of a service account

`JsonnetLibrary`

`ManagedResource`

`Admission`