Espejote: An in-cluster templating controller
Problem Statement
Not every cluster configuration is fully static and can be managed through GitOps.
Some configurations depend on the cluster state or might want to be managed by upstream, by end-users, or by third-party products directly.
We got an extensive array of tools to manage these configurations, from bash "reconcilers," to cron jobs, to policy engines to fully fledged custom controllers. Those tools are all specialized and don’t cover all use-cases.
A maintainable, fully instrumented, and extensible controller that can manage arbitrary configurations in a cluster is missing. We decided to try our hand at creating one.
High Level Goals
-
We can manage arbitrary configurations in a cluster
-
The configuration may depend on state managed by other controllers inside the cluster
-
We can apply partial configurations to externally managed resources
-
-
The amount of requests and caching is tuneable
-
We can monitor failures and successes and have a tight feedback loop
Name
"Espejote" for "big mirror" in Spanish. It’s a play on the earlier project "Espejo" which means "mirror." "Espejo" is a tool to mirror resources to multiple namespaces. "Espejote" is a more general tool to manage arbitrary configurations in a cluster.
Implementation
The controller is based on controller-runtime
.
It’s managed through two Custom Resource Definitions (CRD) called ManagedResource
and Admission
.
ManagedResource
implements a controller pattern that watches for changes in the cluster and applies the configuration.
Admission
implements a webhook pattern that applies the configuration on admission.
We don’t implement any logic to automatically create admission webhooks for the resources or vice versa.
The controller uses Jsonnet to render manifests.
Any Kubernetes manifest can be added to the templating context.
Reconcile triggers are separated from the context resources to allow for more fine-grained control over the reconciliation.
Reconcile triggers
Reconcile triggers are used to re-render the template. They’re separated from the context resources to allow for more fine-grained control.
triggers:
- name: interval (1)
interval: 10s
- name: ext-configmaps
watchResource:
apiVersion: v1
kind: ConfigMap
labelSelector:
matchExpressions:
- key: espejote.io/created-by
operator: DoesNotExist
minTimeBetweenTriggers: 10s
- name: my-configmap
watchResource:
apiVersion: v1
kind: ConfigMap
name: my-configmap
1 | Free form name for the trigger. Can be referenced in the template and used to decide on partial re-renders. |
The trigger that fires is exposed to the Jsonnet as esp.triggerData()
and esp.triggerName()
.
local esp = import "espejote.libsonnet";
if esp.triggerName() == "ext-configmaps" then
// Do some partial re-rendering
local resource = esp.triggerData().resource;
else
// Do a full re-render
Currently, we support two types of triggers:
watchResource
The template is re-rendered whenever the watched resource is created, changed, or deleted.
The minTimeBetweenTriggers
field can be used to limit the rate of re-renders.
This is useful to prevent a high rate of re-renders when a resource is changed multiple times in a short period.
If a list of resources is watched, the minTimeBetweenTriggers
is applied to each resource individually.
minTimeBetweenTriggers
is coming soon.
The watchResource
trigger contains the full Kubernetes resource definition of the watched resource.
The resource is available in the Jsonnet template as esp.triggerData().resource
.
watchContextResource
The watchContextResource
trigger works the same as the watchResource
trigger, but shares cache and configuration with a context resource.
This avoids duplicate, possibly drifting, configuration when using the same configuration for both the context and the trigger.
This often happens when doing partial re-renders.
See the example below.
spec:
context:
- name: namespaces
resource:
apiVersion: v1
kind: Namespace
labelSelector:
matchExpressions:
- key: netpol.example.com/no-default
operator: DoesNotExist
triggers:
- name: namespace
watchContextResource:
name: namespaces (1)
template: |
local esp = import 'espejote.libsonnet';
local netpolForNs = function(ns) {
[...]
};
if esp.triggerName() == 'namespace' then [
netpolForNs(esp.triggerData().resource),
] else [
netpolForNs(ns)
for ns in esp.context().namespaces
]
1 | The name field is used to reference the context resource where the configuration is copied from. |
Context
The context is a list of resources that are available in the Jsonnet template. They’re not watched. If a watch is required, it should be added as a trigger.
context:
- def: configs
resource:
apiVersion: v1
kind: ConfigMap
labelSelector:
matchExpressions:
- key: espejote.io/created-by
operator: DoesNotExist
cache: true
- def: config
resource:
apiVersion: v1
kind: ConfigMap
name: my-configmap
The context is exposed to the Jsonnet as the esp.context()
function.
local esp = import 'espejote.libsonnet';
local config = esp.context().config;
Currently, we support one type of context resource:
resource
A Kubernetes resource that’s available in the Jsonnet template.
The resource can be selected by name
or labelSelector
.
matchNames
and ignoreNames
can be used to filter the resources after the labelSelector is applied.
matchNames
is a list of names that should be matched.
ignoreNames
is a list of names that should be ignored.
Label selectors should be preferred over matchNames
and ignoreNames
because the resource can already be filtered on the API server.
It’s always a list with zero or more items. This is true even if a single resource is selected using the name field.
The cache
field can be used to setup a controller-runtime cache for the resource.
This is a in-memory cache with a watch on the resource.
This is true by default.
Can be disabled for a trade-off between memory usage and API requests.
The possibility to disable the cache is coming soon.
Template
The template is a Jsonnet template that returns a list of Kubernetes resources.
It can pull in external resources through the context and triggers.
It can return a list of resources or a single resource. The resources are applied in the order they’re returned.
Template libraries
The template can use libraries to share code between templates.
The libraries are stored as JsonnetLibrary
resources.
The following libraries are available:
-
"espejote.libsonnet": The built in library for accessing the context and trigger information.
-
"lib/<NAME>/<KEY>" libraries in the shared library namespace. The name corresponds to the name of the JsonnetLibrary object and the key to the key in the data field. The namespace is configured at controller startup and normally points to the namespace of the controller.
-
"<NAME>/<KEY>" libraries in the same namespace as the ManagedResource.
The name corresponds to the name of the JsonnetLibrary object and the key to the key in the data field.
Namespace local libraries are an easy way to set configuration for a specific ManagedResource. See component-openshift4-nodes for an example: JsonnetLibrary, template.
local esp = import 'espejote.libsonnet';
local config = import 'my-managed-resource/config.json';
local sharedLib = import 'lib/shared-library/prometheus.libsonnet';
Deletion
Deletion can be achieved marking the resource for deletion using the esp.markForDelete
function.
local esp = import 'espejote.libsonnet';
esp.markForDelete({
apiVersion: 'v1',
kind: 'ConfigMap',
metadata: {
namespace: 'my-namespace',
name: 'my-configmap',
}
})
Instrumentation
Prometheus metrics
The controller is instrumented with Prometheus metrics.
The metrics are exposed on the /metrics
endpoint.
The metrics should contain the following:
-
Amount of re-renders per trigger
-
Amount of re-renders per resource
-
Amount of errors per re-render
-
Amount of resources applied per re-render coming soon
-
Amount of admission requests with return status
The metrics should be labeled with the ManagedResource
name and namespace.
Apply options
-
applyOptions.forceConflicts
can be used to resolve field conflicts. The ManagedResource will become the sole owner of the field. -
applyOptions.createOnly
can be used to only create the resource. If the resource already exists, the ManagedResource won’t update it.
Permissions and RBAC
The ManagedResources
should run with the least permissions possible.
Namespace isolation
By default the ManagedResource
is namespace scoped.
Namespace fields for triggers and contexts need to be set to ""
explicitly if resources from all namespaces are required.
Returned resources will default to the namespace of the ManagedResource
if not explicitly set.
Running ManagedResources
in the context of a service account
The ManagedResource
is run in the context of a service account.
The service account must hold all permissions required to query and apply the resources.
If no service account is specified, the ManagedResource
will run context of the namespace’s default
service account.
spec:
serviceAccountRef:
name: my-service-account
Testing
The controller should have a testing utility integrated that will render the template with a given context and triggers. The rendered manifests could then be compared to a golden file, applied to a cluster, or parts matched using further testing frameworks.
For each trigger a rendered YAML is created with the output of the template.
The utility checks that the triggers and contexts in the test file match the triggers and contexts in the ManagedResource
.
The utility can export real sources from a cluster to an input file. It executes the template to to check which JsonnetLibraries are used and bundles them in the input file.
espejote collect-input my-template.yaml > my-input-1.yaml
triggers: (1)
- {} (2)
- name: namespace
watchResource:
apiVersion: v1
kind: Namespace
metadata:
creationTimestamp: "2025-03-17T12:41:04Z"
labels:
kubernetes.io/metadata.name: blub
managedresource-sample.espejote.io/inject-cm: glug
name: blub
resourceVersion: "1268206" (3)
uid: 62bf8581-6e7b-49df-b1bb-3b2ad46405e9
spec:
finalizers:
- kubernetes
status:
phase: Active
- name: namespace
watchResource:
apiVersion: v1
kind: Namespace
metadata:
creationTimestamp: "2025-03-17T20:50:58Z"
labels:
kubernetes.io/metadata.name: blub2
managedresource-sample.espejote.io/inject-cm: glugindeed
name: blub2
resourceVersion: "1268191"
uid: b1d4b12a-e7d8-4334-ba68-b598e88c4dfb
spec:
finalizers:
- kubernetes
status:
phase: Active
context:
- name: namespaces
resources:
- apiVersion: v1
kind: Namespace
metadata:
creationTimestamp: "2025-03-17T12:41:04Z"
labels:
kubernetes.io/metadata.name: blub
managedresource-sample.espejote.io/inject-cm: glug
name: blub
resourceVersion: "1268206"
uid: 62bf8581-6e7b-49df-b1bb-3b2ad46405e9
spec:
finalizers:
- kubernetes
status:
phase: Active
- apiVersion: v1
kind: Namespace
metadata:
creationTimestamp: "2025-03-17T20:50:58Z"
labels:
kubernetes.io/metadata.name: blub2
managedresource-sample.espejote.io/inject-cm: glugindeed
name: blub2
resourceVersion: "1268191"
uid: b1d4b12a-e7d8-4334-ba68-b598e88c4dfb
spec:
finalizers:
- kubernetes
status:
phase: Active
libraries:
jsonnetlibrary-sample/sample.libsonnet: "{Sample: 'Hello World'}" (4)
1 | The template is rendered for every trigger. If multiple different contexts variations are required, a new file should be created for each combination. |
2 | Empty trigger to test a full ManagedResource reconciliation. |
3 | Real resources exported from a cluster by using espejote collect-input my-template.yaml |
4 | Any JsonnetLibraries used by the template are exported from the cluster and included in the input file. |
espejote render my-template.yaml --input my-input-1.yaml --input my-input-2.yaml (1)
1 | The command can be used with multiple input files to test different context variations. |
Admission
Espejote manages ValidatingWebhookConfiguration
or MutatingWebhookConfiguration
through Admission
resources.
The Admission
allows validating API requests or mutating them before they’re persisted in the cluster.
Webhook configuration
Most webhook configurations are exposed in the Admission
resource to make configuration as flexible as possible.
Espejote manages the service configuration for the generated admission webhook configurations and injects a namespace selector for namespaced admissions.
webhookConfiguration:
objectSelector:
matchLabels:
cluster-autoscaler: default
k8s-app: cluster-autoscaler
rules:
- apiGroups:
- ''
apiVersions:
- '*'
operations:
- CREATE
resources:
- pods
Requests
The full request can be accessed using admission.admissionRequest()
.
The request contains the full admission request as defined in pkg.go.dev/k8s.io/api/admission/v1#AdmissionRequest.
Responses
Accepted admission responses are allowed
, denied
, and patched
.
The allowed
and denied
responses are used to allow or deny the admission request.
The patched
response is used to apply a patch to the resource.
Patches are only allowed for mutating: true
admissions.
All responses should be created using the (import "espejote.libsonnet").ALPHA.admission
library.
If a patch is returned the patch should be tested using admission.assertPatch
.
This will ensure the patch can be applied to the resource or fail the admission otherwise.
Manifests
JsonnetLibrary
apiVersion: espejote.io/v1alpha1
kind: JsonnetLibrary
metadata:
name: my-library
namespace: controller-namespace
spec:
data:
my-function.libsonnet: |
{
myFunction: function() 42,
}
hello-world.libsonnet: |
{
helloWorld: function(name) 'Hello, %s!' % [name],
}
ManagedResource
apiVersion: espejote.io/v1alpha1
kind: ManagedResource
metadata:
name: copy-configmap
namespace: my-namespace (1)
spec:
applyOptions: (2)
forceConflicts: true
triggers:
- name: interval
interval: 10s (3)
- name: cm
watchResource: (4)
apiVersion: v1
kind: ConfigMap
# name: my-configmap (5)
labelSelector: (6)
matchExpressions:
- key: espejote.io/created-by
operator: DoesNotExist
# matchNames: [] (7)
# ignoreNames: [] (8)
minTimeBetweenTriggers: 10s (9)
context: (10)
- name: cm (11)
resource: (12)
apiVersion: v1
kind: ConfigMap
# name: my-configmap
# matchNames: []
# ignoreNames: []
labelSelector:
matchExpressions:
- key: espejote.io/created-by
operator: DoesNotExist
cache: true (13)
template: |
local esp = import 'espejote.libsonnet';
local cm = esp.context().cm; (15)
if esp.triggerName() == "cm" then {
local triggerData = esp.triggerData().resource, (14)
// Do some partial re-rendering
} else [ (16)
c {
metadata: {
name: 'copy-of-' + c.metadata.name,
// namespace: c.metadata.namespace, (17)
labels+: {
'espejote.io/created-by': 'copy-configmap'
}
}
},
for c in cm.items (18)
]
serviceAccountRef:
name: my-service-account (19)
1 | ManagedResource are always namespace-scoped and can’t access any resources outside of their namespace. |
2 | applyOptions can be used to resolve field conflicts. |
3 | interval triggers the template every 10 seconds. |
4 | watchResource triggers the template whenever the watched resource is created, changed, or deleted. |
5 | name is optional and can be used to select a specific resource. |
6 | labelSelector is optional and can be used to a specific set of resources. |
7 | matchNames is optional and can be used to filter the resources after the labelSelector is applied. |
8 | ignoreNames is optional and can be used to filter the resources after the labelSelector is applied. |
9 | minTimeBetweenTriggers is optional and can be used to limit the rate of re-renders.
coming soon
The re-renders are limited by the specific unique resource. |
10 | Context is a list of resources that are available in the Jsonnet template. They’re not watched. |
11 | name defines a variable in the Jsonnet template. |
12 | resource is a Kubernetes resource that’s available in the Jsonnet template. |
13 | cache is optional and can be used to setup a controller-runtime cache for the resource.
The default is true.
coming soon |
14 | espejote.libsonnet is used to access the trigger data.
The full manifest is available if using the watchResource trigger. |
15 | espejote.libsonnet is used to access the context.
It returns a object with all defined resources as keys. |
16 | The template can return null, a list of resources, or a single resource. |
17 | The namespace defaults to the namespace of the ManagedResource if not set. |
18 | Resource variables are always a list with zero or more items. |
19 | serviceAccountRef.name holds the reference to the service account used to run the ManagedResource .
Optional, the namespaces default service account is used if not set. |
Admission
apiVersion: espejote.io/v1alpha1
kind: Admission
metadata:
name: pods-inject-creator-annotation
namespace: my-namespace
spec:
mutating: true (1)
template: |
local esp = import 'espejote.libsonnet';
local admission = esp.ALPHA.admission; (2)
local user = admission.admissionRequest().userInfo.username; (3)
local obj = admission.admissionRequest().object;
if std.get(obj.metadata, 'annotations') == null then
admission.patched('added user annotation', admission.assertPatch([ (4)
admission.jsonPatchOp('add', '/metadata/annotations', { 'creator': user }),
]))
else
admission.patched('added user annotation', admission.assertPatch([
admission.jsonPatchOp('add', '/metadata/annotations/creator', user),
]))
webhookConfiguration: (5)
rules:
- apiGroups:
- ''
apiVersions:
- '*'
operations:
- CREATE
resources:
- pods
1 | mutating is true if the webhook should mutate the resource. |
2 | espejote.libsonnet provides facilities to access the admission request and create patches from it. |
3 | admissionRequest returns the full admission request as defined in pkg.go.dev/k8s.io/api/admission/v1#AdmissionRequest. |
4 | admission.patched returns a patch that can be applied to the resource.
Other valid return values are admission.allowed(msg) and admission.denied(msg) .
admission.assertPatch validates whether the patch can be applied to the resource and fails the admission if the patch isn’t valid.
This makes failures easier to debug than failing in the Kubernetes API server. |
5 | webhookConfiguration is used to configure the webhook.
The user has basically full control over the webhook configuration with all upstream selectors available.
The admission is namespace scoped and the webhook configuration has a namespace selector injecting the namespace of the Admission resource.
Cluster scoped webhooks aren’t yet supported and tracked here. |
Sample use-cases
Sync upgrade notifications with a ArgoCD sync hook and bash script
We want to show a notification in the OpenShift Console when a minor upgrade is scheduled. To implement this we added a job triggered by an ArgoCD sync. The job is reading another resource and creating a console notification manifest from it.
Sync OpenShift Console TLS Secret with a bash "reconciler"
Cert-manager only allows to create the secret holding the certificate data in the same namespace as the Certificate
resource.
The OpenShift Console route is in the openshift-console
namespace, but the secret needs to be applied in the openshift-config
namespace.
However, we must create the Certificate
resource in openshift-console
, since otherwise the OpenShift ingress doesn’t admit the HTTP challenge ingress.
We use a bash reconciler to create the secret in the correct namespace.
source_namespace="openshift-console"
target_namespace="openshift-config"
# # Wait for the secret to be created before trying to get it.
# # TODO: --for=create is included with OCP 4.17
# kubectl -n "${source_namespace}" wait secret "${SECRET_NAME}" --for=create --timeout=30m
echo "Waiting for secret ${SECRET_NAME} to be created"
while test -z "$(kubectl -n "${source_namespace}" get secret "${SECRET_NAME}" --ignore-not-found -oname)" ; do
printf "."
sleep 1
done
printf "\n"
# When using -w flag kubectl returns the secret once on startup and then again when it changes.
kubectl -n "${source_namespace}" get secret "${SECRET_NAME}" -ojson -w | jq -c --unbuffered | while read -r secret ; do
echo "Syncing secret: $(printf "%s" "$secret" | jq -r '.metadata.name')"
kubectl -n "$target_namespace" apply --server-side -f <(printf "%s" "$secret" | jq '{"apiVersion": .apiVersion, "kind": .kind, "metadata": {"name": .metadata.name}, "type": .type, "data": .data}')
done
Default network policies for namespaces with espejo
We apply default network policies to all namespaces using espejo
.
apiVersion: sync.appuio.ch/v1alpha1
kind: SyncConfig
[...]
spec:
namespaceSelector:
ignoreNames:
- my-ignored-namespace
labelSelector:
matchExpressions:
- key: network-policies.syn.tools/no-defaults
operator: DoesNotExist
syncItems:
- apiVersion: networking.k8s.io/v1
[...]