ADR 0040 - TCP Service Access
Author |
Marco De Luca |
|---|---|
Owner |
Schedar |
Reviewers |
Schedar |
Date Created |
2025-10-01 |
Date Updated |
2025-10-27 |
Status |
accepted |
Tags |
framework, networking, tcp |
|
Summary
This ADR evaluates methods to enable cluster-external TCP-based service access for AppCat/Servala instances and compares architectures for multiplexing TCP traffic across hundreds of tenants. This applies to services such as Codey (Git over SSH), PostgreSQL, Redis, MySQL, and other database or TCP-based workloads. |
Context
AppCat/Servala users need direct TCP access from outside a cluster to various services running in their tenant namespaces, including Codey (Git over SSH), PostgreSQL, Redis, MySQL, and other database or TCP-based workloads.
Unlike HTTP, TCP protocols typically lack SNI (Server Name Indication), which means hostname-based routing on a single port is not possible. To multiplex multiple tenant services through a single IP address, each service instance must be assigned a unique TCP port.
Cloud provider load balancers impose hard limits on the number of listeners (ports) per IP address. For example (as of October 2025):
-
Cloudscale: up to 100 listeners per LoadBalancer
-
Exoscale: up to 10 listeners per LoadBalancer
Considered Options
Option 1: Per-tenant LoadBalancer (distinct IP per instance)
Each tenant service instance would expose its Service directly as type: LoadBalancer, letting the cloud provider allocate a public IP for that instance.
This is the most straightforward design and uses Kubernetes native service abstraction without any shared proxying layer.
Network traffic from the Internet terminates directly on each tenant’s service pod, no intermediate routing or multiplexing is required.
Operationally, this model scales linearly: every new tenant service gets one LoadBalancer and one public IP.
- Pros
-
-
Simple datapath
-
Simpler to operate
-
No lateral movement risk
-
- Cons
-
-
Potentially hundreds of IPs
-
Linear cost
-
Violates consolidation goal
-
Cloud providers limit the number of load balancers a per tenant
-
Option 2: One LB + one IP + file-based TCP proxy (graceful reloads)
A single shared LoadBalancer IP would forward all inbound SSH traffic to a TCP proxy Deployment (for example HAProxy).
The proxy configuration file would define one bind :PORT and server <tenant> pair per customer instance.
A custom controller would maintain that configuration file (for example a ConfigMap) and trigger a graceful reload whenever tenants are added or removed.
During reloads, HAProxy can keep existing sessions alive but still incurs short pauses for new connections.
- Pros
-
-
Simple architecture
-
Portable
-
- Cons
-
-
Reloads can interrupt active sessions, altough sessions are short-lived for Git
-
Config management complexity
-
Option 3: Dynamic proxy with runtime API (HAProxy Data Plane API / Envoy xDS)
Instead of editing files, the controller would communicate with the proxy at runtime:
-
For HAProxy, through its REST-based Data Plane API to add or remove frontends/backends dynamically.
-
For Envoy, through an xDS control-plane that updates listeners and clusters live.
The proxy stays running continuously. New tenants are added or removed atomically without restarting the process.
The controller becomes a lightweight "routing brain" maintaining the mapping of (tenant → port → backend service) and pushing those changes to the proxy’s API.
- Pros
-
-
Atomic updates
-
No dropped connections
-
Mature APIs
-
- Cons
-
-
Needs custom control-plane component
-
Higher operational burden
-
- References
Option 4: Kubernetes Gateway API (TCPRoute per tenant)
In this model, the Kubernetes Gateway API serves as the control plane for multiplexing TCP traffic, independent of any specific proxy implementation. A conformant Gateway controller such as Envoy Gateway, Cilium Gateway, or KGateway manages a shared data plane that acts as a TCP multiplexer for all tenant service connections.
A Kubernetes Gateway resource represents the public entry point (one Service type=LoadBalancer with a single external IP).
Each tenant service receives a dedicated TCP listener bound to a unique port.
For larger deployments, additional ports can be defined through XListenerSet (an experimental feature as the X implies) resources, which extend a Gateway with extra listener definitions beyond its native limit.
A custom controller acts as a lightweight orchestrator: it allocates a free port, creates a TCPRoute pointing to the tenant’s internal Service, and adds a ReferenceGrant in the tenant’s namespace to authorize the cross-namespace reference.
The selected Gateway implementation (Envoy, Cilium, etc.) then automatically reconciles these CRDs into live configuration, handling listener creation, routing, and backend resolution without reloads or direct API calls.
- Pros
-
-
Fully Kubernetes-native and declarative, integrates cleanly with OpenShift and future Gateway API implementations.
-
Zero-downtime dynamic updates. Gateway controllers apply config atomically without restarts.
-
Simpler controller logic: only needs to manage CRDs, not interact with proxy runtime APIs.
-
Strong observability and status reporting via Gateway API metrics and Proxy metrics.
-
- Cons
-
-
Each
Gatewaysupports ≤ 64 listeners, additional listeners requireXListenerSet, which can add management overhead. -
Still constrained by cloud LoadBalancer listener caps (10-100), necessitating sharding at scale.
-
Slightly heavier footprint than a plain HAProxy setup.
-
- References
Option 5: BGP-advertised Floating IPs with MetalLB or Cilium LB
This approach uses BGP to advertise floating IP addresses directly to the cluster, bypassing cloud provider LoadBalancer-as-a-Service (LBaaS) offerings.
Tools such as MetalLB or Cilium LoadBalancer mode manage IP allocation and BGP advertisements, allowing Kubernetes Service type=LoadBalancer resources to receive IPs from a provider-configured floating IP range.
- Pros
-
-
Independence from cloud provider LBaaS APIs and their limitations
-
Potentially lower cost at scale
-
- Cons
-
-
Each Kubernetes
Service type=LoadBalancerstill requires a unique IP address -
Kubernetes Services cannot route traffic to endpoints across multiple namespaces
-
Does not solve the listener port multiplexing problem, reverse proxy still required
-
Requires BGP peering configuration with the cloud provider
-
May require external load balancer infrastructure or additional network management
-
Increases operational overhead compared to managed LBaaS
-
Feasibility depends on cloud provider support for BGP and floating IP ranges
-
Not verified on all target platforms (for example Exoscale)
-
- References
Decision
We will adopt Option 4, a Gateway API based architecture for multiplexing TCP traffic across AppCat/Servala tenant services.
This design leverages the Kubernetes Gateway API as the declarative control plane for dynamic TCP routing, while allowing flexibility in choosing the underlying implementation such as Envoy Gateway, Cilium Gateway API, KGateway etc.
The custom controller will manage Gateway API resources (Gateway, TCPRoute, ReferenceGrant, and optionally XListenerSet) to dynamically assign ports and routes for each tenant service.
The selected Gateway implementation (Envoy, Cilium, etc.) will reconcile these CRDs into live data-plane configuration without requiring reloads or manual synchronization.
-
Each shard = one
Gatewayobject behind its ownService type=LoadBalancer(one IP). -
Shard capacity is provider-specific (Cloudscale = 100 listeners/IP, Exoscale = 10 listeners/IP).
-
The controller allocates ports from reserved per-shard ranges (sticky per tenant to avoid collisions) and updates each tenant service’s configuration with the assigned external port and hostname.
-
DNS uses deterministic service-specific hostnames per shard: for example,
ssh1.codey.ch,ssh2.codey.chfor Codey Git SSH;db1.example.ch,db2.example.chfor databases.
-
Gateway API is the next generation Kubernetes Ingress, ensuring long-term compatibility and portability.
-
Multiple conformant implementations exist, providing choice and ecosystem flexibility.
-
Dynamic listener and route updates occur natively through the Gateway controller, no reloads, no direct API integration required.
-
The controller logic remains simple: declarative CRD management and port allocation rather than low-level proxy configuration.
-
LoadBalancer sharding remains necessary due to provider listener caps (10-100 listeners/IP).
-
This complexity is isolated to the controller layer (shard lifecycle, DNS updates), not the data plane.
-
The architecture remains portable across providers and Gateway implementations.
Implementation Notes
-
Each service instance listens internally on a service-specific port (for example, Forgejo on
2222, PostgreSQL on5432, Redis on6379). -
Controller maintains a CRD tracking shard capacity, port ranges, and per-service assignments.
-
When a shard reaches capacity, the controller provisions a new
Gatewayand assigns the next DNS alias. -
NetworkPolicies allow traffic from the Gateway namespace to tenant namespaces on the service-specific internal ports.
-
Health, metrics, and reconciliation loops ensure the Gateway state matches the controller’s record.
Consequences
-
Cloudscale: one IP supports ~100 tenant services (listeners).
-
Exoscale: one IP supports 10 tenant services (listeners).
-
Uniform controller logic across providers and service types, only shard capacities differ.
-
LoadBalancer costs grow linearly with shard count.
-
Each service type (Codey, PostgreSQL, Redis, etc.) can use dedicated or shared Gateway shards depending on scale and isolation requirements.
Risks and Mitigations
-
Hard provider caps: unavoidable → automatic sharding.
-
Gateway listener scaling: if using Gateway API, use
XListenerSet, enforce per-shard and per-Gateway limits. -
Port collisions: non-overlapping port ranges per shard, lease-based allocator.
-
DNS complexity: deterministic shard hostnames (
sshN.codey.ch) and automation. -
Operational drift: periodic reconcile verifies proxy state against controller CRDs.