Skip to content

High availability

The control plane is deployed as a standalone application. If using our Helm chart, it's possible to deploy multiple replicas in different availability zones which can provide operational redundancy.

To make sure only one instance performs infrastructure changes, the control plane implements leader election and automatically selects one pod to serve all requests.

How it works

Kubernetes only

This feature is only supported when running the control plane with Kubernetes. While a different synchronization primitive could have been used, we rely on the Kubernetes Service to redirect traffic.

Control plane

When an instance of the control plane deployment is started, it attempts to acquire a Lease from the Kubernetes API. The lease is short-lived and is renewed periodically.

Whichever instance acquires the lease first will report itself as "Ready" to the Service. This ensures that only one instance can respond to API requests from both PgDog and the web UI.

Configuration

This feature is disabled by default. It can be enabled and configured in the Helm chart:

values.yaml
control:
  config:
    leader:
      enabled: true
      lease_name: "control2"
      lease_duration_secs: 15
      lease_interval_secs: 5
      release_timeout_secs: 5

Most of these settings have sane defaults:

Configuration Description
enabled Toggle leader election on or off. It is disabled by default (false).
lease_name The name of the Lease resource. Change it if you're planning to deploy more than one control plane per namespace.
lease_duration_secs Lease duration. Longer values prevent lease takeover due to clock skew, but slow down redeployments after unexpected pod termination.
lease_interval_secs How often the control plane leader attempts to renew the lease.
release_timeout_secs How long the control plane will wait while shutting down gracefully for the lease to be released.

Default deployment

By default, the control plane is deployed with only one replica. This is usually sufficient since high availability is not essential for normal operations and PgDog pods can tolerate intermittent control plane downtime.