High Availability Self-Hosted Plausible on Kubernetes

Introduction

There are several tutorials on installing Plausible Analytics on Kubernetes, but none of them explain how to achieve high availability. In this guide, I'll walk you through setting up Plausible Analytics with a highly available ClickHouse cluster and PostgreSQL cluster.

Prerequisites

A Kubernetes cluster with at least three worker nodes.
CloudNativePG - PostgreSQL Operator for Kubernetes installed.
Altinity Kubernetes Operator for ClickHouse installed.
cert-manager.io installed with a cluster issuer.

Step-by-step Guide

Note: This might only works with Plausible CE v2.1.3.

Create a plausible namespace:

apiVersion: v1
kind: Namespace
metadata:
  name: plausible

Save as namespace.yaml file and send to K8s:

kubectl create -f namespace.yaml

ZooKeeper is required for ClickHouse cluster replication, deploy a HA ZooKeeper:

Download this zookeeper.yaml file and send to K8s:
```
kubectl create -f zookeeper.yaml
```

Deploy the HA ClickHouse cluster:

apiVersion: 'clickhouse.altinity.com/v1'
kind: 'ClickHouseInstallation'
metadata:
  name: plausible
  namespace: plausible
spec:
  configuration:
    users:
      plausible/password_sha256_hex: xxxyyy # printf 'YOUR_PASSWORD' | sha256sum
      plausible/networks/ip:
        - 0.0.0.0/0
    zookeeper:
      nodes:
        - host: zookeeper-0.zookeeper-headless.plausible.svc
        - host: zookeeper-1.zookeeper-headless.plausible.svc
        - host: zookeeper-2.zookeeper-headless.plausible.svc
    clusters:
      - name: 'cluster'
        templates:
          podTemplate: pod-template
        layout:
          shardsCount: 1
          replicasCount: 3
  templates:
    podTemplates:
      - name: pod-template
        spec:
          containers:
            - name: clickhouse
              image: clickhouse/clickhouse-server:24.3
              volumeMounts:
                - name: clickhouse-data
                  mountPath: /var/lib/clickhouse
                - name: clickhouse-log
                  mountPath: /var/log/clickhouse-server
    volumeClaimTemplates:
      - name: clickhouse-data
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 2Gi
      - name: clickhouse-log
        spec:
          accessModes:
            - ReadWriteOnce
          resources:
            requests:
              storage: 2Gi

Save as clickhouse.yaml file and send to K8s:

kubectl create -f clickhouse.yaml

Deploy a HA PostgreSQL cluster:

---
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
  name: postgresql-plausible
  namespace: plausible
spec:
  # https://github.com/cloudnative-pg/postgres-containers/pkgs/container/postgresql
  # https://github.com/cloudnative-pg/postgis-containers/pkgs/container/postgis
  imageName: ghcr.io/cloudnative-pg/postgresql:17.0
  instances: 3
  postgresql:
    parameters:
      max_worker_processes: '60'
    pg_hba:
      - host all all all md5
  storage:
    size: 1Gi
  primaryUpdateMethod: switchover
  monitoring:
    enablePodMonitor: true
---
apiVersion: postgresql.cnpg.io/v1
kind: Pooler
metadata:
  name: postgresql
  namespace: plausible
spec:
  cluster:
    name: postgresql-plausible
  instances: 3
  type: rw
  pgbouncer:
    poolMode: session
    parameters:
      max_client_conn: '1000'
      default_pool_size: '10'

Save as postgresql.yaml file and send to K8s:

kubectl create -f postgresql.yaml

Provide the ENVs for the Plausible through ConfigMap and Secret:

You can get the PostgreSQL password in the postgresql-plausible-app secret:

kubectl -n plausible get secret postgresql-plausible-app -o jsonpath="{.data.password}" | base64 --decode

You can create a K8s secret for storing plausible ENVs:

export SECRET_KEY_BASE=$(openssl rand -base64 48)
export TOTP_VAULT_KEY=$(openssl rand -base64 32)
export DATABASE_URL='postgres://app:ToBeReplaced@postgresql.plausible:5432/app'
export CLICKHOUSE_DATABASE_URL='http://default:ToBeReplaced@clickhouse-plausible.plausible:8123/default'
export SMTP_USER_PWD='toBeReplaced'

kubectl create secret generic --dry-run=client \
    plausible-env \
    --namespace='plausible' \
    --from-literal=SECRET_KEY_BASE=$SECRET_KEY_BASE \
    --from-literal=TOTP_VAULT_KEY=$TOTP_VAULT_KEY \
    --from-literal=DATABASE_URL=$DATABASE_URL \
    --from-literal=CLICKHOUSE_DATABASE_URL=$CLICKHOUSE_DATABASE_URL \
    --from-literal=SMTP_USER_PWD=$SMTP_USER_PWD \
    -o yaml

Your final YAML file for the ENVs will looks like this:

apiVersion: v1
kind: ConfigMap
metadata:
  name: plausible-env
  namespace: plausible
data:
  BASE_URL: 'https://analytics.domain.com' # Update this
  DISABLE_REGISTRATION: 'invite_only'
  ENABLE_EMAIL_VERIFICATION: 'true'
  MAILER_NAME: 'Plausible Analytics'
  MAILER_EMAIL: 'plausible@analytics.domain.com' # Update this
  SMTP_HOST_ADDR: 'smtp.tem.scw.cloud' # Update this
  SMTP_USER_NAME: 'xxxxxxxx-aaaa-bbbb-cccc-yyyyyyyyyyyyy' # Update this
---
apiVersion: v1
data:
  CLICKHOUSE_DATABASE_URL: XXX
  DATABASE_URL: YYY
  SECRET_KEY_BASE: ZZZ
  SMTP_USER_PWD: AAA
  TOTP_VAULT_KEY: BBB
kind: Secret
metadata:
  creationTimestamp: null
  name: plausible-env
  namespace: plausible

Save as env.yaml file and send to K8s:

kubectl create -f env.yaml

Plausible doesn't natively support a highly available (HA) ClickHouse replication cluster. To enable this, we need to manually create the necessary Replicated tables on the ClickHouse cluster:

Connect to ClickHouse and execute this SQL query using any preferred tool, such as DBeaver.

Deploy the Plausible with two or more replicas:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: plausible
  namespace: plausible
  annotations:
    reloader.stakater.com/auto: 'true'
  labels:
    app: plausible
spec:
  replicas: 2
  selector:
    matchLabels:
      app: plausible
  template:
    metadata:
      namespace: plausible
      labels:
        app: plausible
    spec:
      initContainers:
        - name: migration
          image: ghcr.io/plausible/community-edition:v2.1.3
          command:
            - /bin/sh
            - -c
            - |
              /entrypoint.sh db createdb && /entrypoint.sh db migrate
          envFrom:
            - secretRef:
                name: plausible-env
            - configMapRef:
                name: plausible-env
      containers:
        - name: plausible
          image: ghcr.io/plausible/community-edition:v2.1.3
          ports:
            - containerPort: 8000
          envFrom:
            - secretRef:
                name: plausible-env
            - configMapRef:
                name: plausible-env

Save as depl.yaml file and send to K8s:

kubectl create -f depl.yaml

Configure service and ingress for Plausible:

---
apiVersion: v1
kind: Service
metadata:
  name: plausible
  namespace: plausible
spec:
  selector:
    app: plausible
  ports:
    - protocol: TCP
      port: 8000
      targetPort: 8000
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: plausible
  namespace: plausible
  annotations:
    cert-manager.io/cluster-issuer: letsencrypt-cluster-issuer
spec:
  tls:
    - hosts:
        - analytics.domain.com # Update this
      secretName: plausible-tls
  rules:
    - host: analytics.domain.com
      http:
        paths:
          - path: /
            pathType: Prefix
            backend:
              service:
                name: plausible
                port:
                  number: 8000

Save as ingress-service.yaml file and send to K8s:

kubectl create -f ingress-service.yaml

Verify

Once configured correctly, the following result will be displayed:

kubectl -n plausible get pods
NAME                          READY   STATUS    RESTARTS      AGE
chi-plausible-cluster-0-0-0   1/1     Running   0             24h
chi-plausible-cluster-0-1-0   1/1     Running   0             24h
chi-plausible-cluster-0-2-0   1/1     Running   0             24h
plausible-6cc78d59bb-56wjj    1/1     Running   2 (24h ago)   24h
plausible-6cc78d59bb-kl6sf    1/1     Running   1 (24h ago)   24h
postgresql-7968d465bd-87gx4   1/1     Running   0             24h
postgresql-7968d465bd-hlrdj   1/1     Running   0             24h
postgresql-7968d465bd-wj4ln   1/1     Running   0             24h
postgresql-plausible-1        1/1     Running   0             24h
postgresql-plausible-2        1/1     Running   0             24h
postgresql-plausible-3        1/1     Running   0             24h
zookeeper-0                   1/1     Running   1 (22h ago)   24h
zookeeper-1                   1/1     Running   0             24h
zookeeper-2                   1/1     Running   0             24h

You can now access to analytics.domain.com and create your Plausible account.

Conclusion

By following these steps, you now have a production-grade Plausible Analytics setup running on your Kubernetes cluster. I've been running my own Plausible in my cluster, and it has performed well without any downtime, even during Kubernetes node maintenance/restating.

High Availability Self-Hosted Plausible Analytics on Kubernetes

Introduction

Prerequisites

Step-by-step Guide

Verify

Conclusion

References