High Availability Self-Hosted Plausible Analytics on Kubernetes

HA Plausible K8s
HA Plausible K8s

Introduction

There are several tutorials on installing Plausible Analytics on Kubernetes, but none of them explain how to achieve high availability. In this guide, I'll walk you through setting up Plausible Analytics with a highly available ClickHouse cluster and PostgreSQL cluster.

Prerequisites

  • A Kubernetes cluster with at least three worker nodes.
  • CloudNativePG - PostgreSQL Operator for Kubernetes installed.
  • Altinity Kubernetes Operator for ClickHouse installed.
  • cert-manager.io installed with a cluster issuer.

Step-by-step Guide

Note: This might only works with Plausible CE v2.1.3.

  1. Create a plausible namespace:

    apiVersion: v1
    kind: Namespace
    metadata:
      name: plausible
    

    Save as namespace.yaml file and send to K8s:

    kubectl create -f namespace.yaml
    
  2. ZooKeeper is required for ClickHouse cluster replication, deploy a HA ZooKeeper:

    Download this zookeeper.yaml file and send to K8s:

    kubectl create -f zookeeper.yaml
    
  3. Deploy the HA ClickHouse cluster:

    apiVersion: 'clickhouse.altinity.com/v1'
    kind: 'ClickHouseInstallation'
    metadata:
      name: plausible
      namespace: plausible
    spec:
      configuration:
        users:
          plausible/password_sha256_hex: xxxyyy # printf 'YOUR_PASSWORD' | sha256sum
          plausible/networks/ip:
            - 0.0.0.0/0
        zookeeper:
          nodes:
            - host: zookeeper-0.zookeeper-headless.plausible.svc
            - host: zookeeper-1.zookeeper-headless.plausible.svc
            - host: zookeeper-2.zookeeper-headless.plausible.svc
        clusters:
          - name: 'cluster'
            templates:
              podTemplate: pod-template
            layout:
              shardsCount: 1
              replicasCount: 3
      templates:
        podTemplates:
          - name: pod-template
            spec:
              containers:
                - name: clickhouse
                  image: clickhouse/clickhouse-server:24.3
                  volumeMounts:
                    - name: clickhouse-data
                      mountPath: /var/lib/clickhouse
                    - name: clickhouse-log
                      mountPath: /var/log/clickhouse-server
        volumeClaimTemplates:
          - name: clickhouse-data
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 2Gi
          - name: clickhouse-log
            spec:
              accessModes:
                - ReadWriteOnce
              resources:
                requests:
                  storage: 2Gi    
    

    Save as clickhouse.yaml file and send to K8s:

    kubectl create -f clickhouse.yaml
    
  4. Deploy a HA PostgreSQL cluster:

    ---
    apiVersion: postgresql.cnpg.io/v1
    kind: Cluster
    metadata:
      name: postgresql-plausible
      namespace: plausible
    spec:
      # https://github.com/cloudnative-pg/postgres-containers/pkgs/container/postgresql
      # https://github.com/cloudnative-pg/postgis-containers/pkgs/container/postgis
      imageName: ghcr.io/cloudnative-pg/postgresql:17.0
      instances: 3
      postgresql:
        parameters:
          max_worker_processes: '60'
        pg_hba:
          - host all all all md5
      storage:
        size: 1Gi
      primaryUpdateMethod: switchover
      monitoring:
        enablePodMonitor: true
    ---
    apiVersion: postgresql.cnpg.io/v1
    kind: Pooler
    metadata:
      name: postgresql
      namespace: plausible
    spec:
      cluster:
        name: postgresql-plausible
      instances: 3
      type: rw
      pgbouncer:
        poolMode: session
        parameters:
          max_client_conn: '1000'
          default_pool_size: '10'    
    

    Save as postgresql.yaml file and send to K8s:

    kubectl create -f postgresql.yaml
    
  5. Provide the ENVs for the Plausible through ConfigMap and Secret:

    You can get the PostgreSQL password in the postgresql-plausible-app secret:

    kubectl -n plausible get secret postgresql-plausible-app -o jsonpath="{.data.password}" | base64 --decode
    

    You can create a K8s secret for storing plausible ENVs:

    export SECRET_KEY_BASE=$(openssl rand -base64 48)
    export TOTP_VAULT_KEY=$(openssl rand -base64 32)
    export DATABASE_URL='postgres://app:ToBeReplaced@postgresql.plausible:5432/app'
    export CLICKHOUSE_DATABASE_URL='http://default:ToBeReplaced@clickhouse-plausible.plausible:8123/default'
    export SMTP_USER_PWD='toBeReplaced'
    
    kubectl create secret generic --dry-run=client \
        plausible-env \
        --namespace='plausible' \
        --from-literal=SECRET_KEY_BASE=$SECRET_KEY_BASE \
        --from-literal=TOTP_VAULT_KEY=$TOTP_VAULT_KEY \
        --from-literal=DATABASE_URL=$DATABASE_URL \
        --from-literal=CLICKHOUSE_DATABASE_URL=$CLICKHOUSE_DATABASE_URL \
        --from-literal=SMTP_USER_PWD=$SMTP_USER_PWD \
        -o yaml
    

    Your final YAML file for the ENVs will looks like this:

    apiVersion: v1
    kind: ConfigMap
    metadata:
      name: plausible-env
      namespace: plausible
    data:
      BASE_URL: 'https://analytics.domain.com' # Update this
      DISABLE_REGISTRATION: 'invite_only'
      ENABLE_EMAIL_VERIFICATION: 'true'
      MAILER_NAME: 'Plausible Analytics'
      MAILER_EMAIL: 'plausible@analytics.domain.com' # Update this
      SMTP_HOST_ADDR: 'smtp.tem.scw.cloud' # Update this
      SMTP_USER_NAME: 'xxxxxxxx-aaaa-bbbb-cccc-yyyyyyyyyyyyy' # Update this
    ---
    apiVersion: v1
    data:
      CLICKHOUSE_DATABASE_URL: XXX
      DATABASE_URL: YYY
      SECRET_KEY_BASE: ZZZ
      SMTP_USER_PWD: AAA
      TOTP_VAULT_KEY: BBB
    kind: Secret
    metadata:
      creationTimestamp: null
      name: plausible-env
      namespace: plausible
    

    Save as env.yaml file and send to K8s:

    kubectl create -f env.yaml
    
  6. Plausible doesn't natively support a highly available (HA) ClickHouse replication cluster. To enable this, we need to manually create the necessary Replicated tables on the ClickHouse cluster:

    Connect to ClickHouse and execute this SQL query using any preferred tool, such as DBeaver.

  7. Deploy the Plausible with two or more replicas:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: plausible
      namespace: plausible
      annotations:
        reloader.stakater.com/auto: 'true'
      labels:
        app: plausible
    spec:
      replicas: 2
      selector:
        matchLabels:
          app: plausible
      template:
        metadata:
          namespace: plausible
          labels:
            app: plausible
        spec:
          initContainers:
            - name: migration
              image: ghcr.io/plausible/community-edition:v2.1.3
              command:
                - /bin/sh
                - -c
                - |
                  /entrypoint.sh db createdb && /entrypoint.sh db migrate
              envFrom:
                - secretRef:
                    name: plausible-env
                - configMapRef:
                    name: plausible-env
          containers:
            - name: plausible
              image: ghcr.io/plausible/community-edition:v2.1.3
              ports:
                - containerPort: 8000
              envFrom:
                - secretRef:
                    name: plausible-env
                - configMapRef:
                    name: plausible-env    
    

    Save as depl.yaml file and send to K8s:

    kubectl create -f depl.yaml
    
  8. Configure service and ingress for Plausible:

    ---
    apiVersion: v1
    kind: Service
    metadata:
      name: plausible
      namespace: plausible
    spec:
      selector:
        app: plausible
      ports:
        - protocol: TCP
          port: 8000
          targetPort: 8000
    ---
    apiVersion: networking.k8s.io/v1
    kind: Ingress
    metadata:
      name: plausible
      namespace: plausible
      annotations:
        cert-manager.io/cluster-issuer: letsencrypt-cluster-issuer
    spec:
      tls:
        - hosts:
            - analytics.domain.com # Update this
          secretName: plausible-tls
      rules:
        - host: analytics.domain.com
          http:
            paths:
              - path: /
                pathType: Prefix
                backend:
                  service:
                    name: plausible
                    port:
                      number: 8000
    

    Save as ingress-service.yaml file and send to K8s:

    kubectl create -f ingress-service.yaml
    

Verify

Once configured correctly, the following result will be displayed:

kubectl -n plausible get pods
NAME                          READY   STATUS    RESTARTS      AGE
chi-plausible-cluster-0-0-0   1/1     Running   0             24h
chi-plausible-cluster-0-1-0   1/1     Running   0             24h
chi-plausible-cluster-0-2-0   1/1     Running   0             24h
plausible-6cc78d59bb-56wjj    1/1     Running   2 (24h ago)   24h
plausible-6cc78d59bb-kl6sf    1/1     Running   1 (24h ago)   24h
postgresql-7968d465bd-87gx4   1/1     Running   0             24h
postgresql-7968d465bd-hlrdj   1/1     Running   0             24h
postgresql-7968d465bd-wj4ln   1/1     Running   0             24h
postgresql-plausible-1        1/1     Running   0             24h
postgresql-plausible-2        1/1     Running   0             24h
postgresql-plausible-3        1/1     Running   0             24h
zookeeper-0                   1/1     Running   1 (22h ago)   24h
zookeeper-1                   1/1     Running   0             24h
zookeeper-2                   1/1     Running   0             24h

You can now access to analytics.domain.com and create your Plausible account.

Conclusion

By following these steps, you now have a production-grade Plausible Analytics setup running on your Kubernetes cluster. I've been running my own Plausible in my cluster, and it has performed well without any downtime, even during Kubernetes node maintenance/restating.

References