Deploy high-availability Stalwart mail server for production

HA Stalwart
HA Stalwart

Introduction

Stalwart serves as an all-in-one, open-source mail solution, offering robust support for essential communication standards including SMTP, JMAP, IMAP4, POP3, and the full WebDAV suite (CalDAV/CardDAV). I've been running Stalwart in production for some time now, and I've learned a lot. In this post, I'll share my step-by-step method for deploying a fully redundant, high-availability (HA) Stalwart cluster.

Prerequisites

To get our high-availability cluster working, we'll need a few other services set up first. The great news is that all of these are completely free.

  1. Two server instances (2CPU-12GB RAM) from the Oracle Cloud Always Free tier.
  2. A MySQL HeatWave database from the Oracle Cloud Always Free tier.
  3. A free replicated managed Redis instances from Upstash.
  4. A free CockroachDB (PostgreSQL-compatible) database from Cockroach Labs.
  5. You will need one IPv4 and one IPv6 address for each server. You must also establish proper hostnames and reverse DNS (rDNS) records for all of them.
  6. Cloudflare DNS for automated certificate issuance and renewal.

Step-by-step

The following steps assume your domain is hostname.me, the first server’s hostname is mx1.hostname.me, the second server’s hostname is mx2.hostname.me, and the load balancer’s hostname is mx.hostname.me.

  1. Provision two free VM.Standard.A1.Flex instances on Oracle Cloud, distributing them across separate availability domains: Oracle Arm Instances

  2. Create a MySQL HeatWave database system using the Oracle Cloud Always Free tier: Oracle MySQL HeatWave

  3. Create a free, highly available CockroachDB cluster - fully PostgreSQL-compatible - by configuring three different regions. This will give you a cluster similar to the one shown in the image below: HA CockroachDB

  4. Create a free, replicated Redis instance using Upstash. You will need to start with a single region and subsequently add a read region to form the replicated cluster: Upstash Redis

  5. Create a free Object Storage bucket in Oracle and generate the necessary access credentials: Oracle Cloud Object Storage

  6. Configure the DNS for your server hostnames, then contact Oracle Support to request the creation of PTR records for your IP addresses:

    1st_server_ipv4 -> mx1.hostname.me
    1st_server_ipv6 -> mx1.hostname.me
    2nd_server_ipv6 -> mx2.hostname.me
    2nd_server_ipv6 -> mx2.hostname.me
    
  7. Create a Network Security Group to allow incoming traffic on the ports used by Stalwart: Stalwart Network Security Group

  8. Create a free, public Network Load Balancer. Configure the two previously created servers as the backend set, and assign the Stalwart Network Security Group: Stalwart Network Load Balancer

  9. Point the mx.hostname.me to the load balacer IP:

    dig mx.hostname.me A +short
    145.241.xxx.yyy
    
  10. Create Cloudflare API Key for automated certificate issuance, renewal and TLSA records updating:

    CLOUDFLARE_API_KEY=xxxyyyzzz
    
  11. Install Docker on the servers, then create a Docker Compose file compose.yaml with the following content:

    networks:
      ip6net:
        enable_ipv6: true
        ipam:
          config:
            - subnet: 2001:db8::/64
    services:
      stalwart:
        image: stalwartlabs/stalwart:v0.14.1
        container_name: stalwart
        restart: unless-stopped
        networks:
          - ip6net  
        ports:
          - "443:443"
          - "8080:8080"
          - "25:25"
          - "587:587"
          - "465:465"
          - "143:143"
          - "993:993"
          - "4190:4190"
          - "110:110"
          - "995:995"
        volumes:
          - ./data:/opt/stalwart
    
  12. Start the Stalwart on one server, check the logs for the admin's password:

    docker compose up -d
    docker compose logs stalwart
    
  13. Access the UI via the load balancer IP/hostname (open port 8080 temporary):

    http://mx.hostname.me:8080
    
  14. Configure all the backend stores: Stalwart Stores

  15. Configure the data store, blob store, full-text store and in-memory store: Stalwart backend stores

  16. Configure the default ACME provider for your domain, set the subject names to hostname.me, *.hostname.me and select Cloudflare as the DNS Provider and put the API key: Stalwart ACME provider

  17. Configure MySQL HeatWay as the tracing history store: Stalwart Tracing History

  18. Configure the MX patterns in the MTA-STS policy so that mail servers use these hostnames instead of the load balancer hostname: Stalwart MTA-STS Policy

  19. Configure the Connect stage to use the correct hostnames instead of the load balancer hostname: Stalwart Connect Stage

  20. Create the stalwart-tlsa-updater API key for automatic TLSA record updates: Stalwart API Key

  21. Select Redis as the cluster coordinator backend: Stalwart Redis Coordinator

  22. Your Stalwart configuration is now ready for the first server. You can now update the data/etc/config.toml file to add additional settings:

    cluster.node-id = 1  # 1st server id
    queue.connection.default.ehlo-hostname = "mx1.hostname.me" # -> 1st server hostname
    server.hostname = "mx.hostname.me" # -> load balancer hostname
    ...
    
  23. Copy the config.toml to the 2nd server with the following updates:

    cluster.node-id = 2 # 2nd server id
    queue.connection.default.ehlo-hostname = "mx2.hostname.me" # -> 2nd server hostname
    
  24. Start the Stalwart on the 2nd server:

    docker compose up -d
    
  25. Run the Stalwart TLSA Updater service by adding a new service to the docker compose file:

    services:
      tlsa:
        image: ghcr.io/harrytang/stalwart-tlsa-updater:v0.1.2
        container_name: tlsa
        restart: unless-stopped
        environment:
          - STALWART_API_KEY=YOUR_API_KEY_ABOVE
          - STALWART_API_URL=https://mx.hostname.me/api # load balancer hostname
          - STALWART_DOMAIN=hostname.me # main domain
          - STALWART_HOSTNAMES=mx1.hostname.me,mx2.hostname.me
          - CLOUDFLARE_API_URL=https://api.cloudflare.com/client/v4
          - CLOUDFLARE_API_KEY=YOUR_CF_API_KEY
          - API_KEY=009c9e24-3412-4aa4-a18d-4ad2c57870e8 # tlsa-updater auth api key
          - REDIS_URL=YOUR_UPTASH_REDIS_URL
        networks:
          - ip6net
        ports:
          - "3000:3000"
    
  26. You can now begin adding domains to the Stalwart system. Start by adding hostname.me as the primary domain, then add any additional domains you need. You can then manage the domains and view their DNS records. Copy these records and add them to your domain's DNS settings. The only change needed is that the default MX record points to mx.hostname.me - delete it and add two new MX records pointing to mx1.hostname.me and mx2.hostname.me instead: Stalwart Domains

  27. Create a webhook so that when the certificate is renewed, Stalwart notifies the Stalwart TLSA Updater, which will update the TLSA records via the Cloudflare DNS API: Stalwart Webhook

  28. Everything is set - you're ready to send emails: Stalwart mail-tester

Conclusion

You now have a fully functional, highly available (replicated) Stalwart mail system running entirely free of charge. I’ve been using this setup for my own email and for clients, and it has been running smoothly. Stalwart v1.0 is coming soon, so stay tuned.

References