Nomad cluster with Consul service mesh

by Afanasy Barbarov

Complete Guide: Production Nomad Cluster with Consul Service Mesh

A comprehensive, step-by-step guide to building a production-ready Nomad cluster with Consul service mesh, Vault secrets management, and persistent storage on Hetzner Cloud.

Prerequisites

  • Hetzner Cloud account with API token
  • Cloudflare account for tunnel access
  • Local machine with Terraform installed
  • SSH key for server access

Architecture Overview

                    ┌─────────────┐
                    │   Bastion   │
                    │ (10.0.1.2)  │ ◄── Internet
                    │ NAT Gateway │
                    └─────────────┘

         ┌──────────────────┼──────────────────┐
         │                  │                  │
┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│   Node 1    │    │   Node 2    │    │   Node 3    │
│ (10.0.1.3)  │◄──►│ (10.0.1.4)  │◄──►│ (10.0.1.5)  │
│  Consul     │    │  Consul     │    │  Consul     │
│  Vault      │    │  Vault      │    │  Vault      │
│  Nomad      │    │  Nomad      │    │  Nomad      │
└─────────────┘    └─────────────┘    └─────────────┘
       │                  │                  │
       └──────────────────┼──────────────────┘

           ┌──────────────────────────────────┐
           │                                  │
           ▼                                  ▼
    ┌─────────────┐                  ┌─────────────┐
    │  Services   │                  │  Storage    │
    │             │                  │             │
    │ • Traefik   │                  │ • PostgreSQL│
    │ • Web Apps  │◄────────────────►│ • MinIO     │
    │ • APIs      │  Service Mesh    │ • CSI Vols  │
    │             │   (Connect)      │             │
    └─────────────┘                  └─────────────┘

Step 1: Infrastructure Deployment

1.1 Terraform Configuration

Create main.tf with the complete infrastructure:

# Network and Load Balancer
resource "hcloud_network" "private_network" {
  name     = "private-network"
  ip_range = "10.0.0.0/16"
}

resource "hcloud_network_subnet" "private_subnet" {
  network_id   = hcloud_network.private_network.id
  type         = "cloud"
  network_zone = "eu-central"
  ip_range     = "10.0.1.0/24"
}

resource "hcloud_load_balancer" "lb" {
  name               = "load-balancer"
  load_balancer_type = "lb11"
  location           = "nbg1"
}

# Firewall rules
resource "hcloud_firewall" "firewall" {
  name = "cluster-firewall"
  
  rule {
    direction  = "in"
    protocol   = "tcp"
    port       = "22"
    source_ips = ["0.0.0.0/0"]
  }
  
  rule {
    direction  = "in"
    protocol   = "tcp"
    port       = "any"
    source_ips = [hcloud_network_subnet.private_subnet.ip_range]
  }
}

# SSH Key
resource "tls_private_key" "ssh_key" {
  algorithm = "RSA"
  rsa_bits  = 4096
}

resource "hcloud_ssh_key" "default" {
  name       = "cluster-ssh-key"
  public_key = tls_private_key.ssh_key.public_key_openssh
}

resource "local_file" "private_key" {
  content         = tls_private_key.ssh_key.private_key_pem
  filename        = "cluster-ssh-key.pem"
  file_permission = "0600"
}

1.2 Deploy Infrastructure

terraform init
terraform plan
terraform apply

Save the bastion IP from outputs:

terraform output -json bastion_ip | jq -r '.ipv4'

Step 2: Initial Server Setup

2.1 SSH Connection Setup

eval $(ssh-agent)
ssh-add ./cluster-ssh-key.pem
ssh -i ./cluster-ssh-key.pem -A root@<BASTION_IP>

# From bastion, access cluster nodes
ssh [email protected]
ssh [email protected]  
ssh [email protected]

2.2 Network Verification

On each node, verify connectivity:

ping -c 4 8.8.8.8
ping -c 4 google.com

Step 3: Consul Cluster Setup

3.1 Generate Gossip Key

On any node, generate the encryption key:

consul keygen
# Example output: HpEfIFftGyPG1gdjpJ185/J2V2R/HnSkruG1tWbQYOE=

3.2 Configure Consul

On each node, edit /etc/consul.d/consul.hcl:

datacenter = "eu-central"
data_dir = "/opt/consul"
encrypt = "HpEfIFftGyPG1gdjpJ185/J2V2R/HnSkruG1tWbQYOE="
retry_join = ["10.0.1.3", "10.0.1.4", "10.0.1.5"]
bind_addr = "0.0.0.0"
client_addr = "0.0.0.0"
advertise_addr = "10.0.1.X"  # Node-specific IP
bootstrap_expect = 3
server = true

acl {
  enabled        = true
  default_policy = "allow"
  enable_token_persistence = true
}

ports {
  grpc = 8502
}

ui_config {
  enabled = true
}

connect {
  enabled = true
}

3.3 Start Consul

On each node:

systemctl daemon-reload
systemctl enable consul
systemctl start consul
systemctl status consul
consul members

Step 4: Vault Cluster Setup

4.1 Configure Vault

On each node, edit /etc/vault.d/vault.hcl:

ui = true
disable_mlock = true

storage "raft" {
  path    = "/opt/vault/data"
  node_id = "vault-X"  # 1, 2, or 3
  
  retry_join {
    leader_api_addr = "http://10.0.1.3:8200"
  }
  retry_join {
    leader_api_addr = "http://10.0.1.4:8200"
  }
  retry_join {
    leader_api_addr = "http://10.0.1.5:8200"
  }
}

listener "tcp" {
  address     = "0.0.0.0:8200"
  cluster_address = "0.0.0.0:8201"
  tls_disable = 1
}

api_addr     = "http://10.0.1.X:8200"  # Node-specific IP
cluster_addr = "http://10.0.1.X:8201"

4.2 Initialize Vault

On node 1:

systemctl start vault
export VAULT_ADDR="http://127.0.0.1:8200"
vault operator init -key-shares=3 -key-threshold=2

# Save the unseal keys and root token!
vault operator unseal  # Use key 1
vault operator unseal  # Use key 2
export VAULT_TOKEN="hvs.EXAMPLE_ROOT_TOKEN"

On nodes 2 and 3:

systemctl start vault
export VAULT_ADDR="http://127.0.0.1:8200"
vault operator unseal  # Use key 1
vault operator unseal  # Use key 2

Step 5: Nomad Configuration

5.1 Configure Nomad

On each node, edit /etc/nomad.d/nomad.hcl:

datacenter = "eu-central"
data_dir   = "/opt/nomad/data"
bind_addr  = "0.0.0.0"

server {
  enabled          = true
  bootstrap_expect = 3
  server_join {
    retry_join = ["10.0.1.3", "10.0.1.4", "10.0.1.5"]
  }
}

advertise {
  http = "10.0.1.X:4646"  # Node-specific IP
  rpc  = "10.0.1.X:4647"
  serf = "10.0.1.X:4648"
}

client {
  enabled           = true
  network_interface = "enp7s0"
  servers           = ["10.0.1.3", "10.0.1.4", "10.0.1.5"]
  
  cni_path = "/opt/cni/bin"
  cni_config_dir = "/etc/cni/net.d"
  
  meta {
    role     = "server"
    location = "nbg1"  # fsn1 for node 2, hel1 for node 3
    node_id  = "X"     # 1, 2, or 3
  }
}

consul {
  address             = "127.0.0.1:8500"
  grpc_address        = "127.0.0.1:8502"
  server_service_name = "nomad-server"
  client_service_name = "nomad-client"
  auto_advertise      = true
  server_auto_join    = true
  client_auto_join    = true
}

plugin "docker" {
  config {
    allow_privileged = true
    volumes {
      enabled = true
    }
  }
}

plugin "csi" {
  enabled = true
}

5.2 Start Nomad

systemctl enable nomad
systemctl start nomad
nomad server members
nomad node status

Step 6: Cloudflare Tunnel Setup

6.1 Create Tunnel Configuration

Create /opt/nomad/jobs/cloudflare-tunnel.yml:

ingress:
  - hostname: nomad.example.com
    service: http://10.0.1.3:4646
  - hostname: consul.example.com
    service: http://10.0.1.3:8500
  - hostname: vault.example.com
    service: http://10.0.1.3:8200
  - hostname: traefik.example.com
    service: http://10.0.1.3:8080
  - service: http_status:404

6.2 Deploy Tunnel Job

Create cf-tunnel.nomad.hcl:

job "cloudflare-tunnel" {
  datacenters = ["eu-central"]
  type        = "service"

  group "tunnel" {
    task "cloudflared" {
      driver = "docker"
      
      config {
        image = "cloudflare/cloudflared:latest"
        args = [
          "tunnel", 
          "--config", "/local/tunnel.yml",
          "run", "--token", "CLOUDFLARE_TUNNEL_TOKEN"
        ]
      }
      
      template {
        destination = "local/tunnel.yml"
        data = file("cloudflare-tunnel.yml")
      }
    }
  }
}
nomad job run cf-tunnel.nomad.hcl

Step 7: Vault Policies and Secrets

7.1 Create Vault Policies

Create nomad-server-policy.hcl:

path "auth/token/create/nomad-cluster" {
  capabilities = ["update"]
}

path "auth/token/roles/nomad-cluster" {
  capabilities = ["read"]
}

path "secret/data/*" {
  capabilities = ["read"]
}

Apply policies:

vault policy write nomad-server nomad-server-policy.hcl
vault policy write ssl-access ssl-access-policy.hcl
vault policy write pg-access pg-access-policy.hcl
vault policy write minio-access minio-access-policy.hcl

7.2 Create Nomad Token Role

Create nomad-cluster-role.json:

{
  "disallowed_policies": "nomad-server",
  "token_explicit_max_ttl": 0,
  "name": "nomad-cluster",
  "orphan": true,
  "token_period": 2592000,
  "renewable": true
}
vault write /auth/token/roles/nomad-cluster @nomad-cluster-role.json
vault token create -role=nomad-cluster -period=730h -orphan

7.3 Store Secrets

vault secrets enable -path=secret kv-v2

# Database passwords (use strong passwords)
vault kv put secret/pg/primary username="postgres" password="STRONG_PASSWORD_1"
vault kv put secret/pg/replica1 username="postgres" password="STRONG_PASSWORD_2" 
vault kv put secret/pg/replica2 username="postgres" password="STRONG_PASSWORD_3"

# MinIO credentials
vault kv put secret/minio/keys access_key="MINIO_ACCESS_KEY" secret_key="MINIO_SECRET_KEY"

# SSL certificates (use your actual certificates)
vault kv put secret/ssl/example.com [email protected] [email protected]
vault kv put secret/ssl/example2.com [email protected] [email protected]

7.4 Update Nomad Configuration

Add to each node's /etc/nomad.d/nomad.hcl:

vault {
  enabled          = true
  address          = "http://127.0.0.1:8200"
  tls_skip_verify  = true
  task_token_ttl   = "1h"
  create_from_role = "nomad-cluster"
  token            = "NOMAD_VAULT_TOKEN"
}
systemctl restart nomad

Step 8: Storage Infrastructure

8.1 Deploy CSI Controller

Create hcloud-csi-controller.hcl:

job "hcloud-csi-controller" {
  datacenters = ["eu-central"]
  type        = "system"

  group "controller" {
    task "hcloud-csi-controller" {
      driver = "docker"
      
      config {
        image = "hetznercloud/hcloud-csi-driver:latest"
        args = [
          "--endpoint=unix:///csi/csi.sock",
          "--metrics-addr=0.0.0.0:9189"
        ]
      }

      env {
        HCLOUD_TOKEN = "HETZNER_API_TOKEN"
      }
    }
  }
}

8.2 Deploy CSI Node Plugin

Create hcloud-csi-node.hcl:

job "hcloud-csi-node" {
  datacenters = ["eu-central"]
  type        = "system"

  group "node" {
    task "hcloud-csi-node" {
      driver = "docker"
      
      config {
        image = "hetznercloud/hcloud-csi-driver:latest"
        privileged = true
        args = [
          "--endpoint=unix:///csi/csi.sock",
          "--metrics-addr=0.0.0.0:9189"
        ]
      }
    }
  }
}

Deploy CSI:

nomad job run hcloud-csi-controller.hcl
nomad job run hcloud-csi-node.hcl

8.3 Create Volumes

Create volume definitions:

db-volume-1.hcl:

type         = "csi"
id           = "db-volume-1"
name         = "db-volume-1"
external_id  = "db-volume-1"
plugin_id    = "csi.hetzner.cloud"
capacity_min = "10G"
capacity_max = "20G"

capability {
  access_mode     = "single-node-writer"
  attachment_mode = "file-system"
}

Create volumes:

nomad volume create db-volume-1.hcl
nomad volume create db-volume-2.hcl
nomad volume create db-volume-3.hcl
nomad volume create storage-volume-1.hcl
nomad volume create storage-volume-2.hcl
nomad volume create storage-volume-3.hcl

Step 9: Database Deployment

9.1 PostgreSQL Primary

Create postgres-primary.hcl:

job "postgresql-primary" {
  datacenters = ["eu-central"]
  type        = "service"

  group "postgresql" {
    count = 1

    constraint {
      attribute = "${meta.node_id}"
      value     = "1"
    }

    volume "pgdata" {
      type            = "csi"
      source          = "db-volume-1"
      attachment_mode = "file-system"
      access_mode     = "single-node-writer"
    }

    network {
      mode = "host"
      port "db" {
        static = 5432
      }
    }

    service {
      name = "postgresql-primary"
      port = "db"

      check {
        type     = "tcp"
        interval = "30s"
        timeout  = "5s"
      }
    }

    task "postgres" {
      driver = "docker"

      vault {
        policies = ["pg-access"]
      }

      config {
        image = "postgres:14"
        network_mode = "host"
        volumes = [
          "local/postgresql.conf:/etc/postgresql/postgresql.conf",
          "local/pg_hba.conf:/etc/postgresql/pg_hba.conf"
        ]
      }

      template {
        destination = "secrets/env"
        env         = true
        data = <<EOF
POSTGRES_PASSWORD={{ with secret "secret/pg/primary" }}{{ .Data.data.password }}{{ end }}
POSTGRES_DB=appsdb1
EOF
      }

      volume_mount {
        volume      = "pgdata"
        destination = "/var/lib/postgresql/data"
      }

      resources {
        cpu    = 500
        memory = 512
      }
    }
  }
}

9.2 Deploy Database

nomad job run postgres-primary.hcl
nomad job run postgres-replica-1.hcl
nomad job run postgres-replica-2.hcl

Step 10: Object Storage

10.1 MinIO Deployment

Create minio.hcl:

job "minio" {
  datacenters = ["eu-central"]
  type        = "service"

  group "minio-1" {
    constraint {
      attribute = "${meta.node_id}"
      value     = "1"
    }

    volume "minio_data" {
      type            = "csi"
      source          = "storage-volume-1"
      attachment_mode = "file-system"
      access_mode     = "single-node-writer"
    }

    network {
      mode = "host"
      port "minio" {
        static = 9000
      }
      port "console" {
        static = 9001
      }
    }

    task "minio" {
      driver = "docker"

      vault {
        policies = ["minio-access"]
      }

      config {
        image = "minio/minio:latest"
        network_mode = "host"
        args = [
          "server",
          "--address", "0.0.0.0:9000",
          "--console-address", "0.0.0.0:9001",
          "/data"
        ]
      }

      template {
        destination = "secrets/env"
        env         = true
        data = <<EOF
MINIO_ROOT_USER={{ with secret "secret/minio/keys" }}{{ .Data.data.access_key }}{{ end }}
MINIO_ROOT_PASSWORD={{ with secret "secret/minio/keys" }}{{ .Data.data.secret_key }}{{ end }}
EOF
      }

      volume_mount {
        volume      = "minio_data"
        destination = "/data"
      }
    }
  }
}
nomad job run minio.hcl

Step 11: Web Stack Deployment

11.1 Traefik Load Balancer

Create traefik.nomad.hcl:

job "traefik" {
  datacenters = ["eu-central"]
  type        = "system"

  group "traefik" {
    network {
      port "dashboard" {
        static = 8080
      }
      port "http" {
        static = 80
      }
      port "https" {
        static = 443
      }
    }

    task "traefik" {
      driver = "docker"

      vault {
        policies = ["ssl-access"]
      }

      config {
        image = "traefik:v3.1.6"
        network_mode = "host"
        args = [
          "--api.dashboard=true",
          "--api.insecure=true",
          "--entryPoints.http.address=:80",
          "--entryPoints.https.address=:443",
          "--providers.consulcatalog=true",
          "--providers.nomad=true",
          "--providers.nomad.endpoint.address=http://127.0.0.1:4646",
          "--providers.file.filename=/local/traefik.toml"
        ]
      }

      template {
        destination = "local/traefik.toml"
        data = <<EOF
[tls]
  [[tls.certificates]]
    certFile = "/local/ssl/example.com/cert.pem"
    keyFile = "/local/ssl/example.com/key.pem"

  [[tls.certificates]]
    certFile = "/local/ssl/example2.com/cert.pem"
    keyFile = "/local/ssl/example2.com/key.pem"
EOF
      }

      template {
        data = <<EOF
{{ with secret "secret/ssl/example.com" }}{{ .Data.data.cert }}{{ end }}
EOF
        destination = "local/ssl/example.com/cert.pem"
      }

      template {
        data = <<EOF
{{ with secret "secret/ssl/example.com" }}{{ .Data.data.key }}{{ end }}
EOF
        destination = "local/ssl/example.com/key.pem"
      }
    }
  }
}

11.2 Main Website

Create web-main.hcl:

job "web-example-com" {
  datacenters = ["eu-central"]
  type        = "service"

  group "main" {
    count = 1

    network {
      mode = "bridge"
      port "http" {
        to = 80
      }
    }

    service {
      port = "http"
      name = "main"
      provider = "nomad"
      tags = [
        "traefik.enable=true",
        "traefik.http.routers.main.rule=Host(`example.com`)",
        "traefik.http.routers.main.entrypoints=https",
        "traefik.http.routers.main.service=main",
        "traefik.http.routers.main.tls=true",
      ]
      check {
        type     = "http"
        path     = "/"
        interval = "10s"
        timeout  = "2s"
      }
    }

    task "main" {
      driver = "docker"

      config {
        image = "nginx:latest"
        auth_soft_fail = true
        ports = ["http"]
        volumes = [
          "local/nginx.conf:/etc/nginx/nginx.conf",
          "local/index.html:/usr/share/nginx/html/index.html"
        ]
      }

      template {
        destination = "local/nginx.conf"
        data = <<EOF
events {
    worker_connections 1024;
}
http {
    server {
        listen 80;
        root /usr/share/nginx/html;
        index index.html;
        location / {
            try_files $uri $uri/ =404;
        }
    }
}
EOF
      }

      template {
        destination = "local/index.html"
        data = <<EOF
<!DOCTYPE html>
<html>
<head>
    <title>Main Site</title>
</head>
<body>
    <h1>Welcome to Example.com</h1>
</body>
</html>
EOF
      }

      resources {
        cpu    = 100
        memory = 128
      }
    }
  }
}

11.3 Blog Application

Create web-blog.hcl:

job "web-blog-example-com" {
  datacenters = ["eu-central"]
  type        = "service"

  group "blog" {
    count = 1

    network {
      mode = "bridge"
      port "http" {
        to = 5000
      }
    }

    service {
      port = "http"
      name = "blog"
      provider = "nomad"
      tags = [
        "traefik.enable=true",
        "traefik.http.routers.blog.rule=Host(`blog.example.com`)",
        "traefik.http.routers.blog.entrypoints=https",
        "traefik.http.routers.blog.service=blog",
        "traefik.http.routers.blog.tls=true",
      ]

      check {
        name     = "HTTP Check"
        type     = "http"
        path     = "/"
        interval = "10s"
        timeout  = "2s"
      }
    }

    task "blog" {
      driver = "docker"

      vault {
        policies = ["pg-access"]
      }

      config {
        image = "python:3.9-slim"
        auth_soft_fail = true
        ports = ["http"]
        entrypoint = ["/bin/sh", "-c"]
        args = [
          <<-EOF
            pip install flask psycopg2-binary && \
            python -c "
            import os
            from flask import Flask
            import psycopg2

            app = Flask(__name__)

            @app.route('/')
            def hello():
                conn = psycopg2.connect(
                    dbname='appsdb1',
                    user='postgres',
                    password=os.environ['NOMAD_POSTGRES_PASSWORD'],
                    host=os.environ['DB_HOST']
                )
                cur = conn.cursor()
                cur.execute('SELECT version();')
                version = cur.fetchone()
                conn.close()
                return f'Blog powered by PostgreSQL: {version[0]}'

            if __name__ == '__main__':
                app.run(host='0.0.0.0', port=5000)"
          EOF
        ]
      }

      template {
        destination = "secrets/env"
        env         = true
        data = <<-EOF
          NOMAD_POSTGRES_PASSWORD={{ with secret "secret/pg/primary" }}{{ .Data.data.password }}{{ end }}
          DB_HOST=10.0.1.3
        EOF
      }

      resources {
        cpu    = 100
        memory = 128
      }
    }
  }
}

11.4 Bot Application

Create web-bot.hcl:

job "web-bot-example2-ai" {
  datacenters = ["eu-central"]
  type        = "service"

  group "bot" {
    count = 1

    constraint {
      attribute = "${meta.node_id}"
      value     = "1"
    }

    volume "bot_data" {
      type      = "host"
      source    = "bot_data"
      read_only = false
    }

    network {
      mode = "bridge"
      port "http" {
        to = 5000
      }
    }

    service {
      port = "http"
      name = "bot"
      provider = "nomad"
      tags = [
        "traefik.enable=true",
        "traefik.http.routers.bot.rule=Host(`bot.example2.ai`)",
        "traefik.http.routers.bot.entrypoints=https",
        "traefik.http.routers.bot.service=bot",
        "traefik.http.routers.bot.tls=true",
      ]
    }

    task "bot" {
      driver = "docker"

      vault {
        policies = ["pg-access"]
      }

      config {
        image = "rust:alpine"
        auth_soft_fail = true
        ports = ["http"]
        entrypoint = ["/bin/sh", "-c"]
        args = [
          <<-EOF
            apk add --no-cache gcc musl-dev && \
            echo 'fn main() { println!("Bot service running"); }' > main.rs && \
            rustc main.rs -o bot && \
            ./bot
          EOF
        ]
      }

      template {
        destination = "local/env"
        env = true
        data = <<EOF
RUST_LOG=info
UPLOADS_DIR=/app/data/uploads
DB_PATH=/app/data/db.sqlite
BASE_URL=https://bot.example2.ai
BASE_PORT=5000
BASE_ENV="prod"
PG_URL="10.0.1.3:5432"
PG_DB="bot_prod"
PG_USER="bot_prod_user"
PG_PASSWORD="{{ with nomadVar "nomad/jobs/web-bot-example2-ai" }}{{ .pg_password }}{{ end }}"
OPENAI_API_KEY="{{ with nomadVar "nomad/jobs/web-bot-example2-ai" }}{{ .openai_api_key }}{{ end }}"
TELEGRAM_TOKEN="{{ with nomadVar "nomad/jobs/web-bot-example2-ai" }}{{ .telegram_token }}{{ end }}"
EOF
      }

      resources {
        cpu    = 250
        memory = 256
      }

      volume_mount {
        volume      = "bot_data"
        destination = "/app/data"
        read_only   = false
      }

      logs {
        max_files     = 3
        max_file_size = 50
      }
    }
  }
}

Step 12: Development Environment with Service Mesh

12.1 Dev Backend Service

Create dev-backend.hcl:

job "web-dev-example-com" {
  datacenters = ["eu-central"]
  type        = "service"

  group "dev-example-com-backend" {
    network {
      mode = "bridge"
    }

    service {
      name = "dev-example-com-backend"
      port = "9001"

      connect {
        sidecar_service {}

        sidecar_task {
          driver = "docker"

          config {
            image           = "envoyproxy/envoy:v1.31-latest"
            auth_soft_fail  = true
          }

          resources {
            cpu    = 133
            memory = 63
          }
        }
      }
    }

    task "dev-example-com-backend" {
      driver = "docker"

      config {
        image = "hashicorpdev/counter-api:v3-arm64"
        auth_soft_fail  = true
      }

      resources {
        cpu    = 100
        memory = 32
      }
    }
  }

  group "dev-example-com-frontend" {
    network {
      mode = "bridge"

      port "http" {
        to = 9002
      }
    }

    service {
      name = "dev-example-com-frontend"
      port = "9002"

      connect {
        sidecar_service {
          proxy {
            upstreams {
              destination_name = "dev-example-com-backend"
              local_bind_port  = 8080
            }
          }
        }

        sidecar_task {
          driver = "docker"

          config {
            image           = "envoyproxy/envoy:v1.31-latest"
            auth_soft_fail  = true
          }

          resources {
            cpu    = 133
            memory = 63
          }
        }
      }
    }

    task "dev-example-com-frontend" {
      driver = "docker"

      env {
        COUNTING_SERVICE_URL = "http://${NOMAD_UPSTREAM_ADDR_dev_example_com_backend}"
      }

      config {
        image = "hashicorpdev/counter-dashboard:v3-arm64"
        auth_soft_fail  = true
      }

      resources {
        cpu    = 100
        memory = 32
      }
    }
  }
}

12.2 Dev Ingress Gateway

Create dev-ingress.hcl:

job "ingress-dev-example-com" {
  datacenters = ["eu-central"]
  type        = "service"

  group "ingress-dev-example-com" {
    network {
      mode = "bridge"

      port "frontend" {
        to = 9002
      }
    }

    service {
      name = "ingress-dev-example-com"
      port = "frontend"

      tags = [
        "traefik.enable=true",
        "traefik.http.routers.dev-example.rule=Host(`dev.example.com`)",
        "traefik.http.routers.dev-example.entrypoints=https",
        "traefik.http.routers.dev-example.service=ingress-dev-example-com",
        "traefik.http.routers.dev-example.tls=true"
      ]

      connect {
        gateway {
          proxy {}

          ingress {
            listener {
              port     = 9002
              protocol = "tcp"

              service {
                name = "dev-example-com-frontend"
              }
            }
          }
        }

        sidecar_task {
          driver = "docker"

          config {
            image           = "envoyproxy/envoy:v1.31-latest"
            auth_soft_fail  = true
          }

          resources {
            cpu    = 133
            memory = 63
          }
        }
      }
    }
  }
}

12.3 Deploy Development Environment

# Deploy backend and frontend services
nomad job run dev-backend.hcl

# Deploy ingress gateway
nomad job run dev-ingress.hcl

# Verify service mesh connectivity
nomad alloc logs -f $(nomad job allocs web-dev-example-com | grep frontend | head -1 | awk '{print $1}')

This demonstrates a complete service mesh pattern with:

  • Backend service with sidecar proxy
  • Frontend service connecting to backend via service mesh
  • Ingress gateway terminating external connections
  • Automatic service discovery and load balancing

Step 13: Deployment and Verification

13.1 Create Nomad Variables

For applications using nomadVar:

# Bot application variables
nomad var put nomad/jobs/web-bot-example2-ai \
  pg_password="STRONG_BOT_DB_PASSWORD" \
  openai_api_key="sk-YOUR_OPENAI_API_KEY" \
  telegram_token="YOUR_TELEGRAM_BOT_TOKEN"

# Verify variables
nomad var get nomad/jobs/web-bot-example2-ai

13.2 Deploy Core Services

# Deploy in order
nomad job run traefik.nomad.hcl
nomad job run web-main.hcl
nomad job run web-blog.hcl
nomad job run web-bot.hcl
nomad job run dev-backend.hcl
nomad job run dev-ingress.hcl

# Verify services
nomad job status
consul catalog services
vault status

13.2 Access Management Interfaces

  • Nomad: https://nomad.example.com
  • Consul: https://consul.example.com
  • Vault: https://vault.example.com
  • Traefik: https://traefik.example.com
  • MinIO: https://minio.example.com

13.3 Application Access

  • Main Site: https://example.com
  • Blog: https://blog.example.com
  • Bot Service: https://bot.example2.ai
  • Development Environment: https://dev.example.com

13.4 Health Checks

# Check cluster health
nomad server members
consul members
vault status

# Check service connectivity
curl -k https://example.com
curl -k https://blog.example.com
curl -k https://bot.example2.ai

# Check database connectivity
psql -h 10.0.1.3 -p 5432 -U postgres -d appsdb1

# Check MinIO
curl -k https://minio.example.com

Step 14: Production Considerations

14.1 Backup Strategy

  • Vault: Regular snapshots of raft storage
  • Consul: Snapshot automation via API
  • PostgreSQL: WAL archiving and base backups
  • Nomad: State backup via API

14.2 Monitoring Integration

  • Prometheus metrics from all services
  • Health check automation
  • Log aggregation through service mesh
  • Alert manager configuration

14.3 Security Hardening

  • Enable Consul/Nomad ACLs in production
  • Use TLS everywhere (disable tls_skip_verify)
  • Rotate secrets regularly
  • Network segmentation with strict firewall rules

Conclusion

This setup provides a production-ready foundation with:

  • High Availability: Multi-node clusters with automatic failover
  • Security: Service mesh encryption, secret management, network isolation
  • Scalability: Horizontal scaling through load balancing and replication
  • Observability: Built-in health checks and metrics collection
  • Reproducibility: Infrastructure as code with Terraform and Nomad

The service mesh eliminates network complexity while maintaining security. Adding new services requires only job definitions - no manual networking or secret management.

This architecture scales from development environments to production workloads, providing a solid foundation for containerized applications with enterprise-grade reliability.

Written by Afanasy Barbarov — Tech Lead with 15+ years shipping production systems in Rust, Go, and TypeScript. Facing a similar challenge? Reach out on LinkedIn. Support my work.

More articles

Previous post

Give Nomad second chance.

Read more

Next post

Creating a performance monitoring solution for pet projects.

Read more