When v2.0.0 added the golang.org/x/image/webp dependency (used by the media scrubber for WebP decoding), go mod tidy bumped the module's minimum Go version in go.mod: module go-blockchain go 1.25.0 The three Dockerfiles in the repo were still pinned to older images: /Dockerfile FROM golang:1.24-alpine /deploy/prod/Dockerfile.slim FROM golang:1.24-alpine /docker/media-sidecar/Dockerfile FROM golang:1.22-alpine Result: `docker build` on any of them fails at `go mod download` with go: go.mod requires go >= 1.25.0 (running go 1.24.13; GOTOOLCHAIN=local) because Alpine's golang image pins GOTOOLCHAIN=local to keep the toolchain reproducible. Fix: bump all three to golang:1.25-alpine. The media-sidecar module doesn't actually need 1.25 (it's self-contained and only uses stdlib), but keeping all three in sync avoids surprise the next time somebody adds a dep. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
DChain production deployment
Turn-key-ish stack: 3 validators + Caddy TLS edge + optional Prometheus/Grafana, behind auto-HTTPS.
Prerequisites
- Docker + Compose v2
- A public IP and open ports
80,443,4001(libp2p) on every host - DNS
A-record pointingDOMAINat the host running Caddy - Basic familiarity with editing env files
Layout (single-host pilot)
┌─ Caddy :443 ── TLS terminate ──┬─ node1:8080 ──┐
internet ────────→│ ├─ node2:8080 │ round-robin /api/*
└─ Caddy :4001 (passthrough) └─ node3:8080 │ ip_hash /api/ws
...
Prometheus → node{1,2,3}:8080/metrics
Grafana ← Prometheus data source
For a real multi-datacentre deployment, copy this whole directory onto each
VPS, edit docker-compose.yml to keep only the node that runs there, and
put Caddy on one dedicated edge host (or none — point clients at one node
directly and accept the lower availability).
First-boot procedure
-
Generate keys for each validator. Easiest way:
# On any box with the repo checked out docker build -t dchain-node-slim -f deploy/prod/Dockerfile.slim . mkdir -p deploy/prod/keys for i in 1 2 3; do docker run --rm -v "$PWD/deploy/prod/keys:/out" dchain-node-slim \ /usr/local/bin/client keygen --out /out/node$i.json done cat deploy/prod/keys/node*.json | jq -r .pub_key # → copy into DCHAIN_VALIDATORS -
Configure env files. Copy
node.env.exampletonode1.env,node2.env,node3.env. Paste the three pubkeys from step 1 intoDCHAIN_VALIDATORSin ALL THREE files. SetDOMAINto your public host. -
Start the network:
DOMAIN=dchain.example.com docker compose up -d docker compose logs -f node1 # watch genesis + first blocksFirst block is genesis (index 0), created only by
node1because it has the--genesisflag. After you see blocks #1, #2, #3… committing, editdocker-compose.ymland remove the--genesisflag from node1's command section, thendocker compose up -d node1to re-create it without that flag. Leaving--genesisin makes no-op on a non-empty DB but is noise in the logs. -
Verify HTTPS and HTTP-to-HTTPS redirect:
curl -s https://$DOMAIN/api/netstats | jq curl -s https://$DOMAIN/api/well-known-contracts | jqCaddy should have issued a cert automatically from Let's Encrypt.
-
(Optional) observability:
GRAFANA_ADMIN_PW=$(openssl rand -hex 24) docker compose --profile monitor up -d # Grafana at http://<host>:3000, user admin, password from envAdd a "Prometheus" data source pointing at
http://prometheus:9090, then import a dashboard that graphs:dchain_blocks_total(rate)dchain_tx_submit_accepted_total/rejected_totaldchain_ws_connectionsdchain_peer_count_liverate(dchain_block_commit_seconds_sum[5m]) / rate(dchain_block_commit_seconds_count[5m])
Common tasks
Add a 4th validator
The new node joins as an observer via --join, then an existing validator
promotes it on-chain:
# On the new box
docker run -d --name node4 \
--volumes chaindata:/data \
-e DCHAIN_ANNOUNCE=/ip4/<public-ip>/tcp/4001 \
dchain-node-slim \
--db=/data/chain --join=https://$DOMAIN --register-relay
Then from any existing validator:
docker compose exec node1 /usr/local/bin/client add-validator \
--key /keys/node.json \
--node http://localhost:8080 \
--target <NEW_PUBKEY>
The new node starts signing as soon as it sees itself in the validator set on-chain — no restart needed.
Upgrade without downtime
PBFT tolerates f faulty nodes out of 3f+1. For 3 validators that means
zero — any offline node halts consensus. So for 3-node clusters:
docker compose pull && docker compose buildon all three hosts first.- Graceful one-at-a-time:
docker compose up -d --no-deps node1, wait for/api/netstatsto show it catching up, then do node2, then node3.
For 4+ nodes you can afford one-at-a-time hot rolls.
Back up the chain
docker run --rm -v node1_data:/data -v "$PWD":/bak alpine \
tar czf /bak/dchain-backup-$(date +%F).tar.gz -C /data .
Restore by swapping the file back into a fresh named volume before node startup.
Remove a bad validator
Same as adding but with remove-validator. Only works if a majority of
CURRENT validators cosign the removal — intentional, keeps one rogue
validator from kicking others unilaterally (see ROADMAP P2.1).
Security notes
/metricsis firewalled to internal networks by Caddy. If you need external scraping, add proper auth (Caddybasicauthor mTLS).- All public endpoints are rate-limited per-IP via the node itself — see
api_guards.go. Adjust limits before releasing to the open internet. - Each node runs as non-root inside a read-only rootfs container with all
capabilities dropped. If you need to exec into one,
docker compose exec --user root nodeN sh. - The Ed25519 key files mounted at
/keys/node.jsonare your validator identities. Losing them means losing your ability to produce blocks; get them onto the host via your normal secret-management (Vault, sealed- secrets, encrypted tarball at deploy time). Never commit them to git.
Troubleshooting
| Symptom | Check |
|---|---|
Caddy keeps issuing failed to get certificate |
Is port 80 open? DNS A-record pointing here? docker compose logs caddy |
New node can't sync: FATAL: genesis hash mismatch |
The --db volume has data from a different chain. docker volume rm nodeN_data and re-up |
| Chain stops producing blocks | docker compose logs nodeN | tail -100; look for SLOW AddBlock or validator silence |
/api/ws returns 429 |
Client opened > WSMaxConnectionsPerIP (default 10). Check ws.go for per-IP cap |
| Disk usage growing | Background vlog GC runs every 5 min. Manual: docker compose exec nodeN /bin/sh -c 'kill -USR1 1' (see StartValueLogGC) |