Deployment
latchgate upstarts the gate in embedded mode (SQLite + embedded policy) with zero external dependencies. For production with HA replay and defense-in-depth egress, uselatchgate up --infraor manage Redis, OPA, and Squid yourself and start withlatchgate serve.
Production checklist
Section titled “Production checklist”Before deploying LatchGate to production, verify:
- Identity provider is
peercred(or OIDC/mTLS when available), notnone - Named operator credentials with DPoP (
[operator_credentials.NAME]), not shared key - Signing keys are persisted to disk (
receipt_signing_key_path,grant_signing_key_path) - Receipt keys JWKS file is persisted (
receipt_keys_jwks_path) -
response_schema_enforcement = "deny" - TCP listener is disabled (no
unsafe_expose_http) - Binary is a release build (not compiled with
--features unsafe-dev) - Redis is reachable and persistent (
appendonly yes) - OPA is reachable with the policy bundle loaded
- Signing key files are backed up
-
sops_secrets_fileis set for actions that declare secrets (see Secrets Management) -
sops_key_fileis set (or SOPS backend auth is configured via environment) - Age key file has restricted permissions (
chmod 600) -
egress_proxy_urlis set AND Squid is running, if any action usesproxy_allowlist(strongly recommended — see Egress Proxy). Without it, kernel-only enforcement (Layer 1) applies and a startup warning is emitted. -
latchgate doctorpasses all checks including SOPS and egress proxy - Evidence ledger backup schedule configured (SQLite online backup)
- Monitoring alerts on
latchgate_unresolved_intents > 0(see Troubleshooting) - Monitoring alerts on
webhook_outbox_pendinggrowth (see Webhooks)
Pre-flight check
Section titled “Pre-flight check”Run latchgate doctor before starting the gate to verify all dependencies and configuration are correct:
latchgate doctorThis checks Redis connectivity, OPA reachability, egress proxy reachability, provider module digests, manifest integrity, SOPS binary availability (when sops_secrets_file is configured), and WASM host capabilities. See CLI Reference for details.
Recommended architecture
Section titled “Recommended architecture”┌────────────────────────────────────────┐│ Agent container / VM ││ ││ agent process ││ │ ││ │ UDS ││ ▼ ││ /run/latchgate/gate.sock ││ │ ││ latchgate serve ││ │ ││ ├── Redis (replay, budgets) ││ ├── OPA (policy) ││ └── Squid (egress proxy — required ││ for proxy_allowlist) │└────────────────────────────────────────┘ │ │ host I/O (HTTP via Squid in v0.1; SMTP, SQL, AMQP, S3 planned) ▼ external systemsThe agent process communicates with LatchGate exclusively over a Unix domain socket. The agent has no direct network access to external systems — all side effects go through LatchGate’s host I/O layer.
Transport
Section titled “Transport”Client socket
Section titled “Client socket”Agent processes connect to the client socket:
listen_uds_path = "/run/latchgate/gate.sock"This exposes: POST /v1/leases, GET /.well-known/jwks.json, POST /v1/actions/{id}/execute, GET /v1/actions, GET /v1/actions/{id}, GET /v1/actions/{id}/schema/request, GET /v1/approvals/{id}/poll, GET /v1/receipts/{id}, and health endpoints.
Admin socket
Section titled “Admin socket”Operator tools (CLI, dashboards) connect to the admin socket:
listen_admin_uds_path = "/run/latchgate/gate-admin.sock"This exposes: approval endpoints, audit queries, receipt retrieval (with operator auth), revocation, receipt key export, domain management, path management, policy ACL management, and metrics. Agent processes cannot reach admin APIs.
Rate limits: 20 req/s on operator write endpoints, 100 req/s on operator read endpoints (token-bucket, per-process).
Note: The receipt endpoint is available on both sockets with different auth models. Client socket uses lease-based DPoP auth. Admin socket uses operator auth. Both return identical response bodies.
Why UDS?
Section titled “Why UDS?”Unix domain sockets provide kernel-enforced caller identity via SO_PEERCRED. The kernel guarantees the peer UID — it cannot be forged. This is the foundation for peercred identity (mapping UIDs to principals without any client-side authentication).
Identity: peercred setup
Section titled “Identity: peercred setup”Map each agent’s Unix UID to a principal name and scope set:
[identity]provider = "peercred"
[identity.peercred]allow_unmapped = false
[identity.peercred.principals]1001 = { principal = "agent-support", scopes = ["tools:call"], owner = "alice@company.com" }1002 = { principal = "agent-ops", scopes = ["tools:call", "db:query"], owner = "bob@company.com" }With allow_unmapped = false, any UID not in the map is denied at lease issuance.
Key management
Section titled “Key management”Signing keys
Section titled “Signing keys”LatchGate uses two Ed25519 signing keys:
- Receipt signing key — signs ExecutionReceipts for the evidence ledger
- Grant signing key — signs ExecutionGrants (separate key for defense-in-depth)
Keys are auto-generated on first run (32-byte seed, mode 0600). Back them up. If a receipt key is lost, receipts signed with it become unverifiable.
On load, LatchGate checks that key files have not been widened beyond 0600 (owner-only). If group or world bits are set, a SECURITY warning is emitted to structured logs. This matches OpenSSH’s private key permission check behavior.
Key rotation
Section titled “Key rotation”When the receipt signing key is rotated, the old verifying key is appended to the JWKS file. Receipts carry a signing_key_id (kid) and the /v1/receipt-keys endpoint returns all historical verifying keys. Old receipts remain verifiable.
Never delete a verifying key from receipt-keys.jwks unless every receipt signed with that key has been externally verified and archived.
Secrets
Section titled “Secrets”Secrets for action execution are stored in a SOPS-encrypted file and decrypted just-in-time. See Secrets Management for setup, rotation, and encryption backend options.
Egress proxy
Section titled “Egress proxy”For defense-in-depth, configure a Squid forward proxy for outbound HTTP from WASM providers. When actions use proxy_allowlist but no proxy is configured, the gate starts with a warning and uses kernel-only enforcement (Layer 1: sink validation + SSRF protection + manifest domain allowlists). The proxy adds an independent Layer 2 backstop.
egress_proxy_url = "http://squid.internal:3128"See Egress Proxy for the full setup: allowlist generation, Squid configuration, troubleshooting, and how the kernel + proxy layers cooperate.
Docker
Section titled “Docker”Pre-built images are published to GHCR on every release:
docker pull ghcr.io/latchgate-ai/latchgate:latestdocker pull ghcr.io/latchgate-ai/latchgate:0.1.0 # pinned versionThe runtime image includes a Docker HEALTHCHECK instruction that polls /healthz every 10 seconds. Container orchestrators (ECS, Compose, Swarm) use this to detect unresponsive instances and trigger restarts automatically.
Docker Compose profiles
Section titled “Docker Compose profiles”The docker-compose.yml at the repo root uses Compose profiles to opt into optional services. Pick one based on what you need:
docker compose up # core deps only: redis + opadocker compose --profile dev up # core deps + Squid + Prometheus (gate runs on host)docker compose --profile quickstart up # full self-contained stack: gate + redis + opa- Default (no profile) — starts only Redis and OPA. Use this when you run
latchgate serveon the host and only need its dependencies. --profile dev— adds the Squid egress proxy and Prometheus alongside the core deps. The gate itself still runs on the host, exercising the egress proxy locally.--profile quickstart— runs the gate inside Docker too, in a single self-contained stack. Enables HTTP transport for easy demo access. Not for production — production deployments must use UDS with no HTTP exposure.
To build the image from source instead of pulling from GHCR:
docker build -t latchgate .Kill switch
Section titled “Kill switch”In an emergency, revoke all active leases and grants:
latchgate revokeThe kill-switch requires operator DPoP authentication. Use the CLI which handles DPoP proof construction automatically, or call the API with Authorization: DPoP <key> and DPoP: <proof> headers.
This advances the revocation epoch. All leases and grants from prior epochs are immediately invalid. Agents must re-authenticate.
Monitoring
Section titled “Monitoring”/healthz— liveness probe (returns{"status":"ok"})/readyz— readiness probe (returns 503 until all startup checks pass)/v1/admin/status— operational status snapshot: version, uptime, dependency health, pending approvals, unresolved intents, revocation epoch, webhook state (admin socket, operator auth required)/metrics— Prometheus-format metrics (admin socket only)- JSONL audit export for SIEM integration
- Outbound webhooks for real-time alerting on approvals, denials, revocations, and failures
Key metrics to alert on
Section titled “Key metrics to alert on”latchgate_unresolved_intents— should be 0 in steady state; non-zero indicates evidence gapslatchgate_webhook_outbox_pending— growing trend indicates webhook delivery issueslatchgate_oldest_pending_approval_seconds— growing trend indicates operator response delayslatchgate_audit_write_error_total— any increment is a critical incidentlatchgate_budget_exhausted_total— indicates undersized budgets or runaway agentsreadyz_degraded_total{reason="..."}— per-cause degradation counters
Graceful shutdown
Section titled “Graceful shutdown”Orchestrators (systemd, Kubernetes) should follow this sequence:
- Call
POST /v1/admin/drain(the gate refuses new requests with 503) - Poll
/v1/admin/statusuntilin_flight_executions == 0 - Send SIGTERM
The gate on SIGTERM without a prior drain will wait up to 30 seconds for in-flight executions to complete before aborting. Aborted executions produce unresolved intents — avoid this path.
For configuration reference, see Configuration. For the full threat model, see Security Model. For secrets setup, see Secrets Management. For egress proxy setup, see Egress Proxy.