Epic: Secure secret delivery to provisioned containers #129

Open
opened 2026-04-15 00:24:54 +02:00 by claude · 0 comments
Owner

Problem

The Cameleer3 server orchestrates Docker containers for customer apps (SaaS mode). Currently, all secrets are passed as plaintext environment variablescustomEnvVars stored as unencrypted JSONB in PostgreSQL, passed to the Docker API as KEY=VALUE strings via DeploymentExecutor.buildEnvVars().

Current Flow

PostgreSQL (plaintext JSONB) → ConfigMerger → buildEnvVars() → Docker API → Container env vars

Exposure Points

  1. PostgreSQLcontainer_config and resolved_config JSONB columns store secrets unencrypted
  2. Docker inspectdocker inspect <container> shows all env vars in plaintext
  3. REST APIGET /api/v1/apps/{slug} returns containerConfig with customEnvVars unmasked
  4. Deployment recorddeployments.resolved_config persists a copy of all resolved config including secrets
  5. Container logs — if app logs env vars, they're stored unredacted in ClickHouse
  6. Bootstrap token — single shared CAMELEER_AGENT_AUTH_TOKEN across all containers

Target Platforms

  • Docker standalone (single-host dev/staging)
  • Docker Swarm (multi-host production)
  • Kubernetes (k3s production, current deployment target)

Scope

Evaluate and select the most secure, practical approach for delivering secrets to provisioned containers across all three platforms. Each option is tracked as a separate issue with detailed research.

Options Evaluated

# Option Issue Verdict
1 Platform-native secrets (Swarm + K8s) #130 (2/5) — Swarm incompatible with createContainerCmd; K8s good as Phase 2
2 Tmpfs-mounted secret files #134 ½ (3.5/5) — Solid security upgrade, restart resilience is weak point
3 Application-level encryption at rest #131 ½ (4.5/5) — Implement regardless of delivery choice; protects DB/backups
4 External vault (Vault, OpenBao, Infisical) #132 (4/5) — Infisical best fit if adding infra; OpenBao for long-term
5 Per-container JWT with secret claims (JWE) #133 (2/5) — Chicken-and-egg fatal; industry advises against
6 Server-side bootstrap callback #135 ½ (4.5/5) — Recommended primary mechanism; Vault cubbyhole without Vault

Layer 1 — Encryption at rest (#131): Implement first

Encrypt customEnvVars values in PostgreSQL with AES-256-GCM (Google Tink). Independent encryption key. Protects against DB dumps, backups, SQL injection. Satisfies SOC2/GDPR. 2-3 days effort.

Layer 2 — Bootstrap callback (#135): Primary delivery mechanism

One-time token per container, agent fetches secrets via HTTP at startup. Eliminates docker inspect exposure, adds tamper detection and audit trail. OWASP ranks this as "best" delivery approach. 3-4 days effort.

Layer 3 (future) — Platform-native enhancement (#130 K8s part)

When on K8s, use native Secrets with volume mounts via fabric8 client. Forward-compatible with the callback pattern.

Layer 4 (if needed) — External vault (#132)

If dynamic credentials or enterprise-grade audit becomes necessary, deploy Infisical (low ops burden) or OpenBao (Vault-grade power, true open source).

Complementary Improvements (regardless of chosen option)

  • Encrypt customEnvVars at rest in PostgreSQL (AES-256-GCM)
  • Strip secrets from resolved_config JSONB on deployment records
  • Redact *_SECRET, *_PASSWORD, *_TOKEN, *_KEY from API responses and logs
  • Replace shared bootstrap token with per-deployment scoped tokens
  • Audit logging for secret access/modification

Decision Criteria

  • Security posture (encryption at rest, in transit, in use)
  • Platform compatibility (Docker / Swarm / K8s)
  • Operational complexity (dependencies, maintenance burden)
  • Developer experience (ease of configuring secrets)
  • Rotation and revocation support
  • Audit trail capability
  • Failure modes (what happens when the secret source is unavailable)

Affected Code

  • DeploymentExecutor.buildEnvVars() — env var assembly
  • DockerRuntimeOrchestrator.startContainer() — Docker API call
  • ConfigMerger.resolve() — 3-layer config merge
  • ContainerRequest / ResolvedContainerConfig — data records
  • PostgresAppRepository / PostgresEnvironmentRepository — JSONB storage
  • AppController / EnvironmentAdminController — API endpoints
  • ContainerLogForwarder — log streaming without redaction
## Problem The Cameleer3 server orchestrates Docker containers for customer apps (SaaS mode). Currently, **all secrets are passed as plaintext environment variables** — `customEnvVars` stored as unencrypted JSONB in PostgreSQL, passed to the Docker API as `KEY=VALUE` strings via `DeploymentExecutor.buildEnvVars()`. ### Current Flow ``` PostgreSQL (plaintext JSONB) → ConfigMerger → buildEnvVars() → Docker API → Container env vars ``` ### Exposure Points 1. **PostgreSQL** — `container_config` and `resolved_config` JSONB columns store secrets unencrypted 2. **Docker inspect** — `docker inspect <container>` shows all env vars in plaintext 3. **REST API** — `GET /api/v1/apps/{slug}` returns `containerConfig` with `customEnvVars` unmasked 4. **Deployment record** — `deployments.resolved_config` persists a copy of all resolved config including secrets 5. **Container logs** — if app logs env vars, they're stored unredacted in ClickHouse 6. **Bootstrap token** — single shared `CAMELEER_AGENT_AUTH_TOKEN` across all containers ### Target Platforms - **Docker standalone** (single-host dev/staging) - **Docker Swarm** (multi-host production) - **Kubernetes** (k3s production, current deployment target) ## Scope Evaluate and select the most secure, practical approach for delivering secrets to provisioned containers across all three platforms. Each option is tracked as a separate issue with detailed research. ## Options Evaluated | # | Option | Issue | Verdict | |---|--------|-------|---------| | 1 | Platform-native secrets (Swarm + K8s) | #130 | ⭐⭐ (2/5) — Swarm incompatible with `createContainerCmd`; K8s good as Phase 2 | | 2 | Tmpfs-mounted secret files | #134 | ⭐⭐⭐½ (3.5/5) — Solid security upgrade, restart resilience is weak point | | 3 | Application-level encryption at rest | #131 | ⭐⭐⭐⭐½ (4.5/5) — **Implement regardless** of delivery choice; protects DB/backups | | 4 | External vault (Vault, OpenBao, Infisical) | #132 | ⭐⭐⭐⭐ (4/5) — Infisical best fit if adding infra; OpenBao for long-term | | 5 | Per-container JWT with secret claims (JWE) | #133 | ⭐⭐ (2/5) — Chicken-and-egg fatal; industry advises against | | 6 | Server-side bootstrap callback | #135 | ⭐⭐⭐⭐½ (4.5/5) — **Recommended primary mechanism**; Vault cubbyhole without Vault | ## Recommended Strategy ### Layer 1 — Encryption at rest (#131): Implement first Encrypt `customEnvVars` values in PostgreSQL with AES-256-GCM (Google Tink). Independent encryption key. Protects against DB dumps, backups, SQL injection. Satisfies SOC2/GDPR. **2-3 days effort.** ### Layer 2 — Bootstrap callback (#135): Primary delivery mechanism One-time token per container, agent fetches secrets via HTTP at startup. Eliminates docker inspect exposure, adds tamper detection and audit trail. OWASP ranks this as "best" delivery approach. **3-4 days effort.** ### Layer 3 (future) — Platform-native enhancement (#130 K8s part) When on K8s, use native Secrets with volume mounts via fabric8 client. Forward-compatible with the callback pattern. ### Layer 4 (if needed) — External vault (#132) If dynamic credentials or enterprise-grade audit becomes necessary, deploy Infisical (low ops burden) or OpenBao (Vault-grade power, true open source). ## Complementary Improvements (regardless of chosen option) - [ ] Encrypt `customEnvVars` at rest in PostgreSQL (AES-256-GCM) - [ ] Strip secrets from `resolved_config` JSONB on deployment records - [ ] Redact `*_SECRET`, `*_PASSWORD`, `*_TOKEN`, `*_KEY` from API responses and logs - [ ] Replace shared bootstrap token with per-deployment scoped tokens - [ ] Audit logging for secret access/modification ## Decision Criteria - Security posture (encryption at rest, in transit, in use) - Platform compatibility (Docker / Swarm / K8s) - Operational complexity (dependencies, maintenance burden) - Developer experience (ease of configuring secrets) - Rotation and revocation support - Audit trail capability - Failure modes (what happens when the secret source is unavailable) ## Affected Code - `DeploymentExecutor.buildEnvVars()` — env var assembly - `DockerRuntimeOrchestrator.startContainer()` — Docker API call - `ConfigMerger.resolve()` — 3-layer config merge - `ContainerRequest` / `ResolvedContainerConfig` — data records - `PostgresAppRepository` / `PostgresEnvironmentRepository` — JSONB storage - `AppController` / `EnvironmentAdminController` — API endpoints - `ContainerLogForwarder` — log streaming without redaction
claude added the featuresecurityepic labels 2026-04-15 00:24:54 +02:00
Sign in to join this conversation.