Files
cameleer-server/.claude/rules/docker-orchestration.md
hsiegeln 3334f0a1d2
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 3m0s
CI / docker (push) Successful in 3m26s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 45s
chore: hand cameleer-runtime-loader image build to cameleer-saas
The loader is infra glue (per-replica init container that fetches the
tenant JAR from a signed URL) — same shape as runtime-base, postgres,
clickhouse, traefik, logto images already living in cameleer-saas. Move
the source + CI build there so all sidecar/infra image builds are in
one place; cameleer-server's CI is back to building only what it owns
(server, server-ui).

Coordination: cameleer-saas@ac8d628 added the build step and copied the
source verbatim. Published tag path is unchanged
(gitea.siegeln.net/cameleer/cameleer-runtime-loader:latest), so running
tenant servers continue pulling the same image without disruption.

This commit:
- Deletes cameleer-runtime-loader/ (Dockerfile, entrypoint.sh, README).
- Removes the conditional "Build and push runtime-loader" step and its
  upstream "Detect runtime-loader changes" detection from .gitea/workflows/ci.yml.
  Drops the fetch-depth: 0 + outputs.loader_changed plumbing that only
  existed for the change-detection path.
- Drops cameleer-runtime-loader from the in-job and cleanup-branch image
  cleanup loops — saas owns the registry lifecycle now.
- Rewrites LoaderHardeningIT to pull the published :latest from the
  registry (via Testcontainers GenericContainer) instead of building
  from a local Dockerfile. The IT now functions as a cross-repo contract
  test: cameleer-server's hardening expectations vs. the saas-published
  artifact. Local devs need `docker login gitea.siegeln.net`; CI runners
  are pre-authenticated.
- Updates .claude/rules/docker-orchestration.md to point at the new
  source-of-truth location and reframe LoaderHardeningIT as the
  cross-repo contract test.

The image's runtime contract (ARTIFACT_URL, ARTIFACT_EXPECTED_SIZE,
/app/jars/app.jar mount, exit code semantics) is unchanged. Future
contract changes need coordinated commits across both repos.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 13:02:54 +02:00

17 KiB
Raw Blame History

paths
paths
cameleer-server-app/**/runtime/**
cameleer-server-core/**/runtime/**
deploy/**
docker-compose*.yml
Dockerfile
docker-entrypoint.sh

Docker Orchestration

When deployed via the cameleer-saas platform, this server orchestrates customer app containers using Docker. Key components:

  • ConfigMerger (core/runtime/ConfigMerger.java) — pure function: resolve(globalDefaults, envConfig, appConfig) -> ResolvedContainerConfig. Three-layer merge: global (application.yml) -> environment (defaultContainerConfig JSONB) -> app (containerConfig JSONB). Includes runtimeType (default "auto") and customArgs (default "").
  • TraefikLabelBuilder (app/runtime/TraefikLabelBuilder.java) — generates Traefik Docker labels for path-based (/{envSlug}/{appSlug}/) or subdomain-based ({appSlug}-{envSlug}.{domain}) routing. Supports strip-prefix and SSL offloading toggles. Per-replica identity labels: cameleer.replica (index), cameleer.generation (8-char deployment UUID prefix — pin Prometheus/Grafana deploy boundaries with this), cameleer.instance-id ({envSlug}-{appSlug}-{replicaIndex}-{generation}). Traefik router/service keys deliberately omit the generation so load balancing spans old + new replicas during a blue/green overlap. When ResolvedContainerConfig.externalRouting() is false (UI: Resources → External Routing, default true), the builder emits ONLY the identity labels (managed-by, cameleer.*) and skips every traefik.* label — the container stays on cameleer-traefik and the per-env network (so sibling containers can still reach it via Docker DNS) but is invisible to Traefik. The tls.certresolver label is emitted only when CAMELEER_SERVER_RUNTIME_CERTRESOLVER is set to a non-blank resolver name (matching a resolver configured in the Traefik static config). When unset (dev installs backed by a static TLS store) only tls=true is emitted and Traefik serves the default cert from the TLS store.
  • PrometheusLabelBuilder (app/runtime/PrometheusLabelBuilder.java) — generates Prometheus docker_sd_configs labels per resolved runtime type: Spring Boot /actuator/prometheus:8081, Quarkus/native /q/metrics:9000, plain Java /metrics:9464. Labels merged into container metadata alongside Traefik labels at deploy time.
  • DockerNetworkManager (app/runtime/DockerNetworkManager.java) — manages two Docker network tiers:
    • cameleer-traefik — shared network; Traefik, server, and all app containers attach here. Server joined via docker-compose with cameleer-server DNS alias.
    • cameleer-env-{slug} — per-environment isolated network; containers in the same environment discover each other via Docker DNS. In SaaS mode, env networks are tenant-scoped: cameleer-env-{tenantId}-{envSlug} (overloaded envNetworkName(tenantId, envSlug) method) to prevent cross-tenant collisions when multiple tenants have identically-named environments.
  • DockerEventMonitor (app/runtime/DockerEventMonitor.java) — persistent Docker event stream listener for containers with managed-by=cameleer-server label. Detects die/oom/start/stop events and updates deployment replica states. Periodic reconciliation (@Scheduled every 30s) inspects actual container state and corrects deployment status mismatches (fixes stale DEGRADED with all replicas healthy).
  • DeploymentProgress (ui/src/components/DeploymentProgress.tsx) — UI step indicator showing 7 deploy stages with amber active/green completed styling.
  • ContainerLogForwarder (app/runtime/ContainerLogForwarder.java) — streams Docker container stdout/stderr to ClickHouse logs table with source='container'. Uses docker logs --follow per container, batches lines every 2s or 50 lines. Parses Docker timestamp prefix, infers log level via regex. DeploymentExecutor starts capture after each replica launches with the replica's instanceId ({envSlug}-{appSlug}-{replicaIndex}-{generation}); DockerEventMonitor stops capture on die/oom. 60-second max capture timeout with 30s cleanup scheduler. Thread pool of 10 daemon threads. Container logs use the same instanceId as the agent (set via CAMELEER_AGENT_INSTANCEID env var) for unified log correlation at the instance level. Instance-id changes per deployment — cross-deploy queries aggregate on application + environment (and optionally replica_index).
  • StartupLogPanel (ui/src/components/StartupLogPanel.tsx) — collapsible log panel rendered below DeploymentProgress. Queries /api/v1/logs?source=container&application={appSlug}&environment={envSlug}. Auto-polls every 3s while deployment is STARTING; shows green "live" badge during polling, red "stopped" badge on FAILED. Uses useStartupLogs hook and LogViewer (design system).

Container Hardening (issue #152)

DockerRuntimeOrchestrator.startContainer applies an unconditional hardening contract to BOTH the loader init-container AND the main tenant container (baseHardenedHostConfig() is the shared helper). Java 17 has no SecurityManager so the JVM is not a security boundary, and isolation must live below it. Defaults are fail-closed and have no opt-out:

  • cap_drop = every Capability.values() (effectively ALL — docker-java's enum has no ALL constant). Outbound TCP still works (no caps needed); raw sockets, ptrace, mounts, and bind <1024 are denied.
  • security_opt: no-new-privileges:true, apparmor=docker-default. Default seccomp profile is applied implicitly when seccomp= is absent.
  • read_only rootfs = true.
  • pids_limit = 512 (PIDS_LIMIT constant).
  • tmpfs mount: /tmp with rw,nosuid,size=256m. No noexec — Netty/tcnative, Snappy, LZ4, Zstd dlopen native libs from /tmp via mmap(PROT_EXEC) which noexec blocks. Issue #153 will add per-app writeableVolumes for stateful tenants (Kafka Streams etc.).
  • userns_mode = host:1000:65536 on both loader and main. Container root is never UID 0 on the host — closes the last open hardening item from issue #152.

Sandboxed runtime auto-detect: at construction the orchestrator calls dockerClient.infoCmd().exec().getRuntimes() and uses runsc (gVisor) when present. Override with cameleer.server.runtime.dockerruntime (e.g. kata to force Kata Containers, or any other registered runtime). Empty/blank = auto. The override always wins over auto-detect. The DockerRuntimeOrchestrator(DockerClient, String) constructor is the canonical entry point; the single-arg constructor exists only as a convenience for tests that don't need an override.

Init-Container Loader Pattern (JAR fetch)

startContainer is now a two-phase op per replica:

  1. Volume createcameleer-jars-{containerName} named volume (per-replica, deterministic so cleanup in removeContainer can derive it).
  2. Loader containerloaderImage (default gitea.siegeln.net/cameleer/cameleer-runtime-loader:latest, built and published by the cameleer-saas repo at docker/runtime-loader/), name {containerName}-loader, mount the volume RW at /app/jars, env vars ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE. Loader downloads the JAR from the signed URL into the volume and exits 0. Orchestrator blocks on waitContainerCmd().exec(WaitContainerResultCallback).awaitStatusCode(120, SECONDS). Loader container is removed in a finally block; on non-zero exit the volume is also removed and RuntimeException propagates so DeploymentExecutor marks the deployment FAILED. Loader logs are captured before removal (captureLoaderLogslogContainerCmd with withTail(50), capped at 4096 chars, 5s timeout) and appended to the thrown RuntimeException message as ". loader output: <text>". Best-effort: log-capture failures are swallowed and don't mask the original exit. The loader image's Dockerfile pre-creates /app/jars owned by loader:loader (UID 1000) so the orchestrator's fresh named volume initialises with that ownership — without it the empty volume comes up as root:root 0755 and wget exits 1 with "Permission denied". LoaderHardeningIT is the cross-repo contract test (pulls the published :latest and asserts exit 0 under the orchestrator's hardening shape).
  3. Main container — same hardening contract, mount the same volume RO at /app/jars, entrypoint reads /app/jars/app.jar (Spring Boot/Quarkus: -jar /app/jars/app.jar; plain Java: -cp /app/jars/app.jar <MainClass>; native: exec /app/jars/app.jar).

removeContainer(id) derives the volume name from the inspected container name (Docker prefixes it with /) and removes the volume after the container removes — blue/green doesn't leak volumes.

DeploymentExecutor generates the signed URL via ArtifactDownloadTokenSigner.sign(appVersion.id(), Duration.ofSeconds(artifactTokenTtlSeconds)) and passes appVersion.id(), the URL, appVersion.jarSizeBytes(), and the loader image into ContainerRequest. The host filesystem is no longer involved at deploy time.

Loader → server reachability: the loader hits the Cameleer server from its primary Docker network only (request.network(), set from CAMELEER_SERVER_RUNTIME_DOCKERNETWORK). Additional networks (cameleer-traefik, per-env) are attached by DockerNetworkManager.connectContainer AFTER startContainer returns — by which time the loader has already exited. The loader cannot use them. The signed URL is built from cameleer.server.runtime.artifactbaseurl (preferred), falling back to cameleer.server.runtime.serverurl, falling back to http://cameleer-server:8081. The default works in SaaS mode because the tenant's primary network (cameleer-tenant-{slug}) hosts the tenant's own server — same CAMELEER_SERVER_RUNTIME_DOCKERNETWORK on both. For non-SaaS topologies, set CAMELEER_SERVER_RUNTIME_ARTIFACTBASEURL to a URL the loader can reach on its primary network.

DeploymentExecutor Details

Primary network for app containers is set via CAMELEER_SERVER_RUNTIME_DOCKERNETWORK env var (in SaaS mode: cameleer-tenant-{slug}); apps also connect to cameleer-traefik (routing) and cameleer-env-{tenantId}-{envSlug} (per-environment discovery) as additional networks. Resolves runtimeType: auto to concrete type from AppVersion.detectedRuntimeType at PRE_FLIGHT (fails deployment if unresolvable). Builds Docker entrypoint per runtime type (all JVM types use -javaagent:/app/agent.jar -jar, plain Java uses -cp with main class, native runs binary directly). Sets per-replica CAMELEER_AGENT_INSTANCEID env var to {envSlug}-{appSlug}-{replicaIndex}-{generation} so container logs and agent logs share the same instance identity. Sets CAMELEER_AGENT_* env vars from ResolvedContainerConfig (routeControlEnabled, replayEnabled, health port). These are startup-only agent properties — changing them requires redeployment.

Container naming{tenantId}-{envSlug}-{appSlug}-{replicaIndex}-{generation}, where generation is the first 8 characters of the deployment UUID. The generation suffix lets old + new replicas coexist during a blue/green swap (deterministic names without a generation used to 409). All lookups across the executor, DockerEventMonitor, and ContainerLogForwarder key on container id, not name — the name is operator-visibility only.

Strategy dispatchDeploymentStrategy.fromWire(config.deploymentStrategy()) branches the executor. Unknown values fall back to BLUE_GREEN so misconfiguration never throws at runtime.

  • Blue/green (default): start all N new replicas → wait for ALL healthy → stop the previous deployment. Resource peak ≈ 2× replicas for the health-check window. Partial health aborts with status FAILED; the previous deployment is preserved untouched (user's safety net).
  • Rolling: replace replicas one at a time — start new[i] → wait healthy → stop old[i] → next. Resource peak = replicas + 1. Mid-rollout health failure stops in-flight new containers and aborts; already-replaced old replicas are NOT restored (not reversible) but un-replaced old[i+1..N] keep serving traffic. User redeploys to recover.

Traffic routing is implicit: Traefik labels (cameleer.app, cameleer.environment) are generation-agnostic, so new replicas attract load balancing as soon as they come up healthy — no explicit swap step.

Deployment Status Model

Status Meaning
STOPPED Intentionally stopped or initial state
STARTING Deploy in progress
RUNNING All replicas healthy and serving
DEGRADED Post-deploy: a replica died after the deploy was marked RUNNING. Set by DockerEventMonitor reconciliation, never by DeploymentExecutor directly.
STOPPING Graceful shutdown in progress
FAILED Terminal failure (pre-flight, health check, or crash). Partial-healthy deploys now mark FAILED — DEGRADED is reserved for post-deploy drift.

Deploy stages (DeployStage): PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE (or FAILED at any stage). Rolling reuses the same stage labels inside the per-replica loop; the UI progress bar shows the most recent stage.

Deployment retention: DeploymentService.createDeployment() deletes FAILED deployments for the same app+environment before creating a new one, preventing failed-attempt buildup. STOPPED deployments are preserved as restorable checkpoints — the UI Checkpoints disclosure lists every deployment with a non-null deployed_config_snapshot (RUNNING, DEGRADED, STOPPED) minus the current one.

JAR Management

  • Retention policy per environment: configurable maximum number of JAR versions to keep. Older JARs are deleted automatically.
  • Nightly cleanup job (JarRetentionJob, Spring @Scheduled 03:00): purges JARs exceeding the retention limit and removes orphaned files not referenced by any app version. Skips versions currently deployed.
  • Storage abstraction: ArtifactStore (in cameleer-server-core/storage) is the only path that touches JAR bytes. FilesystemArtifactStore writes under cameleer.server.runtime.jarstoragepath (default /data/jars); the orchestrator never reads the host filesystem at deploy time.
  • Loader-fetch at deploy time: tenant containers no longer bind-mount JARs from the host. The loader init-container streams the JAR via a signed URL (HMAC-SHA256, TTL cameleer.server.runtime.artifacttokenttlseconds, default 600s) into a per-replica named volume; main mounts that volume RO. This works without host-path access and is the single path supported in Docker-in-Docker SaaS deployments.

Runtime Type Detection

The server detects the app framework from uploaded JARs and builds Docker entrypoints. The agent shaded JAR bundles the log appender, so no separate cameleer-log-appender.jar or PropertiesLauncher is needed:

  • Detection (RuntimeDetector): runs at JAR upload time. Checks ZIP magic bytes (non-ZIP = native binary), then probes META-INF/MANIFEST.MF Main-Class: Spring Boot loader prefix -> spring-boot, Quarkus entry point -> quarkus, other Main-Class -> plain-java (extracts class name). Results stored on AppVersion (detected_runtime_type, detected_main_class).
  • Runtime types (RuntimeType enum): AUTO, SPRING_BOOT, QUARKUS, PLAIN_JAVA, NATIVE. Configurable per app/environment via containerConfig.runtimeType (default "auto").
  • Entrypoint per type: All JVM types use java -javaagent:/app/agent.jar -jar app.jar. Plain Java uses -cp with explicit main class instead of -jar. Native runs the binary directly.
  • Custom arguments (containerConfig.customArgs): freeform string appended to the start command. Validated against a strict pattern to prevent shell injection (entrypoint uses sh -c).
  • AUTO resolution: at deploy time (PRE_FLIGHT), "auto" resolves to the detected type from AppVersion. Fails deployment if detection was unsuccessful — user must set type explicitly.
  • UI: Resources tab shows Runtime Type dropdown (with detection hint from latest uploaded version) and Custom Arguments text field.

SaaS Multi-Tenant Network Isolation

In SaaS mode, each tenant's server and its deployed apps are isolated at the Docker network level:

  • Tenant network (cameleer-tenant-{slug}) — primary internal bridge for all of a tenant's containers. Set as CAMELEER_SERVER_RUNTIME_DOCKERNETWORK for the tenant's server instance. Tenant A's apps cannot reach tenant B's apps.
  • Shared services network — server also connects to the shared infrastructure network (PostgreSQL, ClickHouse, Logto) and cameleer-traefik for HTTP routing.
  • Tenant-scoped environment networks (cameleer-env-{tenantId}-{envSlug}) — per-environment discovery is scoped per tenant, so alpha-corp's "dev" environment network is separate from beta-corp's "dev" environment network.

nginx / Reverse Proxy

  • client_max_body_size 200m is required in the nginx config to allow JAR uploads up to 200 MB. Without this, large JAR uploads return 413.