Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because
9 alone leaves the orchestrator+executor referencing removed ContainerRequest
fields.
ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds
appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage.
DockerRuntimeOrchestrator (app):
- per-replica named volume "cameleer-jars-{containerName}"
- phase 1: loader container with the volume mounted RW at /app/jars,
ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract
- block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit
remove the loader, remove the volume, propagate RuntimeException so
DeploymentExecutor marks the deployment FAILED. main is never created.
- phase 2: main container with the same volume mounted RO at /app/jars
- withUsernsMode("host:1000:65536") on BOTH containers — closes the last
open hardening gap from issue #152
- main entrypoint paths point at /app/jars/app.jar
- extracted baseHardenedHostConfig() so loader and main share the
cap_drop / security_opt / readonly / pids / tmpfs contract
- removeContainer() also removes the per-replica volume so blue/green
doesn't leak volumes
DeploymentExecutor (app):
- injects ArtifactDownloadTokenSigner; new @Value props loaderimage,
artifacttokenttlseconds, artifactbaseurl
- replaces the temporary getVersion(...).jarPath() bridge with a signed
URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig
- drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is
the size-of-record check now
- drops jarDockerVolume / jarStoragePath @Value fields and the volume
plumbing in startReplica
- DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize
in place of jarPath
Tests:
- DockerRuntimeOrchestratorHardeningTest updated for the new shape;
captures HostConfig on the MAIN container and asserts cap_drop ALL
+ no-new-privileges + apparmor + readonly + pids + tmpfs + the new
withUsernsMode("host:1000:65536")
- DockerRuntimeOrchestratorLoaderTest (new): verifies volume create →
loader create with RW bind → loader started → awaited → loader
removed → main create with RO bind → main started; verifies abort
+ cleanup on loader exit != 0 (loader removed, volume removed, main
NEVER created); verifies userns_mode applied to both containers.
Config:
- application.yml replaces jardockervolume with loaderimage,
artifacttokenttlseconds, artifactbaseurl
Rules updated: .claude/rules/docker-orchestration.md (loader pattern,
userns, no more bind-mount); .claude/rules/core-classes.md
(ContainerRequest field map).
Test counts after change:
- cameleer-server-core: 116/116 unit tests pass
- cameleer-server-app: 273/273 unit tests pass
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
15 KiB
paths
| paths | ||||||
|---|---|---|---|---|---|---|
|
Docker Orchestration
When deployed via the cameleer-saas platform, this server orchestrates customer app containers using Docker. Key components:
- ConfigMerger (
core/runtime/ConfigMerger.java) — pure function: resolve(globalDefaults, envConfig, appConfig) -> ResolvedContainerConfig. Three-layer merge: global (application.yml) -> environment (defaultContainerConfig JSONB) -> app (containerConfig JSONB). IncludesruntimeType(default"auto") andcustomArgs(default""). - TraefikLabelBuilder (
app/runtime/TraefikLabelBuilder.java) — generates Traefik Docker labels for path-based (/{envSlug}/{appSlug}/) or subdomain-based ({appSlug}-{envSlug}.{domain}) routing. Supports strip-prefix and SSL offloading toggles. Per-replica identity labels:cameleer.replica(index),cameleer.generation(8-char deployment UUID prefix — pin Prometheus/Grafana deploy boundaries with this),cameleer.instance-id({envSlug}-{appSlug}-{replicaIndex}-{generation}). Traefik router/service keys deliberately omit the generation so load balancing spans old + new replicas during a blue/green overlap. WhenResolvedContainerConfig.externalRouting()isfalse(UI: Resources → External Routing, defaulttrue), the builder emits ONLY the identity labels (managed-by,cameleer.*) and skips everytraefik.*label — the container stays oncameleer-traefikand the per-env network (so sibling containers can still reach it via Docker DNS) but is invisible to Traefik. Thetls.certresolverlabel is emitted only whenCAMELEER_SERVER_RUNTIME_CERTRESOLVERis set to a non-blank resolver name (matching a resolver configured in the Traefik static config). When unset (dev installs backed by a static TLS store) onlytls=trueis emitted and Traefik serves the default cert from the TLS store. - PrometheusLabelBuilder (
app/runtime/PrometheusLabelBuilder.java) — generates Prometheusdocker_sd_configslabels per resolved runtime type: Spring Boot/actuator/prometheus:8081, Quarkus/native/q/metrics:9000, plain Java/metrics:9464. Labels merged into container metadata alongside Traefik labels at deploy time. - DockerNetworkManager (
app/runtime/DockerNetworkManager.java) — manages two Docker network tiers:cameleer-traefik— shared network; Traefik, server, and all app containers attach here. Server joined via docker-compose withcameleer-serverDNS alias.cameleer-env-{slug}— per-environment isolated network; containers in the same environment discover each other via Docker DNS. In SaaS mode, env networks are tenant-scoped:cameleer-env-{tenantId}-{envSlug}(overloadedenvNetworkName(tenantId, envSlug)method) to prevent cross-tenant collisions when multiple tenants have identically-named environments.
- DockerEventMonitor (
app/runtime/DockerEventMonitor.java) — persistent Docker event stream listener for containers withmanaged-by=cameleer-serverlabel. Detects die/oom/start/stop events and updates deployment replica states. Periodic reconciliation (@Scheduled every 30s) inspects actual container state and corrects deployment status mismatches (fixes stale DEGRADED with all replicas healthy). - DeploymentProgress (
ui/src/components/DeploymentProgress.tsx) — UI step indicator showing 7 deploy stages with amber active/green completed styling. - ContainerLogForwarder (
app/runtime/ContainerLogForwarder.java) — streams Docker container stdout/stderr to ClickHouselogstable withsource='container'. Usesdocker logs --followper container, batches lines every 2s or 50 lines. Parses Docker timestamp prefix, infers log level via regex.DeploymentExecutorstarts capture after each replica launches with the replica'sinstanceId({envSlug}-{appSlug}-{replicaIndex}-{generation});DockerEventMonitorstops capture on die/oom. 60-second max capture timeout with 30s cleanup scheduler. Thread pool of 10 daemon threads. Container logs use the sameinstanceIdas the agent (set viaCAMELEER_AGENT_INSTANCEIDenv var) for unified log correlation at the instance level. Instance-id changes per deployment — cross-deploy queries aggregate onapplication + environment(and optionallyreplica_index). - StartupLogPanel (
ui/src/components/StartupLogPanel.tsx) — collapsible log panel rendered belowDeploymentProgress. Queries/api/v1/logs?source=container&application={appSlug}&environment={envSlug}. Auto-polls every 3s while deployment is STARTING; shows green "live" badge during polling, red "stopped" badge on FAILED. UsesuseStartupLogshook andLogViewer(design system).
Container Hardening (issue #152)
DockerRuntimeOrchestrator.startContainer applies an unconditional hardening contract to BOTH the loader init-container AND the main tenant container (baseHardenedHostConfig() is the shared helper). Java 17 has no SecurityManager so the JVM is not a security boundary, and isolation must live below it. Defaults are fail-closed and have no opt-out:
cap_drop= everyCapability.values()(effectively ALL — docker-java's enum has noALLconstant). Outbound TCP still works (no caps needed); raw sockets, ptrace, mounts, and bind <1024 are denied.security_opt:no-new-privileges:true,apparmor=docker-default. Default seccomp profile is applied implicitly whenseccomp=is absent.read_onlyrootfs = true.pids_limit= 512 (PIDS_LIMITconstant).tmpfsmount:/tmpwithrw,nosuid,size=256m. Nonoexec— Netty/tcnative, Snappy, LZ4, Zstd dlopen native libs from/tmpviammap(PROT_EXEC)whichnoexecblocks. Issue #153 will add per-appwriteableVolumesfor stateful tenants (Kafka Streams etc.).userns_mode=host:1000:65536on both loader and main. Container root is never UID 0 on the host — closes the last open hardening item from issue #152.
Sandboxed runtime auto-detect: at construction the orchestrator calls dockerClient.infoCmd().exec().getRuntimes() and uses runsc (gVisor) when present. Override with cameleer.server.runtime.dockerruntime (e.g. kata to force Kata Containers, or any other registered runtime). Empty/blank = auto. The override always wins over auto-detect. The DockerRuntimeOrchestrator(DockerClient, String) constructor is the canonical entry point; the single-arg constructor exists only as a convenience for tests that don't need an override.
Init-Container Loader Pattern (JAR fetch)
startContainer is now a two-phase op per replica:
- Volume create —
cameleer-jars-{containerName}named volume (per-replica, deterministic so cleanup inremoveContainercan derive it). - Loader container —
loaderImage(defaultgitea.siegeln.net/cameleer/cameleer-runtime-loader:latest), name{containerName}-loader, mount the volume RW at/app/jars, env varsARTIFACT_URL+ARTIFACT_EXPECTED_SIZE. Loader downloads the JAR from the signed URL into the volume and exits 0. Orchestrator blocks onwaitContainerCmd().exec(WaitContainerResultCallback).awaitStatusCode(120, SECONDS). Loader container is removed in afinallyblock; on non-zero exit the volume is also removed andRuntimeExceptionpropagates soDeploymentExecutormarks the deployment FAILED. - Main container — same hardening contract, mount the same volume RO at
/app/jars, entrypoint reads/app/jars/app.jar(Spring Boot/Quarkus:-jar /app/jars/app.jar; plain Java:-cp /app/jars/app.jar <MainClass>; native:exec /app/jars/app.jar).
removeContainer(id) derives the volume name from the inspected container name (Docker prefixes it with /) and removes the volume after the container removes — blue/green doesn't leak volumes.
DeploymentExecutor generates the signed URL via ArtifactDownloadTokenSigner.sign(appVersion.id(), Duration.ofSeconds(artifactTokenTtlSeconds)) and passes appVersion.id(), the URL, appVersion.jarSizeBytes(), and the loader image into ContainerRequest. The host filesystem is no longer involved at deploy time.
DeploymentExecutor Details
Primary network for app containers is set via CAMELEER_SERVER_RUNTIME_DOCKERNETWORK env var (in SaaS mode: cameleer-tenant-{slug}); apps also connect to cameleer-traefik (routing) and cameleer-env-{tenantId}-{envSlug} (per-environment discovery) as additional networks. Resolves runtimeType: auto to concrete type from AppVersion.detectedRuntimeType at PRE_FLIGHT (fails deployment if unresolvable). Builds Docker entrypoint per runtime type (all JVM types use -javaagent:/app/agent.jar -jar, plain Java uses -cp with main class, native runs binary directly). Sets per-replica CAMELEER_AGENT_INSTANCEID env var to {envSlug}-{appSlug}-{replicaIndex}-{generation} so container logs and agent logs share the same instance identity. Sets CAMELEER_AGENT_* env vars from ResolvedContainerConfig (routeControlEnabled, replayEnabled, health port). These are startup-only agent properties — changing them requires redeployment.
Container naming — {tenantId}-{envSlug}-{appSlug}-{replicaIndex}-{generation}, where generation is the first 8 characters of the deployment UUID. The generation suffix lets old + new replicas coexist during a blue/green swap (deterministic names without a generation used to 409). All lookups across the executor, DockerEventMonitor, and ContainerLogForwarder key on container id, not name — the name is operator-visibility only.
Strategy dispatch — DeploymentStrategy.fromWire(config.deploymentStrategy()) branches the executor. Unknown values fall back to BLUE_GREEN so misconfiguration never throws at runtime.
- Blue/green (default): start all N new replicas → wait for ALL healthy → stop the previous deployment. Resource peak ≈ 2× replicas for the health-check window. Partial health aborts with status FAILED; the previous deployment is preserved untouched (user's safety net).
- Rolling: replace replicas one at a time — start new[i] → wait healthy → stop old[i] → next. Resource peak = replicas + 1. Mid-rollout health failure stops in-flight new containers and aborts; already-replaced old replicas are NOT restored (not reversible) but un-replaced old[i+1..N] keep serving traffic. User redeploys to recover.
Traffic routing is implicit: Traefik labels (cameleer.app, cameleer.environment) are generation-agnostic, so new replicas attract load balancing as soon as they come up healthy — no explicit swap step.
Deployment Status Model
| Status | Meaning |
|---|---|
STOPPED |
Intentionally stopped or initial state |
STARTING |
Deploy in progress |
RUNNING |
All replicas healthy and serving |
DEGRADED |
Post-deploy: a replica died after the deploy was marked RUNNING. Set by DockerEventMonitor reconciliation, never by DeploymentExecutor directly. |
STOPPING |
Graceful shutdown in progress |
FAILED |
Terminal failure (pre-flight, health check, or crash). Partial-healthy deploys now mark FAILED — DEGRADED is reserved for post-deploy drift. |
Deploy stages (DeployStage): PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE (or FAILED at any stage). Rolling reuses the same stage labels inside the per-replica loop; the UI progress bar shows the most recent stage.
Deployment retention: DeploymentService.createDeployment() deletes FAILED deployments for the same app+environment before creating a new one, preventing failed-attempt buildup. STOPPED deployments are preserved as restorable checkpoints — the UI Checkpoints disclosure lists every deployment with a non-null deployed_config_snapshot (RUNNING, DEGRADED, STOPPED) minus the current one.
JAR Management
- Retention policy per environment: configurable maximum number of JAR versions to keep. Older JARs are deleted automatically.
- Nightly cleanup job (
JarRetentionJob, Spring@Scheduled03:00): purges JARs exceeding the retention limit and removes orphaned files not referenced by any app version. Skips versions currently deployed. - Storage abstraction:
ArtifactStore(incameleer-server-core/storage) is the only path that touches JAR bytes.FilesystemArtifactStorewrites undercameleer.server.runtime.jarstoragepath(default/data/jars); the orchestrator never reads the host filesystem at deploy time. - Loader-fetch at deploy time: tenant containers no longer bind-mount JARs from the host. The loader init-container streams the JAR via a signed URL (HMAC-SHA256, TTL
cameleer.server.runtime.artifacttokenttlseconds, default 600s) into a per-replica named volume; main mounts that volume RO. This works without host-path access and is the single path supported in Docker-in-Docker SaaS deployments.
Runtime Type Detection
The server detects the app framework from uploaded JARs and builds Docker entrypoints. The agent shaded JAR bundles the log appender, so no separate cameleer-log-appender.jar or PropertiesLauncher is needed:
- Detection (
RuntimeDetector): runs at JAR upload time. Checks ZIP magic bytes (non-ZIP = native binary), then probesMETA-INF/MANIFEST.MFMain-Class: Spring Boot loader prefix ->spring-boot, Quarkus entry point ->quarkus, other Main-Class ->plain-java(extracts class name). Results stored onAppVersion(detected_runtime_type,detected_main_class). - Runtime types (
RuntimeTypeenum):AUTO,SPRING_BOOT,QUARKUS,PLAIN_JAVA,NATIVE. Configurable per app/environment viacontainerConfig.runtimeType(default"auto"). - Entrypoint per type: All JVM types use
java -javaagent:/app/agent.jar -jar app.jar. Plain Java uses-cpwith explicit main class instead of-jar. Native runs the binary directly. - Custom arguments (
containerConfig.customArgs): freeform string appended to the start command. Validated against a strict pattern to prevent shell injection (entrypoint usessh -c). - AUTO resolution: at deploy time (PRE_FLIGHT),
"auto"resolves to the detected type fromAppVersion. Fails deployment if detection was unsuccessful — user must set type explicitly. - UI: Resources tab shows Runtime Type dropdown (with detection hint from latest uploaded version) and Custom Arguments text field.
SaaS Multi-Tenant Network Isolation
In SaaS mode, each tenant's server and its deployed apps are isolated at the Docker network level:
- Tenant network (
cameleer-tenant-{slug}) — primary internal bridge for all of a tenant's containers. Set asCAMELEER_SERVER_RUNTIME_DOCKERNETWORKfor the tenant's server instance. Tenant A's apps cannot reach tenant B's apps. - Shared services network — server also connects to the shared infrastructure network (PostgreSQL, ClickHouse, Logto) and
cameleer-traefikfor HTTP routing. - Tenant-scoped environment networks (
cameleer-env-{tenantId}-{envSlug}) — per-environment discovery is scoped per tenant, soalpha-corp's "dev" environment network is separate frombeta-corp's "dev" environment network.
nginx / Reverse Proxy
client_max_body_size 200mis required in the nginx config to allow JAR uploads up to 200 MB. Without this, large JAR uploads return 413.