Files
cameleer-server/.claude/rules/docker-orchestration.md

116 lines
16 KiB
Markdown
Raw Normal View History

---
paths:
- "cameleer-server-app/**/runtime/**"
- "cameleer-server-core/**/runtime/**"
- "deploy/**"
- "docker-compose*.yml"
- "Dockerfile"
- "docker-entrypoint.sh"
---
# Docker Orchestration
When deployed via the cameleer-saas platform, this server orchestrates customer app containers using Docker. Key components:
- **ConfigMerger** (`core/runtime/ConfigMerger.java`) — pure function: resolve(globalDefaults, envConfig, appConfig) -> ResolvedContainerConfig. Three-layer merge: global (application.yml) -> environment (defaultContainerConfig JSONB) -> app (containerConfig JSONB). Includes `runtimeType` (default `"auto"`) and `customArgs` (default `""`).
- **TraefikLabelBuilder** (`app/runtime/TraefikLabelBuilder.java`) — generates Traefik Docker labels for path-based (`/{envSlug}/{appSlug}/`) or subdomain-based (`{appSlug}-{envSlug}.{domain}`) routing. Supports strip-prefix and SSL offloading toggles. Per-replica identity labels: `cameleer.replica` (index), `cameleer.generation` (8-char deployment UUID prefix — pin Prometheus/Grafana deploy boundaries with this), `cameleer.instance-id` (`{envSlug}-{appSlug}-{replicaIndex}-{generation}`). Traefik router/service keys deliberately omit the generation so load balancing spans old + new replicas during a blue/green overlap. When `ResolvedContainerConfig.externalRouting()` is `false` (UI: Resources → External Routing, default `true`), the builder emits ONLY the identity labels (`managed-by`, `cameleer.*`) and skips every `traefik.*` label — the container stays on `cameleer-traefik` and the per-env network (so sibling containers can still reach it via Docker DNS) but is invisible to Traefik. The `tls.certresolver` label is emitted only when `CAMELEER_SERVER_RUNTIME_CERTRESOLVER` is set to a non-blank resolver name (matching a resolver configured in the Traefik static config). When unset (dev installs backed by a static TLS store) only `tls=true` is emitted and Traefik serves the default cert from the TLS store.
- **PrometheusLabelBuilder** (`app/runtime/PrometheusLabelBuilder.java`) — generates Prometheus `docker_sd_configs` labels per resolved runtime type: Spring Boot `/actuator/prometheus:8081`, Quarkus/native `/q/metrics:9000`, plain Java `/metrics:9464`. Labels merged into container metadata alongside Traefik labels at deploy time.
- **DockerNetworkManager** (`app/runtime/DockerNetworkManager.java`) — manages two Docker network tiers:
- `cameleer-traefik` — shared network; Traefik, server, and all app containers attach here. Server joined via docker-compose with `cameleer-server` DNS alias.
- `cameleer-env-{slug}` — per-environment isolated network; containers in the same environment discover each other via Docker DNS. In SaaS mode, env networks are tenant-scoped: `cameleer-env-{tenantId}-{envSlug}` (overloaded `envNetworkName(tenantId, envSlug)` method) to prevent cross-tenant collisions when multiple tenants have identically-named environments.
- **DockerEventMonitor** (`app/runtime/DockerEventMonitor.java`) — persistent Docker event stream listener for containers with `managed-by=cameleer-server` label. Detects die/oom/start/stop events and updates deployment replica states. Periodic reconciliation (@Scheduled every 30s) inspects actual container state and corrects deployment status mismatches (fixes stale DEGRADED with all replicas healthy).
- **DeploymentProgress** (`ui/src/components/DeploymentProgress.tsx`) — UI step indicator showing 7 deploy stages with amber active/green completed styling.
- **ContainerLogForwarder** (`app/runtime/ContainerLogForwarder.java`) — streams Docker container stdout/stderr to ClickHouse `logs` table with `source='container'`. Uses `docker logs --follow` per container, batches lines every 2s or 50 lines. Parses Docker timestamp prefix, infers log level via regex. `DeploymentExecutor` starts capture after each replica launches with the replica's `instanceId` (`{envSlug}-{appSlug}-{replicaIndex}-{generation}`); `DockerEventMonitor` stops capture on die/oom. 60-second max capture timeout with 30s cleanup scheduler. Thread pool of 10 daemon threads. Container logs use the same `instanceId` as the agent (set via `CAMELEER_AGENT_INSTANCEID` env var) for unified log correlation at the instance level. Instance-id changes per deployment — cross-deploy queries aggregate on `application + environment` (and optionally `replica_index`).
- **StartupLogPanel** (`ui/src/components/StartupLogPanel.tsx`) — collapsible log panel rendered below `DeploymentProgress`. Queries `/api/v1/logs?source=container&application={appSlug}&environment={envSlug}`. Auto-polls every 3s while deployment is STARTING; shows green "live" badge during polling, red "stopped" badge on FAILED. Uses `useStartupLogs` hook and `LogViewer` (design system).
## Container Hardening (issue #152)
feat(runtime): init-container loader pattern + withUsernsMode (#152 hardening close) Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because 9 alone leaves the orchestrator+executor referencing removed ContainerRequest fields. ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage. DockerRuntimeOrchestrator (app): - per-replica named volume "cameleer-jars-{containerName}" - phase 1: loader container with the volume mounted RW at /app/jars, ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract - block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit remove the loader, remove the volume, propagate RuntimeException so DeploymentExecutor marks the deployment FAILED. main is never created. - phase 2: main container with the same volume mounted RO at /app/jars - withUsernsMode("host:1000:65536") on BOTH containers — closes the last open hardening gap from issue #152 - main entrypoint paths point at /app/jars/app.jar - extracted baseHardenedHostConfig() so loader and main share the cap_drop / security_opt / readonly / pids / tmpfs contract - removeContainer() also removes the per-replica volume so blue/green doesn't leak volumes DeploymentExecutor (app): - injects ArtifactDownloadTokenSigner; new @Value props loaderimage, artifacttokenttlseconds, artifactbaseurl - replaces the temporary getVersion(...).jarPath() bridge with a signed URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig - drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is the size-of-record check now - drops jarDockerVolume / jarStoragePath @Value fields and the volume plumbing in startReplica - DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize in place of jarPath Tests: - DockerRuntimeOrchestratorHardeningTest updated for the new shape; captures HostConfig on the MAIN container and asserts cap_drop ALL + no-new-privileges + apparmor + readonly + pids + tmpfs + the new withUsernsMode("host:1000:65536") - DockerRuntimeOrchestratorLoaderTest (new): verifies volume create → loader create with RW bind → loader started → awaited → loader removed → main create with RO bind → main started; verifies abort + cleanup on loader exit != 0 (loader removed, volume removed, main NEVER created); verifies userns_mode applied to both containers. Config: - application.yml replaces jardockervolume with loaderimage, artifacttokenttlseconds, artifactbaseurl Rules updated: .claude/rules/docker-orchestration.md (loader pattern, userns, no more bind-mount); .claude/rules/core-classes.md (ContainerRequest field map). Test counts after change: - cameleer-server-core: 116/116 unit tests pass - cameleer-server-app: 273/273 unit tests pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:06:56 +02:00
`DockerRuntimeOrchestrator.startContainer` applies an unconditional hardening contract to BOTH the loader init-container AND the main tenant container (`baseHardenedHostConfig()` is the shared helper). Java 17 has no SecurityManager so the JVM is not a security boundary, and isolation must live below it. Defaults are fail-closed and have no opt-out:
- `cap_drop` = every `Capability.values()` (effectively ALL — docker-java's enum has no `ALL` constant). Outbound TCP still works (no caps needed); raw sockets, ptrace, mounts, and bind <1024 are denied.
- `security_opt`: `no-new-privileges:true`, `apparmor=docker-default`. Default seccomp profile is applied implicitly when `seccomp=` is absent.
- `read_only` rootfs = true.
- `pids_limit` = 512 (`PIDS_LIMIT` constant).
- `tmpfs` mount: `/tmp` with `rw,nosuid,size=256m`. **No `noexec`** — Netty/tcnative, Snappy, LZ4, Zstd dlopen native libs from `/tmp` via `mmap(PROT_EXEC)` which `noexec` blocks. Issue #153 will add per-app `writeableVolumes` for stateful tenants (Kafka Streams etc.).
feat(runtime): init-container loader pattern + withUsernsMode (#152 hardening close) Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because 9 alone leaves the orchestrator+executor referencing removed ContainerRequest fields. ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage. DockerRuntimeOrchestrator (app): - per-replica named volume "cameleer-jars-{containerName}" - phase 1: loader container with the volume mounted RW at /app/jars, ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract - block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit remove the loader, remove the volume, propagate RuntimeException so DeploymentExecutor marks the deployment FAILED. main is never created. - phase 2: main container with the same volume mounted RO at /app/jars - withUsernsMode("host:1000:65536") on BOTH containers — closes the last open hardening gap from issue #152 - main entrypoint paths point at /app/jars/app.jar - extracted baseHardenedHostConfig() so loader and main share the cap_drop / security_opt / readonly / pids / tmpfs contract - removeContainer() also removes the per-replica volume so blue/green doesn't leak volumes DeploymentExecutor (app): - injects ArtifactDownloadTokenSigner; new @Value props loaderimage, artifacttokenttlseconds, artifactbaseurl - replaces the temporary getVersion(...).jarPath() bridge with a signed URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig - drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is the size-of-record check now - drops jarDockerVolume / jarStoragePath @Value fields and the volume plumbing in startReplica - DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize in place of jarPath Tests: - DockerRuntimeOrchestratorHardeningTest updated for the new shape; captures HostConfig on the MAIN container and asserts cap_drop ALL + no-new-privileges + apparmor + readonly + pids + tmpfs + the new withUsernsMode("host:1000:65536") - DockerRuntimeOrchestratorLoaderTest (new): verifies volume create → loader create with RW bind → loader started → awaited → loader removed → main create with RO bind → main started; verifies abort + cleanup on loader exit != 0 (loader removed, volume removed, main NEVER created); verifies userns_mode applied to both containers. Config: - application.yml replaces jardockervolume with loaderimage, artifacttokenttlseconds, artifactbaseurl Rules updated: .claude/rules/docker-orchestration.md (loader pattern, userns, no more bind-mount); .claude/rules/core-classes.md (ContainerRequest field map). Test counts after change: - cameleer-server-core: 116/116 unit tests pass - cameleer-server-app: 273/273 unit tests pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:06:56 +02:00
- `userns_mode` = `host:1000:65536` on both loader and main. Container root is never UID 0 on the host — closes the last open hardening item from issue #152.
**Sandboxed runtime auto-detect**: at construction the orchestrator calls `dockerClient.infoCmd().exec().getRuntimes()` and uses `runsc` (gVisor) when present. Override with `cameleer.server.runtime.dockerruntime` (e.g. `kata` to force Kata Containers, or any other registered runtime). Empty/blank = auto. The override always wins over auto-detect. The `DockerRuntimeOrchestrator(DockerClient, String)` constructor is the canonical entry point; the single-arg constructor exists only as a convenience for tests that don't need an override.
feat(runtime): init-container loader pattern + withUsernsMode (#152 hardening close) Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because 9 alone leaves the orchestrator+executor referencing removed ContainerRequest fields. ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage. DockerRuntimeOrchestrator (app): - per-replica named volume "cameleer-jars-{containerName}" - phase 1: loader container with the volume mounted RW at /app/jars, ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract - block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit remove the loader, remove the volume, propagate RuntimeException so DeploymentExecutor marks the deployment FAILED. main is never created. - phase 2: main container with the same volume mounted RO at /app/jars - withUsernsMode("host:1000:65536") on BOTH containers — closes the last open hardening gap from issue #152 - main entrypoint paths point at /app/jars/app.jar - extracted baseHardenedHostConfig() so loader and main share the cap_drop / security_opt / readonly / pids / tmpfs contract - removeContainer() also removes the per-replica volume so blue/green doesn't leak volumes DeploymentExecutor (app): - injects ArtifactDownloadTokenSigner; new @Value props loaderimage, artifacttokenttlseconds, artifactbaseurl - replaces the temporary getVersion(...).jarPath() bridge with a signed URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig - drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is the size-of-record check now - drops jarDockerVolume / jarStoragePath @Value fields and the volume plumbing in startReplica - DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize in place of jarPath Tests: - DockerRuntimeOrchestratorHardeningTest updated for the new shape; captures HostConfig on the MAIN container and asserts cap_drop ALL + no-new-privileges + apparmor + readonly + pids + tmpfs + the new withUsernsMode("host:1000:65536") - DockerRuntimeOrchestratorLoaderTest (new): verifies volume create → loader create with RW bind → loader started → awaited → loader removed → main create with RO bind → main started; verifies abort + cleanup on loader exit != 0 (loader removed, volume removed, main NEVER created); verifies userns_mode applied to both containers. Config: - application.yml replaces jardockervolume with loaderimage, artifacttokenttlseconds, artifactbaseurl Rules updated: .claude/rules/docker-orchestration.md (loader pattern, userns, no more bind-mount); .claude/rules/core-classes.md (ContainerRequest field map). Test counts after change: - cameleer-server-core: 116/116 unit tests pass - cameleer-server-app: 273/273 unit tests pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:06:56 +02:00
## Init-Container Loader Pattern (JAR fetch)
`startContainer` is now a two-phase op per replica:
1. **Volume create**`cameleer-jars-{containerName}` named volume (per-replica, deterministic so cleanup in `removeContainer` can derive it).
2. **Loader container**`loaderImage` (default `gitea.siegeln.net/cameleer/cameleer-runtime-loader:latest`), name `{containerName}-loader`, mount the volume **RW at `/app/jars`**, env vars `ARTIFACT_URL` + `ARTIFACT_EXPECTED_SIZE`. Loader downloads the JAR from the signed URL into the volume and exits 0. Orchestrator blocks on `waitContainerCmd().exec(WaitContainerResultCallback).awaitStatusCode(120, SECONDS)`. Loader container is removed in a `finally` block; on non-zero exit the volume is also removed and `RuntimeException` propagates so `DeploymentExecutor` marks the deployment FAILED.
3. **Main container** — same hardening contract, mount the same volume **RO at `/app/jars`**, entrypoint reads `/app/jars/app.jar` (Spring Boot/Quarkus: `-jar /app/jars/app.jar`; plain Java: `-cp /app/jars/app.jar <MainClass>`; native: `exec /app/jars/app.jar`).
`removeContainer(id)` derives the volume name from the inspected container name (Docker prefixes it with `/`) and removes the volume after the container removes — blue/green doesn't leak volumes.
`DeploymentExecutor` generates the signed URL via `ArtifactDownloadTokenSigner.sign(appVersion.id(), Duration.ofSeconds(artifactTokenTtlSeconds))` and passes `appVersion.id()`, the URL, `appVersion.jarSizeBytes()`, and the loader image into `ContainerRequest`. The host filesystem is no longer involved at deploy time.
**Loader → server reachability**: the loader container hits the Cameleer server over HTTP from inside its
own Docker network. The signed URL is built from `cameleer.server.runtime.artifactbaseurl` (preferred), falling
back to `cameleer.server.runtime.serverurl`, falling back to `http://cameleer-server:8081`. The default works
in SaaS mode because `DockerNetworkManager` adds `cameleer-traefik` as an additional network for tenant
containers, and the server is reachable on that network via the `cameleer-server` DNS alias. For non-SaaS
topologies (server on a different network than tenants), set `CAMELEER_SERVER_RUNTIME_ARTIFACTBASEURL`
explicitly to a URL the loader can reach.
## DeploymentExecutor Details
Primary network for app containers is set via `CAMELEER_SERVER_RUNTIME_DOCKERNETWORK` env var (in SaaS mode: `cameleer-tenant-{slug}`); apps also connect to `cameleer-traefik` (routing) and `cameleer-env-{tenantId}-{envSlug}` (per-environment discovery) as additional networks. Resolves `runtimeType: auto` to concrete type from `AppVersion.detectedRuntimeType` at PRE_FLIGHT (fails deployment if unresolvable). Builds Docker entrypoint per runtime type (all JVM types use `-javaagent:/app/agent.jar -jar`, plain Java uses `-cp` with main class, native runs binary directly). Sets per-replica `CAMELEER_AGENT_INSTANCEID` env var to `{envSlug}-{appSlug}-{replicaIndex}-{generation}` so container logs and agent logs share the same instance identity. Sets `CAMELEER_AGENT_*` env vars from `ResolvedContainerConfig` (routeControlEnabled, replayEnabled, health port). These are startup-only agent properties — changing them requires redeployment.
**Container naming** — `{tenantId}-{envSlug}-{appSlug}-{replicaIndex}-{generation}`, where `generation` is the first 8 characters of the deployment UUID. The generation suffix lets old + new replicas coexist during a blue/green swap (deterministic names without a generation used to 409). All lookups across the executor, `DockerEventMonitor`, and `ContainerLogForwarder` key on container **id**, not name — the name is operator-visibility only.
**Strategy dispatch** — `DeploymentStrategy.fromWire(config.deploymentStrategy())` branches the executor. Unknown values fall back to BLUE_GREEN so misconfiguration never throws at runtime.
- **Blue/green** (default): start all N new replicas → wait for ALL healthy → stop the previous deployment. Resource peak ≈ 2× replicas for the health-check window. Partial health aborts with status FAILED; the previous deployment is preserved untouched (user's safety net).
- **Rolling**: replace replicas one at a time — start new[i] → wait healthy → stop old[i] → next. Resource peak = replicas + 1. Mid-rollout health failure stops in-flight new containers and aborts; already-replaced old replicas are NOT restored (not reversible) but un-replaced old[i+1..N] keep serving traffic. User redeploys to recover.
Traffic routing is implicit: Traefik labels (`cameleer.app`, `cameleer.environment`) are generation-agnostic, so new replicas attract load balancing as soon as they come up healthy — no explicit swap step.
## Deployment Status Model
| Status | Meaning |
|--------|---------|
| `STOPPED` | Intentionally stopped or initial state |
| `STARTING` | Deploy in progress |
| `RUNNING` | All replicas healthy and serving |
| `DEGRADED` | Post-deploy: a replica died after the deploy was marked RUNNING. Set by `DockerEventMonitor` reconciliation, never by `DeploymentExecutor` directly. |
| `STOPPING` | Graceful shutdown in progress |
| `FAILED` | Terminal failure (pre-flight, health check, or crash). Partial-healthy deploys now mark FAILED — DEGRADED is reserved for post-deploy drift. |
**Deploy stages** (`DeployStage`): PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE (or FAILED at any stage). Rolling reuses the same stage labels inside the per-replica loop; the UI progress bar shows the most recent stage.
**Deployment retention**: `DeploymentService.createDeployment()` deletes FAILED deployments for the same app+environment before creating a new one, preventing failed-attempt buildup. STOPPED deployments are preserved as restorable checkpoints — the UI Checkpoints disclosure lists every deployment with a non-null `deployed_config_snapshot` (RUNNING, DEGRADED, STOPPED) minus the current one.
## JAR Management
- **Retention policy** per environment: configurable maximum number of JAR versions to keep. Older JARs are deleted automatically.
- **Nightly cleanup job** (`JarRetentionJob`, Spring `@Scheduled` 03:00): purges JARs exceeding the retention limit and removes orphaned files not referenced by any app version. Skips versions currently deployed.
feat(runtime): init-container loader pattern + withUsernsMode (#152 hardening close) Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because 9 alone leaves the orchestrator+executor referencing removed ContainerRequest fields. ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage. DockerRuntimeOrchestrator (app): - per-replica named volume "cameleer-jars-{containerName}" - phase 1: loader container with the volume mounted RW at /app/jars, ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract - block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit remove the loader, remove the volume, propagate RuntimeException so DeploymentExecutor marks the deployment FAILED. main is never created. - phase 2: main container with the same volume mounted RO at /app/jars - withUsernsMode("host:1000:65536") on BOTH containers — closes the last open hardening gap from issue #152 - main entrypoint paths point at /app/jars/app.jar - extracted baseHardenedHostConfig() so loader and main share the cap_drop / security_opt / readonly / pids / tmpfs contract - removeContainer() also removes the per-replica volume so blue/green doesn't leak volumes DeploymentExecutor (app): - injects ArtifactDownloadTokenSigner; new @Value props loaderimage, artifacttokenttlseconds, artifactbaseurl - replaces the temporary getVersion(...).jarPath() bridge with a signed URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig - drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is the size-of-record check now - drops jarDockerVolume / jarStoragePath @Value fields and the volume plumbing in startReplica - DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize in place of jarPath Tests: - DockerRuntimeOrchestratorHardeningTest updated for the new shape; captures HostConfig on the MAIN container and asserts cap_drop ALL + no-new-privileges + apparmor + readonly + pids + tmpfs + the new withUsernsMode("host:1000:65536") - DockerRuntimeOrchestratorLoaderTest (new): verifies volume create → loader create with RW bind → loader started → awaited → loader removed → main create with RO bind → main started; verifies abort + cleanup on loader exit != 0 (loader removed, volume removed, main NEVER created); verifies userns_mode applied to both containers. Config: - application.yml replaces jardockervolume with loaderimage, artifacttokenttlseconds, artifactbaseurl Rules updated: .claude/rules/docker-orchestration.md (loader pattern, userns, no more bind-mount); .claude/rules/core-classes.md (ContainerRequest field map). Test counts after change: - cameleer-server-core: 116/116 unit tests pass - cameleer-server-app: 273/273 unit tests pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:06:56 +02:00
- **Storage abstraction**: `ArtifactStore` (in `cameleer-server-core/storage`) is the only path that touches JAR bytes. `FilesystemArtifactStore` writes under `cameleer.server.runtime.jarstoragepath` (default `/data/jars`); the orchestrator never reads the host filesystem at deploy time.
- **Loader-fetch at deploy time**: tenant containers no longer bind-mount JARs from the host. The loader init-container streams the JAR via a signed URL (HMAC-SHA256, TTL `cameleer.server.runtime.artifacttokenttlseconds`, default 600s) into a per-replica named volume; main mounts that volume RO. This works without host-path access and is the single path supported in Docker-in-Docker SaaS deployments.
## Runtime Type Detection
The server detects the app framework from uploaded JARs and builds Docker entrypoints. The agent shaded JAR bundles the log appender, so no separate `cameleer-log-appender.jar` or `PropertiesLauncher` is needed:
- **Detection** (`RuntimeDetector`): runs at JAR upload time. Checks ZIP magic bytes (non-ZIP = native binary), then probes `META-INF/MANIFEST.MF` Main-Class: Spring Boot loader prefix -> `spring-boot`, Quarkus entry point -> `quarkus`, other Main-Class -> `plain-java` (extracts class name). Results stored on `AppVersion` (`detected_runtime_type`, `detected_main_class`).
- **Runtime types** (`RuntimeType` enum): `AUTO`, `SPRING_BOOT`, `QUARKUS`, `PLAIN_JAVA`, `NATIVE`. Configurable per app/environment via `containerConfig.runtimeType` (default `"auto"`).
- **Entrypoint per type**: All JVM types use `java -javaagent:/app/agent.jar -jar app.jar`. Plain Java uses `-cp` with explicit main class instead of `-jar`. Native runs the binary directly.
- **Custom arguments** (`containerConfig.customArgs`): freeform string appended to the start command. Validated against a strict pattern to prevent shell injection (entrypoint uses `sh -c`).
- **AUTO resolution**: at deploy time (PRE_FLIGHT), `"auto"` resolves to the detected type from `AppVersion`. Fails deployment if detection was unsuccessful — user must set type explicitly.
- **UI**: Resources tab shows Runtime Type dropdown (with detection hint from latest uploaded version) and Custom Arguments text field.
## SaaS Multi-Tenant Network Isolation
In SaaS mode, each tenant's server and its deployed apps are isolated at the Docker network level:
- **Tenant network** (`cameleer-tenant-{slug}`) — primary internal bridge for all of a tenant's containers. Set as `CAMELEER_SERVER_RUNTIME_DOCKERNETWORK` for the tenant's server instance. Tenant A's apps cannot reach tenant B's apps.
- **Shared services network** — server also connects to the shared infrastructure network (PostgreSQL, ClickHouse, Logto) and `cameleer-traefik` for HTTP routing.
- **Tenant-scoped environment networks** (`cameleer-env-{tenantId}-{envSlug}`) — per-environment discovery is scoped per tenant, so `alpha-corp`'s "dev" environment network is separate from `beta-corp`'s "dev" environment network.
## nginx / Reverse Proxy
- `client_max_body_size 200m` is required in the nginx config to allow JAR uploads up to 200 MB. Without this, large JAR uploads return 413.