feat(runtime): capture loader logs in failure exceptions; add LoaderHardeningIT regression guard
All checks were successful
All checks were successful
Two diagnostics-and-confidence follow-ups to the loader-init-container pattern. 1) DockerRuntimeOrchestrator now captures the loader's last 50 lines of stdout/stderr (capped at 4096 chars, 5s timeout) before the finally-remove and appends them to the thrown RuntimeException as `. loader output: <text>`. Best-effort: log-capture failures are swallowed and never mask the original exit. Closes the visibility gap that turned a simple "wget: Permission denied" into the opaque "Loader exited 1". 2) New LoaderHardeningIT spins up a Testcontainers nginx serving a 1KB fixture, builds the loader image fresh from cameleer-runtime-loader/, and runs it under the exact baseHardenedHostConfig() shape (cap_drop ALL, readonly rootfs, /tmp tmpfs, no-new-privileges, apparmor=docker-default, pids=512) bound to a fresh named volume RW at /app/jars. Asserts exit 0. This would have caught the volume-permission regression in CI. GenericContainer + OneShotStartupCheckStrategy is used instead of raw docker-java waitContainerCmd because docker-java's unshaded api version in this project's pom and testcontainers' shaded copy disagree on WaitContainerCmd.getCondition() — going through GenericContainer keeps the call inside testcontainers' shaded executor. Rules doc updated to point at the captured-output behaviour and the IT. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -41,7 +41,7 @@ When deployed via the cameleer-saas platform, this server orchestrates customer
|
||||
`startContainer` is now a two-phase op per replica:
|
||||
|
||||
1. **Volume create** — `cameleer-jars-{containerName}` named volume (per-replica, deterministic so cleanup in `removeContainer` can derive it).
|
||||
2. **Loader container** — `loaderImage` (default `gitea.siegeln.net/cameleer/cameleer-runtime-loader:latest`), name `{containerName}-loader`, mount the volume **RW at `/app/jars`**, env vars `ARTIFACT_URL` + `ARTIFACT_EXPECTED_SIZE`. Loader downloads the JAR from the signed URL into the volume and exits 0. Orchestrator blocks on `waitContainerCmd().exec(WaitContainerResultCallback).awaitStatusCode(120, SECONDS)`. Loader container is removed in a `finally` block; on non-zero exit the volume is also removed and `RuntimeException` propagates so `DeploymentExecutor` marks the deployment FAILED.
|
||||
2. **Loader container** — `loaderImage` (default `gitea.siegeln.net/cameleer/cameleer-runtime-loader:latest`), name `{containerName}-loader`, mount the volume **RW at `/app/jars`**, env vars `ARTIFACT_URL` + `ARTIFACT_EXPECTED_SIZE`. Loader downloads the JAR from the signed URL into the volume and exits 0. Orchestrator blocks on `waitContainerCmd().exec(WaitContainerResultCallback).awaitStatusCode(120, SECONDS)`. Loader container is removed in a `finally` block; on non-zero exit the volume is also removed and `RuntimeException` propagates so `DeploymentExecutor` marks the deployment FAILED. **Loader logs are captured before removal** (`captureLoaderLogs` — `logContainerCmd` with `withTail(50)`, capped at 4096 chars, 5s timeout) and appended to the thrown `RuntimeException` message as `". loader output: <text>"`. Best-effort: log-capture failures are swallowed and don't mask the original exit. The loader image's Dockerfile pre-creates `/app/jars` owned by `loader:loader` (UID 1000) so the orchestrator's fresh named volume initialises with that ownership — without it the empty volume comes up as `root:root 0755` and wget exits 1 with "Permission denied". `LoaderHardeningIT` is the regression guard.
|
||||
3. **Main container** — same hardening contract, mount the same volume **RO at `/app/jars`**, entrypoint reads `/app/jars/app.jar` (Spring Boot/Quarkus: `-jar /app/jars/app.jar`; plain Java: `-cp /app/jars/app.jar <MainClass>`; native: `exec /app/jars/app.jar`).
|
||||
|
||||
`removeContainer(id)` derives the volume name from the inspected container name (Docker prefixes it with `/`) and removes the volume after the container removes — blue/green doesn't leak volumes.
|
||||
|
||||
Reference in New Issue
Block a user