docs(rules): document ArtifactDownloadController + storage abstraction; drop JARDOCKERVOLUME

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
hsiegeln
2026-04-27 16:52:20 +02:00
parent cc076b1923
commit 0ee763ba51
2 changed files with 16 additions and 8 deletions

View File

@@ -27,6 +27,7 @@ These paths intentionally stay flat (no `/environments/{envSlug}` prefix). Every
| `/api/v1/catalog`, `/api/v1/catalog/{applicationId}` | Cross-env discovery is the purpose. Env is an optional filter via `?environment=`. |
| `/api/v1/executions/{execId}`, `/processors/**` | Exchange IDs are globally unique; permalinks. |
| `/api/v1/diagrams/{contentHash}/render`, `POST /api/v1/diagrams/render` | Content-addressed or stateless. |
| `/api/v1/artifacts/{appVersionId}` | Init-container artifact pull. HMAC-signed URL is the auth — no JWT context. |
| `/api/v1/alerts/notifications/{id}/retry` | Notification IDs are globally unique; no env routing needed. |
| `/api/v1/auth/**` | Pre-auth; no env context exists. |
| `/api/v1/health`, `/prometheus`, `/api-docs/**`, `/swagger-ui/**` | Server metadata. |
@@ -122,11 +123,12 @@ Env-scoped read-path controllers (`AlertController`, `AlertRuleController`, `Ale
- `DetailController` — GET `/api/v1/executions/{executionId}` + processor snapshot endpoints.
- `MetricsController` — exposes `/api/v1/metrics` and `/api/v1/prometheus` (server-side Prometheus scrape endpoint).
- `ArtifactDownloadController` — GET `/api/v1/artifacts/{appVersionId}?exp&sig`. HMAC-signed URL is the auth (permitAll'd in `SecurityConfig`); validates via `ArtifactDownloadTokenSigner`. Streams the artifact via `ArtifactStore.get(coords)` with content type `application/java-archive`. Hit by the `cameleer-runtime-loader` init container at deploy time. 401 on bad sig, 404 on missing version, 200 on success.
## runtime/ — Docker orchestration
- `DockerRuntimeOrchestrator` — implements RuntimeOrchestrator; Docker Java client (zerodep transport), container lifecycle
- `DeploymentExecutor`@Async staged deploy: PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE. Container names are `{tenantId}-{envSlug}-{appSlug}-{replicaIndex}-{generation}`, where `generation` is the first 8 chars of the deployment UUID — old and new replicas coexist during a blue/green swap. Per-replica `CAMELEER_AGENT_INSTANCEID` env var is `{envSlug}-{appSlug}-{replicaIndex}-{generation}`. Branches on `DeploymentStrategy.fromWire(config.deploymentStrategy())`: **blue-green** (default) starts all N → waits for all healthy → stops old (partial health = FAILED, preserves old untouched); **rolling** replaces replicas one at a time with rollback only for in-flight new containers (already-replaced old stay stopped; un-replaced old keep serving). DEGRADED is now only set by `DockerEventMonitor` post-deploy, never by the executor. **License compute caps**: at PRE_FLIGHT (after `ConfigMerger.resolve`, before image pull / container creation) the executor consults `LicenseUsageReader.computeUsage()` (PG aggregate over non-stopped deployments) and runs three `LicenseEnforcer.assertWithinCap(...)` checks for `max_total_cpu_millis`, `max_total_memory_mb`, and `max_total_replicas`. A `LicenseCapExceededException` propagates to the surrounding `try/catch` which marks the deployment FAILED with the cap message in `deployments.error_message`.
- `DockerRuntimeOrchestrator` — implements RuntimeOrchestrator; Docker Java client (zerodep transport), container lifecycle. **`startContainer` is a 2-phase op**: per-replica named volume → `cameleer-runtime-loader` init container fetches the JAR via signed URL → main container starts with the volume mounted RO at `/app/jars`. Both containers get `cap_drop ALL`, `no-new-privileges`, `apparmor=docker-default`, readonly rootfs, pids=512, `/tmp` tmpfs (no `noexec`), and `userns_mode=host:1000:65536`. Volume cleanup deterministic via `removeContainer` deriving the volume name from the inspected container.
- `DeploymentExecutor`@Async staged deploy: PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE. Pulls both `baseImage` and `loaderImage` at PULL_IMAGE. Generates per-deploy signed download URLs via `ArtifactDownloadTokenSigner.sign(appVersionId, ttl)` — passes URL + appVersionId + jarSizeBytes + loaderImage into `ContainerRequest`. The host filesystem is no longer involved at deploy time. Container names are `{tenantId}-{envSlug}-{appSlug}-{replicaIndex}-{generation}`, where `generation` is the first 8 chars of the deployment UUID — old and new replicas coexist during a blue/green swap. Per-replica `CAMELEER_AGENT_INSTANCEID` env var is `{envSlug}-{appSlug}-{replicaIndex}-{generation}`. Branches on `DeploymentStrategy.fromWire(config.deploymentStrategy())`: **blue-green** (default) starts all N → waits for all healthy → stops old (partial health = FAILED, preserves old untouched); **rolling** replaces replicas one at a time with rollback only for in-flight new containers (already-replaced old stay stopped; un-replaced old keep serving). DEGRADED is now only set by `DockerEventMonitor` post-deploy, never by the executor. **License compute caps**: at PRE_FLIGHT (after `ConfigMerger.resolve`, before image pull / container creation) the executor consults `LicenseUsageReader.computeUsage()` (PG aggregate over non-stopped deployments) and runs three `LicenseEnforcer.assertWithinCap(...)` checks for `max_total_cpu_millis`, `max_total_memory_mb`, and `max_total_replicas`. A `LicenseCapExceededException` propagates to the surrounding `try/catch` which marks the deployment FAILED with the cap message in `deployments.error_message`.
- `DockerNetworkManager` — ensures bridge networks (cameleer-traefik, cameleer-env-{slug}), connects containers
- `DockerEventMonitor` — persistent Docker event stream listener (die, oom, start, stop), updates deployment status
- `TraefikLabelBuilder` — generates Traefik Docker labels for path-based or subdomain routing. Per-container identity labels: `cameleer.replica` (index), `cameleer.generation` (deployment-scoped 8-char id — for Prometheus/Grafana deploy-boundary annotations), `cameleer.instance-id` (`{envSlug}-{appSlug}-{replicaIndex}-{generation}`). Router/service label keys are generation-agnostic so load balancing spans old + new replicas during a blue/green overlap.
@@ -148,6 +150,11 @@ Env-scoped read-path controllers (`AlertController`, `AlertRuleController`, `Ale
- `PostgresAuditRepository`, `PostgresOidcConfigRepository`, `PostgresClaimMappingRepository`, `PostgresSensitiveKeysRepository`
- `PostgresAppSettingsRepository`, `PostgresApplicationConfigRepository`, `PostgresThresholdRepository`. Both `app_settings` and `application_config` are env-scoped (PK `(app_id, environment)` / `(application, environment)`); finders take `(app, env)` — no env-agnostic variants.
## storage/ — Artifact storage (concrete impls)
- `FilesystemArtifactStore` — implements `ArtifactStore` interface from `cameleer-server-core`. Persists JAR bytes under `{cameleer.server.runtime.jarstoragepath}/{appId}/v{version}/app.jar` (preserves the legacy layout — historical `app_versions.jar_path` rows resolve identically). `put` writes via `<target>.tmp` + `Files.move(ATOMIC_MOVE)` so concurrent readers never see a torn file. `delete` sweeps empty parent dirs and tolerates `DirectoryNotEmptyException` from concurrent sibling-version uploads. `size(coords)` returns the actual on-disk byte count — used by `ArtifactDownloadController` for authoritative `Content-Length` instead of trusting `AppVersion.jarSizeBytes`.
- `ArtifactDownloadTokenSigner` — HMAC-SHA256 URL signer/verifier. Key derived deterministically from JWT secret via HMAC(secret, "cameleer-artifact-token-v1"). Sign produces `{exp, sig}` tuple where `sig = base64url-no-pad(HMAC-SHA256(key, "{uuid}:{exp}"))`. `verify` is constant-time via `MessageDigest.isEqual`. Used by `DeploymentExecutor` to mint download URLs and by `ArtifactDownloadController` to verify them. Rejects null/blank secret at construction.
## storage/ — ClickHouse stores
- `ClickHouseExecutionStore`, `ClickHouseMetricsStore`, `ClickHouseMetricsQueryStore`