Files
cameleer-server/.claude/rules/app-classes.md
hsiegeln 2535741474 docs(rules): document POST /auth/logout on UiAuthController
Updates both UiAuthController listings (Auth flat + security/) so future
sessions know /logout exists, that it bumps token_revoked_before with a
+1ms race-safety bump, and that it audits under AuditCategory.AUTH.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 12:00:05 +02:00

235 lines
36 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
---
paths:
- "cameleer-server-app/**"
---
# App Module Key Classes
`cameleer-server-app/src/main/java/com/cameleer/server/app/`
## URL taxonomy
User-facing data and config endpoints live under `/api/v1/environments/{envSlug}/...`. Env is a path segment, never a query param. The `envSlug` is resolved to an `Environment` bean via the `@EnvPath` argument resolver (`web/EnvironmentPathResolver.java`) — 404 on unknown slug.
**Slugs are immutable after creation** for both environments and apps. Slug regex: `^[a-z0-9][a-z0-9-]{0,63}$`. Validated in `EnvironmentService.create` and `AppService.createApp`. Update endpoints (`PUT`) do not accept a slug field; Jackson drops it as an unknown property.
### Flat-endpoint allow-list
These paths intentionally stay flat (no `/environments/{envSlug}` prefix). Every new endpoint should be env-scoped unless it appears here and the reason is documented.
| Path prefix | Why flat |
|---|---|
| `/api/v1/data/**` | Agent ingestion. JWT `env` claim is authoritative; URL-embedded env would invite spoofing. |
| `/api/v1/agents/register`, `/refresh`, `/{id}/heartbeat`, `/{id}/events` (SSE), `/{id}/deregister`, `/{id}/commands`, `/{id}/commands/{id}/ack`, `/{id}/replay` | Agent self-service; JWT-bound. |
| `/api/v1/agents/commands`, `/api/v1/agents/groups/{group}/commands` | Operator fan-out; target scope is explicit in query params. |
| `/api/v1/agents/config` | Agent-authoritative config read; JWT → registry → (app, env). |
| `/api/v1/admin/{users,roles,groups,oidc,license,audit,rbac/stats,claim-mappings,thresholds,sensitive-keys,usage,clickhouse,database,environments,outbound-connections}` | Truly cross-env admin. Env CRUD URLs use `{envSlug}`, not UUID. |
| `/api/v1/catalog`, `/api/v1/catalog/{applicationId}` | Cross-env discovery is the purpose. Env is an optional filter via `?environment=`. |
| `/api/v1/executions/{execId}`, `/processors/**` | Exchange IDs are globally unique; permalinks. |
| `/api/v1/diagrams/{contentHash}/render`, `POST /api/v1/diagrams/render` | Content-addressed or stateless. |
| `/api/v1/alerts/notifications/{id}/retry` | Notification IDs are globally unique; no env routing needed. |
| `/api/v1/auth/**` | Pre-auth; no env context exists. |
| `/api/v1/health`, `/prometheus`, `/api-docs/**`, `/swagger-ui/**` | Server metadata. |
## Tenant isolation invariant
ClickHouse is shared across tenants. Every ClickHouse query must filter by `tenant_id` (from `CAMELEER_SERVER_TENANT_ID` env var, resolved via `TenantContext`/config) in addition to `environment`. New controllers added under `/environments/{envSlug}/...` must preserve this — the env filter from the path does not replace the tenant filter.
## User ID conventions
`users.user_id` stores the **bare** identifier:
- Local users: `<username>` (e.g. `admin`, `alice`)
- OIDC users: `oidc:<sub>` (e.g. `oidc:c7a93b…`)
JWT subjects carry a `user:` namespace prefix (`user:admin`, `user:oidc:<sub>`) so `JwtAuthenticationFilter` can distinguish user tokens from agent tokens. All three write paths upsert the **bare** form:
- `UiAuthController.login` — computes `userId = request.username()`, signs with `subject = "user:" + userId`.
- `OidcAuthController.callback``userId = "oidc:" + oidcUser.subject()`, signs with `subject = "user:" + userId`.
- `UserAdminController.createUser``userId = request.username()`.
Env-scoped read-path controllers (`AlertController`, `AlertRuleController`, `AlertSilenceController`, `OutboundConnectionAdminController`) strip `"user:"` from `SecurityContextHolder.authentication.name` before using it as an FK. All FKs to `users(user_id)` (e.g. `alert_rules.created_by`, `outbound_connections.created_by`, `alert_reads.user_id`, `user_roles.user_id`, `user_groups.user_id`) therefore reference the bare form. If you add a new controller that needs the acting user id for an FK insert, follow the same strip pattern.
## controller/ — REST endpoints
### Env-scoped (user-facing data & config)
- `AppController``/api/v1/environments/{envSlug}/apps`. GET list / POST create / GET `{appSlug}` / DELETE `{appSlug}` / GET `{appSlug}/versions` / POST `{appSlug}/versions` (JAR upload) / PUT `{appSlug}/container-config` / GET `{appSlug}/dirty-state` (returns `DirtyStateResponse{dirty, lastSuccessfulDeploymentId, differences}` — compares current JAR+config against last RUNNING deployment snapshot; dirty=true when no snapshot exists). App slug uniqueness is per-env (`(env, app_slug)` is the natural key). `CreateAppRequest` body has no env (path), validates slug regex. Injects `DirtyStateCalculator` bean (registered in `RuntimeBeanConfig`, requires `ObjectMapper` with `JavaTimeModule`).
- `DeploymentController``/api/v1/environments/{envSlug}/apps/{appSlug}/deployments`. GET list / POST create (body `{ appVersionId }`) / POST `{id}/stop` / POST `{id}/promote` (body `{ targetEnvironment: slug }` — target app slug must exist in target env) / GET `{id}/logs`. All lifecycle ops (`POST /` deploy, `POST /{id}/stop`, `POST /{id}/promote`) audited under `AuditCategory.DEPLOYMENT`. Action codes: `deploy_app`, `stop_deployment`, `promote_deployment`. Acting user resolved via the `user:` prefix-strip convention; both SUCCESS and FAILURE branches write audit rows. `created_by` (TEXT, nullable) populated from `SecurityContextHolder` and surfaced on the `Deployment` DTO.
- `ApplicationConfigController``/api/v1/environments/{envSlug}`. GET `/config` (list), GET/PUT `/apps/{appSlug}/config`, GET `/apps/{appSlug}/processor-routes`, POST `/apps/{appSlug}/config/test-expression`. PUT accepts `?apply=staged|live` (default `live`). `live` saves to DB and pushes `CONFIG_UPDATE` SSE to live agents in this env (existing behavior); `staged` saves to DB only, skipping the SSE push — used by the unified app deployment page. Audit action is `stage_app_config` for staged writes, `update_app_config` for live. Invalid `apply` values return 400.
- `AppSettingsController``/api/v1/environments/{envSlug}`. GET `/app-settings` (list), GET/PUT/DELETE `/apps/{appSlug}/settings`. ADMIN/OPERATOR only.
- `SearchController``/api/v1/environments/{envSlug}`. GET `/executions`, POST `/executions/search`, GET `/stats`, `/stats/timeseries`, `/stats/timeseries/by-app`, `/stats/timeseries/by-route`, `/stats/punchcard`, `/attributes/keys`, `/errors/top`. GET `/executions` accepts repeat `attr` query params: `attr=order` (key-exists), `attr=order:47` (exact), `attr=order:4*` (wildcard — `*` maps to SQL LIKE `%`). First `:` splits key/value; later colons stay in the value. Invalid keys → 400. POST `/executions/search` accepts the same filters via `SearchRequest.attributeFilters` in the body.
- `LogQueryController` — GET `/api/v1/environments/{envSlug}/logs` (filters: source (multi, comma-split, OR-joined), level (multi, comma-split, OR-joined), application, agentId, exchangeId, logger, q, time range, instanceIds (multi, comma-split, AND-joined as WHERE instance_id IN (...) — used by the Checkpoint detail drawer to scope logs to a deployment's replicas); sort asc/desc). Cursor-paginated, returns `{ data, nextCursor, hasMore, levelCounts }`; cursor is base64url of `"{timestampIso}|{insert_id_uuid}"` — same-millisecond tiebreak via the `insert_id` UUID column on `logs`.
- `RouteCatalogController` — GET `/api/v1/environments/{envSlug}/routes` (merged route catalog from registry + ClickHouse; env filter unconditional).
- `RouteMetricsController` — GET `/api/v1/environments/{envSlug}/routes/metrics`, GET `/api/v1/environments/{envSlug}/routes/metrics/processors`.
- `AgentListController` — GET `/api/v1/environments/{envSlug}/agents` (registered agents with runtime metrics, filtered to env).
- `AgentEventsController` — GET `/api/v1/environments/{envSlug}/agents/events` (lifecycle events; cursor-paginated, returns `{ data, nextCursor, hasMore }`; order `(timestamp DESC, insert_id DESC)`; cursor is base64url of `"{timestampIso}|{insert_id_uuid}"``insert_id` is a stable UUID column used as a same-millisecond tiebreak).
- `AgentMetricsController` — GET `/api/v1/environments/{envSlug}/agents/{agentId}/metrics` (JVM/Camel metrics). Rejects cross-env agents (404) as defence-in-depth.
- `DiagramRenderController` — GET `/api/v1/environments/{envSlug}/apps/{appSlug}/routes/{routeId}/diagram` returns the most recent diagram for (app, env, route) via `DiagramStore.findLatestContentHashForAppRoute`. Registry-independent — routes whose publishing agents were removed still resolve. Also GET `/api/v1/diagrams/{contentHash}/render` (flat — content hashes are globally unique), the point-in-time path consumed by the exchange viewer via `ExecutionDetail.diagramContentHash`.
- `AlertRuleController``/api/v1/environments/{envSlug}/alerts/rules`. GET list / POST create / GET `{id}` / PUT `{id}` / DELETE `{id}` / POST `{id}/enable` / POST `{id}/disable` / POST `{id}/render-preview` / POST `{id}/test-evaluate`. OPERATOR+ for mutations, VIEWER+ for reads. CRITICAL: attribute keys in `ExchangeMatchCondition.filter.attributes` are validated at rule-save time against `^[a-zA-Z0-9._-]+$` — they are later inlined into ClickHouse SQL. `AgentLifecycleCondition` is allowlist-only — the `AgentLifecycleEventType` enum (REGISTERED / RE_REGISTERED / DEREGISTERED / WENT_STALE / WENT_DEAD / RECOVERED) plus the record compact ctor (non-empty `eventTypes`, `withinSeconds ≥ 1`) do the validation; custom agent-emitted event types are tracked in backlog issue #145. Webhook validation: verifies `outboundConnectionId` exists and `isAllowedInEnvironment`. Null notification templates default to `""` (NOT NULL constraint). Audit: `ALERT_RULE_CHANGE`.
- `AlertController``/api/v1/environments/{envSlug}/alerts`. GET list (inbox filtered by userId/groupIds/roleNames via `InAppInboxQuery`; optional multi-value `state`, `severity`, tri-state `acked`, tri-state `read` query params; soft-deleted rows always excluded) / GET `/unread-count` / GET `{id}` / POST `{id}/ack` / POST `{id}/read` / POST `/bulk-read` / POST `/bulk-ack` (VIEWER+) / DELETE `{id}` (OPERATOR+, soft-delete) / POST `/bulk-delete` (OPERATOR+) / POST `{id}/restore` (OPERATOR+, clears `deleted_at`). `requireLiveInstance` helper returns 404 on soft-deleted rows; `restore` explicitly fetches regardless of `deleted_at`. `BulkIdsRequest` is the shared body for bulk-read/ack/delete (`{ instanceIds }`). `AlertDto` includes `readAt`; `deletedAt` is intentionally NOT on the wire. Inbox SQL: `? = ANY(target_user_ids) OR target_group_ids && ? OR target_role_names && ?` — requires at least one matching target (no broadcast concept).
- `AlertSilenceController``/api/v1/environments/{envSlug}/alerts/silences`. GET list / POST create / DELETE `{id}`. 422 if `endsAt <= startsAt`. OPERATOR+ for mutations, VIEWER+ for list. Audit: `ALERT_SILENCE_CHANGE`.
- `AlertNotificationController` — Dual-path (no class-level prefix). GET `/api/v1/environments/{envSlug}/alerts/{alertId}/notifications` (VIEWER+); POST `/api/v1/alerts/notifications/{id}/retry` (OPERATOR+, flat — notification IDs globally unique). Retry resets attempts to 0 and sets `nextAttemptAt = now`.
### Env admin (env-slug-parameterized, not env-scoped data)
- `EnvironmentAdminController``/api/v1/admin/environments`. GET list / POST create / GET `{envSlug}` / PUT `{envSlug}` / DELETE `{envSlug}` / PUT `{envSlug}/default-container-config` / PUT `{envSlug}/jar-retention`. Slug immutable — PUT body has no slug field; any slug supplied is dropped by Jackson. Slug validated on POST. `UpdateEnvironmentRequest` carries `color` (nullable); unknown values rejected with 400 via `EnvironmentColor.isValid`. Null/absent color preserves the existing value.
### Agent-only (JWT-authoritative, intentionally flat)
- `AgentRegistrationController` — POST `/register` (requires `environmentId` in body; 400 if missing), POST `/{id}/refresh` (rejects tokens with no `env` claim), POST `/{id}/heartbeat` (env from body preferred, JWT fallback; 400 if neither), POST `/{id}/deregister`.
- `AgentSseController` — GET `/{id}/events` (SSE connection).
- `AgentCommandController` — POST `/{agentId}/commands`, POST `/groups/{group}/commands`, POST `/commands` (broadcast), POST `/{agentId}/commands/{commandId}/ack`, POST `/{agentId}/replay`.
- `AgentConfigController` — GET `/api/v1/agents/config`. Agent-authoritative config read: resolves (app, env) from JWT subject → registry (registry miss falls back to JWT env claim; no registry entry → 404 since application can't be derived).
### Ingestion (agent-only, JWT-authoritative)
- `LogIngestionController` — POST `/api/v1/data/logs` (accepts `List<LogEntry>`; WARNs on missing identity, unregistered agents, empty payloads, buffer-full drops).
- `EventIngestionController` — POST `/api/v1/data/events`.
- `ChunkIngestionController` — POST `/api/v1/data/executions`. Accepts a single `ExecutionChunk` or an array (fields include `exchangeId`, `applicationId`, `instanceId`, `routeId`, `status`, `startTime`, `endTime`, `durationMs`, `chunkSeq`, `final`, `processors: FlatProcessorRecord[]`). The accumulator merges non-final chunks by exchangeId and emits the merged envelope on the final chunk or on stale timeout. Legacy `ExecutionController` / `RouteExecution` shape is retired.
- `MetricsController` — POST `/api/v1/data/metrics`.
- `DiagramController` — POST `/api/v1/data/diagrams` (resolves applicationId + environment from the agent registry keyed on JWT subject; stamps both on the stored `TaggedDiagram`).
### Cross-env discovery (flat)
- `CatalogController` — GET `/api/v1/catalog` (merges managed apps + in-memory agents + CH stats; optional `?environment=` filter). DELETE `/api/v1/catalog/{applicationId}` (ADMIN: dismiss app, purge all CH data + PG record).
### Admin (cross-env, flat)
- `UserAdminController` — CRUD `/api/v1/admin/users`, POST `/{id}/roles`, POST `/{id}/set-password`.
- `RoleAdminController` — CRUD `/api/v1/admin/roles`.
- `GroupAdminController` — CRUD `/api/v1/admin/groups`.
- `OidcConfigAdminController` — GET/POST `/api/v1/admin/oidc`, POST `/test`.
- `OutboundConnectionAdminController``/api/v1/admin/outbound-connections`. GET list / POST create / GET `{id}` / PUT `{id}` / DELETE `{id}` / POST `{id}/test` / GET `{id}/usage`. RBAC: list/get/usage ADMIN|OPERATOR; mutations + test ADMIN.
- `SensitiveKeysAdminController` — GET/PUT `/api/v1/admin/sensitive-keys`. GET returns 200 or 204 if not configured. PUT accepts `{ keys: [...] }` with optional `?pushToAgents=true`. Fan-out iterates every distinct `(application, environment)` slice — intentional global baseline + per-env overrides.
- `ClaimMappingAdminController` — CRUD `/api/v1/admin/claim-mappings`, POST `/test`.
- `LicenseAdminController` — GET/POST `/api/v1/admin/license`. ADMIN only. GET returns `{state, invalidReason, envelope, lastValidatedAt?}` — the raw token is deliberately omitted; only the parsed `LicenseInfo` envelope is exposed. POST delegates to `LicenseService.install(token, userId, "api")` (acting userId resolved via the `user:` prefix-strip convention) — install/replace/reject all flow through `LicenseService` so audit, persistence, and `LicenseChangedEvent` publishing are uniform.
- `LicenseUsageController` — GET `/api/v1/admin/license/usage`. Returns license `state`, `expiresAt`/`daysRemaining`/`gracePeriodDays`/`tenantId`/`label`/`lastValidatedAt`, the `LicenseMessageRenderer.forState(...)` message, and a `limits[]` array (`{key, current, cap, source}`) covering every effective-limits key. `source` is `"license"` when the cap came from the license override map, `"default"` otherwise. `max_agents` reads from `AgentRegistryService.liveCount()`; all other counts come from `LicenseUsageReader.snapshot()`.
- `ThresholdAdminController` — CRUD `/api/v1/admin/thresholds`.
- `AuditLogController` — GET `/api/v1/admin/audit`.
- `RbacStatsController` — GET `/api/v1/admin/rbac/stats`.
- `UsageAnalyticsController` — GET `/api/v1/admin/usage` (ClickHouse `usage_events`).
- `ClickHouseAdminController` — GET `/api/v1/admin/clickhouse/**` (conditional on `infrastructureendpoints` flag).
- `DatabaseAdminController` — GET `/api/v1/admin/database/**` (conditional on `infrastructureendpoints` flag).
- `ServerMetricsAdminController``/api/v1/admin/server-metrics/**`. GET `/catalog`, GET `/instances`, POST `/query`. Generic read API over the `server_metrics` ClickHouse table so SaaS dashboards don't need direct CH access. Delegates to `ServerMetricsQueryStore` (impl `ClickHouseServerMetricsQueryStore`). Visibility matches ClickHouse/Database admin: `@ConditionalOnProperty(infrastructureendpoints, matchIfMissing=true)` + class-level `@PreAuthorize("hasRole('ADMIN')")`. Validation: metric/tag regex `^[a-zA-Z0-9._]+$`, statistic regex `^[a-z_]+$`, `to - from ≤ 31 days`, stepSeconds ∈ [10, 3600], response capped at 500 series. `IllegalArgumentException` → 400. `/query` supports `raw` + `delta` modes (delta does per-`server_instance_id` positive-clipped differences, then aggregates across instances). Derived `statistic=mean` for timers computes `sum(total|total_time)/sum(count)` per bucket.
### Auth (flat)
- `UiAuthController``/api/v1/auth` (login, refresh, me, logout). Local username/password against env-var admin or DB BCrypt hash. Lockout after 5 failed attempts. `POST /logout` is permitAll — controller resolves the user from the access token if present, bumps `users.token_revoked_before = now().plusMillis(1)` to invalidate all outstanding refresh + access tokens (enforced by `JwtAuthenticationFilter`), audits `AuditCategory.AUTH / logout`, returns 204. Best-effort: 204 also when called without a token so the SPA's logout never fails on already-expired sessions. The +1ms guards against same-millisecond races (JWT `iat` is ms-quantised, filter check is strict `isBefore`).
- `OidcAuthController``/api/v1/auth/oidc` (config, callback). Code → token exchange. Roles via custom JWT claim, claim mapping rules, or default roles.
- `AuthCapabilitiesController``GET /api/v1/auth/capabilities` (unauthenticated). Reports `{oidc:{enabled, providerName, primary}, localAccounts:{enabled, adminRecoveryOnly}}` so the SPA renders the login page deterministically. `oidc.primary == oidc.enabled`; `localAccounts.adminRecoveryOnly == oidc.primary`. `providerName` is best-effort label via `OidcProviderNameDeriver` (Logto / Keycloak / Auth0 / Okta / Single Sign-On). The SPA hides the local form behind `?local` when `adminRecoveryOnly` is true.
### Other (flat)
- `DetailController` — GET `/api/v1/executions/{executionId}` + processor snapshot endpoints.
- `MetricsController` — exposes `/api/v1/metrics` and `/api/v1/prometheus` (server-side Prometheus scrape endpoint).
## runtime/ — Docker orchestration
- `DockerRuntimeOrchestrator` — implements RuntimeOrchestrator; Docker Java client (zerodep transport), container lifecycle
- `DeploymentExecutor`@Async staged deploy: PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE. Container names are `{tenantId}-{envSlug}-{appSlug}-{replicaIndex}-{generation}`, where `generation` is the first 8 chars of the deployment UUID — old and new replicas coexist during a blue/green swap. Per-replica `CAMELEER_AGENT_INSTANCEID` env var is `{envSlug}-{appSlug}-{replicaIndex}-{generation}`. Branches on `DeploymentStrategy.fromWire(config.deploymentStrategy())`: **blue-green** (default) starts all N → waits for all healthy → stops old (partial health = FAILED, preserves old untouched); **rolling** replaces replicas one at a time with rollback only for in-flight new containers (already-replaced old stay stopped; un-replaced old keep serving). DEGRADED is now only set by `DockerEventMonitor` post-deploy, never by the executor. **License compute caps**: at PRE_FLIGHT (after `ConfigMerger.resolve`, before image pull / container creation) the executor consults `LicenseUsageReader.computeUsage()` (PG aggregate over non-stopped deployments) and runs three `LicenseEnforcer.assertWithinCap(...)` checks for `max_total_cpu_millis`, `max_total_memory_mb`, and `max_total_replicas`. A `LicenseCapExceededException` propagates to the surrounding `try/catch` which marks the deployment FAILED with the cap message in `deployments.error_message`.
- `DockerNetworkManager` — ensures bridge networks (cameleer-traefik, cameleer-env-{slug}), connects containers
- `DockerEventMonitor` — persistent Docker event stream listener (die, oom, start, stop), updates deployment status
- `TraefikLabelBuilder` — generates Traefik Docker labels for path-based or subdomain routing. Per-container identity labels: `cameleer.replica` (index), `cameleer.generation` (deployment-scoped 8-char id — for Prometheus/Grafana deploy-boundary annotations), `cameleer.instance-id` (`{envSlug}-{appSlug}-{replicaIndex}-{generation}`). Router/service label keys are generation-agnostic so load balancing spans old + new replicas during a blue/green overlap.
- `PrometheusLabelBuilder` — generates Prometheus Docker labels (`prometheus.scrape/path/port`) per runtime type for `docker_sd_configs` auto-discovery
- `ContainerLogForwarder` — streams Docker container stdout/stderr to ClickHouse with `source='container'`. One follow-stream thread per container, batches lines every 2s/50 lines via `ClickHouseLogStore.insertBufferedBatch()`. 60-second max capture timeout.
- `DisabledRuntimeOrchestrator` — no-op when runtime not enabled
## metrics/ — Prometheus observability
- `ServerMetrics` — centralized business metrics: gauges (agents by state, SSE connections, buffer depths), counters (ingestion drops, agent transitions, deployment outcomes, auth failures), timers (flush duration, deployment duration). Exposed via `/api/v1/prometheus`.
- `ServerInstanceIdConfig``@Configuration`, exposes `@Bean("serverInstanceId") String`. Resolution precedence: `cameleer.server.instance-id` property → `HOSTNAME` env → `InetAddress.getLocalHost()` → random UUID. Fixed at boot; rotates across restarts so counters restart cleanly.
- `ServerMetricsSnapshotScheduler``@Scheduled(fixedDelayString = "${cameleer.server.self-metrics.interval-ms:60000}")`. Walks `MeterRegistry.getMeters()` each tick, emits one `ServerMetricSample` per `Measurement` (Timer/DistributionSummary produce multiple rows per meter — one per Micrometer `Statistic`). Skips non-finite values; logs and swallows store failures. Disabled via `cameleer.server.self-metrics.enabled=false` (`@ConditionalOnProperty`). Write-only — no query endpoint yet; inspect via `/api/v1/admin/clickhouse/query`.
## storage/ — PostgreSQL repositories (JdbcTemplate)
- `PostgresAppRepository`, `PostgresAppVersionRepository`, `PostgresEnvironmentRepository`
- `PostgresDeploymentRepository` — includes JSONB replica_states, deploy_stage, findByContainerId. Also carries `deployed_config_snapshot` JSONB (Flyway V3) populated by `DeploymentExecutor` via `saveDeployedConfigSnapshot(UUID, DeploymentConfigSnapshot)` on successful RUNNING transition. Consumed by `DirtyStateCalculator` for the `/apps/{slug}/dirty-state` endpoint and by the UI for checkpoint restore.
- `PostgresUserRepository`, `PostgresRoleRepository`, `PostgresGroupRepository`
- `PostgresAuditRepository`, `PostgresOidcConfigRepository`, `PostgresClaimMappingRepository`, `PostgresSensitiveKeysRepository`
- `PostgresAppSettingsRepository`, `PostgresApplicationConfigRepository`, `PostgresThresholdRepository`. Both `app_settings` and `application_config` are env-scoped (PK `(app_id, environment)` / `(application, environment)`); finders take `(app, env)` — no env-agnostic variants.
## storage/ — ClickHouse stores
- `ClickHouseExecutionStore`, `ClickHouseMetricsStore`, `ClickHouseMetricsQueryStore`
- `ClickHouseStatsStore` — pre-aggregated stats, punchcard
- `ClickHouseDiagramStore`, `ClickHouseAgentEventRepository`
- `ClickHouseUsageTracker` — usage_events for billing
- `ClickHouseRouteCatalogStore` — persistent route catalog with first_seen cache, warm-loaded on startup
- `ClickHouseServerMetricsStore` — periodic dumps of the server's own Micrometer registry into the `server_metrics` table. Tenant-stamped (bound at the scheduler, not the bean); no `environment` column (server straddles envs). Batch-insert via `JdbcTemplate.batchUpdate` with `Map(String, String)` tag binding. Written by `ServerMetricsSnapshotScheduler`.
- `ClickHouseServerMetricsQueryStore` — read side of `server_metrics` for dashboards. Implements `ServerMetricsQueryStore`. `catalog(from,to)` returns name+type+statistics+tagKeys, `listInstances(from,to)` returns server_instance_ids with first/last seen, `query(request)` builds bucketed time-series with `raw` or `delta` mode and supports a derived `mean` statistic for timers. All identifier inputs regex-validated; tenant_id always bound; max range 31 days; series count capped at 500. Exposed via `ServerMetricsAdminController`.
## search/ — ClickHouse search and log stores
- `ClickHouseLogStore` — log storage and query, MDC-based exchange/processor correlation
- `ClickHouseSearchIndex` — full-text search
## security/ — Spring Security
- `SecurityConfig` — WebSecurityFilterChain, JWT filter, CORS, OIDC conditional. `/api/v1/admin/outbound-connections/**` GETs permit OPERATOR in addition to ADMIN (defense-in-depth at controller level); mutations remain ADMIN-only. Alerting matchers: GET `/environments/*/alerts/**` VIEWER+; POST/PUT/DELETE rules and silences OPERATOR+; ack/read/bulk-read VIEWER+; POST `/alerts/notifications/*/retry` OPERATOR+.
- `JwtAuthenticationFilter` — OncePerRequestFilter, validates Bearer tokens
- `JwtServiceImpl` — HMAC-SHA256 JWT (Nimbus JOSE)
- `UiAuthController``/api/v1/auth` (login, refresh, me, logout). Upserts `users.user_id = request.username()` (bare); signs JWTs with `subject = "user:" + userId`. `refresh`/`me`/`logout` strip the `"user:"` prefix from incoming subjects via `stripSubjectPrefix()` before any DB/RBAC lookup. `logout` revokes outstanding tokens by writing `users.token_revoked_before` and audits under `AuditCategory.AUTH / logout`.
- `OidcAuthController``/api/v1/auth/oidc` (login-uri, token-exchange, logout). Upserts `users.user_id = "oidc:" + oidcUser.subject()` (no `user:` prefix); signs JWTs with `subject = "user:oidc:" + oidcUser.subject()`. `applyClaimMappings` + `getSystemRoleNames` calls all use the bare `oidc:<sub>` form.
- `OidcTokenExchanger` — code -> tokens, role extraction from access_token then id_token
- `OidcProviderHelper` — OIDC discovery, JWK source cache
## agent/ — Agent lifecycle
- `SseConnectionManager` — manages per-agent SSE connections, delivers commands
- `AgentLifecycleMonitor`@Scheduled 10s, LIVE->STALE->DEAD transitions
- `SsePayloadSigner` — Ed25519 signs SSE payloads for agent verification
## retention/ — JAR cleanup
- `JarRetentionJob`@Scheduled 03:00 daily, per-environment retention, skips deployed versions
## alerting/eval/ — Rule evaluation
- `AlertEvaluatorJob`@Scheduled tick driver; per-rule claim/release via `AlertRuleRepository`, dispatches to per-kind `ConditionEvaluator`, persists advanced cursor on release via `AlertRule.withEvalState`.
- `BatchResultApplier``@Component` that wraps a single rule's tick outcome (`EvalResult.Batch` = `firings` + `nextEvalState`) in one `@Transactional` boundary: instance upserts + notification enqueues + cursor advance commit atomically or roll back together. This is the exactly-once-per-exchange guarantee for `PER_EXCHANGE` fire mode.
- `ConditionEvaluator` — interface; per-kind implementations: `ExchangeMatchEvaluator`, `AgentLifecycleEvaluator`, `AgentStateEvaluator`, `DeploymentStateEvaluator`, `JvmMetricEvaluator`, `LogPatternEvaluator`, `RouteMetricEvaluator`.
- `AlertStateTransitions` — PER_EXCHANGE vs rule-level FSM helpers (fire/resolve/ack).
- `PerKindCircuitBreaker` — trips noisy per-kind evaluators; `TickCache` — per-tick shared lookups (apps, envs, silences).
## http/ — Outbound HTTP client implementation
- `SslContextBuilder` — composes SSL context from `OutboundHttpProperties` + `OutboundHttpRequestContext`. Supports SYSTEM_DEFAULT (JDK roots + configured CA extras), TRUST_ALL (short-circuit no-op TrustManager), TRUST_PATHS (JDK roots + system extras + per-request extras). Throws `IllegalArgumentException("CA file not found: ...")` on missing PEM.
- `ApacheOutboundHttpClientFactory` — Apache HttpClient 5 impl of `OutboundHttpClientFactory`. Memoizes clients per `CacheKey(trustAll, caPaths, mode, connectTimeout, readTimeout)`. Applies `NoopHostnameVerifier` when trust-all is active.
- `config/OutboundHttpConfig``@ConfigurationProperties("cameleer.server.outbound-http")`. Exposes beans: `OutboundHttpProperties`, `SslContextBuilder`, `OutboundHttpClientFactory`. `@PostConstruct` logs WARN on trust-all and throws if configured CA paths don't exist.
## outbound/ — Admin-managed outbound connections (implementation)
- `crypto/SecretCipher` — AES-GCM symmetric cipher with key derived via HMAC-SHA256(jwtSecret, "cameleer-outbound-secret-v1"). Ciphertext format: base64(IV(12 bytes) || GCM output with 128-bit tag). `encrypt` throws `IllegalStateException`; `decrypt` throws `IllegalArgumentException` on tamper/wrong-key/malformed.
- `storage/PostgresOutboundConnectionRepository` — JdbcTemplate impl. `save()` upserts by id; JSONB serialization via ObjectMapper; UUID arrays via `ConnectionCallback`. Reads `created_by`/`updated_by` as String (= users.user_id TEXT).
- `OutboundConnectionServiceImpl` — service layer. Tenant bound at construction via `cameleer.server.tenant.id` property. Uniqueness check via `findByName`. Narrowing-envs guard: rejects update that removes envs while rules reference the connection (rulesReferencing stubbed in Plan 01, wired in Plan 02). Delete guard: rejects if referenced by rules.
- `controller/OutboundConnectionAdminController` — REST controller. Class-level `@PreAuthorize("hasRole('ADMIN')")` defaults; GETs relaxed to ADMIN|OPERATOR. Resolves acting user id via the user-id convention (strip `"user:"` from `authentication.name` → matches `users.user_id` FK). Audit via `AuditCategory.OUTBOUND_CONNECTION_CHANGE`.
- `dto/OutboundConnectionRequest` — Bean Validation: `@NotBlank` name, `@Pattern("^https://.+")` url, `@NotNull` method/tlsTrustMode/auth. Compact ctor throws `IllegalArgumentException` if TRUST_PATHS with empty paths list.
- `dto/OutboundConnectionDto` — response DTO. `hmacSecretSet: boolean` instead of the ciphertext; `authKind: OutboundAuthKind` instead of the full auth config.
- `dto/OutboundConnectionTestResult` — result of POST `/{id}/test`: status, latencyMs, responseSnippet (first 512 chars), tlsProtocol/cipherSuite/peerCertSubject (protocol is "TLS" stub; enriched in Plan 02 follow-up), error (nullable).
- `config/OutboundBeanConfig` — registers `OutboundConnectionRepository`, `SecretCipher`, `OutboundConnectionService` beans.
## license/ — License enforcement & lifecycle
- `LicenseService` — install / replace / revalidate mediator. `install(token, installedBy, source)` validates via `LicenseValidator`, on failure marks the gate INVALID + audits `reject_license` + publishes `LicenseChangedEvent` and rethrows; on success persists via `LicenseRepository.upsert(...)`, mutates `LicenseGate`, audits `install_license` or `replace_license` (detects existing row), and publishes `LicenseChangedEvent`. `loadInitial(envToken, fileToken)` boot precedence env > file > DB; ABSENT publishes a `LicenseChangedEvent(ABSENT, null)`. `revalidate()` re-runs validation against the persisted token, on success bumps `last_validated_at`; on failure marks INVALID and audits `revalidate_license` FAILURE. `getTenantId()` exposes the tenant for downstream lookups.
- `LicenseRepository` — interface in `app/license`. `Optional<LicenseRecord> findByTenantId(String)`, `void upsert(LicenseRecord)`, `int touchValidated(String tenantId, Instant)`, `int delete(String)`.
- `LicenseRecord` — record persisted in PG `license` table: `(String tenantId, String token, UUID licenseId, Instant installedAt, String installedBy, Instant expiresAt, Instant lastValidatedAt)`.
- `PostgresLicenseRepository` — JdbcTemplate impl of `LicenseRepository`. Targets PG `license` table (V5). Upsert via `INSERT ... ON CONFLICT (tenant_id) DO UPDATE`.
- `LicenseChangedEvent` — Spring application event: `(LicenseState state, LicenseInfo current)`. Published on every install / replace / revalidate / boot-time ABSENT path so downstream listeners (retention policy, metrics, etc.) react uniformly.
- `LicenseEnforcer``@Component`. `assertWithinCap(String limitKey, long currentUsage, long requestedDelta)` consults `LicenseGate.getEffectiveLimits()`. On overflow increments `cameleer_license_cap_rejections_total{limit=...}`, emits an `AuditCategory.LICENSE / cap_exceeded` audit row when `AuditService` is wired (try/catch + log.warn so audit-write failures don't suppress the 403), and throws `LicenseCapExceededException`. Unknown limit keys propagate `IllegalArgumentException` from `LicenseLimits.get(...)` (programmer error, not a 403).
- `LicenseUsageReader``@Component` over PG. `snapshot()` returns a `Map<String,Long>` of (max_environments, max_apps, max_users, max_outbound_connections, max_alert_rules, max_total_cpu_millis, max_total_memory_mb, max_total_replicas) from PG row counts and a SUM over non-stopped deployments' `deployed_config_snapshot.containerConfig` (replicas × cpuLimit / memoryLimitMb). `computeUsage()` returns the typed `ComputeUsage(cpuMillis, memoryMb, replicas)` tuple consumed by `DeploymentExecutor` PRE_FLIGHT cap checks. `agentCount(int)` echoes a registry-supplied live count (registry is in-memory; not stored in PG).
- `LicenseCapExceededException` — typed `RuntimeException(limitKey, current, cap)` with accessors. Mapped to HTTP 403 by `LicenseExceptionAdvice`.
- `LicenseExceptionAdvice``@ControllerAdvice` mapping `LicenseCapExceededException` → 403 with body `{error:"license cap reached", limit, current, cap, state, message}` where `message` is `LicenseMessageRenderer.forCap(state, info, limit, current, cap, invalidReason)`.
- `LicenseMessageRenderer` — pure formatter (utility class, no DI). `forCap(state, info, limit, current, cap[, invalidReason])` per-state human messages for cap-rejection responses; `forState(state, info[, invalidReason])` shorter state-only messages for the `/usage` endpoint and metrics surfaces.
- `RetentionPolicyApplier``@EventListener(LicenseChangedEvent.class) @Async`. For each environment × table in the static `SPECS` list (`executions`, `processor_executions`, `logs`, `agent_metrics`, `agent_events`) computes `effective = min(licenseCap, env.configuredRetentionDays)` and emits `ALTER TABLE <t> MODIFY TTL toDateTime(<col>) + INTERVAL <n> DAY DELETE WHERE environment = '<slug>'`. ClickHouse failures are logged and swallowed (best-effort; never propagates to the originating license install/revalidate). `route_diagrams` (no TTL clause) and `server_metrics` (no environment column) are intentionally excluded.
- `LicenseRevalidationJob``@Component`. `@Scheduled(cron = "0 0 3 * * *")` daily revalidation; `@EventListener(ApplicationReadyEvent.class) @Async` 60-second post-startup tick to catch ABSENT→ACTIVE when a license was inserted between server starts. Both paths call `LicenseService.revalidate()` and swallow scheduler-thread crashes.
- `LicenseMetrics``@Component`. Registers Micrometer gauges: `cameleer_license_state{state=...}` (one-hot per `LicenseState`), `cameleer_license_days_remaining` (negative when ABSENT/INVALID), `cameleer_license_last_validated_age_seconds` (0 when no DB row). Refreshed eagerly on `LicenseChangedEvent` via `@EventListener` and lazily every 60s via `@Scheduled(fixedDelay = 60_000)`.
## config/ — Spring beans
- `RuntimeOrchestratorAutoConfig` — conditional Docker/Disabled orchestrator + NetworkManager + EventMonitor
- `RuntimeBeanConfig` — DeploymentExecutor, AppService, EnvironmentService. Wires `CreateGuard` instances per service from `LicenseEnforcer.assertWithinCap(...)` so creation paths (Environment, App, Agent) consult license caps without core depending on the app module.
- `SecurityBeanConfig` — JwtService, Ed25519, BootstrapTokenValidator
- `StorageBeanConfig` — all repositories
- `ClickHouseConfig` — ClickHouse JdbcTemplate, schema initializer
- `LicenseBeanConfig` — license bean topology in dependency order: `LicenseGate``LicenseValidator` (when `cameleer.server.license.publickey` is unset, an always-failing override is returned so any loaded token still routes through `install()` and is audited as INVALID, never silently dropped) → `LicenseService``LicenseBootLoader` (`@PostConstruct` drives `loadInitial(envToken, fileToken)` once the context is ready; resolution order env var > license file > persisted DB row).