diff --git a/.claude/rules/app-classes.md b/.claude/rules/app-classes.md index 41552371..9b5d1df5 100644 --- a/.claude/rules/app-classes.md +++ b/.claude/rules/app-classes.md @@ -7,44 +7,94 @@ paths: `cameleer-server-app/src/main/java/com/cameleer/server/app/` +## URL taxonomy + +User-facing data and config endpoints live under `/api/v1/environments/{envSlug}/...`. Env is a path segment, never a query param. The `envSlug` is resolved to an `Environment` bean via the `@EnvPath` argument resolver (`web/EnvironmentPathResolver.java`) — 404 on unknown slug. + +**Slugs are immutable after creation** for both environments and apps. Slug regex: `^[a-z0-9][a-z0-9-]{0,63}$`. Validated in `EnvironmentService.create` and `AppService.createApp`. Update endpoints (`PUT`) do not accept a slug field; Jackson drops it as an unknown property. + +### Flat-endpoint allow-list + +These paths intentionally stay flat (no `/environments/{envSlug}` prefix). Every new endpoint should be env-scoped unless it appears here and the reason is documented. + +| Path prefix | Why flat | +|---|---| +| `/api/v1/data/**` | Agent ingestion. JWT `env` claim is authoritative; URL-embedded env would invite spoofing. | +| `/api/v1/agents/register`, `/refresh`, `/{id}/heartbeat`, `/{id}/events` (SSE), `/{id}/deregister`, `/{id}/commands`, `/{id}/commands/{id}/ack`, `/{id}/replay` | Agent self-service; JWT-bound. | +| `/api/v1/agents/commands`, `/api/v1/agents/groups/{group}/commands` | Operator fan-out; target scope is explicit in query params. | +| `/api/v1/agents/config` | Agent-authoritative config read; JWT → registry → (app, env). | +| `/api/v1/admin/{users,roles,groups,oidc,license,audit,rbac/stats,claim-mappings,thresholds,sensitive-keys,usage,clickhouse,database,environments}` | Truly cross-env admin. Env CRUD URLs use `{envSlug}`, not UUID. | +| `/api/v1/catalog`, `/api/v1/catalog/{applicationId}` | Cross-env discovery is the purpose. Env is an optional filter via `?environment=`. | +| `/api/v1/executions/{execId}`, `/processors/**` | Exchange IDs are globally unique; permalinks. | +| `/api/v1/diagrams/{contentHash}/render`, `POST /api/v1/diagrams/render` | Content-addressed or stateless. | +| `/api/v1/auth/**` | Pre-auth; no env context exists. | +| `/api/v1/health`, `/prometheus`, `/api-docs/**`, `/swagger-ui/**` | Server metadata. | + +## Tenant isolation invariant + +ClickHouse is shared across tenants. Every ClickHouse query must filter by `tenant_id` (from `CAMELEER_SERVER_TENANT_ID` env var, resolved via `TenantContext`/config) in addition to `environment`. New controllers added under `/environments/{envSlug}/...` must preserve this — the env filter from the path does not replace the tenant filter. + ## controller/ — REST endpoints -- `AgentRegistrationController` — POST /register (requires `environmentId` in body; 400 if missing/blank), POST /heartbeat (env from body `environmentId` → JWT `env` claim; 400 if neither present during auto-heal), GET / (list), POST /refresh-token (rejects tokens with no `env` claim) -- `AgentSseController` — GET /sse (Server-Sent Events connection) -- `AgentCommandController` — POST /broadcast, POST /{agentId}, POST /{agentId}/ack -- `AppController` — CRUD /api/v1/apps, POST /{appId}/upload-jar, GET /{appId}/versions -- `DeploymentController` — GET/POST /api/v1/apps/{appId}/deployments, POST /{id}/stop, POST /{id}/promote, GET /{id}/logs -- `EnvironmentAdminController` — CRUD /api/v1/admin/environments, PUT /{id}/jar-retention -- `ExecutionController` — GET /api/v1/executions (search + detail) -- `SearchController` — POST /api/v1/search, GET /routes, GET /top-errors, GET /punchcard -- `LogQueryController` — GET /api/v1/logs (filters: source, application, agentId, exchangeId, level, logger, q, environment, time range) -- `LogIngestionController` — POST /api/v1/data/logs (accepts `List` JSON array, each entry has `source`: app/agent). Logs WARN for: missing agent identity, unregistered agents, empty payloads, buffer-full drops, deserialization failures. Normal acceptance at DEBUG. -- `CatalogController` — GET /api/v1/catalog (unified app catalog merging PG managed apps + in-memory agents + CH stats), DELETE /api/v1/catalog/{applicationId} (ADMIN: dismiss app, purge all CH data + PG record). Auto-filters discovered apps older than `discoveryttldays` with no live agents. -- `ChunkIngestionController` — POST /api/v1/ingestion/chunk/{executions|metrics|diagrams} -- `UserAdminController` — CRUD /api/v1/admin/users, POST /{id}/roles, POST /{id}/set-password -- `RoleAdminController` — CRUD /api/v1/admin/roles -- `GroupAdminController` — CRUD /api/v1/admin/groups -- `OidcConfigAdminController` — GET/POST /api/v1/admin/oidc, POST /test -- `SensitiveKeysAdminController` — GET/PUT /api/v1/admin/sensitive-keys. GET returns 200 with config or 204 if not configured. PUT accepts `{ keys: [...] }` with optional `?pushToAgents=true`. The fan-out iterates over every distinct `(application, environment)` slice (from persisted `application_config` rows plus currently-registered agents) and pushes per-slice merged keys — intentional global baseline + per-env overrides. Stored in `server_config` table (key `sensitive_keys`). -- `AuditLogController` — GET /api/v1/admin/audit -- `MetricsController` — GET /api/v1/metrics, GET /timeseries -- `DiagramController` — GET /api/v1/diagrams/{id}, POST /api/v1/data/diagrams. Ingestion resolves applicationId + environment from the agent registry (keyed on JWT subject) and stamps both on the stored `TaggedDiagram`. `route_diagrams` CH table has an `environment` column; queries like `findProcessorRouteMapping(app, env)` filter by it. -- `DiagramRenderController` — POST /api/v1/diagrams/render (ELK layout) -- `ClaimMappingAdminController` — CRUD /api/v1/admin/claim-mappings, POST /test (accepts inline rules + claims for preview without saving) -- `LicenseAdminController` — GET/POST /api/v1/admin/license -- `AgentEventsController` — GET /api/v1/agent-events (agent state change history) -- `AgentMetricsController` — GET /api/v1/agent-metrics (JVM/Camel metrics per agent instance) -- `AppSettingsController` — GET/PUT /api/v1/admin/app-settings (list), /api/v1/admin/app-settings/{appId} (per-app). All endpoints require `?environment=`. -- `ApplicationConfigController` — `/api/v1/config` (agent/admin observability config: traced processors, taps, route recording, per-app sensitive keys). GET list requires `?environment=`. GET/PUT/DELETE for a single app are env-scoped: for AGENT role the env comes from the JWT `env` claim (query param ignored, agents cannot spoof env); for non-agent callers env must be supplied via `?environment=` (user JWTs carry a placeholder env="default" that is NOT authoritative). `defaultConfig(application, environment)` is returned when no row exists. -- `ClickHouseAdminController` — GET /api/v1/admin/clickhouse (ClickHouse admin, conditional on infrastructure endpoints) -- `DatabaseAdminController` — GET /api/v1/admin/database (PG admin, conditional on infrastructure endpoints) -- `DetailController` — GET /api/v1/detail (execution detail with processor tree) -- `EventIngestionController` — POST /api/v1/data/events (agent event ingestion) -- `RbacStatsController` — GET /api/v1/admin/rbac/stats -- `RouteCatalogController` — GET /api/v1/routes/catalog (merged route catalog from registry + ClickHouse) -- `RouteMetricsController` — GET /api/v1/route-metrics (per-route Camel metrics) -- `ThresholdAdminController` — CRUD /api/v1/admin/thresholds -- `UsageAnalyticsController` — GET /api/v1/admin/usage (ClickHouse usage_events) +### Env-scoped (user-facing data & config) + +- `AppController` — `/api/v1/environments/{envSlug}/apps`. GET list / POST create / GET `{appSlug}` / DELETE `{appSlug}` / GET `{appSlug}/versions` / POST `{appSlug}/versions` (JAR upload) / PUT `{appSlug}/container-config`. App slug uniqueness is per-env (`(env, app_slug)` is the natural key). `CreateAppRequest` body has no env (path), validates slug regex. +- `DeploymentController` — `/api/v1/environments/{envSlug}/apps/{appSlug}/deployments`. GET list / POST create (body `{ appVersionId }`) / POST `{id}/stop` / POST `{id}/promote` (body `{ targetEnvironment: slug }` — target app slug must exist in target env) / GET `{id}/logs`. +- `ApplicationConfigController` — `/api/v1/environments/{envSlug}`. GET `/config` (list), GET/PUT `/apps/{appSlug}/config`, GET `/apps/{appSlug}/processor-routes`, POST `/apps/{appSlug}/config/test-expression`. PUT also pushes `CONFIG_UPDATE` to LIVE agents in this env. +- `AppSettingsController` — `/api/v1/environments/{envSlug}`. GET `/app-settings` (list), GET/PUT/DELETE `/apps/{appSlug}/settings`. ADMIN/OPERATOR only. +- `SearchController` — `/api/v1/environments/{envSlug}`. GET `/executions`, POST `/executions/search`, GET `/stats`, `/stats/timeseries`, `/stats/timeseries/by-app`, `/stats/timeseries/by-route`, `/stats/punchcard`, `/attributes/keys`, `/errors/top`. +- `LogQueryController` — GET `/api/v1/environments/{envSlug}/logs` (filters: source, application, agentId, exchangeId, level, logger, q, time range). +- `RouteCatalogController` — GET `/api/v1/environments/{envSlug}/routes` (merged route catalog from registry + ClickHouse; env filter unconditional). +- `RouteMetricsController` — GET `/api/v1/environments/{envSlug}/routes/metrics`, GET `/api/v1/environments/{envSlug}/routes/metrics/processors`. +- `AgentListController` — GET `/api/v1/environments/{envSlug}/agents` (registered agents with runtime metrics, filtered to env). +- `AgentEventsController` — GET `/api/v1/environments/{envSlug}/agents/events` (lifecycle events). +- `AgentMetricsController` — GET `/api/v1/environments/{envSlug}/agents/{agentId}/metrics` (JVM/Camel metrics). Rejects cross-env agents (404) as defence-in-depth. +- `DiagramRenderController` — GET `/api/v1/environments/{envSlug}/apps/{appSlug}/routes/{routeId}/diagram` (env-scoped lookup). Also GET `/api/v1/diagrams/{contentHash}/render` (flat — content hashes are globally unique). + +### Env admin (env-slug-parameterized, not env-scoped data) + +- `EnvironmentAdminController` — `/api/v1/admin/environments`. GET list / POST create / GET `{envSlug}` / PUT `{envSlug}` / DELETE `{envSlug}` / PUT `{envSlug}/default-container-config` / PUT `{envSlug}/jar-retention`. Slug immutable — PUT body has no slug field; any slug supplied is dropped by Jackson. Slug validated on POST. + +### Agent-only (JWT-authoritative, intentionally flat) + +- `AgentRegistrationController` — POST `/register` (requires `environmentId` in body; 400 if missing), POST `/{id}/refresh` (rejects tokens with no `env` claim), POST `/{id}/heartbeat` (env from body preferred, JWT fallback; 400 if neither), POST `/{id}/deregister`. +- `AgentSseController` — GET `/{id}/events` (SSE connection). +- `AgentCommandController` — POST `/{agentId}/commands`, POST `/groups/{group}/commands`, POST `/commands` (broadcast), POST `/{agentId}/commands/{commandId}/ack`, POST `/{agentId}/replay`. +- `AgentConfigController` — GET `/api/v1/agents/config`. Agent-authoritative config read: resolves (app, env) from JWT subject → registry (registry miss falls back to JWT env claim; no registry entry → 404 since application can't be derived). + +### Ingestion (agent-only, JWT-authoritative) + +- `LogIngestionController` — POST `/api/v1/data/logs` (accepts `List`; WARNs on missing identity, unregistered agents, empty payloads, buffer-full drops). +- `EventIngestionController` — POST `/api/v1/data/events`. +- `ChunkIngestionController` — POST `/api/v1/ingestion/chunk/{executions|metrics|diagrams}`. +- `ExecutionController` — POST `/api/v1/data/executions` (legacy ingestion path when ClickHouse disabled). +- `MetricsController` — POST `/api/v1/data/metrics`. +- `DiagramController` — POST `/api/v1/data/diagrams` (resolves applicationId + environment from the agent registry keyed on JWT subject; stamps both on the stored `TaggedDiagram`). + +### Cross-env discovery (flat) + +- `CatalogController` — GET `/api/v1/catalog` (merges managed apps + in-memory agents + CH stats; optional `?environment=` filter). DELETE `/api/v1/catalog/{applicationId}` (ADMIN: dismiss app, purge all CH data + PG record). + +### Admin (cross-env, flat) + +- `UserAdminController` — CRUD `/api/v1/admin/users`, POST `/{id}/roles`, POST `/{id}/set-password`. +- `RoleAdminController` — CRUD `/api/v1/admin/roles`. +- `GroupAdminController` — CRUD `/api/v1/admin/groups`. +- `OidcConfigAdminController` — GET/POST `/api/v1/admin/oidc`, POST `/test`. +- `SensitiveKeysAdminController` — GET/PUT `/api/v1/admin/sensitive-keys`. GET returns 200 or 204 if not configured. PUT accepts `{ keys: [...] }` with optional `?pushToAgents=true`. Fan-out iterates every distinct `(application, environment)` slice — intentional global baseline + per-env overrides. +- `ClaimMappingAdminController` — CRUD `/api/v1/admin/claim-mappings`, POST `/test`. +- `LicenseAdminController` — GET/POST `/api/v1/admin/license`. +- `ThresholdAdminController` — CRUD `/api/v1/admin/thresholds`. +- `AuditLogController` — GET `/api/v1/admin/audit`. +- `RbacStatsController` — GET `/api/v1/admin/rbac/stats`. +- `UsageAnalyticsController` — GET `/api/v1/admin/usage` (ClickHouse `usage_events`). +- `ClickHouseAdminController` — GET `/api/v1/admin/clickhouse/**` (conditional on `infrastructureendpoints` flag). +- `DatabaseAdminController` — GET `/api/v1/admin/database/**` (conditional on `infrastructureendpoints` flag). + +### Other (flat) + +- `DetailController` — GET `/api/v1/executions/{executionId}` + processor snapshot endpoints. +- `MetricsController` — exposes `/api/v1/metrics` and `/api/v1/prometheus` (server-side Prometheus scrape endpoint). ## runtime/ — Docker orchestration diff --git a/CLAUDE.md b/CLAUDE.md index d1143786..0889aedf 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -37,7 +37,10 @@ java -jar cameleer-server-app/target/cameleer-server-app-1.0-SNAPSHOT.jar - Depends on `com.cameleer:cameleer-common` from Gitea Maven registry - Jackson `JavaTimeModule` for `Instant` deserialization - Communication: receives HTTP POST data from agents (executions, diagrams, metrics, logs), serves SSE event streams for config push/commands (config-update, deep-trace, replay, route-control) -- Environment filtering: all data queries filter by the selected environment. All commands target only agents in the selected environment. Backend endpoints accept optional `environment` query parameter; null = all environments (backward compatible). +- URL taxonomy: user-facing data, config, and query endpoints live under `/api/v1/environments/{envSlug}/...`. Env is a path segment, resolved via the `@EnvPath` argument resolver (404 on unknown slug). Flat endpoints are only for: agent self-service (JWT-authoritative), cross-env admin (RBAC, OIDC, audit, license, thresholds, env CRUD), cross-env discovery (`/catalog`), content-addressed lookups (`/diagrams/{contentHash}/render`, `/executions/{id}`), and auth. See `.claude/rules/app-classes.md` for the full allow-list. +- Slug immutability: environment and app slugs are immutable after creation (both appear in URLs, Docker network names, container names, and ClickHouse partition keys). Slug regex `^[a-z0-9][a-z0-9-]{0,63}$` is enforced on POST; update endpoints silently drop any slug field in the request body via Jackson's default unknown-property handling. +- App uniqueness: `(environment_id, app_slug)` is the natural key. The same app slug can legitimately exist in multiple environments; `AppService.getByEnvironmentAndSlug(envId, slug)` is the canonical lookup for controllers. Bare `getBySlug(slug)` remains for internal use but is ambiguous across envs. +- Environment filtering: all data queries filter by the selected environment. All commands target only agents in the selected environment. Env is required on every env-scoped endpoint (path param); the legacy `?environment=` query form is retired. - Maintains agent instance registry (in-memory) with states: LIVE -> STALE -> DEAD. Auto-heals from JWT `env` claim + heartbeat body on heartbeat/SSE after server restart (priority: heartbeat `environmentId` > JWT `env` claim; no silent default — missing env on heartbeat auto-heal returns 400). Registration (`POST /api/v1/agents/register`) requires `environmentId` in the request body; missing or blank returns 400. Capabilities and route states updated on every heartbeat (protocol v2). Route catalog merges three sources: in-memory agent registry, persistent `route_catalog` table (ClickHouse), and `stats_1m_route` execution stats. The persistent catalog tracks `first_seen`/`last_seen` per route per environment, updated on every registration and heartbeat. Routes appear in the sidebar when their lifecycle overlaps the selected time window (`first_seen <= to AND last_seen >= from`), so historical routes remain visible even after being dropped from newer app versions. - Multi-tenancy: each server instance serves one tenant (configured via `CAMELEER_SERVER_TENANT_ID`, default: `"default"`). Environments (dev/staging/prod) are first-class. PostgreSQL isolated via schema-per-tenant (`?currentSchema=tenant_{id}`) and `ApplicationName=tenant_{id}` on the JDBC URL. ClickHouse shared DB with `tenant_id` + `environment` columns, partitioned by `(tenant_id, toYYYYMM(timestamp))`. - Storage: PostgreSQL for RBAC, config, and audit; ClickHouse for all observability data (executions, search, logs, metrics, stats, diagrams). ClickHouse schema migrations in `clickhouse/*.sql`, run idempotently on startup by `ClickHouseSchemaInitializer`. Use `IF NOT EXISTS` for CREATE and ADD PROJECTION. @@ -91,7 +94,7 @@ When adding, removing, or renaming classes, controllers, endpoints, UI component # GitNexus — Code Intelligence -This project is indexed by GitNexus as **cameleer-server** (6404 symbols, 16118 relationships, 300 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely. +This project is indexed by GitNexus as **cameleer-server** (6436 symbols, 16257 relationships, 300 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely. > If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.