# Environment-scoped config — fixing cross-env data bleed **Date:** 2026-04-16 **Status:** Not started **Backwards compatibility:** None (pre-1.0; user will wipe dev DB) ## Problem Two PostgreSQL tables key per-app state on the application name alone, despite environments (`dev`/`staging`/`prod`) being first-class in the rest of the system: - `application_config` PK `(application)` — traced processors, taps, route recording, per-app sensitive keys. All env-sensitive. - `app_settings` PK `(application_id)` — SLA threshold, health warn/crit thresholds. All env-sensitive. Consequences: - **Config corruption**: `PUT /api/v1/config/{app}?environment=dev` correctly scopes the SSE fan-out but overwrites the single DB row, so when `prod` agents restart and fetch config they get the `dev` config. - **Agent startup is env-blind**: `GET /api/v1/config/{app}` reads neither JWT `env` claim nor any request parameter; returns whichever row exists. - **Dashboard settings ambiguous**: `AppSettings` endpoints have no env parameter; SLA/health displayed without env context. - Ancillary: `ClickHouseDiagramStore.findProcessorRouteMapping(appId)` doesn't filter by env even though the table has an `environment` column. - Ancillary: `AgentRegistrationController` accepts registrations without `environmentId` and silently defaults to `"default"` — masks misconfigured agents. ## Non-goals / working correctly (do not touch) - All ClickHouse observability tables (executions, logs, metrics, stats_1m_*) — already env-scoped. - `AgentCommandController` / SSE command fan-out — already env-filtered via `AgentRegistryService.findByApplicationAndEnvironment`. - `SearchController` search path — fixed in commit `e2d9428`. - RBAC (users/roles/groups/claim mappings) — tenant-wide by design. - Global sensitive-keys push to all envs (`SensitiveKeysAdminController.fanOutToAllAgents`) — by design; global baseline. - Admin UI per-page env indicator — not needed, already shown in top-right of the shell. ## Design decisions (fixed) | Question | Answer | |---|---| | Schema migration strategy | Edit `V1__init.sql` in place. User wipes dev DB. | | Agent config fetch with no/unknown env | Return `404 Not Found`. No `"default"` fallback. | | `cameleer-common` `ApplicationConfig` model | Add `environment` field in-place; agent team coordinates the bump (SNAPSHOT). | | Agent registration without `environmentId` | Return `400 Bad Request`. Registration MUST include env. | | UI per-screen env display | Already covered by top-right global env indicator — no extra UI work. | ## Plan ### Phase 1 — PostgreSQL schema 1. Edit `cameleer-server-app/src/main/resources/db/migration/V1__init.sql`: - `application_config`: add `environment TEXT NOT NULL` column; change PK to `(application, environment)`. - `app_settings`: add `environment TEXT NOT NULL` column; change PK to `(application_id, environment)`. 2. Commit message MUST call out: "Wipe dev DB before deploying — Flyway V1 checksum changes." ### Phase 2 — Shared model (`cameleer-common`) 3. **`ApplicationConfig` stays untouched on the server side.** The agent team is adding `environment` to the common class separately; the server doesn't depend on it. On the server, `environment` flows as a sidecar parameter through repositories/controllers and as a dedicated `environment` column on `application_config`. The stored JSON body contains only the config content. If/when the field appears in the common class, we'll hydrate it from the DB column into the returned DTO — no code change needed today. 4. Add `environment` field to `AppSettings` record in `cameleer-server-core` (`admin/AppSettings.java`). **Done.** ### Phase 3 — Repositories 5. `PostgresApplicationConfigRepository`: - `findByApplicationAndEnvironment(String app, String env)` replaces `findByApplication(app)`. - `findAll(String env)` (env required) replaces `findAll()`. - `save(String app, String env, ApplicationConfig, String updatedBy)` replaces `save(app, config, updatedBy)`. - Keep behaviour identical except for the PK. 6. `AppSettingsRepository` interface (core) and `PostgresAppSettingsRepository` (app) — same treatment with `(applicationId, environment)`. ### Phase 4 — REST controllers 7. `ApplicationConfigController`: - `getConfig(@PathVariable app)`: dual-mode by caller role. For AGENT role → env taken from JWT `env` claim, query param ignored (agents cannot spoof env). For non-agent callers (admin UI, with user JWTs whose `env="default"` is a placeholder) → env must be passed via `?environment=` query param. If neither produces a value → **404**. - `updateConfig(@PathVariable app, @RequestParam String environment, ...)`: make `environment` required. Forward to repo save. SSE push already env-scoped — keep. - `listConfigs(@RequestParam String environment)`: require env; filter. - `getProcessorRouteMapping(@PathVariable app, @RequestParam String environment)`: require env; forward to ClickHouse. - `testExpression(@PathVariable app, @RequestParam String environment, ...)`: make env required (already accepted as optional — tighten). 8. `AppSettingsController`: - `GET /api/v1/admin/app-settings?environment=`: list filtered. - `GET /api/v1/admin/app-settings/{appId}?environment=`: require env. - `PUT /api/v1/admin/app-settings/{appId}?environment=`: require env. 9. `SensitiveKeysAdminController`: review — global sensitive keys are server-wide (one row in `server_config`), no change needed. Add code comment clarifying env-wide push is intentional. 10. `SearchController.stats`: the SLA threshold lookup `appSettingsRepository.findByApplicationId(app)` becomes env-aware via the existing `environment` query param. ### Phase 5 — Storage 11. `ClickHouseDiagramStore.findProcessorRouteMapping(app)` → `findProcessorRouteMapping(app, env)`. Include `environment = ?` in `WHERE`. ### Phase 6 — JWT surface 12. Expose `env` claim via Spring `Authentication` principal — simplest path is a small custom `AuthenticationPrincipal` or `@RequestAttribute("env")` populated by `JwtAuthenticationFilter`. Keep scope minimal; only `ApplicationConfigController.getConfig` needs it directly for the 404 rule. ### Phase 7 — Agent registration hardening 13. `AgentRegistrationController.register`: - If `request.environmentId()` is null or blank → `400 Bad Request` with an explicit error message. Drop the `"default"` fallback on line 122. - Log the rejection (agent identity + remote IP) at INFO for diagnostics. 14. `AgentRegistrationController.refreshToken`: remove the `"default"` fallback at line 211 (dead after Phase 7.13, but harmless to clean up). 15. `AgentRegistrationController.heartbeat`: already falls back to JWT claim; after Phase 7.13 every JWT has a real env, so the `"default"` fallback at line 247 is dead code — remove. ### Phase 8 — UI queries 16. `ui/src/api/queries/dashboard.ts`: `useAppSettings(appId)` → `useAppSettings(appId, environment)`; same for `useAllAppSettings()`. Pull env from `useEnvironmentStore`. 17. `ui/src/api/queries/commands.ts`: verify `useApplicationConfig(appId)` / `useUpdateApplicationConfig` already pass env. Add if missing. (Audit pass only, may be no-op.) 18. Verify no other UI hook fetches per-app state without env. ### Phase 9 — Tests 19. Integration: write config for `(app=X, env=dev)`; read for `(app=X, env=prod)` returns empty/default. 20. Integration: agent JWT with `env=dev` calling `GET /api/v1/config/X` returns the dev config row. JWT with no env claim → 404. 21. Integration: `POST /api/v1/agents/register` with no `environmentId` → 400. 22. Unit: `AppSettingsRepository` env-isolation test. ### Phase 10 — Documentation 23. `CLAUDE.md`: - "Storage" section: update `application_config` and `app_settings` PK description. - Agent lifecycle section: note that registration requires `environmentId` (was optional, defaulted to `"default"`). - Remove the "priority: heartbeat `environmentId` > JWT `env` claim > `"default"`" note — after fix, every agent has a real env on every path. 24. `.claude/rules/app-classes.md`: - `ApplicationConfigController` — reflect env-required endpoints. - `AppSettingsController` — reflect env-required endpoints. - `AgentRegistrationController` — note env required. 25. `.claude/rules/core-classes.md`: - `PostgresApplicationConfigRepository`, `PostgresAppSettingsRepository` — updated signatures. ## Execution order Phases are mostly sequential by dependency: 1 → 2 → 3 → (4, 5 in parallel) → 6 → 7 → 8 → 9 → 10. Phase 6 (JWT surfacing) is a small dependency for Phase 4 controller changes; do them together. ## Verification - `mvn clean verify` passes. - `detect_changes` scope matches the files touched per phase. - Manual: spin up two envs (`dev` + `prod`) locally; configure tap in `dev`; confirm `prod` agent doesn't receive it and its DB row is untouched. - Manual: stop an agent without env in the registration payload; confirm server returns 400. ## Out of scope / follow-ups - `audit_log` has no `environment` column; filtering audit by env would be nice-to-have but not a correctness issue. Defer. - Agent bootstrap-token scoping to env (so a dev token can't register as prod) — security hardening for after 1.0.