Commit Graph

166 Commits

Author SHA1 Message Date
hsiegeln
3a649f40cd ui(deploy): router blocker + DS dialog for unsaved edits
- Add deployedConfigSnapshot field to Deployment interface (mirrors server shape)
- Remove the Task 10.3 cast in handleRestore now that the type has the field
- New useUnsavedChangesBlocker hook (react-router useBlocker, v7.13.1)
- Wire AlertDialog into AppDeploymentPage for in-app navigation guard

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 23:13:36 +02:00
hsiegeln
52ff385b04 ui(api): add useDirtyState + apply=staged|live on useUpdateApplicationConfig
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 22:45:42 +02:00
hsiegeln
0434299d53 api(schema): regenerate OpenAPI + schema.d.ts for deployment page
Picks up GET dirty-state, PUT config ?apply=staged|live, and
deployedConfigSnapshot on Deployment for the deployment config-diff UI.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-22 22:42:10 +02:00
hsiegeln
2835d08418 ui(env): explicit switcher button+modal, forced selection, 3px color bar
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m6s
CI / docker (push) Successful in 1m18s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
- Replace EnvironmentSelector "All Envs" dropdown with Button+Modal (DS Modal, forced on first-use).
- Add 8-swatch preset color picker in the Environment settings "Appearance" section; commits via useUpdateEnvironment.
- Render a 3px fixed top bar in the current env's color across every page (z-index 900, below DS modals).
- New env-colors tokens (--env-color-*, light + dark) and envColorVar() helper with slate fallback.
- Vitest coverage for button, modal, and color helpers (13 new specs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:24:48 +02:00
hsiegeln
79fa4c097c api(schema): regenerate OpenAPI + schema.d.ts for env color field
UpdateEnvironmentRequest gains an optional color; Environment schema
surfaces color on GET responses.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 19:24:35 +02:00
hsiegeln
7677df33e5 ui(api): regen types + drop perExchangeLingerSeconds from SPA
Follows backend removal of the field (Task 3.1). Typechecker confirms
zero remaining references. The ExchangeMatchForm linger-input is
visually removed in Task 4.4.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 17:40:43 +02:00
hsiegeln
98cbf8f3fc refactor(search): drop dead SearchIndexer subsystem
After the ExecutionController removal (0f635576), SearchIndexer
subscribed to ExecutionUpdatedEvent but nothing publishes that event.
Every SearchIndexerStats metric returned always-zero, and the admin
/api/v1/admin/clickhouse/pipeline endpoint that surfaced those stats
carried no signal.

Backend removed:
- core: SearchIndexer, SearchIndexerStats, ExecutionUpdatedEvent
- app: IndexerPipelineResponse DTO, /pipeline endpoint on
  ClickHouseAdminController (field + ctor param)
- StorageBeanConfig.searchIndexer bean

UI removed:
- IndexerPipeline type + useIndexerPipeline hook in
  api/queries/admin/clickhouse.ts
- Indexer Pipeline card in ClickHouseAdminPage.tsx (plus ProgressBar
  import and pipeline* CSS classes)

OpenAPI schema.d.ts + openapi.json regenerated (stale /pipeline path
and IndexerPipelineResponse schema removed).

SearchIndex interface + ClickHouseSearchIndex impl kept — those are
live and used by SearchService + ExchangeMatchEvaluator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 23:32:49 +02:00
hsiegeln
b7d201d743 fix(alerts): add AGENT_LIFECYCLE to condition_kind_enum + readable error toasts
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m5s
CI / docker (push) Successful in 1m19s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Backend
 - V18 migration adds AGENT_LIFECYCLE to condition_kind_enum. Java
   ConditionKind enum shipped with this value but no Postgres migration
   extended the type, so any AGENT_LIFECYCLE rule insert failed with
   "invalid input value for enum condition_kind_enum".
 - ALTER TYPE ... ADD VALUE lives alone in its migration per Postgres
   constraint that the new value cannot be referenced in the same tx.
 - V18MigrationIT asserts the enum now contains all 7 kinds.

Frontend
 - Add describeApiError(e) helper to unwrap openapi-fetch error bodies
   (Spring error JSON) into readable strings. String(e) on a plain
   object rendered "[object Object]" in toasts — the actual failure
   reason was hidden from the user.
 - Replace String(e) in all 13 toast descriptions across the alerting
   and outbound-connection mutation paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 20:23:14 +02:00
hsiegeln
be703eb71d feat(ui/alerts): hooks for bulk-ack, delete, bulk-delete, restore + acked/read filter params
- useAlerts gains acked/read filter params threaded into query + queryKey
- new mutations: useBulkAckAlerts, useDeleteAlert, useBulkDeleteAlerts, useRestoreAlert
- all cache-invalidate the alerts list and unread-count on success

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 19:00:18 +02:00
hsiegeln
207ae246af chore(ui): regenerate OpenAPI schema for alerts inbox redesign
New endpoints visible to the SPA: DELETE /alerts/{id}, POST
/alerts/{id}/restore, POST /alerts/bulk-delete, POST /alerts/bulk-ack.
GET /alerts gains tri-state acked / read query params. AlertDto now
includes readAt.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 18:58:26 +02:00
hsiegeln
414f7204bf feat(alerting): AGENT_LIFECYCLE condition kind with per-subject fire mode
Allows alert rules to fire on agent-lifecycle events — REGISTERED,
RE_REGISTERED, DEREGISTERED, WENT_STALE, WENT_DEAD, RECOVERED — rather
than only on current state. Each matching `(agent, eventType, timestamp)`
becomes its own ackable AlertInstance, so outages on distinct agents are
independently routable.

Core:
- New `ConditionKind.AGENT_LIFECYCLE` + `AgentLifecycleCondition` record
  (scope, eventTypes, withinSeconds). Compact ctor rejects empty
  eventTypes and withinSeconds<1.
- Strict allowlist enum `AgentLifecycleEventType` (six entries matching
  the server-emitted types in `AgentRegistrationController` and
  `AgentLifecycleMonitor`). Custom agent-emitted event types tracked in
  backlog issue #145.
- `AgentEventRepository.findInWindow(env, appSlug, agentId, eventTypes,
  from, to, limit)` — new read path ordered `(timestamp ASC, insert_id
  ASC)` used by the evaluator. Implemented on
  `ClickHouseAgentEventRepository` with tenant + env filter mandatory.

App:
- `AgentLifecycleEvaluator` queries events in the last `withinSeconds`
  window and returns `EvalResult.Batch` with one `Firing` per row.
  Every Firing carries a canonical `_subjectFingerprint` of
  `"<agentId>:<eventType>:<tsMillis>"` in context plus `agent` / `event`
  subtrees for Mustache templating.
- `NotificationContextBuilder` gains an `AGENT_LIFECYCLE` branch that
  exposes `{{agent.id}}`, `{{agent.app}}`, `{{event.type}}`,
  `{{event.timestamp}}`, `{{event.detail}}`.
- Validation is delegated to the record compact ctor + enum at Jackson
  deserialization time — matches the existing policy of keeping
  controller validators focused on env-scoped / SQL-injection concerns.

Schema:
- V16 migration generalises the V15 per-exchange discriminator on
  `alert_instances_open_rule_uq` to prefer `_subjectFingerprint` with a
  fallback to the legacy `exchange.id` expression. Scalar kinds still
  resolve to `''` and keep one-open-per-rule. Duplicate-key path in
  `PostgresAlertInstanceRepository.save` is unchanged — the index is
  the deduper.

UI:
- New `AgentLifecycleForm.tsx` wizard form with multi-select chips for
  the six allowed event types + `withinSeconds` input. Wired into
  `ConditionStep`, `form-state` (validation + defaults: WENT_DEAD,
  300 s), and `enums.ts` options. Tests in `enums.test.ts` pin the
  new option array.
- `alert-variables.ts` registers `{{agent.app}}`, `{{event.type}}`,
  `{{event.timestamp}}`, `{{event.detail}}` leaves for the new kind,
  and extends `agent.id`'s availability list to include `AGENT_LIFECYCLE`.

Tests (all passing):
- 5 new JSON-roundtrip cases on `AlertConditionJsonTest` (positive +
  empty/zero/unknown-type rejection).
- 5 new evaluator unit tests on `AgentLifecycleEvaluatorTest` (empty
  window, multi-agent fingerprint shape, scope forwarding, missing env).
- `NotificationContextBuilderTest` switch now covers the new kind.
- 119 alerting unit tests + 71 UI tests green.

Docs: `.claude/rules/{core,app,ui}` and CLAUDE.md migration list updated.
2026-04-21 14:52:08 +02:00
hsiegeln
f037d8c922 feat(alerting): server-side state+severity filters, ButtonGroup filter UI
Backend: `GET /environments/{envSlug}/alerts` now accepts optional multi-value
`state=…` and `severity=…` query params. Filters are pushed down to
PostgresAlertInstanceRepository, which appends `AND state::text = ANY(?)` /
`AND severity::text = ANY(?)` to the inbox query (null/empty = no filter).

`AlertInstanceRepository.listForInbox` gained a 7-arg overload; the old 5-arg
form is preserved as a default delegate so existing callers (evaluator,
AlertingFullLifecycleIT, PostgresAlertInstanceRepositoryIT) compile unchanged.
`InAppInboxQuery.listInbox` also has a new filtered overload.

UI: InboxPage severity filter migrated from `SegmentedTabs` (single-select,
no color cues) to `ButtonGroup` (multi-select with severity-coloured dots),
matching the topnavbar status-filter pattern. `useAlerts` forwards the
filters as query params and cache-keys on the filter tuple so each combo
is independently cached.

Unit + hook tests updated to the new contract (5 UI tests + 8 Java unit
tests passing). OpenAPI types regenerated from the fresh local backend.
2026-04-21 12:47:31 +02:00
hsiegeln
c443fc606a fix(alerts/ui): bell position, content tabs hidden, filters, novice labels
Surfaced during second smoke:

1. Notification bell moved — was first child of TopBar (left of
   breadcrumb); now rendered inside the `environment` slot so it
   sits between the env selector and the user menu, matching user
   expectations.

2. Content tabs (Exchanges/Dashboard/Runtime/Deployments) hidden on
   `/alerts/*` — the operational tabs don't apply there.

3. Inbox / All alerts filters now actually filter. `AlertController.list`
   accepts only `limit` — `state`/`severity` query params are dropped
   server-side. Move `useAlerts` to fetch once per env (limit 200) and
   apply filters client-side via react-query `select`, with a stable
   queryKey so filter toggles are instant and don't re-request. True
   server-side filter needs a backend change (follow-up).

4. Novice-friendly labels:
   - Inbox subtitle: "99 firing · 100 total" → "99 need attention ·
     100 total in inbox"
   - All alerts filter: Open/Firing/Acked/All →
     "Currently open"/"Firing now"/"Acknowledged"/"All states"
   - All alerts subtitle: "N shown" → "N matching your filter"
   - History subtitle: "N resolved" → "N resolved alert(s) in range"
   - Rules subtitle: "N total" → "N rule(s) configured"
   - Silences subtitle: "N active" → "N active silence(s)" or
     "Nothing silenced right now"
   - Column headers: "State" → "Status", rules "Kind" → "Type",
     rules "Targets" → "Notifies"
   - Buttons: "Ack" → "Acknowledge", silence "End" → "End early"

Updated alerts.test.tsx and e2e selector to match new behavior/labels.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:48:33 +02:00
hsiegeln
e590682f8f refactor(ui/alerts): address code-review findings on alerting-enums
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / docker (push) Successful in 1m22s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Follow-up to 83837ada addressing the critical-review feedback:

- Duplicate ConditionKind type consolidated: the one in
  api/queries/alertRules.ts (which was nullable — wrong) is gone;
  single source of truth lives in this module.

- Module moved out of api/ into pages/Alerts/ where it belongs.
  api/ is the data layer; labels + hide lists are view-layer concerns.

- Hidden values formalised: Comparator.EQ and JvmAggregation.LATEST
  are intentionally not surfaced in dropdowns (noisy / wrong feature
  boundary, see in-file comments). They remain in the type unions so
  rules that carry those values save/load correctly — we just don't
  advertise them in the UI.

- JvmAggregation declaration order restored to MAX/AVG/MIN (matches
  what users saw before 83837ada). LATEST declared last; hidden.

- Snapshot tests for every visible *_OPTIONS array — reviewer signal
  in future PRs when a backend enum change or hide-list edit
  silently reshapes the dropdown.

- `toOptions` gains a JSDoc noting that label-map declaration order
  is load-bearing (ES2015 Object.keys insertion-order guarantee).

- **Honest about the springdoc schema quirk**: the generated
  polymorphic condition types resolve to `never` at the TypeScript
  level (two conflicting `kind` discriminators — the class-name
  literal and the Jackson enum — intersect to never), which silently
  defeated `Record<T, string>` exhaustiveness. The previous commit's
  "schema-derived enums" claim was accurate only for the flat-field
  enums (ConditionKind, Severity, TargetKind); condition-specific
  enums (RouteMetric, Comparator, JvmAggregation, ExchangeFireMode)
  were silently `never`. Those are now declared as hand-written
  string-literal unions with a top-of-file comment spelling out the
  issue and the regen-and-compare workflow. Real upstream fix is a
  backend-side adjustment to how springdoc emits polymorphic
  `@JsonSubTypes` — out of scope for this phase.

Verified: ui build green, 56/56 vitest pass (49 pre-existing + 7
new enum snapshots).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 19:26:16 +02:00
hsiegeln
83837ada8f refactor(ui/alerts): derive option lists + form-state types from schema.d.ts
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m8s
CI / docker (push) Successful in 1m15s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Closes item 5 on the Plan 03 cleanup triage. The option arrays
("METRICS", "COMPARATORS", KIND_OPTIONS, SEVERITY_OPTIONS, FIRE_MODES)
scattered across RouteMetricForm / JvmMetricForm / ExchangeMatchForm /
ConditionStep / ScopeStep were hand-typed string literals. They drifted
silently — P95_LATENCY_MS appeared in a dropdown without a backend
counterpart (caught at runtime in bcde6678); JvmMetric.LATEST and
Comparator.EQ existed on the backend but were missing from the UI all
along.

Fix: new `ui/src/api/alerting-enums.ts` derives every enum from
schema.d.ts and pairs each with a `Record<T, string>` label map.
TypeScript enforces exhaustiveness — adding or removing a backend
value fails the build of this file until the label map is updated.
Every consumer imports the generated `*_OPTIONS` array.

Covered (schema-derived):
  - ConditionKind            → CONDITION_KIND_OPTIONS
  - Severity                 → SEVERITY_OPTIONS
  - RouteMetric              → ROUTE_METRIC_OPTIONS
  - Comparator               → COMPARATOR_OPTIONS (adds EQ that was missing)
  - JvmAggregation           → JVM_AGGREGATION_OPTIONS (adds LATEST that was missing)
  - ExchangeMatch.fireMode   → EXCHANGE_FIRE_MODE_OPTIONS
  - AlertRuleTarget.kind     → TARGET_KIND_OPTIONS

form-state.ts: `severity: 'CRITICAL' | 'WARNING' | 'INFO'` and
`kind: 'USER' | 'GROUP' | 'ROLE'` literal unions swapped for the
derived `Severity` / `TargetKind` aliases.

Not covered, backend types them as `String` (no `@Schema(allowableValues)`
annotation yet):
  - AgentStateCondition.state
  - DeploymentStateCondition.states
  - LogPatternCondition.level
  - ExchangeFilter.status
  - JvmMetricCondition.metric

These stay hand-typed with a pointer-comment. Follow-up: add
`@Schema(allowableValues = …)` to the Java record components so the
enums land in schema.d.ts; then fold them into alerting-enums.ts.

Plus: gitnexus index-stats refresh in AGENTS.md/CLAUDE.md from the
post-deploy reindex.

Verified: ui build green, 49/49 vitest pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 19:02:52 +02:00
hsiegeln
09b49f096c feat(alerting): per-severity breakdown on unread-count DTO
Spec §13 calls for the notification bell to colour-code by highest
unread severity (CRITICAL → error, WARNING → amber, INFO → muted).
The old { count } DTO forced the UI to pick one static colour, so
NotificationBell shipped with a TODO. Grow the contract instead:

  UnreadCountResponse = { total, bySeverity: { CRITICAL, WARNING, INFO } }

Guarantees:
- every severity is always present with a >=0 value (no undefined
  keys on the wire), so the UI can branch without defaults.
- total = sum of bySeverity values — kept explicit on the wire for
  cheap top-line display, not recomputed client-side.

Backend
- AlertInstanceRepository: replaces countUnreadForUser(long) with
  countUnreadBySeverityForUser returning Map<AlertSeverity, Long>.
  One SQL round-trip per (env, user) — GROUP BY ai.severity over the
  same NOT EXISTS(alert_reads) filter.
- UnreadCountResponse.from(Map) normalises and defensively copies;
  missing severities default to 0.
- InAppInboxQuery.countUnread now returns the DTO, caches the full
  response (still 5s TTL) so severity breakdown gets the same
  hit-rate as the total did before.
- AlertController just hands the DTO back.

Breaking change — no backwards-compat shim: the `count` field is
gone. UI and tests updated in the same commit; there are no other
API consumers in the tree.

Frontend
- Regenerated openapi.json + schema.d.ts against a fresh build of
  the new backend.
- NotificationBell branches badge colour on the highest unread
  severity (CRITICAL > WARNING > INFO) via new CSS variants.
- Tests cover all four paths: zero, critical-present, warning-only,
  info-only.

Tests: 7 unit tests + 12 ITs (incl. new grouping + empty-map)
       + 49 vitest (was 46; +3 severity-branch assertions).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:15:56 +02:00
hsiegeln
38083d7c3f fix(ui/alerts): remove dead usePageVisible subscription; align alerts test mock with DTO
NotificationBell used a usePageVisible() subscription that re-rendered on
every visibilitychange without consuming the value. TanStack Query's
refetchIntervalInBackground:false already pauses polling; the extra
subscription was speculative generality. Dropped the import + call + JSDoc
reference; usePageVisible hook + test retained as a reusable primitive.

Also: alerts.test.tsx 'returns the server payload unmodified' asserted a
pre-plan {total, bySeverity} shape, but UnreadCountResponse is actually
{count}. Fixed mock + assertion to {count: 3}.
2026-04-20 13:30:07 +02:00
hsiegeln
51bc796bec feat(ui/alerts): silence + notification query hooks 2026-04-20 13:16:57 +02:00
hsiegeln
c6c3dd9cfe feat(ui/alerts): alert rule query hooks (CRUD, enable/disable, preview, test-evaluate)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:13:07 +02:00
hsiegeln
82c29f46a5 fix(ui/alerts): bulk-read body uses instanceIds to match BulkReadRequest DTO
Plan 03 prose had 'alertInstanceIds'; backend record is 'instanceIds'.
2026-04-20 13:08:22 +02:00
hsiegeln
83a8912da6 feat(ui/alerts): TanStack Query hooks for /alerts endpoints
Adds env-scoped hooks for the alerts inbox:
- useAlerts (30s poll, background-paused, filter-aware)
- useAlert, useUnreadCount (30s poll)
- useAckAlert, useMarkAlertRead, useBulkReadAlerts (mutations that
  invalidate the alerts query key tree + unread-count)

Test file uses .tsx because the QueryClientProvider wrapper relies on
JSX; vitest picks up both .ts and .tsx via the configured include glob.
Client mock targets the actual export name (`api` in ../client) rather
than the `apiClient` alias that alertMeta re-exports.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 13:07:16 +02:00
hsiegeln
1a8b9eb41b feat(ui/alerts): shared env helper for alerting query hooks 2026-04-20 13:00:49 +02:00
hsiegeln
b2066cdb68 chore(ui): regenerate openapi.json + schema.d.ts from deployed Plan 02 backend
Fetched from http://192.168.50.86:30090/api/v1/api-docs via
`npm run generate-api:live`. Adds TypeScript types for the new alerting
REST surface merged in #140:

- 15 alerting paths under /environments/{envSlug}/alerts/** (rules CRUD,
  enable/disable, render-preview, test-evaluate, inbox, unread-count,
  ack/read/bulk-read, silences CRUD, per-alert notifications)
- 1 flat notification retry path /alerts/notifications/{id}/retry
- 4 outbound-connection admin paths (from Plan 01 #139)

Verified tsc -p tsconfig.app.json --noEmit exits 0 — no existing SPA
call sites break against the fresh types. Plan 03 UI work can consume
these directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 12:59:38 +02:00
hsiegeln
3c903fc8dc feat(ui): tanstack query hooks for outbound connections
Types are hand-authored (matching codebase admin-query convention);
schema.d.ts regeneration deferred until backend dev server is available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:52:40 +02:00
hsiegeln
a3429a609e fix(ui): live-tail logs when time range is a relative preset
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Page 1 refetches were using the captured timeRange.end, so rows
arriving after the initial render were outside the query window and
never surfaced. When timeRange.preset is set (e.g. 'last 1h'), each
fetch now advances 'to' to Date.now() so the poll picks up new rows.
Absolute ranges are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:48:29 +02:00
hsiegeln
a2d55f7075 fix(ui): push log sort toggle server-side
Reversing logStream.items client-side breaks across infinite-scroll
pages. Passing sort='asc'/'desc' into the query key and URL triggers
a fresh first-page fetch in the selected order.
2026-04-17 13:19:29 +02:00
hsiegeln
73309c7e63 feat(ui): replace useAgentEvents with useInfiniteAgentEvents
Cursor-paginated timeline stream matching the new /agents/events
endpoint. Consumers (AgentHealth, AgentInstance) updated in
follow-up commits.
2026-04-17 12:43:51 +02:00
hsiegeln
43f145157d feat(ui): add useInfiniteApplicationLogs hook
Server-side filters on source/level/time-range, client-side text
search on top of flattened items. Leaves useApplicationLogs and
useStartupLogs untouched for bounded consumers (LogTab, StartupLogPanel).
2026-04-17 12:42:35 +02:00
hsiegeln
bfb5a7a895 chore: regenerate openapi.json + schema.d.ts
Captures the cursor-paginated /agents/events response shape
(AgentEventPageResponse with data/nextCursor/hasMore and a new ?cursor
param). Also folds in pre-existing drift from 62dd71b (environment
field on agent event rows). Consumer UI hooks are updated in
Tasks 9-11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 12:39:03 +02:00
hsiegeln
59de424ab9 chore: regenerate openapi.json + schema.d.ts from deployed server
Fetched from http://192.168.50.86:30090/api/v1/api-docs (running origin/main
through b7a107d — full P3B/P3C env-scoping migration live there).

SPA TS types now match the env-scoped URL shape used at runtime:
- /environments/{envSlug}/... for data, config, search, logs, routes, agents
- /agents/config (agent-authoritative)
- /admin/environments/{envSlug}/... (env CRUD)

Note: ExecutionDetail.environment isn't in the regenerated schema yet —
commit d02fa73 (local, not yet pushed/deployed) adds that backend field.
The local type extension in ui/src/components/ExecutionDiagram/types.ts
covers the gap until the next redeploy + regen.

UI typecheck (tsc -p tsconfig.app.json --noEmit) passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 10:24:24 +02:00
hsiegeln
873e1d3df7 feat!: move query/logs/routes/diagram/agent-view endpoints under /environments/{envSlug}/
P3C — the last data/query wave of the taxonomy migration. Every user-
facing read endpoint that was keyed on env-as-query-param is now under
the env-scoped URL, making env impossible to omit and unambiguous in
server-side tenant+env filtering.

Server:
- SearchController: /api/v1/search/** → /api/v1/environments/{envSlug}/...
  Endpoints: /executions (GET), /executions/search (POST), /stats,
  /stats/timeseries, /stats/timeseries/by-app, /stats/timeseries/by-route,
  /stats/punchcard, /attributes/keys, /errors/top. Env comes from path.
- LogQueryController: /api/v1/logs → /api/v1/environments/{envSlug}/logs.
- RouteCatalogController: /api/v1/routes/catalog → /api/v1/environments/
  {envSlug}/routes. Env filter unconditional (path).
- RouteMetricsController: /api/v1/routes/metrics →
  /api/v1/environments/{envSlug}/routes/metrics (and /metrics/processors).
- DiagramRenderController: /{contentHash}/render stays flat (hashes are
  globally unique). Find-by-route moved to /api/v1/environments/{envSlug}/
  apps/{appSlug}/routes/{routeId}/diagram — the old GET /diagrams?...
  handler is removed.
- Agent views split cleanly:
  - AgentListController (new): /api/v1/environments/{envSlug}/agents
  - AgentEventsController: /api/v1/environments/{envSlug}/agents/events
  - AgentMetricsController: /api/v1/environments/{envSlug}/agents/
    {agentId}/metrics — now also rejects cross-env agents (404) as a
    defense-in-depth check, fulfilling B3.
  Agent self-service endpoints (register/refresh/heartbeat/deregister)
  remain flat at /api/v1/agents/** — JWT-authoritative.

SPA:
- queries/agents.ts, agent-metrics.ts, logs.ts, catalog.ts (route
  metrics only; /catalog stays flat), processor-metrics.ts,
  executions.ts (attributes/keys, stats, timeseries, search),
  dashboard.ts (all stats/errors/punchcard), correlation.ts,
  diagrams.ts (by-route) — all rewritten to env-scoped URLs.
- Hooks now either read env from useEnvironmentStore internally or
  require it as an argument. Query keys include env so switching env
  invalidates caches.
- useAgents/useAgentEvents signature simplified — env is no longer a
  parameter; it's read from the store. Callers (LayoutShell,
  AgentHealth, AgentInstance) updated accordingly.
- LogTab and useStartupLogs thread env through to useLogs.
- envFetch helper introduced in executions.ts for env-prefixed raw
  fetch until schema.d.ts is regenerated against the new backend.

BREAKING CHANGE: All these flat paths are removed:
  /api/v1/search/**, /api/v1/logs, /api/v1/routes/catalog,
  /api/v1/routes/metrics (and /processors), /api/v1/diagrams
  (lookup), /api/v1/agents (list), /api/v1/agents/events-log,
  /api/v1/agents/{id}/metrics, /api/v1/agent-events.
Clients must use the /api/v1/environments/{envSlug}/... equivalents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:48:25 +02:00
hsiegeln
6d9e456b97 feat!: move apps & deployments under /api/v1/environments/{envSlug}/apps/{appSlug}/...
P3B of the taxonomy migration. App and deployment routes are now
env-scoped in the URL itself, making the (env, app_slug) uniqueness
key explicit. Previously /api/v1/apps/{appSlug} was ambiguous: with
the same app deployed to multiple environments (dev/staging/prod),
the handler called AppService.getBySlug(slug) which returns the
first row matching slug regardless of env.

Server:
- AppController: @RequestMapping("/api/v1/environments/{envSlug}/
  apps"). Every handler now calls
  appService.getByEnvironmentAndSlug(env.id(), appSlug) — 404 if the
  app doesn't exist in *this* env. CreateAppRequest body drops
  environmentId (it's in the path).
- DeploymentController: @RequestMapping("/api/v1/environments/
  {envSlug}/apps/{appSlug}/deployments"). DeployRequest body drops
  environmentId. PromoteRequest body switches from
  targetEnvironmentId (UUID) to targetEnvironment (slug);
  promote handler resolves the target env by slug and looks up the
  app with the same slug in the target env (fails with 404 if the
  target app doesn't exist yet — apps must exist in both source
  and target before promote).
- AppService: added getByEnvironmentAndSlug helper; createApp now
  validates slug against ^[a-z0-9][a-z0-9-]{0,63}$ (400 on
  invalid).

SPA:
- queries/admin/apps.ts: rewritten. Hooks take envSlug where
  env-scoped. Removed useAllApps (no flat endpoint). Renamed path
  param naming: appId → appSlug throughout. Added
  usePromoteDeployment. Query keys include envSlug so cache is
  env-scoped.
- AppsTab.tsx: call sites updated. When no environment is selected,
  the managed-app list is empty — cross-env discovery lives in the
  Runtime tab (catalog). handleDeploy/handleStop/etc. pass envSlug
  to the new hook signatures.

BREAKING CHANGE: /api/v1/apps/** paths removed. Clients must use
/api/v1/environments/{envSlug}/apps/{appSlug}/**. Request bodies
for POST /apps and POST /apps/{slug}/deployments no longer accept
environmentId (use the URL path instead). Promote body uses slug
not UUID.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:38:37 +02:00
hsiegeln
969cdb3bd0 feat!: move config & settings under /api/v1/environments/{envSlug}/...
P3A of the taxonomy migration. Env-scoped config and settings endpoints
now live under the env-prefixed URL shape, making env a first-class
path segment instead of a query param. Agent-authoritative config is
split off into a dedicated endpoint so agent env comes from the JWT
only — never spoofable via URL.

Server:
- ApplicationConfigController: @RequestMapping("/api/v1/environments/
  {envSlug}"). Handlers use @EnvPath Environment env, appSlug as
  @PathVariable. Removed the dual-mode resolveEnvironmentForRead —
  user flow only; agent flow moved to AgentConfigController.
- AgentConfigController (new): GET /api/v1/agents/config. Reads
  instanceId from JWT subject, resolves (app, env) from registry,
  returns AppConfigResponse. Registry miss → falls back to JWT env
  claim for environment, but 404s if application cannot be derived
  (no other source without registry).
- AppSettingsController: @RequestMapping("/api/v1/environments/
  {envSlug}"). List at /app-settings, per-app at /apps/{appSlug}/
  settings. Access class-wide PreAuthorize preserved (ADMIN/OPERATOR).

SPA:
- commands.ts: useAllApplicationConfigs, useApplicationConfig,
  useUpdateApplicationConfig, useProcessorRouteMapping,
  useTestExpression — rewritten URLs to /environments/{env}/apps/
  {app}/... shape. environment now required on every call. Query
  keys include environment so cache is env-scoped.
- dashboard.ts: useAppSettings, useAllAppSettings, useUpdateAppSettings
  rewritten.
- TapConfigModal: new required environment prop; callers updated.
- RouteDetail, ExchangesPage: thread selectedEnv into test-expression
  and modal.

Config changes in SecurityConfig for the new shape landed earlier in
P0.2; no security rule changes needed in this commit.

BREAKING CHANGE: /api/v1/config/** and /api/v1/admin/app-settings/**
paths removed. Agents must use /api/v1/agents/config instead of
GET /api/v1/config/{app}; users must use /api/v1/environments/{env}/
apps/{app}/config and /api/v1/environments/{env}/apps/{app}/settings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:33:25 +02:00
hsiegeln
6b5ee10944 feat!: environment admin URLs use slug; validate and immutabilize slug
UUID-based admin paths were the only remaining UUID-in-URL pattern in
the API. Migrates /api/v1/admin/environments/{id} to /{envSlug} so
slugs are the single environment identifier in every URL. UUIDs stay
internal to the database.

- Controller: @PathVariable UUID id → @PathVariable String envSlug on
  get/update/delete and the two nested endpoints (default-container-
  config, jar-retention). Handlers resolve slug → Environment via
  EnvironmentService.getBySlug, then delegate to existing UUID-based
  service methods.
- Service: create() now validates slug against ^[a-z0-9][a-z0-9-]{0,63}$
  and returns 400 on invalid slugs. Rationale documented in the class:
  slugs are immutable after creation because they appear in URLs,
  Docker network names, container names, and ClickHouse partition keys.
- UpdateEnvironmentRequest has no slug field and Jackson's default
  ignore-unknown behavior drops any slug supplied in a PUT body;
  regression test (updateEnvironment_withSlugInBody_ignoresSlug)
  documents this invariant.
- SPA: mutation args change from { id } to { slug }. EnvironmentsPage
  still uses env.id for local selection state (UUID from DB) but
  passes env.slug to every mutation.

BREAKING CHANGE: /api/v1/admin/environments/{id:UUID}/... paths removed.
Clients must use /{envSlug}/... (slug from the environments list).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:23:31 +02:00
hsiegeln
fcb53dd010 fix!: require environment on diagram lookup and attribute keys queries
Closes two cross-env data leakage paths. Both endpoints previously
returned data aggregated across all environments, so a diagram or
attribute key from dev could appear in a prod UI query (and vice versa).

B1: GET /api/v1/diagrams?application=&routeId= now requires
?environment= and resolves agents via
registryService.findByApplicationAndEnvironment instead of
findByApplication. Prevents serving a dev diagram for a prod route.

B2: GET /api/v1/search/attributes/keys now requires ?environment=.
SearchIndex.distinctAttributeKeys gains an environment parameter and
the ClickHouse query adds the env filter alongside the existing
tenant_id filter. Prevents prod attribute names leaking into dev
autocompletion (and vice versa).

SPA hooks updated to thread environment through from
useEnvironmentStore; query keys include environment so React Query
re-fetches on env switch. No call-site changes needed — hook
signatures unchanged.

B3 (AgentMetricsController env scope) deferred to P3C: agent-env is
effectively 1:1 today via the instance_id naming
({envSlug}-{appSlug}-{replicaIndex}), and the URL migration in P3C
to /api/v1/environments/{env}/agents/{agentId}/metrics naturally
introduces env from path. A minimal P1 fix would regress the "view
metrics of a killed agent" case.

BREAKING CHANGE: Both endpoints now require ?environment= (slug).
Clients omitting the parameter receive 400.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:19:55 +02:00
hsiegeln
9b1ef51d77 feat!: scope per-app config and settings by environment
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m40s
SonarQube / sonarqube (push) Successful in 4m29s
BREAKING: wipe dev PostgreSQL before deploying — V1 checksum changes.
Agents must now send environmentId on registration (400 if missing).

Two tables previously keyed on app name alone caused cross-environment
data bleed: writing config for (app=X, env=dev) would overwrite the row
used by (app=X, env=prod) agents, and agent startup fetches ignored env
entirely.

- V1 schema: application_config and app_settings are now PK (app, env).
- Repositories: env-keyed finders/saves; env is the authoritative column,
  stamped on the stored JSON so the row agrees with itself.
- ApplicationConfigController.getConfig is dual-mode — AGENT role uses
  JWT env claim (agents cannot spoof env); non-agent callers provide env
  via ?environment= query param.
- AppSettingsController endpoints now require ?environment=.
- SensitiveKeysAdminController fan-out iterates (app, env) slices so each
  env gets its own merged keys.
- DiagramController ingestion stamps env on TaggedDiagram; ClickHouse
  route_diagrams INSERT + findProcessorRouteMapping are env-scoped.
- AgentRegistrationController: environmentId is required on register;
  removed all "default" fallbacks from register/refresh/heartbeat auto-heal.
- UI hooks (useApplicationConfig, useProcessorRouteMapping, useAppSettings,
  useAllAppSettings, useUpdateAppSettings) take env, wired to
  useEnvironmentStore at all call sites.
- New ConfigEnvIsolationIT covers env-isolation for both repositories.

Plan in docs/superpowers/plans/2026-04-16-environment-scoping.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 22:25:21 +02:00
hsiegeln
4b264b3308 feat: add CPU usage to agent response and compact cards
Backend:
- Add cpuUsage field to AgentInstanceResponse (-1 if unavailable)
- Add queryAgentCpuUsage() to AgentRegistrationController — queries
  avg CPU per instance from agent_metrics over last 2 minutes
- Wire CPU into agent list response via withCpuUsage()

Frontend:
- Add cpuUsage to schema.d.ts
- Compute maxCpu per AppGroup (max across all instances)
- Show "X% cpu" on compact cards when available (hidden when -1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:12:23 +02:00
hsiegeln
cb3ebfea7c chore: rename cameleer3 to cameleer
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 18s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Rename Java packages from com.cameleer3 to com.cameleer, module
directories from cameleer3-* to cameleer-*, and all references
throughout workflows, Dockerfiles, docs, migrations, and pom.xml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:28:42 +02:00
hsiegeln
fd806b9f2d fix: unify container/agent log identity and fix multi-replica log capture
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m18s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Four logging pipeline fixes:

1. Multi-replica startup logs: remove stopLogCaptureByApp from
   SseConnectionManager — container log capture now expires naturally
   after 60s instead of being killed when the first agent connects SSE.
   This ensures all replicas' bootstrap output is captured.

2. Unified instance_id: container logs and agent logs now share the same
   instance identity ({envSlug}-{appSlug}-{replicaIndex}). DeploymentExecutor
   sets CAMELEER_AGENT_INSTANCEID per replica so the agent uses the same
   ID as ContainerLogForwarder. Instance-level log views now show both
   container and agent logs.

3. Labels-first container identity: TraefikLabelBuilder emits cameleer.replica
   and cameleer.instance-id labels. Container names are tenant-prefixed
   ({tenantId}-{envSlug}-{appSlug}-{idx}) for global Docker daemon uniqueness.

4. Environment filter on log queries: useApplicationLogs now passes the
   selected environment to the API, preventing log leakage across environments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:54:05 +02:00
hsiegeln
119cf912b8 feat: add useStartupLogs hook for container startup log polling
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:21:23 +02:00
hsiegeln
9ac8e3604c fix: allow testing claim mapping rules before saving and keep rows editable after test
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
The test endpoint now accepts inline rules from the client instead of reading
from the database, so unsaved rules can be tested. Matched rows show the
checkmark alongside action buttons instead of replacing them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:52:18 +02:00
hsiegeln
7b73b5c9c5 feat: add per-app sensitive keys section to AppConfigDetailPage
Adds sensitiveKeys/globalSensitiveKeys/mergedSensitiveKeys fields to
ApplicationConfig, unwraps the new AppConfigResponse envelope in
useApplicationConfig, and renders an editable Sensitive Keys section
with read-only global pills and add/remove app-specific key tags.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:26:05 +02:00
hsiegeln
06c719f0dd feat: add sensitive keys API query hooks 2026-04-14 18:22:28 +02:00
hsiegeln
344700e29e feat: add React Query hooks for claim mapping rules API
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:46:40 +02:00
hsiegeln
0827fd21e3 feat: persist and display exchange properties from agent
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m59s
CI / docker (push) Successful in 2m13s
CI / deploy (push) Successful in 58s
CI / deploy-feature (push) Has been skipped
Add support for exchange properties sent by the agent alongside headers.
Properties flow through the same pipeline as headers: ClickHouse columns
(input_properties, output_properties) on both executions and
processor_executions tables, MergedExecution record, ChunkAccumulator
extraction, DetailService snapshot, and REST API response.

UI adds a Properties tab next to Headers in the process diagram detail
panel, with the same input/output split table layout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:23:53 +02:00
hsiegeln
e57343e3df feat: add delta mode for counter metrics using ClickHouse lag()
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Counter metrics like chunks.exported.count are monotonically increasing.
Add mode=delta query parameter to the agent metrics API that computes
per-bucket deltas server-side using ClickHouse lag() window function:
max(value) per bucket, then greatest(0, current - previous) to get the
increase per period with counter-reset handling.

The chunks exported/dropped charts now show throughput per bucket
instead of the ever-increasing cumulative total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:56:06 +02:00
hsiegeln
27f2503640 fix: clean up runtime UI and harden session expiry handling
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Remove redundant "X/X LIVE" badge from runtime page, breadcrumb trail
and routes section from agent detail page (pills moved into Process
Information card). Fix session expiry: guard against concurrent 401
refresh races and skip re-entrant triggers on auth endpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:33:44 +02:00
hsiegeln
00c9a0006e feat: rework runtime charts and fix time range propagation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Runtime page (AgentInstance):
- Rearrange charts: CPU, Memory, GC (top); Threads, Chunks Exported,
  Chunks Dropped (bottom). Removes throughput/error charts (belong on
  Dashboard, not Runtime).
- Pass global time range (from/to) to useAgentMetrics — charts now
  respect the time filter instead of always showing last 60 minutes.
- Bottom row (logs + timeline) fills remaining vertical space.

Dashboard L3:
- Processor metrics section fills remaining vertical space.
- Chart x-axis uses timestamps instead of bucket indices.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:59:38 +02:00
hsiegeln
51a4317440 fix: optimistically remove dismissed app from sidebar cache
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m14s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Sets query cache immediately on dismiss success so the sidebar updates
without waiting for the catalog refetch to complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:05:04 +02:00
hsiegeln
90c82238a0 feat: add orphaned app cleanup — auto-filter stale discovered apps, manual dismiss with data purge
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:19:59 +02:00