Commit Graph

1436 Commits

Author SHA1 Message Date
hsiegeln
e8de8d88ad refactor(ui/alerts/all): state filter to ButtonGroup (topnavbar style)
Replace the SegmentedTabs with multi-select ButtonGroup, matching the
topnavbar Completed/Warning/Failed/Running pattern. State dots use the
same palette as AlertStateChip (FIRING=error, ACKNOWLEDGED=warning,
PENDING=muted, RESOLVED=success). Default selection is the three "open"
states — Resolved is off by default and a single click surfaces closed
alerts without navigating to /history.
2026-04-21 13:05:32 +02:00
hsiegeln
f037d8c922 feat(alerting): server-side state+severity filters, ButtonGroup filter UI
Backend: `GET /environments/{envSlug}/alerts` now accepts optional multi-value
`state=…` and `severity=…` query params. Filters are pushed down to
PostgresAlertInstanceRepository, which appends `AND state::text = ANY(?)` /
`AND severity::text = ANY(?)` to the inbox query (null/empty = no filter).

`AlertInstanceRepository.listForInbox` gained a 7-arg overload; the old 5-arg
form is preserved as a default delegate so existing callers (evaluator,
AlertingFullLifecycleIT, PostgresAlertInstanceRepositoryIT) compile unchanged.
`InAppInboxQuery.listInbox` also has a new filtered overload.

UI: InboxPage severity filter migrated from `SegmentedTabs` (single-select,
no color cues) to `ButtonGroup` (multi-select with severity-coloured dots),
matching the topnavbar status-filter pattern. `useAlerts` forwards the
filters as query params and cache-keys on the filter tuple so each combo
is independently cached.

Unit + hook tests updated to the new contract (5 UI tests + 8 Java unit
tests passing). OpenAPI types regenerated from the fresh local backend.
2026-04-21 12:47:31 +02:00
hsiegeln
468132d1dd fix(ui/alerts): bell spacing, rule editor width, inbox bulk controls
Round 4 smoke feedback on /alerts:
- Bell now has consistent 12px gap from env selector and user name
  (wrap env + bell in flex container inside TopBar's environment prop)
- RuleEditorWizard constrained to max-width 840px (centered) and
  upgraded the page title from SectionHeader to h2 pattern used by
  the list pages
- Inbox: added select-all checkbox, severity SegmentedTabs filter
  (All / Critical / Warning / Info), and bulk-ack actions
  (Acknowledge selected + Acknowledge all firing) alongside the
  existing mark-read actions
2026-04-21 12:10:20 +02:00
hsiegeln
c443fc606a fix(alerts/ui): bell position, content tabs hidden, filters, novice labels
Surfaced during second smoke:

1. Notification bell moved — was first child of TopBar (left of
   breadcrumb); now rendered inside the `environment` slot so it
   sits between the env selector and the user menu, matching user
   expectations.

2. Content tabs (Exchanges/Dashboard/Runtime/Deployments) hidden on
   `/alerts/*` — the operational tabs don't apply there.

3. Inbox / All alerts filters now actually filter. `AlertController.list`
   accepts only `limit` — `state`/`severity` query params are dropped
   server-side. Move `useAlerts` to fetch once per env (limit 200) and
   apply filters client-side via react-query `select`, with a stable
   queryKey so filter toggles are instant and don't re-request. True
   server-side filter needs a backend change (follow-up).

4. Novice-friendly labels:
   - Inbox subtitle: "99 firing · 100 total" → "99 need attention ·
     100 total in inbox"
   - All alerts filter: Open/Firing/Acked/All →
     "Currently open"/"Firing now"/"Acknowledged"/"All states"
   - All alerts subtitle: "N shown" → "N matching your filter"
   - History subtitle: "N resolved" → "N resolved alert(s) in range"
   - Rules subtitle: "N total" → "N rule(s) configured"
   - Silences subtitle: "N active" → "N active silence(s)" or
     "Nothing silenced right now"
   - Column headers: "State" → "Status", rules "Kind" → "Type",
     rules "Targets" → "Notifies"
   - Buttons: "Ack" → "Acknowledge", silence "End" → "End early"

Updated alerts.test.tsx and e2e selector to match new behavior/labels.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 11:48:33 +02:00
hsiegeln
05f420d162 fix(alerts/ui): page header, scroll, title preview, bell badge polish
Visual regressions surfaced during browser smoke:

1. Page headers — `SectionHeader` renders as 12px uppercase gray (a
   section divider, not a page title). Replace with proper h2 title
   + inline subtitle (`N firing · N total` etc.) and right-aligned
   actions, styled from `alerts-page.module.css`.

2. Undefined `--space-*` tokens — the project (and `@cameleer/design-system`)
   has never shipped `--space-sm|md|lg|xl`, even though many modules
   (SensitiveKeysPage, alerts CSS, …) reference them. The fallback
   to `initial` silently collapsed gaps/paddings to 0. Define the
   scale in `ui/src/index.css` so every consumer picks it up.

3. List scrolling — DataTable was using default pagination, but with
   no flex sizing the whole page scrolled. Add `fillHeight` and raise
   `pageSize`/list `limit` to 200 so the table gets sticky header +
   internal scroll + pinned pagination footer (Gmail-style). True
   cursor-based infinite scroll needs a backend change (filed as
   follow-up — `/alerts` only accepts `limit` today).

4. Title column clipping — `.titlePreview` used `white-space: nowrap`
   + fixed `max-width`, truncating message mid-UUID. Switch to a
   2-line `-webkit-line-clamp` so full context is visible.

5. Notification bell badge invisible — `NotificationBell.module.css`
   referenced undefined tokens (`--fg`, `--hover-bg`, `--bg`,
   `--muted`). Map to real DS tokens (`--text-primary`, `--bg-hover`,
   `#fff`, `--text-muted`). The admin user currently sees no badge
   because the backend `/alerts/unread-count` returns 0 (read
   receipts) — that's data, not UI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:40:28 +02:00
hsiegeln
10e132cd50 refactor(alerts/ui): fix leftover --muted refs in wizard steps
Two inline-style color refs in NotifyStep and TriggerStep were still
pointing at the undefined --muted token instead of the DS
--text-muted. Caught by the design-system-alignment verification
grep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:21:00 +02:00
hsiegeln
35f17a7eeb test(alerts/e2e): adapt smoke suite to DS ConfirmDialog
The Rules list Delete and Silences End-early flows now use DS
ConfirmDialog instead of native confirm(). Update selectors to
target the dialog's role=dialog + confirm button instead of
listening for the native `dialog` event.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:19:14 +02:00
hsiegeln
e861e0199c refactor(alerts/ui): wizard banners → DS Alert, step body → section card
Promote banner and prefill warnings now render as DS Alert components
(info / warning variants). Step body wraps in sectionStyles.section
for card affordance matching other forms in the app.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:17:54 +02:00
hsiegeln
1b6e6ce40c refactor(alerts/ui): replace undefined CSS vars in wizard.module.css
Replace undefined tokens (--muted, --fg, --accent, --border,
--amber-bg) with DS tokens (--text-muted, --text-primary, --amber,
--border-subtle, --space-sm|md). Drop .promoteBanner — replaced by
DS Alert in follow-up commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:16:47 +02:00
hsiegeln
0037309e4f chore(alerts/ui): remove obsolete AlertRow.tsx
The feed-row component is replaced by DataTable column renderers and
the shared renderAlertExpanded content renderer. No callers remain.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:15:24 +02:00
hsiegeln
3e81572477 refactor(alerts/ui): rewrite Silences with DataTable + FormField + ConfirmDialog
Replaces raw <table> with DataTable, inline-styled form with proper
FormField hints, and native confirm() end-early with ConfirmDialog
(warning variant). Adds DS EmptyState for no-silences case.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:14:19 +02:00
hsiegeln
23f3c3990c refactor(alerts/ui): rewrite Rules list with DataTable + Dropdown + ConfirmDialog
Replaces raw <table> with DataTable, raw <select> promote control with
DS Dropdown, and native confirm() delete with ConfirmDialog. Adds DS
EmptyState with CTA for the no-rules case. Uses SectionHeader's
action slot instead of ad-hoc flex wrapper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:12:25 +02:00
hsiegeln
436a0e4d4c refactor(alerts/ui): rewrite History as DataTable + DateRangePicker
Replaces custom feed rows with DataTable. Adds a DateRangePicker
filter (client-side) defaulting to the last 7 days. Client-side
range filter is a stopgap; a server-side range param is a future
enhancement captured in the design spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:10:10 +02:00
hsiegeln
a74785f64d refactor(alerts/ui): rewrite All alerts as DataTable + SegmentedTabs filter
Replaces 4-Button filter row with DS SegmentedTabs and custom row
rendering with DataTable. Shares expandedContent renderer and
severity-driven rowAccent with Inbox.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:07:38 +02:00
hsiegeln
588e0b723a refactor(alerts/ui): rewrite Inbox as DataTable with expandable rows
Replaces custom feed-row layout with the shared DataTable shell used
elsewhere in the app. Adds checkbox selection + bulk "Mark selected
read" toolbar alongside the existing "Mark all read". Uses DS
EmptyState for empty lists, severity-driven rowAccent for unread
tinting, and renderAlertExpanded for row detail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:05:39 +02:00
hsiegeln
c87c77c1cf refactor(alerts/ui): slim alerts-page.module.css to layout-only DS tokens
Drop the feed-row classes (.row, .rowUnread, .body, .meta, .time,
.message, .actions, .empty) — these are replaced by DS DataTable +
EmptyState in follow-up tasks. Keep layout helpers for page shell,
toolbar, filter bar, bulk-action bar, title cell, and DataTable
expanded content. All colors / spacing use DS tokens.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:03:20 +02:00
hsiegeln
b16ea8b185 feat(alerts/ui): add shared renderAlertExpanded for DataTable rows
Extracts the per-row detail block used by Inbox/All/History DataTables
so the three pages share one rendering. Consumes AlertDto fields that
are nullable in the schema; hides missing fields instead of rendering
placeholders.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:01:39 +02:00
hsiegeln
4a63149338 feat(alerts/ui): add formatRelativeTime helper
Formats ISO timestamps as `Nm ago` / `Nh ago` / `Nd ago`, falling back
to an absolute locale date string for values older than 30 days. Used
by the alert DataTable Age column.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 10:00:15 +02:00
hsiegeln
a2b2ccbab7 feat(alerts/ui): add severityToAccent helper for DataTable rowAccent
Pure function mapping the 3-value AlertDto.severity enum to the 2-value
DataTable rowAccent prop. INFO maps to undefined (no tint) because the
DS DataTable rowAccent only supports error|warning.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 09:57:58 +02:00
hsiegeln
52a08a8769 docs(alerts): Implementation plan — design-system alignment for /alerts pages
Task-by-task TDD plan implementing the design spec. Splits the work
into 14 tasks: helper utilities (TDD), shared renderer, CSS token
migration, per-page rewrites (Inbox/All/History/Rules/Silences),
wizard banner migration, AlertRow deletion, E2E adaptation for
ConfirmDialog, and full verification pass. Each task produces an
atomic commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 09:49:47 +02:00
hsiegeln
3d0a4d289b docs(alerts): Design spec — design-system alignment for /alerts pages
Rework all pages under /alerts to use @cameleer/design-system components
and tokens. Unified DataTable shell for Inbox/All/History with expandable
rows; DataTable + Dropdown + ConfirmDialog for Rules list; FormField grid
+ DataTable for Silences; DS Alert for wizard banners. Replaces undefined
CSS variables (--bg, --fg, --muted, --accent) with DS tokens and removes
raw <table>/<select>/confirm() usage.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 09:43:19 +02:00
hsiegeln
037a27d405 fix(alerting): allow multiple open alert_instances per rule for PER_EXCHANGE
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m51s
CI / docker (push) Successful in 1m17s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
V13 added a partial unique index on alert_instances(rule_id) WHERE state
IN (PENDING,FIRING,ACKNOWLEDGED). Correct for scalar condition kinds
(ROUTE_METRIC / AGENT_STATE / DEPLOYMENT_STATE / LOG_PATTERN / JVM_METRIC
/ EXCHANGE_MATCH in COUNT_IN_WINDOW) but wrong for EXCHANGE_MATCH /
PER_EXCHANGE, which by design emits one alert_instance per matching
exchange. Under V13 every PER_EXCHANGE tick with >1 match logged
"Skipped duplicate open alert_instance for rule …" at evaluator cadence
and silently lost alert fidelity — only the first matching exchange per
tick got an AlertInstance + webhook dispatch.

V15 drops the rule_id-only constraint and recreates it with a
discriminator on context->'exchange'->>'id'. Scalar kinds emit
Map.of() as context, so their expression resolves to '' — "one open per
rule" preserved. ExchangeMatchEvaluator.evaluatePerExchange always
populates exchange.id, so per-exchange instances coexist cleanly.

Two new PostgresAlertInstanceRepositoryIT tests:
  - multiple open instances for same rule + distinct exchanges all land
  - second open for identical (rule, exchange) still dedups via the
    DuplicateKeyException fallback in save() — defense-in-depth kept

Also fixes pre-existing PostgresAlertReadRepositoryIT brokenness: its
setup() inserted 3 open instances sharing one rule_id, which V13 blocked
on arrival. Migrate to one rule_id per instance (pattern already used
across other storage ITs).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 22:26:19 +02:00
hsiegeln
e7ce1a73d0 docs(alerting): Plan 04 implementation plan — post-ship hardening
13 atomic commits covering 5 hardening tasks:

  Task 1-2: @Schema(discriminatorMapping) on AlertCondition, derive
            polymorphic unions in enums.ts from schema
  Task 3-7: AgentState / DeploymentStatus / LogLevel / ExecutionStatus
            enum migrations + @Schema(allowableValues) on JvmMetric
  Task 8:   ContextStartupSmokeTest (unit-tier, no Testcontainers)
  Task 9-12: AlertTemplateVariables registry + round-trip test +
             SSOT endpoint + UI consumer
  Task 13:  alerting-editor.spec.ts Playwright spec

Each task has bite-sized write-test/red/green/commit steps with
exact paths and full code. Pre-flight SQL check and post-flight
self-verification scripts included.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 21:54:09 +02:00
hsiegeln
46867cc659 docs(alerting): Plan 04 design spec — post-ship hardening
Closes the loop on three bug classes from Plan 03 triage:
context-load regressions (missing @Autowired), UI/backend drift
on template variables, and hand-maintained TS enum unions caused
by springdoc polymorphic schema quirk.

Covers 5 tasks: context-startup smoke test, template-variables
SSOT endpoint, second Playwright spec, String-to-enum migrations
on 5 condition fields, and @DiscriminatorMapping on AlertCondition.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 21:44:41 +02:00
hsiegeln
efa8390108 fix(alerting): reject null fireMode on ExchangeMatchCondition + repair in-flight rows
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m2s
CI / docker (push) Successful in 1m20s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
SonarQube / sonarqube (push) Successful in 5m31s
The rule editor wizard reset the condition payload on kind-change without
seeding a fireMode default; the ExchangeMatchCondition ctor allowed null to
pass through; AlertEvaluatorJob then NPE-looped every tick on a saved rule.

- core: compact ctor now rejects null fireMode (Jackson-deser path only — all
  production callers already pass a value).
- V14: repair existing EXCHANGE_MATCH rows with fireMode=null to
  PER_EXCHANGE + perExchangeLingerSeconds=300 (default matches the wizard).
- ui: ConditionStep.onKindChange seeds EXCHANGE_MATCH defaults so the
  Select's displayed fallback ("Per exchange") is actually in form state.
- ui: validateStep('condition', ...) now enforces fireMode presence + the
  mode-specific fields before the user reaches Review.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 20:05:55 +02:00
hsiegeln
e590682f8f refactor(ui/alerts): address code-review findings on alerting-enums
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / docker (push) Successful in 1m22s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Follow-up to 83837ada addressing the critical-review feedback:

- Duplicate ConditionKind type consolidated: the one in
  api/queries/alertRules.ts (which was nullable — wrong) is gone;
  single source of truth lives in this module.

- Module moved out of api/ into pages/Alerts/ where it belongs.
  api/ is the data layer; labels + hide lists are view-layer concerns.

- Hidden values formalised: Comparator.EQ and JvmAggregation.LATEST
  are intentionally not surfaced in dropdowns (noisy / wrong feature
  boundary, see in-file comments). They remain in the type unions so
  rules that carry those values save/load correctly — we just don't
  advertise them in the UI.

- JvmAggregation declaration order restored to MAX/AVG/MIN (matches
  what users saw before 83837ada). LATEST declared last; hidden.

- Snapshot tests for every visible *_OPTIONS array — reviewer signal
  in future PRs when a backend enum change or hide-list edit
  silently reshapes the dropdown.

- `toOptions` gains a JSDoc noting that label-map declaration order
  is load-bearing (ES2015 Object.keys insertion-order guarantee).

- **Honest about the springdoc schema quirk**: the generated
  polymorphic condition types resolve to `never` at the TypeScript
  level (two conflicting `kind` discriminators — the class-name
  literal and the Jackson enum — intersect to never), which silently
  defeated `Record<T, string>` exhaustiveness. The previous commit's
  "schema-derived enums" claim was accurate only for the flat-field
  enums (ConditionKind, Severity, TargetKind); condition-specific
  enums (RouteMetric, Comparator, JvmAggregation, ExchangeFireMode)
  were silently `never`. Those are now declared as hand-written
  string-literal unions with a top-of-file comment spelling out the
  issue and the regen-and-compare workflow. Real upstream fix is a
  backend-side adjustment to how springdoc emits polymorphic
  `@JsonSubTypes` — out of scope for this phase.

Verified: ui build green, 56/56 vitest pass (49 pre-existing + 7
new enum snapshots).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 19:26:16 +02:00
hsiegeln
83837ada8f refactor(ui/alerts): derive option lists + form-state types from schema.d.ts
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m8s
CI / docker (push) Successful in 1m15s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Closes item 5 on the Plan 03 cleanup triage. The option arrays
("METRICS", "COMPARATORS", KIND_OPTIONS, SEVERITY_OPTIONS, FIRE_MODES)
scattered across RouteMetricForm / JvmMetricForm / ExchangeMatchForm /
ConditionStep / ScopeStep were hand-typed string literals. They drifted
silently — P95_LATENCY_MS appeared in a dropdown without a backend
counterpart (caught at runtime in bcde6678); JvmMetric.LATEST and
Comparator.EQ existed on the backend but were missing from the UI all
along.

Fix: new `ui/src/api/alerting-enums.ts` derives every enum from
schema.d.ts and pairs each with a `Record<T, string>` label map.
TypeScript enforces exhaustiveness — adding or removing a backend
value fails the build of this file until the label map is updated.
Every consumer imports the generated `*_OPTIONS` array.

Covered (schema-derived):
  - ConditionKind            → CONDITION_KIND_OPTIONS
  - Severity                 → SEVERITY_OPTIONS
  - RouteMetric              → ROUTE_METRIC_OPTIONS
  - Comparator               → COMPARATOR_OPTIONS (adds EQ that was missing)
  - JvmAggregation           → JVM_AGGREGATION_OPTIONS (adds LATEST that was missing)
  - ExchangeMatch.fireMode   → EXCHANGE_FIRE_MODE_OPTIONS
  - AlertRuleTarget.kind     → TARGET_KIND_OPTIONS

form-state.ts: `severity: 'CRITICAL' | 'WARNING' | 'INFO'` and
`kind: 'USER' | 'GROUP' | 'ROLE'` literal unions swapped for the
derived `Severity` / `TargetKind` aliases.

Not covered, backend types them as `String` (no `@Schema(allowableValues)`
annotation yet):
  - AgentStateCondition.state
  - DeploymentStateCondition.states
  - LogPatternCondition.level
  - ExchangeFilter.status
  - JvmMetricCondition.metric

These stay hand-typed with a pointer-comment. Follow-up: add
`@Schema(allowableValues = …)` to the Java record components so the
enums land in schema.d.ts; then fold them into alerting-enums.ts.

Plus: gitnexus index-stats refresh in AGENTS.md/CLAUDE.md from the
post-deploy reindex.

Verified: ui build green, 49/49 vitest pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 19:02:52 +02:00
hsiegeln
f8c1ba4988 docs(auth): document user_id convention and write-path shape
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m2s
CI / docker (push) Successful in 1m20s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m41s
After the UiAuth / Oidc / UserAdmin controllers were aligned to store
bare user_ids, the rules that future sessions read were still describing
the old behaviour (OutboundConnectionAdminController "strips user:
prefix" — true mechanically but the subtlety is that the strip is
the bridge between a prefixed JWT subject and an unprefixed DB key,
not a hack).

- CLAUDE.md: expand the User persistence one-liner to state the
  convention authoritatively (local `<username>`, OIDC `oidc:<sub>`,
  JWT `user:` namespace, env-scoped controllers strip for FK).
- .claude/rules/app-classes.md:
  - Add "User ID conventions" section near the top that spells out
    write-path vs read-path behaviour in one place.
  - Add UiAuthController + OidcAuthController entries under
    security/ with their upsert shape documented.
  - Soften the OutboundConnectionAdminController line to reference
    the convention instead of restating the mechanism.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:49:22 +02:00
hsiegeln
ae6473635d fix(auth): OidcAuthController + UserAdminController upsert unprefixed
Follow-up to the UiAuthController fix: every write path that puts a row
into users/user_roles/user_groups must use the bare DB key, because
the env-scoped controllers (Alert, AlertRule, AlertSilence, Outbound)
strip "user:" before using the name as an FK. If the write path stores
prefixed, first-time alerting/outbound writes fail with
alert_rules_created_by_fkey violation.

UiAuthController shipped the model in the prior commit (bare userId
for all DB/RBAC calls, "user:"-namespaced subject for JWT signing).
Bringing the other two write paths in line:

- OidcAuthController.callback:
    userId  = "oidc:" + oidcUser.subject()    // DB key, no "user:"
    subject = "user:" + userId                // JWT subject (namespaced)
  All userRepository / rbacService / applyClaimMappings calls use
  userId. Tokens still carry the namespaced subject so
  JwtAuthenticationFilter can distinguish user vs agent tokens.

- UserAdminController.createUser: userId = request.username() (bare).
  resetPassword: dropped the "user:"-strip fallback that was only
  needed because create used to prefix — now dead.

No migration. Greenfield alpha product — any pre-existing prefixed
rows in a dev DB will become orphans on next login (login upserts
the unprefixed row, old prefixed row is harmless but unused).
Operators doing a clean re-index can wipe the DB.

Read-path controllers still strip — harmless for bare DB rows, and
OIDC humans (JWT sub "user:oidc:<s>") still resolve correctly to
the new DB key "oidc:<s>" after stripping.

Verified: 45/45 alerting + outbound ITs pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:44:17 +02:00
hsiegeln
6b5aefd4c2 docs(gitnexus): re-run analyze after cleanup-batch commits
Post-commit stats: 8524 nodes / 22174 edges / 415 clusters / 300 flows
(up from 8513/22146/409/300 after the five cleanup-batch commits).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:27:27 +02:00
hsiegeln
1ea0258393 fix(auth): upsert UI login user_id unprefixed (drop docker seeder workaround)
Root cause of the mismatch that prompted the one-shot cameleer-seed
docker service: UiAuthController stored users.user_id as the JWT
subject "user:admin" (JWT sub format). Every env-scoped controller
(Alert, AlertSilence, AlertRule, OutboundConnectionAdmin) already
strips the "user:" prefix on the read path — so the rest of the
system expects the DB key to be the bare username. With UiAuth
storing prefixed, fresh docker stacks hit
"alert_rules_created_by_fkey violation" on the first rule create.

Fix: inside login(), compute `userId = request.username()` and use
it everywhere the DB/RBAC layer is touched (isLocked, getPasswordHash,
record/clearFailedLogins, upsert, assignRoleToUser, addUserToGroup,
getSystemRoleNames). Keep `subject = "user:" + userId` — we still
sign JWTs with the namespaced subject so JwtAuthenticationFilter can
distinguish user vs agent tokens.

refresh() and me() follow the same rule via a stripSubjectPrefix()
helper (JWT subject in, bare DB key out).

With the write path aligned, the docker bridge is no longer needed:
- Deleted deploy/docker/postgres-init.sql
- Deleted cameleer-seed service from docker-compose.yml

Scope: UiAuthController only. UserAdminController + OidcAuthController
still prefix on upsert — that's the bug class the triage identified
as "Option A or B either way OK". Not changing them now because:
  a) prod admins are provisioned unprefixed through some other path,
     so those two files aren't the docker-only failure observed;
  b) stripping them would need a data migration for any existing
     prod users stored prefixed, which is out of scope for a cleanup
     phase. Follow-up worth scheduling if we ever wire OIDC or admin-
     created users into alerting FKs.

Verified: 33/33 alerting+outbound controller ITs pass (9 outbound,
10 rules, 9 silences, 5 alert inbox).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:26:03 +02:00
hsiegeln
09b49f096c feat(alerting): per-severity breakdown on unread-count DTO
Spec §13 calls for the notification bell to colour-code by highest
unread severity (CRITICAL → error, WARNING → amber, INFO → muted).
The old { count } DTO forced the UI to pick one static colour, so
NotificationBell shipped with a TODO. Grow the contract instead:

  UnreadCountResponse = { total, bySeverity: { CRITICAL, WARNING, INFO } }

Guarantees:
- every severity is always present with a >=0 value (no undefined
  keys on the wire), so the UI can branch without defaults.
- total = sum of bySeverity values — kept explicit on the wire for
  cheap top-line display, not recomputed client-side.

Backend
- AlertInstanceRepository: replaces countUnreadForUser(long) with
  countUnreadBySeverityForUser returning Map<AlertSeverity, Long>.
  One SQL round-trip per (env, user) — GROUP BY ai.severity over the
  same NOT EXISTS(alert_reads) filter.
- UnreadCountResponse.from(Map) normalises and defensively copies;
  missing severities default to 0.
- InAppInboxQuery.countUnread now returns the DTO, caches the full
  response (still 5s TTL) so severity breakdown gets the same
  hit-rate as the total did before.
- AlertController just hands the DTO back.

Breaking change — no backwards-compat shim: the `count` field is
gone. UI and tests updated in the same commit; there are no other
API consumers in the tree.

Frontend
- Regenerated openapi.json + schema.d.ts against a fresh build of
  the new backend.
- NotificationBell branches badge colour on the highest unread
  severity (CRITICAL > WARNING > INFO) via new CSS variants.
- Tests cover all four paths: zero, critical-present, warning-only,
  info-only.

Tests: 7 unit tests + 12 ITs (incl. new grouping + empty-map)
       + 49 vitest (was 46; +3 severity-branch assertions).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:15:56 +02:00
hsiegeln
18cacb33ee docs(alerting): align @JsonTypeInfo spec with shipped code
Design spec and Plan 02 described AlertCondition polymorphism as
Id.DEDUCTION, but the code that shipped in PR #140 uses Id.NAME with
property="kind" and include=EXISTING_PROPERTY. The `kind` field is
real on every subtype and the DB stores it in a separate column
(condition_kind), so reading the discriminator directly is simpler
than deduction — update the docs to match. Also add `"kind"` to the
example JSON payloads so they match on-wire reality.

OutboundAuth (Plan 01) correctly still uses Id.DEDUCTION and is
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:04:17 +02:00
hsiegeln
d850d00bab docs(gitnexus): refresh index stats + repo name (alerting-02 → cameleer-server)
Re-ran `npx gitnexus analyze --embeddings` after PR #144 merge.
8513 symbols / 22146 relationships / 300 execution flows.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:02:18 +02:00
hsiegeln
579b5f1a04 chore(ui): delete unused usePageVisible hook
Added as a reusable primitive during Plan 03 Task 9, but the intended
consumer (NotificationBell live-region refresh) was removed during
code review, leaving the hook unused. Delete it — YAGNI; reintroduce
when a real consumer shows up.

Verified upstream impact (gitnexus): 0 callers, LOW risk.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 18:02:04 +02:00
ec460faf02 Merge pull request 'feat(alerting): Plan 03 — UI + backfills (SSRF guard, metrics caching, docker stack)' (#144) from feat/alerting-03-ui into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m1s
CI / docker (push) Successful in 1m16s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Reviewed-on: #144
2026-04-20 16:27:49 +02:00
hsiegeln
1ebc2fa71e test(ui/alerts): Playwright E2E smoke (sidebar, rule CRUD, CMD-K, silence CRUD)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m10s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m34s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 5m11s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 40s
fixtures.ts: auto-applied login fixture — visits /login?local to skip OIDC
auto-redirect, fills username/password via label-matcher, clicks 'Sign in',
then selects the 'default' env so alerting hooks enable (useSelectedEnv gate).
Override via E2E_ADMIN_USER + E2E_ADMIN_PASS.

alerting.spec.ts: 4 tests against the full docker-compose stack:
 - sidebar Alerts accordion → /alerts/inbox
 - 5-step wizard: defaults-only create + row delete (unique timestamp name
   avoids strict-mode collisions with leftover rules)
 - CMD-K palette via SearchTrigger click (deterministic; Ctrl+K via keyboard
   is flaky when the canvas doesn't have focus)
 - silence matcher-based create + end-early

DS FormField renders labels as generics (not htmlFor-wired), so inputs are
targeted by placeholder or label-proximity locators instead of getByLabel.

Does not exercise fire→ack→clear; that's covered backend-side by
AlertingFullLifecycleIT (Plan 02). UI E2E for that path would need event
injection into ClickHouse, out of scope for this smoke.
2026-04-20 16:18:17 +02:00
hsiegeln
d88bede097 chore(docker): seeder service pre-creates unprefixed 'admin' user row
Alerting + outbound controllers resolve acting user via
authentication.name with 'user:' prefix stripped → 'admin'. But
UserRepository.upsert stores env-admin as 'user:admin' (JWT sub format).
The resulting FK mismatch manifests as 500 'alert_rules_created_by_fkey'
on any create operation in a fresh docker stack.

Workaround: run-once 'cameleer-seed' compose service runs psql against
deploy/docker/postgres-init.sql after the server is healthy (i.e. after
Flyway migrations have created tenant_default.users), inserting
user_id='admin' idempotently. The root-cause fix belongs in the backend
(either stop stripping the prefix in alerting/outbound controllers, or
normalise storage to the unprefixed form) and is out of scope for
Plan 03.
2026-04-20 16:18:07 +02:00
hsiegeln
bcde6678b8 fix(ui/alerts): align RouteMetric metric enum with backend; pre-populate ROUTE_METRIC defaults
- RouteMetricForm dropped P95_LATENCY_MS — not in cameleer-server-core
  RouteMetric enum (valid: ERROR_RATE, P99_LATENCY_MS, AVG_DURATION_MS,
  THROUGHPUT, ERROR_COUNT).
- initialForm now returns a ready-to-save ROUTE_METRIC condition
  (metric=ERROR_RATE, comparator=GT, threshold=0.05, windowSeconds=300),
  so clicking through the wizard with all defaults produces a valid rule.
  Prevents a 400 'missing type id property kind' + 400 on condition enum
  validation if the user leaves the condition step untouched.
2026-04-20 16:17:59 +02:00
hsiegeln
5edf7eb23a fix(alerting): @Autowired on AlertingMetrics production constructor
Task 29's refactor added a package-private test-friendly constructor
alongside the public production one. Without @Autowired Spring cannot pick
which constructor to use for the @Component, and falls back to searching
for a no-arg default — crashing startup with 'No default constructor found'.

Detected when launching the server via the new docker-compose stack; unit
tests still pass because they invoke the package-private test constructor
directly.
2026-04-20 16:02:48 +02:00
hsiegeln
1ed2d3a611 chore(docker): full-stack docker-compose mirroring deploy/ k8s manifests
Mirrors the k8s manifests in deploy/ as a local dev stack:
  - cameleer-postgres   (matches deploy/cameleer-postgres.yaml)
  - cameleer-clickhouse (matches deploy/cameleer-clickhouse.yaml, default CLICKHOUSE_DB=cameleer)
  - cameleer-server     (built from Dockerfile, env mirrors deploy/base/server.yaml)
  - cameleer-ui         (built from ui/Dockerfile, served on host :8080 to leave :5173 free for Vite dev)

Dockerfile + ui/Dockerfile: REGISTRY_TOKEN is now optional (empty → skip Maven/npm auth).
cameleer-common package is public, so anonymous pulls succeed; private packages still require the token.

Backend defaults tuned for local E2E:
  - RUNTIME_ENABLED=false (no Docker-in-Docker deployments in dev stack)
  - OUTBOUND_HTTP_ALLOW_PRIVATE_TARGETS=true (so webhook tests can target host.docker.internal etc.)
  - UIUSER/UIPASSWORD=admin/admin (matches Playwright E2E_ADMIN_USER/PASS defaults)
  - CORS includes both :5173 (Vite) and :8080 (nginx)
2026-04-20 15:52:24 +02:00
hsiegeln
f75ee9f352 docs(alerting): UI map + admin-guide walkthrough for Plan 03
.claude/rules/ui.md now maps every Plan 03 UI surface. Admin guide gains
an inbox/rules/silences walkthrough so ops teams can start in the UI
without reading the spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:55:36 +02:00
hsiegeln
9f109b20fd perf(alerting): 30s TTL cache on AlertingMetrics gauge suppliers
Prometheus scrapes can fire every few seconds. The open-alerts / open-rules
gauges query Postgres on each read — caching the values for 30s amortises
that to one query per half-minute. Addresses final-review NIT from Plan 02.

- Introduces a package-private TtlCache that wraps a Supplier<Long> and
  memoises the last read for a configurable Duration against a Supplier<Instant>
  clock.
- Wraps each gauge supplier (alerting_rules_total{enabled|disabled},
  alerting_instances_total{state}) in its own TtlCache.
- Adds a test-friendly constructor (package-private) taking explicit
  Duration + Supplier<Instant> so AlertingMetricsCachingTest can advance
  a fake clock without waiting wall-clock time.
- Adds AlertingMetricsCachingTest covering:
  * supplier invoked once per TTL across repeated scrapes
  * 29 s elapsed → still cached; 31 s elapsed → re-queried
  * gauge value reflects the cached result even after delegate mutates

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:22:54 +02:00
hsiegeln
5ebc729b82 feat(alerting): SSRF guard on outbound connection URL
Rejects webhook URLs that resolve to loopback, link-local, or RFC-1918
private ranges (IPv4 + IPv6 ULA fc00::/7). Enforced on both create and
update in OutboundConnectionServiceImpl before persistence; returns 400
Bad Request with "private or loopback" in the body.

Bypass via `cameleer.server.outbound-http.allow-private-targets=true`
for dev environments where webhooks legitimately point at local
services. Production default is `false`.

Test profile sets the flag to `true` in application-test.yml so the
existing ITs that post webhooks to WireMock on https://localhost:PORT
keep working. A dedicated OutboundConnectionSsrfIT overrides the flag
back to false (via @TestPropertySource + @DirtiesContext) to exercise
the reject path end-to-end through the admin controller.

Plan 01 scope; required before SaaS exposure (spec §17).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:17:44 +02:00
hsiegeln
f4c2cb120b feat(ui/alerts): CMD-K sources for alerts + alert rules
Extends operationalSearchData with open alerts (FIRING|ACKNOWLEDGED) and
all rules. Badges convey severity + state. Selecting an alert navigates to
/alerts/inbox/{id}; a rule navigates to /alerts/rules/{id}. Uses the
existing CommandPalette extension point — no new registry.
2026-04-20 14:09:39 +02:00
hsiegeln
8689643e11 feat(ui/alerts): SilencesPage with matcher-based create + end-early action
Matcher accepts ruleId and/or appSlug. Server enforces endsAt > startsAt
(V12 CHECK constraint) and matcher_matches() at dispatch time (spec §7).
2026-04-20 14:08:27 +02:00
hsiegeln
0191ca4b13 feat(ui/alerts): render promotion warnings in wizard banner
Fetches target-env apps (useCatalog) and env-allowed outbound
connections, passes them to prefillFromPromotion, and renders the
returned warnings in an amber banner above the step nav. Warnings list
the field name and the remediation message so users see crossings that
need manual adjustment before saving.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:05:08 +02:00
hsiegeln
3963ea5591 feat(ui/alerts): ReviewStep + promotion prefill warnings
Review step dumps a human summary plus raw request JSON, and (when a
setter is supplied) offers an Enabled-on-save Toggle. Promotion prefill
now returns {form, warnings}: clears agent IDs (per-env), flags missing
apps in target env, and flags webhook connections not allowed in target
env. 4 Vitest cases cover copy-name, agent clear, app-missing, and
webhook-not-allowed paths.

The wizard now consumes {form, warnings}; Task 25 renders the warnings
banner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:04:04 +02:00
hsiegeln
816096f4d1 feat(ui/alerts): NotifyStep (MustacheEditor for title/message/body, targets, webhook bindings)
Title and message use MustacheEditor with kind-specific autocomplete.
Preview button posts to the render-preview endpoint and shows rendered
title/message inline. Targets combine users/groups/roles into a unified
Badge pill list. Webhook picker filters to outbound connections allowed
in the current env (spec 6, allowed_environment_ids). Header overrides
use plain Input rather than MustacheEditor for now.

Deviations:
 - RenderPreviewRequest is Record<string, never>, so we send {} instead
   of {titleTemplate, messageTemplate}; backend resolves from rule state.
 - RenderPreviewResponse has {title, message} (plan draft used
   renderedTitle/renderedMessage).
 - Button size="sm" not "small" (DS only accepts sm|md).
 - Target kind field renamed from targetKind to kind to match
   AlertRuleTarget DTO.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:02:26 +02:00
hsiegeln
d42a6ca6a8 feat(ui/alerts): TriggerStep (evaluation interval, for-duration, re-notify, test-evaluate)
Three numeric inputs for evaluation cadence, for-duration, and
re-notification window, plus a Test evaluate button for saved rules.
TestEvaluateRequest is empty on the wire (server uses the rule id), so
we send {} and rely on the backend to evaluate the current saved state.

Deviation: plan draft passed {condition: toRequest(form).condition} into
the request body. The generated TestEvaluateRequest type is
Record<string, never>, so we send an empty body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-20 14:00:33 +02:00