feat(alerting): Plan 03 — UI + backfills (SSRF guard, metrics caching, docker stack) #144

claude · 2026-04-20T16:20:07+02:00

claude commented

2026-04-20 16:20:07 +02:00

Summary

Plan 03 delivers the full alerting UI on top of Plan 02's backend, plus two backend backfills (SSRF guard on outbound URLs + 30s TTL cache on AlertingMetrics gauges) and a complete local docker-compose stack mirroring the k8s manifests in deploy/.

UI additions (all under ui/src/pages/Alerts/ + ui/src/components/):

/alerts/{inbox,all,history,rules,rules/new,rules/:id,silences} routes (lazy-loaded).
Sidebar accordion section (Inbox / All / Rules / Silences / History) + NotificationBell in TopBar that polls /alerts/unread-count every 30s (paused when tab hidden via TanStack Query refetchIntervalInBackground).
5-step rule editor wizard (Scope / Condition / Trigger / Notify / Review) with 6 kind-specific condition sub-forms (ROUTE_METRIC, EXCHANGE_MATCH, AGENT_STATE, DEPLOYMENT_STATE, LOG_PATTERN, JVM_METRIC).
Env-promotion flow (pure client-side URL prefill with warnings for cross-env agents and disallowed outbound connections — no new REST endpoint).
Shared <MustacheEditor /> (CodeMirror 6) with variable autocomplete + inline linter. Registry (alert-variables.ts) mirrors NotificationContextBuilder leaves.
AlertStateChip, SeverityBadge, InboxPage (bulk-read), AllAlertsPage (state filter), HistoryPage, RulesListPage (enable/disable + delete + promote), SilencesPage (matcher-based create + end-early).
CMD-K integration — alert + alertRule result categories via the existing LayoutShell searchData extension point.
TanStack Query hooks: alerts.ts, alertRules.ts, alertSilences.ts, alertNotifications.ts, alertMeta.ts. All env-scoped via useSelectedEnv.

Backend backfills:

SsrfGuard — rejects outbound webhook URLs that resolve to loopback, link-local, RFC-1918 private ranges, or IPv6 ULA. Wired into OutboundConnectionServiceImpl.create/update. Bypass via cameleer.server.outbound-http.allow-private-targets=true for dev.
AlertingMetrics gauges now wrap their Postgres-backed suppliers in a 30s TTL cache so Prometheus scrapes don't produce per-scrape DB queries (final-review NIT from Plan 02).
Hotfix during E2E: @Autowired on the AlertingMetrics production constructor so Spring picks it over the package-private test-friendly one (Task 29 refactor had introduced ambiguity).

Docker stack (new docker-compose.yml mirroring deploy/):

cameleer-postgres (matches deploy/cameleer-postgres.yaml).
cameleer-clickhouse (matches deploy/cameleer-clickhouse.yaml; CLICKHOUSE_DB=cameleer).
cameleer-server built from the repo Dockerfile (REGISTRY_TOKEN now optional — cameleer-common is public).
cameleer-ui built from ui/Dockerfile on host :8080 so Vite dev (npm run dev:local) keeps :5173 free.
cameleer-seed one-shot service that seeds user_id='admin' in tenant_default.users after the server is healthy, bridging a pre-existing FK mismatch between UserRepository storage (prefixed user:admin) and alerting-controller usage (stripped admin). The root-cause fix belongs in a future backend cleanup.

Docs + rules:

.claude/rules/ui.md — Alerts section mapping every Plan 03 UI surface.
docs/alerting.md — UI walkthrough (sidebar / bell / wizard / Mustache autocomplete / env promotion / CMD-K).

Plan + spec

Spec: docs/superpowers/specs/2026-04-19-alerting-design.md §9, §12, §13, §17.
Plan: docs/superpowers/plans/2026-04-20-alerting-03-ui.md.

Supersedes chore/openapi-regen-post-plan02 — delete that branch after merge.

Test plan

Frontend unit suites: 47/47 pass across 15 files (cd ui && npm test). Covers query hooks, CM6 completion + linter, Mustache variable registry, wizard form-state, promotion prefill, AlertStateChip, SeverityBadge, NotificationBell, usePageVisible.
Frontend TypeScript clean: cd ui && npx tsc -p tsconfig.app.json --noEmit → zero errors.
Frontend build succeeds: cd ui && npm run build (RuleEditorWizard chunk ~120 KB gzip incl. CM6).
Backend Plan 03 suites: 11/11 pass (mvn -pl cameleer-server-app -am test -Dtest='SsrfGuardTest,AlertingMetricsCachingTest,OutboundConnectionSsrfIT' -Dsurefire.failIfNoSpecifiedTests=false): 8 SsrfGuard + 2 AlertingMetrics caching + 1 SSRF admin-controller IT.
Regression: existing OutboundConnectionAdminControllerIT 9/9 pass with allow-private-targets=true in test profile.
Playwright E2E: 4/4 pass against the docker stack (cd ui && npx playwright test). Covers sidebar nav, rule CRUD via wizard, CMD-K open/close, silence create + end-early.
Manual smoke on a real deployment (post-merge).
Follow-up backend cleanup: unify UserRepository storage with alerting/outbound controller stripping so the compose seeder becomes redundant.

End-to-end fire → ack → clear is covered server-side by Plan 02's AlertingFullLifecycleIT. UI E2E for that path would require event injection into ClickHouse and is out of scope.

🤖 Generated with Claude Code

## Summary Plan 03 delivers the full alerting UI on top of Plan 02's backend, plus two backend backfills (SSRF guard on outbound URLs + 30s TTL cache on `AlertingMetrics` gauges) and a complete local docker-compose stack mirroring the k8s manifests in `deploy/`. **UI additions** (all under `ui/src/pages/Alerts/` + `ui/src/components/`): - `/alerts/{inbox,all,history,rules,rules/new,rules/:id,silences}` routes (lazy-loaded). - Sidebar accordion section (Inbox / All / Rules / Silences / History) + `NotificationBell` in TopBar that polls `/alerts/unread-count` every 30s (paused when tab hidden via TanStack Query `refetchIntervalInBackground`). - 5-step rule editor wizard (Scope / Condition / Trigger / Notify / Review) with 6 kind-specific condition sub-forms (`ROUTE_METRIC`, `EXCHANGE_MATCH`, `AGENT_STATE`, `DEPLOYMENT_STATE`, `LOG_PATTERN`, `JVM_METRIC`). - Env-promotion flow (pure client-side URL prefill with warnings for cross-env agents and disallowed outbound connections — no new REST endpoint). - Shared `<MustacheEditor />` (CodeMirror 6) with variable autocomplete + inline linter. Registry (`alert-variables.ts`) mirrors `NotificationContextBuilder` leaves. - `AlertStateChip`, `SeverityBadge`, `InboxPage` (bulk-read), `AllAlertsPage` (state filter), `HistoryPage`, `RulesListPage` (enable/disable + delete + promote), `SilencesPage` (matcher-based create + end-early). - CMD-K integration — `alert` + `alertRule` result categories via the existing `LayoutShell` searchData extension point. - TanStack Query hooks: `alerts.ts`, `alertRules.ts`, `alertSilences.ts`, `alertNotifications.ts`, `alertMeta.ts`. All env-scoped via `useSelectedEnv`. **Backend backfills**: - `SsrfGuard` — rejects outbound webhook URLs that resolve to loopback, link-local, RFC-1918 private ranges, or IPv6 ULA. Wired into `OutboundConnectionServiceImpl.create/update`. Bypass via `cameleer.server.outbound-http.allow-private-targets=true` for dev. - `AlertingMetrics` gauges now wrap their Postgres-backed suppliers in a 30s TTL cache so Prometheus scrapes don't produce per-scrape DB queries (final-review NIT from Plan 02). - Hotfix during E2E: `@Autowired` on the `AlertingMetrics` production constructor so Spring picks it over the package-private test-friendly one (Task 29 refactor had introduced ambiguity). **Docker stack** (new `docker-compose.yml` mirroring `deploy/`): - `cameleer-postgres` (matches `deploy/cameleer-postgres.yaml`). - `cameleer-clickhouse` (matches `deploy/cameleer-clickhouse.yaml`; `CLICKHOUSE_DB=cameleer`). - `cameleer-server` built from the repo `Dockerfile` (`REGISTRY_TOKEN` now optional — cameleer-common is public). - `cameleer-ui` built from `ui/Dockerfile` on host `:8080` so Vite dev (`npm run dev:local`) keeps `:5173` free. - `cameleer-seed` one-shot service that seeds `user_id='admin'` in `tenant_default.users` after the server is healthy, bridging a pre-existing FK mismatch between `UserRepository` storage (prefixed `user:admin`) and alerting-controller usage (stripped `admin`). The root-cause fix belongs in a future backend cleanup. **Docs + rules**: - `.claude/rules/ui.md` — Alerts section mapping every Plan 03 UI surface. - `docs/alerting.md` — UI walkthrough (sidebar / bell / wizard / Mustache autocomplete / env promotion / CMD-K). ## Plan + spec - Spec: `docs/superpowers/specs/2026-04-19-alerting-design.md` §9, §12, §13, §17. - Plan: `docs/superpowers/plans/2026-04-20-alerting-03-ui.md`. Supersedes `chore/openapi-regen-post-plan02` — delete that branch after merge. ## Test plan - [x] Frontend unit suites: **47/47 pass** across 15 files (`cd ui && npm test`). Covers query hooks, CM6 completion + linter, Mustache variable registry, wizard form-state, promotion prefill, AlertStateChip, SeverityBadge, NotificationBell, usePageVisible. - [x] Frontend TypeScript clean: `cd ui && npx tsc -p tsconfig.app.json --noEmit` → zero errors. - [x] Frontend build succeeds: `cd ui && npm run build` (RuleEditorWizard chunk ~120 KB gzip incl. CM6). - [x] Backend Plan 03 suites: **11/11 pass** (`mvn -pl cameleer-server-app -am test -Dtest='SsrfGuardTest,AlertingMetricsCachingTest,OutboundConnectionSsrfIT' -Dsurefire.failIfNoSpecifiedTests=false`): 8 SsrfGuard + 2 AlertingMetrics caching + 1 SSRF admin-controller IT. - [x] Regression: existing `OutboundConnectionAdminControllerIT` **9/9 pass** with `allow-private-targets=true` in test profile. - [x] Playwright E2E: **4/4 pass** against the docker stack (`cd ui && npx playwright test`). Covers sidebar nav, rule CRUD via wizard, CMD-K open/close, silence create + end-early. - [ ] Manual smoke on a real deployment (post-merge). - [ ] Follow-up backend cleanup: unify `UserRepository` storage with alerting/outbound controller stripping so the compose seeder becomes redundant. End-to-end `fire → ack → clear` is covered server-side by Plan 02's `AlertingFullLifecycleIT`. UI E2E for that path would require event injection into ClickHouse and is out of scope. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

claude added 39 commits 2026-04-20 16:20:07 +02:00

docs(alerting): Plan 03 — UI + backfills implementation plan 2942025a54

32 tasks across 10 phases:
 - Foundation: Vitest, CodeMirror 6, Playwright scaffolding + schema regen.
 - API: env-scoped query hooks for alerts/rules/silences/notifications.
 - Components: AlertStateChip, SeverityBadge, NotificationBell (with tab-hidden poll pause), MustacheEditor (CM6 with variable autocomplete + linter).
 - Routes: /alerts/* section with sidebar accordion; bell mounted in TopBar.
 - Pages: Inbox / All / History / Rules (with env promotion) / Silences.
 - Wizard: 5-step editor with kind-specific condition forms + test-evaluate + render-preview + prefill warnings.
 - CMD-K: alerts + rules sources via LayoutShell extension.
 - Backend backfills: SSRF guard on outbound URL + 30s AlertingMetrics gauge cache.
 - Final: Playwright smoke, .claude/rules/ui.md + admin-guide updates, full build/test/PR.

Decisions: CM6 over Monaco/textarea (90KB gzipped, ARIA-conformant); CMD-K extension via existing LayoutShell searchData (not a new registry); REST-API-driven tests per project test policy.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore(ui): add Vitest + Testing Library scaffolding 0aa1776b57

Prepares for Plan 03 unit tests (MustacheEditor, NotificationBell, wizard step
validation). jsdom environment + jest-dom matchers + canary test verifies the
wiring.

chore(ui): add CodeMirror 6 + Playwright config 1260cbe674

Install @codemirror/{view,state,autocomplete,commands,language,lint}
and @lezer/common — needed by Phase 3's MustacheEditor (Task 13).
CM6 picked over a raw textarea for its small incremental-rendering
bundle, full ARIA/keyboard support, and pluggable autocomplete +
linter APIs that map cleanly to Mustache token parsing.

Add ui/playwright.config.ts wiring Task 30's E2E smoke:
- testDir ./src/test/e2e, single worker, trace+screenshot on failure
- webServer launches `npm run dev:local` (backend on :8081 required)
- PLAYWRIGHT_BASE_URL env var skips the dev server for CI against a
  pre-deployed UI

Add test:e2e / test:e2e:ui npm scripts and exclude Playwright's
test-results/ and playwright-report/ from git. @playwright/test
itself was already in devDependencies from an earlier task.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore(ui): regenerate openapi.json + schema.d.ts from deployed Plan 02 backend b2066cdb68

Fetched from http://192.168.50.86:30090/api/v1/api-docs via
`npm run generate-api:live`. Adds TypeScript types for the new alerting
REST surface merged in #140:

- 15 alerting paths under /environments/{envSlug}/alerts/** (rules CRUD,
  enable/disable, render-preview, test-evaluate, inbox, unread-count,
  ack/read/bulk-read, silences CRUD, per-alert notifications)
- 1 flat notification retry path /alerts/notifications/{id}/retry
- 4 outbound-connection admin paths (from Plan 01 #139)

Verified tsc -p tsconfig.app.json --noEmit exits 0 — no existing SPA
call sites break against the fresh types. Plan 03 UI work can consume
these directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): shared env helper for alerting query hooks 1a8b9eb41b

feat(ui/alerts): TanStack Query hooks for /alerts endpoints 83a8912da6

Adds env-scoped hooks for the alerts inbox:
- useAlerts (30s poll, background-paused, filter-aware)
- useAlert, useUnreadCount (30s poll)
- useAckAlert, useMarkAlertRead, useBulkReadAlerts (mutations that
  invalidate the alerts query key tree + unread-count)

Test file uses .tsx because the QueryClientProvider wrapper relies on
JSX; vitest picks up both .ts and .tsx via the configured include glob.
Client mock targets the actual export name (`api` in ../client) rather
than the `apiClient` alias that alertMeta re-exports.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix(ui/alerts): bulk-read body uses instanceIds to match BulkReadRequest DTO 82c29f46a5

Plan 03 prose had 'alertInstanceIds'; backend record is 'instanceIds'.

feat(ui/alerts): alert rule query hooks (CRUD, enable/disable, preview, test-evaluate) c6c3dd9cfe

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): silence + notification query hooks 51bc796bec

feat(ui/alerts): AlertStateChip + SeverityBadge components 31ee974830

State colors follow the convention from @cameleer/design-system (CRITICAL->error,
WARNING->warning, INFO->auto). Silenced pill stacks next to state for the spec
section 8 audit-trail surface.

feat(ui/alerts): NotificationBell with Page Visibility poll pause 197c60126c

Adds a header bell component linking to /alerts/inbox with an unread-count
badge for the selected environment. Polling pauses when the tab is hidden
via TanStack Query's refetchIntervalInBackground:false (already set on
useUnreadCount); the new usePageVisible hook gives components a
re-renders-on-visibility-change signal for future defense-in-depth.

Plan-prose deviation: the plan assumed UnreadCountResponse carries a
bySeverity map for per-severity badge coloring, but the backend DTO only
exposes a scalar `count`. The bell reads `data?.count` and renders a single
var(--error) tint; a TODO references spec §13 for future per-severity work
that would require expanding the DTO.

Tests: usePageVisible toggles on visibilitychange events; NotificationBell
renders the bell with no badge at count=0 and shows "3" at count=3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix(ui/alerts): remove dead usePageVisible subscription; align alerts test mock with DTO 38083d7c3f

NotificationBell used a usePageVisible() subscription that re-rendered on
every visibilitychange without consuming the value. TanStack Query's
refetchIntervalInBackground:false already pauses polling; the extra
subscription was speculative generality. Dropped the import + call + JSDoc
reference; usePageVisible hook + test retained as a reusable primitive.

Also: alerts.test.tsx 'returns the server payload unmodified' asserted a
pre-plan {total, bySeverity} shape, but UnreadCountResponse is actually
{count}. Fixed mock + assertion to {count: 3}.

feat(ui/alerts): Mustache variable metadata registry for autocomplete 5ddd89f883

ALERT_VARIABLES mirrors the spec §8 context map. availableVariables(kind)
returns the kind-specific filter (always vars + kind vars). extractReferences
+ unknownReferences drive the inline amber linter. Backend NotificationContext
adds must land here too.

fix(ui/alerts): align ALERT_VARIABLES registry with NotificationContextBuilder 18e6dde67a

Plan prose had spec §8 idealized leaves, but the backend NotificationContext
only emits a subset:
  ROUTE_METRIC / EXCHANGE_MATCH → route.id + route.uri (uri added)
  LOG_PATTERN → log.pattern + log.matchCount (renamed from log.logger/level/message)
  app.slug / app.id → scoped to non-env kinds (removed from 'always')
  exchange.link / alert.comparator / alert.window / app.displayName → removed (backend doesn't emit)

Without this alignment the Task 11 linter would (1) flag valid route.uri as
unknown, (2) suggest log.{logger,level,message} as valid paths that render
empty, and (3) flag app.slug on env-wide rules.

feat(ui/alerts): CM6 completion + linter for Mustache templates ac2a943feb

completion fires after {{ and narrows as the user types; apply() closes the
tag automatically. Linter raises an error on unclosed {{, a warning on
references that aren't in the allowed-variable set for the current condition
kind. Kind-specific allowed set comes from availableVariables().

feat(ui/alerts): MustacheEditor component (CM6 shell with completion + linter) 019e79a362

Wires the mustache-completion source and mustache-linter into a CodeMirror 6
EditorView. Accepts kind (filters variables) and reducedContext (env-only for
connection URLs). singleLine prevents newlines for URL/header fields. Host
ref syncs when the parent replaces value (promotion prefill).

feat(ui/alerts): register /alerts/* routes with placeholder pages 167d0ebd42

Adds 6 lazy-loaded route entries for the alerting UI (Inbox, All, History,
Rules list, Rule editor wizard, Silences) plus an `/alerts` → `/alerts/inbox`
redirect. Page components are placeholder stubs to be replaced in Phase 5/6/7.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): Alerts sidebar section with Inbox/All/Rules/Silences/History 54e4217e21

Adds `buildAlertsTreeNodes` to sidebar-utils and renders an Alerts section
between Applications and Starred in LayoutShell. The section uses an
accordion pattern — entering `/alerts/*` collapses apps/admin/starred and
restores their state on leave.

gitnexus_impact(LayoutContent, upstream) = LOW (0 direct callers; rendered
only by LayoutShell's provider wrapper).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): mount NotificationBell in TopBar 891dcaef32

Renders the `<NotificationBell />` as the first child of `<TopBar>` (before
`<SearchTrigger>`). The bell links to `/alerts/inbox` and shows the unread
alert count for the currently selected environment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): InboxPage with ack + bulk-read actions 8d8bae4e18

AlertRow is reused by AllAlertsPage and HistoryPage. Marking a row as read
happens when its link is followed (the detail sub-route will be added in
phase 10 polish). FIRING rows get an amber left border.

feat(ui/alerts): AllAlertsPage + HistoryPage 269a63af1f

AllAlertsPage: state filter chips (Open/Firing/Acked/All).
HistoryPage: RESOLVED filter, respects retention window.

feat(ui/alerts): RulesListPage with enable/disable, delete, env promotion 7e91459cd6

Promotion dropdown builds a /alerts/rules/new URL with promoteFrom, ruleId,
and targetEnv query params — the wizard will read these in Task 24 and
pre-fill the form with source-env prefill + client-side warnings.

feat(ui/alerts): rule editor wizard shell + form-state module 334e815c25

Wizard navigates 5 steps (scope/condition/trigger/notify/review) with
per-step validation. form-state module is the single source of truth for
the rule form; initialForm/toRequest/validateStep are unit-tested (6
tests). Step components are stubbed and will be implemented in Tasks
20-24. prefillFromPromotion is a thin wrapper in this commit; Task 24
rewrites it to compute scope-adjustment warnings.

Deviation notes:
 - FormState.targets uses {kind, targetId} to match AlertRuleTarget DTO
   field names (plan draft had targetKind).
 - toRequest casts through Record<string, unknown> so the spread over
   the Partial<AlertCondition> union typechecks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): ScopeStep (name, severity, env/app/route/agent selectors) f48fc750f2

Name, description, severity, scope-kind radio, and cascading app/route/
agent selectors driven by catalog + agents data. Adjusts condition
routing by clearing routeId/agentId when the app changes.

Deviation: DS Select uses native event-based onChange; plan draft had
a value-based signature.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): ConditionStep with 6 kind-specific forms ef8c60c2b5

Each condition kind (ROUTE_METRIC, EXCHANGE_MATCH, AGENT_STATE,
DEPLOYMENT_STATE, LOG_PATTERN, JVM_METRIC) renders its own payload-shape
form. Changing the kind resets the condition payload to {kind, scope} so
stale fields from a previous kind don't leak into the save request.

Deviation: DS Select uses native event-based onChange. Plan draft showed
a value-based signature (onChange(v) => ...).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): TriggerStep (evaluation interval, for-duration, re-notify, test-evaluate) d42a6ca6a8

Three numeric inputs for evaluation cadence, for-duration, and
re-notification window, plus a Test evaluate button for saved rules.
TestEvaluateRequest is empty on the wire (server uses the rule id), so
we send {} and rely on the backend to evaluate the current saved state.

Deviation: plan draft passed {condition: toRequest(form).condition} into
the request body. The generated TestEvaluateRequest type is
Record<string, never>, so we send an empty body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): NotifyStep (MustacheEditor for title/message/body, targets, webhook bindings) 816096f4d1

Title and message use MustacheEditor with kind-specific autocomplete.
Preview button posts to the render-preview endpoint and shows rendered
title/message inline. Targets combine users/groups/roles into a unified
Badge pill list. Webhook picker filters to outbound connections allowed
in the current env (spec 6, allowed_environment_ids). Header overrides
use plain Input rather than MustacheEditor for now.

Deviations:
 - RenderPreviewRequest is Record<string, never>, so we send {} instead
   of {titleTemplate, messageTemplate}; backend resolves from rule state.
 - RenderPreviewResponse has {title, message} (plan draft used
   renderedTitle/renderedMessage).
 - Button size="sm" not "small" (DS only accepts sm|md).
 - Target kind field renamed from targetKind to kind to match
   AlertRuleTarget DTO.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): ReviewStep + promotion prefill warnings 3963ea5591

Review step dumps a human summary plus raw request JSON, and (when a
setter is supplied) offers an Enabled-on-save Toggle. Promotion prefill
now returns {form, warnings}: clears agent IDs (per-env), flags missing
apps in target env, and flags webhook connections not allowed in target
env. 4 Vitest cases cover copy-name, agent clear, app-missing, and
webhook-not-allowed paths.

The wizard now consumes {form, warnings}; Task 25 renders the warnings
banner.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): render promotion warnings in wizard banner 0191ca4b13

Fetches target-env apps (useCatalog) and env-allowed outbound
connections, passes them to prefillFromPromotion, and renders the
returned warnings in an amber banner above the step nav. Warnings list
the field name and the remediation message so users see crossings that
need manual adjustment before saving.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(ui/alerts): SilencesPage with matcher-based create + end-early action 8689643e11

Matcher accepts ruleId and/or appSlug. Server enforces endsAt > startsAt
(V12 CHECK constraint) and matcher_matches() at dispatch time (spec §7).

feat(ui/alerts): CMD-K sources for alerts + alert rules f4c2cb120b

Extends operationalSearchData with open alerts (FIRING|ACKNOWLEDGED) and
all rules. Badges convey severity + state. Selecting an alert navigates to
/alerts/inbox/{id}; a rule navigates to /alerts/rules/{id}. Uses the
existing CommandPalette extension point — no new registry.

feat(alerting): SSRF guard on outbound connection URL 5ebc729b82

Rejects webhook URLs that resolve to loopback, link-local, or RFC-1918
private ranges (IPv4 + IPv6 ULA fc00::/7). Enforced on both create and
update in OutboundConnectionServiceImpl before persistence; returns 400
Bad Request with "private or loopback" in the body.

Bypass via `cameleer.server.outbound-http.allow-private-targets=true`
for dev environments where webhooks legitimately point at local
services. Production default is `false`.

Test profile sets the flag to `true` in application-test.yml so the
existing ITs that post webhooks to WireMock on https://localhost:PORT
keep working. A dedicated OutboundConnectionSsrfIT overrides the flag
back to false (via @TestPropertySource + @DirtiesContext) to exercise
the reject path end-to-end through the admin controller.

Plan 01 scope; required before SaaS exposure (spec §17).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

perf(alerting): 30s TTL cache on AlertingMetrics gauge suppliers 9f109b20fd

Prometheus scrapes can fire every few seconds. The open-alerts / open-rules
gauges query Postgres on each read — caching the values for 30s amortises
that to one query per half-minute. Addresses final-review NIT from Plan 02.

- Introduces a package-private TtlCache that wraps a Supplier<Long> and
  memoises the last read for a configurable Duration against a Supplier<Instant>
  clock.
- Wraps each gauge supplier (alerting_rules_total{enabled|disabled},
  alerting_instances_total{state}) in its own TtlCache.
- Adds a test-friendly constructor (package-private) taking explicit
  Duration + Supplier<Instant> so AlertingMetricsCachingTest can advance
  a fake clock without waiting wall-clock time.
- Adds AlertingMetricsCachingTest covering:
  * supplier invoked once per TTL across repeated scrapes
  * 29 s elapsed → still cached; 31 s elapsed → re-queried
  * gauge value reflects the cached result even after delegate mutates

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

docs(alerting): UI map + admin-guide walkthrough for Plan 03 f75ee9f352

.claude/rules/ui.md now maps every Plan 03 UI surface. Admin guide gains
an inbox/rules/silences walkthrough so ops teams can start in the UI
without reading the spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

chore(docker): full-stack docker-compose mirroring deploy/ k8s manifests 1ed2d3a611

Mirrors the k8s manifests in deploy/ as a local dev stack:
  - cameleer-postgres   (matches deploy/cameleer-postgres.yaml)
  - cameleer-clickhouse (matches deploy/cameleer-clickhouse.yaml, default CLICKHOUSE_DB=cameleer)
  - cameleer-server     (built from Dockerfile, env mirrors deploy/base/server.yaml)
  - cameleer-ui         (built from ui/Dockerfile, served on host :8080 to leave :5173 free for Vite dev)

Dockerfile + ui/Dockerfile: REGISTRY_TOKEN is now optional (empty → skip Maven/npm auth).
cameleer-common package is public, so anonymous pulls succeed; private packages still require the token.

Backend defaults tuned for local E2E:
  - RUNTIME_ENABLED=false (no Docker-in-Docker deployments in dev stack)
  - OUTBOUND_HTTP_ALLOW_PRIVATE_TARGETS=true (so webhook tests can target host.docker.internal etc.)
  - UIUSER/UIPASSWORD=admin/admin (matches Playwright E2E_ADMIN_USER/PASS defaults)
  - CORS includes both :5173 (Vite) and :8080 (nginx)

fix(alerting): @Autowired on AlertingMetrics production constructor 5edf7eb23a

Task 29's refactor added a package-private test-friendly constructor
alongside the public production one. Without @Autowired Spring cannot pick
which constructor to use for the @Component, and falls back to searching
for a no-arg default — crashing startup with 'No default constructor found'.

Detected when launching the server via the new docker-compose stack; unit
tests still pass because they invoke the package-private test constructor
directly.

fix(ui/alerts): align RouteMetric metric enum with backend; pre-populate ROUTE_METRIC defaults bcde6678b8

- RouteMetricForm dropped P95_LATENCY_MS — not in cameleer-server-core
  RouteMetric enum (valid: ERROR_RATE, P99_LATENCY_MS, AVG_DURATION_MS,
  THROUGHPUT, ERROR_COUNT).
- initialForm now returns a ready-to-save ROUTE_METRIC condition
  (metric=ERROR_RATE, comparator=GT, threshold=0.05, windowSeconds=300),
  so clicking through the wizard with all defaults produces a valid rule.
  Prevents a 400 'missing type id property kind' + 400 on condition enum
  validation if the user leaves the condition step untouched.

chore(docker): seeder service pre-creates unprefixed 'admin' user row d88bede097

Alerting + outbound controllers resolve acting user via
authentication.name with 'user:' prefix stripped → 'admin'. But
UserRepository.upsert stores env-admin as 'user:admin' (JWT sub format).
The resulting FK mismatch manifests as 500 'alert_rules_created_by_fkey'
on any create operation in a fresh docker stack.

Workaround: run-once 'cameleer-seed' compose service runs psql against
deploy/docker/postgres-init.sql after the server is healthy (i.e. after
Flyway migrations have created tenant_default.users), inserting
user_id='admin' idempotently. The root-cause fix belongs in the backend
(either stop stripping the prefix in alerting/outbound controllers, or
normalise storage to the unprefixed form) and is out of scope for
Plan 03.

test(ui/alerts): Playwright E2E smoke (sidebar, rule CRUD, CMD-K, silence CRUD)

CI / cleanup-branch (push) Has been skipped

Details

CI / build (push) Successful in 2m10s

Details

CI / cleanup-branch (pull_request) Has been skipped

Details

CI / build (pull_request) Successful in 2m34s

Details

CI / docker (pull_request) Has been skipped

Details

CI / deploy (pull_request) Has been skipped

Details

CI / deploy-feature (pull_request) Has been skipped

Details

CI / docker (push) Successful in 5m11s

Details

CI / deploy (push) Has been skipped

Details

CI / deploy-feature (push) Successful in 40s

Details

1ebc2fa71e

fixtures.ts: auto-applied login fixture — visits /login?local to skip OIDC
auto-redirect, fills username/password via label-matcher, clicks 'Sign in',
then selects the 'default' env so alerting hooks enable (useSelectedEnv gate).
Override via E2E_ADMIN_USER + E2E_ADMIN_PASS.

alerting.spec.ts: 4 tests against the full docker-compose stack:
 - sidebar Alerts accordion → /alerts/inbox
 - 5-step wizard: defaults-only create + row delete (unique timestamp name
   avoids strict-mode collisions with leftover rules)
 - CMD-K palette via SearchTrigger click (deterministic; Ctrl+K via keyboard
   is flaky when the canvas doesn't have focus)
 - silence matcher-based create + end-early

DS FormField renders labels as generics (not htmlFor-wired), so inputs are
targeted by placeholder or label-proximity locators instead of getByLabel.

Does not exercise fire→ack→clear; that's covered backend-side by
AlertingFullLifecycleIT (Plan 02). UI E2E for that path would need event
injection into ClickHouse, out of scope for this smoke.

hsiegeln merged commit ec460faf02 into main

2026-04-20 16:27:49 +02:00

hsiegeln referenced this issue from a commit

2026-04-20 16:27:50 +02:00

Merge pull request 'feat(alerting): Plan 03 — UI + backfills (SSRF guard, metrics caching, docker stack)' (#144) from feat/alerting-03-ui into main

claude referenced this issue from a commit

2026-04-20 18:49:30 +02:00

docs(gitnexus): refresh index stats + repo name (alerting-02 → cameleer-server)

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: cameleer/cameleer-server#144