Files
cameleer-server/docs/superpowers/specs/2026-04-21-alerts-inbox-redesign-design.md
hsiegeln c0b8c9a1ad docs(alerts): spec — inbox redesign (single filterable inbox)
Collapse /alerts/inbox, /alerts/all, /alerts/history into a single
filterable inbox. Drop ACKNOWLEDGED from AlertState; add read_at and
deleted_at as orthogonal timestamp flags. Retire per-user alert_reads
tracking. Add Silence-rule and Delete row/bulk actions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-21 16:45:04 +02:00

12 KiB

Alerts Inbox Redesign — Design

Status: Approved for planning Date: 2026-04-21 Author: Hendrik + Claude

Goal

Collapse the three alert list pages (/alerts/inbox, /alerts/all, /alerts/history) into a single filterable inbox. Retire per-user read tracking: ack, read, and delete become global timestamp flags on the alert instance. Add row/bulk actions for Silence rule… and Delete directly from the list.

Motivation

Today the three pages all hit the same user-scoped InAppInboxQuery.listInbox, so "All" is misleading (it's not env-wide) and "History" is just "Inbox with status=RESOLVED". The user asked for a single clean inbox with richer filters and list-level actions, and explicitly granted simplification of the read/ack/delete tracking model: one action is visible to everyone, no per-user state.

Scope

In:

  • Data model: drop ACKNOWLEDGED from AlertState, add read_at + deleted_at columns, drop alert_reads table, rework V13 open-rule index predicate.
  • Backend: AlertController gains acked/read filter params, new DELETE /alerts/{id}, new POST /alerts/bulk-delete, new POST /alerts/bulk-ack. /read + /bulk-read rewire to update alert_instances.read_at.
  • Data migration (V17): existing ACKNOWLEDGED rows → state='FIRING' (ack_time preserved).
  • UI: rebuild InboxPage filter bar, add Silence/Delete row + bulk actions. Delete AllAlertsPage.tsx + HistoryPage.tsx. Sidebar trims to Inbox · Rules · Silences.
  • Tests + rules-file updates.

Out:

  • No redirects from /alerts/all or /alerts/history — clean break per project's no-backwards-compat policy. Stale URLs 404.
  • No per-instance silence (different from rule-silence). Silence row action always silences the rule that produced the alert.
  • No "mark unread". Read is a one-way flag.
  • No per-user actor tracking for read/deleted. acked_by stays (already exists, useful in UI), but only because it's already wired.

Architecture

Data model (alert_instances)

state enum:           PENDING · FIRING · RESOLVED       (was: + ACKNOWLEDGED)
acked_at  TIMESTAMPTZ NULL                               (existing, semantics unchanged)
acked_by  TEXT NULL → users(user_id)                     (existing, retained for UI)
read_at   TIMESTAMPTZ NULL                               (NEW, global)
deleted_at TIMESTAMPTZ NULL                              (NEW, soft delete)

Orthogonality: state describes the alert's lifecycle (is the underlying condition still met?). acked_at / read_at / deleted_at describe what humans have done to the notification. A FIRING alert can be acked (= "someone's on it") while remaining FIRING until the condition clears.

V13 open-rule unique index predicate (preserved as the evaluator's dedup key) changes from:

WHERE state IN ('PENDING','FIRING','ACKNOWLEDGED')

to:

WHERE state IN ('PENDING','FIRING') AND deleted_at IS NULL

Ack no longer "closes" the open window — a rule that's still matching stays de-duped against the open instance whether acked or not. Deleting soft-deletes the row and opens a new slot so the rule can fire again fresh if the condition re-triggers.

alert_reads table: dropped entirely. No FK references elsewhere.

Postgres enum removal

Postgres doesn't support removing a value from an enum type. Migration path:

-- V17
-- 1. Coerce existing ACKNOWLEDGED rows → FIRING (ack_time already set)
UPDATE alert_instances SET state = 'FIRING' WHERE state = 'ACKNOWLEDGED';

-- 2. Swap to a new enum type without ACKNOWLEDGED
CREATE TYPE alert_state_enum_v2 AS ENUM ('PENDING','FIRING','RESOLVED');
ALTER TABLE alert_instances
  ALTER COLUMN state TYPE alert_state_enum_v2
  USING state::text::alert_state_enum_v2;
DROP TYPE alert_state_enum;
ALTER TYPE alert_state_enum_v2 RENAME TO alert_state_enum;

-- 3. New columns
ALTER TABLE alert_instances
  ADD COLUMN read_at    timestamptz NULL,
  ADD COLUMN deleted_at timestamptz NULL;
CREATE INDEX alert_instances_read_idx     ON alert_instances (environment_id, read_at)    WHERE read_at IS NULL    AND deleted_at IS NULL;
CREATE INDEX alert_instances_deleted_idx  ON alert_instances (deleted_at)                 WHERE deleted_at IS NOT NULL;

-- 4. Rework V13/V15/V16 open-rule unique index with the new predicate
DROP INDEX IF EXISTS alert_instances_open_rule_uq;
CREATE UNIQUE INDEX alert_instances_open_rule_uq
  ON alert_instances (rule_id, (COALESCE(
        context->>'_subjectFingerprint',
        context->'exchange'->>'id',
        '')))
  WHERE rule_id IS NOT NULL
    AND state IN ('PENDING','FIRING')
    AND deleted_at IS NULL;

-- 5. Drop alert_reads
DROP TABLE alert_reads;

Backend — AlertController

Method Path Body/Query RBAC Effect
GET /alerts state, severity, acked, read, limit VIEWER+ Inbox list, always deleted_at IS NULL. state no longer accepts ACKNOWLEDGED. acked / read are tri-state: omitted = no filter; =true = only acked/read; =false = only unacked/unread. UI defaults to acked=false&read=false via the "Hide acked" + "Hide read" toggles.
GET /alerts/unread-count VIEWER+ Counts read_at IS NULL AND deleted_at IS NULL + user-visibility predicate.
GET /alerts/{id} VIEWER+ Detail. Returns 404 if deleted_at IS NOT NULL.
POST /alerts/{id}/ack VIEWER+ Sets acked_at=now, acked_by=user. No state change.
POST /alerts/{id}/read VIEWER+ Sets read_at=now if null. Idempotent.
POST /alerts/bulk-read { instanceIds: [...] } VIEWER+ UPDATE alert_instances SET read_at=now() for all ids in env.
POST /alerts/bulk-ack { instanceIds: [...] } NEW VIEWER+ Parallel to bulk-read.
DELETE /alerts/{id} NEW OPERATOR+ Sets deleted_at=now. Returns 204.
POST /alerts/bulk-delete { instanceIds: [...] } NEW OPERATOR+ Bulk soft-delete in env.

Removed:

  • AlertReadRepository bean + alert_reads usage — read_at/bulk-read now update alert_instances directly.
  • ACKNOWLEDGED handling in all backend code paths.

InAppInboxQuery.countUnread rewires to a single SQL count on alert_instances with read_at IS NULL AND deleted_at IS NULL + target-visibility predicate.

Backend — evaluator + notifier

  • AlertInstanceRepository.findOpenByRule(ruleId, subjectFingerprint) already exists; its predicate now matches the new index (state IN ('PENDING','FIRING') AND deleted_at IS NULL).
  • All test fixtures that assert state=ACKNOWLEDGED → assert acked_at IS NOT NULL.
  • Notification pipeline (AlertNotifier) already fires on state transitions; no change — ack no longer being a state means one fewer state-change branch to handle.

Silence from list — no new endpoint

UI row-action calls existing POST /alerts/silences with { matcher: { ruleId: <id> }, startsAt: now, endsAt: now + duration, reason: "Silenced from inbox" }. The duration picker is a small menu: 1h / 8h / 24h / Custom…. "Custom" routes to /alerts/silences (the existing SilencesPage form) with the ruleId pre-filled via URL search param.

UI — InboxPage.tsx

Filter bar (topnavbar-style, left-to-right):

Filter Values Default
Severity (ButtonGroup multi) CRITICAL · WARNING · INFO none (= no filter)
Status (ButtonGroup multi) PENDING · FIRING · RESOLVED FIRING selected
Hide acked (Toggle) on/off on
Hide read (Toggle) on/off on

Default state: "Show me firing things nobody's touched." Matches the "what needs attention" mental model.

Row actions column (right-aligned, shown on hover or always for the touched row):

  • Acknowledge (when acked_at IS NULL)
  • Mark read (when read_at IS NULL)
  • Silence rule… (opens quick menu: 1h / 8h / 24h / Custom…)
  • Delete (trash icon, OPERATOR+ only). Soft-delete. Undo toast for 5s invalidates the mutation.

Bulk toolbar (shown when selection > 0, above table):

  • Acknowledge N (filters to unacked)
  • Mark N read (filters to unread)
  • Silence rules (silences every unique ruleId in selection — duration menu)
  • Delete N (OPERATOR+) — opens confirmation modal: "Delete N alerts? This affects all users."

Deleted/dropped files:

  • ui/src/pages/Alerts/AllAlertsPage.tsx — removed
  • ui/src/pages/Alerts/HistoryPage.tsx — removed
  • /alerts/all and /alerts/history route definitions in router.tsx — removed

Sidebar: buildAlertsTreeNodes in ui/src/components/sidebar-utils.ts trims to: Inbox · Rules · Silences. "Inbox" stays as the first child; selecting /alerts (bare) goes to inbox.

CMD-K: buildAlertSearchData still registers alert and alertRule categories, but the alert-category deep links all point to /alerts/inbox/{id} (single detail route).

Tests

Backend:

  • AlertControllerTest — new acked + read filter cases, new DELETE + bulk-delete + bulk-ack tests, 404 on soft-deleted instance.
  • PostgresAlertInstanceRepositoryTestmarkRead/bulkMarkRead/softDelete/bulkSoftDelete SQL, index predicate correctness.
  • V17 migration test: seed an ACKNOWLEDGED row pre-migration, verify post-migration state and index.
  • Every test using AlertState.ACKNOWLEDGED — removed or switched to acked_at IS NOT NULL assertion.
  • AlertNotifierTest — confirm no regression on notification emission paths.

UI:

  • InboxPage.test.tsx — filter toggles, select-all, row actions, bulk actions, optimistic delete + undo.
  • enums.test.ts snapshot — AlertState drops ACKNOWLEDGED, new filter option arrays added.
  • Silence duration menu component test.

Docs / rules updates

  • .claude/rules/app-classes.md:
    • AlertController endpoint list updated (new DELETE, bulk-delete, bulk-ack; acked/read filter params; ACKNOWLEDGED removed from allowed state).
    • Drop AlertReadRepository from security/ or repository listings.
  • .claude/rules/ui.md:
    • Alerts section: remove "All" and "History" pages, drop their routes. Rewrite Inbox description to new filter bar + actions.
    • Note: unread-count bell now global.
  • .claude/rules/core-classes.md:
    • AlertState enum values reduced to three.
    • Note alert_reads table is retired.
  • CLAUDE.md:
    • New migration entry: V17 — Alerts: drop ACKNOWLEDGED state, add read_at/deleted_at, drop alert_reads, rework open-rule index.

Risk / open concerns

  1. Enum-type swap on a populated table. ALTER COLUMN TYPE … USING cast::text::enum_v2 rewrites every row. alert_instances is expected to remain bounded (RESOLVED rows age out via retention), but on large installs this should run during a low-traffic window. Migration is idempotent.
  2. Concurrent ack/delete races. Both are simple column updates with WHERE id=? AND deleted_at IS NULL; last-write wins is acceptable per the "no individual tracking" decision.
  3. Notification context mustache variables. No change — alert.state shape is unchanged; templates referencing state=ACKNOWLEDGED are user-authored and will start producing no matches after the migration, which is intentional. Add a release note.
  4. CMD-K deep links to deleted alert ids return 404 now (they did before for missing, now also for soft-deleted). Acceptable.

Acceptance

  • Single inbox at /alerts/inbox with four filter dimensions wired end-to-end.
  • Silence-rule menu works from row + bulk.
  • Soft-delete works from row + bulk, with OPERATOR+ guard and undo toast for single.
  • Unread count bell reflects global read_at IS NULL.
  • All existing backend/UI tests green; new test coverage as listed above.
  • V17 up-migrates ACKNOWLEDGED rows cleanly; reviewer can verify with a seeded pre-migration snapshot.