Collapse /alerts/inbox, /alerts/all, /alerts/history into a single filterable inbox. Drop ACKNOWLEDGED from AlertState; add read_at and deleted_at as orthogonal timestamp flags. Retire per-user alert_reads tracking. Add Silence-rule and Delete row/bulk actions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
203 lines
12 KiB
Markdown
203 lines
12 KiB
Markdown
# Alerts Inbox Redesign — Design
|
|
|
|
**Status:** Approved for planning
|
|
**Date:** 2026-04-21
|
|
**Author:** Hendrik + Claude
|
|
|
|
## Goal
|
|
|
|
Collapse the three alert list pages (`/alerts/inbox`, `/alerts/all`, `/alerts/history`) into a single filterable inbox. Retire per-user read tracking: `ack`, `read`, and `delete` become global timestamp flags on the alert instance. Add row/bulk actions for **Silence rule…** and **Delete** directly from the list.
|
|
|
|
## Motivation
|
|
|
|
Today the three pages all hit the same user-scoped `InAppInboxQuery.listInbox`, so "All" is misleading (it's not env-wide) and "History" is just "Inbox with status=RESOLVED". The user asked for a single clean inbox with richer filters and list-level actions, and explicitly granted simplification of the read/ack/delete tracking model: one action is visible to everyone, no per-user state.
|
|
|
|
## Scope
|
|
|
|
**In:**
|
|
- Data model: drop `ACKNOWLEDGED` from `AlertState`, add `read_at` + `deleted_at` columns, drop `alert_reads` table, rework V13 open-rule index predicate.
|
|
- Backend: `AlertController` gains `acked`/`read` filter params, new `DELETE /alerts/{id}`, new `POST /alerts/bulk-delete`, new `POST /alerts/bulk-ack`. `/read` + `/bulk-read` rewire to update `alert_instances.read_at`.
|
|
- Data migration (V17): existing `ACKNOWLEDGED` rows → `state='FIRING'` (ack_time preserved).
|
|
- UI: rebuild `InboxPage` filter bar, add Silence/Delete row + bulk actions. Delete `AllAlertsPage.tsx` + `HistoryPage.tsx`. Sidebar trims to Inbox · Rules · Silences.
|
|
- Tests + rules-file updates.
|
|
|
|
**Out:**
|
|
- No redirects from `/alerts/all` or `/alerts/history` — clean break per project's no-backwards-compat policy. Stale URLs 404.
|
|
- No per-instance silence (different from rule-silence). Silence row action always silences the rule that produced the alert.
|
|
- No "mark unread". Read is a one-way flag.
|
|
- No per-user actor tracking for `read`/`deleted`. `acked_by` stays (already exists, useful in UI), but only because it's already wired.
|
|
|
|
## Architecture
|
|
|
|
### Data model (`alert_instances`)
|
|
|
|
```
|
|
state enum: PENDING · FIRING · RESOLVED (was: + ACKNOWLEDGED)
|
|
acked_at TIMESTAMPTZ NULL (existing, semantics unchanged)
|
|
acked_by TEXT NULL → users(user_id) (existing, retained for UI)
|
|
read_at TIMESTAMPTZ NULL (NEW, global)
|
|
deleted_at TIMESTAMPTZ NULL (NEW, soft delete)
|
|
```
|
|
|
|
**Orthogonality:** `state` describes the alert's lifecycle (is the underlying condition still met?). `acked_at` / `read_at` / `deleted_at` describe what humans have done to the notification. A FIRING alert can be acked (= "someone's on it") while remaining FIRING until the condition clears.
|
|
|
|
**V13 open-rule unique index predicate** (preserved as the evaluator's dedup key) changes from:
|
|
```sql
|
|
WHERE state IN ('PENDING','FIRING','ACKNOWLEDGED')
|
|
```
|
|
to:
|
|
```sql
|
|
WHERE state IN ('PENDING','FIRING') AND deleted_at IS NULL
|
|
```
|
|
Ack no longer "closes" the open window — a rule that's still matching stays de-duped against the open instance whether acked or not. Deleting soft-deletes the row and opens a new slot so the rule can fire again fresh if the condition re-triggers.
|
|
|
|
**`alert_reads` table:** dropped entirely. No FK references elsewhere.
|
|
|
|
### Postgres enum removal
|
|
|
|
Postgres doesn't support removing a value from an enum type. Migration path:
|
|
|
|
```sql
|
|
-- V17
|
|
-- 1. Coerce existing ACKNOWLEDGED rows → FIRING (ack_time already set)
|
|
UPDATE alert_instances SET state = 'FIRING' WHERE state = 'ACKNOWLEDGED';
|
|
|
|
-- 2. Swap to a new enum type without ACKNOWLEDGED
|
|
CREATE TYPE alert_state_enum_v2 AS ENUM ('PENDING','FIRING','RESOLVED');
|
|
ALTER TABLE alert_instances
|
|
ALTER COLUMN state TYPE alert_state_enum_v2
|
|
USING state::text::alert_state_enum_v2;
|
|
DROP TYPE alert_state_enum;
|
|
ALTER TYPE alert_state_enum_v2 RENAME TO alert_state_enum;
|
|
|
|
-- 3. New columns
|
|
ALTER TABLE alert_instances
|
|
ADD COLUMN read_at timestamptz NULL,
|
|
ADD COLUMN deleted_at timestamptz NULL;
|
|
CREATE INDEX alert_instances_read_idx ON alert_instances (environment_id, read_at) WHERE read_at IS NULL AND deleted_at IS NULL;
|
|
CREATE INDEX alert_instances_deleted_idx ON alert_instances (deleted_at) WHERE deleted_at IS NOT NULL;
|
|
|
|
-- 4. Rework V13/V15/V16 open-rule unique index with the new predicate
|
|
DROP INDEX IF EXISTS alert_instances_open_rule_uq;
|
|
CREATE UNIQUE INDEX alert_instances_open_rule_uq
|
|
ON alert_instances (rule_id, (COALESCE(
|
|
context->>'_subjectFingerprint',
|
|
context->'exchange'->>'id',
|
|
'')))
|
|
WHERE rule_id IS NOT NULL
|
|
AND state IN ('PENDING','FIRING')
|
|
AND deleted_at IS NULL;
|
|
|
|
-- 5. Drop alert_reads
|
|
DROP TABLE alert_reads;
|
|
```
|
|
|
|
### Backend — `AlertController`
|
|
|
|
| Method | Path | Body/Query | RBAC | Effect |
|
|
|---|---|---|---|---|
|
|
| GET | `/alerts` | `state, severity, acked, read, limit` | VIEWER+ | Inbox list, always `deleted_at IS NULL`. `state` no longer accepts `ACKNOWLEDGED`. `acked` / `read` are tri-state: **omitted** = no filter; `=true` = only acked/read; `=false` = only unacked/unread. UI defaults to `acked=false&read=false` via the "Hide acked" + "Hide read" toggles. |
|
|
| GET | `/alerts/unread-count` | — | VIEWER+ | Counts `read_at IS NULL AND deleted_at IS NULL` + user-visibility predicate. |
|
|
| GET | `/alerts/{id}` | — | VIEWER+ | Detail. Returns 404 if `deleted_at IS NOT NULL`. |
|
|
| POST | `/alerts/{id}/ack` | — | VIEWER+ | Sets `acked_at=now, acked_by=user`. No state change. |
|
|
| POST | `/alerts/{id}/read` | — | VIEWER+ | Sets `read_at=now` if null. Idempotent. |
|
|
| POST | `/alerts/bulk-read` | `{ instanceIds: [...] }` | VIEWER+ | UPDATE `alert_instances SET read_at=now()` for all ids in env. |
|
|
| POST | `/alerts/bulk-ack` | `{ instanceIds: [...] }` | **NEW** VIEWER+ | Parallel to bulk-read. |
|
|
| DELETE | `/alerts/{id}` | — | **NEW** OPERATOR+ | Sets `deleted_at=now`. Returns 204. |
|
|
| POST | `/alerts/bulk-delete` | `{ instanceIds: [...] }` | **NEW** OPERATOR+ | Bulk soft-delete in env. |
|
|
|
|
Removed:
|
|
- `AlertReadRepository` bean + `alert_reads` usage — `read_at`/`bulk-read` now update `alert_instances` directly.
|
|
- `ACKNOWLEDGED` handling in all backend code paths.
|
|
|
|
`InAppInboxQuery.countUnread` rewires to a single SQL count on `alert_instances` with `read_at IS NULL AND deleted_at IS NULL` + target-visibility predicate.
|
|
|
|
### Backend — evaluator + notifier
|
|
|
|
- `AlertInstanceRepository.findOpenByRule(ruleId, subjectFingerprint)` already exists; its predicate now matches the new index (`state IN ('PENDING','FIRING') AND deleted_at IS NULL`).
|
|
- All test fixtures that assert `state=ACKNOWLEDGED` → assert `acked_at IS NOT NULL`.
|
|
- Notification pipeline (`AlertNotifier`) already fires on state transitions; no change — ack no longer being a state means one fewer state-change branch to handle.
|
|
|
|
### Silence from list — no new endpoint
|
|
|
|
UI row-action calls existing `POST /alerts/silences` with `{ matcher: { ruleId: <id> }, startsAt: now, endsAt: now + duration, reason: "Silenced from inbox" }`. The duration picker is a small menu: `1h / 8h / 24h / Custom…`. "Custom" routes to `/alerts/silences` (the existing SilencesPage form) with the `ruleId` pre-filled via URL search param.
|
|
|
|
### UI — `InboxPage.tsx`
|
|
|
|
**Filter bar (topnavbar-style, left-to-right):**
|
|
|
|
| Filter | Values | Default |
|
|
|---|---|---|
|
|
| Severity (ButtonGroup multi) | CRITICAL · WARNING · INFO | none (= no filter) |
|
|
| Status (ButtonGroup multi) | PENDING · FIRING · RESOLVED | FIRING selected |
|
|
| Hide acked (Toggle) | on/off | **on** |
|
|
| Hide read (Toggle) | on/off | **on** |
|
|
|
|
Default state: "Show me firing things nobody's touched." Matches the "what needs attention" mental model.
|
|
|
|
**Row actions column** (right-aligned, shown on hover or always for the touched row):
|
|
- `Acknowledge` (when `acked_at IS NULL`)
|
|
- `Mark read` (when `read_at IS NULL`)
|
|
- `Silence rule…` (opens quick menu: `1h / 8h / 24h / Custom…`)
|
|
- `Delete` (trash icon, OPERATOR+ only). Soft-delete. Undo toast for 5s invalidates the mutation.
|
|
|
|
**Bulk toolbar** (shown when selection > 0, above table):
|
|
- `Acknowledge N` (filters to unacked)
|
|
- `Mark N read` (filters to unread)
|
|
- `Silence rules` (silences every unique ruleId in selection — duration menu)
|
|
- `Delete N` (OPERATOR+) — opens confirmation modal: "Delete N alerts? This affects all users."
|
|
|
|
**Deleted/dropped files:**
|
|
- `ui/src/pages/Alerts/AllAlertsPage.tsx` — removed
|
|
- `ui/src/pages/Alerts/HistoryPage.tsx` — removed
|
|
- `/alerts/all` and `/alerts/history` route definitions in `router.tsx` — removed
|
|
|
|
**Sidebar:**
|
|
`buildAlertsTreeNodes` in `ui/src/components/sidebar-utils.ts` trims to: Inbox · Rules · Silences. "Inbox" stays as the first child; selecting `/alerts` (bare) goes to inbox.
|
|
|
|
**CMD-K:** `buildAlertSearchData` still registers `alert` and `alertRule` categories, but the alert-category deep links all point to `/alerts/inbox/{id}` (single detail route).
|
|
|
|
### Tests
|
|
|
|
Backend:
|
|
- `AlertControllerTest` — new `acked` + `read` filter cases, new DELETE + bulk-delete + bulk-ack tests, 404 on soft-deleted instance.
|
|
- `PostgresAlertInstanceRepositoryTest` — `markRead`/`bulkMarkRead`/`softDelete`/`bulkSoftDelete` SQL, index predicate correctness.
|
|
- V17 migration test: seed an `ACKNOWLEDGED` row pre-migration, verify post-migration state and index.
|
|
- Every test using `AlertState.ACKNOWLEDGED` — removed or switched to `acked_at IS NOT NULL` assertion.
|
|
- `AlertNotifierTest` — confirm no regression on notification emission paths.
|
|
|
|
UI:
|
|
- `InboxPage.test.tsx` — filter toggles, select-all, row actions, bulk actions, optimistic delete + undo.
|
|
- `enums.test.ts` snapshot — `AlertState` drops ACKNOWLEDGED, new filter option arrays added.
|
|
- Silence duration menu component test.
|
|
|
|
### Docs / rules updates
|
|
|
|
- `.claude/rules/app-classes.md`:
|
|
- `AlertController` endpoint list updated (new DELETE, bulk-delete, bulk-ack; `acked`/`read` filter params; `ACKNOWLEDGED` removed from allowed state).
|
|
- Drop `AlertReadRepository` from `security/` or repository listings.
|
|
- `.claude/rules/ui.md`:
|
|
- Alerts section: remove "All" and "History" pages, drop their routes. Rewrite Inbox description to new filter bar + actions.
|
|
- Note: unread-count bell now global.
|
|
- `.claude/rules/core-classes.md`:
|
|
- `AlertState` enum values reduced to three.
|
|
- Note `alert_reads` table is retired.
|
|
- `CLAUDE.md`:
|
|
- New migration entry: `V17 — Alerts: drop ACKNOWLEDGED state, add read_at/deleted_at, drop alert_reads, rework open-rule index.`
|
|
|
|
## Risk / open concerns
|
|
|
|
1. **Enum-type swap on a populated table.** `ALTER COLUMN TYPE … USING cast::text::enum_v2` rewrites every row. `alert_instances` is expected to remain bounded (RESOLVED rows age out via retention), but on large installs this should run during a low-traffic window. Migration is idempotent.
|
|
2. **Concurrent ack/delete races.** Both are simple column updates with `WHERE id=? AND deleted_at IS NULL`; last-write wins is acceptable per the "no individual tracking" decision.
|
|
3. **Notification context mustache variables.** No change — `alert.state` shape is unchanged; templates referencing `state=ACKNOWLEDGED` are user-authored and will start producing no matches after the migration, which is intentional. Add a release note.
|
|
4. **CMD-K deep links** to deleted alert ids return 404 now (they did before for missing, now also for soft-deleted). Acceptable.
|
|
|
|
## Acceptance
|
|
|
|
- Single inbox at `/alerts/inbox` with four filter dimensions wired end-to-end.
|
|
- Silence-rule menu works from row + bulk.
|
|
- Soft-delete works from row + bulk, with OPERATOR+ guard and undo toast for single.
|
|
- Unread count bell reflects global `read_at IS NULL`.
|
|
- All existing backend/UI tests green; new test coverage as listed above.
|
|
- V17 up-migrates ACKNOWLEDGED rows cleanly; reviewer can verify with a seeded pre-migration snapshot.
|