Alerting: support custom agent event types in AGENT_LIFECYCLE condition #145

New Issue

claude · 2026-04-21T14:13:36+02:00

claude commented

2026-04-21 14:13:36 +02:00

Context

The AGENT_LIFECYCLE alert condition (introduced in the agent-lifecycle-conditions feature) ships with a strict allowlist of event types:

REGISTERED
RE_REGISTERED
DEREGISTERED
WENT_STALE
WENT_DEAD
RECOVERED

Agents can also post arbitrary event types via POST /api/v1/data/events (handled by EventIngestionController), which end up in the same agent_events ClickHouse table. Today those custom event types cannot be selected in an alert rule — the allowlist rejects anything outside the six entries above.

Why we shipped strict first

Typos silently never fire. A freeform text field like eventType = "REGISTER" (missing ED) would look valid in the UI but match zero rows forever.
The six registry-lifecycle events are stable, well-known, and emitted by the server itself, so we can guarantee the allowlist is correct.

What's needed

A way to register/declare additional event types (per-tenant? per-env? global?) so the rule editor can offer them as first-class options.
UI in the wizard: either a "custom event type" chip input gated on an admin-approved list, or a dropdown populated from a discovery query (e.g. SELECT DISTINCT event_type FROM agent_events WHERE tenant_id = ? AND timestamp > now() - INTERVAL 30 DAY).
Validation: still reject unknown types at save time, but "unknown" now means "not in allowlist ∪ registered custom set".

Acceptance criteria

Users can opt into matching a custom event type in AgentLifecycleCondition.eventTypes without losing the typo safety net.
Existing rules (allowlist-only) keep working unchanged.
.claude/rules/ updated to document the mechanism.

Priority

Backlog — the six lifecycle events cover the core "agent outage / restart" use cases. Custom events are a "nice to have" for agents emitting domain-specific signals.

## Context The `AGENT_LIFECYCLE` alert condition (introduced in the agent-lifecycle-conditions feature) ships with a **strict allowlist** of event types: - `REGISTERED` - `RE_REGISTERED` - `DEREGISTERED` - `WENT_STALE` - `WENT_DEAD` - `RECOVERED` Agents can also post arbitrary event types via `POST /api/v1/data/events` (handled by `EventIngestionController`), which end up in the same `agent_events` ClickHouse table. Today those custom event types **cannot** be selected in an alert rule — the allowlist rejects anything outside the six entries above. ## Why we shipped strict first - Typos silently never fire. A freeform text field like `eventType = "REGISTER"` (missing `ED`) would look valid in the UI but match zero rows forever. - The six registry-lifecycle events are stable, well-known, and emitted by the server itself, so we can guarantee the allowlist is correct. ## What's needed - A way to register/declare additional event types (per-tenant? per-env? global?) so the rule editor can offer them as first-class options. - UI in the wizard: either a "custom event type" chip input gated on an admin-approved list, or a dropdown populated from a discovery query (e.g. `SELECT DISTINCT event_type FROM agent_events WHERE tenant_id = ? AND timestamp > now() - INTERVAL 30 DAY`). - Validation: still reject unknown types at save time, but "unknown" now means "not in allowlist ∪ registered custom set". ## Acceptance criteria - Users can opt into matching a custom event type in `AgentLifecycleCondition.eventTypes` without losing the typo safety net. - Existing rules (allowlist-only) keep working unchanged. - `.claude/rules/` updated to document the mechanism. ## Priority Backlog — the six lifecycle events cover the core "agent outage / restart" use cases. Custom events are a "nice to have" for agents emitting domain-specific signals.

hsiegeln referenced this issue from a commit

2026-04-21 19:53:13 +02:00

feat(alerting): AGENT_LIFECYCLE condition kind with per-subject fire mode

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: cameleer/cameleer-server#145