Agent-scoped lookups miss diagrams from routes whose publishing agents
have been redeployed or removed. The new method resolves by
(applicationId, environment, routeId) + created_at DESC, independent of
the agent registry. An in-memory cache mirrors the existing hashCache
pattern, warm-loaded at startup via argMax.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously `TraefikLabelBuilder` hardcoded `tls.certresolver=default` on
every router. That assumes a resolver literally named `default` exists
in the Traefik static config — true for ACME-backed installs, false for
dev/local installs that use a file-based TLS store. Traefik logs
"Router uses a nonexistent certificate resolver" for the bogus resolver
on every managed app, and any future attempt to define a differently-
named real resolver would silently skip these routers.
Server-wide setting via `CAMELEER_SERVER_RUNTIME_CERTRESOLVER` (empty by
default) flows through `ConfigMerger.GlobalRuntimeDefaults.certResolver`
into `ResolvedContainerConfig.certResolver`. When blank the
`tls.certresolver` label is omitted entirely; `tls=true` is still
emitted so Traefik serves the default TLS-store cert. When set, the
label is emitted with the configured resolver name.
Not per-app/per-env configurable: there is one Traefik per server
instance and one resolver config; app-level override would only let
users break their own routers.
TDD: TraefikLabelBuilderTest gains 3 cases (resolver set, null, blank).
Full unit suite 211/0/0.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a boolean `externalRouting` flag (default `true`) on
ResolvedContainerConfig. When `false`, TraefikLabelBuilder emits only
the identity labels (`managed-by`, `cameleer.*`) and skips every
`traefik.*` label, so the container is not published by Traefik.
Sibling containers on `cameleer-traefik` / `cameleer-env-{tenant}-{env}`
can still reach it via Docker DNS on whatever port the app listens on.
TDD: new TraefikLabelBuilderTest covers enabled (default labels present),
disabled (zero traefik.* labels), and disabled (identity labels retained)
cases. Full module unit suite: 208/0/0.
Plumbed through ConfigMerger read, DeploymentExecutor snapshot, UI form
state, Resources tab toggle, POST payload, and snapshot-to-form mapping.
Rule files updated.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds List<String> instanceIds to LogSearchRequest (null-normalized to
List.of() in compact ctor) and generates an IN clause in both
ClickHouseLogStore.search() and countLogs(), mirroring the existing
sources pattern. LogQueryController parses ?instanceIds= as a
comma-split list. All existing LogSearchRequest call sites updated.
New ClickHouseLogStoreInstanceIdsIT covers: multi-value filter, empty
filter (all rows), null filter (all rows), single-value filter, and
coexistence with the singular instanceId field.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Appends String createdBy to the Deployment record (after createdAt), updates
both with-er methods to pass it through, threads the parameter through
DeploymentRepository.create, DeploymentService.createDeployment/promote, and
PostgresDeploymentRepository (INSERT + SELECT_COLS + mapRow). DeploymentController
passes null as placeholder (Task 4 will resolve from SecurityContextHolder).
Covers with PostgresDeploymentRepositoryCreatedByIT verifying round-trip via
both createDeployment and promote.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Backend: rename deleteTerminalByAppAndEnvironment → deleteFailedByAppAndEnvironment.
STOPPED rows were being wiped on every redeploy, so Checkpoints was always empty.
Now only FAILED rows are pruned; STOPPED deployments are retained as restorable
checkpoints (they still carry deployed_config_snapshot from their RUNNING window).
- UI filter: any deployment with a snapshot is a checkpoint (was RUNNING|DEGRADED only,
which excluded the main case — the previous blue/green deployment now in STOPPED).
- UI placement: Checkpoints disclosure now renders inside IdentitySection, matching
the design spec.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Typed enum (BLUE_GREEN, ROLLING) with fromWire/toWire kebab-case
translation. fromWire falls back to BLUE_GREEN for unknown or null
input so the executor dispatch site never null-checks and no
misconfigured container-config can throw at runtime.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Container names are deterministic: {tenant}-{envSlug}-{appSlug}-{replica}.
The prior code did the stop-existing step at SWAP_TRAFFIC, *after*
START_REPLICAS had already tried to create containers with the same
names — so a redeploy against a RUNNING app consistently failed with
Docker 409 "container name already in use".
Move the stop-existing block to run right after CREATE_NETWORK and
before START_REPLICAS. SWAP_TRAFFIC becomes a label-only marker (traffic
is swapped implicitly by Traefik labels once new replicas are healthy).
Also: add `findActiveByAppIdAndEnvironmentIdExcluding` so the SQL
excludes the current deployment by id — previously the Java-side
`!id.equals(me)` guard failed because the newly-inserted row has
status=STARTING (DB default) and ORDER BY created_at DESC LIMIT 1
picked the new row, hiding the actual previous deployment.
Trade-off: this is destroy-then-start rather than true blue/green —
brief downtime during the swap. Matches the pre-unified-page behavior
and is what users reasonably expect. True blue/green would require
per-deployment container names.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Issue 1: add List<String> sensitiveKeys as 4th field to DeploymentConfigSnapshot; populate
from agentConfig.getSensitiveKeys() in DeploymentExecutor; handleRestore hydrates from
snap.sensitiveKeys directly; Deployment type in apps.ts gains sensitiveKeys field
- Issue 2: after createApp succeeds, refetchQueries(['apps', envSlug]) before navigate so the
new app is in cache before the router renders the deployed view (eliminates transient Save-
disabled flash)
- Issue 3: useDeploymentPageState useEffect now uses prevServerStateRef to detect local edits;
background refetches only overwrite form when no local changes are present
- Issue 5: handleRedeploy invalidates dirty-state + versions queries after createDeployment
resolves; handleSave invalidates dirty-state after staged save
- Issue 10: DirtyStateCalculator strips volatile agentConfig keys (version, updatedAt, updatedBy,
environment, application) before JSON comparison via scrubAgentConfig(); adds
versionBumpDoesNotMarkDirty test
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wires DirtyStateCalculator behind an HTTP endpoint on AppController.
Adds findLatestSuccessfulByAppAndEnv to PostgresDeploymentRepository,
registers DirtyStateCalculator as a Spring bean (with ObjectMapper for
JavaTimeModule support), and covers all three scenarios with IT.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- compareJson now recurses when both nodes are ObjectNode, so nested maps
(tracedProcessors, routeRecording, routeSamplingRates) produce deep paths
like agentConfig.tracedProcessors.proc-1 instead of a blob diff
- Extract nodeToString helper: value nodes use asText() (strips JSON quotes),
null becomes "(none)", arrays/objects get compact JSON
- Apply nodeToString in both diff-emission paths (top-level mismatch + leaf)
- Add three new tests: nullAgentConfigInSnapshot, nestedAgentField_reportsDeepPath,
stringField_differenceValueIsUnquoted (8 tests total, all pass)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Pure-logic dirty-state detection: compares desired JAR + agent config + container
config against the DeploymentConfigSnapshot from the last successful deployment.
Returns a structured DirtyStateResult with per-field differences. 5 unit tests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Appends DeploymentConfigSnapshot deployedConfigSnapshot to the Deployment
record and adds a matching withDeployedConfigSnapshot wither. All
positional call sites (repository mapper, test fixture) updated to pass
null; Task 1.4 will wire real persistence and Task 1.5 will populate
the field on RUNNING transition.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All other FKs to app_versions.id (e.g. Deployment.appVersionId) use UUID;
DeploymentConfigSnapshot.jarVersionId was incorrectly typed as String.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Dead field — was enforced by compact ctor as required for PER_EXCHANGE,
but never read anywhere in the codebase. Removal tightens the API surface
and is precondition for the Task 3.3 cross-field validator.
Pre-prod; no shim / migration.
Adds an optional afterExecutionId field to SearchRequest. When combined
with a non-null timeFrom, ClickHouseSearchIndex applies a strictly-after
tuple predicate (start_time > ts OR (start_time = ts AND execution_id > id))
so same-millisecond exchanges can be consumed exactly once across ticks.
When afterExecutionId is null, timeFrom keeps its existing >= semantics —
no behaviour change for any current caller.
Also adds the SearchRequest.withCursor(ts, id) wither. Threads the field
through existing withInstanceIds / withEnvironment witheres. All existing
positional call-sites (SearchController, ExchangeMatchEvaluator,
ClickHouseSearchIndexIT, ClickHouseChunkPipelineIT) pass null for the new
slot.
Task 1.2 of docs/superpowers/plans/2026-04-22-per-exchange-exactly-once.md.
The evaluator-side wiring that actually supplies the cursor is Task 1.5.
No callers after the legacy PG ingestion path was retired in 0f635576.
core-classes.md updated to drop the leftover note.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After the ExecutionController removal (0f635576), SearchIndexer
subscribed to ExecutionUpdatedEvent but nothing publishes that event.
Every SearchIndexerStats metric returned always-zero, and the admin
/api/v1/admin/clickhouse/pipeline endpoint that surfaced those stats
carried no signal.
Backend removed:
- core: SearchIndexer, SearchIndexerStats, ExecutionUpdatedEvent
- app: IndexerPipelineResponse DTO, /pipeline endpoint on
ClickHouseAdminController (field + ctor param)
- StorageBeanConfig.searchIndexer bean
UI removed:
- IndexerPipeline type + useIndexerPipeline hook in
api/queries/admin/clickhouse.ts
- Indexer Pipeline card in ClickHouseAdminPage.tsx (plus ProgressBar
import and pipeline* CSS classes)
OpenAPI schema.d.ts + openapi.json regenerated (stale /pipeline path
and IndexerPipelineResponse schema removed).
SearchIndex interface + ClickHouseSearchIndex impl kept — those are
live and used by SearchService + ExchangeMatchEvaluator.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ExecutionController was @ConditionalOnMissingBean(ChunkAccumulator.class),
and ChunkAccumulator is registered unconditionally — the legacy controller
never bound in any profile. Even if it had, IngestionService.ingestExecution
called executionStore.upsert(), and the only ExecutionStore impl
(ClickHouseExecutionStore) threw UnsupportedOperationException from upsert
and upsertProcessors. The entire RouteExecution → upsert path was dead code
carrying four transitive dependencies (RouteExecution import, eventPublisher
wiring, body-size-limit config, searchIndexer::onExecutionUpdated hook).
Removed:
- cameleer-server-app/.../controller/ExecutionController.java (whole file)
- ExecutionStore.upsert + upsertProcessors (interface methods)
- ClickHouseExecutionStore.upsert + upsertProcessors (thrower overrides)
- IngestionService.ingestExecution + toExecutionRecord + flattenProcessors
+ hasAnyTraceData + truncateBody + toJson/toJsonObject helpers
- IngestionService constructor now takes (DiagramStore, WriteBuffer<Metrics>);
dropped ExecutionStore + Consumer<ExecutionUpdatedEvent> + bodySizeLimit
- StorageBeanConfig.ingestionService(...) simplified accordingly
Untouched because still in use:
- ExecutionRecord / ProcessorRecord records (findById / findProcessors /
SearchIndexer / DetailController)
- SearchIndexer (its onExecutionUpdated never fires now since no-one
publishes ExecutionUpdatedEvent, but SearchIndexerStats is still
referenced by ClickHouseAdminController — separate cleanup)
- TaggedExecution record has no remaining callers after this change —
flagged in core-classes.md as a leftover; separate cleanup.
Rule docs updated:
- .claude/rules/app-classes.md: retired ExecutionController bullet, fixed
stale URL for ChunkIngestionController (it owns /api/v1/data/executions,
not /api/v1/ingestion/chunk/executions).
- .claude/rules/core-classes.md: IngestionService surface + note the dead
TaggedExecution.
Full IT suite post-removal: 560 tests run, 11 F + 1 E — same 12 failures
in the same 3 previously-parked classes (AgentSseControllerIT / SseSigningIT
SSE-timing + ClickHouseStatsStoreIT timezone bug). No regression.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reproduction: pause a container long enough to cross both the stale
and dead thresholds, then unpause. The agent resumes sending heartbeats
but the server keeps it shown as DEAD. Only a full container restart
(which re-registers) fixes it.
Root cause: AgentRegistryService.heartbeat() only revived STALE → LIVE.
A DEAD agent's heartbeat updated lastHeartbeat but left state unchanged.
checkLifecycle() never downgrades DEAD either (no-op in that branch),
so the agent was permanently stuck in DEAD until a register() call.
Fix: extend the revival branch to also cover DEAD. Same process; a
heartbeat is proof of liveness regardless of the previous state.
Also: AgentLifecycleMonitor.mapTransitionEvent() now emits RECOVERED
for DEAD → LIVE, mirroring its behavior for STALE → LIVE, so the
lifecycle timeline captures the transition.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- new AlertInstanceRepository.filterInEnvLive(ids, env): single-query bulk ID validation
- AlertController.inEnvLiveIds now one SQL round-trip instead of N
- bulkMarkRead SQL: defense-in-depth AND deleted_at IS NULL
- bulkAck SQL already had deleted_at IS NULL guard — no change needed
- PostgresAlertInstanceRepositoryIT: add filterInEnvLive_excludes_other_env_and_soft_deleted
- V12MigrationIT: remove alert_reads assertion (table dropped by V17)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- AlertState: remove ACKNOWLEDGED case (V17 migration already dropped it from DB enum)
- AlertInstance: insert readAt + deletedAt Instant fields after lastNotifiedAt; add withReadAt/withDeletedAt withers; update all existing withers to pass both fields positionally
- AlertStateTransitions: add null,null for readAt/deletedAt in newInstance ctor call; collapse FIRING,ACKNOWLEDGED switch arm to just FIRING
- AlertScopeTest: update AlertState.values() assertion to 3 values; fix stale ConditionKind.hasSize(6) to 7 (JVM_METRIC was added earlier)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Allows alert rules to fire on agent-lifecycle events — REGISTERED,
RE_REGISTERED, DEREGISTERED, WENT_STALE, WENT_DEAD, RECOVERED — rather
than only on current state. Each matching `(agent, eventType, timestamp)`
becomes its own ackable AlertInstance, so outages on distinct agents are
independently routable.
Core:
- New `ConditionKind.AGENT_LIFECYCLE` + `AgentLifecycleCondition` record
(scope, eventTypes, withinSeconds). Compact ctor rejects empty
eventTypes and withinSeconds<1.
- Strict allowlist enum `AgentLifecycleEventType` (six entries matching
the server-emitted types in `AgentRegistrationController` and
`AgentLifecycleMonitor`). Custom agent-emitted event types tracked in
backlog issue #145.
- `AgentEventRepository.findInWindow(env, appSlug, agentId, eventTypes,
from, to, limit)` — new read path ordered `(timestamp ASC, insert_id
ASC)` used by the evaluator. Implemented on
`ClickHouseAgentEventRepository` with tenant + env filter mandatory.
App:
- `AgentLifecycleEvaluator` queries events in the last `withinSeconds`
window and returns `EvalResult.Batch` with one `Firing` per row.
Every Firing carries a canonical `_subjectFingerprint` of
`"<agentId>:<eventType>:<tsMillis>"` in context plus `agent` / `event`
subtrees for Mustache templating.
- `NotificationContextBuilder` gains an `AGENT_LIFECYCLE` branch that
exposes `{{agent.id}}`, `{{agent.app}}`, `{{event.type}}`,
`{{event.timestamp}}`, `{{event.detail}}`.
- Validation is delegated to the record compact ctor + enum at Jackson
deserialization time — matches the existing policy of keeping
controller validators focused on env-scoped / SQL-injection concerns.
Schema:
- V16 migration generalises the V15 per-exchange discriminator on
`alert_instances_open_rule_uq` to prefer `_subjectFingerprint` with a
fallback to the legacy `exchange.id` expression. Scalar kinds still
resolve to `''` and keep one-open-per-rule. Duplicate-key path in
`PostgresAlertInstanceRepository.save` is unchanged — the index is
the deduper.
UI:
- New `AgentLifecycleForm.tsx` wizard form with multi-select chips for
the six allowed event types + `withinSeconds` input. Wired into
`ConditionStep`, `form-state` (validation + defaults: WENT_DEAD,
300 s), and `enums.ts` options. Tests in `enums.test.ts` pin the
new option array.
- `alert-variables.ts` registers `{{agent.app}}`, `{{event.type}}`,
`{{event.timestamp}}`, `{{event.detail}}` leaves for the new kind,
and extends `agent.id`'s availability list to include `AGENT_LIFECYCLE`.
Tests (all passing):
- 5 new JSON-roundtrip cases on `AlertConditionJsonTest` (positive +
empty/zero/unknown-type rejection).
- 5 new evaluator unit tests on `AgentLifecycleEvaluatorTest` (empty
window, multi-agent fingerprint shape, scope forwarding, missing env).
- `NotificationContextBuilderTest` switch now covers the new kind.
- 119 alerting unit tests + 71 UI tests green.
Docs: `.claude/rules/{core,app,ui}` and CLAUDE.md migration list updated.
Backend: `GET /environments/{envSlug}/alerts` now accepts optional multi-value
`state=…` and `severity=…` query params. Filters are pushed down to
PostgresAlertInstanceRepository, which appends `AND state::text = ANY(?)` /
`AND severity::text = ANY(?)` to the inbox query (null/empty = no filter).
`AlertInstanceRepository.listForInbox` gained a 7-arg overload; the old 5-arg
form is preserved as a default delegate so existing callers (evaluator,
AlertingFullLifecycleIT, PostgresAlertInstanceRepositoryIT) compile unchanged.
`InAppInboxQuery.listInbox` also has a new filtered overload.
UI: InboxPage severity filter migrated from `SegmentedTabs` (single-select,
no color cues) to `ButtonGroup` (multi-select with severity-coloured dots),
matching the topnavbar status-filter pattern. `useAlerts` forwards the
filters as query params and cache-keys on the filter tuple so each combo
is independently cached.
Unit + hook tests updated to the new contract (5 UI tests + 8 Java unit
tests passing). OpenAPI types regenerated from the fresh local backend.
The rule editor wizard reset the condition payload on kind-change without
seeding a fireMode default; the ExchangeMatchCondition ctor allowed null to
pass through; AlertEvaluatorJob then NPE-looped every tick on a saved rule.
- core: compact ctor now rejects null fireMode (Jackson-deser path only — all
production callers already pass a value).
- V14: repair existing EXCHANGE_MATCH rows with fireMode=null to
PER_EXCHANGE + perExchangeLingerSeconds=300 (default matches the wizard).
- ui: ConditionStep.onKindChange seeds EXCHANGE_MATCH defaults so the
Select's displayed fallback ("Per exchange") is actually in form state.
- ui: validateStep('condition', ...) now enforces fireMode presence + the
mode-specific fields before the user reaches Review.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §13 calls for the notification bell to colour-code by highest
unread severity (CRITICAL → error, WARNING → amber, INFO → muted).
The old { count } DTO forced the UI to pick one static colour, so
NotificationBell shipped with a TODO. Grow the contract instead:
UnreadCountResponse = { total, bySeverity: { CRITICAL, WARNING, INFO } }
Guarantees:
- every severity is always present with a >=0 value (no undefined
keys on the wire), so the UI can branch without defaults.
- total = sum of bySeverity values — kept explicit on the wire for
cheap top-line display, not recomputed client-side.
Backend
- AlertInstanceRepository: replaces countUnreadForUser(long) with
countUnreadBySeverityForUser returning Map<AlertSeverity, Long>.
One SQL round-trip per (env, user) — GROUP BY ai.severity over the
same NOT EXISTS(alert_reads) filter.
- UnreadCountResponse.from(Map) normalises and defensively copies;
missing severities default to 0.
- InAppInboxQuery.countUnread now returns the DTO, caches the full
response (still 5s TTL) so severity breakdown gets the same
hit-rate as the total did before.
- AlertController just hands the DTO back.
Breaking change — no backwards-compat shim: the `count` field is
gone. UI and tests updated in the same commit; there are no other
API consumers in the tree.
Frontend
- Regenerated openapi.json + schema.d.ts against a fresh build of
the new backend.
- NotificationBell branches badge colour on the highest unread
severity (CRITICAL > WARNING > INFO) via new CSS variants.
- Tests cover all four paths: zero, critical-present, warning-only,
info-only.
Tests: 7 unit tests + 12 ITs (incl. new grouping + empty-map)
+ 49 vitest (was 46; +3 severity-branch assertions).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
AlertNotificationRepository gains resetForRetry(UUID, Instant) which sets
attempts=0, status=PENDING, next_attempt_at=now, and clears claim/response
fields. AlertNotificationController calls resetForRetry instead of
scheduleRetry so a manual retry always starts from a clean slate.
AlertNotificationControllerIT adds retryResetsAttemptsToZero to verify
attempts==0 and status==PENDING after three prior markFailed calls.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AlertInstanceRepository gains listFiringDueForReNotify(Instant) — only returns
instances where last_notified_at IS NOT NULL and cadence has elapsed (IS NULL
branch excluded: sweep only re-notifies, initial notify is the dispatcher's job).
AlertEvaluatorJob.sweepReNotify() runs at the end of each tick, enqueues fresh
notifications for eligible instances and stamps last_notified_at.
NotificationDispatchJob stamps last_notified_at on the alert_instance when a
notification is DELIVERED, providing the anchor timestamp for cadence checks.
PostgresAlertInstanceRepositoryIT adds listFiringDueForReNotify test covering
the three-rule eligibility matrix (never-notified, long-ago, recent).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The evaluator mapped P95_LATENCY_MS to ExecutionStats.avgDurationMs because
stats_1m_route has no p95 column. Exposing the old name implied p95 semantics
operators did not get. Rename to AVG_DURATION_MS makes the contract honest.
Updated RouteMetric enum (with javadoc), evaluator switch, and admin guide.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add withRuleSnapshot(Map) wither to AlertInstance (same pattern as other withers)
- Call snapshotRule(rule) + withRuleSnapshot in both applyResult (single-firing) and
applyBatchFiring paths so every persisted instance carries a non-empty JSONB snapshot
- Strip null values from the Jackson-serialized map before wrapping in the immutable
snapshot so Map.copyOf in the compact ctor does not throw NPE on nullable rule fields
- Add ruleSnapshotIsPersistedOnInstanceCreation IT: asserts name/severity/conditionKind
appear in the rule_snapshot column after a tick fires an instance
- Add historySurvivesRuleDelete IT: fires an instance, deletes the rule, asserts
rule_id IS NULL and rule_snapshot still contains the rule name (spec §5 guarantee)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Declared in cameleer-server-core pom (canonical location for unit-testable
rendering without Spring) and mirrored in cameleer-server-app pom so the
app module compiles standalone without a full reactor install.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds AlertMatchSpec record (core) and ClickHouseSearchIndex.countExecutionsForAlerting —
no FINAL, no text subqueries. Filters by tenant, env, app, route, status, time window,
and optional after-cursor. Attributes (JSON string column) use inlined JSONExtractString
key literals since ClickHouse JDBC does not bind ? placeholders inside JSON functions.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements AlertInstanceRepository: save (upsert), findById, findOpenForRule,
listForInbox (3-way OR: user/group/role via && array-overlap + ANY), countUnreadForUser
(LEFT JOIN alert_reads), ack, resolve, markSilenced, deleteResolvedBefore.
Integration test covers all 9 scenarios including inbox fan-out across all
three target types. Also adds @JsonIgnoreProperties(ignoreUnknown=true) to
SilenceMatcher to suppress Jackson serializing isWildcard() as a round-trip field.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New audit categories: OUTBOUND_CONNECTION_CHANGE, OUTBOUND_HTTP_TRUST_CHANGE.
Controller-level @PreAuthorize defaults to ADMIN; GETs relaxed to ADMIN|OPERATOR.
SecurityConfig permits OPERATOR GETs on /api/v1/admin/outbound-connections/**.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V11 migration referenced users(id) as uuid, but V1 users table has
user_id as TEXT primary key. Amending V11 and the OutboundConnection
record before Task 7's integration tests catch this at Flyway startup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>