Files
cameleer-server/docs/superpowers/plans/2026-04-22-per-exchange-exactly-once.md
hsiegeln 031fe725b5 docs(plan): PER_EXCHANGE exactly-once — implementation plan (21 tasks, 6 phases)
Plan for executing the tightened spec. TDD per task: RED test first,
minimal GREEN impl, commit. Phases 1-2 land the cursor + atomic batch
commit; phase 3 validates config; phase 4 fixes the UI mode-toggle
leakage + empty-targets guard + render-preview pane; phases 5-6 close
with full-lifecycle IT and regression sweep.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 15:39:31 +02:00

46 KiB
Raw Blame History

PER_EXCHANGE Exactly-Once-Per-Exchange Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Make PER_EXCHANGE alert rules fire exactly one AlertInstance and one PENDING AlertNotification per failed exchange, durable across ticks and crashes, with a config surface that can't express nonsensical combinations.

Architecture: Three backend changes + one UI change, landed in order. §1 makes the evaluator cursor persist end-to-end using a composite (startTime, executionId) key (fixes the dead-code bug). §2 wraps per-rule batch processing in one transaction so instance writes, notification enqueues, and cursor advance commit atomically. §3 removes nonsensical knobs from the rule model, adds cross-field validation at the API edge, and tightens the Review step UX (mode-toggle field clearing, empty-targets guard, render-preview pane).

Tech Stack: Java 17 + Spring Boot 3.4.3, PostgreSQL (rule/instance/notification state), ClickHouse (exchange scan), React + Vitest (UI), Maven multi-module build. Existing test layers: JUnit 5 unit + Testcontainers IT.

Spec reference: docs/superpowers/specs/2026-04-22-per-exchange-exactly-once-design.md


Pre-flight

  • Step 0.1: Confirm clean baseline

Run: git status && git log --oneline -3 Expected: working tree clean, HEAD at or after 2f9b9c9b docs(spec): PER_EXCHANGE — tighten motivation, fold in njams review.

  • Step 0.2: Run the existing alerting suite as a known-good baseline

Run:

mvn -pl cameleer-server-app -am -Dit.test='ExchangeMatchEvaluatorTest,AlertEvaluatorJobIT,AlertingFullLifecycleIT,AlertRuleControllerIT' -DskipTests=false verify

Expected: all green. Note any pre-existing failures — do not proceed until baseline is clean.

  • Step 0.3: Confirm no stray TransactionTemplate usage

Run: grep -rn "TransactionTemplate" cameleer-server-app/src/main/java --include="*.java" || echo "none" Expected: none. Plan uses @Transactional annotation pattern (consistent with the rest of the codebase).


Phase 1: Composite cursor — make PER_EXCHANGE advance its cursor

Goal: ExchangeMatchEvaluator produces a composite (startTime, executionId) cursor; AlertEvaluatorJob persists it back into rule.evalState via releaseClaim. First-run bounded by rule.createdAt. Tests 1, 4, 5 all pass.

Task 1.1: Extend EvalResult.Batch to carry nextEvalState

Files:

  • Modify: cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/EvalResult.java

  • Step 1.1.1: Modify the Batch record

Replace:

record Batch(List<Firing> firings) implements EvalResult {
    public Batch {
        firings = firings == null ? List.of() : List.copyOf(firings);
    }
}

With:

record Batch(List<Firing> firings, Map<String, Object> nextEvalState) implements EvalResult {
    public Batch {
        firings = firings == null ? List.of() : List.copyOf(firings);
        nextEvalState = nextEvalState == null ? Map.of() : Map.copyOf(nextEvalState);
    }
    /** Convenience: a Batch with no cursor update (first-run empty, or no matches). */
    public static Batch empty() {
        return new Batch(List.of(), Map.of());
    }
}
  • Step 1.1.2: Update all callers referencing new EvalResult.Batch(...)

Run: grep -rn "new EvalResult.Batch\|new Batch(" cameleer-server-app/src --include="*.java"

For each hit, add Map.of() as the second arg (or replace with EvalResult.Batch.empty() where appropriate). Callers of note: ExchangeMatchEvaluator.evaluatePerExchange and any tests.

  • Step 1.1.3: Run compile sanity

Run: mvn -pl cameleer-server-app -am -DskipTests compile Expected: BUILD SUCCESS.

  • Step 1.1.4: Commit
git add cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/EvalResult.java
git commit -m "alerting(eval): EvalResult.Batch carries nextEvalState for cursor threading"

Task 1.2: Add afterExecutionId to SearchRequest (composite predicate input)

Files:

  • Modify: cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java

  • Modify: cameleer-server-app/src/main/java/com/cameleer/server/app/search/ClickHouseSearchIndex.java

  • Step 1.2.1: Add the field to SearchRequest

Add String afterExecutionId as the last component before environment (keeps existing positional call-sites easy to find). Update the javadoc to document:

afterExecutionId — when combined with a non-null timeFrom, applies the composite predicate (start_time > timeFrom) OR (start_time = timeFrom AND execution_id > afterExecutionId). When null, timeFrom is applied as a plain >= lower bound (existing behaviour).

Update the two existing withers (withInstanceIds, withEnvironment) to pass the new field through. Add a new wither:

public SearchRequest withCursor(Instant ts, String afterExecutionId) {
    return new SearchRequest(
            status, ts, timeTo, durationMin, durationMax, correlationId,
            text, textInBody, textInHeaders, textInErrors,
            routeId, instanceId, processorType, applicationId, instanceIds,
            offset, limit, sortField, sortDir, afterExecutionId, environment
    );
}
  • Step 1.2.2: Update all call sites of new SearchRequest(...)

Run: grep -rn "new SearchRequest(" --include="*.java"

Every hit needs a null inserted in the new slot. The compiler will flag any miss.

  • Step 1.2.3: Teach ClickHouseSearchIndex to apply the composite predicate

In ClickHouseSearchIndex.search where the timeFrom WHERE clause is built, change from start_time >= ? to:

if (req.timeFrom() != null && req.afterExecutionId() != null) {
    // composite predicate: strictly-after in (start_time, execution_id) tuple order
    where.append(" AND (start_time > ? OR (start_time = ? AND execution_id > ?))");
    params.add(Timestamp.from(req.timeFrom()));
    params.add(Timestamp.from(req.timeFrom()));
    params.add(req.afterExecutionId());
} else if (req.timeFrom() != null) {
    where.append(" AND start_time >= ?");
    params.add(Timestamp.from(req.timeFrom()));
}

(Mirror the exact column/param style used in the surrounding code — look at how durationMin or routeId are appended to match.)

  • Step 1.2.4: Compile

Run: mvn -pl cameleer-server-app -am -DskipTests compile Expected: BUILD SUCCESS.

  • Step 1.2.5: Commit
git add cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java \
        cameleer-server-app/src/main/java/com/cameleer/server/app/search/ClickHouseSearchIndex.java
git commit -m "search: SearchRequest.afterExecutionId — composite (startTime, execId) predicate"

Task 1.3: RED — Test 1 cursor monotonicity + Test 4 first-run in ExchangeMatchEvaluatorTest

Files:

  • Modify: cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/eval/ExchangeMatchEvaluatorTest.java

  • Step 1.3.1: Add Test 1 and Test 4

Add two @Test methods. Use existing test fixtures (mock SearchIndex, build a PER_EXCHANGE AlertRule) — copy the setup pattern from the existing tests in the same file.

@Test
void cursorMonotonicity_sameMillisecondExchanges_fireExactlyOncePerTick() {
    var t = Instant.parse("2026-04-22T10:00:00Z");
    var exA = executionSummary("exec-a", t, "FAILED");
    var exB = executionSummary("exec-b", t, "FAILED");
    when(searchIndex.search(any())).thenReturn(new SearchResult<>(List.of(exA, exB), 2, false));

    AlertRule rule = perExchangeRule().withEvalState(Map.of()); // first-run
    EvalResult r1 = evaluator.evaluate(rule.condition(), rule, ctx(t.plusSeconds(1)));

    assertThat(r1).isInstanceOf(EvalResult.Batch.class);
    var batch1 = (EvalResult.Batch) r1;
    assertThat(batch1.firings()).hasSize(2);
    assertThat(batch1.nextEvalState()).containsKey("lastExchangeCursor");
    // cursor is (t, "exec-b") since "exec-b" > "exec-a" lexicographically

    // Tick 2: reflect the advanced cursor back; expect zero firings
    AlertRule advanced = rule.withEvalState(batch1.nextEvalState());
    when(searchIndex.search(any())).thenReturn(new SearchResult<>(List.of(), 0, false));
    EvalResult r2 = evaluator.evaluate(advanced.condition(), advanced, ctx(t.plusSeconds(2)));
    assertThat(((EvalResult.Batch) r2).firings()).isEmpty();

    // Tick 3: a third exchange at the same t with exec-c > exec-b; expect exactly one firing
    var exC = executionSummary("exec-c", t, "FAILED");
    when(searchIndex.search(any())).thenReturn(new SearchResult<>(List.of(exC), 1, false));
    EvalResult r3 = evaluator.evaluate(advanced.condition(), advanced, ctx(t.plusSeconds(3)));
    assertThat(((EvalResult.Batch) r3).firings()).hasSize(1);
}

@Test
void firstRun_boundedByRuleCreatedAt_notRetentionHistory() {
    var created = Instant.parse("2026-04-22T09:00:00Z");
    var before  = created.minus(Duration.ofHours(1));
    var after   = created.plus(Duration.ofMinutes(30));

    // The evaluator must pass `timeFrom = created` to the search.
    ArgumentCaptor<SearchRequest> cap = ArgumentCaptor.forClass(SearchRequest.class);
    when(searchIndex.search(cap.capture())).thenReturn(
            new SearchResult<>(List.of(executionSummary("exec-after", after, "FAILED")), 1, false));

    AlertRule rule = perExchangeRule(created).withEvalState(Map.of()); // no cursor
    EvalResult r = evaluator.evaluate(rule.condition(), rule, ctx(after.plusSeconds(10)));

    SearchRequest req = cap.getValue();
    assertThat(req.timeFrom()).isEqualTo(created);
    assertThat(((EvalResult.Batch) r).firings()).hasSize(1);
    // Pre-creation exchanges never reached the evaluator — the ClickHouse side drops them.
}

(Helpers like perExchangeRule(), executionSummary(...), ctx(...) either exist in the test class or need small additions — copy the style from the existing test file.)

Note: AlertRule.withEvalState does not exist yet — Task 1.4 adds it.

  • Step 1.3.2: Run the new tests — expect RED

Run:

mvn -pl cameleer-server-app test -Dtest='ExchangeMatchEvaluatorTest#cursorMonotonicity_sameMillisecondExchanges_fireExactlyOncePerTick+firstRun_boundedByRuleCreatedAt_notRetentionHistory'

Expected: both RED. Compile may fail until Task 1.4 lands — that's fine, keep the test in a failing state and proceed.

Task 1.4: Add AlertRule.withEvalState wither

Files:

  • Modify: cameleer-server-core/src/main/java/com/cameleer/server/core/alerting/AlertRule.java

  • Step 1.4.1: Add the wither method inside the record

public AlertRule withEvalState(Map<String, Object> newEvalState) {
    return new AlertRule(
            id, environmentId, name, description, severity, enabled,
            conditionKind, condition, evaluationIntervalSeconds,
            forDurationSeconds, reNotifyMinutes,
            notificationTitleTmpl, notificationMessageTmpl,
            webhooks, targets, nextEvaluationAt, claimedBy, claimedUntil,
            newEvalState,
            createdAt, createdBy, updatedAt, updatedBy);
}
  • Step 1.4.2: Compile

Run: mvn -pl cameleer-server-core test-compile Expected: BUILD SUCCESS.

  • Step 1.4.3: Commit (wither is a precondition for Task 1.5)
git add cameleer-server-core/src/main/java/com/cameleer/server/core/alerting/AlertRule.java
git commit -m "core(alerting): AlertRule.withEvalState wither for cursor threading"

Task 1.5: Implement composite-cursor logic in ExchangeMatchEvaluator

Files:

  • Modify: cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/ExchangeMatchEvaluator.java

  • Step 1.5.1: Rewrite evaluatePerExchange to use the composite cursor

Replace the body's cursor parse + SearchRequest construction + _nextCursor stamping with:

private EvalResult evaluatePerExchange(ExchangeMatchCondition c, AlertRule rule,
                                       EvalContext ctx, String envSlug) {
    String appSlug = c.scope() != null ? c.scope().appSlug() : null;
    String routeId = c.scope() != null ? c.scope().routeId() : null;
    ExchangeMatchCondition.ExchangeFilter filter = c.filter();

    // Resolve composite cursor: (startTime, executionId)
    Instant cursorTs;
    String cursorId;
    Object raw = rule.evalState().get("lastExchangeCursor");
    if (raw instanceof String s && !s.isBlank()) {
        int pipe = s.indexOf('|');
        if (pipe < 0) {
            // Malformed — treat as first-run
            cursorTs = rule.createdAt();
            cursorId = "";
        } else {
            cursorTs = Instant.parse(s.substring(0, pipe));
            cursorId = s.substring(pipe + 1);
        }
    } else {
        // First run — bounded by rule.createdAt, empty executionId so any real id sorts after it
        cursorTs = rule.createdAt();
        cursorId = "";
    }

    var req = new SearchRequest(
            filter != null ? filter.status() : null,
            cursorTs,                       // timeFrom
            ctx.now(),                      // timeTo
            null, null, null,
            null, null, null, null,
            routeId,
            null, null,
            appSlug,
            null,
            0, 50, "startTime", "asc",
            cursorId.isEmpty() ? null : cursorId,  // afterExecutionId — null on first run enables >=
            envSlug
    );

    SearchResult<ExecutionSummary> result = searchIndex.search(req);
    List<ExecutionSummary> matches = result.data();

    if (matches.isEmpty()) return EvalResult.Batch.empty();

    // Ensure deterministic ordering for cursor advance
    matches.sort(Comparator
            .comparing(ExecutionSummary::startTime)
            .thenComparing(ExecutionSummary::executionId));

    ExecutionSummary last = matches.get(matches.size() - 1);
    String nextCursorSerialized = last.startTime().toString() + "|" + last.executionId();

    List<EvalResult.Firing> firings = new ArrayList<>();
    for (ExecutionSummary ex : matches) {
        Map<String, Object> ctx2 = new HashMap<>();
        ctx2.put("exchange", Map.of(
                "id",        ex.executionId(),
                "routeId",   ex.routeId() == null ? "" : ex.routeId(),
                "status",    ex.status()  == null ? "" : ex.status(),
                "startTime", ex.startTime() == null ? "" : ex.startTime().toString()
        ));
        ctx2.put("app", Map.of("slug", ex.applicationId() == null ? "" : ex.applicationId()));
        firings.add(new EvalResult.Firing(1.0, null, ctx2));
    }

    Map<String, Object> nextEvalState = new HashMap<>(rule.evalState());
    nextEvalState.put("lastExchangeCursor", nextCursorSerialized);
    return new EvalResult.Batch(firings, nextEvalState);
}

Note: retires the _nextCursor per-firing stamp. If any downstream consumer reads it (unlikely — it was dead), the compiler won't catch it. Spec-check: Test 3 asserts the end-to-end behaviour and will fail if anything still reads _nextCursor.

  • Step 1.5.2: Remove the legacy lastExchangeTs read path

The old code also parsed lastExchangeTs as an Instant. Per spec §1, the key is retired. Delete any references to lastExchangeTs in this file.

Run: grep -n "lastExchangeTs" cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/ExchangeMatchEvaluator.java Expected: (no matches) after edit.

  • Step 1.5.3: Run the unit tests from Task 1.3 — expect GREEN for both

Run:

mvn -pl cameleer-server-app test -Dtest='ExchangeMatchEvaluatorTest'

Expected: all green.

  • Step 1.5.4: Commit
git add cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/ExchangeMatchEvaluator.java \
        cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/eval/ExchangeMatchEvaluatorTest.java
git commit -m "alerting(eval): PER_EXCHANGE composite cursor — monotone across same-ms exchanges

Tests:
- cursorMonotonicity_sameMillisecondExchanges_fireExactlyOncePerTick
- firstRun_boundedByRuleCreatedAt_notRetentionHistory"

Task 1.6: Persist the advanced cursor in AlertEvaluatorJob

Files:

  • Modify: cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJob.java

  • Step 1.6.1: Thread nextEvalState from Batch into releaseClaim

Find the tick() / per-rule processing loop. Between applyResult(rule, result) and reschedule(rule, nextRun), capture the updated eval state from a Batch result:

Map<String, Object> nextEvalState = rule.evalState();
if (result instanceof EvalResult.Batch b) {
    applyResult(rule, b);
    if (!b.nextEvalState().isEmpty()) {
        nextEvalState = b.nextEvalState();
    }
} else {
    applyResult(rule, result);
}
// ...
reschedule(rule.withEvalState(nextEvalState), nextRun);

Update reschedule to pass rule.evalState() (which is now the updated value because we withEvalState-ed the local):

private void reschedule(AlertRule rule, Instant nextRun) {
    ruleRepo.releaseClaim(rule.id(), nextRun, rule.evalState());
}

(The existing signature already passes rule.evalState() — the change is that the caller hands in a mutated rule.)

  • Step 1.6.2: Delete the dead _nextCursor code path

If any lingering reference to _nextCursor exists in AlertEvaluatorJob, remove it.

Run: grep -n "_nextCursor" cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJob.java Expected: (no matches).

  • Step 1.6.3: Compile + unit tests

Run: mvn -pl cameleer-server-app test -Dtest='ExchangeMatchEvaluatorTest' Expected: green.

Task 1.7: RED — Test 5 notification-bleed regression in AlertEvaluatorJobIT

Files:

  • Modify: cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJobIT.java

  • Step 1.7.1: Add Test 5

Use the existing IT fixture pattern (Testcontainers Postgres + ClickHouse).

@Test
void tick2_noNewExchanges_enqueuesZeroAdditionalNotifications() {
    // Arrange: 2 FAILED executions in ClickHouse, 1 PER_EXCHANGE rule with 1 webhook
    seedFailedExecution("exec-1", Instant.parse("2026-04-22T10:00:00Z"));
    seedFailedExecution("exec-2", Instant.parse("2026-04-22T10:00:01Z"));
    UUID ruleId = createPerExchangeRuleWithWebhook();

    // Tick 1 — expect 2 instances, 2 notifications
    evaluator.tick();
    assertThat(instanceRepo.listByRule(ruleId)).hasSize(2);
    assertThat(notificationRepo.listByRule(ruleId)).hasSize(2);

    // Mark dispatched (simulate NotificationDispatchJob drained them)
    notificationRepo.listByRule(ruleId).forEach(n ->
            notificationRepo.markDelivered(n.id(), Instant.now(), 200, "OK"));

    // Tick 2 — no new executions, expect no new notifications
    evaluator.tick();

    assertThat(instanceRepo.listByRule(ruleId)).hasSize(2); // unchanged
    long pending = notificationRepo.listByRule(ruleId).stream()
            .filter(n -> n.status() == NotificationStatus.PENDING).count();
    assertThat(pending).isZero(); // THE BLEED — must be zero after fix
}

(listByRule, markDelivered, seedFailedExecution, createPerExchangeRuleWithWebhook — use or extend helpers already present in AlertEvaluatorJobIT. If they don't exist, add them following the style of the existing tests.)

  • Step 1.7.2: Run Test 5 — expect GREEN (§1 should have fixed it)

Run:

mvn -pl cameleer-server-app verify -Dit.test='AlertEvaluatorJobIT#tick2_noNewExchanges_enqueuesZeroAdditionalNotifications'

Expected: GREEN. If RED, §1.5 / §1.6 are incomplete — debug before proceeding.

  • Step 1.7.3: Commit Phase 1 closeout
git add cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJob.java \
        cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJobIT.java
git commit -m "alerting(eval): persist advanced cursor via releaseClaim — Phase 1 close

Fixes the notification-bleed regression pinned by
AlertEvaluatorJobIT#tick2_noNewExchanges_enqueuesZeroAdditionalNotifications."

Phase 2: Transactional coupling — atomic batch commit

Goal: Instance writes + notification enqueues + cursor advance commit atomically per rule tick. Test 2 pins this.

Task 2.1: RED — Test 2 tick atomicity in AlertEvaluatorJobIT

Files:

  • Modify: cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJobIT.java

  • Step 2.1.1: Add Test 2 with fault injection on the second notification insert

The cleanest injection point is a repository spy or a wrapped AlertNotificationRepository that throws on the Nth save() call. Look at how other ITs wrap repos; if there is a pattern, follow it. Otherwise, add a minimal testing-only FaultInjectingNotificationRepository decorator in the test's support package:

@Test
void tickRollback_faultOnSecondNotificationInsert_leavesCursorUnchanged() {
    seedFailedExecution("exec-1", Instant.parse("2026-04-22T10:00:00Z"));
    seedFailedExecution("exec-2", Instant.parse("2026-04-22T10:00:01Z"));
    seedFailedExecution("exec-3", Instant.parse("2026-04-22T10:00:02Z"));
    UUID ruleId = createPerExchangeRuleWithWebhook();

    var ruleBefore = ruleRepo.findById(ruleId).orElseThrow();
    String cursorBefore = (String) ruleBefore.evalState().get("lastExchangeCursor");
    Instant nextRunBefore = ruleBefore.nextEvaluationAt();

    faultInjectingNotificationRepo.failOnSave(2); // 2nd save() throws

    assertThatThrownBy(() -> evaluator.tick()).isInstanceOf(RuntimeException.class);

    // Post-rollback: zero instances, zero notifications, cursor unchanged, nextRunAt unchanged
    assertThat(instanceRepo.listByRule(ruleId)).isEmpty();
    assertThat(notificationRepo.listByRule(ruleId)).isEmpty();
    var ruleAfter = ruleRepo.findById(ruleId).orElseThrow();
    assertThat(ruleAfter.evalState().get("lastExchangeCursor")).isEqualTo(cursorBefore);
    assertThat(ruleAfter.nextEvaluationAt()).isEqualTo(nextRunBefore);

    // Recover: clear fault, tick again
    faultInjectingNotificationRepo.clearFault();
    evaluator.tick();
    assertThat(instanceRepo.listByRule(ruleId)).hasSize(3);
    assertThat(notificationRepo.listByRule(ruleId)).hasSize(3);
    assertThat(ruleRepo.findById(ruleId).orElseThrow()
            .evalState().get("lastExchangeCursor")).isNotEqualTo(cursorBefore);
}
  • Step 2.1.2: Run Test 2 — expect RED

Run:

mvn -pl cameleer-server-app verify -Dit.test='AlertEvaluatorJobIT#tickRollback_faultOnSecondNotificationInsert_leavesCursorUnchanged'

Expected: RED — one instance + one notification are persisted before the fault; current code is not transactional.

Task 2.2: Wrap per-rule batch processing in a single transaction

Files:

  • Modify: cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJob.java

  • Step 2.2.1: Extract the per-rule batch work into a @Transactional method

Introduce (or refactor into) a method like:

@org.springframework.transaction.annotation.Transactional
protected void processBatchResultAtomically(AlertRule rule, EvalResult.Batch batch, Instant nextRun) {
    // 1. persist all AlertInstances
    for (EvalResult.Firing f : batch.firings()) {
        applyBatchFiring(rule, f); // already persists instance + enqueues notifications
    }
    // 2. update rule: cursor + nextRunAt
    AlertRule updated = rule.withEvalState(
            batch.nextEvalState().isEmpty() ? rule.evalState() : batch.nextEvalState());
    ruleRepo.releaseClaim(updated.id(), nextRun, updated.evalState());
}

Remove the post-loop reschedule(rule, nextRun) call for the Batch path — it's folded in above. Keep it intact for Clear / Firing / Error results (non-Batch path has its own failure semantics and is untouched per spec non-goal).

The Spring proxy rule: @Transactional only applies to methods called through the bean, not via this.foo(). If AlertEvaluatorJob is itself a Spring bean called from the scheduler, splitting processBatchResultAtomically into a separate @Component (e.g. BatchResultApplier) injected into AlertEvaluatorJob is safer. Do that if the existing class structure is self-calling.

  • Step 2.2.2: Ensure the repositories participate in the outer transaction

JdbcTemplate-based repositories (PostgresAlertInstanceRepository, PostgresAlertNotificationRepository, PostgresAlertRuleRepository) use DataSourceTransactionManager automatically when the outer method is @Transactional. No change expected, but grep for explicit transactionTemplate.execute or manual getConnection().setAutoCommit(false) — if any are present in these repos, that's a problem. Fix: remove the manual transaction boundary and let the outer @Transactional drive.

Run: grep -n "setAutoCommit\|transactionTemplate" cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/storage/*.java Expected: no manual tx boundaries. If any appear, delete them.

  • Step 2.2.3: Run Test 2 — expect GREEN

Run:

mvn -pl cameleer-server-app verify -Dit.test='AlertEvaluatorJobIT#tickRollback_faultOnSecondNotificationInsert_leavesCursorUnchanged'

Expected: GREEN.

  • Step 2.2.4: Run all AlertEvaluatorJobIT tests — regression check

Run:

mvn -pl cameleer-server-app verify -Dit.test='AlertEvaluatorJobIT'

Expected: all green.

  • Step 2.2.5: Commit
git add cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJob.java \
        cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/eval/AlertEvaluatorJobIT.java
git commit -m "alerting(eval): atomic per-rule batch commit — Phase 2 close

Wraps instance writes, notification enqueues, and cursor advance in one
@Transactional boundary. Rollback leaves the rule replayable on next tick."

Phase 3: Config hygiene — seal the nonsensical combinations

Goal: perExchangeLingerSeconds removed from the model. API rejects reNotifyMinutes != 0 or forDurationSeconds != 0 on PER_EXCHANGE, and rejects empty webhooks+targets on save. UI clears mode-specific fields on fireMode toggle. Review step blocks save on missing targets and surfaces render-preview.

Task 3.1: Remove perExchangeLingerSeconds from the core model

Files:

  • Modify: cameleer-server-core/src/main/java/com/cameleer/server/core/alerting/ExchangeMatchCondition.java

  • Step 3.1.1: Drop the field from the record and its compact ctor validation

Delete int perExchangeLingerSeconds from the record components and remove the corresponding line in the compact constructor (the one that currently enforces non-null / positive on this field for PER_EXCHANGE).

  • Step 3.1.2: Fix all call sites

Run: grep -rn "perExchangeLingerSeconds" --include="*.java" --include="*.ts" --include="*.tsx"

Every hit in Java files needs the positional arg removed from new ExchangeMatchCondition(...) calls. For TS/TSX hits, defer to Phase 4 Task 4.1 (type regen) which will surface them.

  • Step 3.1.3: Compile

Run: mvn -pl cameleer-server-core,cameleer-server-app -DskipTests compile Expected: BUILD SUCCESS.

Task 3.2: RED — cross-field validator tests in AlertRuleControllerIT

Files:

  • Modify: cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/controller/AlertRuleControllerIT.java

  • Step 3.2.1: Add three validation ITs

@Test
void createPerExchangeRule_withReNotifyMinutesNonZero_returns400() {
    var body = baseValidPerExchangeRuleRequest()
            .withReNotifyMinutes(60);
    mvc.perform(post(rulesUrl()).contentType(APPLICATION_JSON).content(asJson(body)))
       .andExpect(status().isBadRequest())
       .andExpect(jsonPath("$.message").value(containsString("reNotifyMinutes")));
}

@Test
void createPerExchangeRule_withForDurationSecondsNonZero_returns400() {
    var body = baseValidPerExchangeRuleRequest()
            .withForDurationSeconds(60);
    mvc.perform(post(rulesUrl()).contentType(APPLICATION_JSON).content(asJson(body)))
       .andExpect(status().isBadRequest())
       .andExpect(jsonPath("$.message").value(containsString("forDurationSeconds")));
}

@Test
void createAnyRule_withEmptyWebhooksAndTargets_returns400() {
    var body = baseValidPerExchangeRuleRequest()
            .withWebhooks(List.of())
            .withTargets(List.of());
    mvc.perform(post(rulesUrl()).contentType(APPLICATION_JSON).content(asJson(body)))
       .andExpect(status().isBadRequest())
       .andExpect(jsonPath("$.message").value(containsString("webhook").or(containsString("target"))));
}

(baseValidPerExchangeRuleRequest — helper builder. If missing, add one at the top of the file based on existing valid request fixtures.)

  • Step 3.2.2: Run the three tests — expect RED

Run:

mvn -pl cameleer-server-app verify -Dit.test='AlertRuleControllerIT#createPerExchangeRule_withReNotifyMinutesNonZero_returns400+createPerExchangeRule_withForDurationSecondsNonZero_returns400+createAnyRule_withEmptyWebhooksAndTargets_returns400'

Expected: 3 RED.

Task 3.3: Implement cross-field validation in AlertRuleController

Files:

  • Modify: cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/controller/AlertRuleController.java

  • Modify: cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/dto/AlertRuleRequest.java (if separate)

  • Step 3.3.1: Add a validateBusinessRules(AlertRuleRequest req) method

Gather the three checks in one place:

private void validateBusinessRules(AlertRuleRequest req) {
    if (req.condition() instanceof ExchangeMatchCondition ex
            && ex.fireMode() == FireMode.PER_EXCHANGE) {
        if (req.reNotifyMinutes() != 0) {
            throw new ResponseStatusException(HttpStatus.BAD_REQUEST,
                "reNotifyMinutes must be 0 for PER_EXCHANGE rules (re-notify does not apply)");
        }
        if (req.forDurationSeconds() != 0) {
            throw new ResponseStatusException(HttpStatus.BAD_REQUEST,
                "forDurationSeconds must be 0 for PER_EXCHANGE rules");
        }
    }
    boolean noWebhooks = req.webhooks() == null || req.webhooks().isEmpty();
    boolean noTargets  = req.targets()  == null || req.targets().isEmpty();
    if (noWebhooks && noTargets) {
        throw new ResponseStatusException(HttpStatus.BAD_REQUEST,
            "rule must have at least one webhook or target — otherwise it never notifies anyone");
    }
}
  • Step 3.3.2: Call it from the create AND update handlers

Run: grep -n "@PostMapping\|@PutMapping" cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/controller/AlertRuleController.java

For each POST/PUT handler that accepts an AlertRuleRequest, call validateBusinessRules(req) at the top of the method (before any persistence).

  • Step 3.3.3: Run the three tests from 3.2 — expect GREEN

Run:

mvn -pl cameleer-server-app verify -Dit.test='AlertRuleControllerIT#createPerExchangeRule_withReNotifyMinutesNonZero_returns400+createPerExchangeRule_withForDurationSecondsNonZero_returns400+createAnyRule_withEmptyWebhooksAndTargets_returns400'

Expected: 3 GREEN.

  • Step 3.3.4: Run the full AlertRuleControllerIT — regression check

Run: mvn -pl cameleer-server-app verify -Dit.test='AlertRuleControllerIT' Expected: all green. Some existing fixtures may have built rules with empty webhooks+targets — fix those fixtures by adding at least one target.

  • Step 3.3.5: Commit
git add cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/controller/AlertRuleController.java \
        cameleer-server-app/src/main/java/com/cameleer/server/app/alerting/dto/AlertRuleRequest.java \
        cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/controller/AlertRuleControllerIT.java \
        cameleer-server-core/src/main/java/com/cameleer/server/core/alerting/ExchangeMatchCondition.java
git commit -m "alerting(api): cross-field validation for PER_EXCHANGE + empty-targets guard

Drops perExchangeLingerSeconds (unused since inception). 400s on:
- reNotifyMinutes != 0 for PER_EXCHANGE
- forDurationSeconds != 0 for PER_EXCHANGE
- empty webhooks + empty targets (rule never notifies)"

Phase 4: UI — form-state hygiene + Review step guards

Goal: ExchangeMatchForm clears mode-specific fields on fireMode toggle; ReviewStep blocks save on empty webhooks+targets and shows a render-preview pane.

Task 4.1: Regenerate OpenAPI types (reflects Phase 3 contract changes)

Files:

  • Regen: ui/src/api/openapi.json, ui/src/api/schema.d.ts

  • Step 4.1.1: Start the backend

Run (in a separate terminal, or via run_in_background):

mvn -pl cameleer-server-app spring-boot:run

Wait until Started ... in ... seconds appears.

  • Step 4.1.2: Regenerate types

Run:

cd ui && npm run generate-api:live

Expected: openapi.json + schema.d.ts updated.

  • Step 4.1.3: Surface TypeScript compile errors

Run:

cd ui && npm run typecheck

Expected: errors on every SPA call site that references the dropped perExchangeLingerSeconds or the stale rule-request shape. Fix each — either drop the field from object literals or stop surfacing it in forms.

  • Step 4.1.4: Stop the backend

Ctrl-C (or kill the background process).

  • Step 4.1.5: Commit the regenerated types
git add ui/src/api/openapi.json ui/src/api/schema.d.ts
# Fixes to other .ts/.tsx files are added in later tasks; if there are stragglers here, add them too.
git commit -m "ui(api): regen types after PER_EXCHANGE validation + linger field drop"

Task 4.2: RED — form-state clear test

Files:

  • Create: ui/src/pages/Alerts/RuleEditor/condition-forms/ExchangeMatchForm.test.tsx

  • Step 4.2.1: Write the failing Vitest

import { describe, it, expect } from "vitest";
import { initialFormState, reduceFormState } from "../form-state";

describe("ExchangeMatchForm form-state on fireMode toggle", () => {
  it("clears COUNT_IN_WINDOW fields when switching to PER_EXCHANGE", () => {
    let s = initialFormState();
    s = reduceFormState(s, { type: "SET_CONDITION_KIND", kind: "EXCHANGE_MATCH" });
    s = reduceFormState(s, { type: "SET_FIRE_MODE", mode: "COUNT_IN_WINDOW" });
    s = reduceFormState(s, { type: "SET_THRESHOLD", value: 5 });
    s = reduceFormState(s, { type: "SET_WINDOW_SECONDS", value: 300 });

    s = reduceFormState(s, { type: "SET_FIRE_MODE", mode: "PER_EXCHANGE" });

    expect(s.condition.threshold).toBe(0);
    expect(s.condition.windowSeconds).toBe(0);
  });

  it("restores defaults (not stale values) when switching back to COUNT_IN_WINDOW", () => {
    let s = initialFormState();
    s = reduceFormState(s, { type: "SET_CONDITION_KIND", kind: "EXCHANGE_MATCH" });
    s = reduceFormState(s, { type: "SET_FIRE_MODE", mode: "COUNT_IN_WINDOW" });
    s = reduceFormState(s, { type: "SET_THRESHOLD", value: 5 });
    s = reduceFormState(s, { type: "SET_FIRE_MODE", mode: "PER_EXCHANGE" });
    s = reduceFormState(s, { type: "SET_FIRE_MODE", mode: "COUNT_IN_WINDOW" });

    expect(s.condition.threshold).toBe(0); // not 5
  });

  it("clears reNotifyMinutes to 0 when switching to PER_EXCHANGE", () => {
    let s = initialFormState();
    s = reduceFormState(s, { type: "SET_CONDITION_KIND", kind: "EXCHANGE_MATCH" });
    s = reduceFormState(s, { type: "SET_FIRE_MODE", mode: "COUNT_IN_WINDOW" });
    s = reduceFormState(s, { type: "SET_RENOTIFY_MINUTES", value: 60 });
    s = reduceFormState(s, { type: "SET_FIRE_MODE", mode: "PER_EXCHANGE" });

    expect(s.reNotifyMinutes).toBe(0);
    expect(s.forDurationSeconds).toBe(0);
  });
});

(If the action names or shape differ, align to the existing form-state.ts reducer signature — read the existing file once to match.)

  • Step 4.2.2: Run — expect RED

Run:

cd ui && npm test -- ExchangeMatchForm.test.tsx

Expected: RED.

Task 4.3: Implement form-state clearing on fireMode toggle

Files:

  • Modify: ui/src/pages/Alerts/RuleEditor/form-state.ts

  • Step 4.3.1: Update the SET_FIRE_MODE reducer case

Find the existing SET_FIRE_MODE action handler. Replace it with a handler that, based on the new mode, resets the fields that are no longer in scope:

case "SET_FIRE_MODE": {
  if (action.mode === "PER_EXCHANGE") {
    return {
      ...state,
      condition: { ...state.condition, fireMode: "PER_EXCHANGE",
                   threshold: 0, windowSeconds: 0 },
      reNotifyMinutes: 0,
      forDurationSeconds: 0,
    };
  }
  // COUNT_IN_WINDOW — restore defaults for re-entered fields
  return {
    ...state,
    condition: { ...state.condition, fireMode: "COUNT_IN_WINDOW",
                 threshold: 0, windowSeconds: 0 },  // zero not stale
    // reNotifyMinutes / forDurationSeconds stay as-is or per product default
  };
}

(Adjust exact field shape to match the real FormState type from the file.)

  • Step 4.3.2: Run the test — expect GREEN

Run: cd ui && npm test -- ExchangeMatchForm.test.tsx Expected: GREEN.

  • Step 4.3.3: Commit
git add ui/src/pages/Alerts/RuleEditor/form-state.ts \
        ui/src/pages/Alerts/RuleEditor/condition-forms/ExchangeMatchForm.test.tsx
git commit -m "ui(alerts): clear mode-specific fields on fireMode toggle

Prevents stale COUNT_IN_WINDOW threshold/windowSeconds from surviving
PER_EXCHANGE save validation."

Task 4.4: Remove linger field + enforce PER_EXCHANGE UI constraints

Files:

  • Modify: ui/src/pages/Alerts/RuleEditor/condition-forms/ExchangeMatchForm.tsx

  • Step 4.4.1: Delete any perExchangeLingerSeconds input binding

Run: grep -n "perExchangeLingerSeconds\|lingerSeconds" ui/src/pages/Alerts/RuleEditor/ Expected after this step: no matches anywhere in ui/src.

  • Step 4.4.2: When fireMode === 'PER_EXCHANGE', force-disable reNotifyMinutes at 0

Near the reNotifyMinutes input, wrap it:

<label>Re-notify every (minutes)</label>
<input
  type="number"
  min={0}
  value={state.reNotifyMinutes}
  onChange={...}
  disabled={state.condition.fireMode === "PER_EXCHANGE"}
  title={state.condition.fireMode === "PER_EXCHANGE"
    ? "Per-exchange rules fire exactly once — re-notify does not apply."
    : undefined}
/>
  • Step 4.4.3: Hide forDurationSeconds entirely when PER_EXCHANGE

Wrap the forDuration input's JSX in {state.condition.fireMode !== "PER_EXCHANGE" && ( ... )}.

  • Step 4.4.4: Smoke-verify in the browser

Start the backend in a background shell, run cd ui && npm run dev, open http://localhost:5173/alerts/rules/new.

  • Pick EXCHANGE_MATCH → PER_EXCHANGE: reNotifyMinutes disabled and shows 0, forDuration hidden.

  • Flip to COUNT_IN_WINDOW: both become editable again.

  • Flip back to PER_EXCHANGE: any values entered for threshold/window are cleared (check the preview).

  • Step 4.4.5: Commit

git add ui/src/pages/Alerts/RuleEditor/condition-forms/ExchangeMatchForm.tsx
git commit -m "ui(alerts): ExchangeMatchForm — enforce PER_EXCHANGE constraints

Linger field removed. reNotifyMinutes disabled at 0 with tooltip when
PER_EXCHANGE. forDurationSeconds hidden (COUNT_IN_WINDOW only)."

Task 4.5: ReviewStep blocks save on empty webhooks+targets

Files:

  • Modify: ui/src/pages/Alerts/RuleEditor/ReviewStep.tsx

  • Step 4.5.1: Add a computed saveBlockReason

const saveBlockReason = useMemo<string | null>(() => {
  const noWebhooks = (state.webhooks ?? []).length === 0;
  const noTargets  = (state.targets  ?? []).length === 0;
  if (noWebhooks && noTargets) {
    return "Add at least one webhook or target — a rule with no recipients never notifies anyone.";
  }
  return null;
}, [state.webhooks, state.targets]);

Render a red warning banner when saveBlockReason !== null. Disable the "Save" button when non-null:

{saveBlockReason && <div className="error-banner">{saveBlockReason}</div>}
<button
  onClick={onSave}
  disabled={saveBlockReason !== null || saving}
>Save rule</button>
  • Step 4.5.2: Smoke-test in browser

Same flow as 4.4.4. Create a rule, skip the Notify step, land on Review — expect the banner and disabled Save button. Add a webhook → banner and button-disabled state clear.

Task 4.6: ReviewStep surfaces /render-preview

Files:

  • Modify: ui/src/pages/Alerts/RuleEditor/ReviewStep.tsx

  • Step 4.6.1: Add a "Preview rendered notification" call

Using the existing openapi client (apiClient.POST('/api/v1/environments/{envSlug}/alerts/rules/{id}/render-preview', ...) — check the generated schema for the exact function name), fetch preview on a button click. Display title and message in a read-only pane:

const [preview, setPreview] = useState<{title: string; message: string} | null>(null);

async function handlePreview() {
  // For new rules (no id yet), POST to a stateless preview endpoint if one exists,
  // or skip the button when editing an already-saved rule only. Check AlertRuleController
  // for the exact shape — render-preview today is `POST /{id}/render-preview`, so new-rule
  // preview requires saving first. If that's the case, show the button only in edit mode.
  const resp = await apiClient.POST(
    "/api/v1/environments/{envSlug}/alerts/rules/{id}/render-preview",
    { params: { path: { envSlug, id: ruleId } } }
  );
  setPreview(resp.data);
}

If render-preview is not new-rule-friendly (requires a persisted id), scope this to the edit flow only — show the button only when ruleId !== undefined. Flag in a code comment; wiring it for new rules is a follow-up.

  • Step 4.6.2: Smoke-test

Edit an existing PER_EXCHANGE rule. Click "Preview". Expect title/message pane populated.

  • Step 4.6.3: Commit ReviewStep changes
git add ui/src/pages/Alerts/RuleEditor/ReviewStep.tsx
git commit -m "ui(alerts): ReviewStep — block save on empty targets + render-preview pane"

Phase 5: Full-lifecycle exactly-once (Test 3)

Goal: Pin the end-to-end guarantee — 5 exchanges across 2 ticks → exactly 5 instances + 5 notifications; tick 3 no-op; ack isolation.

Task 5.1: Extend AlertingFullLifecycleIT with Test 3

Files:

  • Modify: cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/AlertingFullLifecycleIT.java

  • Step 5.1.1: Add the exactly-once lifecycle test

@Test
void perExchange_5FailuresAcross2Ticks_exactlyOncePerExchange() {
    UUID ruleId = createPerExchangeRuleWithWebhook();
    var base = Instant.parse("2026-04-22T10:00:00Z");

    // Tick 1 — seed 3, tick
    seedFailedExecution("exec-1", base);
    seedFailedExecution("exec-2", base.plusSeconds(1));
    seedFailedExecution("exec-3", base.plusSeconds(2));
    evaluator.tick();

    // Tick 2 — seed 2 more, tick
    seedFailedExecution("exec-4", base.plusSeconds(3));
    seedFailedExecution("exec-5", base.plusSeconds(4));
    evaluator.tick();

    assertThat(instanceRepo.listByRule(ruleId)).hasSize(5);
    assertThat(notificationRepo.listByRule(ruleId)).hasSize(5);
    assertThat(notificationRepo.listByRule(ruleId).stream()
        .allMatch(n -> n.status() == NotificationStatus.PENDING)).isTrue();

    // Dispatch all, then tick 3 — expect no change
    dispatchAllPending();
    evaluator.tick();
    assertThat(instanceRepo.listByRule(ruleId)).hasSize(5);
    long pending = notificationRepo.listByRule(ruleId).stream()
        .filter(n -> n.status() == NotificationStatus.PENDING).count();
    assertThat(pending).isZero();

    // Ack one — others unchanged
    var first = instanceRepo.listByRule(ruleId).get(0);
    instanceRepo.ack(first.id(), "tester", Instant.now());
    var all = instanceRepo.listByRule(ruleId);
    long ackedCount = all.stream().filter(i -> i.ackedBy() != null).count();
    assertThat(ackedCount).isEqualTo(1);
}
  • Step 5.1.2: Run — expect GREEN (given Phases 13 landed)

Run:

mvn -pl cameleer-server-app verify -Dit.test='AlertingFullLifecycleIT#perExchange_5FailuresAcross2Ticks_exactlyOncePerExchange'

Expected: GREEN. If RED, investigate — the earlier phases' tests should cover the building blocks.

  • Step 5.1.3: Commit
git add cameleer-server-app/src/test/java/com/cameleer/server/app/alerting/AlertingFullLifecycleIT.java
git commit -m "alerting(it): AlertingFullLifecycleIT — exactly-once across ticks, ack isolation"

Phase 6: Final sweep

Task 6.1: Full regression suite

  • Step 6.1.1: Backend full alerting suite

Run:

mvn -pl cameleer-server-app -am -Dit.test='ExchangeMatchEvaluatorTest,AlertEvaluatorJobIT,AlertingFullLifecycleIT,AlertRuleControllerIT' verify

Expected: 0 failures.

  • Step 6.1.2: Full UI test + typecheck + build

Run:

cd ui && npm test && npm run typecheck && npm run build

Expected: all green.

  • Step 6.1.3: Manual verification per spec §Verification

Checklist from the spec:

  1. Create a PER_EXCHANGE / FAILED rule via UI — verify reNotifyMinutes fixed at 0 and disabled; linger field gone; toggling fireMode clears COUNT_IN_WINDOW-specific fields.
  2. Produce failing exchanges → exactly one Inbox instance per exchange; exactly one Slack message per exchange.
  3. Wait 3 evaluation-interval ticks with no new exchanges → no additional notifications (the pre-fix bleed is dead).
  4. Ack one instance → others unaffected.
  5. Produce another failure → only the new one appears.
  6. Save a rule with empty webhooks+targets → blocked at Review step with a reason.

Any deviation is a regression; fix before closing.

Task 6.2: Housekeeping

  • Step 6.2.1: Refresh .claude/rules if relevant class wiring changed

Per CLAUDE.md: if new classes were introduced (e.g. BatchResultApplier from Task 2.2.1), add them to .claude/rules/app-classes.md. If only existing classes were modified, no rule update is needed.

  • Step 6.2.2: Refresh GitNexus

Run:

npx gitnexus analyze --embeddings

(A PostToolUse hook may already handle this after the last commit — if so, skip.)

  • Step 6.2.3: Final commit if any housekeeping landed
git add .claude/rules/app-classes.md
git commit -m "docs(rules): app-classes — BatchResultApplier for transactional batch commit"

Deferred (explicitly parked — see spec §Follow-ups)

  • Sealed condition-type hierarchy (backend).
  • coalesceSeconds primitive on PER_EXCHANGE.
  • Ingestion-time rule matching (streaming path).
  • deployBacklogCap clamp on first-run cursor (only needed if deploying against long-lived rules in a history-rich environment; pre-prod skippable).