From af0af9ce38038b33865f96c194cf015cb1f3c1cb Mon Sep 17 00:00:00 2001 From: hsiegeln <37154749+hsiegeln@users.noreply.github.com> Date: Wed, 11 Mar 2026 18:42:50 +0100 Subject: [PATCH] docs(03-01): complete agent registry plan - SUMMARY.md with 2 tasks, 15 files, 30 tests (23 unit + 7 IT) - STATE.md: Phase 3 position, agent registry decisions - ROADMAP.md: Phase 3 progress 1/2 plans - REQUIREMENTS.md: AGNT-01, AGNT-02, AGNT-03 marked complete Co-Authored-By: Claude Opus 4.6 --- .planning/REQUIREMENTS.md | 6 +- .planning/STATE.md | 36 ++--- .../03-01-SUMMARY.md | 133 ++++++++++++++++++ 3 files changed, 156 insertions(+), 19 deletions(-) create mode 100644 .planning/phases/03-agent-registry-sse-push/03-01-SUMMARY.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index 18f7775f..cd410847 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -27,9 +27,9 @@ Requirements for initial release. Each maps to roadmap phases. Tracked as Gitea ### Agent Management -- [ ] **AGNT-01**: Agent registers via `POST /api/v1/agents/register` with bootstrap token, receives JWT + server public key (#13) -- [ ] **AGNT-02**: Server maintains agent registry with LIVE/STALE/DEAD lifecycle based on heartbeat timing (#14) -- [ ] **AGNT-03**: Agent sends heartbeat via `POST /api/v1/agents/{id}/heartbeat` every 30s (#15) +- [x] **AGNT-01**: Agent registers via `POST /api/v1/agents/register` with bootstrap token, receives JWT + server public key (#13) +- [x] **AGNT-02**: Server maintains agent registry with LIVE/STALE/DEAD lifecycle based on heartbeat timing (#14) +- [x] **AGNT-03**: Agent sends heartbeat via `POST /api/v1/agents/{id}/heartbeat` every 30s (#15) - [ ] **AGNT-04**: Server pushes `config-update` events to agents via SSE with Ed25519 signature (#16) - [ ] **AGNT-05**: Server pushes `deep-trace` commands to agents via SSE for specific correlationIds (#17) - [ ] **AGNT-06**: Server pushes `replay` commands to agents via SSE with signed replay tokens (#18) diff --git a/.planning/STATE.md b/.planning/STATE.md index 626fa1df..c23f7ad5 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -2,16 +2,16 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone -status: completed -stopped_at: Phase 3 context gathered -last_updated: "2026-03-11T17:10:19.153Z" -last_activity: 2026-03-11 -- Completed 02-04 (Diagram hash linking, Surefire fix, test stability) +status: in-progress +stopped_at: Completed 03-01-PLAN.md +last_updated: "2026-03-11T17:41:24Z" +last_activity: 2026-03-11 -- Completed 03-01 (Agent registry domain + REST endpoints) progress: total_phases: 4 completed_phases: 2 - total_plans: 7 - completed_plans: 7 - percent: 100 + total_plans: 9 + completed_plans: 8 + percent: 89 --- # Project State @@ -21,16 +21,16 @@ progress: See: .planning/PROJECT.md (updated 2026-03-11) **Core value:** Users can reliably search and find any transaction across all connected Camel instances -- by any combination of state, time, duration, or content -- even at millions of transactions per day with 30-day retention. -**Current focus:** Phase 2: Transaction Search + Diagrams +**Current focus:** Phase 3: Agent Registry + SSE Push ## Current Position -Phase: 2 of 4 (Transaction Search + Diagrams) -- COMPLETE -Plan: 4 of 4 in current phase (gap closure) -Status: Phase 02 Complete (including gap closure) -Last activity: 2026-03-11 -- Completed 02-04 (Diagram hash linking, Surefire fix, test stability) +Phase: 3 of 4 (Agent Registry + SSE Push) +Plan: 1 of 2 in current phase (agent registry domain + REST endpoints) +Status: Plan 03-01 complete, Plan 03-02 remaining +Last activity: 2026-03-11 -- Completed 03-01 (Agent registry domain + REST endpoints) -Progress: [██████████] 100% +Progress: [████████░░] 89% ## Performance Metrics @@ -57,6 +57,7 @@ Progress: [██████████] 100% | Phase 02 P02 | 14min | 2 tasks | 10 files | | Phase 02 P03 | 12min | 2 tasks | 9 files | | Phase 02 P04 | 22min | 1 tasks | 5 files | +| Phase 03 P01 | 15min | 2 tasks | 15 files | ## Accumulated Context @@ -92,6 +93,9 @@ Recent decisions affecting current work: - [Phase 02]: DiagramRepository injected via constructor into ClickHouseExecutionRepository for diagram hash lookup during batch insert - [Phase 02]: Awaitility ignoreExceptions pattern adopted for all ClickHouse polling assertions - [Phase 02]: Surefire and Failsafe both need reuseForks=false for ELK classloader isolation +- [Phase 03]: AgentInfo as Java record with wither-style methods for immutable ConcurrentHashMap swapping +- [Phase 03]: Dead threshold measured from staleTransitionTime, not lastHeartbeat +- [Phase 03]: spring.mvc.async.request-timeout=-1 set proactively for SSE support in Plan 02 ### Pending Todos @@ -106,6 +110,6 @@ None yet. ## Session Continuity -Last session: 2026-03-11T17:10:19.149Z -Stopped at: Phase 3 context gathered -Resume file: .planning/phases/03-agent-registry-sse-push/03-CONTEXT.md +Last session: 2026-03-11T17:41:24Z +Stopped at: Completed 03-01-PLAN.md +Resume file: .planning/phases/03-agent-registry-sse-push/03-02-PLAN.md diff --git a/.planning/phases/03-agent-registry-sse-push/03-01-SUMMARY.md b/.planning/phases/03-agent-registry-sse-push/03-01-SUMMARY.md new file mode 100644 index 00000000..46de11c7 --- /dev/null +++ b/.planning/phases/03-agent-registry-sse-push/03-01-SUMMARY.md @@ -0,0 +1,133 @@ +--- +phase: 03-agent-registry-sse-push +plan: 01 +subsystem: agent-registry +tags: [concurrenthashmap, lifecycle, heartbeat, rest-api, spring-scheduled] + +# Dependency graph +requires: + - phase: 01-ingestion-pipeline + provides: IngestionBeanConfig pattern, @Scheduled pattern, ProtocolVersionInterceptor +provides: + - AgentRegistryService with register/heartbeat/lifecycle/command management + - AgentInfo record with wither-style immutable state transitions + - AgentCommand record with delivery status tracking + - AgentEventListener interface for SSE bridge (Plan 02) + - POST /api/v1/agents/register endpoint + - POST /api/v1/agents/{id}/heartbeat endpoint + - GET /api/v1/agents endpoint with ?status= filter + - AgentLifecycleMonitor with LIVE->STALE->DEAD transitions + - AgentRegistryConfig with all timing properties +affects: [03-02-sse-push, 04-security] + +# Tech tracking +tech-stack: + added: [] + patterns: [immutable-record-with-wither, compute-if-present-atomic-swap, agent-lifecycle-state-machine] + +key-files: + created: + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentInfo.java + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentState.java + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentCommand.java + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandStatus.java + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandType.java + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentRegistryService.java + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentEventListener.java + - cameleer3-server-core/src/test/java/com/cameleer3/server/core/agent/AgentRegistryServiceTest.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryConfig.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryBeanConfig.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java + - cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java + modified: + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java + - cameleer3-server-app/src/main/resources/application.yml + +key-decisions: + - "AgentInfo as Java record with wither-style methods for immutable ConcurrentHashMap swapping" + - "Dead threshold measured from staleTransitionTime, not lastHeartbeat (matches requirement precisely)" + - "spring.mvc.async.request-timeout=-1 set now for SSE support in Plan 02" + +patterns-established: + - "Immutable record + ConcurrentHashMap.compute for thread-safe state transitions" + - "AgentEventListener interface in core module as bridge to SSE layer in app module" + +requirements-completed: [AGNT-01, AGNT-02, AGNT-03] + +# Metrics +duration: 15min +completed: 2026-03-11 +--- + +# Phase 3 Plan 1: Agent Registry Summary + +**In-memory agent registry with ConcurrentHashMap, LIVE/STALE/DEAD lifecycle via @Scheduled, and REST endpoints for registration/heartbeat/listing** + +## Performance + +- **Duration:** 15 min +- **Started:** 2026-03-11T17:26:34Z +- **Completed:** 2026-03-11T17:41:24Z +- **Tasks:** 2 +- **Files modified:** 15 + +## Accomplishments +- Agent registry domain model with 5 types (AgentInfo, AgentState, AgentCommand, CommandStatus, CommandType) +- Full lifecycle management: register, heartbeat, LIVE->STALE->DEAD transitions with configurable thresholds +- Command queue with PENDING/DELIVERED/ACKNOWLEDGED/EXPIRED status tracking and event listener bridge +- REST endpoints: POST /register, POST /{id}/heartbeat, GET /agents with ?status= filter +- 23 unit tests + 7 integration tests all passing + +## Task Commits + +Each task was committed atomically: + +1. **Task 1 (RED): Failing tests for agent registry** - `4cd7ed9` (test) +2. **Task 1 (GREEN): Implement agent registry service** - `61f3902` (feat) +3. **Task 2: Controllers, config, lifecycle monitor, integration tests** - `0372be2` (feat) + +_Note: Task 1 used TDD with separate RED/GREEN commits_ + +## Files Created/Modified +- `AgentInfo.java` - Immutable record with wither-style methods for atomic state transitions +- `AgentState.java` - LIVE, STALE, DEAD lifecycle enum +- `AgentCommand.java` - Command record with delivery status tracking +- `CommandStatus.java` - PENDING, DELIVERED, ACKNOWLEDGED, EXPIRED enum +- `CommandType.java` - CONFIG_UPDATE, DEEP_TRACE, REPLAY enum +- `AgentRegistryService.java` - Core registry: register, heartbeat, lifecycle, commands +- `AgentEventListener.java` - Interface for SSE bridge (Plan 02 integration point) +- `AgentRegistryConfig.java` - @ConfigurationProperties for all timing settings +- `AgentRegistryBeanConfig.java` - @Configuration wiring AgentRegistryService +- `AgentLifecycleMonitor.java` - @Scheduled lifecycle check and command expiry +- `AgentRegistrationController.java` - REST endpoints for agents +- `AgentRegistryServiceTest.java` - 23 unit tests +- `AgentRegistrationControllerIT.java` - 7 integration tests +- `Cameleer3ServerApplication.java` - Added AgentRegistryConfig to @EnableConfigurationProperties +- `application.yml` - Added agent-registry config section and spring.mvc.async.request-timeout + +## Decisions Made +- Used Java record with wither-style methods for AgentInfo instead of mutable class -- ConcurrentHashMap.compute provides atomic swapping without needing synchronized fields +- Dead threshold measured from staleTransitionTime field (not lastHeartbeat) to match the "5 minutes after going STALE" requirement precisely +- Set spring.mvc.async.request-timeout=-1 proactively for SSE support needed in Plan 02 +- Command queue uses ConcurrentLinkedQueue per agent for lock-free command management + +## Deviations from Plan + +None - plan executed exactly as written. + +## Issues Encountered +- DiagramRenderControllerIT has a pre-existing flaky failure (EmptyResultDataAccess in seedDiagram) unrelated to Phase 3 changes. Logged in deferred-items.md. + +## User Setup Required + +None - no external service configuration required. + +## Next Phase Readiness +- AgentRegistryService ready for SSE integration via AgentEventListener interface +- Plan 02 (SSE Push) can wire SseConnectionManager as AgentEventListener implementation +- All agent endpoints under /api/v1/agents/ already covered by ProtocolVersionInterceptor + +--- +*Phase: 03-agent-registry-sse-push* +*Completed: 2026-03-11*