docs(03-01): complete agent registry plan

- SUMMARY.md with 2 tasks, 15 files, 30 tests (23 unit + 7 IT)
- STATE.md: Phase 3 position, agent registry decisions
- ROADMAP.md: Phase 3 progress 1/2 plans
- REQUIREMENTS.md: AGNT-01, AGNT-02, AGNT-03 marked complete

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
hsiegeln
2026-03-11 18:42:50 +01:00
parent 0372be2334
commit af0af9ce38
3 changed files with 156 additions and 19 deletions

View File

@@ -27,9 +27,9 @@ Requirements for initial release. Each maps to roadmap phases. Tracked as Gitea
### Agent Management ### Agent Management
- [ ] **AGNT-01**: Agent registers via `POST /api/v1/agents/register` with bootstrap token, receives JWT + server public key (#13) - [x] **AGNT-01**: Agent registers via `POST /api/v1/agents/register` with bootstrap token, receives JWT + server public key (#13)
- [ ] **AGNT-02**: Server maintains agent registry with LIVE/STALE/DEAD lifecycle based on heartbeat timing (#14) - [x] **AGNT-02**: Server maintains agent registry with LIVE/STALE/DEAD lifecycle based on heartbeat timing (#14)
- [ ] **AGNT-03**: Agent sends heartbeat via `POST /api/v1/agents/{id}/heartbeat` every 30s (#15) - [x] **AGNT-03**: Agent sends heartbeat via `POST /api/v1/agents/{id}/heartbeat` every 30s (#15)
- [ ] **AGNT-04**: Server pushes `config-update` events to agents via SSE with Ed25519 signature (#16) - [ ] **AGNT-04**: Server pushes `config-update` events to agents via SSE with Ed25519 signature (#16)
- [ ] **AGNT-05**: Server pushes `deep-trace` commands to agents via SSE for specific correlationIds (#17) - [ ] **AGNT-05**: Server pushes `deep-trace` commands to agents via SSE for specific correlationIds (#17)
- [ ] **AGNT-06**: Server pushes `replay` commands to agents via SSE with signed replay tokens (#18) - [ ] **AGNT-06**: Server pushes `replay` commands to agents via SSE with signed replay tokens (#18)

View File

@@ -2,16 +2,16 @@
gsd_state_version: 1.0 gsd_state_version: 1.0
milestone: v1.0 milestone: v1.0
milestone_name: milestone milestone_name: milestone
status: completed status: in-progress
stopped_at: Phase 3 context gathered stopped_at: Completed 03-01-PLAN.md
last_updated: "2026-03-11T17:10:19.153Z" last_updated: "2026-03-11T17:41:24Z"
last_activity: 2026-03-11 -- Completed 02-04 (Diagram hash linking, Surefire fix, test stability) last_activity: 2026-03-11 -- Completed 03-01 (Agent registry domain + REST endpoints)
progress: progress:
total_phases: 4 total_phases: 4
completed_phases: 2 completed_phases: 2
total_plans: 7 total_plans: 9
completed_plans: 7 completed_plans: 8
percent: 100 percent: 89
--- ---
# Project State # Project State
@@ -21,16 +21,16 @@ progress:
See: .planning/PROJECT.md (updated 2026-03-11) See: .planning/PROJECT.md (updated 2026-03-11)
**Core value:** Users can reliably search and find any transaction across all connected Camel instances -- by any combination of state, time, duration, or content -- even at millions of transactions per day with 30-day retention. **Core value:** Users can reliably search and find any transaction across all connected Camel instances -- by any combination of state, time, duration, or content -- even at millions of transactions per day with 30-day retention.
**Current focus:** Phase 2: Transaction Search + Diagrams **Current focus:** Phase 3: Agent Registry + SSE Push
## Current Position ## Current Position
Phase: 2 of 4 (Transaction Search + Diagrams) -- COMPLETE Phase: 3 of 4 (Agent Registry + SSE Push)
Plan: 4 of 4 in current phase (gap closure) Plan: 1 of 2 in current phase (agent registry domain + REST endpoints)
Status: Phase 02 Complete (including gap closure) Status: Plan 03-01 complete, Plan 03-02 remaining
Last activity: 2026-03-11 -- Completed 02-04 (Diagram hash linking, Surefire fix, test stability) Last activity: 2026-03-11 -- Completed 03-01 (Agent registry domain + REST endpoints)
Progress: [██████████] 100% Progress: [████████░░] 89%
## Performance Metrics ## Performance Metrics
@@ -57,6 +57,7 @@ Progress: [██████████] 100%
| Phase 02 P02 | 14min | 2 tasks | 10 files | | Phase 02 P02 | 14min | 2 tasks | 10 files |
| Phase 02 P03 | 12min | 2 tasks | 9 files | | Phase 02 P03 | 12min | 2 tasks | 9 files |
| Phase 02 P04 | 22min | 1 tasks | 5 files | | Phase 02 P04 | 22min | 1 tasks | 5 files |
| Phase 03 P01 | 15min | 2 tasks | 15 files |
## Accumulated Context ## Accumulated Context
@@ -92,6 +93,9 @@ Recent decisions affecting current work:
- [Phase 02]: DiagramRepository injected via constructor into ClickHouseExecutionRepository for diagram hash lookup during batch insert - [Phase 02]: DiagramRepository injected via constructor into ClickHouseExecutionRepository for diagram hash lookup during batch insert
- [Phase 02]: Awaitility ignoreExceptions pattern adopted for all ClickHouse polling assertions - [Phase 02]: Awaitility ignoreExceptions pattern adopted for all ClickHouse polling assertions
- [Phase 02]: Surefire and Failsafe both need reuseForks=false for ELK classloader isolation - [Phase 02]: Surefire and Failsafe both need reuseForks=false for ELK classloader isolation
- [Phase 03]: AgentInfo as Java record with wither-style methods for immutable ConcurrentHashMap swapping
- [Phase 03]: Dead threshold measured from staleTransitionTime, not lastHeartbeat
- [Phase 03]: spring.mvc.async.request-timeout=-1 set proactively for SSE support in Plan 02
### Pending Todos ### Pending Todos
@@ -106,6 +110,6 @@ None yet.
## Session Continuity ## Session Continuity
Last session: 2026-03-11T17:10:19.149Z Last session: 2026-03-11T17:41:24Z
Stopped at: Phase 3 context gathered Stopped at: Completed 03-01-PLAN.md
Resume file: .planning/phases/03-agent-registry-sse-push/03-CONTEXT.md Resume file: .planning/phases/03-agent-registry-sse-push/03-02-PLAN.md

View File

@@ -0,0 +1,133 @@
---
phase: 03-agent-registry-sse-push
plan: 01
subsystem: agent-registry
tags: [concurrenthashmap, lifecycle, heartbeat, rest-api, spring-scheduled]
# Dependency graph
requires:
- phase: 01-ingestion-pipeline
provides: IngestionBeanConfig pattern, @Scheduled pattern, ProtocolVersionInterceptor
provides:
- AgentRegistryService with register/heartbeat/lifecycle/command management
- AgentInfo record with wither-style immutable state transitions
- AgentCommand record with delivery status tracking
- AgentEventListener interface for SSE bridge (Plan 02)
- POST /api/v1/agents/register endpoint
- POST /api/v1/agents/{id}/heartbeat endpoint
- GET /api/v1/agents endpoint with ?status= filter
- AgentLifecycleMonitor with LIVE->STALE->DEAD transitions
- AgentRegistryConfig with all timing properties
affects: [03-02-sse-push, 04-security]
# Tech tracking
tech-stack:
added: []
patterns: [immutable-record-with-wither, compute-if-present-atomic-swap, agent-lifecycle-state-machine]
key-files:
created:
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentInfo.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentState.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentCommand.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandStatus.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandType.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentRegistryService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentEventListener.java
- cameleer3-server-core/src/test/java/com/cameleer3/server/core/agent/AgentRegistryServiceTest.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryBeanConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java
- cameleer3-server-app/src/main/resources/application.yml
key-decisions:
- "AgentInfo as Java record with wither-style methods for immutable ConcurrentHashMap swapping"
- "Dead threshold measured from staleTransitionTime, not lastHeartbeat (matches requirement precisely)"
- "spring.mvc.async.request-timeout=-1 set now for SSE support in Plan 02"
patterns-established:
- "Immutable record + ConcurrentHashMap.compute for thread-safe state transitions"
- "AgentEventListener interface in core module as bridge to SSE layer in app module"
requirements-completed: [AGNT-01, AGNT-02, AGNT-03]
# Metrics
duration: 15min
completed: 2026-03-11
---
# Phase 3 Plan 1: Agent Registry Summary
**In-memory agent registry with ConcurrentHashMap, LIVE/STALE/DEAD lifecycle via @Scheduled, and REST endpoints for registration/heartbeat/listing**
## Performance
- **Duration:** 15 min
- **Started:** 2026-03-11T17:26:34Z
- **Completed:** 2026-03-11T17:41:24Z
- **Tasks:** 2
- **Files modified:** 15
## Accomplishments
- Agent registry domain model with 5 types (AgentInfo, AgentState, AgentCommand, CommandStatus, CommandType)
- Full lifecycle management: register, heartbeat, LIVE->STALE->DEAD transitions with configurable thresholds
- Command queue with PENDING/DELIVERED/ACKNOWLEDGED/EXPIRED status tracking and event listener bridge
- REST endpoints: POST /register, POST /{id}/heartbeat, GET /agents with ?status= filter
- 23 unit tests + 7 integration tests all passing
## Task Commits
Each task was committed atomically:
1. **Task 1 (RED): Failing tests for agent registry** - `4cd7ed9` (test)
2. **Task 1 (GREEN): Implement agent registry service** - `61f3902` (feat)
3. **Task 2: Controllers, config, lifecycle monitor, integration tests** - `0372be2` (feat)
_Note: Task 1 used TDD with separate RED/GREEN commits_
## Files Created/Modified
- `AgentInfo.java` - Immutable record with wither-style methods for atomic state transitions
- `AgentState.java` - LIVE, STALE, DEAD lifecycle enum
- `AgentCommand.java` - Command record with delivery status tracking
- `CommandStatus.java` - PENDING, DELIVERED, ACKNOWLEDGED, EXPIRED enum
- `CommandType.java` - CONFIG_UPDATE, DEEP_TRACE, REPLAY enum
- `AgentRegistryService.java` - Core registry: register, heartbeat, lifecycle, commands
- `AgentEventListener.java` - Interface for SSE bridge (Plan 02 integration point)
- `AgentRegistryConfig.java` - @ConfigurationProperties for all timing settings
- `AgentRegistryBeanConfig.java` - @Configuration wiring AgentRegistryService
- `AgentLifecycleMonitor.java` - @Scheduled lifecycle check and command expiry
- `AgentRegistrationController.java` - REST endpoints for agents
- `AgentRegistryServiceTest.java` - 23 unit tests
- `AgentRegistrationControllerIT.java` - 7 integration tests
- `Cameleer3ServerApplication.java` - Added AgentRegistryConfig to @EnableConfigurationProperties
- `application.yml` - Added agent-registry config section and spring.mvc.async.request-timeout
## Decisions Made
- Used Java record with wither-style methods for AgentInfo instead of mutable class -- ConcurrentHashMap.compute provides atomic swapping without needing synchronized fields
- Dead threshold measured from staleTransitionTime field (not lastHeartbeat) to match the "5 minutes after going STALE" requirement precisely
- Set spring.mvc.async.request-timeout=-1 proactively for SSE support needed in Plan 02
- Command queue uses ConcurrentLinkedQueue per agent for lock-free command management
## Deviations from Plan
None - plan executed exactly as written.
## Issues Encountered
- DiagramRenderControllerIT has a pre-existing flaky failure (EmptyResultDataAccess in seedDiagram) unrelated to Phase 3 changes. Logged in deferred-items.md.
## User Setup Required
None - no external service configuration required.
## Next Phase Readiness
- AgentRegistryService ready for SSE integration via AgentEventListener interface
- Plan 02 (SSE Push) can wire SseConnectionManager as AgentEventListener implementation
- All agent endpoints under /api/v1/agents/ already covered by ProtocolVersionInterceptor
---
*Phase: 03-agent-registry-sse-push*
*Completed: 2026-03-11*