17 KiB
phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
| phase | plan | type | wave | depends_on | files_modified | autonomous | requirements | must_haves | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 03-agent-registry-sse-push | 01 | execute | 1 |
|
true |
|
|
Purpose: Agents need to register with the server, send periodic heartbeats, and the server must track their LIVE/STALE/DEAD states. This is the foundation that the SSE push layer (Plan 02) builds on. Output: Core domain types (AgentInfo, AgentState, AgentCommand, CommandStatus, CommandType), AgentRegistryService in core module, registration/heartbeat/list controllers in app module, lifecycle monitor, unit + integration tests.
<execution_context> @C:/Users/Hendrik/.claude/get-shit-done/workflows/execute-plan.md @C:/Users/Hendrik/.claude/get-shit-done/templates/summary.md </execution_context>
@.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/03-agent-registry-sse-push/03-CONTEXT.md @.planning/phases/03-agent-registry-sse-push/03-RESEARCH.md@cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java @cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionBeanConfig.java @cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java @cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java @cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java @cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java @cameleer3-server-app/src/main/resources/application.yml @cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
Pattern: Core module plain class, app module bean config:
- IngestionService is a plain Java class (no Spring annotations) in core module
- IngestionBeanConfig is @Configuration in app module that creates the bean
- IngestionConfig is @ConfigurationProperties in app module for YAML binding
Pattern: Controller accepts raw String body:
- Controllers use @RequestBody String body, parse with ObjectMapper
- Return ResponseEntity with serialized JSON string
Pattern: @Scheduled for periodic tasks:
- ClickHouseFlushScheduler uses @Scheduled(fixedDelayString = "${ingestion.flush-interval-ms:1000}")
- @EnableScheduling already on Cameleer3ServerApplication
Pattern: @EnableConfigurationProperties registration:
- Cameleer3ServerApplication has @EnableConfigurationProperties(IngestionConfig.class)
- New config classes must be added to this annotation
Pattern: ProtocolVersionInterceptor:
- WebConfig registers interceptor for "/api/v1/data/", "/api/v1/agents/"
- Agent endpoints already covered -- agents must send X-Cameleer-Protocol-Version:1 header
1. **AgentState enum**: LIVE, STALE, DEAD
2. **CommandType enum**: CONFIG_UPDATE, DEEP_TRACE, REPLAY
3. **CommandStatus enum**: PENDING, DELIVERED, ACKNOWLEDGED, EXPIRED
4. **AgentInfo**: Mutable class (not record -- needs state transitions) with fields:
- id (String), name (String), group (String), version (String)
- routeIds (List<String>), capabilities (Map<String, Object>)
- state (AgentState), registeredAt (Instant), lastHeartbeat (Instant)
- staleTransitionTime (Instant, nullable -- set when transitioning to STALE)
- Use synchronized methods or volatile fields for thread safety since ConcurrentHashMap only protects the map, not the values.
- Actually, prefer immutable-style: store as records in the ConcurrentHashMap and use computeIfPresent to atomically swap. AgentInfo can be a record with wither-style methods (withState, withLastHeartbeat, etc.).
5. **AgentCommand**: Record with fields: id (String, UUID), type (CommandType), payload (String -- raw JSON), targetAgentId (String), createdAt (Instant), status (CommandStatus). Provide withStatus method.
6. **AgentEventListener**: Interface with methods `onCommandReady(String agentId, AgentCommand command)` -- this allows the SSE layer (Plan 02) to be notified when a command is added. The core module defines the interface; the app module implements it.
7. **AgentRegistryService**: Plain class (no Spring annotations), constructor takes staleThresholdMs (long), deadThresholdMs (long), commandExpiryMs (long). Uses ConcurrentHashMap<String, AgentInfo> for agents and ConcurrentHashMap<String, List<AgentCommand>> (or ConcurrentHashMap<String, ConcurrentLinkedQueue<AgentCommand>>) for pending commands per agent.
Methods:
- `register(String id, String name, String group, String version, List<String> routeIds, Map<String, Object> capabilities)` -> AgentInfo
- `heartbeat(String id)` -> boolean
- `transitionState(String id, AgentState newState)` -> void (used by lifecycle monitor)
- `checkLifecycle()` -> void (iterates all agents, applies LIVE->STALE and STALE->DEAD based on thresholds)
- `findById(String id)` -> AgentInfo (nullable)
- `findAll()` -> List<AgentInfo>
- `findByState(AgentState state)` -> List<AgentInfo>
- `addCommand(String agentId, CommandType type, String payload)` -> AgentCommand (creates with PENDING, calls eventListener.onCommandReady if set)
- `acknowledgeCommand(String agentId, String commandId)` -> boolean
- `findPendingCommands(String agentId)` -> List<AgentCommand>
- `markDelivered(String agentId, String commandId)` -> void
- `expireOldCommands()` -> void (sweep commands older than commandExpiryMs)
- `setEventListener(AgentEventListener listener)` -> void (optional, for SSE integration)
Write tests FIRST (RED), then implement (GREEN). Test class: AgentRegistryServiceTest.
mvn test -pl cameleer3-server-core -Dtest=AgentRegistryServiceTest
All unit tests pass: registration (new + re-register), heartbeat (known + unknown), lifecycle transitions (LIVE->STALE->DEAD, heartbeat revives STALE), findAll/findByState/findById, command add/acknowledge/expire. AgentEventListener interface defined.
Task 2: Registration/heartbeat/list controllers, config, lifecycle monitor, integration tests
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryBeanConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java,
cameleer3-server-app/src/main/resources/application.yml,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java
Wire the agent registry into the Spring Boot app and create REST endpoints:
1. **AgentRegistryConfig** (@ConfigurationProperties prefix "agent-registry"):
- heartbeatIntervalMs (long, default 30000)
- staleThresholdMs (long, default 90000)
- deadThresholdMs (long, default 300000) -- this is 5 minutes from staleTransitionTime, NOT from lastHeartbeat
- pingIntervalMs (long, default 15000)
- commandExpiryMs (long, default 60000)
- lifecycleCheckIntervalMs (long, default 10000)
Follow IngestionConfig pattern: plain class with getters/setters.
2. **AgentRegistryBeanConfig** (@Configuration):
- @Bean AgentRegistryService: `new AgentRegistryService(config.getStaleThresholdMs(), config.getDeadThresholdMs(), config.getCommandExpiryMs())`
Follow IngestionBeanConfig pattern.
3. **Update Cameleer3ServerApplication**: Add AgentRegistryConfig.class to @EnableConfigurationProperties.
4. **Update application.yml**: Add agent-registry section with all defaults (see RESEARCH.md code example). Also add `spring.mvc.async.request-timeout: -1` for SSE support (Plan 02 needs it, but set it now).
5. **AgentLifecycleMonitor** (@Component):
- Inject AgentRegistryService
- @Scheduled(fixedDelayString = "${agent-registry.lifecycle-check-interval-ms:10000}") calls registryService.checkLifecycle() and registryService.expireOldCommands()
- Follow ClickHouseFlushScheduler pattern but simpler (no SmartLifecycle needed -- agent state is ephemeral)
6. **AgentRegistrationController** (@RestController, @RequestMapping("/api/v1/agents")):
- Inject AgentRegistryService, ObjectMapper
- `POST /register`: Accept raw String body, parse JSON with ObjectMapper. Extract: agentId (required), name (required), group (default "default"), version, routeIds (default empty list), capabilities (default empty map). Call registryService.register(). Build response JSON: { agentId, sseEndpoint: "/api/v1/agents/{agentId}/events", heartbeatIntervalMs: from config, serverPublicKey: null (Phase 4 placeholder) }. Return 200.
- `POST /{id}/heartbeat`: Call registryService.heartbeat(id). Return 200 if true, 404 if false.
- `GET /`: Accept optional @RequestParam status. If status provided, parse to AgentState and call findByState. Otherwise call findAll. Serialize with ObjectMapper, return 200. Handle invalid status with 400.
- Add @Tag(name = "Agent Management") and @Operation annotations for OpenAPI.
7. **AgentRegistrationControllerIT** (extends AbstractClickHouseIT):
- Test register new agent: POST /api/v1/agents/register with valid payload, assert 200, response contains agentId and sseEndpoint
- Test re-register same agent: register twice with same ID, assert second returns 200, state is LIVE
- Test heartbeat known agent: register then heartbeat, assert 200
- Test heartbeat unknown agent: heartbeat without register, assert 404
- Test list all agents: register 2 agents, GET /api/v1/agents, assert both returned
- Test list by status filter: register agent, GET /api/v1/agents?status=LIVE, assert filtered correctly
- Test invalid status filter: GET /api/v1/agents?status=INVALID, assert 400
- All requests must include X-Cameleer-Protocol-Version:1 header (ProtocolVersionInterceptor covers /api/v1/agents/**)
- Use TestRestTemplate (already available from AbstractClickHouseIT's @SpringBootTest)
mvn test -pl cameleer3-server-core,cameleer3-server-app -Dtest="Agent*"
POST /register returns 200 with agentId + sseEndpoint + heartbeatIntervalMs. POST /{id}/heartbeat returns 200 for known agents, 404 for unknown. GET /agents returns all agents with optional ?status= filter. AgentLifecycleMonitor runs on schedule. All integration tests pass. mvn clean verify passes.
mvn clean verify -- full suite green (existing Phase 1+2 tests still pass, new agent tests pass)
<success_criteria>
- Agent registration flow works end-to-end via REST
- Heartbeat updates agent state correctly
- Lifecycle monitor transitions LIVE->STALE->DEAD based on configured thresholds
- Agent list endpoint with optional status filter returns correct results
- All 7+ integration tests pass
- Existing test suite unbroken </success_criteria>