--- phase: 03-agent-registry-sse-push plan: 01 type: execute wave: 1 depends_on: [] files_modified: - cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java - cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentState.java - cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentCommand.java - cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandStatus.java - cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandType.java - cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java - cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentEventListener.java - cameleer-server-core/src/test/java/com/cameleer/server/core/agent/AgentRegistryServiceTest.java - cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryConfig.java - cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java - cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java - cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java - cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java - cameleer-server-app/src/main/resources/application.yml - cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java autonomous: true requirements: - AGNT-01 - AGNT-02 - AGNT-03 must_haves: truths: - "Agent can register via POST /api/v1/agents/register with agentId, name, group, version, routeIds, capabilities and receive a response containing SSE endpoint URL and server config" - "Re-registration with the same agentId resumes existing identity (transitions back to LIVE, updates metadata)" - "Agent can send heartbeat via POST /api/v1/agents/{id}/heartbeat and receive 200 (or 404 if unknown)" - "Server transitions agents LIVE->STALE after 90s without heartbeat, STALE->DEAD 5 minutes after staleTransitionTime" - "Agent list endpoint GET /api/v1/agents returns all agents, filterable by ?status=LIVE|STALE|DEAD" artifacts: - path: "cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java" provides: "Agent registration, heartbeat, lifecycle transitions, find/filter" - path: "cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java" provides: "Agent record with id, name, group, version, routeIds, capabilities, state, timestamps" - path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java" provides: "POST /register, POST /{id}/heartbeat, GET /agents endpoints" - path: "cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java" provides: "@Scheduled lifecycle transitions LIVE->STALE->DEAD" key_links: - from: "AgentRegistrationController" to: "AgentRegistryService" via: "constructor injection" pattern: "registryService\\.register|registryService\\.heartbeat" - from: "AgentLifecycleMonitor" to: "AgentRegistryService" via: "@Scheduled lifecycle check" pattern: "registry\\.transitionState" - from: "AgentRegistryBeanConfig" to: "AgentRegistryService" via: "@Bean factory method" pattern: "new AgentRegistryService" --- Build the agent registry domain model, registration/heartbeat REST endpoints, and lifecycle monitoring. Purpose: Agents need to register with the server, send periodic heartbeats, and the server must track their LIVE/STALE/DEAD states. This is the foundation that the SSE push layer (Plan 02) builds on. Output: Core domain types (AgentInfo, AgentState, AgentCommand, CommandStatus, CommandType), AgentRegistryService in core module, registration/heartbeat/list controllers in app module, lifecycle monitor, unit + integration tests. @C:/Users/Hendrik/.claude/get-shit-done/workflows/execute-plan.md @C:/Users/Hendrik/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @.planning/phases/03-agent-registry-sse-push/03-CONTEXT.md @.planning/phases/03-agent-registry-sse-push/03-RESEARCH.md @cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java @cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionBeanConfig.java @cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java @cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java @cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java @cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java @cameleer-server-app/src/main/resources/application.yml @cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java Pattern: Core module plain class, app module bean config: - IngestionService is a plain Java class (no Spring annotations) in core module - IngestionBeanConfig is @Configuration in app module that creates the bean - IngestionConfig is @ConfigurationProperties in app module for YAML binding Pattern: Controller accepts raw String body: - Controllers use @RequestBody String body, parse with ObjectMapper - Return ResponseEntity with serialized JSON string Pattern: @Scheduled for periodic tasks: - ClickHouseFlushScheduler uses @Scheduled(fixedDelayString = "${ingestion.flush-interval-ms:1000}") - @EnableScheduling already on CameleerServerApplication Pattern: @EnableConfigurationProperties registration: - CameleerServerApplication has @EnableConfigurationProperties(IngestionConfig.class) - New config classes must be added to this annotation Pattern: ProtocolVersionInterceptor: - WebConfig registers interceptor for "/api/v1/data/**", "/api/v1/agents/**" - Agent endpoints already covered -- agents must send X-Cameleer-Protocol-Version:1 header Task 1: Core domain types and AgentRegistryService with unit tests cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java, cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentState.java, cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentCommand.java, cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandStatus.java, cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandType.java, cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java, cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentEventListener.java, cameleer-server-core/src/test/java/com/cameleer/server/core/agent/AgentRegistryServiceTest.java - register: new agent ID creates AgentInfo with state LIVE, returns AgentInfo - register: same agent ID re-registers (updates metadata, transitions to LIVE, updates lastHeartbeat and registeredAt) - heartbeat: known agent updates lastHeartbeat and transitions STALE back to LIVE, returns true - heartbeat: unknown agent returns false - lifecycle: LIVE agent with lastHeartbeat > staleThresholdMs transitions to STALE (staleTransitionTime recorded) - lifecycle: STALE agent where now - staleTransitionTime > deadThresholdMs transitions to DEAD - lifecycle: DEAD agent remains DEAD (no auto-purge) - findAll: returns all agents regardless of state - findByState: filters agents by AgentState - findById: returns null for unknown ID - addCommand: creates AgentCommand with PENDING status, returns command ID - acknowledgeCommand: transitions command from PENDING/DELIVERED to ACKNOWLEDGED - expireCommands: removes commands older than expiryMs with PENDING status - findPendingCommands: returns PENDING commands for given agentId Create the agent domain model in the core module (package com.cameleer.server.core.agent): 1. **AgentState enum**: LIVE, STALE, DEAD 2. **CommandType enum**: CONFIG_UPDATE, DEEP_TRACE, REPLAY 3. **CommandStatus enum**: PENDING, DELIVERED, ACKNOWLEDGED, EXPIRED 4. **AgentInfo**: Mutable class (not record -- needs state transitions) with fields: - id (String), name (String), group (String), version (String) - routeIds (List), capabilities (Map) - state (AgentState), registeredAt (Instant), lastHeartbeat (Instant) - staleTransitionTime (Instant, nullable -- set when transitioning to STALE) - Use synchronized methods or volatile fields for thread safety since ConcurrentHashMap only protects the map, not the values. - Actually, prefer immutable-style: store as records in the ConcurrentHashMap and use computeIfPresent to atomically swap. AgentInfo can be a record with wither-style methods (withState, withLastHeartbeat, etc.). 5. **AgentCommand**: Record with fields: id (String, UUID), type (CommandType), payload (String -- raw JSON), targetAgentId (String), createdAt (Instant), status (CommandStatus). Provide withStatus method. 6. **AgentEventListener**: Interface with methods `onCommandReady(String agentId, AgentCommand command)` -- this allows the SSE layer (Plan 02) to be notified when a command is added. The core module defines the interface; the app module implements it. 7. **AgentRegistryService**: Plain class (no Spring annotations), constructor takes staleThresholdMs (long), deadThresholdMs (long), commandExpiryMs (long). Uses ConcurrentHashMap for agents and ConcurrentHashMap> (or ConcurrentHashMap>) for pending commands per agent. Methods: - `register(String id, String name, String group, String version, List routeIds, Map capabilities)` -> AgentInfo - `heartbeat(String id)` -> boolean - `transitionState(String id, AgentState newState)` -> void (used by lifecycle monitor) - `checkLifecycle()` -> void (iterates all agents, applies LIVE->STALE and STALE->DEAD based on thresholds) - `findById(String id)` -> AgentInfo (nullable) - `findAll()` -> List - `findByState(AgentState state)` -> List - `addCommand(String agentId, CommandType type, String payload)` -> AgentCommand (creates with PENDING, calls eventListener.onCommandReady if set) - `acknowledgeCommand(String agentId, String commandId)` -> boolean - `findPendingCommands(String agentId)` -> List - `markDelivered(String agentId, String commandId)` -> void - `expireOldCommands()` -> void (sweep commands older than commandExpiryMs) - `setEventListener(AgentEventListener listener)` -> void (optional, for SSE integration) Write tests FIRST (RED), then implement (GREEN). Test class: AgentRegistryServiceTest. mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest All unit tests pass: registration (new + re-register), heartbeat (known + unknown), lifecycle transitions (LIVE->STALE->DEAD, heartbeat revives STALE), findAll/findByState/findById, command add/acknowledge/expire. AgentEventListener interface defined. Task 2: Registration/heartbeat/list controllers, config, lifecycle monitor, integration tests cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryConfig.java, cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java, cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java, cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java, cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java, cameleer-server-app/src/main/resources/application.yml, cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java Wire the agent registry into the Spring Boot app and create REST endpoints: 1. **AgentRegistryConfig** (@ConfigurationProperties prefix "agent-registry"): - heartbeatIntervalMs (long, default 30000) - staleThresholdMs (long, default 90000) - deadThresholdMs (long, default 300000) -- this is 5 minutes from staleTransitionTime, NOT from lastHeartbeat - pingIntervalMs (long, default 15000) - commandExpiryMs (long, default 60000) - lifecycleCheckIntervalMs (long, default 10000) Follow IngestionConfig pattern: plain class with getters/setters. 2. **AgentRegistryBeanConfig** (@Configuration): - @Bean AgentRegistryService: `new AgentRegistryService(config.getStaleThresholdMs(), config.getDeadThresholdMs(), config.getCommandExpiryMs())` Follow IngestionBeanConfig pattern. 3. **Update CameleerServerApplication**: Add AgentRegistryConfig.class to @EnableConfigurationProperties. 4. **Update application.yml**: Add agent-registry section with all defaults (see RESEARCH.md code example). Also add `spring.mvc.async.request-timeout: -1` for SSE support (Plan 02 needs it, but set it now). 5. **AgentLifecycleMonitor** (@Component): - Inject AgentRegistryService - @Scheduled(fixedDelayString = "${agent-registry.lifecycle-check-interval-ms:10000}") calls registryService.checkLifecycle() and registryService.expireOldCommands() - Follow ClickHouseFlushScheduler pattern but simpler (no SmartLifecycle needed -- agent state is ephemeral) 6. **AgentRegistrationController** (@RestController, @RequestMapping("/api/v1/agents")): - Inject AgentRegistryService, ObjectMapper - `POST /register`: Accept raw String body, parse JSON with ObjectMapper. Extract: agentId (required), name (required), group (default "default"), version, routeIds (default empty list), capabilities (default empty map). Call registryService.register(). Build response JSON: { agentId, sseEndpoint: "/api/v1/agents/{agentId}/events", heartbeatIntervalMs: from config, serverPublicKey: null (Phase 4 placeholder) }. Return 200. - `POST /{id}/heartbeat`: Call registryService.heartbeat(id). Return 200 if true, 404 if false. - `GET /`: Accept optional @RequestParam status. If status provided, parse to AgentState and call findByState. Otherwise call findAll. Serialize with ObjectMapper, return 200. Handle invalid status with 400. - Add @Tag(name = "Agent Management") and @Operation annotations for OpenAPI. 7. **AgentRegistrationControllerIT** (extends AbstractClickHouseIT): - Test register new agent: POST /api/v1/agents/register with valid payload, assert 200, response contains agentId and sseEndpoint - Test re-register same agent: register twice with same ID, assert second returns 200, state is LIVE - Test heartbeat known agent: register then heartbeat, assert 200 - Test heartbeat unknown agent: heartbeat without register, assert 404 - Test list all agents: register 2 agents, GET /api/v1/agents, assert both returned - Test list by status filter: register agent, GET /api/v1/agents?status=LIVE, assert filtered correctly - Test invalid status filter: GET /api/v1/agents?status=INVALID, assert 400 - All requests must include X-Cameleer-Protocol-Version:1 header (ProtocolVersionInterceptor covers /api/v1/agents/**) - Use TestRestTemplate (already available from AbstractClickHouseIT's @SpringBootTest) mvn test -pl cameleer-server-core,cameleer-server-app -Dtest="Agent*" POST /register returns 200 with agentId + sseEndpoint + heartbeatIntervalMs. POST /{id}/heartbeat returns 200 for known agents, 404 for unknown. GET /agents returns all agents with optional ?status= filter. AgentLifecycleMonitor runs on schedule. All integration tests pass. mvn clean verify passes. mvn clean verify -- full suite green (existing Phase 1+2 tests still pass, new agent tests pass) - Agent registration flow works end-to-end via REST - Heartbeat updates agent state correctly - Lifecycle monitor transitions LIVE->STALE->DEAD based on configured thresholds - Agent list endpoint with optional status filter returns correct results - All 7+ integration tests pass - Existing test suite unbroken After completion, create `.planning/phases/03-agent-registry-sse-push/03-01-SUMMARY.md`