cameleer/cameleer-server

Fork 0

Files

hsiegeln cb3ebfea7c

CI / cleanup-branch (push) Has been skipped

Details

CI / build (push) Failing after 18s

Details

CI / docker (push) Has been skipped

Details

CI / deploy (push) Has been skipped

Details

CI / deploy-feature (push) Has been skipped

Details

chore: rename cameleer3 to cameleer

Rename Java packages from com.cameleer3 to com.cameleer, module
directories from cameleer3-* to cameleer-*, and all references
throughout workflows, Dockerfiles, docs, migrations, and pom.xml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-15 15:28:42 +02:00

27 KiB

Raw Blame History

Phase 3: Agent Registry + SSE Push - Research

Researched: 2026-03-11 Domain: Agent lifecycle management, Server-Sent Events (SSE), in-memory registry Confidence: HIGH

Summary

This phase adds agent registration, heartbeat-based lifecycle management (LIVE/STALE/DEAD), and real-time command push via SSE to the Cameleer server. The technology stack is straightforward: Spring MVC's SseEmitter for server-push, ConcurrentHashMap for the in-memory agent registry, and @Scheduled for periodic lifecycle checks (same pattern already used by ClickHouseFlushScheduler).

The main architectural challenge is managing per-agent SSE connections reliably -- handling disconnections, timeouts, and cleanup without leaking threads or emitters. The command delivery model (PENDING with 60s expiry, acknowledgement) adds a second concurrent data structure to manage alongside the registry itself.

Primary recommendation: Use Spring MVC SseEmitter (already on classpath via spring-boot-starter-web). No new dependencies required. Follow the established core-module-plain-class / app-module-Spring-bean pattern. Agent registry service in core, SSE connection manager and controllers in app.

<user_constraints>

User Constraints (from CONTEXT.md)

Locked Decisions

Heartbeat interval: 30 seconds
STALE threshold: 90 seconds (3 missed heartbeats)
DEAD threshold: 5 minutes after going STALE
DEAD agents kept indefinitely (no auto-purge)
Agent list endpoint returns all agents (LIVE, STALE, DEAD) with ?status= filter parameter
Generic command endpoint: POST /api/v1/agents/{id}/commands with {"type": "config-update|deep-trace|replay", "payload": {...}}
Three targeting levels: single agent, group, all live agents
Agent self-declares group name at registration
Command delivery tracking: PENDING until acknowledged, 60s expiry
Agent provides its own persistent ID at registration
Rich registration payload: agent ID, name, group, version, list of route IDs, capabilities
Re-registration with same ID resumes existing identity
Heartbeat is just a ping -- no metadata update
Registration response includes: SSE endpoint URL, current server config, server public key placeholder
Last-Event-ID supported but does NOT replay missed events
Pending commands NOT auto-pushed on reconnect
SSE ping/keepalive interval: 15 seconds

Claude's Discretion

In-memory vs persistent storage for agent registry (in-memory is fine for v1)
Command acknowledgement mechanism details (heartbeat piggyback vs dedicated endpoint)
SSE implementation approach (Spring SseEmitter, WebFlux, or other)
Thread scheduling for lifecycle state transitions

Deferred Ideas (OUT OF SCOPE)

Server-side agent tags/labels for more flexible grouping
Auto-push pending commands on reconnect
Last-Event-ID replay of missed events
Agent capability negotiation

</user_constraints>

<phase_requirements>

Phase Requirements

ID	Description	Research Support
AGNT-01 (#13)	Agent registers via POST /api/v1/agents/register with bootstrap token, receives JWT + server public key	Registration controller + service; JWT/security enforcement deferred to Phase 4 but flow must work end-to-end
AGNT-02 (#14)	Server maintains agent registry with LIVE/STALE/DEAD lifecycle based on heartbeat timing	In-memory ConcurrentHashMap registry + @Scheduled lifecycle monitor
AGNT-03 (#15)	Agent sends heartbeat via POST /api/v1/agents/{id}/heartbeat every 30s	Heartbeat endpoint updates lastHeartbeat timestamp, transitions STALE back to LIVE
AGNT-04 (#16)	Server pushes config-update events to agents via SSE (Ed25519 signature deferred to Phase 4)	SseEmitter per-agent connection + command push infrastructure
AGNT-05 (#17)	Server pushes deep-trace commands to agents via SSE for specific correlationIds	Same SSE command push mechanism with deep-trace type
AGNT-06 (#18)	Server pushes replay commands to agents via SSE (signed replay tokens deferred to Phase 4)	Same SSE command push mechanism with replay type
AGNT-07 (#19)	SSE connection includes ping keepalive and supports Last-Event-ID reconnection	15s ping via @Scheduled, Last-Event-ID header read on connect

</phase_requirements>

Standard Stack

Core

Library	Version	Purpose	Why Standard
Spring MVC SseEmitter	6.2.x (via Boot 3.4.3)	Server-Sent Events	Already on classpath, servlet-based (matches existing stack), no WebFlux needed
ConcurrentHashMap	JDK 17	Agent registry storage	Thread-safe, O(1) lookup by agent ID, no external dependency
Spring @Scheduled	6.2.x (via Boot 3.4.3)	Lifecycle monitor + SSE keepalive	Already enabled in application, proven pattern in ClickHouseFlushScheduler

Supporting

Library	Version	Purpose	When to Use
Jackson ObjectMapper	2.17.3 (managed)	Command serialization/deserialization	Already configured with JavaTimeModule, used throughout codebase

Alternatives Considered

Instead of	Could Use	Tradeoff
SseEmitter (MVC)	WebFlux Flux	Would require adding spring-boot-starter-webflux and mixing reactive/servlet stacks -- unnecessary complexity for this use case
ConcurrentHashMap	Redis/ClickHouse persistence	Over-engineering for v1; in-memory is sufficient since agent state is ephemeral and rebuilt on reconnect
@Scheduled	ScheduledExecutorService	@Scheduled already works, already enabled; raw executor only needed for complex scheduling

Installation: No new dependencies required. Everything is already on the classpath.

Architecture Patterns

Recommended Project Structure

cameleer-server-core/src/main/java/com/cameleer/server/core/
├── agent/
│   ├── AgentInfo.java              # Record: id, name, group, version, routeIds, capabilities, state, timestamps
│   ├── AgentState.java             # Enum: LIVE, STALE, DEAD
│   ├── AgentRegistryService.java   # Plain class: register, heartbeat, findById, findAll, lifecycle transitions
│   ├── AgentCommand.java           # Record: id, type, payload, targetAgentId, createdAt, status
│   └── CommandStatus.java          # Enum: PENDING, DELIVERED, ACKNOWLEDGED, EXPIRED

cameleer-server-app/src/main/java/com/cameleer/server/app/
├── config/
│   ├── AgentRegistryConfig.java    # @ConfigurationProperties(prefix = "agent-registry")
│   └── AgentRegistryBeanConfig.java # @Configuration: wires AgentRegistryService as bean
├── controller/
│   ├── AgentRegistrationController.java  # POST /register, POST /{id}/heartbeat, GET /agents
│   ├── AgentCommandController.java       # POST /{id}/commands, POST /groups/{group}/commands, POST /commands
│   └── AgentSseController.java           # GET /{id}/events (SSE stream)
├── agent/
│   ├── SseConnectionManager.java         # @Component: ConcurrentHashMap<agentId, SseEmitter>, ping scheduler
│   └── AgentLifecycleMonitor.java        # @Component: @Scheduled lifecycle check (like ClickHouseFlushScheduler)

Pattern 1: Core Module Plain Class + App Module Bean Config

What: Domain logic in core module as plain Java classes; Spring wiring in app module via @Configuration When to use: Always -- this is the established codebase pattern Example:

// Core module: plain class, no Spring annotations
public class AgentRegistryService {
    private final ConcurrentHashMap<String, AgentInfo> agents = new ConcurrentHashMap<>();

    public AgentInfo register(String id, String name, String group, String version,
                              List<String> routeIds, Map<String, Object> capabilities) {
        AgentInfo existing = agents.get(id);
        if (existing != null) {
            // Re-registration: update metadata, transition back to LIVE
            AgentInfo updated = existing.withState(AgentState.LIVE)
                    .withLastHeartbeat(Instant.now());
            agents.put(id, updated);
            return updated;
        }
        AgentInfo agent = new AgentInfo(id, name, group, version, routeIds,
                capabilities, AgentState.LIVE, Instant.now(), Instant.now());
        agents.put(id, agent);
        return agent;
    }

    public boolean heartbeat(String id) {
        return agents.computeIfPresent(id, (k, v) ->
                v.withState(AgentState.LIVE).withLastHeartbeat(Instant.now())) != null;
    }
}

// App module: bean config
@Configuration
public class AgentRegistryBeanConfig {
    @Bean
    public AgentRegistryService agentRegistryService() {
        return new AgentRegistryService();
    }
}

Pattern 2: SseEmitter Per-Agent Connection

What: Each agent has one SseEmitter stored in ConcurrentHashMap, managed by a dedicated component When to use: For all SSE connections to agents Example:

@Component
public class SseConnectionManager {
    private final ConcurrentHashMap<String, SseEmitter> emitters = new ConcurrentHashMap<>();

    public SseEmitter connect(String agentId) {
        // Use Long.MAX_VALUE timeout -- we manage keepalive ourselves
        SseEmitter emitter = new SseEmitter(Long.MAX_VALUE);

        emitter.onCompletion(() -> emitters.remove(agentId));
        emitter.onTimeout(() -> emitters.remove(agentId));
        emitter.onError(e -> emitters.remove(agentId));

        // Replace any existing emitter (agent reconnect)
        SseEmitter old = emitters.put(agentId, emitter);
        if (old != null) {
            old.complete(); // Close stale connection
        }

        return emitter;
    }

    public boolean sendEvent(String agentId, String eventId, String eventType, Object data) {
        SseEmitter emitter = emitters.get(agentId);
        if (emitter == null) return false;
        try {
            emitter.send(SseEmitter.event()
                    .id(eventId)
                    .name(eventType)
                    .data(data, MediaType.APPLICATION_JSON));
            return true;
        } catch (IOException e) {
            emitters.remove(agentId);
            return false;
        }
    }

    public void sendPingToAll() {
        emitters.forEach((id, emitter) -> {
            try {
                emitter.send(SseEmitter.event().comment("ping"));
            } catch (IOException e) {
                emitters.remove(id);
            }
        });
    }
}

Pattern 3: Lifecycle Monitor via @Scheduled

What: Periodic task checks all agents' lastHeartbeat timestamps and transitions states When to use: For LIVE->STALE and STALE->DEAD transitions Example:

@Component
public class AgentLifecycleMonitor {
    private final AgentRegistryService registry;
    private final AgentRegistryConfig config;

    @Scheduled(fixedDelayString = "${agent-registry.lifecycle-check-interval-ms:10000}")
    public void checkLifecycle() {
        Instant now = Instant.now();
        for (AgentInfo agent : registry.findAll()) {
            Duration sinceHeartbeat = Duration.between(agent.lastHeartbeat(), now);
            if (agent.state() == AgentState.LIVE
                    && sinceHeartbeat.toMillis() > config.getStaleThresholdMs()) {
                registry.transitionState(agent.id(), AgentState.STALE);
            } else if (agent.state() == AgentState.STALE
                    && sinceHeartbeat.toMillis() > config.getStaleThresholdMs() + config.getDeadThresholdMs()) {
                registry.transitionState(agent.id(), AgentState.DEAD);
            }
        }
    }
}

Anti-Patterns to Avoid

Mixing WebFlux and MVC: Do not add spring-boot-starter-webflux. The project uses servlet-based MVC. Adding WebFlux creates classpath conflicts and ambiguity.
Sharing SseEmitter across threads without protection: Always use ConcurrentHashMap and handle IOException on every send. A failed send means the client disconnected.
Storing SseEmitter in the core module: SseEmitter is a Spring MVC class. Keep it in the app module only. The core module should define interfaces for "push event to agent" that the app module implements.
Not setting SseEmitter timeout: Default timeout is server-dependent (often 30s). Use Long.MAX_VALUE and manage keepalive yourself.

Don't Hand-Roll

Problem	Don't Build	Use Instead	Why
SSE protocol	Custom HTTP streaming	Spring SseEmitter	Handles text/event-stream format, event IDs, retry fields automatically
Thread-safe map	Synchronized HashMap	ConcurrentHashMap	Lock-free reads, segmented writes, battle-tested
Periodic scheduling	Manual Thread/Timer	@Scheduled + @EnableScheduling	Already configured, integrates with Spring lifecycle
JSON serialization	Manual string building	ObjectMapper (already configured)	Handles Instant, unknown fields, all edge cases
Async request timeout	Manual thread management	spring.mvc.async.request-timeout config	Spring handles Tomcat async timeout correctly

Key insight: SSE in Spring MVC is a well-supported, first-class feature. The SseEmitter API handles the wire protocol; your job is managing the lifecycle of emitters (create, store, cleanup, send).

Common Pitfalls

Pitfall 1: SseEmitter Default Timeout Kills Long-Lived Connections

What goes wrong: Emitter times out after 30s (Tomcat default), client gets disconnected Why it happens: Not setting explicit timeout on SseEmitter constructor How to avoid: Always use new SseEmitter(Long.MAX_VALUE). Also set spring.mvc.async.request-timeout=-1 in application.yml to disable the MVC-level async timeout Warning signs: Clients disconnecting every 30 seconds, reconnection storms

Pitfall 2: IOException on Send Not Handled

What goes wrong: Client disconnects but server keeps trying to send, gets IOException, does not clean up Why it happens: Not wrapping every emitter.send() in try-catch How to avoid: Every send must catch IOException, remove the emitter from the map, and log at debug level (not error -- disconnects are normal) Warning signs: Growing emitter map, increasing IOExceptions in logs

Pitfall 3: Race Condition on Agent Reconnect

What goes wrong: Agent disconnects and reconnects rapidly; old emitter and new emitter both exist briefly Why it happens: onCompletion callback of old emitter fires after new emitter is stored, removing the new one How to avoid: Use ConcurrentHashMap.put() which returns the old value. Only remove in callbacks if the emitter in the map is still the same instance (reference equality check) Warning signs: Agent SSE stream stops working after reconnect

Pitfall 4: Tomcat Thread Exhaustion with SSE

What goes wrong: Each SSE connection holds a Tomcat thread (with default sync mode) Why it happens: MVC SseEmitter uses Servlet 3.1 async support but the async processing still occupies a thread from the pool during the initial request How to avoid: Spring Boot's default Tomcat thread pool (200 threads) is sufficient for dozens to low hundreds of agents. If scaling beyond that, configure server.tomcat.threads.max. For thousands of agents, consider WebFlux (but that is a v2 concern) Warning signs: Thread pool exhaustion, connection refused errors

Pitfall 5: Command Expiry Not Cleaned Up

What goes wrong: Expired PENDING commands accumulate in memory Why it happens: No scheduled task to clean them up How to avoid: The lifecycle monitor (or a separate @Scheduled task) should also sweep expired commands every check cycle Warning signs: Memory growth over time, stale commands in API responses

Pitfall 6: SSE Endpoint Blocked by ProtocolVersionInterceptor

What goes wrong: SSE GET request rejected because it lacks X-Cameleer-Protocol-Version header Why it happens: WebConfig already registers the interceptor for /api/v1/agents/** which includes the SSE endpoint How to avoid: Either add the protocol header requirement to agents (recommended -- agents already send it for POST requests) or exclude the SSE endpoint path from the interceptor Warning signs: 400 errors on SSE connect attempts

Code Examples

Registration Controller

@RestController
@RequestMapping("/api/v1/agents")
@Tag(name = "Agent Management", description = "Agent registration and lifecycle endpoints")
public class AgentRegistrationController {

    private final AgentRegistryService registryService;
    private final ObjectMapper objectMapper;

    @PostMapping("/register")
    @Operation(summary = "Register an agent")
    public ResponseEntity<String> register(@RequestBody String body) throws JsonProcessingException {
        // Parse registration payload
        JsonNode node = objectMapper.readTree(body);
        String agentId = node.get("agentId").asText();
        String name = node.get("name").asText();
        String group = node.has("group") ? node.get("group").asText() : "default";
        // ... extract other fields

        AgentInfo agent = registryService.register(agentId, name, group, version, routeIds, capabilities);

        // Build registration response
        Map<String, Object> response = new LinkedHashMap<>();
        response.put("agentId", agent.id());
        response.put("sseEndpoint", "/api/v1/agents/" + agentId + "/events");
        response.put("heartbeatIntervalMs", 30000);
        response.put("serverPublicKey", null); // Phase 4
        // JWT token placeholder -- Phase 4 will add real JWT
        response.put("token", "placeholder-" + agentId);

        return ResponseEntity.ok(objectMapper.writeValueAsString(response));
    }

    @PostMapping("/{id}/heartbeat")
    @Operation(summary = "Agent heartbeat ping")
    public ResponseEntity<Void> heartbeat(@PathVariable String id) {
        boolean found = registryService.heartbeat(id);
        if (!found) {
            return ResponseEntity.notFound().build();
        }
        return ResponseEntity.ok().build();
    }

    @GetMapping
    @Operation(summary = "List all agents")
    public ResponseEntity<String> listAgents(
            @RequestParam(required = false) String status) throws JsonProcessingException {
        List<AgentInfo> agents;
        if (status != null) {
            AgentState stateFilter = AgentState.valueOf(status.toUpperCase());
            agents = registryService.findByState(stateFilter);
        } else {
            agents = registryService.findAll();
        }
        return ResponseEntity.ok(objectMapper.writeValueAsString(agents));
    }
}

SSE Controller

@RestController
@RequestMapping("/api/v1/agents")
@Tag(name = "Agent SSE", description = "Server-Sent Events for agent communication")
public class AgentSseController {

    private final SseConnectionManager connectionManager;
    private final AgentRegistryService registryService;

    @GetMapping(value = "/{id}/events", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
    @Operation(summary = "SSE event stream for an agent")
    public SseEmitter subscribe(
            @PathVariable String id,
            @RequestHeader(value = "Last-Event-ID", required = false) String lastEventId) {

        AgentInfo agent = registryService.findById(id);
        if (agent == null) {
            throw new ResponseStatusException(HttpStatus.NOT_FOUND, "Agent not registered");
        }

        // Last-Event-ID acknowledged but no replay (per decision)
        if (lastEventId != null) {
            log.debug("Agent {} reconnected with Last-Event-ID: {} (no replay)", id, lastEventId);
        }

        return connectionManager.connect(id);
    }
}

Command Acknowledgement Endpoint (Recommended: Dedicated Endpoint)

@PostMapping("/{id}/commands/{commandId}/ack")
@Operation(summary = "Acknowledge command receipt")
public ResponseEntity<Void> acknowledgeCommand(
        @PathVariable String id,
        @PathVariable String commandId) {
    boolean acknowledged = registryService.acknowledgeCommand(id, commandId);
    if (!acknowledged) {
        return ResponseEntity.notFound().build();
    }
    return ResponseEntity.ok().build();
}

Application Configuration Addition

# application.yml additions
agent-registry:
  heartbeat-interval-ms: 30000
  stale-threshold-ms: 90000
  dead-threshold-ms: 300000   # 5 minutes after last heartbeat (not after going stale)
  ping-interval-ms: 15000
  command-expiry-ms: 60000
  lifecycle-check-interval-ms: 10000

spring:
  mvc:
    async:
      request-timeout: -1    # Disable async timeout for SSE

State of the Art

Old Approach	Current Approach	When Changed	Impact
Polling for agent status	SSE push for commands	Always SSE for server-push	Immediate delivery, lower latency
WebFlux for SSE	MVC SseEmitter	Spring 4.2+	MVC SseEmitter is sufficient for moderate scale; no need for reactive stack
Custom HTTP streaming	SseEmitter.event() builder	Spring 4.2+	Wire protocol handled automatically

Deprecated/outdated:

ResponseBodyEmitter directly for SSE: Use SseEmitter which extends it with SSE-specific features
DeferredResult for server push: Only for single-value responses, not streams

Open Questions

Command acknowledgement: dedicated endpoint vs heartbeat piggyback
- What we know: Dedicated endpoint is simpler, more explicit, and decoupled from heartbeat
- What's unclear: Whether agent-side implementation prefers one approach
- Recommendation: Use dedicated POST /{id}/commands/{commandId}/ack endpoint. Cleaner separation of concerns, easier to test, and does not complicate the heartbeat path
Dead threshold calculation: from last heartbeat or from STALE transition?
- What we know: CONTEXT.md says "5 minutes after going STALE"
- What's unclear: Whether to track staleTransitionTime separately or compute from lastHeartbeat
- Recommendation: Track staleTransitionTime in AgentInfo. Dead threshold = 5 minutes after staleTransitionTime. This matches the stated requirement precisely
Async timeout vs SseEmitter timeout
- What we know: Both spring.mvc.async.request-timeout and new SseEmitter(timeout) affect SSE lifetime
- What's unclear: Interaction between the two
- Recommendation: Set SseEmitter(Long.MAX_VALUE) AND spring.mvc.async.request-timeout=-1. Belt and suspenders -- both disabled ensures no premature timeout

Validation Architecture

Test Framework

Property	Value
Framework	JUnit 5 + Spring Boot Test (via spring-boot-starter-test)
Config file	pom.xml (Surefire + Failsafe configured)
Quick run command	`mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest`
Full suite command	`mvn clean verify`

Phase Requirements to Test Map

Req ID	Behavior	Test Type	Automated Command	File Exists?
AGNT-01	Agent registers and gets response	integration	`mvn test -pl cameleer-server-app -Dtest=AgentRegistrationControllerIT#registerAgent*`	No - Wave 0
AGNT-02	Lifecycle transitions LIVE/STALE/DEAD	unit	`mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest#lifecycle*`	No - Wave 0
AGNT-03	Heartbeat updates timestamp, returns 200/404	integration	`mvn test -pl cameleer-server-app -Dtest=AgentRegistrationControllerIT#heartbeat*`	No - Wave 0
AGNT-04	Config-update pushed via SSE	integration	`mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#configUpdate*`	No - Wave 0
AGNT-05	Deep-trace command pushed via SSE	integration	`mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#deepTrace*`	No - Wave 0
AGNT-06	Replay command pushed via SSE	integration	`mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#replay*`	No - Wave 0
AGNT-07	SSE ping keepalive + Last-Event-ID	integration	`mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#pingKeepalive*`	No - Wave 0

Sampling Rate

Per task commit: mvn test -pl cameleer-server-core,cameleer-server-app -Dtest="Agent*" (agent-related tests only)
Per wave merge: mvn clean verify
Phase gate: Full suite green before /gsd:verify-work

Wave 0 Gaps

cameleer-server-core/.../agent/AgentRegistryServiceTest.java -- covers AGNT-02, AGNT-03 (unit tests for registry logic)
cameleer-server-app/.../controller/AgentRegistrationControllerIT.java -- covers AGNT-01, AGNT-03
cameleer-server-app/.../controller/AgentSseControllerIT.java -- covers AGNT-04, AGNT-05, AGNT-06, AGNT-07
cameleer-server-app/.../controller/AgentCommandControllerIT.java -- covers command targeting (single, group, all)
No new framework install needed -- JUnit 5 + Spring Boot Test + Awaitility already in place

SSE Test Strategy

Testing SSE with TestRestTemplate requires special handling. Use Spring's WebClient from WebFlux test support or raw HttpURLConnection to read the SSE stream. Alternatively, test at the service layer (SseConnectionManager) with direct emitter interaction. The integration test should:

Register agent via POST
Open SSE connection (separate thread)
Send command via POST
Assert SSE stream received the event
Verify with Awaitility for async assertions

Sources

Primary (HIGH confidence)

SseEmitter Javadoc (Spring Framework 7.0.5) - Full API reference
Asynchronous Requests :: Spring Framework - Official async request handling docs
Task Execution and Scheduling :: Spring Boot - Official scheduling docs
Existing codebase: ClickHouseFlushScheduler, IngestionService, IngestionBeanConfig, WebConfig patterns

Secondary (MEDIUM confidence)

Spring Boot SSE SseEmitter tutorial - Complete guide with patterns
SseEmitter timeout issue #4021 - Timeout handling gotchas
SseEmitter response closed #19652 - Thread safety discussion

Tertiary (LOW confidence)

Various Medium articles on SSE patterns - used for cross-referencing community patterns only

Metadata

Confidence breakdown:

Standard stack: HIGH - SseEmitter is built into Spring MVC, already on classpath, well-documented API
Architecture: HIGH - follows established codebase patterns (core plain class, app bean config, @Scheduled)
Pitfalls: HIGH - well-known issues documented in Spring GitHub issues and multiple sources
SSE test strategy: MEDIUM - SSE testing with TestRestTemplate is non-trivial, may need adaptation

Research date: 2026-03-11 Valid until: 2026-04-11 (stable stack, no fast-moving dependencies)

27 KiB Raw Blame History

Phase 3: Agent Registry + SSE Push - Research

Summary

User Constraints (from CONTEXT.md)

Locked Decisions

Claude's Discretion

Deferred Ideas (OUT OF SCOPE)

Phase Requirements

Standard Stack

Core

Supporting

Alternatives Considered

Architecture Patterns

Recommended Project Structure

Pattern 1: Core Module Plain Class + App Module Bean Config

Pattern 2: SseEmitter Per-Agent Connection

Pattern 3: Lifecycle Monitor via @Scheduled

Anti-Patterns to Avoid

Don't Hand-Roll

Common Pitfalls

Pitfall 1: SseEmitter Default Timeout Kills Long-Lived Connections

Pitfall 2: IOException on Send Not Handled

Pitfall 3: Race Condition on Agent Reconnect

Pitfall 4: Tomcat Thread Exhaustion with SSE

Pitfall 5: Command Expiry Not Cleaned Up

Pitfall 6: SSE Endpoint Blocked by ProtocolVersionInterceptor

Code Examples

Registration Controller

SSE Controller

Command Acknowledgement Endpoint (Recommended: Dedicated Endpoint)

Application Configuration Addition

State of the Art

Open Questions

Validation Architecture

Test Framework

Phase Requirements to Test Map

Sampling Rate

Wave 0 Gaps

SSE Test Strategy

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)

Metadata

27 KiB

Raw Blame History