Files
cameleer/docs/superpowers/specs/2026-04-14-simplified-log-forwarding-design.md

169 lines
8.3 KiB
Markdown
Raw Normal View History

# Simplified Log Forwarding Architecture
**Date:** 2026-04-14
**Status:** Approved (conversation design review)
## Problem
The current log forwarding has three buffering layers (BridgeAccess early buffer → LogForwarder deferred exporter → ChunkedExporter) to cover the timing gap between appender registration and server connection. This creates:
- Framework-specific early hooks: `SpringContextAdvice` hooks `AbstractApplicationContext.refresh()` for Spring apps; `CameleerHookInstaller.preInstall()` has a fallback path for non-Spring apps
- `BridgeAccess` early buffer (5000 entries) with synchronized drain logic and classloader reflection
- `LogForwarder` deferred exporter pattern (no-arg constructor, `setExporter()` called later)
- `LogForwardingConsumer` wrapper class to avoid ByteBuddy lambda inlining issues
- Cross-classloader complexity in `SpringContextAdvice` (loads LogForwarder on system CL via reflection)
Despite this complexity, **bootstrap failures before CamelContext.start() are still not captured** — the server connection only happens in `postInstall`, so if the app dies during Spring context initialization, buffered logs die with the process.
## Decision
1. **Move server connection to `preInstall`** — connect before routes start, so the exporter is live when the appender is registered. No deferred pattern needed.
2. **External infrastructure handles pre-agent logs** — Docker/K8s logging drivers, DaemonSet shippers, or kubelet events capture container lifecycle before the JVM starts. The server team will implement this.
3. **Remove all early buffering machinery** — BridgeAccess buffer, deferred exporter, SpringContextAdvice log code, LogForwardingConsumer.
4. **Unified appender registration path** — same code for Spring Boot, Quarkus, and Plain Java, executed in preInstall/configure.
### Why preInstall instead of premain?
The original conversation explored moving the server connection to `premain()`. However, the agent uses an `AgentClassLoader` (child-first) that creates a separate class namespace from the system classloader. Objects created in `premain()` (system CL) cannot be passed to `CameleerHookInstaller` (AgentClassLoader) without complex cross-classloader reflection using only primitives. Since `preInstall` already fires before any route processes exchanges (it runs at the start of `CamelContext.start()`), it meets the requirement of "fully initialized before the first route executes" without the classloader complexity.
## Architecture
### Three-Phase Log Coverage Model
| Phase | Coverage | Mechanism |
|-------|----------|-----------|
| Pre-JVM (container fails before Java runs) | Server team / infra | kubelet events, Docker logging drivers, DaemonSet shipper |
| JVM start → preInstall (framework bootstrap) | Not captured by agent | Infra-level capture; acceptable gap |
| preInstall onward (routes executing) | Agent | In-agent appender with structured log forwarding |
### New Startup Sequence (Agent Mode)
```
premain()
→ install CamelContext transformer (existing)
→ install SendDynamicAware transformer (existing)
→ [REMOVED: Spring AbstractApplicationContext transformer]
CamelContext.start() intercepted → preInstall()
→ install InterceptStrategy (existing)
→ earlyConnect: ServerConnection + register (minimal) + ChunkedExporter
→ register log appender on root logger (unified, all frameworks)
→ create LogForwarder WITH live exporter
→ set bridge handler on LogEventBridge
→ ✅ log forwarding LIVE before any route processes an exchange
CamelContext.start() completes → postInstall()
→ create EventNotifier (existing)
→ PostStartSetup: diagrams, metrics, re-register with full info, heartbeat, SSE
```
### New Startup Sequence (Extension Mode)
```
@PostConstruct (CameleerConfigAdapter)
→ set system properties from Quarkus config
→ [REMOVED: appender registration and earlyLogForwarder]
configure() (CameleerLifecycle - before CamelContext.start())
→ earlyConnect: ServerConnection + register (minimal) + ChunkedExporter
→ create collector + InterceptStrategy + EventNotifier (existing)
→ register log appender on root logger
→ create LogForwarder WITH live exporter
→ set bridge handler
→ ✅ log forwarding LIVE
CamelContextStartedEvent
→ PostStartSetup: diagrams, metrics, re-register, heartbeat, SSE
```
### Server-Side Cutover
The `logForwarding` capability in the registration/heartbeat capabilities map signals when agent log forwarding is active. The server team can use this to stop ingesting infra-level logs for that instance, avoiding duplicates.
## Changes
### New: `ServerSetup.earlyConnect(CameleerAgentConfig)`
Minimal server registration before routes start. Returns `EarlyConnection(ServerConnection, ChunkedExporter)` or null on failure.
- Creates `ServerConnection` with endpoint + auth token
- Registers with: instanceId, applicationId, environmentId, version, empty routeIds, minimal capabilities
- Creates `ChunkedExporter` wired to the connection
- On failure: logs warning, returns null (agent falls back to `LogExporter`)
### New: `ServerSetup.setupLogForwarding(config, exporter, appClassLoader)`
Shared method for both agent and extension to register the appender and create a LogForwarder in one call.
### Modified: `ServerSetup.connect(ConnectionContext)`
- `ConnectionContext` gains an `earlyConnection` field (replaces `earlyLogForwarder`)
- If `earlyConnection` is present: reuses `ServerConnection` and `ChunkedExporter`, re-registers with full route IDs and capabilities
- If absent: full connection as before (backward compat for edge cases)
- `installLogForwarding()` removed (log forwarding already set up in preInstall)
### Modified: `PostStartSetup.Context`
- `earlyLogForwarder` field replaced by `earlyConnection` (ServerSetup.EarlyConnection)
- `logForwarder` passed separately (already created in preInstall)
### Modified: `LogForwarder`
- No-arg constructor **removed**
- Constructor always requires a non-null `Exporter`
- `setExporter()` **kept** for one case: early connect failed (LogExporter), late connect succeeded (swap to ChunkedExporter)
- Internal queue and flush scheduler unchanged
### Simplified: `BridgeAccess`
- `earlyBuffer` (ConcurrentLinkedQueue) **removed**
- `drainBuffer()` **removed**
- `MAX_BUFFER_SIZE` **removed**
- `forward()` calls handler directly; if no handler, entry is silently dropped
- `setHandler()`, `getHandler()`, `resolveField()` unchanged
- Test utility methods updated
### Removed: `SpringContextAdvice`
- Entire class deleted (only purpose was early log registration)
### Removed: `LogForwardingConsumer`
- Entire class deleted (was only needed by SpringContextAdvice)
### Removed: `SpringContextTransformer`
- Entire class deleted (ByteBuddy transformer for SpringContextAdvice)
### Modified: `CameleerAgent.premain()`
- Remove `AbstractApplicationContext` transformer installation (lines 67-72)
### Modified: `CameleerHookInstaller`
- `preInstall()`: calls `earlyConnect()`, registers appender, creates `LogForwarder` with live exporter, stores `EarlyConnection` for postInstall
- `postInstall()`: no longer picks up `SpringContextAdvice.earlyLogForwarder`; passes `EarlyConnection` + `logForwarder` to `PostStartSetup`
### Modified: `CameleerConfigAdapter` (Extension)
- Remove `earlyLogForwarder` field and `getEarlyLogForwarder()`
- Remove appender registration from `@PostConstruct`
### Modified: `CameleerLifecycle` (Extension)
- `configure()`: calls `earlyConnect()`, registers appender, creates `LogForwarder`
- `onCamelContextStarted()`: passes `EarlyConnection` + `logForwarder` to `PostStartSetup`
## Test Changes
- `LogForwarderTest`: remove `deferredExporter_buffersUntilExporterSet` test (deferred pattern no longer exists)
- `BridgeAccessTest`: remove buffer tests (`forward_buffersWhenNoHandler`, `forward_drainsBufferWhenHandlerSet`, `forward_bufferCapped`); add test for silent drop when no handler
- Remaining tests unchanged (they already pass exporter to constructor)
## Constraints
- Must not steal the customer's log stream — appender adds alongside existing appenders, never replaces
- Must work with Datadog, Elastic, and other APM agents that also add appenders
- Extension path (Quarkus native, no agent) remains separate but follows the same simplified pattern
- If server is unreachable in preInstall, graceful fallback to LogExporter; postInstall retries connection