camel-ops-platform/docs/camel_agent_spec.md

# Apache Camel Java Agent Specification

## 1. Overview
To achieve "zero-code change" data extraction for Apache Camel applications (similar to the nJAMS agent for MuleSoft/TIBCO), we utilize the JVM's `java.lang.instrument` API via a `-javaagent` flag. This agent dynamically attaches to the target JVM, hooks into the Camel Context initialization, and injects native Camel telemetry SPIs (`InterceptStrategy` and `EventNotifier`) without requiring developers to alter their Camel routes or application code.

## 2. Technical Implementation: The `-javaagent` Hook

### JVM Instrumentation & Bytecode Manipulation
When the application starts with `-javaagent:/path/to/camel-agent.jar`, the JVM invokes the agent's `premain` method before the application's `main` method.

1. **Bytecode Interception (ByteBuddy / ASM):**
   The agent registers a `ClassFileTransformer` via the `Instrumentation` API. Using a library like [ByteBuddy](https://bytebuddy.net/) (which is safer and easier to maintain than raw ASM), we instrument the `org.apache.camel.impl.engine.AbstractCamelContext` or `DefaultCamelContext` classes depending on the Camel version.

2. **Hooking Context Initialization:**
   We intercept the `start()` or `build()` methods of the `CamelContext`.

   ```java
   // Pseudo-code for ByteBuddy Advice
   @Advice.OnMethodEnter
   public static void onStart(@Advice.This CamelContext context) {
       CamelAgentRegistrar.register(context);
   }
   ```

3. **Registering Native Camel SPIs:**
   Inside `CamelAgentRegistrar.register(context)`, we inject our monitoring components directly into the context before the routes start processing messages. Because we are inside the JVM, we have direct access to the `CamelContext` object:

   *   **`EventNotifier`:** Added via `context.getManagementStrategy().addEventNotifier(...)`. This captures macro lifecycle events (`ExchangeCreatedEvent`, `ExchangeCompletedEvent`, `ExchangeFailedEvent`) with extremely low overhead.
   *   **`InterceptStrategy`:** Added via `context.addInterceptStrategy(...)`. This wraps every `Processor` in the route definition. It allows us to capture the payload (`Exchange.getIn().getBody()`), headers, and properties before and after each discrete step in the integration flow.

This approach guarantees zero-code instrumentation while remaining entirely "Camel Native," avoiding the fragility of intercepting arbitrary application methods.

## 3. High-Performance Payload Extraction

Capturing payloads at every step of a Camel route can introduce severe GC (Garbage Collection) pauses, CPU spikes, and memory bloat. We must extract data cheaply.

### Serialization Strategy

1. **Type-Aware Truncation (The First Line of Defense):**
   Before attempting deep serialization, we inspect the payload type. If it's a `String`, `byte[]`, or `InputStream` (common in Camel), we read only the first *N* bytes (e.g., 4KB limit) and discard the rest. We use cached/pooled buffers to read streams without allocating new byte arrays per exchange.

2. **Fast Object Serialization (Kryo):**
   If the payload is a complex POJO, standard Java serialization or Jackson/Gson JSON mapping is too slow and allocates too much memory.
   *   We embed [Kryo](https://github.com/EsotericSoftware/kryo), a fast, efficient object graph serialization framework.
   *   Kryo instances are heavily pooled (e.g., using `ThreadLocal` or a fast object pool) because they are not thread-safe but are extremely fast and produce compact binary outputs when reused.

### Memory & GC Pause Prevention

1. **Asynchronous Ring Buffers (LMAX Disruptor):**
   Instead of creating new Event objects on the heap for every intercepted payload and doing synchronous I/O, the `InterceptStrategy` writes the extracted (and truncated) data directly into a pre-allocated, lock-free Ring Buffer (e.g., LMAX Disruptor or a simple circular array).
   *   A dedicated, low-priority background thread consumes from this ring buffer, batches the events, and flushes them to the local OTEL Collector or Log storage.
   *   **Load Shedding:** If the system is under extreme load and the buffer fills up, the agent **drops** the telemetry data rather than blocking the Camel routing threads. Monitoring must never crash the host application.

2. **Conditional Capture (Dynamic Sampling):**
   To further reduce overhead, the agent queries the local Appliance Hub for rules:
   *   *Error-Only Mode:* Payloads are cached in a tiny thread-local circular buffer and only serialized/retained if the Exchange fails (`Exchange.isFailed()`).
   *   *Sampling:* Only 1 in 100 exchanges are deeply inspected.
   *   *Step Filtering:* Only capture payloads at the ingress and egress endpoints of the route, ignoring intermediate data transformation steps unless debug mode is triggered.