Files
camel-ops-platform/docs/camel_agent_spec.md

4.7 KiB

Apache Camel Java Agent Specification

1. Overview

To achieve "zero-code change" data extraction for Apache Camel applications (similar to the nJAMS agent for MuleSoft/TIBCO), we utilize the JVM's java.lang.instrument API via a -javaagent flag. This agent dynamically attaches to the target JVM, hooks into the Camel Context initialization, and injects native Camel telemetry SPIs (InterceptStrategy and EventNotifier) without requiring developers to alter their Camel routes or application code.

2. Technical Implementation: The -javaagent Hook

JVM Instrumentation & Bytecode Manipulation

When the application starts with -javaagent:/path/to/camel-agent.jar, the JVM invokes the agent's premain method before the application's main method.

  1. Bytecode Interception (ByteBuddy / ASM): The agent registers a ClassFileTransformer via the Instrumentation API. Using a library like ByteBuddy (which is safer and easier to maintain than raw ASM), we instrument the org.apache.camel.impl.engine.AbstractCamelContext or DefaultCamelContext classes depending on the Camel version.

  2. Hooking Context Initialization: We intercept the start() or build() methods of the CamelContext.

    // Pseudo-code for ByteBuddy Advice
    @Advice.OnMethodEnter
    public static void onStart(@Advice.This CamelContext context) {
        CamelAgentRegistrar.register(context);
    }
    
  3. Registering Native Camel SPIs: Inside CamelAgentRegistrar.register(context), we inject our monitoring components directly into the context before the routes start processing messages. Because we are inside the JVM, we have direct access to the CamelContext object:

    • EventNotifier: Added via context.getManagementStrategy().addEventNotifier(...). This captures macro lifecycle events (ExchangeCreatedEvent, ExchangeCompletedEvent, ExchangeFailedEvent) with extremely low overhead.
    • InterceptStrategy: Added via context.addInterceptStrategy(...). This wraps every Processor in the route definition. It allows us to capture the payload (Exchange.getIn().getBody()), headers, and properties before and after each discrete step in the integration flow.

This approach guarantees zero-code instrumentation while remaining entirely "Camel Native," avoiding the fragility of intercepting arbitrary application methods.

3. High-Performance Payload Extraction

Capturing payloads at every step of a Camel route can introduce severe GC (Garbage Collection) pauses, CPU spikes, and memory bloat. We must extract data cheaply.

Serialization Strategy

  1. Type-Aware Truncation (The First Line of Defense): Before attempting deep serialization, we inspect the payload type. If it's a String, byte[], or InputStream (common in Camel), we read only the first N bytes (e.g., 4KB limit) and discard the rest. We use cached/pooled buffers to read streams without allocating new byte arrays per exchange.

  2. Fast Object Serialization (Kryo): If the payload is a complex POJO, standard Java serialization or Jackson/Gson JSON mapping is too slow and allocates too much memory.

    • We embed Kryo, a fast, efficient object graph serialization framework.
    • Kryo instances are heavily pooled (e.g., using ThreadLocal or a fast object pool) because they are not thread-safe but are extremely fast and produce compact binary outputs when reused.

Memory & GC Pause Prevention

  1. Asynchronous Ring Buffers (LMAX Disruptor): Instead of creating new Event objects on the heap for every intercepted payload and doing synchronous I/O, the InterceptStrategy writes the extracted (and truncated) data directly into a pre-allocated, lock-free Ring Buffer (e.g., LMAX Disruptor or a simple circular array).

    • A dedicated, low-priority background thread consumes from this ring buffer, batches the events, and flushes them to the local OTEL Collector or Log storage.
    • Load Shedding: If the system is under extreme load and the buffer fills up, the agent drops the telemetry data rather than blocking the Camel routing threads. Monitoring must never crash the host application.
  2. Conditional Capture (Dynamic Sampling): To further reduce overhead, the agent queries the local Appliance Hub for rules:

    • Error-Only Mode: Payloads are cached in a tiny thread-local circular buffer and only serialized/retained if the Exchange fails (Exchange.isFailed()).
    • Sampling: Only 1 in 100 exchanges are deeply inspected.
    • Step Filtering: Only capture payloads at the ingress and egress endpoints of the route, ignoring intermediate data transformation steps unless debug mode is triggered.