.planning/phases/02-transaction-search-diagrams/02-01-PLAN.md

---
phase: 02-transaction-search-diagrams
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
  - clickhouse/init/02-search-columns.sql
  - cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchResult.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchEngine.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchService.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/search/ExecutionSummary.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/detail/DetailService.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ExecutionDetail.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ProcessorNode.java
  - cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
  - cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
  - cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
autonomous: true
requirements:
  - SRCH-01
  - SRCH-02
  - SRCH-03
  - SRCH-04
  - SRCH-05
  - DIAG-01
  - DIAG-02

must_haves:
  truths:
    - "ClickHouse schema has columns for exchange bodies, headers, processor depths, parent indexes, diagram content hash"
    - "Ingested route executions populate depth, parent index, exchange data, and diagram hash columns"
    - "SearchEngine interface exists in core module for future OpenSearch swap"
    - "SearchRequest supports all filter combinations: status, time range, duration range, correlationId, text, per-field text"
    - "SearchResult envelope wraps paginated data with total, offset, limit"
  artifacts:
    - path: "clickhouse/init/02-search-columns.sql"
      provides: "Schema extension DDL for Phase 2 columns and skip indexes"
      contains: "exchange_bodies"
    - path: "cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchEngine.java"
      provides: "Search backend abstraction interface"
      exports: ["SearchEngine"]
    - path: "cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java"
      provides: "Immutable search criteria record"
      exports: ["SearchRequest"]
    - path: "cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java"
      provides: "Extended with new columns in INSERT, plus query methods"
      min_lines: 100
  key_links:
    - from: "cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchService.java"
      to: "SearchEngine"
      via: "constructor injection"
      pattern: "SearchEngine"
    - from: "cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java"
      to: "clickhouse/init/02-search-columns.sql"
      via: "INSERT and SELECT SQL matching schema"
      pattern: "exchange_bodies|processor_depths|diagram_content_hash"
---

<objective>
Extend the ClickHouse schema and ingestion path for Phase 2 search capabilities, and create the core domain types and interfaces for the search/detail layer.

Purpose: Phase 2 search and detail endpoints need additional columns in route_executions (exchange data, tree metadata, diagram hash) and a swappable search engine abstraction. This plan lays the foundation that Plans 02 and 03 build upon.

Output: Schema migration SQL, updated ingestion INSERT with new columns, core search/detail domain types, SearchEngine interface.
</objective>

<execution_context>
@C:/Users/Hendrik/.claude/get-shit-done/workflows/execute-plan.md
@C:/Users/Hendrik/.claude/get-shit-done/templates/summary.md
</execution_context>

<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/02-transaction-search-diagrams/02-CONTEXT.md
@.planning/phases/02-transaction-search-diagrams/02-RESEARCH.md

@clickhouse/init/01-schema.sql
@cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
@cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
@cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java

<interfaces>
<!-- Existing interfaces the executor needs -->

From cameleer-server-core/.../storage/ExecutionRepository.java:
```java
public interface ExecutionRepository {
    void insertBatch(List<RouteExecution> executions);
}
```

From cameleer-server-core/.../storage/DiagramRepository.java:
```java
public interface DiagramRepository {
    void store(RouteGraph graph);
    Optional<RouteGraph> findByContentHash(String contentHash);
    Optional<String> findContentHashForRoute(String routeId, String agentId);
}
```

From cameleer-common (decompiled — key fields):
```java
// RouteExecution: routeId, status (ExecutionStatus enum: COMPLETED/FAILED/RUNNING),
//   startTime (Instant), endTime (Instant), durationMs (long), correlationId, exchangeId,
//   errorMessage, errorStackTrace, processors (List<ProcessorExecution>),
//   inputSnapshot (ExchangeSnapshot), outputSnapshot (ExchangeSnapshot)

// ProcessorExecution: processorId, processorType, status, startTime, endTime, durationMs,
//   children (List<ProcessorExecution>), diagramNodeId,
//   inputSnapshot (ExchangeSnapshot), outputSnapshot (ExchangeSnapshot)

// ExchangeSnapshot: body (String), headers (Map<String,String>), properties (Map<String,String>)

// RouteGraph: routeId, nodes (List<RouteNode>), edges (List<RouteEdge>), processorNodeMapping (Map<String,String>)
// RouteNode: id, label, type (NodeType enum), properties (Map<String,String>)
// RouteEdge: source, target, label
// NodeType enum: ENDPOINT, TO, TO_DYNAMIC, DIRECT, SEDA, PROCESSOR, BEAN, LOG, SET_HEADER, SET_BODY,
//   TRANSFORM, MARSHAL, UNMARSHAL, CHOICE, WHEN, OTHERWISE, SPLIT, AGGREGATE, MULTICAST,
//   FILTER, RECIPIENT_LIST, ROUTING_SLIP, DYNAMIC_ROUTER, LOAD_BALANCE, THROTTLE, DELAY,
//   ERROR_HANDLER, ON_EXCEPTION, TRY_CATCH, DO_TRY, DO_CATCH, DO_FINALLY, WIRE_TAP,
//   ENRICH, POLL_ENRICH, SORT, RESEQUENCE, IDEMPOTENT_CONSUMER, CIRCUIT_BREAKER, SAGA, LOOP
```

Existing ClickHouse schema (01-schema.sql):
```sql
-- route_executions: execution_id, route_id, agent_id, status, start_time, end_time,
--   duration_ms, correlation_id, exchange_id, error_message, error_stacktrace,
--   processor_ids, processor_types, processor_starts, processor_ends,
--   processor_durations, processor_statuses, server_received_at
-- ORDER BY (agent_id, status, start_time, execution_id)
-- PARTITION BY toYYYYMMDD(start_time)
-- Skip indexes: idx_correlation (bloom_filter), idx_error (tokenbf_v1)
```
</interfaces>
</context>

<tasks>

<task type="auto">
  <name>Task 1: Schema extension and core domain types</name>
  <files>
    clickhouse/init/02-search-columns.sql,
    cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchResult.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchEngine.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchService.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/search/ExecutionSummary.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/detail/DetailService.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ExecutionDetail.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ProcessorNode.java,
    cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
  </files>
  <action>
    1. Create `clickhouse/init/02-search-columns.sql` with ALTER TABLE statements to add Phase 2 columns to route_executions:
       - `exchange_bodies String DEFAULT ''` — concatenated searchable text of all exchange bodies
       - `exchange_headers String DEFAULT ''` — concatenated searchable text of all exchange headers
       - `processor_depths Array(UInt16) DEFAULT []` — depth of each processor in tree
       - `processor_parent_indexes Array(Int32) DEFAULT []` — parent index (-1 for roots) for tree reconstruction
       - `processor_error_messages Array(String) DEFAULT []` — per-processor error messages
       - `processor_error_stacktraces Array(String) DEFAULT []` — per-processor error stack traces
       - `processor_input_bodies Array(String) DEFAULT []` — per-processor input body snapshots
       - `processor_output_bodies Array(String) DEFAULT []` — per-processor output body snapshots
       - `processor_input_headers Array(String) DEFAULT []` — per-processor input headers (JSON string per element)
       - `processor_output_headers Array(String) DEFAULT []` — per-processor output headers (JSON string per element)
       - `processor_diagram_node_ids Array(String) DEFAULT []` — per-processor diagramNodeId for overlay linking
       - `diagram_content_hash String DEFAULT ''` — links execution to its active diagram version (DIAG-02)
       - Add tokenbf_v1 skip indexes on exchange_bodies and exchange_headers (GRANULARITY 4, same as idx_error)
       - Add tokenbf_v1 skip index on error_stacktrace (it has no index yet, needed for SRCH-05 full-text search across stack traces)

    2. Create core search domain types in `com.cameleer.server.core.search`:
       - `SearchRequest` record: status (String, nullable), timeFrom (Instant), timeTo (Instant), durationMin (Long), durationMax (Long), correlationId (String), text (String — global full-text), textInBody (String), textInHeaders (String), textInErrors (String), offset (int), limit (int). Compact constructor validates: limit defaults to 50 if <= 0, capped at 500; offset defaults to 0 if < 0.
       - `SearchResult<T>` record: data (List<T>), total (long), offset (int), limit (int). Include static factory `empty(int offset, int limit)`.
       - `ExecutionSummary` record: executionId (String), routeId (String), agentId (String), status (String), startTime (Instant), endTime (Instant), durationMs (long), correlationId (String), errorMessage (String), diagramContentHash (String). This is the lightweight list-view DTO — NOT the full processor arrays.
       - `SearchEngine` interface with methods: `SearchResult<ExecutionSummary> search(SearchRequest request)` and `long count(SearchRequest request)`. This is the swappable backend (ClickHouse now, OpenSearch later per user decision).
       - `SearchService` class: plain class (no Spring annotations, same pattern as IngestionService). Constructor takes SearchEngine. `search(SearchRequest)` delegates to engine.search(). This thin orchestration layer allows adding cross-cutting concerns later.

    3. Create core detail domain types in `com.cameleer.server.core.detail`:
       - `ProcessorNode` record: processorId (String), processorType (String), status (String), startTime (Instant), endTime (Instant), durationMs (long), diagramNodeId (String), errorMessage (String), errorStackTrace (String), children (List<ProcessorNode>). This is the nested tree node.
       - `ExecutionDetail` record: executionId (String), routeId (String), agentId (String), status (String), startTime (Instant), endTime (Instant), durationMs (long), correlationId (String), exchangeId (String), errorMessage (String), errorStackTrace (String), diagramContentHash (String), processors (List<ProcessorNode>). This is the full detail response.
       - `DetailService` class: plain class (no Spring annotations). Constructor takes ExecutionRepository. Method `getDetail(String executionId)` returns `Optional<ExecutionDetail>`. Calls repository's new `findDetailById` method, then calls `reconstructTree()` to convert flat arrays into nested ProcessorNode tree. The `reconstructTree` method: takes parallel arrays (ids, types, statuses, starts, ends, durations, diagramNodeIds, errorMessages, errorStackTraces, depths, parentIndexes), creates ProcessorNode[] array, then wires children using parentIndexes (parentIndex == -1 means root).

    4. Extend `ExecutionRepository` interface with new query methods:
       - `Optional<ExecutionDetail> findDetailById(String executionId)` — returns raw flat data for tree reconstruction (DetailService handles reconstruction)

       Actually, use a different approach per the layering: add a `findRawById(String executionId)` method that returns `Optional<RawExecutionRow>` — a new record containing all parallel arrays. DetailService takes this and reconstructs. Create `RawExecutionRow` as a record in the detail package with all fields needed for reconstruction.
  </action>
  <verify>
    <automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn compile -pl cameleer-server-core</automated>
  </verify>
  <done>Schema migration SQL exists, all core domain types compile, SearchEngine interface and SearchService defined, ExecutionRepository extended with query method, DetailService has tree reconstruction logic</done>
</task>

<task type="auto" tdd="true">
  <name>Task 2: Update ingestion to populate new columns and verify with integration test</name>
  <files>
    cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java,
    cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java,
    cameleer-server-app/src/test/java/com/cameleer/server/app/storage/IngestionSchemaIT.java
  </files>
  <behavior>
    - Test: After inserting a RouteExecution with processors that have exchange snapshots and nested children, the route_executions row has non-empty exchange_bodies, exchange_headers, processor_depths (correct depth values), processor_parent_indexes (correct parent wiring), processor_input_bodies, processor_output_bodies, processor_input_headers, processor_output_headers, processor_diagram_node_ids, and diagram_content_hash columns
    - Test: Processor depths are correct for a 3-level tree: root=0, child=1, grandchild=2
    - Test: Processor parent indexes correctly reference parent positions: root=-1, child=parentIdx, grandchild=childIdx
    - Test: exchange_bodies contains concatenated body text from all processor snapshots (for LIKE search)
    - Test: Insertions that omit exchange snapshot data (null snapshots) produce empty-string defaults without error
  </behavior>
  <action>
    1. Update `AbstractClickHouseIT.initSchema()` to also load `02-search-columns.sql` after `01-schema.sql`. Use the same path resolution pattern (check `clickhouse/init/` then `../clickhouse/init/`).

    2. Update `ClickHouseExecutionRepository`:
       - Extend INSERT_SQL to include all new columns: exchange_bodies, exchange_headers, processor_depths, processor_parent_indexes, processor_error_messages, processor_error_stacktraces, processor_input_bodies, processor_output_bodies, processor_input_headers, processor_output_headers, processor_diagram_node_ids, diagram_content_hash
       - Refactor `flattenProcessors` to return a list of `FlatProcessor` records containing the original ProcessorExecution plus computed depth (int) and parentIndex (int). Use the recursive approach from the research: track depth and parent index during DFS traversal.
       - In `setValues`: build parallel arrays for all new columns from FlatProcessor list.
       - Build concatenated `exchange_bodies` string: join all processor input/output bodies plus route-level input/output snapshot bodies with space separators. Same for `exchange_headers` but serialize Map<String,String> headers to JSON string using Jackson ObjectMapper (inject via constructor or create statically).
       - For diagram_content_hash: leave as empty string for now (the ingestion endpoint does not yet resolve the active diagram hash — this is a query-time concern). Plan 03 wires this if needed, but DIAG-02 can also be satisfied by joining route_diagrams at query time.
       - Handle null ExchangeSnapshot gracefully: empty string for bodies, empty JSON object for headers.

    3. Create `IngestionSchemaIT` integration test that:
       - Extends AbstractClickHouseIT
       - Builds a RouteExecution with a 3-level processor tree where processors have ExchangeSnapshot data
       - POSTs it to /api/v1/data/executions, waits for flush
       - Queries ClickHouse directly via jdbcTemplate to verify all new columns have correct values
       - Verifies processor_depths = [0, 1, 2] for a root->child->grandchild chain
       - Verifies processor_parent_indexes = [-1, 0, 1]
       - Verifies exchange_bodies contains the body text
       - Verifies a second insertion with null snapshots succeeds with empty defaults
  </action>
  <verify>
    <automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=IngestionSchemaIT</automated>
  </verify>
  <done>All new columns populated correctly during ingestion, tree metadata (depth/parent) correct for nested processors, exchange data concatenated for search, existing ingestion tests still pass</done>
</task>

</tasks>

<verification>
- `mvn compile -pl cameleer-server-core` succeeds (core domain types compile)
- `mvn test -pl cameleer-server-app -Dtest=IngestionSchemaIT` passes (new columns populated correctly)
- `mvn test -pl cameleer-server-app` passes (all existing tests still green with schema extension)
</verification>

<success_criteria>
- ClickHouse schema extension SQL exists and is loaded by test infrastructure
- All 12+ new columns populated during ingestion with correct values
- Processor tree metadata (depth, parentIndex) correctly computed during DFS flattening
- Exchange snapshot data concatenated into searchable text columns
- SearchEngine interface exists in core module for future backend swap
- SearchRequest/SearchResult/ExecutionSummary records exist with all required fields
- DetailService can reconstruct a nested ProcessorNode tree from flat arrays
- All existing Phase 1 tests still pass
</success_criteria>

<output>
After completion, create `.planning/phases/02-transaction-search-diagrams/02-01-SUMMARY.md`
</output>