Files
cameleer-server/.planning/phases/01-ingestion-pipeline-api-foundation/01-RESEARCH.md
hsiegeln cb3ebfea7c
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 18s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
chore: rename cameleer3 to cameleer
Rename Java packages from com.cameleer3 to com.cameleer, module
directories from cameleer3-* to cameleer-*, and all references
throughout workflows, Dockerfiles, docs, migrations, and pom.xml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:28:42 +02:00

27 KiB

Phase 1: Ingestion Pipeline + API Foundation - Research

Researched: 2026-03-11 Domain: ClickHouse batch ingestion, Spring Boot REST API, write buffer with backpressure Confidence: HIGH

Summary

Phase 1 establishes the data pipeline and API skeleton for Cameleer Server. Agents POST execution data, diagrams, and metrics to REST endpoints; the server buffers these in memory and batch-flushes to ClickHouse. The ClickHouse schema design is the most critical and least reversible decision in this phase -- ORDER BY and partitioning cannot be changed without table recreation.

The ClickHouse Java ecosystem has undergone significant changes. The recommended approach is clickhouse-jdbc v0.9.7 (JDBC V2 driver) with Spring Boot's JdbcTemplate for batch inserts. An alternative is the standalone client-v2 artifact which offers a POJO-based insert API, but JDBC integration with Spring Boot is more conventional and better documented. ClickHouse now has a native full-text index (TYPE text, GA as of March 2026) that supersedes the older tokenbf_v1 bloom filter approach -- this is relevant for Phase 2 but should be accounted for in schema design now.

Primary recommendation: Use clickhouse-jdbc 0.9.7 with Spring JdbcTemplate, ArrayBlockingQueue write buffer with scheduled batch flush, daily partitioning with TTL + ttl_only_drop_parts, and Docker Compose for local ClickHouse. Keep Spring Security out of Phase 1 -- all endpoints open, security layered in Phase 4.

<phase_requirements>

Phase Requirements

ID Description Research Support
INGST-01 (#1) Accept RouteExecution via POST /api/v1/data/executions, return 202 REST controller + async write buffer pattern; Jackson deserialization of cameleer-common models
INGST-02 (#2) Accept RouteGraph via POST /api/v1/data/diagrams, return 202 Same pattern; separate ClickHouse table for diagrams with content-hash dedup
INGST-03 (#3) Accept metrics via POST /api/v1/data/metrics, return 202 Same pattern; separate ClickHouse table for metrics
INGST-04 (#4) In-memory batch buffer with configurable flush interval/size ArrayBlockingQueue + @Scheduled flush; configurable via application.yml
INGST-05 (#5) Return 503 when write buffer full (backpressure) queue.offer() returns false when full -> controller returns 503 + Retry-After header
INGST-06 (#6) ClickHouse TTL expires data after 30 days (configurable) Daily partitioning + TTL + ttl_only_drop_parts=1; configurable interval
API-01 (#28) All endpoints under /api/v1/ path Spring @RequestMapping("/api/v1") base path
API-02 (#29) OpenAPI/Swagger via springdoc-openapi springdoc-openapi-starter-webmvc-ui 2.8.6
API-03 (#30) GET /api/v1/health endpoint Spring Boot Actuator or custom health controller
API-04 (#31) Validate X-Cameleer-Protocol-Version: 1 header Spring HandlerInterceptor or servlet filter
API-05 (#32) Accept unknown JSON fields (forward compat) Spring Boot default: FAIL_ON_UNKNOWN_PROPERTIES=false (already the default)
</phase_requirements>

Standard Stack

Core (Phase 1 specific)

Library Version Purpose Why Standard
clickhouse-jdbc 0.9.7 (classifier: all) ClickHouse JDBC V2 driver Latest stable; V2 rewrite with improved type handling, batch support; works with Spring JdbcTemplate
Spring Boot Starter Web 3.4.3 (parent) REST controllers, Jackson Already in POM
Spring Boot Starter Actuator 3.4.3 (parent) Health endpoint, metrics Standard for health checks
springdoc-openapi-starter-webmvc-ui 2.8.6 OpenAPI 3.1 + Swagger UI Latest stable for Spring Boot 3.4; generates from annotations
Testcontainers (clickhouse) 2.0.2 Integration tests with real ClickHouse Spins up ClickHouse in Docker for tests
Testcontainers (junit-jupiter) 2.0.2 JUnit 5 integration Lifecycle management for test containers
HikariCP (Spring Boot managed) JDBC connection pool Default Spring Boot pool; works with ClickHouse JDBC

Supporting

Library Version Purpose When to Use
Jackson JavaTimeModule (Spring Boot managed) Instant/Duration serialization Already noted in project; needed for all timestamp fields
Micrometer (Spring Boot managed) Buffer depth metrics, ingestion rate Expose queue.size() and flush latency as metrics
Awaitility (Spring Boot managed) Async test assertions Testing batch flush timing in integration tests

Alternatives Considered

Instead of Could Use Tradeoff
clickhouse-jdbc 0.9.7 client-v2 0.9.7 (standalone) client-v2 has POJO insert API but no JdbcTemplate/Spring integration; JDBC is more conventional
ArrayBlockingQueue LMAX Disruptor Disruptor is faster under extreme contention but adds complexity; ABQ is sufficient for this throughput
Spring JdbcTemplate Raw JDBC PreparedStatement JdbcTemplate provides cleaner error handling and resource management; no meaningful overhead

Installation (add to cameleer-server-app/pom.xml):

<!-- ClickHouse JDBC V2 -->
<dependency>
    <groupId>com.clickhouse</groupId>
    <artifactId>clickhouse-jdbc</artifactId>
    <version>0.9.7</version>
    <classifier>all</classifier>
</dependency>

<!-- API Documentation -->
<dependency>
    <groupId>org.springdoc</groupId>
    <artifactId>springdoc-openapi-starter-webmvc-ui</artifactId>
    <version>2.8.6</version>
</dependency>

<!-- Actuator for health endpoint -->
<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

<!-- Testing -->
<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>testcontainers-clickhouse</artifactId>
    <version>2.0.2</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.testcontainers</groupId>
    <artifactId>junit-jupiter</artifactId>
    <version>2.0.2</version>
    <scope>test</scope>
</dependency>
<dependency>
    <groupId>org.awaitility</groupId>
    <artifactId>awaitility</artifactId>
    <scope>test</scope>
</dependency>

Add to cameleer-server-core/pom.xml:

<!-- SLF4J for logging (no Spring dependency) -->
<dependency>
    <groupId>org.slf4j</groupId>
    <artifactId>slf4j-api</artifactId>
</dependency>

Architecture Patterns

cameleer-server-core/src/main/java/com/cameleer/server/core/
    ingestion/
        WriteBuffer.java              # Bounded queue + flush logic
        IngestionService.java         # Accepts data, routes to buffer
    storage/
        ExecutionRepository.java      # Interface: batch insert + query
        DiagramRepository.java        # Interface: store/retrieve diagrams
        MetricsRepository.java        # Interface: store metrics
    model/
        (extend/complement cameleer-common models as needed)

cameleer-server-app/src/main/java/com/cameleer/server/app/
    config/
        ClickHouseConfig.java         # DataSource + JdbcTemplate bean
        IngestionConfig.java          # Buffer size, flush interval from YAML
        WebConfig.java                # Protocol version interceptor
    controller/
        ExecutionController.java      # POST /api/v1/data/executions
        DiagramController.java        # POST /api/v1/data/diagrams
        MetricsController.java        # POST /api/v1/data/metrics
        HealthController.java         # GET /api/v1/health (or use Actuator)
    storage/
        ClickHouseExecutionRepository.java
        ClickHouseDiagramRepository.java
        ClickHouseMetricsRepository.java
    interceptor/
        ProtocolVersionInterceptor.java

Pattern 1: Bounded Write Buffer with Scheduled Flush

What: ArrayBlockingQueue between HTTP endpoint and ClickHouse. Scheduled task drains and batch-inserts. When to use: Always for ClickHouse ingestion.

// In core module -- no Spring dependency
public class WriteBuffer<T> {
    private final BlockingQueue<T> queue;
    private final int capacity;

    public WriteBuffer(int capacity) {
        this.capacity = capacity;
        this.queue = new ArrayBlockingQueue<>(capacity);
    }

    /** Returns false when buffer is full (caller should return 503) */
    public boolean offer(T item) {
        return queue.offer(item);
    }

    public boolean offerBatch(List<T> items) {
        // Try to add all; if any fails, none were lost (already in list)
        for (T item : items) {
            if (!queue.offer(item)) return false;
        }
        return true;
    }

    /** Drain up to maxBatch items. Called by scheduled flush. */
    public List<T> drain(int maxBatch) {
        List<T> batch = new ArrayList<>(maxBatch);
        queue.drainTo(batch, maxBatch);
        return batch;
    }

    public int size() { return queue.size(); }
    public int capacity() { return capacity; }
    public boolean isFull() { return queue.remainingCapacity() == 0; }
}
// In app module -- Spring wiring
@Component
public class ClickHouseFlushScheduler {
    private final WriteBuffer<RouteExecution> executionBuffer;
    private final ExecutionRepository repository;

    @Scheduled(fixedDelayString = "${ingestion.flush-interval-ms:1000}")
    public void flushExecutions() {
        List<RouteExecution> batch = executionBuffer.drain(
            ingestionConfig.getBatchSize()); // default 5000
        if (!batch.isEmpty()) {
            repository.insertBatch(batch);
        }
    }
}

Pattern 2: Controller Returns 202 or 503

What: Ingestion endpoints accept data asynchronously. Return 202 on success, 503 when buffer full. When to use: All ingestion POST endpoints.

@RestController
@RequestMapping("/api/v1/data")
public class ExecutionController {

    @PostMapping("/executions")
    public ResponseEntity<Void> ingestExecutions(
            @RequestBody List<RouteExecution> executions) {
        if (!ingestionService.accept(executions)) {
            return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)
                .header("Retry-After", "5")
                .build();
        }
        return ResponseEntity.accepted().build();
    }
}

Pattern 3: ClickHouse Batch Insert via JdbcTemplate

What: Use JdbcTemplate.batchUpdate with PreparedStatement for efficient ClickHouse inserts.

@Repository
public class ClickHouseExecutionRepository implements ExecutionRepository {

    private final JdbcTemplate jdbc;

    @Override
    public void insertBatch(List<RouteExecution> executions) {
        String sql = "INSERT INTO route_executions (execution_id, route_id, "
            + "agent_id, status, start_time, end_time, duration_ms, "
            + "correlation_id, error_message) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?)";

        jdbc.batchUpdate(sql, new BatchPreparedStatementSetter() {
            @Override
            public void setValues(PreparedStatement ps, int i) throws SQLException {
                RouteExecution e = executions.get(i);
                ps.setString(1, e.getExecutionId());
                ps.setString(2, e.getRouteId());
                ps.setString(3, e.getAgentId());
                ps.setString(4, e.getStatus().name());
                ps.setObject(5, e.getStartTime()); // Instant -> DateTime64
                ps.setObject(6, e.getEndTime());
                ps.setLong(7, e.getDurationMs());
                ps.setString(8, e.getCorrelationId());
                ps.setString(9, e.getErrorMessage());
            }
            @Override
            public int getBatchSize() { return executions.size(); }
        });
    }
}

Pattern 4: Protocol Version Interceptor

What: Validate X-Cameleer-Protocol-Version header on all /api/v1/ requests.

public class ProtocolVersionInterceptor implements HandlerInterceptor {
    @Override
    public boolean preHandle(HttpServletRequest request,
            HttpServletResponse response, Object handler) throws Exception {
        String version = request.getHeader("X-Cameleer-Protocol-Version");
        if (version == null || !"1".equals(version)) {
            response.setStatus(HttpStatus.BAD_REQUEST.value());
            response.getWriter().write(
                "{\"error\":\"Missing or unsupported X-Cameleer-Protocol-Version header\"}");
            return false;
        }
        return true;
    }
}

Note: Health and OpenAPI endpoints should be excluded from this interceptor.

Anti-Patterns to Avoid

  • Individual row inserts to ClickHouse: Each insert creates a data part. At 50+ agents, you get "too many parts" errors within hours. Always batch.
  • Unbounded write buffer: Without a capacity limit, agent reconnection storms cause OOM. ArrayBlockingQueue with fixed capacity is mandatory.
  • Synchronous ClickHouse writes in controller: Blocks HTTP threads during ClickHouse inserts. Always decouple via buffer.
  • Using JPA/Hibernate with ClickHouse: ClickHouse is not relational. JPA adds friction with zero benefit. Use JdbcTemplate directly.
  • Bare DateTime in ClickHouse (no timezone): Defaults to server timezone. Always use DateTime64(3, 'UTC').

Don't Hand-Roll

Problem Don't Build Use Instead Why
JDBC connection pooling Custom connection management HikariCP (Spring Boot default) Handles timeouts, leak detection, sizing
OpenAPI documentation Manual JSON/YAML spec springdoc-openapi Generates from code; stays in sync automatically
Health endpoint Custom /health servlet Spring Boot Actuator Standard format, integrates with Docker healthchecks
JSON serialization config Custom ObjectMapper setup Spring Boot auto-config + application.yml Spring Boot already configures Jackson correctly
Test database lifecycle Manual Docker commands Testcontainers Automatic container lifecycle per test class

Common Pitfalls

Pitfall 1: Wrong ClickHouse ORDER BY Design

What goes wrong: Choosing ORDER BY (execution_id) makes time-range queries scan entire partitions. Why it happens: Instinct from relational DB where primary key = UUID. How to avoid: ORDER BY must match dominant query pattern. For this project: ORDER BY (agent_id, status, start_time, execution_id) puts the most-filtered columns first. execution_id last because it's high-cardinality. Warning signs: EXPLAIN shows rows_read >> result set size.

Pitfall 2: ClickHouse TTL Fragmenting Partitions

What goes wrong: Row-level TTL rewrites data parts, causing merge pressure. Why it happens: Default TTL behavior deletes individual rows. How to avoid: Use daily partitioning (PARTITION BY toYYYYMMDD(start_time)) combined with SETTINGS ttl_only_drop_parts = 1. This drops entire parts instead of rewriting. Alternatively, use a scheduled job with ALTER TABLE DROP PARTITION for partitions older than 30 days. Warning signs: Continuous high merge activity, elevated CPU during TTL cleanup.

Pitfall 3: Data Loss on Server Restart

What goes wrong: In-memory buffer loses unflushed data on SIGTERM or crash. Why it happens: Default Spring Boot shutdown does not drain custom queues. How to avoid: Implement SmartLifecycle with ordered shutdown: flush buffer before stopping. Accept that crash (not graceful shutdown) may lose up to flush-interval-ms of data -- this is acceptable for observability. Warning signs: Missing transactions around deployment timestamps.

Pitfall 4: DateTime Timezone Mismatch

What goes wrong: Agents send UTC Instants, ClickHouse stores in server-local timezone, queries return wrong time ranges. Why it happens: ClickHouse DateTime defaults to server timezone if not specified. How to avoid: Always use DateTime64(3, 'UTC') in schema. Ensure Jackson serializes Instants as ISO-8601 with Z suffix. Add server_received_at timestamp for clock skew detection.

Pitfall 5: springdoc Not Scanning Controllers

What goes wrong: OpenAPI spec is empty; Swagger UI shows no endpoints. Why it happens: springdoc defaults to scanning the main application package. If controllers are in a different package hierarchy, they are missed. How to avoid: Ensure @SpringBootApplication is in a parent package of all controllers, or configure springdoc.packagesToScan in application.yml.

Code Examples

ClickHouse Schema: Route Executions Table

-- Source: ClickHouse MergeTree docs + project requirements
CREATE TABLE route_executions (
    execution_id     String,
    route_id         LowCardinality(String),
    agent_id         LowCardinality(String),
    status           LowCardinality(String),  -- COMPLETED, FAILED, RUNNING
    start_time       DateTime64(3, 'UTC'),
    end_time         Nullable(DateTime64(3, 'UTC')),
    duration_ms      UInt64,
    correlation_id   String,
    exchange_id      String,
    error_message    Nullable(String),
    error_stacktrace Nullable(String),
    -- Nested processor executions stored as arrays (ClickHouse nested pattern)
    processor_ids    Array(String),
    processor_types  Array(LowCardinality(String)),
    processor_starts Array(DateTime64(3, 'UTC')),
    processor_ends   Array(DateTime64(3, 'UTC')),
    processor_durations Array(UInt64),
    processor_statuses  Array(LowCardinality(String)),
    -- Metadata
    server_received_at DateTime64(3, 'UTC') DEFAULT now64(3, 'UTC'),
    -- Skip index for future full-text search (Phase 2)
    INDEX idx_correlation correlation_id TYPE bloom_filter GRANULARITY 4,
    INDEX idx_error error_message TYPE tokenbf_v1(32768, 3, 0) GRANULARITY 4
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(start_time)
ORDER BY (agent_id, status, start_time, execution_id)
TTL start_time + INTERVAL 30 DAY
SETTINGS ttl_only_drop_parts = 1;

ClickHouse Schema: Route Diagrams Table

CREATE TABLE route_diagrams (
    content_hash     String,          -- SHA-256 of definition
    route_id         LowCardinality(String),
    agent_id         LowCardinality(String),
    definition       String,          -- JSON graph definition
    created_at       DateTime64(3, 'UTC') DEFAULT now64(3, 'UTC'),
    -- No TTL -- diagrams are small and versioned
)
ENGINE = ReplacingMergeTree(created_at)
ORDER BY (content_hash);

ClickHouse Schema: Metrics Table

CREATE TABLE agent_metrics (
    agent_id         LowCardinality(String),
    collected_at     DateTime64(3, 'UTC'),
    metric_name      LowCardinality(String),
    metric_value     Float64,
    tags             Map(String, String),
    server_received_at DateTime64(3, 'UTC') DEFAULT now64(3, 'UTC')
)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(collected_at)
ORDER BY (agent_id, metric_name, collected_at)
TTL collected_at + INTERVAL 30 DAY
SETTINGS ttl_only_drop_parts = 1;

Docker Compose: Local ClickHouse

# docker-compose.yml (development)
services:
  clickhouse:
    image: clickhouse/clickhouse-server:25.3
    ports:
      - "8123:8123"   # HTTP interface
      - "9000:9000"   # Native protocol
    volumes:
      - clickhouse-data:/var/lib/clickhouse
      - ./clickhouse/init:/docker-entrypoint-initdb.d
    environment:
      CLICKHOUSE_USER: cameleer
      CLICKHOUSE_PASSWORD: cameleer_dev
      CLICKHOUSE_DB: cameleer
    ulimits:
      nofile:
        soft: 262144
        hard: 262144

volumes:
  clickhouse-data:

application.yml Configuration

server:
  port: 8081

spring:
  datasource:
    url: jdbc:ch://localhost:8123/cameleer
    username: cameleer
    password: cameleer_dev
    driver-class-name: com.clickhouse.jdbc.ClickHouseDriver
  jackson:
    serialization:
      write-dates-as-timestamps: false
    deserialization:
      fail-on-unknown-properties: false  # API-05: forward compat (also Spring Boot default)

ingestion:
  buffer-capacity: 50000
  batch-size: 5000
  flush-interval-ms: 1000

clickhouse:
  ttl-days: 30

springdoc:
  api-docs:
    path: /api/v1/api-docs
  swagger-ui:
    path: /api/v1/swagger-ui

management:
  endpoints:
    web:
      base-path: /api/v1
      exposure:
        include: health
  endpoint:
    health:
      show-details: always

State of the Art

Old Approach Current Approach When Changed Impact
clickhouse-http-client 0.6.x clickhouse-jdbc 0.9.7 (V2) 2025 V1 client deprecated; V2 has proper type mapping, batch support
tokenbf_v1 bloom filter index TYPE text() full-text index March 2026 (GA) Native full-text search in ClickHouse; may eliminate need for OpenSearch in Phase 2
springdoc-openapi 2.3.x springdoc-openapi 2.8.6 2025 Latest for Spring Boot 3.4; v3.x is for Spring Boot 4 only
Testcontainers 1.19.x Testcontainers 2.0.2 2025 Major version bump; new artifact names (testcontainers-clickhouse)

Deprecated/outdated:

  • clickhouse-http-client artifact: replaced by clickhouse-jdbc with JDBC V2
  • tokenbf_v1 / ngrambf_v1 skip indexes: deprecated in favor of TYPE text() index (though still functional)
  • Testcontainers artifact org.testcontainers:clickhouse: replaced by org.testcontainers:testcontainers-clickhouse

Open Questions

  1. Exact cameleer-common model structure

    • What we know: Models include RouteExecution, ProcessorExecution, ExchangeSnapshot, RouteGraph, RouteNode, RouteEdge
    • What's unclear: Exact field names, types, nesting structure -- needed to design ClickHouse schema precisely
    • Recommendation: Read cameleer-common source code before implementing schema. Schema must match the wire format.
  2. ClickHouse JDBC V2 + HikariCP compatibility

    • What we know: clickhouse-jdbc 0.9.7 implements JDBC spec; HikariCP is Spring Boot default
    • What's unclear: Whether HikariCP validation queries work correctly with ClickHouse JDBC V2
    • Recommendation: Test in integration test; may need spring.datasource.hikari.connection-test-query=SELECT 1
  3. Nested data: arrays vs separate table for ProcessorExecutions

    • What we know: ClickHouse supports Array columns and Nested type
    • What's unclear: Whether flattening processor executions into arrays in the execution row is better than a separate table with JOIN
    • Recommendation: Arrays are faster for co-located reads (no JOIN) but harder to query individually. Start with arrays; add a materialized view if individual processor queries are needed in Phase 2.

Validation Architecture

Test Framework

Property Value
Framework JUnit 5 (Spring Boot managed) + Testcontainers 2.0.2
Config file cameleer-server-app/src/test/resources/application-test.yml (Wave 0)
Quick run command mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q
Full suite command mvn verify

Phase Requirements -> Test Map

Req ID Behavior Test Type Automated Command File Exists?
INGST-01 POST /api/v1/data/executions returns 202, data in ClickHouse integration mvn test -pl cameleer-server-app -Dtest=ExecutionControllerIT -q Wave 0
INGST-02 POST /api/v1/data/diagrams returns 202 integration mvn test -pl cameleer-server-app -Dtest=DiagramControllerIT -q Wave 0
INGST-03 POST /api/v1/data/metrics returns 202 integration mvn test -pl cameleer-server-app -Dtest=MetricsControllerIT -q Wave 0
INGST-04 Buffer flushes at interval/size unit mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q Wave 0
INGST-05 503 when buffer full unit+integration mvn test -pl cameleer-server-app -Dtest=BackpressureIT -q Wave 0
INGST-06 TTL removes old data integration mvn test -pl cameleer-server-app -Dtest=ClickHouseTtlIT -q Wave 0
API-01 Endpoints under /api/v1/ integration Covered by controller ITs Wave 0
API-02 OpenAPI docs available integration mvn test -pl cameleer-server-app -Dtest=OpenApiIT -q Wave 0
API-03 GET /api/v1/health responds integration mvn test -pl cameleer-server-app -Dtest=HealthControllerIT -q Wave 0
API-04 Protocol version header validated integration mvn test -pl cameleer-server-app -Dtest=ProtocolVersionIT -q Wave 0
API-05 Unknown JSON fields accepted unit mvn test -pl cameleer-server-app -Dtest=ForwardCompatIT -q Wave 0

Sampling Rate

  • Per task commit: mvn test -pl cameleer-server-core -q (unit tests, fast)
  • Per wave merge: mvn verify (full suite with Testcontainers integration tests)
  • Phase gate: Full suite green before verification

Wave 0 Gaps

  • cameleer-server-app/src/test/resources/application-test.yml -- test ClickHouse config
  • cameleer-server-core/src/test/java/.../WriteBufferTest.java -- buffer unit tests
  • cameleer-server-app/src/test/java/.../AbstractClickHouseIT.java -- shared Testcontainers base class
  • cameleer-server-app/src/test/java/.../ExecutionControllerIT.java -- ingestion integration test
  • Docker available on test machine for Testcontainers

Sources

Primary (HIGH confidence)

Secondary (MEDIUM confidence)

Tertiary (LOW confidence)

  • ClickHouse ORDER BY optimization -- based on training data knowledge of MergeTree internals; should validate with EXPLAIN on real data

Metadata

Confidence breakdown:

  • Standard stack: HIGH -- versions verified against live sources (GitHub releases, Maven Central)
  • Architecture: HIGH -- write buffer + batch flush is established ClickHouse pattern used by SigNoz, Uptrace
  • ClickHouse schema: MEDIUM -- ORDER BY design is sound but should be validated with realistic query patterns
  • Pitfalls: HIGH -- well-documented ClickHouse failure modes, confirmed by multiple sources

Research date: 2026-03-11 Valid until: 2026-04-11 (30 days -- stack is stable)