Files
cameleer-server/.planning/phases/01-ingestion-pipeline-api-foundation/01-01-PLAN.md
hsiegeln cb3ebfea7c
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 18s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
chore: rename cameleer3 to cameleer
Rename Java packages from com.cameleer3 to com.cameleer, module
directories from cameleer3-* to cameleer-*, and all references
throughout workflows, Dockerfiles, docs, migrations, and pom.xml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:28:42 +02:00

217 lines
11 KiB
Markdown

---
phase: 01-ingestion-pipeline-api-foundation
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- pom.xml
- cameleer-server-core/pom.xml
- cameleer-server-app/pom.xml
- docker-compose.yml
- clickhouse/init/01-schema.sql
- cameleer-server-app/src/main/resources/application.yml
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/ClickHouseConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/MetricsRepository.java
- cameleer-server-core/src/test/java/com/cameleer/server/core/ingestion/WriteBufferTest.java
autonomous: true
requirements:
- INGST-04
- INGST-05
- INGST-06
must_haves:
truths:
- "WriteBuffer accepts items and returns false when full (backpressure signal)"
- "WriteBuffer drains items in batches for scheduled flush"
- "ClickHouse schema creates route_executions, route_diagrams, and agent_metrics tables with correct column types"
- "TTL clause on tables removes data older than configured days"
- "Docker Compose starts ClickHouse and initializes the schema"
artifacts:
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java"
provides: "Generic bounded write buffer with offer/drain/isFull"
min_lines: 30
- path: "clickhouse/init/01-schema.sql"
provides: "ClickHouse DDL for all three tables"
contains: "CREATE TABLE route_executions"
- path: "docker-compose.yml"
provides: "Local ClickHouse service"
contains: "clickhouse-server"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java"
provides: "Repository interface for execution batch inserts"
exports: ["insertBatch"]
key_links:
- from: "cameleer-server-app/src/main/java/com/cameleer/server/app/config/ClickHouseConfig.java"
to: "application.yml"
via: "spring.datasource properties"
pattern: "spring\\.datasource"
- from: "cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java"
to: "application.yml"
via: "ingestion.* properties"
pattern: "ingestion\\."
---
<objective>
Set up ClickHouse infrastructure, schema, WriteBuffer with backpressure, and repository interfaces.
Purpose: Establishes the storage foundation that all ingestion endpoints and future search queries depend on. The WriteBuffer is the central throughput mechanism -- all data flows through it before reaching ClickHouse.
Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with unit tests, repository interfaces.
</objective>
<execution_context>
@C:/Users/Hendrik/.claude/get-shit-done/workflows/execute-plan.md
@C:/Users/Hendrik/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@.planning/phases/01-ingestion-pipeline-api-foundation/01-RESEARCH.md
@pom.xml
@cameleer-server-core/pom.xml
@cameleer-server-app/pom.xml
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: Dependencies, Docker Compose, ClickHouse schema, and application config</name>
<files>
pom.xml,
cameleer-server-core/pom.xml,
cameleer-server-app/pom.xml,
docker-compose.yml,
clickhouse/init/01-schema.sql,
cameleer-server-app/src/main/resources/application.yml
</files>
<behavior>
- docker compose up -d starts ClickHouse on ports 8123/9000
- Connecting to ClickHouse and running SELECT 1 succeeds
- Tables route_executions, route_diagrams, agent_metrics exist after init
- route_executions has TTL clause with configurable interval
- route_executions has PARTITION BY toYYYYMMDD(start_time) and ORDER BY (agent_id, status, start_time, execution_id)
- route_diagrams uses ReplacingMergeTree with ORDER BY (content_hash)
- agent_metrics has TTL and daily partitioning
- Maven compile succeeds with new dependencies
</behavior>
<action>
1. Add dependencies to cameleer-server-app/pom.xml per research:
- clickhouse-jdbc 0.9.7 (classifier: all)
- spring-boot-starter-actuator
- springdoc-openapi-starter-webmvc-ui 2.8.6
- testcontainers-clickhouse 2.0.2 (test scope)
- junit-jupiter from testcontainers 2.0.2 (test scope)
- awaitility (test scope)
2. Add slf4j-api dependency to cameleer-server-core/pom.xml.
3. Create docker-compose.yml at project root with ClickHouse service:
- Image: clickhouse/clickhouse-server:25.3
- Ports: 8123:8123, 9000:9000
- Volume mount ./clickhouse/init to /docker-entrypoint-initdb.d
- Environment: CLICKHOUSE_USER=cameleer, CLICKHOUSE_PASSWORD=cameleer_dev, CLICKHOUSE_DB=cameleer
- ulimits nofile 262144
4. Create clickhouse/init/01-schema.sql with the three tables from research:
- route_executions: MergeTree, daily partitioning on start_time, ORDER BY (agent_id, status, start_time, execution_id), TTL start_time + INTERVAL 30 DAY, SETTINGS ttl_only_drop_parts=1. Include Array columns for processor executions (processor_ids, processor_types, processor_starts, processor_ends, processor_durations, processor_statuses). Include skip indexes for correlation_id (bloom_filter) and error_message (tokenbf_v1).
- route_diagrams: ReplacingMergeTree(created_at), ORDER BY (content_hash). No TTL.
- agent_metrics: MergeTree, daily partitioning on collected_at, ORDER BY (agent_id, metric_name, collected_at), TTL collected_at + INTERVAL 30 DAY, SETTINGS ttl_only_drop_parts=1.
- All DateTime fields use DateTime64(3, 'UTC').
5. Create cameleer-server-app/src/main/resources/application.yml with config from research:
- server.port: 8081
- spring.datasource: url=jdbc:ch://localhost:8123/cameleer, username/password, driver-class-name
- spring.jackson: write-dates-as-timestamps=false, fail-on-unknown-properties=false
- ingestion: buffer-capacity=50000, batch-size=5000, flush-interval-ms=1000
- clickhouse.ttl-days: 30
- springdoc paths under /api/v1/
- management endpoints (health under /api/v1/, show-details=always)
6. Ensure .gitattributes exists with `* text=auto eol=lf`.
</action>
<verify>
<automated>mvn clean compile -q 2>&1 | tail -5</automated>
</verify>
<done>Maven compiles successfully with all new dependencies. Docker Compose file and ClickHouse DDL exist. application.yml configures datasource, ingestion buffer, and springdoc.</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: WriteBuffer, repository interfaces, IngestionConfig, and ClickHouseConfig</name>
<files>
cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/storage/MetricsRepository.java,
cameleer-server-core/src/test/java/com/cameleer/server/core/ingestion/WriteBufferTest.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/ClickHouseConfig.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
</files>
<behavior>
- WriteBuffer(capacity=10): offer() returns true for first 10 items, false on 11th
- WriteBuffer.drain(5) returns up to 5 items and removes them from the queue
- WriteBuffer.isFull() returns true when at capacity
- WriteBuffer.offerBatch(list) returns false without partial insert if buffer would overflow
- WriteBuffer.size() tracks current queue depth
- ExecutionRepository interface declares insertBatch(List of RouteExecution)
- DiagramRepository interface declares store(RouteGraph) and findByContentHash(String)
- MetricsRepository interface declares insertBatch(List of metric data)
</behavior>
<action>
1. Create WriteBuffer<T> in core module (no Spring dependency):
- Constructor takes int capacity, creates ArrayBlockingQueue(capacity)
- offer(T item): returns queue.offer(item) -- false when full
- offerBatch(List<T> items): check remainingCapacity() >= items.size() first, then offer each. If insufficient capacity, return false immediately without adding any items.
- drain(int maxBatch): drainTo into ArrayList, return list
- size(), capacity(), isFull(), remainingCapacity() accessors
2. Create WriteBufferTest (JUnit 5, no Spring):
- Test offer succeeds until capacity
- Test offer returns false when full
- Test offerBatch all-or-nothing semantics
- Test drain returns items and removes from queue
- Test drain with empty queue returns empty list
- Test isFull/size/remainingCapacity
3. Create repository interfaces in core module:
- ExecutionRepository: void insertBatch(List<RouteExecution> executions)
- DiagramRepository: void store(RouteGraph graph), Optional<RouteGraph> findByContentHash(String hash), Optional<String> findContentHashForRoute(String routeId, String agentId)
- MetricsRepository: void insertBatch(List<MetricsSnapshot> metrics) -- use a generic type or the cameleer-common metrics model if available; if not, create a simple MetricsData record in core module
4. Create IngestionConfig as @ConfigurationProperties("ingestion"):
- bufferCapacity (int, default 50000)
- batchSize (int, default 5000)
- flushIntervalMs (long, default 1000)
5. Create ClickHouseConfig as @Configuration:
- Exposes JdbcTemplate bean (Spring Boot auto-configures DataSource from spring.datasource)
- No custom bean needed if relying on auto-config; only create if explicit JdbcTemplate customization required
</action>
<verify>
<automated>mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q 2>&1 | tail -10</automated>
</verify>
<done>WriteBuffer passes all unit tests. Repository interfaces exist with correct method signatures. IngestionConfig reads from application.yml.</done>
</task>
</tasks>
<verification>
- `mvn test -pl cameleer-server-core -q` -- all WriteBuffer unit tests pass
- `mvn clean compile -q` -- full project compiles with new dependencies
- `docker compose config` -- validates Docker Compose file
- clickhouse/init/01-schema.sql contains CREATE TABLE for all three tables with correct ENGINE, ORDER BY, PARTITION BY, and TTL
</verification>
<success_criteria>
WriteBuffer unit tests green. Project compiles. ClickHouse DDL defines all three tables with TTL and correct partitioning. Repository interfaces define batch insert contracts.
</success_criteria>
<output>
After completion, create `.planning/phases/01-ingestion-pipeline-api-foundation/01-01-SUMMARY.md`
</output>