From 7f0ceca8b1acdaea9497bdf2d20d9fe59978d472 Mon Sep 17 00:00:00 2001 From: hsiegeln <37154749+hsiegeln@users.noreply.github.com> Date: Wed, 11 Mar 2026 12:16:46 +0100 Subject: [PATCH] docs(01-02): complete ingestion endpoints plan - SUMMARY.md with 14 files, 3 commits, 11 integration tests - STATE.md: Phase 1 complete (3/3 plans) - ROADMAP.md: Phase 01 progress updated - REQUIREMENTS.md: INGST-01, INGST-02, INGST-03 marked complete Co-Authored-By: Claude Opus 4.6 --- .planning/REQUIREMENTS.md | 6 +- .planning/STATE.md | 30 ++-- .../01-02-SUMMARY.md | 148 ++++++++++++++++++ 3 files changed, 168 insertions(+), 16 deletions(-) create mode 100644 .planning/phases/01-ingestion-pipeline-api-foundation/01-02-SUMMARY.md diff --git a/.planning/REQUIREMENTS.md b/.planning/REQUIREMENTS.md index 48e1f07c..8f2064a3 100644 --- a/.planning/REQUIREMENTS.md +++ b/.planning/REQUIREMENTS.md @@ -9,9 +9,9 @@ Requirements for initial release. Each maps to roadmap phases. Tracked as Gitea ### Data Ingestion -- [ ] **INGST-01**: Server accepts `RouteExecution` (single or array) via `POST /api/v1/data/executions` and returns `202 Accepted` (#1) -- [ ] **INGST-02**: Server accepts `RouteGraph` (single or array) via `POST /api/v1/data/diagrams` and returns `202 Accepted` (#2) -- [ ] **INGST-03**: Server accepts metrics snapshots via `POST /api/v1/data/metrics` and returns `202 Accepted` (#3) +- [x] **INGST-01**: Server accepts `RouteExecution` (single or array) via `POST /api/v1/data/executions` and returns `202 Accepted` (#1) +- [x] **INGST-02**: Server accepts `RouteGraph` (single or array) via `POST /api/v1/data/diagrams` and returns `202 Accepted` (#2) +- [x] **INGST-03**: Server accepts metrics snapshots via `POST /api/v1/data/metrics` and returns `202 Accepted` (#3) - [x] **INGST-04**: Ingestion uses in-memory batch buffer with configurable flush interval/size for ClickHouse writes (#4) - [x] **INGST-05**: Server returns `503 Service Unavailable` when write buffer is full (backpressure) (#5) - [x] **INGST-06**: ClickHouse TTL automatically expires data after 30 days (configurable) (#6) diff --git a/.planning/STATE.md b/.planning/STATE.md index 456e6679..fdfa30b4 100644 --- a/.planning/STATE.md +++ b/.planning/STATE.md @@ -3,15 +3,15 @@ gsd_state_version: 1.0 milestone: v1.0 milestone_name: milestone status: executing -stopped_at: Completed 01-03-PLAN.md -last_updated: "2026-03-11T11:03:08.000Z" -last_activity: 2026-03-11 -- Completed 01-03 (API foundation, protocol interceptor, integration tests) +stopped_at: Completed 01-02-PLAN.md +last_updated: "2026-03-11T11:14:00.000Z" +last_activity: 2026-03-11 -- Completed 01-02 (Ingestion endpoints, ClickHouse repositories, flush scheduler, 11 ITs) progress: total_phases: 4 - completed_phases: 0 + completed_phases: 1 total_plans: 3 - completed_plans: 2 - percent: 66 + completed_plans: 3 + percent: 100 --- # Project State @@ -25,12 +25,12 @@ See: .planning/PROJECT.md (updated 2026-03-11) ## Current Position -Phase: 1 of 4 (Ingestion Pipeline + API Foundation) -Plan: 2 of 3 in current phase -Status: Executing -Last activity: 2026-03-11 -- Completed 01-03 (API foundation, protocol interceptor, integration tests) +Phase: 1 of 4 (Ingestion Pipeline + API Foundation) -- COMPLETE +Plan: 3 of 3 in current phase +Status: Phase 1 Complete +Last activity: 2026-03-11 -- Completed 01-02 (Ingestion endpoints, ClickHouse repositories, flush scheduler, 11 ITs) -Progress: [██████░░░░] 66% +Progress: [██████████] 100% ## Performance Metrics @@ -51,6 +51,7 @@ Progress: [██████░░░░] 66% *Updated after each plan completion* | Phase 01 P01 | 3min | 2 tasks | 13 files | +| Phase 01 P02 | 7min | 2 tasks | 14 files | | Phase 01 P03 | 10min | 2 tasks | 12 files | ## Accumulated Context @@ -69,6 +70,9 @@ Recent decisions affecting current work: - [Phase 01]: Upgraded testcontainers to 2.0.3 for Docker Desktop 29.x compatibility - [Phase 01]: Changed error_message/error_stacktrace to non-nullable String for tokenbf_v1 index compat - [Phase 01]: TTL expressions require toDateTime() cast for DateTime64 columns in ClickHouse 25.3 +- [Phase 01]: Controllers accept raw String body to support both single and array JSON payloads +- [Phase 01]: IngestionService is a plain class in core module, wired as bean by IngestionBeanConfig in app +- [Phase 01]: Removed @Configuration from IngestionConfig to fix duplicate bean with @EnableConfigurationProperties ### Pending Todos @@ -83,6 +87,6 @@ None yet. ## Session Continuity -Last session: 2026-03-11T11:03:08.000Z -Stopped at: Completed 01-03-PLAN.md +Last session: 2026-03-11T11:14:00.000Z +Stopped at: Completed 01-02-PLAN.md (Phase 1 fully complete) Resume file: None diff --git a/.planning/phases/01-ingestion-pipeline-api-foundation/01-02-SUMMARY.md b/.planning/phases/01-ingestion-pipeline-api-foundation/01-02-SUMMARY.md new file mode 100644 index 00000000..6e50e8ef --- /dev/null +++ b/.planning/phases/01-ingestion-pipeline-api-foundation/01-02-SUMMARY.md @@ -0,0 +1,148 @@ +--- +phase: 01-ingestion-pipeline-api-foundation +plan: 02 +subsystem: api +tags: [rest, ingestion, clickhouse, jdbc, backpressure, spring-boot, integration-tests] + +requires: + - phase: 01-01 + provides: WriteBuffer, repository interfaces, IngestionConfig, ClickHouse schema + - phase: 01-03 + provides: AbstractClickHouseIT base class, ProtocolVersionInterceptor, application bootstrap + +provides: + - IngestionService routing data to WriteBuffer instances + - ClickHouseExecutionRepository with batch insert and parallel processor arrays + - ClickHouseDiagramRepository with SHA-256 content-hash deduplication + - ClickHouseMetricsRepository with batch insert for agent_metrics + - ClickHouseFlushScheduler with SmartLifecycle shutdown flush + - POST /api/v1/data/executions endpoint (single and array) + - POST /api/v1/data/diagrams endpoint (single and array) + - POST /api/v1/data/metrics endpoint (array) + - Backpressure: 503 with Retry-After when buffer full + - 11 integration tests verifying end-to-end ingestion pipeline + +affects: [02-search, 03-agent-registry] + +tech-stack: + added: [] + patterns: [single/array JSON payload parsing via raw String body, SmartLifecycle for graceful shutdown flush, BatchPreparedStatementSetter for ClickHouse batch inserts, SHA-256 content-hash dedup for diagrams] + +key-files: + created: + - cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseDiagramRepository.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseMetricsRepository.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionBeanConfig.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/ExecutionController.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramController.java + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/MetricsController.java + - cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ExecutionControllerIT.java + - cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramControllerIT.java + - cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/MetricsControllerIT.java + - cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/BackpressureIT.java + modified: + - cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java + +key-decisions: + - "Controllers accept raw String body and detect single vs array JSON to support both payload formats" + - "IngestionService is a plain class in core module, wired as a bean by IngestionBeanConfig in app module" + - "Removed @Configuration from IngestionConfig to fix duplicate bean with @EnableConfigurationProperties" + +patterns-established: + - "Controller pattern: accept raw String body, parse single/array JSON, delegate to IngestionService, return 202 or 503" + - "Repository pattern: BatchPreparedStatementSetter for ClickHouse batch inserts" + - "FlushScheduler pattern: SmartLifecycle for graceful shutdown, loop-drain until empty" + - "Backpressure pattern: WriteBuffer.offer returns false -> controller returns 503 + Retry-After" + +requirements-completed: [INGST-01, INGST-02, INGST-03, INGST-05] + +duration: 7min +completed: 2026-03-11 +--- + +# Phase 1 Plan 02: Ingestion Endpoints and ClickHouse Pipeline Summary + +**Three REST ingestion endpoints with ClickHouse batch insert repositories, scheduled flush with graceful shutdown, and 11 green integration tests verifying end-to-end data flow and backpressure** + +## Performance + +- **Duration:** 7 min +- **Started:** 2026-03-11T11:06:47Z +- **Completed:** 2026-03-11T11:14:00Z +- **Tasks:** 2 +- **Files modified:** 14 + +## Accomplishments +- Complete ingestion pipeline: HTTP POST -> IngestionService -> WriteBuffer -> ClickHouseFlushScheduler -> ClickHouse repositories +- Three REST endpoints accepting both single and array JSON payloads with 202 Accepted response +- Backpressure returns 503 with Retry-After header when write buffer is full +- ClickHouse repositories: batch insert for executions (with flattened processor arrays), JSON storage with SHA-256 dedup for diagrams, batch insert for metrics +- Graceful shutdown via SmartLifecycle drains all remaining buffered data + +## Task Commits + +Each task was committed atomically: + +1. **Task 1: IngestionService, ClickHouse repositories, and flush scheduler** - `17a18cf` (feat) +2. **Task 2 RED: Failing integration tests for ingestion endpoints** - `d55ebc1` (test) +3. **Task 2 GREEN: Ingestion REST controllers with backpressure** - `8fe65f0` (feat) + +## Files Created/Modified +- `cameleer3-server-core/.../ingestion/IngestionService.java` - Routes data to WriteBuffer instances +- `cameleer3-server-app/.../storage/ClickHouseExecutionRepository.java` - Batch insert with parallel processor arrays +- `cameleer3-server-app/.../storage/ClickHouseDiagramRepository.java` - JSON storage with SHA-256 content-hash dedup +- `cameleer3-server-app/.../storage/ClickHouseMetricsRepository.java` - Batch insert for agent_metrics +- `cameleer3-server-app/.../ingestion/ClickHouseFlushScheduler.java` - Scheduled drain + SmartLifecycle shutdown +- `cameleer3-server-app/.../config/IngestionBeanConfig.java` - WriteBuffer and IngestionService bean wiring +- `cameleer3-server-app/.../controller/ExecutionController.java` - POST /api/v1/data/executions +- `cameleer3-server-app/.../controller/DiagramController.java` - POST /api/v1/data/diagrams +- `cameleer3-server-app/.../controller/MetricsController.java` - POST /api/v1/data/metrics +- `cameleer3-server-app/.../config/IngestionConfig.java` - Removed @Configuration (fix duplicate bean) +- `cameleer3-server-app/.../controller/ExecutionControllerIT.java` - 4 tests: single, array, flush, unknown fields +- `cameleer3-server-app/.../controller/DiagramControllerIT.java` - 3 tests: single, array, flush +- `cameleer3-server-app/.../controller/MetricsControllerIT.java` - 2 tests: POST, flush +- `cameleer3-server-app/.../controller/BackpressureIT.java` - 2 tests: 503 response, data not lost + +## Decisions Made +- Controllers accept raw String body and detect single vs array JSON (starts with `[`), supporting both payload formats per protocol spec +- IngestionService is a plain class in core module (no Spring annotations), wired as a bean by IngestionBeanConfig in app module +- Removed `@Configuration` from IngestionConfig to fix duplicate bean conflict with `@EnableConfigurationProperties` + +## Deviations from Plan + +### Auto-fixed Issues + +**1. [Rule 1 - Bug] Fixed duplicate IngestionConfig bean** +- **Found during:** Task 2 (integration test context startup) +- **Issue:** IngestionConfig had both `@Configuration` and `@ConfigurationProperties`, while `@EnableConfigurationProperties(IngestionConfig.class)` on the app class created a second bean, causing "expected single matching bean but found 2" +- **Fix:** Removed `@Configuration` from IngestionConfig, relying solely on `@EnableConfigurationProperties` +- **Files modified:** cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java +- **Verification:** Application context starts successfully, all tests pass +- **Committed in:** 8fe65f0 + +--- + +**Total deviations:** 1 auto-fixed (1 bug) +**Impact on plan:** Necessary fix for Spring context startup. No scope creep. + +## Issues Encountered +- BackpressureIT initially failed because the scheduled flush drained the buffer before the test could fill it. Fixed by using a 60s flush interval and batch POST to fill buffer atomically. + +## User Setup Required +None - no external service configuration required. + +## Next Phase Readiness +- All three ingestion endpoints operational and tested +- Phase 1 complete: ClickHouse infrastructure, API foundation, and ingestion pipeline all working +- Ready for Phase 2 (search) and Phase 3 (agent registry) which both depend only on Phase 1 + +## Self-Check: PASSED + +All 13 created files verified present. All 3 task commits verified in git log. + +--- +*Phase: 01-ingestion-pipeline-api-foundation* +*Completed: 2026-03-11*