Files
cameleer-server/.planning/phases/01-ingestion-pipeline-api-foundation/01-02-SUMMARY.md

149 lines
8.2 KiB
Markdown
Raw Normal View History

---
phase: 01-ingestion-pipeline-api-foundation
plan: 02
subsystem: api
tags: [rest, ingestion, clickhouse, jdbc, backpressure, spring-boot, integration-tests]
requires:
- phase: 01-01
provides: WriteBuffer, repository interfaces, IngestionConfig, ClickHouse schema
- phase: 01-03
provides: AbstractClickHouseIT base class, ProtocolVersionInterceptor, application bootstrap
provides:
- IngestionService routing data to WriteBuffer instances
- ClickHouseExecutionRepository with batch insert and parallel processor arrays
- ClickHouseDiagramRepository with SHA-256 content-hash deduplication
- ClickHouseMetricsRepository with batch insert for agent_metrics
- ClickHouseFlushScheduler with SmartLifecycle shutdown flush
- POST /api/v1/data/executions endpoint (single and array)
- POST /api/v1/data/diagrams endpoint (single and array)
- POST /api/v1/data/metrics endpoint (array)
- Backpressure: 503 with Retry-After when buffer full
- 11 integration tests verifying end-to-end ingestion pipeline
affects: [02-search, 03-agent-registry]
tech-stack:
added: []
patterns: [single/array JSON payload parsing via raw String body, SmartLifecycle for graceful shutdown flush, BatchPreparedStatementSetter for ClickHouse batch inserts, SHA-256 content-hash dedup for diagrams]
key-files:
created:
- cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseDiagramRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseMetricsRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionBeanConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ExecutionController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/MetricsController.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ExecutionControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/MetricsControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/BackpressureIT.java
modified:
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
key-decisions:
- "Controllers accept raw String body and detect single vs array JSON to support both payload formats"
- "IngestionService is a plain class in core module, wired as a bean by IngestionBeanConfig in app module"
- "Removed @Configuration from IngestionConfig to fix duplicate bean with @EnableConfigurationProperties"
patterns-established:
- "Controller pattern: accept raw String body, parse single/array JSON, delegate to IngestionService, return 202 or 503"
- "Repository pattern: BatchPreparedStatementSetter for ClickHouse batch inserts"
- "FlushScheduler pattern: SmartLifecycle for graceful shutdown, loop-drain until empty"
- "Backpressure pattern: WriteBuffer.offer returns false -> controller returns 503 + Retry-After"
requirements-completed: [INGST-01, INGST-02, INGST-03, INGST-05]
duration: 7min
completed: 2026-03-11
---
# Phase 1 Plan 02: Ingestion Endpoints and ClickHouse Pipeline Summary
**Three REST ingestion endpoints with ClickHouse batch insert repositories, scheduled flush with graceful shutdown, and 11 green integration tests verifying end-to-end data flow and backpressure**
## Performance
- **Duration:** 7 min
- **Started:** 2026-03-11T11:06:47Z
- **Completed:** 2026-03-11T11:14:00Z
- **Tasks:** 2
- **Files modified:** 14
## Accomplishments
- Complete ingestion pipeline: HTTP POST -> IngestionService -> WriteBuffer -> ClickHouseFlushScheduler -> ClickHouse repositories
- Three REST endpoints accepting both single and array JSON payloads with 202 Accepted response
- Backpressure returns 503 with Retry-After header when write buffer is full
- ClickHouse repositories: batch insert for executions (with flattened processor arrays), JSON storage with SHA-256 dedup for diagrams, batch insert for metrics
- Graceful shutdown via SmartLifecycle drains all remaining buffered data
## Task Commits
Each task was committed atomically:
1. **Task 1: IngestionService, ClickHouse repositories, and flush scheduler** - `17a18cf` (feat)
2. **Task 2 RED: Failing integration tests for ingestion endpoints** - `d55ebc1` (test)
3. **Task 2 GREEN: Ingestion REST controllers with backpressure** - `8fe65f0` (feat)
## Files Created/Modified
- `cameleer-server-core/.../ingestion/IngestionService.java` - Routes data to WriteBuffer instances
- `cameleer-server-app/.../storage/ClickHouseExecutionRepository.java` - Batch insert with parallel processor arrays
- `cameleer-server-app/.../storage/ClickHouseDiagramRepository.java` - JSON storage with SHA-256 content-hash dedup
- `cameleer-server-app/.../storage/ClickHouseMetricsRepository.java` - Batch insert for agent_metrics
- `cameleer-server-app/.../ingestion/ClickHouseFlushScheduler.java` - Scheduled drain + SmartLifecycle shutdown
- `cameleer-server-app/.../config/IngestionBeanConfig.java` - WriteBuffer and IngestionService bean wiring
- `cameleer-server-app/.../controller/ExecutionController.java` - POST /api/v1/data/executions
- `cameleer-server-app/.../controller/DiagramController.java` - POST /api/v1/data/diagrams
- `cameleer-server-app/.../controller/MetricsController.java` - POST /api/v1/data/metrics
- `cameleer-server-app/.../config/IngestionConfig.java` - Removed @Configuration (fix duplicate bean)
- `cameleer-server-app/.../controller/ExecutionControllerIT.java` - 4 tests: single, array, flush, unknown fields
- `cameleer-server-app/.../controller/DiagramControllerIT.java` - 3 tests: single, array, flush
- `cameleer-server-app/.../controller/MetricsControllerIT.java` - 2 tests: POST, flush
- `cameleer-server-app/.../controller/BackpressureIT.java` - 2 tests: 503 response, data not lost
## Decisions Made
- Controllers accept raw String body and detect single vs array JSON (starts with `[`), supporting both payload formats per protocol spec
- IngestionService is a plain class in core module (no Spring annotations), wired as a bean by IngestionBeanConfig in app module
- Removed `@Configuration` from IngestionConfig to fix duplicate bean conflict with `@EnableConfigurationProperties`
## Deviations from Plan
### Auto-fixed Issues
**1. [Rule 1 - Bug] Fixed duplicate IngestionConfig bean**
- **Found during:** Task 2 (integration test context startup)
- **Issue:** IngestionConfig had both `@Configuration` and `@ConfigurationProperties`, while `@EnableConfigurationProperties(IngestionConfig.class)` on the app class created a second bean, causing "expected single matching bean but found 2"
- **Fix:** Removed `@Configuration` from IngestionConfig, relying solely on `@EnableConfigurationProperties`
- **Files modified:** cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
- **Verification:** Application context starts successfully, all tests pass
- **Committed in:** 8fe65f0
---
**Total deviations:** 1 auto-fixed (1 bug)
**Impact on plan:** Necessary fix for Spring context startup. No scope creep.
## Issues Encountered
- BackpressureIT initially failed because the scheduled flush drained the buffer before the test could fill it. Fixed by using a 60s flush interval and batch POST to fill buffer atomically.
## User Setup Required
None - no external service configuration required.
## Next Phase Readiness
- All three ingestion endpoints operational and tested
- Phase 1 complete: ClickHouse infrastructure, API foundation, and ingestion pipeline all working
- Ready for Phase 2 (search) and Phase 3 (agent registry) which both depend only on Phase 1
## Self-Check: PASSED
All 13 created files verified present. All 3 task commits verified in git log.
---
*Phase: 01-ingestion-pipeline-api-foundation*
*Completed: 2026-03-11*