Rename Java packages from com.cameleer3 to com.cameleer, module directories from cameleer3-* to cameleer-*, and all references throughout workflows, Dockerfiles, docs, migrations, and pom.xml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
98 lines
7.0 KiB
Markdown
98 lines
7.0 KiB
Markdown
# Roadmap: Cameleer Server
|
|
|
|
## Overview
|
|
|
|
Build an observability server that ingests millions of Camel route transactions per day into ClickHouse, provides structured and full-text search, manages agent lifecycles via SSE, and secures all communication with JWT and Ed25519 signing. The roadmap moves from data-in (ingestion) to data-out (search) to agent management to security, each phase delivering a complete, verifiable capability.
|
|
|
|
## Phases
|
|
|
|
**Phase Numbering:**
|
|
- Integer phases (1, 2, 3): Planned milestone work
|
|
- Decimal phases (2.1, 2.2): Urgent insertions (marked with INSERTED)
|
|
|
|
Decimal phases appear between their surrounding integers in numeric order.
|
|
|
|
- [ ] **Phase 1: Ingestion Pipeline + API Foundation** - ClickHouse schema, batch write buffer, ingestion endpoints, API scaffolding
|
|
- [ ] **Phase 2: Transaction Search + Diagrams** - Structured search, full-text search, diagram versioning and rendering
|
|
- [x] **Phase 3: Agent Registry + SSE Push** - Agent lifecycle management, heartbeat monitoring, SSE config/command push (completed 2026-03-11)
|
|
- [ ] **Phase 4: Security** - JWT authentication, Ed25519 signing, bootstrap token registration, endpoint protection
|
|
|
|
## Phase Details
|
|
|
|
### Phase 1: Ingestion Pipeline + API Foundation
|
|
**Goal**: Agents can POST execution data, diagrams, and metrics to the server, which batch-writes them to ClickHouse with TTL retention and backpressure protection
|
|
**Depends on**: Nothing (first phase)
|
|
**Requirements**: INGST-01 (#1), INGST-02 (#2), INGST-03 (#3), INGST-04 (#4), INGST-05 (#5), INGST-06 (#6), API-01 (#28), API-02 (#29), API-03 (#30), API-04 (#31), API-05 (#32)
|
|
**Success Criteria** (what must be TRUE):
|
|
1. An HTTP client can POST a RouteExecution payload to `/api/v1/data/executions` and receive 202 Accepted, and the data appears in ClickHouse within the flush interval
|
|
2. An HTTP client can POST RouteGraph and metrics payloads to their respective endpoints and receive 202 Accepted
|
|
3. When the write buffer is full, the server returns 503 and does not lose already-buffered data
|
|
4. Data older than the configured TTL (default 30 days) is automatically removed by ClickHouse
|
|
5. The health endpoint responds at `/api/v1/health`, OpenAPI docs are available, protocol version header is validated, and unknown JSON fields are accepted
|
|
**Plans:** 2/3 plans executed
|
|
|
|
Plans:
|
|
- [ ] 01-01-PLAN.md -- ClickHouse infrastructure, schema, WriteBuffer, repository interfaces, test infrastructure
|
|
- [ ] 01-02-PLAN.md -- Ingestion REST endpoints, ClickHouse repositories, flush scheduler, integration tests
|
|
- [ ] 01-03-PLAN.md -- API foundation (health, OpenAPI, protocol header, forward compat, TTL verification)
|
|
|
|
### Phase 2: Transaction Search + Diagrams
|
|
**Goal**: Users can find any transaction by status, time, duration, correlation ID, or content, view execution detail trees, and see versioned route diagrams linked to transactions
|
|
**Depends on**: Phase 1
|
|
**Requirements**: SRCH-01 (#7), SRCH-02 (#8), SRCH-03 (#9), SRCH-04 (#10), SRCH-05 (#11), SRCH-06 (#12), DIAG-01 (#20), DIAG-02 (#21), DIAG-03 (#22)
|
|
**Success Criteria** (what must be TRUE):
|
|
1. User can query transactions filtered by any combination of status, date range, duration range, and correlationId, and receive matching results via REST
|
|
2. User can full-text search across message bodies, headers, error messages, and stack traces and find matching transactions
|
|
3. User can retrieve a transaction's detail view showing the nested processor execution tree
|
|
4. Route diagrams are stored with content-addressable versioning (identical definitions stored once), each transaction links to its active diagram version, and diagrams can be rendered from stored definitions
|
|
**Plans:** 4 plans (3 executed, 1 gap closure)
|
|
|
|
Plans:
|
|
- [ ] 02-01-PLAN.md -- Schema extension, core domain types, ingestion updates for search/detail columns
|
|
- [ ] 02-02-PLAN.md -- Diagram rendering with ELK layout and JFreeSVG (SVG + JSON via content negotiation)
|
|
- [ ] 02-03-PLAN.md -- Search endpoints (GET + POST), transaction detail with tree reconstruction, integration tests
|
|
- [ ] 02-04-PLAN.md -- Gap closure: populate diagram_content_hash during ingestion, fix Surefire classloader isolation
|
|
|
|
### Phase 3: Agent Registry + SSE Push
|
|
**Goal**: Server tracks connected agents through their full lifecycle and can push configuration updates, deep-trace commands, and replay commands to specific agents in real time
|
|
**Depends on**: Phase 1
|
|
**Requirements**: AGNT-01 (#13), AGNT-02 (#14), AGNT-03 (#15), AGNT-04 (#16), AGNT-05 (#17), AGNT-06 (#18), AGNT-07 (#19)
|
|
**Success Criteria** (what must be TRUE):
|
|
1. An agent can register via POST with a bootstrap token and receive a JWT (security enforcement deferred to Phase 4, but the registration flow and token issuance work end-to-end)
|
|
2. Server correctly transitions agents through LIVE/STALE/DEAD states based on heartbeat timing, and the agent list endpoint reflects current states
|
|
3. Server pushes config-update, deep-trace, and replay events to a specific agent's SSE stream, with ping keepalive and Last-Event-ID reconnection support
|
|
**Plans:** 2/2 plans complete
|
|
|
|
Plans:
|
|
- [ ] 03-01-PLAN.md -- Agent domain types, registry service, registration/heartbeat/list endpoints, lifecycle monitor
|
|
- [ ] 03-02-PLAN.md -- SSE connection management, command push (config-update, deep-trace, replay), ping keepalive, acknowledgement, integration tests
|
|
|
|
### Phase 4: Security
|
|
**Goal**: All server communication is authenticated and integrity-protected, with JWT for API access and Ed25519 signatures for pushed configuration
|
|
**Depends on**: Phase 1, Phase 3
|
|
**Requirements**: SECU-01 (#23), SECU-02 (#24), SECU-03 (#25), SECU-04 (#26), SECU-05 (#27)
|
|
**Success Criteria** (what must be TRUE):
|
|
1. All API endpoints except health and register reject requests without a valid JWT Bearer token
|
|
2. Agents can refresh expired JWTs via the refresh endpoint without re-registering
|
|
3. Server generates an Ed25519 keypair at startup, delivers the public key during registration, and all config-update and replay SSE payloads carry a valid Ed25519 signature
|
|
4. Bootstrap token from CAMELEER_AUTH_TOKEN environment variable is required for initial agent registration
|
|
**Plans:** 2/3 plans executed
|
|
|
|
Plans:
|
|
- [x] 04-01-PLAN.md -- Security service foundation: JwtService, Ed25519SigningService, BootstrapTokenValidator, Maven deps, config
|
|
- [ ] 04-02-PLAN.md -- Spring Security filter chain, JWT auth filter, registration/refresh integration, existing test adaptation
|
|
- [ ] 04-03-PLAN.md -- Ed25519 signing of SSE command payloads (config-update, deep-trace, replay)
|
|
|
|
## Progress
|
|
|
|
**Execution Order:**
|
|
Phases execute in numeric order: 1 -> 2 -> 3 -> 4
|
|
Note: Phases 2 and 3 both depend only on Phase 1 and could execute in parallel.
|
|
|
|
| Phase | Plans Complete | Status | Completed |
|
|
|-------|----------------|--------|-----------|
|
|
| 1. Ingestion Pipeline + API Foundation | 3/3 | Complete | 2026-03-11 |
|
|
| 2. Transaction Search + Diagrams | 3/4 | Gap Closure | |
|
|
| 3. Agent Registry + SSE Push | 2/2 | Complete | 2026-03-11 |
|
|
| 4. Security | 2/3 | In Progress| |
|