Compare commits

...

690 Commits

Author SHA1 Message Date
ca78aa3962 Merge pull request 'feat(alerting): Plan 02 — backend (domain, storage, evaluators, dispatch)' (#140) from feat/alerting-02-backend into main
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m58s
CI / docker (push) Successful in 34s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m17s
2026-04-20 09:03:15 +02:00
b763155a60 Merge pull request 'feat(alerting): Plan 01 — outbound HTTP infra + admin-managed outbound connections' (#139) from feat/alerting-01-outbound-infra into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m53s
CI / docker (push) Successful in 36s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Reviewed-on: #139
2026-04-20 08:57:40 +02:00
hsiegeln
aa9e93369f docs(alerting): add V11-V13 migration entries to CLAUDE.md
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 3m35s
CI / docker (push) Successful in 4m34s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 2m10s
Documents the three Flyway migrations added by the alerting feature branch
so future sessions have an accurate migration map.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:27:39 +02:00
hsiegeln
b0ba08e572 test(alerting): rewrite AlertingFullLifecycleIT — REST-driven rule creation, re-notify cadence
Rule creation now goes through POST /alerts/rules (exercises saveTargets on the
write path). Clock is replaced with @MockBean(name="alertingClock") and re-stubbed
in @BeforeEach to survive Mockito's inter-test reset. Six ordered steps:

  1. seed log → tick evaluator → assert FIRING instance with non-empty targets (B-1)
  2. tick dispatcher → assert DELIVERED notification + lastNotifiedAt stamped (B-2)
  3. ack via REST → assert ACKNOWLEDGED state
  4. create silence → inject PENDING notification → tick dispatcher → assert silenced (FAILED)
  5. delete rule → assert rule_id nullified, rule_snapshot preserved (ON DELETE SET NULL)
  6. new rule with reNotifyMinutes=1 → first dispatch → advance clock 61s →
     evaluator sweep → second dispatch → verify 2 WireMock POSTs (B-2 cadence)

Background scheduler races addressed by resetting claimed_by/claimed_until before
each manual tick. Simulated clock set AFTER log insert to guarantee log timestamp
falls within the evaluator window. Re-notify notifications backdated in Postgres
to work around the simulated vs real clock gap in claimDueNotifications.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:26:38 +02:00
hsiegeln
2c82b50ea2 fix(alerting/B-1): AlertStateTransitions.newInstance() propagates rule targets to AlertInstance
newInstance() now maps rule.targets() into targetUserIds/targetGroupIds/targetRoleNames
so newly created AlertInstance rows carry the correct target arrays.
Previously these were always empty List.of(), making the inbox query return nothing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:26:25 +02:00
hsiegeln
7e79ff4d98 fix(alerting/I-2): add unique partial index on alert_instances(rule_id) for open states
V13 migration creates alert_instances_open_rule_uq — a partial unique index on
(rule_id) WHERE state IN ('PENDING','FIRING','ACKNOWLEDGED'), preventing
duplicate open instances per rule. PostgresAlertInstanceRepository.save() catches
DuplicateKeyException and returns the existing open instance instead of failing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:26:07 +02:00
hsiegeln
424894a3e2 fix(alerting/I-1): retry endpoint resets attempts to 0 instead of incrementing
AlertNotificationRepository gains resetForRetry(UUID, Instant) which sets
attempts=0, status=PENDING, next_attempt_at=now, and clears claim/response
fields. AlertNotificationController calls resetForRetry instead of
scheduleRetry so a manual retry always starts from a clean slate.

AlertNotificationControllerIT adds retryResetsAttemptsToZero to verify
attempts==0 and status==PENDING after three prior markFailed calls.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:25:59 +02:00
hsiegeln
d74079da63 fix(alerting/B-2): implement re-notify cadence sweep and lastNotifiedAt tracking
AlertInstanceRepository gains listFiringDueForReNotify(Instant) — only returns
instances where last_notified_at IS NOT NULL and cadence has elapsed (IS NULL
branch excluded: sweep only re-notifies, initial notify is the dispatcher's job).

AlertEvaluatorJob.sweepReNotify() runs at the end of each tick, enqueues fresh
notifications for eligible instances and stamps last_notified_at.

NotificationDispatchJob stamps last_notified_at on the alert_instance when a
notification is DELIVERED, providing the anchor timestamp for cadence checks.

PostgresAlertInstanceRepositoryIT adds listFiringDueForReNotify test covering
the three-rule eligibility matrix (never-notified, long-ago, recent).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:25:50 +02:00
hsiegeln
3f036da03d fix(alerting/B-1): PostgresAlertRuleRepository.save() now persists alert_rule_targets
saveTargets() is called unconditionally at the end of save() — it deletes
existing targets and re-inserts from the current targets list. findById()
and listByEnvironment() already call withTargets() so reads are consistent.
PostgresAlertRuleRepositoryIT adds saveTargets_roundtrip and
saveTargets_updateReplacesExistingTargets to cover the new write path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 08:25:39 +02:00
hsiegeln
8bf45d5456 fix(alerting): use ALTER TABLE MODIFY SETTING to enable projections on executions ReplacingMergeTree
Investigated three approaches for CH 24.12:
- Inline SETTINGS on ADD PROJECTION: rejected (UNKNOWN_SETTING — not a query-level setting).
- ALTER TABLE MODIFY SETTING deduplicate_merge_projection_mode='rebuild': works; persists in
  table metadata across connection restarts; runs before ADD PROJECTION in the SQL script.
- Session-level JDBC URL param: not pursued (MODIFY SETTING is strictly better).

alerting_projections.sql now runs MODIFY SETTING before the two executions ADD PROJECTIONs.
AlertingProjectionsIT strengthened to assert all four projections (including alerting_app_status
and alerting_route_status on executions) exist after schema init.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 07:36:55 +02:00
hsiegeln
f1abca3a45 refactor(alerting): rename P95_LATENCY_MS → AVG_DURATION_MS to match what stats_1m_route exposes
The evaluator mapped P95_LATENCY_MS to ExecutionStats.avgDurationMs because
stats_1m_route has no p95 column. Exposing the old name implied p95 semantics
operators did not get. Rename to AVG_DURATION_MS makes the contract honest.
Updated RouteMetric enum (with javadoc), evaluator switch, and admin guide.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 07:36:43 +02:00
hsiegeln
144915563c docs(alerting): whole-branch final review report
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-20 07:25:33 +02:00
hsiegeln
c79a6234af test(alerting): fix duplicate @MockBean after AbstractPostgresIT centralised mocks + Plan 02 verification report
AbstractPostgresIT gained clickHouseSearchIndex and agentRegistryService mocks in Phase 9.
All 14 alerting IT subclasses that re-declared the same @MockBean fields now fail with
"Duplicate mock definition". Removed the redundant declarations; per-class clickHouseLogStore
mock kept where needed. 120 alerting tests now pass (0 failures).

Also adds docs/alerting-02-verification.md (Task 43).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 23:27:19 +02:00
hsiegeln
63669bd1d7 docs(alerting): default config + admin guide
Adds alerting stanza to application.yml with all AlertingProperties
fields backed by env-var overrides.  Creates docs/alerting.md covering
six condition kinds (with example JSON), template variables, webhook
setup (Slack/PagerDuty examples), silence patterns, circuit-breaker
and retention troubleshooting, and Prometheus metrics reference.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 22:16:38 +02:00
hsiegeln
840a71df94 feat(alerting): observability metrics via micrometer
AlertingMetrics @Component wraps MeterRegistry:
- Counters: alerting_eval_errors_total{kind}, alerting_circuit_opened_total{kind},
  alerting_notifications_total{status}
- Timers: alerting_eval_duration_seconds{kind}, alerting_webhook_delivery_duration_seconds
- Gauges (DB-backed): alerting_rules_total{state}, alerting_instances_total{state}

AlertEvaluatorJob records evalError + evalDuration around each evaluator call.
PerKindCircuitBreaker detects open transitions and fires metrics.circuitOpened(kind).
AlertingBeanConfig wires AlertingMetrics into the circuit breaker post-construction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 22:16:30 +02:00
hsiegeln
1ab21bc019 feat(alerting): AlertingRetentionJob daily cleanup
Nightly @Scheduled(03:00) job deletes RESOLVED alert_instances older
than eventRetentionDays and DELIVERED/FAILED alert_notifications older
than notificationRetentionDays.  Uses injected Clock for testability.
IT covers: old-resolved deleted, fresh-resolved kept, FIRING kept
regardless of age, PENDING notification never deleted.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 22:16:21 +02:00
hsiegeln
118ace7cc3 docs(alerting): update app-classes.md for Phase 9 REST controllers (Task 36)
- Add AlertRuleController, AlertController, AlertSilenceController, AlertNotificationController entries
- Document inbox SQL visibility contract (target_user_ids/group_ids/role_names — no broadcast)
- Add /api/v1/alerts/notifications/{id}/retry to flat-endpoint allow-list
- Update SecurityConfig entry with alerting path matchers
- Note attribute-key SQL injection validation contract on AlertRuleController

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 22:08:38 +02:00
hsiegeln
e334dfacd3 feat(alerting): AlertNotificationController + SecurityConfig matchers + fix IT context (Task 35)
- GET /environments/{envSlug}/alerts/{alertId}/notifications — list notifications for instance (VIEWER+)
- POST /alerts/notifications/{id}/retry — manual retry of failed notification (OPERATOR+)
  Flat path because notification IDs are globally unique (no env routing needed)
- scheduleRetry resets attempts to 0 and sets nextAttemptAt = now
- Added 11 alerting path matchers to SecurityConfig before outbound-connections block
- Fixed context loading failure in 6 pre-existing alerting storage/migration ITs by adding
  @MockBean(clickHouseSearchIndex/clickHouseLogStore): ExchangeMatchEvaluator and
  LogPatternEvaluator inject the concrete classes directly (not interface beans), so the
  full Spring context fails without these mocks in tests that don't use the real CH container
- 5 IT tests: list, viewer-can-list, retry, viewer-cannot-retry, unknown-404

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:29:17 +02:00
hsiegeln
77d1718451 feat(alerting): AlertSilenceController CRUD with time-range validation + audit (Task 34)
- POST/GET/DELETE /environments/{envSlug}/alerts/silences
- 422 when endsAt <= startsAt ("endsAt must be after startsAt")
- OPERATOR+ for create/delete, VIEWER+ for list
- Audit: ALERT_SILENCE_CREATE/DELETE with AuditCategory.ALERT_SILENCE_CHANGE
- 6 IT tests: create, viewer-list, viewer-cannot-create, bad time-range, delete, audit event

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:29:03 +02:00
hsiegeln
841793d7b9 feat(alerting): AlertController in-app inbox with ack/read/bulk-read (Task 33)
- GET /environments/{envSlug}/alerts — inbox filtered by userId/groupIds/roleNames via InAppInboxQuery
- GET /unread-count — memoized unread count (5s TTL)
- GET /{id}, POST /{id}/ack, POST /{id}/read, POST /bulk-read
- bulkRead filters instanceIds to env before delegating to AlertReadRepository
- VIEWER+ for all endpoints; env isolation enforced by requireInstance
- 7 IT tests: list, env isolation, unread-count, ack flow, read, bulk-read, viewer access

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:28:55 +02:00
hsiegeln
c1b34f592b feat(alerting): AlertRuleController with attribute-key SQL injection validation (Task 32)
- POST/GET/PUT/DELETE /environments/{envSlug}/alerts/rules CRUD
- POST /{id}/enable, /{id}/disable, /{id}/render-preview, /{id}/test-evaluate
- Attribute-key validation: rejects keys not matching ^[a-zA-Z0-9._-]+$ at rule-save time
  (CRITICAL: ExchangeMatchCondition attribute keys are inlined into ClickHouse SQL)
- Webhook validation: verifies outboundConnectionId exists and is allowed in env
- Null-safe notification template defaults to "" for NOT NULL DB constraint
- Fixed misleading comment in ClickHouseSearchIndex to document validation contract
- OPERATOR+ for mutations, VIEWER+ for reads
- Audit: ALERT_RULE_CREATE/UPDATE/DELETE/ENABLE/DISABLE with AuditCategory.ALERT_RULE_CHANGE
- 11 IT tests covering RBAC, SQL-injection prevention, enable/disable, audit, render-preview

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 21:28:46 +02:00
hsiegeln
d3dd8882bd feat(alerting): InAppInboxQuery with 5s unread-count memoization
listInbox resolves user groups+roles via RbacService.getEffectiveGroupsForUser
/ getEffectiveRolesForUser then delegates to AlertInstanceRepository.
countUnread memoized per (envId, userId) with 5s TTL via ConcurrentHashMap
using a controllable Clock. 6 unit tests covering delegation, cache hit,
TTL expiry, and isolation between users/envs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:25:00 +02:00
hsiegeln
6b48bc63bf feat(alerting): NotificationDispatchJob outbox loop with silence + retry
Claim-polling SchedulingConfigurer: claims due notifications, resolves
instance/connection/rule, checks active silences, dispatches via
WebhookDispatcher, classifies outcomes into DELIVERED/FAILED/retry.
Guards null rule/env after deletion. 5 Testcontainers ITs: 200/503/404
outcomes, active silence suppression, deleted connection fast-fail.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:24:54 +02:00
hsiegeln
466aceb920 feat(alerting): WebhookDispatcher with HMAC + TLS + retry classification
Renders URL/headers/body with Mustache, optionally HMAC-signs the body
(X-Cameleer-Signature), supports POST/PUT/PATCH, classifies 2xx/4xx/5xx
into DELIVERED/FAILED/retry. 8 WireMock-backed IT tests including HTTPS
TRUST_ALL against WireMock self-signed cert.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:24:47 +02:00
hsiegeln
6f1feaa4b0 feat(alerting): HmacSigner for webhook signature
HmacSHA256 signer returning sha256=<lowercase-hex>. 5 unit tests covering
known vector, prefix, hex casing, and different secrets/bodies.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:24:39 +02:00
hsiegeln
bf178ba141 fix(alerting): populate AlertInstance.rule_snapshot so history survives rule delete
- Add withRuleSnapshot(Map) wither to AlertInstance (same pattern as other withers)
- Call snapshotRule(rule) + withRuleSnapshot in both applyResult (single-firing) and
  applyBatchFiring paths so every persisted instance carries a non-empty JSONB snapshot
- Strip null values from the Jackson-serialized map before wrapping in the immutable
  snapshot so Map.copyOf in the compact ctor does not throw NPE on nullable rule fields
- Add ruleSnapshotIsPersistedOnInstanceCreation IT: asserts name/severity/conditionKind
  appear in the rule_snapshot column after a tick fires an instance
- Add historySurvivesRuleDelete IT: fires an instance, deletes the rule, asserts
  rule_id IS NULL and rule_snapshot still contains the rule name (spec §5 guarantee)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 20:09:28 +02:00
hsiegeln
15c0a8273c feat(alerting): AlertEvaluatorJob with claim-polling + circuit breaker
- AlertEvaluatorJob implements SchedulingConfigurer; fixed-delay tick from
  AlertingProperties.effectiveEvaluatorTickIntervalMs (5 s floor)
- Claim-polling via AlertRuleRepository.claimDueRules (FOR UPDATE SKIP LOCKED)
- Per-kind circuit breaker guards each evaluator; failures recorded, open kinds
  skipped and rescheduled without evaluation
- Single-Firing path delegates to AlertStateTransitions; new FIRING instances
  enqueue AlertNotification rows per rule.webhooks()
- Batch (PER_EXCHANGE) path creates one FIRING AlertInstance per Firing entry
- PENDING→FIRING promotion handled in applyResult via state machine
- Title/message rendered via MustacheRenderer + NotificationContextBuilder;
  environment resolved from EnvironmentRepository.findById per tick
- AlertEvaluatorJobIT (4 tests): uses named @MockBean replacements for
  ClickHouseSearchIndex + ClickHouseLogStore; @MockBean AgentRegistryService
  drives Clear/Firing/resolve cycle without timing sensitivity

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:58:27 +02:00
hsiegeln
657dc2d407 feat(alerting): AlertingProperties + AlertStateTransitions state machine
- AlertingProperties @ConfigurationProperties with effective*() accessors and
  5000 ms floor clamp on evaluatorTickIntervalMs; warn logged at startup
- AlertStateTransitions pure static state machine: Clear/Firing/Batch/Error
  branches, PENDING→FIRING promotion on forDuration elapsed; Batch delegated
  to job
- AlertInstance wither helpers: withState, withFiredAt, withResolvedAt, withAck,
  withSilenced, withTitleMessage, withLastNotifiedAt, withContext
- AlertingBeanConfig gains @EnableConfigurationProperties(AlertingProperties),
  alertingInstanceId bean (hostname:pid), alertingClock bean,
  PerKindCircuitBreaker bean wired from props
- 12 unit tests in AlertStateTransitionsTest covering all transitions

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:58:12 +02:00
hsiegeln
f8cd3f3ee4 feat(alerting): EXCHANGE_MATCH evaluator with per-exchange + count modes
PER_EXCHANGE returns EvalResult.Batch(List<Firing>); last Firing carries
_nextCursor (Instant) in its context map for the job to persist as
evalState.lastExchangeTs.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:40:54 +02:00
hsiegeln
89db8bd1c5 feat(alerting): JVM_METRIC evaluator
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:38:48 +02:00
hsiegeln
17d2be5638 feat(alerting): LOG_PATTERN evaluator
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:37:33 +02:00
hsiegeln
07d0386bf2 feat(alerting): ROUTE_METRIC evaluator
P95_LATENCY_MS maps to avgDurationMs (ExecutionStats has no p95 bucket).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:36:22 +02:00
hsiegeln
983b698266 feat(alerting): DEPLOYMENT_STATE evaluator
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:34:47 +02:00
hsiegeln
e84338fc9a feat(alerting): AGENT_STATE evaluator
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:33:13 +02:00
hsiegeln
55f4cab948 feat(alerting): evaluator scaffolding (context, result, tick cache, circuit breaker)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:32:06 +02:00
hsiegeln
891c7f87e3 feat(alerting): silence matcher for notification-time dispatch
SilenceMatcherService.matches() evaluates AND semantics across ruleId,
severity, appSlug, routeId, agentId constraints. Null fields are wildcards.
Scope-based constraints (appSlug/routeId/agentId) return false when rule is
null (deleted rule — scope cannot be verified). 17 unit tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:27:18 +02:00
hsiegeln
1c74ab8541 feat(alerting): NotificationContextBuilder for template context maps
Builds the Mustache context map from AlertRule + AlertInstance + Environment.
Always emits env/rule/alert subtrees; conditionally emits kind-specific
subtrees (agent, app, route, exchange, log, metric, deployment) based on
rule.conditionKind(). Missing instance.context() keys resolve to empty
string. alert.link prefixed with uiOrigin when non-null. 11 unit tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:27:12 +02:00
hsiegeln
92a74e7b8d feat(alerting): MustacheRenderer with literal fallback on missing vars
Sentinel-substitution approach: unresolved {{x.y.z}} tokens are replaced
with a unique NUL-delimited sentinel before Mustache compilation, rendered
as opaque text, then post-replaced with the original {{x.y.z}} literal.
Malformed templates (unclosed {{) are caught and return the raw template.
Never throws. 9 unit tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:27:05 +02:00
hsiegeln
c53f642838 chore(alerting): add jmustache 1.16
Declared in cameleer-server-core pom (canonical location for unit-testable
rendering without Spring) and mirrored in cameleer-server-app pom so the
app module compiles standalone without a full reactor install.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:26:57 +02:00
hsiegeln
7c0e94a425 feat(alerting): ClickHouse projections for alerting read paths
Adds alerting_projections.sql with four projections (alerting_app_status,
alerting_route_status on executions; alerting_app_level on logs;
alerting_instance_metric on agent_metrics). ClickHouseSchemaInitializer now
runs both init.sql and alerting_projections.sql, with ADD PROJECTION and
MATERIALIZE treated as non-fatal — executions (ReplacingMergeTree) requires
deduplicate_merge_projection_mode=rebuild which is unavailable via JDBC pool.
MergeTree projections (logs, agent_metrics) always succeed and are asserted in IT.

Column names confirmed from init.sql: logs uses 'application' (not application_id),
agent_metrics uses 'collected_at' (not timestamp). All column names match the plan.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:18:58 +02:00
hsiegeln
7b79d3aa64 feat(alerting): countExecutionsForAlerting for exchange-match evaluator
Adds AlertMatchSpec record (core) and ClickHouseSearchIndex.countExecutionsForAlerting —
no FINAL, no text subqueries. Filters by tenant, env, app, route, status, time window,
and optional after-cursor. Attributes (JSON string column) use inlined JSONExtractString
key literals since ClickHouse JDBC does not bind ? placeholders inside JSON functions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:18:49 +02:00
hsiegeln
44e91ccdb5 feat(alerting): ClickHouseLogStore.countLogs for log-pattern evaluator
Adds countLogs(LogSearchRequest) — no FINAL, no cursor/sort/limit —
reusing the same WHERE-clause logic as search() for tenant, env, app,
level, q, logger, source, exchangeId, and time-range filters.
Also extends ClickHouseTestHelper with executeInitSqlWithProjections()
and makes the script runner non-fatal for ADD/MATERIALIZE PROJECTION.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:18:41 +02:00
hsiegeln
59354fae18 feat(alerting): wire all alerting repository beans
AlertingBeanConfig now exposes 4 additional @Bean methods:
alertInstanceRepository, alertSilenceRepository,
alertNotificationRepository, alertReadRepository.
AlertReadRepository takes only JdbcTemplate (no JSONB/ObjectMapper needed).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:05:06 +02:00
hsiegeln
f829929b07 feat(alerting): Postgres repositories for silences, notifications, reads
PostgresAlertSilenceRepository: save/findById roundtrip, listActive (BETWEEN
starts_at AND ends_at), listByEnvironment, delete. JSONB SilenceMatcher via ObjectMapper.

PostgresAlertNotificationRepository: save/findById, listForInstance,
claimDueNotifications (UPDATE...RETURNING with FOR UPDATE SKIP LOCKED),
markDelivered, scheduleRetry (bumps attempts + next_attempt_at), markFailed,
deleteSettledBefore (DELIVERED+FAILED rows older than cutoff). JSONB payload.

PostgresAlertReadRepository: markRead (ON CONFLICT DO NOTHING idempotent),
bulkMarkRead (iterates, handles empty list without error).

16 IT scenarios across 3 classes, all passing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:05:01 +02:00
hsiegeln
45028de1db feat(alerting): Postgres repository for alert_instances with inbox queries
Implements AlertInstanceRepository: save (upsert), findById, findOpenForRule,
listForInbox (3-way OR: user/group/role via && array-overlap + ANY), countUnreadForUser
(LEFT JOIN alert_reads), ack, resolve, markSilenced, deleteResolvedBefore.
Integration test covers all 9 scenarios including inbox fan-out across all
three target types. Also adds @JsonIgnoreProperties(ignoreUnknown=true) to
SilenceMatcher to suppress Jackson serializing isWildcard() as a round-trip field.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 19:04:51 +02:00
hsiegeln
930ac20d11 fix(outbound): wire rulesReferencing to AlertRuleRepository (Plan 01 gate)
Replaces the Plan 01 stub that returned [] with a real call through
AlertRuleRepository.findRuleIdsByOutboundConnectionId. Adds AlertingBeanConfig
exposing the AlertRuleRepository bean; widens OutboundBeanConfig constructor
to inject it. Delete and narrow-envs guards now correctly block when rules
reference a connection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 18:51:36 +02:00
hsiegeln
f80bc006c1 feat(alerting): Postgres repository for alert_rules
Implements AlertRuleRepository with JSONB condition/webhooks/eval_state
serialization via ObjectMapper, UPSERT on conflict, JSONB containment
query for findRuleIdsByOutboundConnectionId, and FOR UPDATE SKIP LOCKED
claim-polling for horizontal scale.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 18:48:15 +02:00
hsiegeln
1ff256dce0 feat(alerting): core repository interfaces 2026-04-19 18:43:36 +02:00
hsiegeln
e7a9042677 feat(alerting): core domain records (rule, instance, silence, notification) 2026-04-19 18:43:03 +02:00
hsiegeln
56a7b6de7d feat(alerting): sealed AlertCondition hierarchy with Jackson deduction 2026-04-19 18:42:04 +02:00
hsiegeln
530bc32040 feat(alerting): core enums + AlertScope 2026-04-19 18:36:29 +02:00
hsiegeln
5103dc91be feat(alerting): add ALERT_RULE_CHANGE + ALERT_SILENCE_CHANGE audit categories 2026-04-19 18:34:08 +02:00
hsiegeln
a80c376950 fix(alerting): harden V12 migration IT against shared container state
- Replace hard-coded 'u1' user_id with per-test UUID to prevent PK collision on re-runs
- Add @AfterEach null-safe cleanup for environments and users rows
- Use containsExactlyInAnyOrder for enum assertions to catch misspelled names
- Slug suffix on environment insert avoids slug uniqueness conflicts on re-runs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 18:32:35 +02:00
hsiegeln
59e76bdfb6 feat(alerting): V12 flyway migration for alerting tables 2026-04-19 18:28:09 +02:00
hsiegeln
087dcee5df docs(alerting): Plan 02 — backend (domain, storage, evaluators, dispatch) 2026-04-19 18:24:16 +02:00
hsiegeln
cacedd3f16 fix(outbound): null-guard TRUST_PATHS check; add RBAC test for probe endpoint
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 3m5s
CI / build (pull_request) Successful in 2m13s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / docker (push) Successful in 4m48s
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 32s
- OutboundConnectionRequest compact ctor: avoid NPE if tlsTrustMode is null
  (defense-in-depth alongside @NotNull Bean Validation).
- Add operatorCannotTest IT case to lock the ADMIN-only contract on
  POST /{id}/test — was previously untested.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:19:37 +02:00
hsiegeln
7358555d56 test(outbound): add @AfterEach cleanup to avoid leaking user/connection rows
Shared Spring test context meant seeded test-admin/test-operator/test-viewer/test-alice
users persisted across IT classes, breaking FlywayMigrationIT's "users is empty" assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:10:25 +02:00
hsiegeln
609a86dd03 docs: admin guide for outbound connections
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:03:18 +02:00
hsiegeln
1dd1f10c0e docs(rules): document http/ and outbound/ packages + admin controller
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 17:02:09 +02:00
hsiegeln
0c5f1b5740 feat(ui): outbound connection editor — TLS config, test action, env restriction
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:59:19 +02:00
hsiegeln
e7fbf5a7b2 feat(ui): admin page for outbound connections list + navigation
Adds OutboundConnectionsPage (list view with delete), lazy route at
/admin/outbound-connections, and Outbound Connections nav node in the
admin sidebar tree. No test file created — UI codebase has no existing
test infrastructure to build on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:55:35 +02:00
hsiegeln
3c903fc8dc feat(ui): tanstack query hooks for outbound connections
Types are hand-authored (matching codebase admin-query convention);
schema.d.ts regeneration deferred until backend dev server is available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:52:40 +02:00
hsiegeln
87b8a71205 feat(outbound): admin test action for reachability + TLS summary
POST /{id}/test issues a synthetic probe against the connection URL.
TLS protocol/cipher/peer-cert details stubbed for now (Plan 02 follow-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:47:36 +02:00
hsiegeln
ea4c56e7f6 feat(outbound): admin CRUD REST + RBAC + audit
New audit categories: OUTBOUND_CONNECTION_CHANGE, OUTBOUND_HTTP_TRUST_CHANGE.
Controller-level @PreAuthorize defaults to ADMIN; GETs relaxed to ADMIN|OPERATOR.
SecurityConfig permits OPERATOR GETs on /api/v1/admin/outbound-connections/**.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:43:48 +02:00
hsiegeln
a3c35c7df9 feat(outbound): request + response + test-result DTOs with Bean Validation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:37:00 +02:00
hsiegeln
94b5db0f5b feat(outbound): service with uniqueness + narrow-envs + delete-if-referenced guards
rulesReferencing() is stubbed; wired to AlertRuleRepository in Plan 02.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:34:09 +02:00
hsiegeln
642c040116 feat(outbound): Postgres repository for outbound_connections
- PostgresOutboundConnectionRepository: JdbcTemplate impl of
  OutboundConnectionRepository; UUID arrays via ConnectionCallback,
  JSONB for headers/auth/ca-paths, enum casts for method/trust/auth-kind
- OutboundBeanConfig: wires the repo + SecretCipher beans
- PostgresOutboundConnectionRepositoryIT: 5 Testcontainers tests
  (save+read, unique-name, allowed-env-ids round-trip, tenant isolation,
  delete); validates V11 Flyway migration end-to-end
- application-test.yml: add jwtsecret default so SecretCipher bean
  starts up in the Spring test context

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 16:23:51 +02:00
hsiegeln
380ccb102b fix(outbound): align user FK with users(user_id) TEXT schema
V11 migration referenced users(id) as uuid, but V1 users table has
user_id as TEXT primary key. Amending V11 and the OutboundConnection
record before Task 7's integration tests catch this at Flyway startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:18:12 +02:00
hsiegeln
b8565af039 feat(outbound): SecretCipher - AES-GCM with JWT-derived key for at-rest secret encryption
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:13:57 +02:00
hsiegeln
46b8f63fd1 feat(outbound): core domain records for outbound connections
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:10:17 +02:00
hsiegeln
0c9d12d8e0 test(http): tighten SSL-failure assertion + null-guard WireMock teardown
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:08:43 +02:00
hsiegeln
000e9d2847 feat(http): ApacheOutboundHttpClientFactory with memoization and startup validation
Adds ApacheOutboundHttpClientFactory (Apache HttpClient 5) that memoizes
CloseableHttpClient instances keyed on effective TLS + timeout config, and
OutboundHttpConfig (@ConfigurationProperties) that validates trusted CA paths
at startup and exposes OutboundHttpClientFactory as a Spring bean.

TRUST_ALL mode disables both cert validation (TrustAllManager in SslContextBuilder)
and hostname verification (NoopHostnameVerifier on SSLConnectionSocketFactoryBuilder).
WireMock HTTPS integration test covers trust-all bypass, system-default PKIX rejection,
and client memoization.

OIDC audit: OidcProviderHelper and OidcTokenExchanger use Nimbus SDK's own HTTP layer
(DefaultResourceRetriever for JWKS, HTTPRequest.send() for token exchange) plus the
bespoke InsecureTlsHelper for TLS skip-verify; neither uses OutboundHttpClientFactory.
Retrofit deferred to a separate follow-up per plan §20.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 16:03:56 +02:00
hsiegeln
4922748599 refactor(http): tighten SslContextBuilder throws clause, classpath test fixture, system trust-all test
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:59:06 +02:00
hsiegeln
262ee91684 feat(http): SslContextBuilder supports system/trust-all/trust-paths modes
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:54:15 +02:00
hsiegeln
2224f7d902 feat(http): core outbound HTTP interfaces and property records 2026-04-19 15:39:57 +02:00
hsiegeln
ffdfd6cd9a feat(outbound): add HTTPS CHECK constraint on outbound_connections.url
Defense-in-depth per code review. DTO layer already validates HTTPS at save
time; this DB-level check guards against future code paths that might bypass
the DTO validator. Mustache template variables in the URL (e.g., {{env.slug}})
remain valid since only the scheme prefix is constrained.
2026-04-19 15:37:35 +02:00
hsiegeln
116038262a feat(outbound): V11 flyway migration for outbound_connections table 2026-04-19 15:33:39 +02:00
hsiegeln
77a23c270b docs(alerting): Plan 01 — outbound HTTP infra + admin-managed outbound connections
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m57s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
First of three sequenced plans for the alerting feature. Covers:
- Cross-cutting http/ module (OutboundHttpClientFactory, SslContextBuilder,
  TLS trust composition, startup validation)
- Admin-managed OutboundConnection with PG persistence, AES-GCM-encrypted
  HMAC secret (resolves spec §20 item 2)
- Admin CRUD REST + test endpoint + RBAC + audit
- Admin UI page with TLS config, allowed-envs multi-select, test action
- OIDC retrofit deliberately deferred (documented in Task 4 audit)

Plan 02 (alerting backend) and Plan 03 (alerting UI) written after Plan 01
executes — lets reality inform their details, especially the secret-cipher
interface and the rules-referencing integration point.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:26:00 +02:00
hsiegeln
e71edcdd5e docs(alerting): add BL-002 for native provider integrations + Mustache auto-complete
BL-002 / gitea#138 tracks deferred native provider types (Slack Block Kit,
PagerDuty Events v2, Teams connector) with shipped templates as a post-v1
fast-follow once usage data informs which providers matter.

Spec §13 folds in context-aware variable auto-complete for the shared
<MustacheEditor /> component used in rule editor, webhook overrides, and
outbound-connection admin. Available variables filter by condition kind.
Completion engine choice added to §20 as a planning-phase decision.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 15:10:00 +02:00
hsiegeln
a9ad0eb841 docs(alerting): spec for alerting feature + backlog entry BL-001
Comprehensive design spec for a confined, env-scoped alerting feature:
6 signal sources, shared env-scoped rules with RBAC-targeted notifications,
in-app inbox + webhook delivery via admin-managed outbound connections,
claim-based polling for horizontal scalability, 4 CH projections for hot-path
reads. Backlog entry BL-001 / gitea#137 tracks deferred managed-CA investigation
(reuse SaaS-layer CA handling first before building in-server storage).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:58:38 +02:00
hsiegeln
c4cee9718c fix(ui): align log search input styling with EventFeed, render ellipsis
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m44s
CI / docker (push) Successful in 1m16s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m49s
SonarQube / sonarqube (push) Successful in 4m5s
JSX attribute strings don't process JS escape sequences — "Search logs\u2026"
rendered the literal "\u2026" in the placeholder. Replaced with the actual
ellipsis character.

Also aligned .logSearchInput (Application Log search) with EventFeed's
internal search input: --bg-surface background, --border border,
mono font family, 28px height. Previously used --bg-body + --border-subtle
+ body font, which looked visibly different next to the Timeline panel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:53:43 +02:00
hsiegeln
d40833b96a docs(rules): refresh for insert_id UUID cursor + AgentEventPage
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- LogQueryController: note response shape, sort param, and that the
  cursor tiebreak is the insert_id UUID column (not exchange/instance)
- AgentEventsController: cursor now carries insert_id UUID (was instanceId);
  order is (timestamp DESC, insert_id DESC)
- core-classes: add AgentEventPage record; note that the non-paginated
  AgentEventRepository.query(...) path has been removed
- core-classes: note LogSearchRequest.sources/levels are now List<String>
  with multi-value OR semantics

Keeps the rule files in sync with the cursor-pagination + multi-select
filter work on main.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:43:25 +02:00
hsiegeln
57e1d09bc6 fix(ui): align Timeline panel header with Application Log
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Both panels now use the same card wrapper (logStyles.logCard), header
container (logStyles.logHeader, 12px 16px padding), and DS SectionHeader
for the title. Previously Timeline rendered a custom 13px span while
Application Log used SectionHeader's uppercase style, so the two panels
side-by-side looked inconsistent.

Removes the now-orphaned .eventCard/.eventCardHeader/.sectionTitle and
.timelineCard/.timelineHeader CSS rules.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:41:29 +02:00
hsiegeln
9292bd5f5f fix(ui): Timeline uses EventFeed's internal scroll + load-older button
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m15s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
EventFeed has its own search + filter toolbar inside the component.
Wrapping it in InfiniteScrollArea made the toolbar scroll out of
sight. Drop InfiniteScrollArea for the Timeline, give EventFeed a
bounded-height flex container (it scrolls its own .list internally),
and add an explicit 'Load older events' button for cursor
pagination. Polling always on for events (low volume).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 15:25:48 +02:00
hsiegeln
a3429a609e fix(ui): live-tail logs when time range is a relative preset
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Page 1 refetches were using the captured timeRange.end, so rows
arriving after the initial render were outside the query window and
never surfaced. When timeRange.preset is set (e.g. 'last 1h'), each
fetch now advances 'to' to Date.now() so the poll picks up new rows.
Absolute ranges are unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:48:29 +02:00
hsiegeln
51feacec1e fix(ui): cascade flatScroll override to descendants
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
EventFeed's overflow-y:auto lives on its inner .list, not the root
where className lands. Extending .flatScroll to .flatScroll * covers
nested scroll containers, and relaxing the root's height:100% (which
EventFeed sets) lets content size naturally so the outer
InfiniteScrollArea owns the single scrollbar.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:44:26 +02:00
hsiegeln
806a817c07 fix(ui): suppress double scrollbar in log + timeline panels
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m17s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
LogViewer and EventFeed each apply overflow-y:auto to their root
container, which produced a nested scrollbar inside the
InfiniteScrollArea that also scrolls. A flatScroll override class
flattens the DS component so the outer InfiniteScrollArea owns the
single scrollbar — matching the IntersectionObserver sentinels that
drive infinite-load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 14:38:15 +02:00
hsiegeln
89c9b53edd fix(pagination): add insert_id UUID tiebreak to cursor keyset
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Same-millisecond rows were silently skipped between pages because the
log cursor had no tiebreak and the events cursor tied by instance_id
(which also collides when one instance emits multiple events within a
millisecond). Add an insert_id UUID (DEFAULT generateUUIDv4()) column
to both logs and agent_events, order by (timestamp, insert_id)
consistently, and encode the cursor as 'timestamp|insert_id'. Existing
data is materialized via ALTER TABLE MATERIALIZE COLUMN (one-time
background mutation).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 14:25:36 +02:00
hsiegeln
07dbfb1391 fix(ui): log header counter reflects visible (filtered) count
When a text search is active, show 'X of Y entries' rather than the
loaded total, so the number matches what's on screen.
2026-04-17 13:19:51 +02:00
hsiegeln
a2d55f7075 fix(ui): push log sort toggle server-side
Reversing logStream.items client-side breaks across infinite-scroll
pages. Passing sort='asc'/'desc' into the query key and URL triggers
a fresh first-page fetch in the selected order.
2026-04-17 13:19:29 +02:00
hsiegeln
6d3956935d refactor(events): remove dead non-paginated query path
AgentEventService.queryEvents, AgentEventRepository.query, and the
ClickHouse implementation have had no callers since /agents/events
became cursor-paginated. Remove them along with their dedicated IT
tests. queryPage and its tests remain as the single query path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 13:16:28 +02:00
hsiegeln
a0a0635ddd fix(api): malformed ?from/?to returns 400 instead of 500
Extends the existing ApiExceptionHandler @RestControllerAdvice to map
DateTimeParseException and IllegalArgumentException to 400 Bad Request.
Logs and agent-events endpoints both parse ISO-8601 query params and
previously leaked parse failures as internal server errors. All
IllegalArgumentException throw sites in production code are
input-validation usages (slug validation, containerConfig validation,
cursor decoding), so mapping to 400 is correct across the board.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 13:14:18 +02:00
hsiegeln
f1c5a95f12 fix(logs): use parseDateTime64BestEffort for all timestamp binds
JDBC Timestamp binding shifted timestamps by the JVM local timezone
offset on both insert and query, producing asymmetric UTC offsets that
broke time-range filtering and cursor pagination. Switching inserts
(indexBatch, insertBufferedBatch) and all WHERE predicates to ISO-8601
strings via parseDateTime64BestEffort, and reading timestamps back as
epoch-millis via toUnixTimestamp64Milli, pins everything to UTC and
fixes the time-range filter test plus cursor pagination.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 13:13:34 +02:00
hsiegeln
5d9f6735cc docs(rules): note cursor pagination + multi-source/level filters
Reflects LogQueryController's multi-value source/level filters,
AgentEventsController's cursor pagination shape, and the new
useInfiniteStream/InfiniteScrollArea UI primitives used by streaming
views.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 12:57:45 +02:00
hsiegeln
4f9ee57421 feat(ui): LogTab — surface source badge on per-exchange log rows 2026-04-17 12:57:06 +02:00
hsiegeln
ef9bc5a614 feat(ui): AgentInstance — server-side multi-select filters + infinite scroll
Same pattern as AgentHealth, scoped to a single agent instance
(passes agentId to both log and timeline streams).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-17 12:55:39 +02:00
hsiegeln
7f233460aa fix(ui): stabilize infinite-stream callbacks + suppress empty-state flash
- useInfiniteStream: wrap fetchNextPage and refresh in useCallback so
  InfiniteScrollArea's IntersectionObserver does not re-subscribe on
  every parent render.
- InfiniteScrollArea: do not render 'End of stream' until at least one
  item has loaded and the initial query has settled (was flashing on
  mount before first fetch).
- AgentHealth: pass isLoading + hasItems to both InfiniteScrollArea
  wrappers.
2026-04-17 12:52:39 +02:00
hsiegeln
fb7d6db375 feat(ui): AgentHealth — server-side multi-select filters + infinite scroll
Application Log: source + level filters move server-side; text search
stays client-side. Timeline: cursor-paginated via useInfiniteAgentEvents.
Both wrapped in InfiniteScrollArea with top-gated auto-refetch.
2026-04-17 12:46:48 +02:00
hsiegeln
73309c7e63 feat(ui): replace useAgentEvents with useInfiniteAgentEvents
Cursor-paginated timeline stream matching the new /agents/events
endpoint. Consumers (AgentHealth, AgentInstance) updated in
follow-up commits.
2026-04-17 12:43:51 +02:00
hsiegeln
43f145157d feat(ui): add useInfiniteApplicationLogs hook
Server-side filters on source/level/time-range, client-side text
search on top of flattened items. Leaves useApplicationLogs and
useStartupLogs untouched for bounded consumers (LogTab, StartupLogPanel).
2026-04-17 12:42:35 +02:00
hsiegeln
c2ce508565 feat(ui): add InfiniteScrollArea component
Scrollable container with top/bottom IntersectionObserver sentinels.
Fires onTopVisibilityChange when the top is fully in view and
onEndReached when the bottom is within 100px. Used by infinite log
and event streams.
2026-04-17 12:41:17 +02:00
hsiegeln
a7f53c8993 feat(ui): add useInfiniteStream hook
Wraps tanstack useInfiniteQuery with cursor flattening, top-gated
polling, and a refresh() invalidator. Used by log and agent-event
streaming views.
2026-04-17 12:39:59 +02:00
hsiegeln
bfb5a7a895 chore: regenerate openapi.json + schema.d.ts
Captures the cursor-paginated /agents/events response shape
(AgentEventPageResponse with data/nextCursor/hasMore and a new ?cursor
param). Also folds in pre-existing drift from 62dd71b (environment
field on agent event rows). Consumer UI hooks are updated in
Tasks 9-11.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 12:39:03 +02:00
hsiegeln
20b8d4ccaf feat(events): cursor-paginated GET /agents/events
Returns {data, nextCursor, hasMore} instead of a bare list. Adds
?cursor= param; existing filters (appId, agentId, from, to, limit)
unchanged. Ordering is (timestamp DESC, instance_id ASC).
2026-04-17 12:22:48 +02:00
hsiegeln
0194549f25 fix(events): reject malformed pagination cursors as 400 errors
Wraps DateTimeParseException from Instant.parse in IllegalArgumentException
so the controller maps it to 400. Also rejects cursors with empty
instance_id (trailing '|') which would otherwise produce a vacuous
keyset predicate.
2026-04-17 12:02:40 +02:00
hsiegeln
d293dafb99 feat(events): cursor-paginate agent events (ClickHouse impl)
Orders by (timestamp DESC, instance_id ASC). Cursor is
base64url('timestampIso|instanceId') with a tuple keyset predicate
for stable paging across ties.
2026-04-17 11:57:35 +02:00
hsiegeln
67a834153e feat(events): add AgentEventPage + queryPage interface
Introduces cursor-paginated query on AgentEventRepository. The cursor
format is owned by the implementation. The existing non-paginated
query(...) is kept for internal consumers.
2026-04-17 11:52:42 +02:00
hsiegeln
769752a327 feat(logs): widen source filter to multi-value OR list
Replaces LogSearchRequest.source (String) with sources (List<String>)
and emits 'source IN (...)' when non-empty. LogQueryController parses
?source=a,b,c the same way it parses ?level=a,b,c.
2026-04-17 11:48:10 +02:00
hsiegeln
e8d6cc5b5d docs: implementation plan for log filters + infinite scroll
13 tasks covering backend multi-value source filter, cursor-paginated
agent events, shared useInfiniteStream/InfiniteScrollArea primitives,
and page-level refactors of AgentHealth + AgentInstance. Bounded log
views (LogTab, StartupLogPanel) keep single-page hooks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 11:37:06 +02:00
hsiegeln
b14551de4e docs: spec for multi-select log filters + infinite scroll
Establishes the shared pattern for streaming views (logs + agent
events): server-side multi-select source/level filters, cursor-based
infinite scroll, and top-gated auto-refetch. Scoped to AgentHealth and
AgentInstance; bounded views (LogTab, StartupLogPanel) keep
single-page hooks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 11:26:39 +02:00
hsiegeln
62dd71b860 fix: stamp environment on agent_events rows
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
The agent_events table has an `environment` column and AgentEventsController
filters on it, but the INSERT never populated it — every row got the
column default ('default'). Result: Timeline on the Application Runtime
page was empty whenever the user's selected env was anything other than
'default'.

Thread env through the write path:
- AgentEventRepository.insert + AgentEventService.recordEvent gain an
  `environment` param; delete the no-env query overload (unused).
- ClickHouseAgentEventRepository.insert writes the column (falls back to
  'default' on null to match column DEFAULT).
- All 5 callers source env from the agent registry (AgentInfo.environmentId)
  or the registration request body; AgentLifecycleMonitor, deregister,
  command ack, event ingestion, register/re-register.
- Integration test updated for the new signatures.

Pre-existing rows in deployed CH will still report environment='default'.
New events from this build forward will carry the correct env. Backfill
(UPDATE ... FROM apps) is left as a manual DB step if historical timeline
is needed for non-default envs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 10:30:56 +02:00
hsiegeln
5807cfd807 fix: make API Docs page scrollable
LayoutShell's <main> sets overflow:hidden + min-height:0; pages must
handle their own scroll. SwaggerPage's root div had no height constraint,
so the Swagger UI rendered below the viewport with no scrollbar.

Match the pattern used by AppsTab/DashboardTab/etc.: flex:1 + min-height:0
+ overflow-y:auto on the page root. Inline style since this is a leaf
page with no existing module.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 10:26:15 +02:00
hsiegeln
88b9faa4f8 chore: point generate-api:live at deployed server instead of localhost
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m33s
CI / docker (push) Successful in 1m41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The deployed instance at 192.168.50.86:30090 is the canonical source of
truth for the API schema during development — regen against it instead
of requiring a local backend boot with Postgres + ClickHouse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 10:24:49 +02:00
hsiegeln
59de424ab9 chore: regenerate openapi.json + schema.d.ts from deployed server
Fetched from http://192.168.50.86:30090/api/v1/api-docs (running origin/main
through b7a107d — full P3B/P3C env-scoping migration live there).

SPA TS types now match the env-scoped URL shape used at runtime:
- /environments/{envSlug}/... for data, config, search, logs, routes, agents
- /agents/config (agent-authoritative)
- /admin/environments/{envSlug}/... (env CRUD)

Note: ExecutionDetail.environment isn't in the regenerated schema yet —
commit d02fa73 (local, not yet pushed/deployed) adds that backend field.
The local type extension in ui/src/components/ExecutionDiagram/types.ts
covers the gap until the next redeploy + regen.

UI typecheck (tsc -p tsconfig.app.json --noEmit) passes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 10:24:24 +02:00
hsiegeln
d02fa73080 fix: scope correlation-chain query to the exchange's own env
Correlated exchanges always share the env of the one being viewed —
using the globally-selected env from the picker was wrong if the user
switched envs after opening a detail view (or arrived via permalink).

Thread `environment` through:
- `ExecutionStore.ExecutionRecord` gains `environment` field; the
  ClickHouse `executions` table already stores this, just not read back.
- `ClickHouseExecutionStore.findById` SELECT adds the column; mapper
  populates it.
- `ExecutionDetail` gains `environment`; `DetailService` passes through.
- `IngestionService.toExecutionRecord` passes null — this legacy PG
  ingestion path isn't active when ClickHouse is enabled, and the
  read-side is what drives the correlation UI.
- UI `ExchangeHeader` reads `detail.environment ?? storeEnv` and
  extends the TS type locally (schema.d.ts catches up on next regen).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 10:19:42 +02:00
hsiegeln
f04e77788e fix: thread environment into correlation-chain query in ExchangeHeader
The env-scoping migration (P3A) changed useCorrelationChain to require an
environment arg and gate on `enabled: !!correlationId && !!environment`,
but ExchangeHeader was still calling it with one arg. Result: the query
never fired, so the header always rendered "no correlated exchanges
found" even when 4+ exchanges shared a correlationId.

Fix: read the selected env from the Zustand environment store and pass
it through.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 10:13:32 +02:00
hsiegeln
b7a107d33f test: update integration tests for env-scoped URL shape
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m49s
CI / docker (push) Successful in 2m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m37s
Picks up the URL moves from P2/P3A/P3B/P3C. Also fixes a latent bug in
AppControllerIT.uploadJar_asOperator_returns201 / DeploymentControllerIT
setUp: the tests were passing the app's UUID as the {appSlug} path
variable (via `path("id").asText()`); the old AppController looked up
apps via getBySlug(), so the legacy URL call would 404 when the slug
literal was a UUID. Now the test tracks the known slug string and uses
it for every /apps/{appSlug}/... path.

Test URL updates:
- SearchControllerIT: /api/v1/search/executions →
  /api/v1/environments/default/executions (GET) and
  /api/v1/environments/default/executions/search (POST).
- AppControllerIT: /api/v1/apps → /api/v1/environments/default/apps.
  Request bodies drop environmentId (it's in the path).
- DeploymentControllerIT: /api/v1/apps/{appId}/deployments →
  /api/v1/environments/default/apps/{appSlug}/deployments. DeployRequest
  body drops environmentId.
- JwtRefreshIT + RegistrationSecurityIT: smoke-test protected endpoint
  call updated to the new /environments/default/executions shape.

All tests compile clean. Runtime behavior requires a full stack
(Postgres + ClickHouse + Docker); validating integration tests is a
pre-merge step before merging the feature branch.

Remaining pre-merge items (not blocked by code):
1. Regenerate ui/src/api/schema.d.ts + openapi.json by running
   `cd ui && npm run generate-api:live` against a running backend.
   SearchController, DeploymentController, etc. DTO signatures have
   changed; schema.d.ts is frozen at the pre-migration shape.
   Raw-fetch call sites introduced in P3A/P3C work at runtime without
   the schema; the regen only sharpens TypeScript coverage.
2. Smoke test locally: boot server, verify EnvironmentsPage,
   AppsTab, Exchanges, Dashboard, Runtime pages all function.
3. Run `mvn verify` end-to-end (Testcontainers + Docker required).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:53:55 +02:00
hsiegeln
51d7bda5b8 docs: document P3 URL taxonomy, slug immutability, tenant invariant
Locks the new conventions into rule files so future agents and humans
don't drift back into old patterns.

- .claude/rules/app-classes.md: replaces the flat endpoint catalog
  with a taxonomy-aware reorganization (env-scoped / env-admin /
  agent-only / ingestion / cross-env discovery / admin / other).
  Adds the flat-endpoint allow-list with rationale per prefix and
  documents the tenant-filter invariant for ClickHouse queries.
- CLAUDE.md: adds four convention bullets in Key Conventions —
  URL taxonomy with allow-list pointer, slug immutability rule,
  app uniqueness as (env, app_slug), env-required on env-scoped
  endpoints.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:50:38 +02:00
hsiegeln
873e1d3df7 feat!: move query/logs/routes/diagram/agent-view endpoints under /environments/{envSlug}/
P3C — the last data/query wave of the taxonomy migration. Every user-
facing read endpoint that was keyed on env-as-query-param is now under
the env-scoped URL, making env impossible to omit and unambiguous in
server-side tenant+env filtering.

Server:
- SearchController: /api/v1/search/** → /api/v1/environments/{envSlug}/...
  Endpoints: /executions (GET), /executions/search (POST), /stats,
  /stats/timeseries, /stats/timeseries/by-app, /stats/timeseries/by-route,
  /stats/punchcard, /attributes/keys, /errors/top. Env comes from path.
- LogQueryController: /api/v1/logs → /api/v1/environments/{envSlug}/logs.
- RouteCatalogController: /api/v1/routes/catalog → /api/v1/environments/
  {envSlug}/routes. Env filter unconditional (path).
- RouteMetricsController: /api/v1/routes/metrics →
  /api/v1/environments/{envSlug}/routes/metrics (and /metrics/processors).
- DiagramRenderController: /{contentHash}/render stays flat (hashes are
  globally unique). Find-by-route moved to /api/v1/environments/{envSlug}/
  apps/{appSlug}/routes/{routeId}/diagram — the old GET /diagrams?...
  handler is removed.
- Agent views split cleanly:
  - AgentListController (new): /api/v1/environments/{envSlug}/agents
  - AgentEventsController: /api/v1/environments/{envSlug}/agents/events
  - AgentMetricsController: /api/v1/environments/{envSlug}/agents/
    {agentId}/metrics — now also rejects cross-env agents (404) as a
    defense-in-depth check, fulfilling B3.
  Agent self-service endpoints (register/refresh/heartbeat/deregister)
  remain flat at /api/v1/agents/** — JWT-authoritative.

SPA:
- queries/agents.ts, agent-metrics.ts, logs.ts, catalog.ts (route
  metrics only; /catalog stays flat), processor-metrics.ts,
  executions.ts (attributes/keys, stats, timeseries, search),
  dashboard.ts (all stats/errors/punchcard), correlation.ts,
  diagrams.ts (by-route) — all rewritten to env-scoped URLs.
- Hooks now either read env from useEnvironmentStore internally or
  require it as an argument. Query keys include env so switching env
  invalidates caches.
- useAgents/useAgentEvents signature simplified — env is no longer a
  parameter; it's read from the store. Callers (LayoutShell,
  AgentHealth, AgentInstance) updated accordingly.
- LogTab and useStartupLogs thread env through to useLogs.
- envFetch helper introduced in executions.ts for env-prefixed raw
  fetch until schema.d.ts is regenerated against the new backend.

BREAKING CHANGE: All these flat paths are removed:
  /api/v1/search/**, /api/v1/logs, /api/v1/routes/catalog,
  /api/v1/routes/metrics (and /processors), /api/v1/diagrams
  (lookup), /api/v1/agents (list), /api/v1/agents/events-log,
  /api/v1/agents/{id}/metrics, /api/v1/agent-events.
Clients must use the /api/v1/environments/{envSlug}/... equivalents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:48:25 +02:00
hsiegeln
6d9e456b97 feat!: move apps & deployments under /api/v1/environments/{envSlug}/apps/{appSlug}/...
P3B of the taxonomy migration. App and deployment routes are now
env-scoped in the URL itself, making the (env, app_slug) uniqueness
key explicit. Previously /api/v1/apps/{appSlug} was ambiguous: with
the same app deployed to multiple environments (dev/staging/prod),
the handler called AppService.getBySlug(slug) which returns the
first row matching slug regardless of env.

Server:
- AppController: @RequestMapping("/api/v1/environments/{envSlug}/
  apps"). Every handler now calls
  appService.getByEnvironmentAndSlug(env.id(), appSlug) — 404 if the
  app doesn't exist in *this* env. CreateAppRequest body drops
  environmentId (it's in the path).
- DeploymentController: @RequestMapping("/api/v1/environments/
  {envSlug}/apps/{appSlug}/deployments"). DeployRequest body drops
  environmentId. PromoteRequest body switches from
  targetEnvironmentId (UUID) to targetEnvironment (slug);
  promote handler resolves the target env by slug and looks up the
  app with the same slug in the target env (fails with 404 if the
  target app doesn't exist yet — apps must exist in both source
  and target before promote).
- AppService: added getByEnvironmentAndSlug helper; createApp now
  validates slug against ^[a-z0-9][a-z0-9-]{0,63}$ (400 on
  invalid).

SPA:
- queries/admin/apps.ts: rewritten. Hooks take envSlug where
  env-scoped. Removed useAllApps (no flat endpoint). Renamed path
  param naming: appId → appSlug throughout. Added
  usePromoteDeployment. Query keys include envSlug so cache is
  env-scoped.
- AppsTab.tsx: call sites updated. When no environment is selected,
  the managed-app list is empty — cross-env discovery lives in the
  Runtime tab (catalog). handleDeploy/handleStop/etc. pass envSlug
  to the new hook signatures.

BREAKING CHANGE: /api/v1/apps/** paths removed. Clients must use
/api/v1/environments/{envSlug}/apps/{appSlug}/**. Request bodies
for POST /apps and POST /apps/{slug}/deployments no longer accept
environmentId (use the URL path instead). Promote body uses slug
not UUID.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:38:37 +02:00
hsiegeln
969cdb3bd0 feat!: move config & settings under /api/v1/environments/{envSlug}/...
P3A of the taxonomy migration. Env-scoped config and settings endpoints
now live under the env-prefixed URL shape, making env a first-class
path segment instead of a query param. Agent-authoritative config is
split off into a dedicated endpoint so agent env comes from the JWT
only — never spoofable via URL.

Server:
- ApplicationConfigController: @RequestMapping("/api/v1/environments/
  {envSlug}"). Handlers use @EnvPath Environment env, appSlug as
  @PathVariable. Removed the dual-mode resolveEnvironmentForRead —
  user flow only; agent flow moved to AgentConfigController.
- AgentConfigController (new): GET /api/v1/agents/config. Reads
  instanceId from JWT subject, resolves (app, env) from registry,
  returns AppConfigResponse. Registry miss → falls back to JWT env
  claim for environment, but 404s if application cannot be derived
  (no other source without registry).
- AppSettingsController: @RequestMapping("/api/v1/environments/
  {envSlug}"). List at /app-settings, per-app at /apps/{appSlug}/
  settings. Access class-wide PreAuthorize preserved (ADMIN/OPERATOR).

SPA:
- commands.ts: useAllApplicationConfigs, useApplicationConfig,
  useUpdateApplicationConfig, useProcessorRouteMapping,
  useTestExpression — rewritten URLs to /environments/{env}/apps/
  {app}/... shape. environment now required on every call. Query
  keys include environment so cache is env-scoped.
- dashboard.ts: useAppSettings, useAllAppSettings, useUpdateAppSettings
  rewritten.
- TapConfigModal: new required environment prop; callers updated.
- RouteDetail, ExchangesPage: thread selectedEnv into test-expression
  and modal.

Config changes in SecurityConfig for the new shape landed earlier in
P0.2; no security rule changes needed in this commit.

BREAKING CHANGE: /api/v1/config/** and /api/v1/admin/app-settings/**
paths removed. Agents must use /api/v1/agents/config instead of
GET /api/v1/config/{app}; users must use /api/v1/environments/{env}/
apps/{app}/config and /api/v1/environments/{env}/apps/{app}/settings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:33:25 +02:00
hsiegeln
6b5ee10944 feat!: environment admin URLs use slug; validate and immutabilize slug
UUID-based admin paths were the only remaining UUID-in-URL pattern in
the API. Migrates /api/v1/admin/environments/{id} to /{envSlug} so
slugs are the single environment identifier in every URL. UUIDs stay
internal to the database.

- Controller: @PathVariable UUID id → @PathVariable String envSlug on
  get/update/delete and the two nested endpoints (default-container-
  config, jar-retention). Handlers resolve slug → Environment via
  EnvironmentService.getBySlug, then delegate to existing UUID-based
  service methods.
- Service: create() now validates slug against ^[a-z0-9][a-z0-9-]{0,63}$
  and returns 400 on invalid slugs. Rationale documented in the class:
  slugs are immutable after creation because they appear in URLs,
  Docker network names, container names, and ClickHouse partition keys.
- UpdateEnvironmentRequest has no slug field and Jackson's default
  ignore-unknown behavior drops any slug supplied in a PUT body;
  regression test (updateEnvironment_withSlugInBody_ignoresSlug)
  documents this invariant.
- SPA: mutation args change from { id } to { slug }. EnvironmentsPage
  still uses env.id for local selection state (UUID from DB) but
  passes env.slug to every mutation.

BREAKING CHANGE: /api/v1/admin/environments/{id:UUID}/... paths removed.
Clients must use /{envSlug}/... (slug from the environments list).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:23:31 +02:00
hsiegeln
fcb53dd010 fix!: require environment on diagram lookup and attribute keys queries
Closes two cross-env data leakage paths. Both endpoints previously
returned data aggregated across all environments, so a diagram or
attribute key from dev could appear in a prod UI query (and vice versa).

B1: GET /api/v1/diagrams?application=&routeId= now requires
?environment= and resolves agents via
registryService.findByApplicationAndEnvironment instead of
findByApplication. Prevents serving a dev diagram for a prod route.

B2: GET /api/v1/search/attributes/keys now requires ?environment=.
SearchIndex.distinctAttributeKeys gains an environment parameter and
the ClickHouse query adds the env filter alongside the existing
tenant_id filter. Prevents prod attribute names leaking into dev
autocompletion (and vice versa).

SPA hooks updated to thread environment through from
useEnvironmentStore; query keys include environment so React Query
re-fetches on env switch. No call-site changes needed — hook
signatures unchanged.

B3 (AgentMetricsController env scope) deferred to P3C: agent-env is
effectively 1:1 today via the instance_id naming
({envSlug}-{appSlug}-{replicaIndex}), and the URL migration in P3C
to /api/v1/environments/{env}/agents/{agentId}/metrics naturally
introduces env from path. A minimal P1 fix would regress the "view
metrics of a killed agent" case.

BREAKING CHANGE: Both endpoints now require ?environment= (slug).
Clients omitting the parameter receive 400.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:19:55 +02:00
hsiegeln
c97d0ea061 feat: add @EnvPath resolver and security matchers for env-scoped URLs
Groundwork for the REST API taxonomy migration. Introduces the
infrastructure that future waves use to move data/query endpoints under
/api/v1/environments/{envSlug}/... without per-handler boilerplate.

- Add @EnvPath annotation + EnvironmentPathResolver: injects the
  Environment identified by the {envSlug} path variable, 404 on unknown
  slug, registered via WebConfig.addArgumentResolvers.
- Add env-scoped URL matchers to SecurityConfig (config, settings,
  executions/stats/logs/routes/agents/apps/deployments under
  /environments/*/**). Legacy flat matchers kept in place and will be
  removed per-wave as controllers migrate. New agent-authoritative
  /api/v1/agents/config matcher prepared for the agent/user split.
- Document OpenAPI schema regen workflow in CLAUDE.md so future API
  changes cover schema.d.ts regeneration as part of the change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:13:17 +02:00
hsiegeln
9b1ef51d77 feat!: scope per-app config and settings by environment
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m40s
SonarQube / sonarqube (push) Successful in 4m29s
BREAKING: wipe dev PostgreSQL before deploying — V1 checksum changes.
Agents must now send environmentId on registration (400 if missing).

Two tables previously keyed on app name alone caused cross-environment
data bleed: writing config for (app=X, env=dev) would overwrite the row
used by (app=X, env=prod) agents, and agent startup fetches ignored env
entirely.

- V1 schema: application_config and app_settings are now PK (app, env).
- Repositories: env-keyed finders/saves; env is the authoritative column,
  stamped on the stored JSON so the row agrees with itself.
- ApplicationConfigController.getConfig is dual-mode — AGENT role uses
  JWT env claim (agents cannot spoof env); non-agent callers provide env
  via ?environment= query param.
- AppSettingsController endpoints now require ?environment=.
- SensitiveKeysAdminController fan-out iterates (app, env) slices so each
  env gets its own merged keys.
- DiagramController ingestion stamps env on TaggedDiagram; ClickHouse
  route_diagrams INSERT + findProcessorRouteMapping are env-scoped.
- AgentRegistrationController: environmentId is required on register;
  removed all "default" fallbacks from register/refresh/heartbeat auto-heal.
- UI hooks (useApplicationConfig, useProcessorRouteMapping, useAppSettings,
  useAllAppSettings, useUpdateAppSettings) take env, wired to
  useEnvironmentStore at all call sites.
- New ConfigEnvIsolationIT covers env-isolation for both repositories.

Plan in docs/superpowers/plans/2026-04-16-environment-scoping.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 22:25:21 +02:00
hsiegeln
c272ac6c24 chore: bump @cameleer/design-system to 0.1.56
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m34s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Picks up the Sidebar fix that aligns Sidebar.Section header and
Sidebar.FooterLink icon/label columns. Bottom-positioned admin section
and footer links (e.g. API Docs) now share the same 9px left offset and
16px icon wrapper width.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 21:24:30 +02:00
hsiegeln
e2d9428dff fix: drop stale instance_id filter from search and scope route stats by app
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
The exchange search silently filtered by the in-memory agent registry's
current instance IDs on top of application_id. Historical exchanges written
by previous agent instances (or any instance not currently registered, e.g.
after a server restart before agents heartbeat back) were hidden from
results even though they matched the application filter.

Fix: drop the applicationId -> instanceIds resolution in SearchController.
Rely on application_id = ? in ClickHouseSearchIndex; keep explicit
instanceIds filtering only when a client passes them.

Related cleanup: the agentIds parameter on StatsStore.statsForRoute /
timeseriesForRoute was silently discarded inside ClickHouseStatsStore, so
per-route stats aggregated across any apps sharing a routeId. Replace with
String applicationId and add application_id to the stats_1m_route filters
so per-route stats are correctly scoped.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 19:49:55 +02:00
hsiegeln
1a68e1c46c fix: add --text-inverse CSS variable for badge icon contrast
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
The design system doesn't define --text-inverse yet, so SVG icons
on colored badge backgrounds (trace, tap, status) had no fill color.
Define it in index.css as #fff until the DS adds it natively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:30:51 +02:00
hsiegeln
4ee43bab95 fix: include headers and properties in has_trace_data detection
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
IngestionService.hasAnyTraceData() and ChunkAccumulator only checked
for inputBody/outputBody when setting has_trace_data on executions.
Headers and properties captured via deep tracing were not considered,
causing the trace data indicator to be missing in the exchange list.

DetailService already checked all six fields — this aligns the
ingestion path to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:26:15 +02:00
hsiegeln
4e45be59ef docs: update CLAUDE.md with persistent route catalog conventions
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:05:36 +02:00
hsiegeln
3a54b4d7e7 refactor: consolidate heartbeat findById into single lookup
Merge the route state update and route catalog upsert blocks to share
one registryService.findById() call instead of two, reducing overhead
on the high-frequency heartbeat path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 19:04:41 +02:00
hsiegeln
b77968bb2d docs: update rule files with RouteCatalogStore classes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:50:39 +02:00
hsiegeln
10b412c50c feat: merge persistent route catalog into legacy catalog endpoint
Add RouteCatalogStore as a third data source in RouteCatalogController so that
/api/v1/routes/catalog surfaces routes with zero executions and routes from
previous app versions that fall within the requested time window.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 18:50:13 +02:00
hsiegeln
24c858cca4 feat: merge persistent route catalog into unified catalog endpoint
Wires RouteCatalogStore into CatalogController as a third data source:
routes with zero executions and routes from previous app versions
(within the queried time window) now appear in the unified catalog.
Also clears route_catalog on app dismiss via deleteByApplication().

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 18:49:00 +02:00
hsiegeln
462b9a4bf0 feat: persist route catalog on agent register and heartbeat
Wire RouteCatalogStore into AgentRegistrationController and call upsert
after registration and heartbeat so routes survive server restarts.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 18:47:22 +02:00
hsiegeln
c4f4477472 feat: wire ClickHouseRouteCatalogStore bean 2026-04-16 18:45:48 +02:00
hsiegeln
961dadd1c8 feat: implement ClickHouseRouteCatalogStore with first_seen cache 2026-04-16 18:45:45 +02:00
hsiegeln
887a9b6faa feat: add RouteCatalogStore interface and RouteCatalogEntry record 2026-04-16 18:45:42 +02:00
hsiegeln
04da0af4bc feat: add route_catalog table to ClickHouse schema 2026-04-16 18:45:38 +02:00
hsiegeln
dd0f0e73b3 docs: add persistent route catalog implementation plan
9-task plan covering ClickHouse schema, core interface, cached
implementation, bean wiring, write path (register/heartbeat),
read path (both catalog controllers), dismiss cleanup, and
rule file updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:42:37 +02:00
hsiegeln
2542e430ac docs: add persistent route catalog design spec
Routes with zero executions (sub-routes) vanish from the sidebar after
server restart because the catalog is purely in-memory with a ClickHouse
stats fallback that only covers executed routes. This spec describes a
persistent route_catalog table in ClickHouse with lifecycle tracking
(first_seen/last_seen) to reconstruct the sidebar without agent
reconnection and support historical time-window queries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 18:39:49 +02:00
hsiegeln
a1ea112876 feat: replace text labels with icons in runtime cards
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m38s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
Use Activity, Cpu, and HeartPulse icons instead of "tps", "cpu", and
"ago" text in compact and expanded app cards. Bump design-system to
v0.1.55 for sidebar footer alignment fix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:43:37 +02:00
11ad769f59 Merge pull request 'feat/runtime-compact-view' (#136) from feat/runtime-compact-view into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Reviewed-on: #136
2026-04-16 15:27:37 +02:00
hsiegeln
3a9f3f41de feat: match filter input to sidebar search styling
All checks were successful
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m11s
CI / cleanup-branch (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / build (push) Successful in 2m11s
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 1m11s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 40s
Add search icon, translucent background, and same padding/sizing
as the sidebar's built-in filter input. Placeholder changed to
"Filter..." to match sidebar convention.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:22:23 +02:00
hsiegeln
e84822f211 feat: add sort buttons and fix filter placeholder
Some checks failed
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m12s
CI / cleanup-branch (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / build (push) Successful in 2m13s
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Add sort buttons (Status, Name, TPS, CPU, Heartbeat) to the toolbar,
right-aligned. Clicking toggles asc/desc, second sort criterion is
always name. Status sorts error > warning > success. Fix trailing
unicode escape in filter placeholder.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:19:10 +02:00
hsiegeln
e346b9bb9d feat: add app name filter to runtime toolbar
Some checks failed
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m9s
CI / cleanup-branch (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / build (push) Successful in 2m10s
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Text input next to view toggle filters apps by name (case-insensitive
substring match). KPI stat strip uses unfiltered counts so totals
stay accurate. Clear button on non-empty input.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:16:28 +02:00
hsiegeln
f4811359e1 fix: keep view toggle visible in both compact and expanded modes
Some checks failed
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m16s
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m12s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Move toolbar above the grid conditional so it renders in both
view modes. Hidden only on app detail pages (isFullWidth).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:13:47 +02:00
hsiegeln
e5b1171833 feat: replace Errors column with CPU in expanded agent table
Some checks failed
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
CI / build (pull_request) Successful in 2m0s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
Show per-instance CPU usage percentage instead of error rate in the
DataTable. Highlights >80% CPU in error color.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 15:12:01 +02:00
hsiegeln
d27a288128 fix: resolve TS2367 — view toggle active class in compact-only branch
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m32s
CI / cleanup-branch (pull_request) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / build (pull_request) Successful in 2m15s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
The toggle only renders inside the compact branch, so viewMode is
always 'compact' there. Use static class assignment instead of a
comparison TypeScript correctly flags as unreachable.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:20:45 +02:00
hsiegeln
7825aae274 feat: show CPU usage in expanded GroupCard meta headers
Add max CPU percentage to the meta row of both the full expanded
view and the overlay expanded card, consistent with compact cards.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:18:48 +02:00
hsiegeln
4b264b3308 feat: add CPU usage to agent response and compact cards
Backend:
- Add cpuUsage field to AgentInstanceResponse (-1 if unavailable)
- Add queryAgentCpuUsage() to AgentRegistrationController — queries
  avg CPU per instance from agent_metrics over last 2 minutes
- Wire CPU into agent list response via withCpuUsage()

Frontend:
- Add cpuUsage to schema.d.ts
- Compute maxCpu per AppGroup (max across all instances)
- Show "X% cpu" on compact cards when available (hidden when -1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:12:23 +02:00
hsiegeln
b57fe875f3 feat: click-outside dismiss and clean overlay styling
- Add invisible backdrop (z-index 99) behind expanded overlay to
  dismiss on outside click
- Remove background/padding from overlay wrapper so GroupCard
  renders without visible extra border
- Use drop-shadow filter instead of box-shadow for natural card
  shadow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 14:04:26 +02:00
hsiegeln
911ba591a9 feat: hide toggle on app detail, left-align toolbar, TPS unit fix
- Move view toggle into compact grid conditional so it only renders
  on the overview page (not app detail /runtime/{slug})
- Left-align the toolbar buttons
- Change TPS format from "x.y/s" to "x.y tps"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:59:25 +02:00
hsiegeln
9d1cf7577a feat: overlay z-index fix, app name navigation, TPS on compact cards
- Bump overlay z-index to 100 so it renders above the sidebar
- App name in compact card navigates to /runtime/{slug} on click
- Add TPS (msg/s) as third metric on compact cards between live
  count and heartbeat

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:52:42 +02:00
hsiegeln
1fa897fbb5 feat: move toggle to toolbar, sort apps by name, overlay expand
- Move expand/collapse toggle from stat strip to dedicated toolbar
  below KPIs
- Sort app groups alphabetically by name
- Expanded card overlays from clicked card position instead of
  pushing other cards down
- Viewport constraint: overlay flips right-alignment and limits
  height when near edges

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:48:39 +02:00
hsiegeln
9f7951aa2b docs: add compact view to runtime section of ui rules
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:42:26 +02:00
hsiegeln
61df59853b feat: add expand/collapse animation for compact card toggle
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 13:39:48 +02:00
hsiegeln
5229e08b27 feat: add compact app cards with inline expand to runtime dashboard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 13:37:58 +02:00
hsiegeln
d0c2fd1ac3 feat: add view mode state and toggle to runtime dashboard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 13:37:16 +02:00
hsiegeln
5c94881608 style: add compact view CSS classes for runtime dashboard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-16 13:35:33 +02:00
hsiegeln
23d24487d1 docs: add runtime compact view implementation plan
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:32:14 +02:00
hsiegeln
bf289aa1b1 docs: add runtime compact view design spec
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 13:27:53 +02:00
hsiegeln
78396a2796 fix: sidebar route selection and missing routes after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Two sidebar bugs fixed:

1. Route entries never highlighted on navigation because sidebar-utils
   generated /apps/ paths for route children while effectiveSelectedPath
   normalizes to /exchanges/. The design system does exact string matching.

2. Routes disappeared from sidebar when agents had no recent exchange
   data. Heartbeat carried routeStates (with route IDs as keys) but
   AgentRegistryService.heartbeat() never updated AgentInfo.routeIds.
   After server restart, auto-heal registered agents with empty routes,
   leaving ClickHouse (24h window) as the only discovery source.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 12:42:01 +02:00
hsiegeln
810f493639 chore: track .claude/rules/ and add self-maintenance instruction
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 5m22s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
Un-ignore .claude/rules/ so path-scoped rule files are shared via git.
Add instruction in CLAUDE.md to update rule files when modifying classes,
controllers, endpoints, or metrics — keeps rules current as part of
normal workflow rather than requiring separate maintenance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:26:53 +02:00
hsiegeln
95730b02ad refactor: decompose CLAUDE.md into path-scoped rules for reduced startup context
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m46s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Fix all factual inaccuracies (Ed25519 methods, syncOidcRoles reference,
cpuShares->cpuRequest, deployment sub-tab order, ClickHouseLogStore package,
OidcConfig fields) and add 30+ missing classes/controllers.

Move reference material (class maps, Docker orchestration, metrics tables,
UI structure, CI/CD) into .claude/rules/ with path-scoped loading. Remove
duplicated GitNexus section (already in AGENTS.md, now in .claude/rules/).

Startup context reduced from ~13K to ~4K tokens (69% reduction).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:24:26 +02:00
hsiegeln
3666994b9e refactor: simplify Docker entrypoints — agent bundles log appender
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m54s
CI / docker (push) Successful in 1m16s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 57s
SonarQube / sonarqube (push) Successful in 4m0s
The agent shaded JAR now includes the log appender classes. Remove
PropertiesLauncher, -Dloader.path, and separate appender JAR references.

All JVM types now use: java -javaagent:/app/agent.jar -jar app.jar
Plain Java uses -cp with explicit main class. Native runs binary directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 01:20:48 +02:00
hsiegeln
a4a5986f38 feat: use cameleer.processorId MDC key for precise log-to-processor correlation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m52s
CI / docker (push) Successful in 1m54s
CI / deploy (push) Successful in 56s
CI / deploy-feature (push) Has been skipped
LogTab now checks mdc['cameleer.processorId'] first when filtering logs
for a selected processor node, falling back to fuzzy message/loggerName
matching for older agents without the new MDC key.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 00:19:02 +02:00
hsiegeln
859cf7c10d fix: support pre-3.2 Spring Boot JARs in runtime entrypoint
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m39s
CI / docker (push) Successful in 1m38s
CI / deploy (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
RuntimeDetector now derives the correct PropertiesLauncher FQN from
the JAR manifest Main-Class package. Spring Boot 3.2+ uses
org.springframework.boot.loader.launch.PropertiesLauncher, pre-3.2
uses org.springframework.boot.loader.PropertiesLauncher.

DockerRuntimeOrchestrator uses the detected class instead of a
hardcoded 3.2+ reference, falling back to 3.2+ when not auto-detected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:21:01 +02:00
hsiegeln
7961c6e18c fix: expose ClickHouse HTTP port via NodePort for remote access
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Add cameleer-clickhouse-external Service (NodePort 30123) matching the
pattern used by cameleer-postgres-external (NodePort 30432).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:01:46 +02:00
hsiegeln
251c88fa63 fix: prefer cameleer.exchangeId MDC key for log correlation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m57s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
The agent now sets cameleer.exchangeId in MDC (persists across processor
executions, unlike Camel's camel.exchangeId which is scoped to MDCUnitOfWork).
For ON_COMPLETION exchange copies, the agent uses the parent's exchange ID.

Server changes:
- ClickHouseLogStore ingestion: extract exchange_id preferring
  cameleer.exchangeId, falling back to camel.exchangeId
- ClickHouseLogStore search: match exchangeId filter against exchange_id
  column OR cameleer.exchangeId OR camel.exchangeId in MDC
- Update CLAUDE.md with log exchange correlation documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:35:28 +02:00
hsiegeln
64af49e0b5 fix: improve sidebar layout with scrollable sections and bottom-pinned admin
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m1s
CI / docker (push) Successful in 2m25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
- Applications section: maxHeight 50vh with scroll overflow
- Starred section: maxHeight 30vh with scroll overflow
- Admin section: pinned to bottom of sidebar via position="bottom"
- Update design-system to 0.1.54 (sidebar section maxHeight, position props)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:09:10 +02:00
hsiegeln
e4b8f1bab4 fix: allow app creation without JAR when deploy is disabled
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
canSubmit no longer requires a JAR file when "Create only" is selected.
JAR upload and deploy steps are skipped when no file is provided.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:38:53 +02:00
hsiegeln
457650012b fix: resolve UI glitches and improve consistency
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m36s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
- Sidebar: make +App button more subtle (lower opacity, brightens on hover)
- Sidebar: add filter chips to hide empty routes and offline/stale apps
- Sidebar: hide filter chips and +App button when sidebar is collapsed
- Exchange table: reorder columns to Status, Attributes, App, Route, Started, Duration; remove ExchangeId and Agent columns
- Exchange detail log tab: query by exchangeId only (no applicationId required), filter by processorId when processor selected
- KPI tooltips: styled tooltips with current/previous values, time period labels, percentage change, themed with DS variables
- KPI tooltips: fix overflow by left-aligning first two and right-aligning last two
- Exchange detail: show full datetime (YYYY-MM-DD HH:mm:ss.SSS) for start/end times
- Status labels: unify to title-case (Completed, Failed, Running) across all views
- Status filter buttons: match title-case labels (Completed, Warning, Failed, Running)
- Create app: show full external URL using routingDomain from env config or window.location.origin fallback
- Create app: add Runtime Type selector and Custom Arguments to Resources tab
- Create app: add Sensitive Keys tab with agent defaults, global keys, and app-specific keys (matching admin page design)
- Create app: add placeholder text to all Input fields for consistency
- Update design-system to 0.1.52 (sidebar collapse toggle fix)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 19:41:36 +02:00
hsiegeln
091dfb34d0 fix: rename Cameleer3ServerApplication.java to match class name
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m2s
CI / docker (push) Successful in 4m15s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 2m4s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:15:45 +02:00
hsiegeln
92c412b198 chore: update design-system to 0.1.51 (renamed assets)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 1m12s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:57:49 +02:00
hsiegeln
cb3ebfea7c chore: rename cameleer3 to cameleer
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 18s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Rename Java packages from com.cameleer3 to com.cameleer, module
directories from cameleer3-* to cameleer-*, and all references
throughout workflows, Dockerfiles, docs, migrations, and pom.xml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:28:42 +02:00
hsiegeln
1077293343 fix: pull base image on deploy and fix registry prefix default
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The PULL_IMAGE deploy stage was a no-op — Docker only pulls on create
if the image is missing entirely, not when a newer version exists.
DeploymentExecutor now calls orchestrator.pullImage() to fetch the
latest base image from the registry before creating containers.

Also fixes the default base image from 'cameleer-runtime-base:latest'
(local-only name) to the fully qualified registry path
'gitea.siegeln.net/cameleer/cameleer-runtime-base:latest'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 11:29:58 +02:00
hsiegeln
fd806b9f2d fix: unify container/agent log identity and fix multi-replica log capture
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m18s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Four logging pipeline fixes:

1. Multi-replica startup logs: remove stopLogCaptureByApp from
   SseConnectionManager — container log capture now expires naturally
   after 60s instead of being killed when the first agent connects SSE.
   This ensures all replicas' bootstrap output is captured.

2. Unified instance_id: container logs and agent logs now share the same
   instance identity ({envSlug}-{appSlug}-{replicaIndex}). DeploymentExecutor
   sets CAMELEER_AGENT_INSTANCEID per replica so the agent uses the same
   ID as ContainerLogForwarder. Instance-level log views now show both
   container and agent logs.

3. Labels-first container identity: TraefikLabelBuilder emits cameleer.replica
   and cameleer.instance-id labels. Container names are tenant-prefixed
   ({tenantId}-{envSlug}-{appSlug}-{idx}) for global Docker daemon uniqueness.

4. Environment filter on log queries: useApplicationLogs now passes the
   selected environment to the API, preventing log leakage across environments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 10:54:05 +02:00
hsiegeln
bf2d07f3ba fix: break circular dependency between runtimeOrchestrator and containerLogForwarder
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m33s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
SonarQube / sonarqube (push) Successful in 3m34s
Extract DockerClient creation into a standalone bean so both
runtimeOrchestrator and containerLogForwarder depend on it directly
instead of on each other. DockerRuntimeOrchestrator now receives
DockerClient via constructor instead of creating it in @PostConstruct.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:06:40 +02:00
hsiegeln
9c912fe694 feat: distinguish agent re-registration from first registration
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m38s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m19s
Detect when an agent instance already exists in the registry and record
a RE_REGISTERED event with route count and capabilities instead of a
generic REGISTERED event. UI shows a refresh icon for re-registrations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:57:20 +02:00
hsiegeln
33b0bc4d98 fix: cast LogEntryResponse to LogEntry for StartupLogPanel type safety
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
The DS LogViewer expects level as a string union, but the API response
type uses plain string. Cast at the call site to fix the TS build error.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:56:01 +02:00
hsiegeln
7a63135d26 fix: scope pg_stat_activity queries by ApplicationName for tenant isolation
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 36s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
DatabaseAdminController's active-queries and kill-query endpoints could
expose SQL text from other tenants sharing the same PostgreSQL instance.
Added ApplicationName=tenant_{id} to the JDBC URL and filter
pg_stat_activity by application_name so each tenant only sees its own
connections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:51:13 +02:00
hsiegeln
c33b2a9048 docs: update CLAUDE.md with container startup log capture documentation
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 37s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add ContainerLogForwarder, StartupLogPanel, useStartupLogs to key classes
and UI files. Document log capture lifecycle and source badge rendering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:36:38 +02:00
hsiegeln
0dafad883e chore: bump @cameleer/design-system to 0.1.49
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 37s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
LogViewer now renders source badges (container/app/agent) on log entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:33:19 +02:00
hsiegeln
1287952387 feat: show startup logs panel below deployment progress
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:24:08 +02:00
hsiegeln
81dd81fc07 feat: add container source option to log source filters
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:21:34 +02:00
hsiegeln
e7732703a6 feat: add StartupLogPanel component for deployment startup logs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:21:26 +02:00
hsiegeln
119cf912b8 feat: add useStartupLogs hook for container startup log polling
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:21:23 +02:00
hsiegeln
81f42d5409 feat: stop container log capture on Docker die/oom events
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:19:26 +02:00
hsiegeln
49c7de7082 feat: stop container log capture when agent SSE connects
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:19:17 +02:00
hsiegeln
4940bf3376 feat: start log capture when deployment replicas are created
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:18:56 +02:00
hsiegeln
de85a861c7 feat: wire ContainerLogForwarder into DockerRuntimeOrchestrator
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:17:54 +02:00
hsiegeln
729944d3ac feat: add ContainerLogForwarder for Docker log streaming to ClickHouse
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:15:49 +02:00
hsiegeln
9c65a3c3b9 feat: add log capture methods to RuntimeOrchestrator interface
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:14:31 +02:00
hsiegeln
8fabc2b308 docs: add container startup log capture implementation plan
12 tasks covering RuntimeOrchestrator extension, ContainerLogForwarder,
deployment/SSE/event monitor integration, and UI startup log panel.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:12:01 +02:00
hsiegeln
14215bebec docs: add container startup log capture design spec
Covers streaming Docker logs to ClickHouse until agent SSE connect,
deployment log panel UI, and source badge in general log views.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:04:24 +02:00
hsiegeln
92d7f5809b improve: redesign SensitiveKeysPage with better layout and information hierarchy
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Show agent built-in defaults as reference Badge pills, separate editable keys
section with count badge, amber-highlighted push toggle, right-aligned save
button. Fix info text: keys add to defaults, not replace. Add ClaimMapping
controller to CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:03:45 +02:00
hsiegeln
9ac8e3604c fix: allow testing claim mapping rules before saving and keep rows editable after test
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
The test endpoint now accepts inline rules from the client instead of reading
from the database, so unsaved rules can be tested. Matched rows show the
checkmark alongside action buttons instead of replacing them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:52:18 +02:00
hsiegeln
891abbfcfd docs: add sensitive keys feature documentation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- CLAUDE.md: add SensitiveKeysConfig, SensitiveKeysRepository, SensitiveKeysMerger
  to core admin classes; add SensitiveKeysAdminController endpoint; add
  PostgresSensitiveKeysRepository; add sensitive keys convention; add admin page
  to UI structure
- Design spec and implementation plan for the feature

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:29:15 +02:00
hsiegeln
7b73b5c9c5 feat: add per-app sensitive keys section to AppConfigDetailPage
Adds sensitiveKeys/globalSensitiveKeys/mergedSensitiveKeys fields to
ApplicationConfig, unwraps the new AppConfigResponse envelope in
useApplicationConfig, and renders an editable Sensitive Keys section
with read-only global pills and add/remove app-specific key tags.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:26:05 +02:00
hsiegeln
96780db9ad feat: wire SensitiveKeysPage into router and admin sidebar 2026-04-14 18:24:13 +02:00
hsiegeln
813ec6904e feat: add SensitiveKeysPage admin page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:23:34 +02:00
hsiegeln
06c719f0dd feat: add sensitive keys API query hooks 2026-04-14 18:22:28 +02:00
hsiegeln
77aa3c3d6f test: add SensitiveKeysAdminController integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:21:46 +02:00
hsiegeln
2fad8811c6 feat: merge global sensitive keys into app config GET and SSE push
- GET /config/{app} now returns AppConfigResponse with globalSensitiveKeys and mergedSensitiveKeys alongside the config
- PUT /config/{app} merges global + per-app sensitive keys before pushing CONFIG_UPDATE to agents via SSE
- extractSensitiveKeys() uses JsonNode reflection to avoid compile-time dependency on cameleer3-common getSensitiveKeys()
- SensitiveKeysRepository injected as new constructor parameter

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:19:59 +02:00
hsiegeln
28e38e4dee fix: add audit logging to GET /admin/sensitive-keys
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:17:42 +02:00
hsiegeln
c3892151a5 feat: add SensitiveKeysAdminController with fan-out support
GET/PUT /api/v1/admin/sensitive-keys (ADMIN only). PUT accepts optional
pushToAgents param — when true, fans out merged global+per-app sensitive
keys to all live agents via CONFIG_UPDATE SSE commands with 10-second
shared deadline. Per-app keys extracted via JsonNode to avoid depending
on ApplicationConfig.getSensitiveKeys() not yet in the published
cameleer3-common jar. Includes audit logging on every PUT.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:16:27 +02:00
hsiegeln
84641fe81a feat: add PostgresSensitiveKeysRepository 2026-04-14 18:08:45 +02:00
hsiegeln
d72a6511da feat: add SensitiveKeysMerger with case-insensitive union dedup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:07:53 +02:00
hsiegeln
86b6c85aa7 feat: add SensitiveKeysConfig record and SensitiveKeysRepository interface 2026-04-14 18:06:12 +02:00
hsiegeln
dcd0b4ebcd fix: use managed assignments for OIDC fallback role paths
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m31s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The roles-claim and default-roles fallback paths in applyClaimMappings
were using assignRoleToUser (origin='direct'), causing OIDC-derived
roles to accumulate across logins and never be cleared. Changed both
to assignManagedRole (origin='managed') so all OIDC-assigned roles
are cleared and re-evaluated on every login, same as claim mapping
rules. Only roles assigned directly via the admin UI are preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:19:20 +02:00
hsiegeln
58e802e2d4 feat: close modal on successful apply, update design spec
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Modal auto-closes after Apply succeeds. Design spec updated to reflect
implemented behavior: local-edit-then-apply pattern, target select
dropdowns, amber pill for add-to-group, close-on-success.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:12:39 +02:00
hsiegeln
9959e30e1e fix: use --amber DS variable for add-to-group pill color
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:09:31 +02:00
hsiegeln
5edefb2180 refactor: switch claim mapping editor to local-edit-then-apply pattern
All edits (add, edit, delete, reorder) now modify local state only.
Cancel discards changes, Apply diffs local vs server and issues the
necessary create/update/delete API calls. Target selects now include
a placeholder option. Footer shows Cancel and Apply buttons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:07:36 +02:00
hsiegeln
0e87161426 feat: use select dropdowns for target role/group in claim mapping editor
Populate target field from existing roles (assign role) or groups
(add to group) instead of free-text input, preventing typos.
Switching action resets the target selection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:02:09 +02:00
hsiegeln
c02fd77c30 fix: use correct DS CSS variables for modal background
Replace non-existent --surface-1/--surface-2 with --bg-raised (modal)
and --bg-hover (subtle backgrounds) from the design system.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:59:50 +02:00
hsiegeln
a3ec0aaef3 fix: address code review findings for claim mapping rules editor
- Bump all font sizes from 11px/10px to 12px (project minimum)
- Fix handleMove race condition: use mutateAsync + Promise.all
- Clear stale test results after rule create/edit/delete/reorder
- Replace inline styles with CSS module classes in OidcConfigPage
- Remove dead .editRow CSS class
- Replace inline chevron with Lucide icon

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:58:06 +02:00
hsiegeln
3985bb8a43 feat: wire claim mapping rules modal into OIDC config page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:51:28 +02:00
hsiegeln
e8a697d185 feat: add claim mapping rules editor modal component
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:50:00 +02:00
hsiegeln
344700e29e feat: add React Query hooks for claim mapping rules API
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:46:40 +02:00
hsiegeln
f110169d54 feat: add POST /test endpoint for claim mapping rule evaluation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:42:54 +02:00
hsiegeln
90ae1d6a14 fix: include properties in hasTrace for ProcessorExecution path
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m39s
CI / docker (push) Failing after 47s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Now that cameleer3-common has getInputProperties/getOutputProperties on
ProcessorExecution, add the check to the processors_json deserialization
path as well.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:34:29 +02:00
hsiegeln
05d91c16e7 fix: include properties in hasTrace check for ProcessorRecord paths
Some checks failed
CI / build (push) Successful in 1m57s
CI / cleanup-branch (push) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The hasTrace flag on ProcessorNode now also checks inputProperties and
outputProperties on the flat-record code paths (buildTreeBySeq and
buildTreeByProcessorId). The ProcessorExecution path (processors_json)
will be updated once cameleer3-common publishes the new snapshot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:32:18 +02:00
hsiegeln
0827fd21e3 feat: persist and display exchange properties from agent
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m59s
CI / docker (push) Successful in 2m13s
CI / deploy (push) Successful in 58s
CI / deploy-feature (push) Has been skipped
Add support for exchange properties sent by the agent alongside headers.
Properties flow through the same pipeline as headers: ClickHouse columns
(input_properties, output_properties) on both executions and
processor_executions tables, MergedExecution record, ChunkAccumulator
extraction, DetailService snapshot, and REST API response.

UI adds a Properties tab next to Headers in the process diagram detail
panel, with the same input/output split table layout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:23:53 +02:00
hsiegeln
199d0259cd feat: add "+ App" shortcut button to sidebar Applications header
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Adds a subtle "+ App" button in the sidebar section header for quick
app creation without navigating to the Deployments tab first. Only
visible to OPERATOR and ADMIN roles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 09:10:41 +02:00
hsiegeln
ac680b7f3f refactor: prefix all third-party service names with cameleer-
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m7s
CI / docker (push) Successful in 1m33s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m51s
SonarQube / sonarqube (push) Successful in 3m28s
Rename all Docker/K8s service names, DNS hostnames, secrets, volumes,
and manifest files to use the cameleer- prefix, making it clear which
software package each container belongs to.

Services renamed:
- postgres → cameleer-postgres
- clickhouse → cameleer-clickhouse
- logto → cameleer-logto
- logto-postgresql → cameleer-logto-postgresql
- traefik (service) → cameleer-traefik
- postgres-external → cameleer-postgres-external

Secrets renamed:
- postgres-credentials → cameleer-postgres-credentials
- clickhouse-credentials → cameleer-clickhouse-credentials
- logto-credentials → cameleer-logto-credentials

Volumes renamed:
- pgdata → cameleer-pgdata
- chdata → cameleer-chdata
- certs → cameleer-certs
- bootstrapdata → cameleer-bootstrapdata

K8s manifests renamed:
- deploy/postgres.yaml → deploy/cameleer-postgres.yaml
- deploy/clickhouse.yaml → deploy/cameleer-clickhouse.yaml
- deploy/logto.yaml → deploy/cameleer-logto.yaml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 22:51:08 +02:00
hsiegeln
fe283674fb fix: use relative asset paths with always-injected <base> tag
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Switch Vite base back to './' (relative paths) and always inject
<base href="${BASE_PATH}"> in the entrypoint, even when BASE_PATH=/.

This fixes asset loading for both deployment modes:
- Single-instance: <base href="/"> resolves ./assets/x.js to /assets/x.js
- SaaS tenant: <base href="/t/slug/"> resolves to /t/slug/assets/x.js

Previously base:'/' produced absolute /assets/ paths that the <base>
tag couldn't redirect, breaking SaaS tenants. And base:'./' without
<base> broke deep URLs in single-instance mode. Always injecting the
tag makes relative paths work universally.

The patched server-ui-entrypoint.sh in cameleer-saas (which rewrote
absolute href/src attributes via sed) is no longer needed and can be
removed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:30:00 +02:00
hsiegeln
67e2c1a531 fix: revert relative base path and fix processor table overflow
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Revert base: './' back to '/' — relative asset paths break on deep
URLs like /dashboard/app/route where the browser resolves assets to
/dashboard/app/assets/ instead of /assets/.

Also fix processor metrics table clipping: remove flex:1/min-height:0
from .processorSection so the table takes its natural content height
and the page scrolls to show all rows (was clipping at ~12 of 18).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:16:59 +02:00
hsiegeln
025e1cfc34 docs: update CLAUDE.md GitNexus stats
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / docker (push) Successful in 32s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:04:59 +02:00
hsiegeln
d942425cb1 fix: use relative base path for Vite assets
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
When the server-ui is deployed under a subpath (/t/{slug}/), absolute
asset paths (/assets/...) resolve to the domain root instead of the
subpath, causing 404s. Using './' makes asset URLs relative to the
HTML page, so they resolve correctly regardless of mount path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:52:49 +02:00
hsiegeln
882198d59a fix: use lagInFrame instead of lag for ClickHouse compatibility
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
ClickHouse does not have lag() as a window function. Use lagInFrame()
with explicit ROWS BETWEEN frame instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:06:25 +02:00
hsiegeln
5edb833d21 chore: remove stats table migration logic from ClickHouseSchemaInitializer
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m44s
Not needed yet -- all deployments are under our control and can be
reset manually if the old schema is encountered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:51:34 +02:00
hsiegeln
3f2392b8f7 refactor: consolidate ClickHouse init.sql as clean idempotent schema
Some checks failed
CI / build (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
Rewrite init.sql as a pure CREATE IF NOT EXISTS file with no DROP or
INSERT statements. Safe for repeated runs on every startup without
corrupting aggregated stats data.

Old deployments with count()-based stats tables are migrated
automatically: ClickHouseSchemaInitializer checks system.columns for
the old AggregateFunction(count) type and drops those tables before
init.sql recreates them with the correct uniq() schema. This runs
once per table and is a no-op on fresh installs or already-migrated
deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:49:53 +02:00
hsiegeln
6b558b649d fix: use absolute asset paths and prevent route-level runtime navigation
Some checks failed
CI / build (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
Change vite base from './' to '/' so asset paths are absolute. With
relative paths, direct navigation to multi-segment URLs like
/runtime/app/instance resolved assets to /runtime/assets/ which 404'd.

Also fix sidebar navigation: clicking a route while on the runtime tab
no longer navigates to /runtime/{appId}/{routeId} (which the runtime
page interprets as an instanceId). It stays at /runtime/{appId}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:24:15 +02:00
hsiegeln
e57343e3df feat: add delta mode for counter metrics using ClickHouse lag()
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Counter metrics like chunks.exported.count are monotonically increasing.
Add mode=delta query parameter to the agent metrics API that computes
per-bucket deltas server-side using ClickHouse lag() window function:
max(value) per bucket, then greatest(0, current - previous) to get the
increase per period with counter-reset handling.

The chunks exported/dropped charts now show throughput per bucket
instead of the ever-increasing cumulative total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:56:06 +02:00
hsiegeln
ae908fb382 fix: deduplicate all stats MVs and preserve loop iterations
All checks were successful
CI / build (push) Successful in 2m25s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m20s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m3s
SonarQube / sonarqube (push) Successful in 3m49s
Extend uniq-based dedup from processor tables to all stats tables
(stats_1m_all, stats_1m_app, stats_1m_route). Execution-level tables
use uniq(execution_id). Processor-level tables now use
uniq(concat(execution_id, toString(seq))) so loop iterations (same
exchange, different seq) are counted while chunk retry duplicates
(same exchange+seq) are collapsed.

All stats tables are dropped, recreated, and backfilled from raw
data on startup. All Java queries updated: countMerge -> uniqMerge,
countIfMerge -> uniqIfMerge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:48:01 +02:00
hsiegeln
1872d46466 fix: remove semicolons from SQL comments that broke schema initializer
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m30s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 50s
The ClickHouseSchemaInitializer splits on semicolons before filtering
comments, so semicolons inside comment text created invalid statements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:17:35 +02:00
hsiegeln
e2f784bf82 fix: deduplicate processor stats using uniq(execution_id)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Processor execution counts were inflated by duplicate inserts into the
plain MergeTree processor_executions table (chunk retries, reconnects).
Replace count()/countIf() with uniq(execution_id)/uniqIf() in both
stats_1m_processor and stats_1m_processor_detail MVs so each exchange
is counted once per processor regardless of duplicates.

Tables are dropped and rebuilt from raw data on startup. MV created
after backfill to avoid double-counting.

Also adds stats_1m_processor_detail to the catalog purge list (was
missing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:12:00 +02:00
hsiegeln
27f2503640 fix: clean up runtime UI and harden session expiry handling
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Remove redundant "X/X LIVE" badge from runtime page, breadcrumb trail
and routes section from agent detail page (pills moved into Process
Information card). Fix session expiry: guard against concurrent 401
refresh races and skip re-entrant triggers on auth endpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:33:44 +02:00
hsiegeln
ffce3b714f chore: update design system to 0.1.48 (x-axis tick label overlap fix)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m24s
CI / docker (push) Successful in 2m12s
CI / deploy (push) Successful in 54s
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:12:12 +02:00
hsiegeln
d34abeb5cb fix: diagram zoom not working on initial render
All checks were successful
CI / build (push) Successful in 2m10s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m32s
CI / deploy (push) Successful in 52s
CI / deploy-feature (push) Has been skipped
The wheel event listener was attached in a useEffect with empty deps,
but the SVG element doesn't exist during the loading state. Switch
svgRef from a plain ref to a callback ref that triggers re-attachment
when the SVG element becomes available after data loads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:05:35 +02:00
hsiegeln
aaf9a00d67 fix: replace native SVG tooltip with styled heatmap tooltip overlay
Some checks failed
CI / build (push) Successful in 2m12s
CI / cleanup-branch (push) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Renders an HTML tooltip below hovered diagram nodes with processor
metrics (avg, p99, % time, invocations, error rate). Styled inline
with the existing NodeToolbar pattern — positioned via screen-space
coordinates, uses DS tokens for background/border/shadow/typography.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:03:05 +02:00
hsiegeln
00c9a0006e feat: rework runtime charts and fix time range propagation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Runtime page (AgentInstance):
- Rearrange charts: CPU, Memory, GC (top); Threads, Chunks Exported,
  Chunks Dropped (bottom). Removes throughput/error charts (belong on
  Dashboard, not Runtime).
- Pass global time range (from/to) to useAgentMetrics — charts now
  respect the time filter instead of always showing last 60 minutes.
- Bottom row (logs + timeline) fills remaining vertical space.

Dashboard L3:
- Processor metrics section fills remaining vertical space.
- Chart x-axis uses timestamps instead of bucket indices.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:59:38 +02:00
hsiegeln
98ce7c2204 feat: combine process diagram and processor table into toggled card
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
Dashboard L3 now shows a single Processor Metrics card with
Diagram/Table toggle buttons. The diagram shows native tooltips on
hover with full processor metrics (avg, p99, invocations, error rate,
% time).

Also fixes:
- Chart x-axis uses actual timestamps instead of bucket indices
- formatDurationShort uses locale formatting with max 3 decimals

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:40:43 +02:00
hsiegeln
66248f6b1c fix: accept logs from unregistered agents using JWT claims
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
After server restart, agents send logs before re-registering. Instead
of dropping these logs, fall back to application and environment from
the JWT token claims. Only drops logs when neither registry nor JWT
provide an applicationId.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:29:05 +02:00
hsiegeln
ce8a2a1525 fix: use raw timestamp string for throughput/error chart data
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 56s
Avoids Date round-trip that crashes with toISOString() on invalid
timestamps from the timeseries API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:25:00 +02:00
hsiegeln
65ed94f0e0 fix: commit DS v0.1.47 dependency update (missed in migration commit)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m34s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:19:56 +02:00
hsiegeln
b1e1f789c5 ci: retrigger build (DS v0.1.47 publish race condition)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 37s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:13:09 +02:00
hsiegeln
0dae1f1cc7 feat: migrate agent charts to ThemedChart + Recharts
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
Replace custom LineChart/AreaChart/BarChart usage with ThemedChart
wrapper. Data format changed from ChartSeries[] to Recharts-native
flat objects. Uses DS v0.1.47.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 19:44:55 +02:00
hsiegeln
a0af53f8f5 chore: update design system to 0.1.46 (responsive charts, timestamp tooltips)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m31s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:52:53 +02:00
hsiegeln
aa91b867c5 fix: use Date x-values in agent charts for proper time axes
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
All chart series now use Date objects from the API response instead
of integer indices. This gives proper date/time on x-axes and in
tooltips (leveraging DS v0.1.46 responsive charts + timestamp
tooltips). GC chart switched from BarChart to AreaChart for
consistency with Date x-values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:51:18 +02:00
hsiegeln
0b19fe8319 fix: update UI metric names from JMX to Micrometer convention
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m34s
CI / docker (push) Successful in 2m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
Agent team migrated from JMX to Micrometer metrics. Update the 5
hardcoded metric names in AgentInstance.tsx JVM charts:
- jvm.cpu.process → process.cpu.usage.value
- jvm.memory.heap.used → jvm.memory.used.value
- jvm.memory.heap.max → jvm.memory.max.value
- jvm.threads.count → jvm.threads.live.value
- jvm.gc.time → jvm.gc.pause.total_time

Server backend is unaffected (generic MetricsSnapshot storage).
CLAUDE.md updated with full agent metric name reference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:35:29 +02:00
hsiegeln
dc29afd4c8 docs: add Prometheus metrics reference to CLAUDE.md
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 3m42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Lists all business metrics (gauges, counters, timers) with their
tags and source classes, plus agent container label mapping table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:29:01 +02:00
hsiegeln
6bf7175a6c feat: add Micrometer Prometheus metrics to server
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m36s
CI / deploy (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Adds micrometer-registry-prometheus and exposes /api/v1/prometheus
endpoint (unauthenticated for scraping). ServerMetrics component
provides business metrics beyond default JVM/HTTP:

Gauges: agents by state, SSE connections, buffer depths (execution,
processor, log, metrics), accumulator pending exchanges.

Counters: ingestion drops (buffer_full, no_agent, no_identity),
agent transitions (went_stale, went_dead, recovered), deployment
outcomes (running, failed, degraded), auth failures (invalid_token,
revoked, oidc_rejected).

Timers: ClickHouse flush duration by type, deployment duration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:23:27 +02:00
hsiegeln
caaa1ab0cc feat: add Prometheus docker_sd_configs labels to agent containers
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Labels prometheus.scrape, prometheus.path, and prometheus.port are now
set on every deployed container based on the resolved runtime type,
enabling automatic Prometheus service discovery via docker_sd_configs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:00:32 +02:00
hsiegeln
c1fbe1a63a chore: update design system to 0.1.45 (sidebar version styling)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m0s
CI / docker (push) Successful in 1m59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:28:29 +02:00
hsiegeln
b32d4adaa5 fix: show empty state for unmanaged apps on Deployments tab
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Previously showed an infinite spinner because unmanaged apps have no
PostgreSQL record. Now shows an "Unmanaged Application" message with
a link to create a managed app.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:16:18 +02:00
hsiegeln
dadab2b5f7 fix: align payloadCaptureMode default with agent (BOTH, not NONE)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 49s
Server defaultConfig() and UI fallbacks returned "NONE" for payload
capture, but the agent defaults to "BOTH". This caused unwanted
reconfiguration when users saved other settings — payload capture
would silently change from the agent's default BOTH to NONE.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:12:21 +02:00
hsiegeln
51a4317440 fix: optimistically remove dismissed app from sidebar cache
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m14s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Sets query cache immediately on dismiss success so the sidebar updates
without waiting for the catalog refetch to complete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:05:04 +02:00
hsiegeln
f84bdebd09 fix: confirm dialog asks user to type app name instead of generic text
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:59:26 +02:00
hsiegeln
c10d207d98 fix: add spacing below dismiss alert on Runtime page
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / deploy (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:58:01 +02:00
hsiegeln
be96336974 feat: add extra Docker networks to container config
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Apps can now join additional Docker networks (e.g., monitoring,
prometheus) configured via containerConfig.extraNetworks. Flows through
the 3-layer config merge. Networks are created if absent and containers
are connected during deployment. UI adds a pill-list field on the
Resources tab (both create and edit views).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:53:01 +02:00
hsiegeln
5b6543b167 fix: use ConfirmDialog for dismiss, move warning to top of page
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Replace window.confirm with design system ConfirmDialog for the dismiss
action. Move the "No agents connected" section to the top of the Runtime
page using Alert component with warning variant.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:43:05 +02:00
hsiegeln
5724b8459d docs: document catalog cleanup, log ingestion logging, and catalog config
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 29s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:37:46 +02:00
hsiegeln
223a60f374 docs: add orphaned app cleanup design spec
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:21:00 +02:00
hsiegeln
90c82238a0 feat: add orphaned app cleanup — auto-filter stale discovered apps, manual dismiss with data purge
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:19:59 +02:00
hsiegeln
d161ad38a8 fix: log deserialization failures on log ingestion endpoint
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 1m4s
CI / deploy (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
Spring's default handler silently returns 400 for malformed payloads
with no server-side log. Added @ExceptionHandler to catch and WARN with
the agent instance ID and root cause message.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:33:57 +02:00
hsiegeln
2d3817b296 fix: downgrade successful log ingestion message to DEBUG
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:29:38 +02:00
hsiegeln
e55ee93dcf fix: add proper logging to log ingestion endpoint
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m47s
CI / docker (push) Successful in 1m38s
CI / deploy (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
Previously the endpoint silently returned 202 for all failures: missing
agent identity, unregistered agents, empty payloads, and buffer-full
drops. Now logs WARN for each failure case with context (instanceId,
entry count, reason). Normal ingestion logged at INFO with accepted
count. Buffer-full drops tracked individually with accepted/dropped
counts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 15:20:07 +02:00
hsiegeln
d02a64709c docs: update documentation for runtime type detection feature
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 1m14s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:21:43 +02:00
hsiegeln
ee435985a9 feat: add Runtime Type and Custom Arguments fields to deployment Resources tab
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:11:59 +02:00
hsiegeln
d5b611cc32 feat: validate runtimeType and customArgs on container config save
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:08:52 +02:00
hsiegeln
e941256e6e feat: build Docker entrypoint per runtime type with custom args support
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:06:54 +02:00
hsiegeln
f66c8b6d18 feat: add runtimeType and customArgs to ResolvedContainerConfig and ConfigMerger
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:04:21 +02:00
hsiegeln
5e28d20e3b feat: run RuntimeDetector on JAR upload and store detected runtime
After versionRepo.create(), detect the runtime type from the saved JAR
via RuntimeDetector and persist the result via updateDetectedRuntime().
Log messages now include the detected runtime type (or 'unknown').

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:03:07 +02:00
hsiegeln
f4bbc1f65f feat: add detected_runtime_type and detected_main_class to app_versions
Flyway V10 migration adds the two nullable columns. AppVersion record,
AppVersionRepository interface, and PostgresAppVersionRepository are
updated to carry and persist detected runtime information.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 13:01:24 +02:00
hsiegeln
cbf29a5d87 feat: add RuntimeType enum and RuntimeDetector for JAR probing
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 12:59:00 +02:00
hsiegeln
51cc2c1d3c ci: retrigger build (cameleer3-common SNAPSHOT updated with source field)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 34s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m33s
2026-04-12 10:42:44 +02:00
hsiegeln
0603e62a69 fix: revert LogEntry to 7-arg constructor (source is not a ctor param)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
LogEntry.getSource() exists but source is not a constructor parameter
in cameleer3-common — it uses a default value.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:40:48 +02:00
hsiegeln
00115a16ac fix: add source parameter to LogSearchRequest/LogEntry calls in ClickHouseLogStoreIT
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 58s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
All constructor calls updated to include the new source field added
in the log forwarding v2 changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:37:56 +02:00
hsiegeln
4d8df86786 docs: update for log forwarding v2 (source field, wire format)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 58s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
HOWTO.md: log ingestion example updated from LogBatch wrapper to raw
JSON array with source field. CLAUDE.md: added LogIngestionController,
updated LogQueryController with new filters. SERVER-CAPABILITIES.md:
updated log ingestion and query descriptions, ClickHouse table note.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 10:33:03 +02:00
hsiegeln
6b00bf81e3 feat: add log source filter (app/agent) to runtime log viewers
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 1m7s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 10:27:59 +02:00
hsiegeln
b03dfee4f3 feat: log forwarding v2 — accept List<LogEntry>, add source field
Replace LogBatch wrapper with raw List<LogEntry> on the ingestion endpoint.
Add source column to ClickHouse logs table and propagate it through the
storage, search, and HTTP layers (LogSearchRequest, LogEntryResult,
LogEntryResponse, LogQueryController).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 10:25:46 +02:00
hsiegeln
4b18579b11 docs: document infrastructureendpoints flag
All checks were successful
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m37s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
SonarQube / sonarqube (push) Successful in 3m33s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:19:58 +02:00
hsiegeln
c8cdd846c0 feat: fetch server capabilities and hide infra tabs when disabled
Adds a useServerCapabilities hook that fetches /api/v1/health once per
session (staleTime: Infinity) and extracts the infrastructureEndpoints
flag. buildAdminTreeNodes now accepts an opts parameter so ClickHouse
and Database tabs are hidden when the server reports infra endpoints as
disabled. LayoutShell wires the hook result into the admin tree memo.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:12:30 +02:00
hsiegeln
9de51014e7 feat: expose infrastructureEndpoints flag in health endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:10:15 +02:00
hsiegeln
293d11e52b feat: add infrastructureendpoints flag with conditional DB/CH controllers
Add cameleer.server.security.infrastructureendpoints property (default true) and
@ConditionalOnProperty to DatabaseAdminController and ClickHouseAdminController so
the SaaS provisioner can set CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS=false
to suppress these endpoints (404) on tenant server containers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 23:09:28 +02:00
hsiegeln
ca89a79f8f docs: add infrastructure endpoint visibility implementation plan
14-task plan covering server-side @ConditionalOnProperty flag,
health endpoint capability exposure, UI sidebar filtering,
SaaS provisioner env var, and vendor infrastructure dashboard
with per-tenant PostgreSQL and ClickHouse visibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 23:06:41 +02:00
hsiegeln
01b268590d docs: add infrastructure endpoint visibility design spec
Covers restricting DB/ClickHouse admin endpoints in SaaS-managed
server instances via @ConditionalOnProperty flag, and building a
vendor-facing infrastructure dashboard in the SaaS platform with
per-tenant PostgreSQL and ClickHouse visibility.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 22:56:45 +02:00
hsiegeln
7a3256f3f6 docs: update env var and property references to new naming convention
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m2s
CI / docker (push) Successful in 32s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m35s
HOWTO.md configuration table rewritten with correct cameleer.server.*
property names, grouped by functional area. Removed stale CAMELEER_OIDC_*
env var references. SERVER-CAPABILITIES.md updated with correct env var
names for ingestion and agent registry tuning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:56:19 +02:00
hsiegeln
350e769948 Group container settings under cameleer.server.runtime.container.*
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m2s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Move container resource defaults into their own sub-namespace for
future extensibility:

  cameleer.server.runtime.container.memorylimit → CAMELEER_SERVER_RUNTIME_CONTAINER_MEMORYLIMIT
  cameleer.server.runtime.container.cpushares   → CAMELEER_SERVER_RUNTIME_CONTAINER_CPUSHARES

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:33:07 +02:00
hsiegeln
534e936cd4 Group OIDC settings under cameleer.server.security.oidc.*
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m59s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Move OIDC properties into a nested Oidc class within SecurityProperties
for clearer grouping. Env vars gain an extra separator:

  cameleer.server.security.oidc.issueruri     → CAMELEER_SERVER_SECURITY_OIDC_ISSUERURI
  cameleer.server.security.oidc.jwkseturi     → CAMELEER_SERVER_SECURITY_OIDC_JWKSETURI
  cameleer.server.security.oidc.audience      → CAMELEER_SERVER_SECURITY_OIDC_AUDIENCE
  cameleer.server.security.oidc.tlsskipverify → CAMELEER_SERVER_SECURITY_OIDC_TLSSKIPVERIFY

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:30:33 +02:00
hsiegeln
60fb5fe21a Remove vestigial clickhouse.enabled flag
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
ClickHouse is the only storage backend — there is no alternative.
The enabled flag created a false sense of optionality: setting it to
false would crash on startup because most beans unconditionally depend
on the ClickHouse JdbcTemplate.

Remove all @ConditionalOnProperty annotations gating ClickHouse beans,
the enabled property from application.yml, and the K8s manifest entry.
Also fix old property names in AbstractPostgresIT test config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:27:10 +02:00
hsiegeln
8fe48bbf02 Migrate config to cameleer.server.* naming convention
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m52s
CI / docker (push) Successful in 1m30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Move all configuration properties under the cameleer.server.* namespace
with all-lowercase dot-separated names and mechanical env var mapping
(dots→underscores, uppercase). This aligns with the agent's convention
(cameleer.agent.*) and establishes a predictable pattern across all
components.

Changes:
- Move 6 config prefixes under cameleer.server.*: agent-registry,
  ingestion, security, license, clickhouse, and cameleer.tenant/runtime/indexer
- Rename all kebab-case properties to concatenated lowercase
  (e.g., bootstrap-token → bootstraptoken, jar-storage-path → jarstoragepath)
- Update all env vars to CAMELEER_SERVER_* mechanical mapping
- Fix container-cpu-request/container-cpu-shares mismatch bug
- Remove displayName from AgentRegistrationRequest (redundant with instanceId)
- Update agent container env vars to CAMELEER_AGENT_* convention
- Update K8s manifests and CI workflow for new env var names
- Update CLAUDE.md, HOWTO.md, SERVER-CAPABILITIES.md documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:10:51 +02:00
hsiegeln
3b95dc777b docs: update CLAUDE.md with route control/replay config, CA import entrypoint
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / docker (push) Successful in 36s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
- ResolvedContainerConfig: added routeControlEnabled, replayEnabled
- DeploymentExecutor: documents capability env vars and startup-only nature
- Dockerfile: documents docker-entrypoint.sh CA cert import

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:07:26 +02:00
hsiegeln
e37003442a feat: add route control and replay toggles to environment defaults
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m32s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Admins can now disable route control and replay per environment via the
Default Resource Limits section. Both default to enabled. Apps in the
environment inherit these defaults unless overridden per-app.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:01:01 +02:00
hsiegeln
3501f32110 feat: make route control and replay configurable per environment/app
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Added routeControlEnabled and replayEnabled to ResolvedContainerConfig,
flowing through the three-layer config merge (global -> env -> app).
Both default to true. Admins can disable them per environment (e.g.
prod) via the defaultContainerConfig JSONB, or per app via the app's
containerConfig JSONB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:56:13 +02:00
hsiegeln
4da81b21ba fix: enable route control and replay capabilities for deployed apps
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
buildEnvVars was missing CAMELEER_ROUTE_CONTROL_ENABLED and
CAMELEER_REPLAY_ENABLED, so deployed app containers defaulted to false
and agents didn't announce these capabilities.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:53:49 +02:00
hsiegeln
1b358f2e10 fix: config bar layout — override section's flex-direction to row
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
The .section base class sets flex-direction: column, which caused the
config bar items (App Log Level, Agent Log Level, etc.) to stack
vertically instead of displaying in a horizontal row.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:50:50 +02:00
hsiegeln
1539c7a67b fix: import /certs/ca.pem into JVM truststore at startup
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 1m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
The server container mounts the platform's certs volume at /certs but
the CA bundle was never imported into the JVM truststore. OIDC discovery
failed with PKIX path building errors when a self-signed or custom CA
was in use.

The new entrypoint script splits the PEM bundle and imports each cert
via keytool before starting the app. This makes the conditional
CAMELEER_OIDC_TLS_SKIP_VERIFY logic in the SaaS provisioner work
correctly: when ca.pem exists, the JVM now actually trusts it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:31:26 +02:00
hsiegeln
e9486bd05a feat: allow M2M password resets when OIDC is enabled
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m50s
CI / docker (push) Successful in 1m34s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
The password reset endpoint was fully blocked under OIDC mode. Now
M2M callers (identified by oidc: principal prefix) can reset local
user passwords, enabling the SaaS platform to manage the server's
built-in admin credentials.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:46:26 +02:00
hsiegeln
cfc42eaf46 feat: add cameleer.tenant label to deployed app containers
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 1m32s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Allows the SaaS platform to identify and clean up all containers
belonging to a tenant on delete (cameleer/cameleer-saas#55).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:10:59 +02:00
hsiegeln
1a45235e30 feat: multi-format env var editor for deployment config
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m32s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Replace simple key-value rows with EnvEditor component that supports
editing variables as Table, Properties, YAML, or .env format.
Switching views converts data seamlessly. Includes file import
(drag-and-drop .properties/.yaml/.env) with auto-detect and merge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:16:09 +02:00
hsiegeln
e9ce828e10 fix: update DS to v0.1.42 — fix double-border on environment selector
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m45s
CI / docker (push) Successful in 2m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
SonarQube / sonarqube (push) Successful in 2m24s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:02:07 +02:00
hsiegeln
491bdfe1ff fix: type-safe ExchangeStatus cast in ButtonGroup onChange
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Cast the Set<string> from ButtonGroup.onChange to Set<ExchangeStatus>
before iterating, fixing TS2345 from DS TopBar decomposition.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 20:10:08 +02:00
hsiegeln
2863ceef12 refactor: compose TopBar center slot with server-specific controls
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 52s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Update to @cameleer/design-system@0.1.40 which decomposes TopBar into a
composable shell. Move status filters, time range, search trigger, and
auto-refresh toggle from the DS TopBar into LayoutShell as composed
children. Fixes cameleer/cameleer-saas#53.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:06:03 +02:00
hsiegeln
f0658cbd07 feat: hardcode Logto org scopes in auth flow, hide from admin UI
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 1m24s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Always include urn:logto:scope:organizations and
urn:logto:scope:organization_roles in OIDC auth requests. These are
required for role mapping in multi-tenant setups and harmless for
non-Logto providers (unknown scopes ignored per OIDC spec).

Filter them from the OIDC admin config page so they don't confuse
standalone server admins or SaaS tenants.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 15:37:40 +02:00
hsiegeln
0d610be3dc fix: use OIDC token roles when no claim mapping rules exist
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m15s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
The OIDC callback extracted roles from the token's Custom JWT claim
(e.g. roles: [server:admin]) but never used them. The
applyClaimMappings fallback only assigned defaultRoles (VIEWER).

Now the fallback priority is: claim mapping rules > OIDC token
roles > defaultRoles. This ensures users get their org-mapped
roles (owner → server:admin) without requiring manual claim
mapping rule configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:17:12 +02:00
hsiegeln
d238d2bd44 docs: update CLAUDE.md with tenant network isolation model
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 23s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:41:54 +02:00
hsiegeln
2ac52d3918 feat: tenant-scoped environment network names
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Environment networks now include the tenant ID to prevent cross-tenant
collisions: cameleer-env-{tenantId}-{envSlug} instead of cameleer-env-
{envSlug}. Without this, two tenants with a "dev" environment would
share the same Docker network.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:13:47 +02:00
hsiegeln
50e3f1ade6 feat: use configured DOCKER_NETWORK as primary for deployed apps
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Instead of hardcoding cameleer-traefik as the primary network for
deployed app containers, use CAMELEER_DOCKER_NETWORK (env var). In
SaaS mode this is the tenant-isolated network (cameleer-tenant-{slug}).
Apps still connect to cameleer-traefik (for routing) and cameleer-env-
{slug} (for intra-environment discovery) as additional networks.

This enables per-tenant network isolation: apps deployed by tenant A
cannot reach apps deployed by tenant B since they share no network.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:08:48 +02:00
995d3ca00d Merge pull request 'fix: restore exchange table scroll by adding flex constraints to tableWrap' (#126) from fix/deployments-redirect-path into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 28s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
SonarQube / sonarqube (push) Successful in 3m41s
Reviewed-on: cameleer/cameleer3-server#126
2026-04-09 19:25:00 +02:00
hsiegeln
ca18e58f5e fix: restore exchange table scroll by adding flex constraints to tableWrap
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m32s
CI / cleanup-branch (pull_request) Has been skipped
CI / deploy (push) Has been skipped
CI / build (pull_request) Successful in 1m45s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / deploy-feature (push) Successful in 48s
The tableSection card wrapper broke the flex height chain — DataTable's
fillHeight couldn't constrain to viewport. Added .tableWrap with
flex: 1, min-height: 0, display: flex to re-establish the chain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:20:50 +02:00
b345ac46a1 Merge pull request 'Fix /deployments redirect path (absolute, not relative)' (#125) from fix/deployments-redirect-path into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Reviewed-on: cameleer/cameleer3-server#125
2026-04-09 19:14:15 +02:00
hsiegeln
374131b7b5 fix: use absolute path for /deployments redirect
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m14s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m4s
CI / docker (pull_request) Has been skipped
CI / docker (push) Successful in 34s
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 26s
The relative `to="apps"` didn't resolve correctly. All other legacy
redirects use absolute paths (`to="/apps"`, `to="/runtime"`).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:06:48 +02:00
0ac84a10e8 Merge pull request 'UX polish: bug fixes, design consistency, contrast, formatting' (#124) from feature/ux-polish into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
Reviewed-on: cameleer/cameleer3-server#124
2026-04-09 19:03:53 +02:00
hsiegeln
191d4f39c1 fix: resolve 4 TypeScript compilation errors from CI
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m56s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m58s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 1m12s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 37s
- AuditLogPage: e.details -> e.detail (correct property name)
- AgentInstance: BarChart x: number -> x: String(i) (BarSeries requires string)
- AppsTab: add missing CatalogRoute import
- Dashboard: wrap MonoText in span for title attribute (MonoText lacks title prop)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:57:42 +02:00
hsiegeln
4bc38453fe fix: nice-to-have polish — breadcrumbs, close button, status badges
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 40s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Failing after 35s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
- 7.1: Add deployment status badge (StatusDot + Badge) to AppsTab app
  list, sourced from catalog.deployment.status via slug lookup
- 7.3: Add X close button to top-right of exchange detail right panel
  in ExchangesPage (position:absolute, triggers handleClearSelection)
- 7.5: PunchcardHeatmap shows "Requires at least 2 days of data"
  when timeRangeMs < 2 days; DashboardL1 passes the range down
- 7.6: Command palette exchange results truncate IDs to ...{last8}
  matching the exchanges table display

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:51:49 +02:00
hsiegeln
9466551044 fix: add unsaved changes banners to edit mode forms
Adds amber edit-mode banners to AppConfigDetailPage and both
DefaultResourcesSection/JarRetentionSection in EnvironmentsPage,
matching the existing ConfigSubTab pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:47:55 +02:00
hsiegeln
39687bc8a9 fix: fix unicode in roles, add password confirmation field
- RolesTab: wrap \u00b7 in JS expression {'\u00b7'} so JSX renders the middle dot correctly instead of literal backslash-u sequence
- UsersTab: add confirm password field with mismatch validation, hint text for password policy, and reset on cancel/success
- UserManagement.module.css: add .hintText style for password policy hint

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:46:30 +02:00
hsiegeln
7ec56f3bd0 fix: add shared number formatting utilities (formatMetric, formatCount, formatPercent)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:43:52 +02:00
hsiegeln
605c8ad270 feat: add CSV export to audit log 2026-04-09 18:43:46 +02:00
hsiegeln
2ede06f32a fix: chart Y-axis auto-scaling, error rate unit, memory reference line, pointer events
- Throughput chart: divide totalCount by bucket duration (seconds) so Y-axis shows true msg/s instead of raw bucket counts; fixes flat-line appearance when TPS is low but totalCount is large
- Error Rate chart: convert failedCount/totalCount to percentage; change yLabel from "err/h" to "%" to match KPI stat card unit
- Memory chart: add threshold line at jvm.memory.heap.max so chart Y-axis extends to max heap and shows the reference line (spec 5.3)
- Agent state: suppress containerStatus badge when value is "UNKNOWN"; only render it with "Container: <state>" label when a non-UNKNOWN secondary state is present (spec 5.4)
- DashboardTab chartGrid: add pointer-events:none with pointer-events:auto on children so the chart grid overlay does not intercept clicks on the Application Health table rows below (spec 5.5)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:42:10 +02:00
hsiegeln
fb53dc6dfc fix: standardize button order, add confirmation dialogs for destructive actions
- Fix Cancel|Save order and add primary/loading props (AppConfigDetailPage)
- Add AlertDialog before stopping deployments (AppsTab)
- Add ConfirmDialog before deleting taps (TapConfigModal)
- Add AlertDialog before killing queries with toast feedback (DatabaseAdminPage)
- Add AlertDialog before removing roles from users (UsersTab)
- Standardize Cancel button to variant="ghost" (TapConfigModal, RouteDetail)
- Add loading prop to ConfirmDialogs (OidcConfigPage, RouteDetail)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:39:22 +02:00
hsiegeln
3d910af491 fix: hide empty attributes column, standardize status labels, truncate agent names
- Attributes column is now hidden when no exchanges in the current view
  have attributes; shown conditionally via hasAttributes check on rows
- Status labels already standardized via statusLabel() in ExchangeHeader
- Agent names truncated to last two hyphen-separated segments via
  shortAgentName(); full name preserved as tooltip title

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:36:06 +02:00
hsiegeln
eadcd160a3 fix: improve duration formatting (Xm Ys) and truncate exchange IDs
- formatDuration and formatDurationShort now show Xm Ys for durations >= 60s (e.g. "5m 21s" instead of "321s") and 1 decimal for 1-60s range ("6.7s" instead of "6.70s")
- Exchange ID column shows last 8 chars with ellipsis prefix; full ID on hover, copies to clipboard on click

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:34:04 +02:00
hsiegeln
ba0a1850a9 fix: WCAG AA contrast compliance for --text-muted/--text-faint, 12px font floor
Override design system tokens in app root CSS: --text-muted raised to 4.5:1
contrast in both light (#766A5E) and dark (#9A9088) modes; --text-faint dark
mode raised from catastrophic 1.4:1 to 3:1 (#6A6058). Migrate --text-faint
usages on readable text (empty states, italic notes, buttons) to --text-muted.
Raise all 10px and 11px font-size declarations to 12px floor.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:31:51 +02:00
hsiegeln
b6b93dc3cc fix: prevent admin page redirect during token refresh
adminFetch called logout() directly on 401/403 responses, which cleared
roles and caused RequireAdmin to redirect to /exchanges while users were
editing forms. Now adminFetch attempts a token refresh before failing,
and RequireAdmin tolerates a transient empty-roles state during refresh.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:28:45 +02:00
hsiegeln
3f9fd44ea5 fix: wrap app config in section cards, replace manual table with DataTable
- Add sectionStyles and tableStyles imports to AppsTab.tsx
- Wrap CreateAppView identity section and each config tab (Monitoring,
  Resources, Variables) in sectionStyles.section cards
- Wrap ConfigSubTab config tabs (Monitoring, Resources, Variables,
  Traces & Taps, Route Recording) in sectionStyles.section cards
- Replace manual <table> in OverviewSubTab with DataTable inside a
  tableStyles.tableSection card wrapper; pre-compute enriched row data
  via useMemo; handle muted non-selected-env rows via inline opacity
- Remove unused .table, .table th, .table td, .table tr:hover td, and
  .mutedRow CSS rules from AppsTab.module.css

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:28:11 +02:00
hsiegeln
ba53f91f4a fix: standardize table containment and container padding across pages
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:21:58 +02:00
hsiegeln
be585934b9 fix: show descriptive error when creating local user with OIDC enabled
Return a JSON error body from UserAdminController instead of an empty 400,
and extract API error messages in adminFetch so toasts display the reason.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:19:10 +02:00
hsiegeln
2771dffb78 fix: add /deployments redirect and fix GC Pauses chart X-axis
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:16:53 +02:00
hsiegeln
80bc092ec1 Add UX polish implementation plan (19 tasks across 8 batches)
Detailed step-by-step plan covering critical bug fixes, layout/interaction
consistency, WCAG contrast compliance, data formatting, chart fixes, and
admin polish. Each task includes exact file paths, code snippets, and
verification steps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:13:41 +02:00
hsiegeln
4ea8bb368a Add UX polish design spec with comprehensive audit findings
Playwright-driven audit of the live UI (build 69dcce2, 60+ screenshots)
covering all pages, CRUD lifecycles, design consistency, and interaction
patterns. Spec defines 8 batches of work: critical bugs, layout
consistency, interaction consistency, contrast/readability, data
formatting, chart fixes, admin polish, and nice-to-have items.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 18:00:50 +02:00
hsiegeln
f24a5e5ff0 docs: update CLAUDE.md, audit, and spec for today's changes
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 27s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
- CLAUDE.md: security (last-admin guard, password policy, brute-force,
  token revocation), environment filtering (queries + commands), Docker
  reconciliation, UI shared patterns, V8/V9 migrations
- UI-CONSISTENCY-AUDIT.md: marked RESOLVED
- UI consistency design spec: marked COMPLETED

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:54:54 +02:00
hsiegeln
1971c70638 fix: commands respect selected environment
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
Backend: AgentRegistryService gains findByApplicationAndEnvironment()
and environment-aware addGroupCommandWithReplies() overload.
AgentCommandController and ApplicationConfigController accept optional
environment query parameter. When set, commands only target agents in
that environment. Backward compatible — null means all environments.

Frontend: All command mutations (config update, route control, traced
processors, tap config, route recording) now pass selectedEnv to the
backend via query parameter.

Prevents cross-environment command leakage — e.g., updating config for
prod no longer pushes to dev agents.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:28:09 +02:00
hsiegeln
69dcce2a8f fix: Runtime tab respects selected environment
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- Add environment parameter to AgentEventsController, AgentEventService,
  and ClickHouseAgentEventRepository (filters agent_events by environment)
- Wire selectedEnv to useAgents and useAgentEvents in both AgentHealth
  and AgentInstance pages
- Wire selectedEnv to useStatsTimeseries in AgentInstance

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:12:33 +02:00
hsiegeln
cb36d7936f fix: auto-compute environment slug + respect environment filter globally
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Part A: Environment creation slug is now auto-derived from display name
and shown read-only (matching app creation pattern). Removes manual slug
input.

Part B: All data queries now pass the selected environment to backend:
- Exchanges search, Dashboard L1/L2/L3 stats, Routes metrics, Route
  detail, correlation chains, and processor metrics all filter by
  selected environment.
- Backend RouteMetricsController now accepts environment parameter for
  both route and processor metrics endpoints.

Closes #XYZ

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 16:01:50 +02:00
hsiegeln
f95a78a380 fix: add periodic deployment status reconciliation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The DockerEventMonitor only reacted to Docker events. If an event was
missed (e.g., during reconnect or startup race), a DEGRADED deployment
with all replicas healthy would never promote back to RUNNING.

Add a @Scheduled reconciliation (every 30s) that inspects actual
container state and corrects deployment status mismatches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:40:18 +02:00
hsiegeln
3f94c98c5b refactor: replace native HTML with design system components (Phase 5)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- EnvironmentSelector: bare <select> -> DS Select
- LogTab: raw <table> + <input> + <button> -> DS LogViewer + Input + Button
- AppsTab: 3 homegrown sub-tab bars -> DS Tabs, remove unused CSS
- AppConfigDetailPage: 4x <select> -> DS Select, 2x <input checkbox> ->
  DS Toggle, 7x <label> -> DS Label, 4x <button> -> DS Button
- AgentHealth: 4x <select> -> DS Select, 7x <button> -> DS Button

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 15:22:14 +02:00
hsiegeln
ff62a34d89 refactor: UI consistency — shared CSS, design system colors, no inline styles
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Phase 1: Extract 6 shared CSS modules (table-section, log-panel,
rate-colors, refresh-indicator, chart-card, section-card) eliminating
~135 duplicate class definitions across 11 files.

Phase 2: Replace all hardcoded hex colors in CSS modules with design
system variables. Strip ~55 hex fallbacks from var() patterns. Fix 4
undefined variable names (--accent, --bg-base, --surface, --bg-surface-raised).

Phase 3: Replace ~45 hardcoded hex values in ProcessDiagram SVG
components with var() CSS custom properties. Fix Dashboard.tsx color prop.

Phase 4: Create CSS modules for AdminLayout, DatabaseAdminPage,
OidcCallback (previously 100% inline). Extract shared PageLoader
component (replaces 3 copy-pasted spinner patterns). Move AppsTab
static inline styles to CSS classes. Extract LayoutShell StarredList styles.

58 files changed, net -219 lines.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:55:54 +02:00
hsiegeln
bfed8174ca docs: UI consistency audit and fix design spec
Full audit of design system adoption, color consistency, inline styles,
layout patterns, and CSS module duplication across the server UI.
Includes 6-phase fix plan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 14:45:32 +02:00
hsiegeln
827ba3c798 feat: last-ADMIN guard and password hardening (#87, #89)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m57s
CI / docker (push) Successful in 1m48s
CI / deploy (push) Successful in 51s
CI / deploy-feature (push) Has been skipped
- Prevent removal of last ADMIN role via role unassign, user delete,
  or group role removal (returns 409 Conflict)
- Add password policy: min 12 chars, 3/4 character classes, no username
- Add brute-force protection: 5 attempts then 15min lockout, IP rate limit
- Add token revocation on password change via token_revoked_before column
- V9 migration adds failed_login_attempts, locked_until, token_revoked_before

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:58:03 +02:00
hsiegeln
3bf470f83f fix: narrow DEPLOY_STATUS_DOT type to match StatusDotVariant
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Fixes pre-existing TS2322 where Record<string, string> was not
assignable to the StatusDotVariant union type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:33:38 +02:00
hsiegeln
de46cee440 chore: add GitNexus config to .gitignore and CLAUDE.md
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 50s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:30:53 +02:00
hsiegeln
04c90bde06 refactor: extract duplicated utility functions into shared modules
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 41s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Consolidate 20+ duplicate function definitions across UI components into
three shared util files (format-utils, agent-utils, config-draft-utils).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:28:31 +02:00
hsiegeln
2df5e0d7ba feat: active config snapshot, composite StatusDot with tooltip
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 43s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Part 1 — Config snapshot:
- V8 migration adds resolved_config JSONB to deployments table
- DeploymentExecutor saves the full resolved config at deploy time
- Deployment record includes resolvedConfig for auditability

Part 2 — Composite health StatusDot:
- CatalogController computes composite health from deployment status +
  agent health (green only when RUNNING AND agent live)
- CatalogApp includes healthTooltip (e.g. "Deployment: RUNNING,
  Agents: live (1 connected)")
- StatusDot added to app detail header with deployment status Badge
- StatusDot added to deployment table rows
- Sidebar passes composite health + tooltip through to tree nodes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:00:54 +02:00
hsiegeln
7b822a787a feat: show Redeploy button when config changed after deployment
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Compare app.updatedAt with deployment.deployedAt — if config was
modified after the deployment started, show a primary "Redeploy" button
in the Actions column. Also show a toast hint after saving config:
"Redeploy to apply changes to running deployments."

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 07:41:11 +02:00
hsiegeln
e88db56f79 refactor: CPU config to millicores, fix replica health, reorder tabs
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
- Rename cpuShares to cpuRequest (millicores), cpuLimit from cores to
  millicores. ResolvedContainerConfig translates to Docker-native units
  via dockerCpuShares() and dockerCpuQuota() helpers. Future K8s
  orchestrator can pass millicores through directly.
- Fix waitForAnyHealthy to wait for ALL replicas instead of returning
  on first healthy one. Prevents false DEGRADED status with 2+ replicas.
- Default app detail to Configuration tab (was Overview)
- Reorder config sub-tabs: Monitoring, Resources, Variables, Traces &
  Taps, Route Recording

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 07:38:23 +02:00
hsiegeln
eb7cd9ba62 fix: keep sidebar selection when switching tabs
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Normalize the sidebar selectedPath so the app highlight persists across
tab switches (Dashboard, Runtime, Deployments). Also make sidebar clicks
tab-aware: clicking an app navigates to the current tab's path instead
of always going to /exchanges/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 07:13:04 +02:00
hsiegeln
b86e95f08e feat: unified catalog endpoint and slug-based app navigation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
SonarQube / sonarqube (push) Successful in 3m47s
Consolidate route catalog (agent-driven) and apps table (deployment-
driven) into a single GET /api/v1/catalog?environment={slug} endpoint.
Apps table is authoritative; agent data enriches with live health,
routes, and metrics. Unmanaged apps (agents without App record) appear
with managed=false.

- Add CatalogController merging App records + agent registry + ClickHouse
- Add CatalogApp DTO with deployment summary, managed flag, health
- Change AppController and DeploymentController to accept slugs (not UUIDs)
- Add AppRepository.findBySlug() and AppService.getBySlug()
- Replace useRouteCatalog() with useCatalog() across all UI components
- Navigate to /apps/{slug} instead of /apps/{UUID}
- Update sidebar, search, and all catalog lookups to use slug

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:43:14 +02:00
hsiegeln
0720053523 docs: add catalog consolidation design spec
Unify route catalog (agent-driven) and apps table (deployment-driven)
into a single catalog endpoint. Apps table becomes authoritative,
agent data enriches with live health/routes. Slug-based URLs replace
UUIDs for navigation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:32:18 +02:00
hsiegeln
a4a569a253 fix: improve deployment progress UI and prevent duplicate deployment rows
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m55s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m6s
- Redesign DeploymentProgress component: track-based layout with amber
  brand color, checkmarks for completed steps, user-friendly labels
  (Prepare, Image, Network, Launch, Verify, Activate, Live)
- Delete terminal (STOPPED/FAILED) deployments before creating new ones
  for the same app+environment, preventing duplicate rows in the UI
- Update CLAUDE.md with comprehensive key class locations, correct deploy
  stages, database migration reference, and REST endpoint summary

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:10:59 +02:00
hsiegeln
6288084daf docs: update documentation for Docker orchestration and env var rename
All checks were successful
CI / build (push) Successful in 2m9s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m41s
CI / deploy (push) Successful in 56s
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 22:09:18 +02:00
hsiegeln
64ebf19ad3 refactor: use CAMELEER_SERVER_URL for agent export endpoint
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
The runtime-base image and all agent Dockerfiles now read
CAMELEER_SERVER_URL instead of CAMELEER_EXPORT_ENDPOINT.
Updated the volume-mode entrypoint override to match.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 22:07:07 +02:00
hsiegeln
20f3dfe59d feat: support Docker volume-based JAR mounting for Docker-in-Docker
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
When CAMELEER_JAR_DOCKER_VOLUME is set, the orchestrator mounts the
named volume at the jar storage path instead of using a host bind mount.
This solves the path translation issue in Docker-in-Docker setups where
the server runs inside a container and manages sibling containers.

The entrypoint is overridden to use the volume-mounted JAR path via
the CAMELEER_APP_JAR env var.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:38:34 +02:00
hsiegeln
c923d8233b fix: move network attachment from orchestrator to executor
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Docker's connectToNetworkCmd needs the network ID (not name) and the
container's network sandbox must be ready. Moving network connection
to DeploymentExecutor where DockerNetworkManager handles ID resolution
and the container is already started.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:29:13 +02:00
hsiegeln
c72424543e fix: add client_max_body_size 200m to nginx API proxy
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Nginx defaults to 1MB body size, causing 413 on JAR uploads through
the UI proxy. Matches the Spring Boot multipart limit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:25:02 +02:00
hsiegeln
18ffbea9db fix: use visually-hidden clip pattern for file inputs
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The opacity:0 approach caused the native "Choose File" button to
appear in the accessibility tree and compete for clicks. The clip
pattern properly hides the input while keeping it functional for
programmatic .click().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:23:05 +02:00
hsiegeln
19da9b9f9f fix: use opacity-based hidden input for file upload instead of display:none
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 1m14s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Some browsers block programmatic .click() on display:none inputs.
Using position:absolute + opacity:0 keeps the input in the render tree.
Also added type="button" to prevent any form-submission interference.
Applied to both create page and detail view file inputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:17:50 +02:00
hsiegeln
8b3c4ba2fe feat: routing mode, domain, server URL, SSL offloading on Environments page
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:35:23 +02:00
hsiegeln
96fbca1b35 feat: replicas column, deploy progress, and new config fields in Deployments UI
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 20:33:41 +02:00
hsiegeln
977bfc1c6b feat: DeploymentProgress step indicator component
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:29:57 +02:00
hsiegeln
7e0536b5b3 feat: update Deployment interface with replicas, stages, new statuses
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:29:33 +02:00
hsiegeln
6e444a414d feat: add CAMELEER_SERVER_URL config property
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:28:44 +02:00
hsiegeln
f8d42026da feat: rewrite DeploymentExecutor with staged deploy, config merge, replicas
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:27:37 +02:00
hsiegeln
fef3ef6184 feat: DockerEventMonitor — persistent event stream for container lifecycle
Listens to Docker daemon events (die, oom, start, stop) for containers
labeled managed-by=cameleer3-server, updates replica states in Postgres,
and recomputes aggregate deployment status (RUNNING/DEGRADED/FAILED).
Bean is wired in RuntimeOrchestratorAutoConfig via instanceof guard so it
only activates when Docker is available.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:24:03 +02:00
hsiegeln
76eacb17e6 feat: DockerNetworkManager with lazy network creation and container attachment
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:21:39 +02:00
hsiegeln
3f2fec2815 feat: TraefikLabelBuilder with path-based and subdomain routing
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:21:16 +02:00
hsiegeln
55bdab472b feat: expand ContainerRequest with cpuLimit, ports, restart policy, additional networks
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:20:13 +02:00
hsiegeln
b7d00548c5 feat: ResolvedContainerConfig record and three-layer ConfigMerger
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:18:25 +02:00
hsiegeln
fef0239b1d feat: update PostgresDeploymentRepository for orchestration columns
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:16:57 +02:00
hsiegeln
6eff271238 feat(core): add orchestration fields to Deployment record
Extends Deployment with targetState, deploymentStrategy, replicaStates
(List<Map<String,Object>>), and deployStage. Updates withStatus() to
carry the new fields through.
2026-04-08 20:15:11 +02:00
hsiegeln
01e0062767 feat(core): expand DeploymentStatus and add DeployStage enum
Adds DEGRADED and STOPPING to DeploymentStatus (reordered for lifecycle
clarity). Introduces DeployStage enum for tracking orchestration progress
through PRE_FLIGHT → COMPLETE.
2026-04-08 20:15:07 +02:00
hsiegeln
0fccdb636f feat(db): add V7 deployment orchestration migration
Adds target_state, deployment_strategy, replica_states (JSONB), and
deploy_stage columns to the deployments table with backfill logic.
2026-04-08 20:15:01 +02:00
hsiegeln
123e66e44d docs: Docker container orchestration implementation plan
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
17 tasks covering: migration, domain models, config merger, Traefik
labels, network manager, Docker event monitor, DeploymentExecutor
rewrite, controller updates, and UI changes (progress indicator,
replicas, new config fields).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 20:11:12 +02:00
hsiegeln
b196918e70 docs: revert ICC-disabled, use shared traefik network with app-level auth
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 26s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
ICC=false breaks Traefik routing and agent-server communication.
Switched to shared traefik network (ICC enabled) with app-level
security boundaries. Per-env Traefik networks noted as future option.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 20:00:12 +02:00
hsiegeln
dd4442329c docs: add ICC-disabled traefik network isolation to orchestration spec
The cameleer-traefik network disables inter-container communication
so app containers cannot reach each other directly — only through
Traefik. Environment networks keep ICC enabled for intra-env comms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 19:53:51 +02:00
hsiegeln
da6bf694f8 docs: Docker container orchestration design spec
Covers: config merging (3-layer), Traefik label generation (path +
subdomain routing), network topology (infra/traefik/env isolation),
replica management, blue/green and rolling deployment strategies,
Docker event stream monitoring, deployment status state machine
(DEGRADED/STOPPING states), pre-flight checks, and UI changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 19:48:34 +02:00
hsiegeln
7e47f1628d feat: JAR retention policy with nightly cleanup job
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
Per-environment "keep last N versions" setting (default 5, null for
unlimited). Nightly scheduled job at 03:00 deletes old versions from
both database and disk, skipping any version that is currently deployed.

Full stack:
- V6 migration: adds jar_retention_count column to environments
- Environment record, repository, service, admin controller endpoint
- JarRetentionJob: @Scheduled nightly, iterates environments and apps
- UI: retention policy editor on admin Environments page with
  toggle between limited/unlimited and version count input
- AppVersionRepository.delete() for version cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 19:06:28 +02:00
hsiegeln
863a992cc4 feat: add default container config editor to Environments admin page
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
New "Default Resource Limits" section in environment detail view with
memory limit/reserve, CPU shares/limit. These defaults apply to new
apps unless overridden per-app.

Added useUpdateDefaultContainerConfig hook for the PUT endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:52:39 +02:00
hsiegeln
0ccb8bc68d feat: extract Variables as first config tab in create and detail views
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 1m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Environment Variables moved from Resources into a dedicated "Variables"
tab, placed first in the tab order since it's the most commonly needed
config when creating new apps.

Tab order:
- Create page: Variables | Monitoring | Resources
- Detail page: Variables | Monitoring | Traces & Taps | Route Recording | Resources

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:47:58 +02:00
hsiegeln
0a3733f9ba feat: show live external URL preview instead of slug on create page
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
As the user types the app name, the URL builds in real-time:
  /{envSlug}/{appSlug}/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:45:02 +02:00
hsiegeln
056b747c3f feat: replace create-app modal with full creation page at /apps/new
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m6s
Full-page creation flow with:
- Identity section: name, auto-slug, environment, JAR upload, deploy toggle
- Monitoring tab: engine level, payload capture, log levels, metrics,
  sampling, compress success, replay, route control
- Resources tab: memory, CPU, ports, environment variables

Environment variables are configurable before first deploy, addressing
the need to set app-specific config upfront.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:31:34 +02:00
hsiegeln
0b2d231b6b feat: split config into 4 tabs and fix JAR upload 413
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Config sub-tabs are now: Monitoring | Traces & Taps | Route Recording | Resources
(renamed from Agent/Infrastructure, with traces and recording as their own tabs).

Also increase Spring multipart max-file-size and max-request-size to 200MB
to fix HTTP 413 on JAR uploads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:22:39 +02:00
hsiegeln
7503641afe chore: remove dead LogsTab and AppConfigPage files
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Both replaced by consolidated Deployments tab. ~1300 lines removed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:11:05 +02:00
hsiegeln
967156d41b feat: migrate traces/taps and route recording into Deployments config
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
ConfigSubTab now uses inner tabs (Agent / Infrastructure):
- Agent: observability settings, compress success, traces & taps table,
  route recording toggles
- Infrastructure: container resources, exposed ports, environment variables

This completes the Config tab consolidation — all features from the
standalone Config page now live in the Deployments tab.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:09:12 +02:00
hsiegeln
0a0733def7 refactor: consolidate tabs — remove standalone Logs and Config tabs
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Logs functionality already exists in Runtime tab (AgentHealth/AgentInstance).
Config functionality moved to Deployments tab ConfigSubTab.
Old routes redirect to /runtime and /apps respectively.
Navigation links updated throughout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:02:29 +02:00
hsiegeln
b7f215e90c feat: add delete confirmation dialog for apps
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Prevents accidental app deletion by requiring the user to type the app
slug before confirming.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 17:55:37 +02:00
hsiegeln
6a32b83326 feat: single-step app creation with auto-slug, JAR upload, and deploy
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Replace inline create form with a modal that handles the full flow:
- Name → auto-computed slug (editable if needed)
- Environment picker
- JAR file upload
- "Deploy immediately" toggle (on by default)
- Single "Create & Deploy" button runs all three API calls sequentially
  with step indicator

After creation, navigates directly to the new app's detail view.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 17:48:20 +02:00
hsiegeln
c4fe992179 feat: redesign Deployments tab with Overview + Configuration sub-tabs
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Overview sub-tab:
- Deployments table with env badge, version, status, URL, deployed time
- Actions (Start/Stop) scoped to selected environment; other envs show
  "switch env to manage" hint with muted rows
- Versions list with per-env deploy target picker

Configuration sub-tab:
- Read-only by default with Edit mode gate (Cancel/Save banner)
- Agent observability: engine level, payload capture with size unit
  selector, log levels, metrics toggle, sampling, replay and route
  control (default enabled)
- Container resources: memory/CPU limits, exposed ports as deletable
  pills with inline add input
- Environment variables: key-value editor with add/remove
- Reuses existing ApplicationConfig API for agent config push via SSE

Tab renamed from "Apps" to "Deployments" in the tab bar.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 17:36:09 +02:00
hsiegeln
01ac47eeb4 chore: update @cameleer/design-system to stable v0.1.39
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m26s
CI / docker (push) Successful in 1m41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Replaces snapshot dependency with tagged release.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:59:20 +02:00
hsiegeln
1c5ecb02e3 fix: make environment list accessible to all authenticated users
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The list endpoint on EnvironmentAdminController now overrides the
class-level ADMIN guard with isAuthenticated(), so VIEWERs can see
the environment selector. The LayoutShell merges environments from
both the table and agent heartbeats, so the selector always shows
configured environments even when no agents are connected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:50:31 +02:00
hsiegeln
b1b7e142bb fix: remove duplicate updated_at column from V5 migration
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
apps.updated_at already exists from V3. The duplicate ALTER caused
Flyway to fail on startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:31:06 +02:00
hsiegeln
de4ca10fa5 feat: move Apps from admin to main tab bar with container config
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m16s
- Apps tab visible to OPERATOR+ (hidden for VIEWER), scoped by
  sidebar app selection and environment filter
- List view: DataTable with name, environment, updated, created columns
- Detail view: deployments across all envs, version upload with
  per-env deploy target, container config form (resources, ports,
  custom env vars) with explicit Save
- Memory reserve field disabled for non-production environments
  with info hint
- Admin sidebar sorted alphabetically, Applications entry removed
- Old admin AppsPage deleted

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:23:30 +02:00
hsiegeln
875062e59a feat: add container config to apps and default config to environments
- V5 migration: container_config JSONB + updated_at on apps,
  default_container_config JSONB on environments
- App/Environment records updated with new fields
- PUT /apps/{id}/container-config endpoint for per-app config
- PUT /admin/environments/{id}/default-container-config for env defaults
- GET /apps now supports optional environmentId (lists all when omitted)
- AppRepository.findAll() for cross-environment app listing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:18:08 +02:00
hsiegeln
e04dca55aa feat: add Applications admin page with version upload and deployments
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- SplitPane layout with environment selector, app list, and detail pane
- Create/delete apps with slug uniqueness validation
- Upload JAR versions with file size display
- Deploy versions and stop running deployments with status badges
- Deployment list auto-refreshes every 5s for live status updates
- Registered at /admin/apps with sidebar entry

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:24:22 +02:00
hsiegeln
448a63adc9 feat: add About Me dialog showing user info, roles, and groups
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 1m45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- Add GET /api/v1/auth/me endpoint returning current user's UserDetail
- Add AboutMeDialog component with role badges and group memberships
- Add userMenuItems prop to TopBar via design-system update
- Wire "About Me" menu item into user dropdown above Logout

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:12:29 +02:00
hsiegeln
a8b977a2db fix: include managed role assignments in direct roles query
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 1m2s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
getDirectRolesForUser filtered on origin='direct', which excluded
roles assigned via claim mapping (origin='managed'). This caused
OIDC users to appear roleless even when claim mappings matched.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:52:50 +02:00
hsiegeln
529e2c727c fix: apply defaultRoles fallback when no claim mapping rules match
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
When no claim mapping rules are configured or none match the JWT
claims, fall back to assigning the OidcConfig.defaultRoles (e.g.
VIEWER). This restores the behavior that was lost when syncOidcRoles
was replaced with claim mapping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:33:24 +02:00
hsiegeln
9af0043915 feat: add Environment admin UI page
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
SplitPane with create/edit/delete, production flag toggle,
enabled/disabled toggle. Follows existing admin page patterns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:19:05 +02:00
hsiegeln
2e006051bc feat: add production/enabled flags to environments, drop status enum
Environments now have:
- production (bool): prod vs non-prod resource allocation
- enabled (bool): disabled blocks new deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:16:09 +02:00
hsiegeln
d9160b7d0e fix: allow local login to coexist with OIDC
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m44s
CI / docker (push) Successful in 1m2s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Local login was blocked when OIDC env vars were present, causing
bootstrap to fail (chicken-and-egg: bootstrap needs local auth to
configure OIDC). The backend now accepts both auth paths; the
frontend/UI decides which login flow to present.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 09:09:24 +02:00
hsiegeln
36e8b2d8ff test: add integration tests for runtime management API
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m40s
CI / docker (push) Successful in 4m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- EnvironmentAdminControllerIT: CRUD, access control, default env protection
- AppControllerIT: create, list, JAR upload, viewer access denied
- DeploymentControllerIT: deploy, list, not-found handling
- Fix bean name conflict: rename executor bean to deploymentTaskExecutor

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:52:07 +02:00
hsiegeln
3d20d7a0cb feat: add runtime management configuration properties
- JAR storage path, base image, Docker network
- Container memory/CPU limits, health check timeout
- Routing mode and domain for Traefik integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:47:43 +02:00
hsiegeln
8f2aafadc1 feat: add REST controllers for environment, app, and deployment management
- EnvironmentAdminController: CRUD under /api/v1/admin/environments (ADMIN)
- AppController: CRUD + JAR upload under /api/v1/apps (OPERATOR+)
- DeploymentController: deploy, stop, promote, logs under /api/v1/apps/{appId}/deployments
- Security rule for /api/v1/apps/** requiring OPERATOR or ADMIN role

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:47:05 +02:00
hsiegeln
248b716cb9 feat: implement async DeploymentExecutor pipeline
- Async container deployment with health check polling
- Stops previous deployment before starting new one
- Configurable memory, CPU, health timeout via application properties
- @EnableAsync on application class for Spring async proxy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:45:38 +02:00
hsiegeln
b05b7e5597 feat: implement DockerRuntimeOrchestrator with volume-mount JAR deployment
- DockerRuntimeOrchestrator: docker-java based container lifecycle
- DisabledRuntimeOrchestrator: no-op for observability-only mode
- RuntimeOrchestratorAutoConfig: auto-detects Docker socket availability

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:44:32 +02:00
hsiegeln
585e078667 feat: implement PostgreSQL repositories for runtime management
- PostgresEnvironmentRepository, PostgresAppRepository
- PostgresAppVersionRepository, PostgresDeploymentRepository
- RuntimeBeanConfig wiring repositories, services, and async executor

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:43:35 +02:00
hsiegeln
55068ff625 feat: add EnvironmentService, AppService, DeploymentService
- EnvironmentService: CRUD with slug uniqueness, default env protection
- AppService: CRUD, JAR upload with SHA-256 checksumming
- DeploymentService: create, promote, status transitions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:41:48 +02:00
hsiegeln
17f45645ff feat: add runtime repository interfaces and RuntimeOrchestrator
- EnvironmentRepository, AppRepository, AppVersionRepository, DeploymentRepository
- RuntimeOrchestrator interface with ContainerRequest and ContainerStatus

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:41:05 +02:00
hsiegeln
fd2e52e155 feat: add runtime management domain records
- Environment, EnvironmentStatus, App, AppVersion
- Deployment, DeploymentStatus, RoutingMode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:40:39 +02:00
hsiegeln
85530d5ea3 feat: add runtime management database schema (environments, apps, versions, deployments)
- environments, apps, app_versions, deployments tables
- Default environment seeded on migration
- Foreign keys with CASCADE delete

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:40:18 +02:00
hsiegeln
32ae642fab chore: add docker-java dependency for runtime orchestration
- docker-java-core 3.4.1
- docker-java-transport-zerodep 3.4.1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:39:57 +02:00
hsiegeln
ec9856d8a2 fix: Ed25519SigningService falls back to ephemeral key when jwt-secret is absent
- SecurityBeanConfig uses Ed25519SigningServiceImpl.ephemeral() when no jwt-secret
- Fixes pre-existing application context failure in integration tests
- Reverts test jwt-secret from application-test.yml (no longer needed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:34:55 +02:00
hsiegeln
847c1f792b test: add integration tests for claim mapping admin API
- ClaimMappingAdminControllerIT with create+list and delete tests
- Add adminHeaders() convenience method to TestSecurityHelper
- Add jwt-secret to test profile (fixes pre-existing Ed25519 init failure)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:20:58 +02:00
hsiegeln
ac9ce4f2e7 feat: add ClaimMappingAdminController for CRUD on mapping rules
- ADMIN-only REST endpoints at /api/v1/admin/claim-mappings
- Full CRUD: list, get by ID, create, update, delete
- OpenAPI annotations for Swagger documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:16:23 +02:00
hsiegeln
7657081b78 feat: disable local auth when OIDC is configured (resource server mode)
- UiAuthController.login returns 404 when OIDC issuer is configured
- JwtAuthenticationFilter skips internal user tokens in OIDC mode (agents still work)
- UserAdminController.createUser and resetPassword return 400 in OIDC mode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:15:47 +02:00
hsiegeln
b5e85162f8 feat: replace syncOidcRoles with claim mapping evaluation on OIDC login
- OidcUserInfo now includes allClaims map from id_token + access_token
- OidcAuthController.callback() calls applyClaimMappings instead of syncOidcRoles
- applyClaimMappings evaluates rules, clears managed assignments, applies new ones
- Supports both assignRole and addToGroup actions

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:13:52 +02:00
hsiegeln
7904a18f67 feat: add origin-aware managed/direct assignment methods to RbacService
- Add clearManagedAssignments, assignManagedRole, addUserToManagedGroup to interface
- Update assignRoleToUser and addUserToGroup to explicitly set origin='direct'
- Update getDirectRolesForUser to filter by origin='direct'
- Implement managed assignment methods with ON CONFLICT upsert

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:12:07 +02:00
hsiegeln
67ca1e726f feat: add license admin API for runtime license updates
- GET /api/v1/admin/license returns current license info
- POST /api/v1/admin/license validates and loads new license token
- Requires ADMIN role, validates Ed25519 signature before applying
- OpenAPI annotations for Swagger documentation
2026-04-07 23:12:03 +02:00
hsiegeln
b969075007 feat: add license loading at startup from env var or file
- LicenseBeanConfig wires LicenseGate bean with startup validation
- Supports token from CAMELEER_LICENSE_TOKEN env var or CAMELEER_LICENSE_FILE path
- Falls back to open mode when no license or no public key configured
- Add license config properties to application.yml
2026-04-07 23:11:02 +02:00
hsiegeln
d734597ec3 feat: implement PostgresClaimMappingRepository and wire beans
- JdbcTemplate-based CRUD for claim_mapping_rules table
- RbacBeanConfig wires ClaimMappingRepository and ClaimMappingService beans

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:10:38 +02:00
hsiegeln
dd5cf1b38c feat: implement LicenseGate for feature checking
- Thread-safe AtomicReference-based license holder
- Defaults to open mode (all features enabled) when no license loaded
- Runtime license loading with feature/limit queries
- Unit tests for open mode and licensed mode
2026-04-07 23:10:14 +02:00
hsiegeln
e1cb17707b feat: implement ClaimMappingService with equals/contains/regex matching
- Evaluates JWT claims against mapping rules
- Supports equals, contains (list + space-separated), regex match types
- Results sorted by priority
- 7 unit tests covering all match types and edge cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:09:50 +02:00
hsiegeln
b5cf35ef9a feat: implement LicenseValidator with Ed25519 signature verification
- Validates payload.signature license tokens using Ed25519 public key
- Parses tier, features, limits, timestamps from JSON payload
- Rejects expired and tampered tokens
- Unit tests for valid, expired, and tampered license scenarios
2026-04-07 23:08:04 +02:00
hsiegeln
2f8fcb866e feat: add ClaimMappingRule domain model and repository interface
- AssignmentOrigin enum (direct/managed)
- ClaimMappingRule record with match type and action enums
- ClaimMappingRepository interface for CRUD operations

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:07:57 +02:00
hsiegeln
bd78207060 feat: add claim mapping rules table and origin tracking to RBAC assignments
- Add origin and mapping_id columns to user_roles and user_groups
- Create claim_mapping_rules table with match_type and action constraints
- Update primary keys to include origin column
- Add indexes for fast managed assignment cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:07:30 +02:00
hsiegeln
96ba7cd711 feat: add LicenseInfo and Feature domain model
- Feature enum with topology, lineage, correlation, debugger, replay
- LicenseInfo record with tier, features, limits, issuedAt, expiresAt
- Open mode factory method for standalone/dev usage
2026-04-07 23:06:17 +02:00
hsiegeln
c6682c4c9c fix: update package-lock.json for DS v0.1.38
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 1m33s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
SonarQube / sonarqube (push) Successful in 2m4s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:47:54 +02:00
hsiegeln
6a1d3bb129 refactor: move inline styles to CSS modules
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 13s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Extract inline fontSize/color styles from LogTab, LayoutShell,
UsersTab, GroupsTab, RolesTab, and LevelFilterBar into CSS modules.
Follows project convention of CSS modules over inline styles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:45:02 +02:00
hsiegeln
9cbf647203 chore: update DS to v0.1.38, enforce 12px font size floor
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 22s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Update @cameleer/design-system to v0.1.38 (12px minimum font size).
Replace all 10px and 11px font sizes with 12px across 25 CSS modules
and 5 TSX inline styles to match the new DS floor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 11:41:51 +02:00
hsiegeln
07f3c2584c fix: syncOidcRoles uses direct roles only, always overwrites
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
- Expose getDirectRolesForUser on RbacService interface so syncOidcRoles
  compares against directly-assigned roles only, not group-inherited ones
- Remove early-return that preserved existing roles when OIDC returned
  none — now always applies defaultRoles as fallback
- Update CLAUDE.md and SERVER-CAPABILITIES.md to reflect changes

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:56:40 +02:00
hsiegeln
ca1b549f10 docs: document OIDC access_token role extraction and audience config
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:55:01 +02:00
hsiegeln
7d5866bca8 chore: remove debug logging from OidcTokenExchanger
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m2s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:50:27 +02:00
hsiegeln
f601074e78 fix: include resource parameter in OIDC token exchange request
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Logto returns opaque access tokens unless the resource parameter is
included in both the authorization request AND the token exchange.
Append resource to the token endpoint POST body per RFC 8707 so Logto
returns a JWT access token with Custom JWT claims.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:45:44 +02:00
hsiegeln
725f826513 debug: log access_token format to diagnose opaque vs JWT
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:39:53 +02:00
hsiegeln
52f5a0414e debug: temporarily log access_token decode failures at WARN level
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:34:15 +02:00
hsiegeln
11fc85e2b9 fix: log access_token claims and audience mismatch during OIDC exchange
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Helps diagnose whether rolesClaim path matches the actual token structure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:32:34 +02:00
hsiegeln
d4b530ff8a refactor: remove PKCE from OIDC flow (confidential client)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m2s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Backend holds client_secret and does the token exchange server-side,
making PKCE redundant. Removes code_verifier/code_challenge from all
frontend auth paths and backend exchange method. Eliminates the source
of "grant request is invalid" errors from verifier mismatches.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:22:13 +02:00
hsiegeln
03ff9a3813 feat: generic OIDC role extraction from access token
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The OIDC login flow now reads roles from the access_token (JWT) in
addition to the id_token. This fixes role extraction with providers
like Logto that put scopes/roles in access tokens rather than id_tokens.

- Add audience and additionalScopes to OidcConfig for RFC 8707 resource
  indicator support and configurable extra scopes
- OidcTokenExchanger decodes access_token with at+jwt-compatible processor,
  falls back to id_token if access_token is opaque or has no roles
- syncOidcRoles preserves existing local roles when OIDC returns none
- SPA includes resource and additionalScopes in authorization requests
- Admin UI exposes new config fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:16:52 +02:00
hsiegeln
95eb388283 fix: handle space-delimited scope string in OIDC role extraction
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
extractRoles() only handled List claims (JSON arrays). When rolesClaim
is configured as "scope", the JWT value is a space-delimited string,
which was silently returning [] and falling back to defaultRoles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:20:37 +02:00
hsiegeln
8852ec1483 feat: add diagnostic logging for OIDC scope and role extraction
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Has started running
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Logs received scopes, rolesClaim path, extracted roles, and all claim
keys at each stage of the OIDC auth flow to help debug Logto integration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 09:16:42 +02:00
hsiegeln
23e90d6afb fix: postinstall creates public/ dir before copying favicon
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 1m20s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
SonarQube / sonarqube (push) Successful in 3m31s
Docker build copies package.json before source, so public/ doesn't
exist when npm ci runs postinstall. Use mkdirSync(recursive:true).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:38:43 +02:00
hsiegeln
d19551f8aa chore: auto-sync favicon from DS via postinstall script
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Failing after 52s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
favicon.svg is now copied from @cameleer/design-system/assets on
npm install via postinstall hook. Removed from git tracking
(.gitignore). Updates automatically when DS version changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:25:44 +02:00
hsiegeln
b2e4b91d94 chore: update design system to v0.1.37 (improved SVG logo)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:24:12 +02:00
hsiegeln
95b35f6203 fix: make OIDC logout resilient to end-session endpoint failures
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m32s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Fire end-session via fetch(no-cors) instead of window.location redirect.
Always navigate to /login?local regardless of whether end-session
succeeds, preventing broken JSON responses from blocking logout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:06:56 +02:00
hsiegeln
a443abe6ae refactor: unify all brand icons to single SVG from DS v0.1.36
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m0s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Replace PNG favicons and brand logos with cameleer3-logo.svg from
@cameleer/design-system/assets. Favicon, login dialog, and sidebar
all use the same SVG. Remove PNG favicon files from public/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:03:30 +02:00
hsiegeln
a5340059d7 refactor: import brand assets directly from DS v0.1.34
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
DS now exports ./assets/* — import PNGs directly via Vite instead of
copying to public/. Removes duplicated brand files from public/.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:47:31 +02:00
hsiegeln
45cccdbd8a fix: revert to public/ brand assets — DS exports field blocks imports
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 2m7s
CI / deploy (push) Successful in 51s
CI / deploy-feature (push) Has been skipped
The @cameleer/design-system package.json exports field doesn't include
assets/, causing production build failures. Copy PNGs to public/ and
reference via basePath until DS adds asset exports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:41:20 +02:00
hsiegeln
281e168790 fix: pass commit short hash as version to UI sidebar
Some checks failed
CI / build (push) Failing after 38s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add VITE_APP_VERSION build arg to UI Dockerfile, pass short SHA from
CI docker build step. vite.config.ts truncates to 7 chars so both
CI build and Docker build produce consistent short hashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:37:46 +02:00
hsiegeln
1386e80670 refactor: import brand icons directly from design system
Some checks failed
CI / build (push) Failing after 36s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Import PNGs via Vite from @cameleer/design-system/assets instead of
copying to public/. Only favicons remain in public/ (needed by HTML).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:20:07 +02:00
hsiegeln
f372d0d63c chore: update design system to v0.1.33 (transparent brand icons)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:18:26 +02:00
hsiegeln
6ef66a14ec fix: use full-color brand PNGs for login dialog and sidebar
All checks were successful
CI / build (push) Successful in 1m32s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m44s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
The SVG uses fill=currentColor (inherits text color). Switch to the
full-color PNG brand icons: 192px for login dialog, 48px for sidebar.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:10:48 +02:00
hsiegeln
0761d0dbee feat: use design system brand icons for favicon, login, sidebar
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
Replace hand-crafted favicon.svg with official brand assets from
@cameleer/design-system v0.1.32: PNG favicons (16/32px) and
camel-logo.svg for login dialog and sidebar. Update SecurityConfig
public endpoints accordingly. Update documentation for architecture
cleanup (PKCE, OidcProviderHelper, role normalization, K8s hardening,
Dockerfile credential removal, CI deduplication, sidebar path fix).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:08:58 +02:00
hsiegeln
0de392ff6e fix: remove securityContext from UI pod — nginx needs root for setup
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
The standard nginx image requires root to modify /etc/nginx/conf.d
and create /var/cache/nginx directories during startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:06:07 +02:00
hsiegeln
c502a42f17 refactor: architecture cleanup — OIDC dedup, PKCE, K8s hardening
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m59s
- Extract OidcProviderHelper for shared discovery + JWK source construction
- Add SystemRole.normalizeScope() to centralize role normalization
- Merge duplicate claim extraction in OidcTokenExchanger
- Add PKCE (S256) to OIDC authorization flow (frontend + backend)
- Add SecurityContext (runAsNonRoot) to all K8s deployments
- Fix postgres probe to use $POSTGRES_USER instead of hardcoded username
- Remove default credentials from Dockerfile
- Extract sanitize_branch() to shared .gitea/sanitize-branch.sh
- Fix sidebar to use /exchanges/ paths directly, remove legacy redirects
- Centralize basePath computation in router.tsx via config module

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 21:57:29 +02:00
hsiegeln
07ff576eb6 fix: prevent SSO re-login loop on OIDC logout
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Logout now always redirects to /login?local, either via OIDC
end_session or as a direct fallback, preventing prompt=none
auto-redirect from logging the user back in immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:37:35 +02:00
hsiegeln
c249c6f3e0 docs: update Config tab navigation behavior and role gating
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m31s
CI / docker (push) Successful in 13s
CI / deploy (push) Successful in 46s
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:29:20 +02:00
hsiegeln
bb6a9c9269 fix: Config tab sidebar navigation stays on config for app and route clicks
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 58s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
When on Config tab: clicking an app navigates to /config/:appId (shows
that app's config with detail panel). Clicking a route navigates to
/config/:appId (same app config, since config is per-app not per-route).
Clicking Applications header navigates to /config (all apps table).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:13:39 +02:00
hsiegeln
c6a8a4471f fix: always show Config tab and fix 404 on sidebar navigation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Config tab now always visible (not just when app selected). Shows all-
app config table at /config, single-app detail at /config/:appId.

Fixed 404 when clicking sidebar nodes while on Config tab — the sidebar
navigation built /config/appId/routeId which had no route. Now falls
back to exchanges tab for route-level navigation from config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:10:02 +02:00
hsiegeln
640a48114d docs: document UI role gating for VIEWER/OPERATOR/ADMIN
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m37s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:52:25 +02:00
hsiegeln
b1655b366e feat: role-based UI access control
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
- Hide Admin sidebar section for non-ADMIN users
- Add RequireAdmin route guard — /admin/* redirects to / for non-admin
- Move App Config from admin section to main Config tab (per-app,
  visible when app selected). VIEWER sees read-only, OPERATOR+ can edit
- Hide diagram node toolbar for VIEWER (onNodeAction conditional)
- Add useIsAdmin/useCanControl helpers to centralize role checks
- Remove App Config from admin sidebar tree

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:51:15 +02:00
hsiegeln
e54f308607 docs: add role-based UI access control design spec
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 10s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:35:59 +02:00
hsiegeln
e69b44f566 docs: document configurable userIdClaim for OIDC
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:20:50 +02:00
hsiegeln
0c77f8d594 feat: add User ID Claim field to OIDC admin config UI
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
New input in the Claim Mapping section lets admins configure which
id_token claim is used as the unique user identifier (default: sub).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:19:38 +02:00
hsiegeln
a96cf2afed feat: add configurable userIdClaim for OIDC user identification
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
The OIDC user login ID is now configurable via the admin OIDC setup
dialog (userIdClaim field). Supports dot-separated claim paths (e.g.
'email', 'preferred_username', 'custom.user_id'). Defaults to 'sub'
for backwards compatibility. Throws if the configured claim is missing
from the id_token.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:18:03 +02:00
hsiegeln
549dbaa322 docs: document OIDC role sync on every login
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:11:49 +02:00
hsiegeln
f4eafd9a0f feat: sync OIDC roles on every login, not just first
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
Roles from the id_token's rolesClaim are now diffed against stored
system roles on each OIDC login. Missing roles are added, revoked
roles are removed. Group memberships (manually assigned) are never
touched. This propagates scope revocations from the OIDC provider
on next user login.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:11:06 +02:00
hsiegeln
4e12fcbe7a docs: document server:-prefixed scopes and case-insensitive role mapping
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:06:11 +02:00
hsiegeln
9c2e6aacad feat: support server:-prefixed scopes and case-insensitive role mapping
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
M2M scope mapping now accepts both 'server:admin' and 'admin' (case-
insensitive). OIDC user login role assignment strips the 'server:'
prefix before looking up SystemRole, so 'server:viewer' from the
id_token maps to VIEWER correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:05:13 +02:00
hsiegeln
c757a0ea51 fix: replace last hardcoded paths with BASE_PATH-aware alternatives
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
- index.html: change /src/main.tsx to ./src/main.tsx (relative, respects
  <base> tag)
- AgentRegistrationController: derive SSE endpoint URL from request
  context via ServletUriComponentsBuilder instead of hardcoding /api/v1

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 09:53:00 +02:00
hsiegeln
9a40626a27 fix: include BASE_PATH and ?local in OIDC post-logout redirect URI
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Without BASE_PATH the redirect fails behind a reverse proxy. Adding
?local prevents the SSO auto-redirect from immediately signing the
user back in after logout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 09:45:46 +02:00
hsiegeln
4496be08bd docs: document SSO auto-redirect, consent handling, and auto-signup
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 34s
SonarQube / sonarqube (push) Successful in 3m36s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:45:45 +02:00
hsiegeln
e8bcc39ca9 fix: add ES384 to OidcTokenExchanger JWT algorithm list
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
Logto signs id_tokens with ES384 by default. SecurityConfig already
included it but OidcTokenExchanger only had RS256 and ES256.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:37:22 +02:00
hsiegeln
94bfb8fc4a fix: Back to Login button navigates to /login?local to prevent auto-redirect
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:31:57 +02:00
hsiegeln
c628c25081 fix: handle consent_required by retrying OIDC without prompt=none
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m36s
CI / deploy (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
When prompt=none fails with consent_required (scopes not yet granted),
retry the OIDC flow without prompt=none so the user can grant consent
once. Uses sessionStorage flag to prevent infinite loops — falls back
to local login if the retry also fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:29:31 +02:00
hsiegeln
3cea306e17 feat: auto-redirect to OIDC provider for true SSO
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m51s
CI / docker (push) Successful in 2m37s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 54s
When OIDC is configured, the login page automatically redirects to the
provider with prompt=none. If the user has an active OIDC session, they
are signed in without seeing a login page. If the provider returns
login_required (no session), falls back to the login form via ?local.
Users can bypass auto-redirect with /login?local.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:20:55 +02:00
hsiegeln
4244dd82e9 fix: use BASE_PATH for favicon references in subpath deployments
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Hardcoded /favicon.svg paths skip the <base> tag and fail when served
from a subpath like /server/. Now uses config.basePath in TSX and a
relative href in index.html so the <base> tag resolves correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:17:17 +02:00
hsiegeln
d7001804f7 fix: permit branding endpoints without authentication
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The login page loads the branding logo before the user is signed in.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:15:21 +02:00
hsiegeln
5c4c7ad321 fix: include BASE_PATH in OIDC redirect_uri for subpath deployments
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Has started running
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Behind a reverse proxy with strip-prefix (e.g., Traefik at /server/),
the OIDC redirect_uri must include the prefix so the callback routes
back through the proxy. Now uses config.basePath (from <base href>)
instead of hardcoding '/'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:14:34 +02:00
hsiegeln
0fab20e67a fix: append .well-known/openid-configuration to issuerUri in token exchanger
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
OidcTokenExchanger fetched the discovery document from the issuerUri
as-is, but the database stores the issuer URI (e.g. /oidc), not the
full discovery URL. Logto returns 404 for the bare issuer path.
SecurityConfig already appended the well-known suffix — now the token
exchanger does the same.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:04:57 +02:00
hsiegeln
d7563902a7 fix: read oidcTlsSkipVerify at call time instead of caching in constructor
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
OidcTokenExchanger cached securityProperties.isOidcTlsSkipVerify() in
the constructor as a boolean field. If Spring constructed the bean
before property binding completed, the cached value was false even when
the env var was set. SecurityConfig worked because it read the property
at call time. Now OidcTokenExchanger stores the SecurityProperties
reference and reads the flag on each call, matching SecurityConfig's
pattern.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:02:36 +02:00
hsiegeln
99e2a8354f fix: handle HTTPS redirects in InsecureTlsHelper for OIDC discovery
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Java's automatic redirect following creates new connections that do NOT
inherit custom SSLSocketFactory/HostnameVerifier. This caused the OIDC
discovery fetch to fail on redirect even with TLS_SKIP_VERIFY=true.
Now disables auto-redirect and follows manually with SSL on each hop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:51:49 +02:00
hsiegeln
083cb8b9ec feat: add CAMELEER_CORS_ALLOWED_ORIGINS for multi-origin CORS support
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Behind a reverse proxy the browser sends Origin matching the proxy's
public URL, which the single-origin CAMELEER_UI_ORIGIN rejects.
New env var accepts comma-separated origins and takes priority over
UI_ORIGIN, which remains as a backwards-compatible fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:41:00 +02:00
hsiegeln
0609220cdf docs: add CAMELEER_OIDC_TLS_SKIP_VERIFY to all documentation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:30:18 +02:00
hsiegeln
ca92b3ce7d feat: add CAMELEER_OIDC_TLS_SKIP_VERIFY to bypass cert verification for OIDC
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Self-signed CA certs on the OIDC provider (e.g. Logto behind a reverse
proxy) cause the login flow to fail because Java's truststore rejects
the connection. This adds an opt-in env var that creates a trust-all
SSLContext scoped to OIDC HTTP calls only (discovery, token exchange,
JWKS fetch) without affecting system-wide TLS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:26:40 +02:00
hsiegeln
7ebbc18b31 fix: make API calls respect BASE_PATH for subpath deployments
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
config.apiBaseUrl now derives from <base> tag when no explicit config
is set (e.g., /server/api/v1 instead of /api/v1). commands.ts authFetch
prepends apiBaseUrl and uses relative paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:04:52 +02:00
hsiegeln
5b7c92848d fix: remove path-rewriting sed that doubled BASE_PATH in <base> tag
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m43s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
The second sed matched the just-injected <base href="/server/"> and
rewrote it to <base href="/server/server/">. Since Vite builds with
base: './' (relative paths), the <base> tag alone is sufficient.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:52:30 +02:00
hsiegeln
44f3821df4 docs: add CAMELEER_OIDC_JWK_SET_URI to all documentation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m40s
CI / docker (push) Successful in 12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:58:05 +02:00
hsiegeln
51abe45fba feat: add BASE_PATH env var for serving UI from a subpath
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
When BASE_PATH is set (e.g., /server/), the entrypoint script injects
a <base> tag and rewrites asset paths in index.html. React Router reads
the basename from the <base> tag. Vite builds with relative paths.
Default / for standalone mode (no changes).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 21:04:28 +02:00
hsiegeln
3c70313d78 feat: add CAMELEER_OIDC_JWK_SET_URI for direct JWKS fetching
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
When set, fetches JWKs from this URL directly instead of discovering
from the OIDC well-known endpoint. Needed when the public issuer URL
(e.g., https://domain.com/oidc) isn't reachable from inside containers
but the internal URL (http://logto:3001/oidc/jwks) is.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 21:02:51 +02:00
hsiegeln
12bb734c2d fix: use tcpSocket probe for logto-postgresql instead of pg_isready
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
pg_isready without -U defaults to OS user "root" which doesn't exist
as a PostgreSQL role, causing noisy log entries.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 13:44:59 +02:00
hsiegeln
cbeaf30bc7 fix: move PG_USER/PG_PASSWORD before DB_URL in logto.yaml
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m9s
K8s $(VAR) substitution only resolves env vars defined earlier in the
list. PG_USER and PG_PASSWORD must come before DB_URL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 13:39:50 +02:00
hsiegeln
c4d2fa90ab docs: clarify Logto proxy setup and ENDPOINT/ADMIN_ENDPOINT semantics
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 3m15s
LOGTO_ENDPOINT and LOGTO_ADMIN_ENDPOINT are public-facing URLs that
Logto uses for OIDC discovery, issuer URI, and redirects. When behind
a reverse proxy (e.g., Traefik), set these to the external URLs.
Logto requires its own subdomain (not a path prefix).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 13:31:17 +02:00
hsiegeln
e9ef97bc20 docs: add Logto OIDC resource server spec and implementation plan
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m33s
CI / docker (push) Successful in 3m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 13:25:24 +02:00
hsiegeln
eecb0adf93 docs: replace Authentik with Logto, document OIDC resource server
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 13:15:09 +02:00
hsiegeln
c47b8b9998 ci: replace Authentik with Logto in deployment pipeline 2026-04-05 13:12:38 +02:00
hsiegeln
22d812d832 feat: replace Authentik with Logto K8s deployment 2026-04-05 13:12:01 +02:00
hsiegeln
fec6717a85 feat: update default rolesClaim to 'roles' for Logto compatibility
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 13:10:53 +02:00
hsiegeln
3bd07c9b07 feat: add OIDC resource server support with JWKS discovery and scope-based roles
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 13:10:08 +02:00
hsiegeln
a5c4e0cead feat: add spring-boot-starter-oauth2-resource-server and OIDC properties
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 13:06:53 +02:00
hsiegeln
de85cdf5a2 fix: let SPRING_DATASOURCE_URL fully control datasource connection
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
SonarQube / sonarqube (push) Successful in 3m26s
Explicit spring.datasource.url in YAML takes precedence over the env var,
causing deployed containers to connect to localhost instead of the postgres
service. Now the YAML uses ${SPRING_DATASOURCE_URL:...} so the env var
wins when set. Flyway inherits from the datasource (no separate URL).
Removed CAMELEER_DB_SCHEMA — schema is part of the datasource URL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:24:22 +02:00
hsiegeln
2277a0498f fix: set CAMELEER_DB_SCHEMA=public for existing main deployment
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m1s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
Existing deployment has tables in public schema. The new tenant_default
default breaks startup because Flyway sees an empty schema. Override to
public for backward compat; new deployments use the tenant-derived default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:21:17 +02:00
hsiegeln
ac87aa6eb2 fix: derive PG schema from tenant ID instead of defaulting to public
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m17s
Schema now defaults to tenant_${cameleer.tenant.id} (e.g. tenant_default,
tenant_acme) instead of public. Flyway create-schemas: true ensures the
schema is auto-created on first startup. CAMELEER_DB_SCHEMA env var still
available as override for feature branch isolation. Removed hardcoded
public schema from K8s base and main overlay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 21:46:57 +02:00
hsiegeln
f16d331621 docs: add SERVER-CAPABILITIES.md for SaaS integration reference
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Comprehensive standalone document covering API surface, agent protocol,
security, storage, multi-tenancy, deployment, and configuration — designed
for external systems (like the SaaS orchestration layer) that need to
understand and manage Cameleer3 Server instances.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 20:30:42 +02:00
hsiegeln
69055f7d74 fix: persist environment selection in Zustand store instead of URL params
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Environment selector was losing its value on navigation because URL search
params were silently dropped by navigate() calls. Moved to a Zustand store
with localStorage persistence so the selection survives navigation, page
refresh, and new tabs. Switching environment now resets all filters, clears
URL params, invalidates queries, and remounts pages via Outlet key. Also
syncs openapi.json schema with running backend.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 17:12:16 +02:00
hsiegeln
37eb56332a fix: use environmentId from heartbeat body for auto-heal
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
HeartbeatRequest now carries environmentId (cameleer3-common update).
Auto-heal prefers the heartbeat value (most current) over the JWT
claim, ensuring agents recover their correct environment immediately
on the first heartbeat after server restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 16:21:55 +02:00
hsiegeln
72ec87a3ba fix: persist environment in JWT claims for auto-heal recovery
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 1m7s
CI / deploy (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
Add 'env' claim to agent JWTs (set at registration, carried through
refresh). Auto-heal on heartbeat/SSE now reads environment from the
JWT instead of hardcoding 'default', so agents retain their correct
environment after server restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 16:12:25 +02:00
hsiegeln
346e38ee1d fix: update DS to v0.1.31, simplify env selector styles
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 1m23s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
DS v0.1.31 changes .env wrapper to neutral button style matching
other TopBar controls. Simplified selector CSS to inherit all
font/color properties from the wrapper.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 16:01:58 +02:00
hsiegeln
39d9ec9cd6 fix: restyle environment selector to match DS TopBar pill
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
Make the select transparent (no border, no background) so it
inherits the DS .env pill styling (success-colored badge with
mono font). Negative margins compensate for the pill padding.
Dropdown chevron uses currentColor to match the pill text.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:53:09 +02:00
hsiegeln
08f2a01057 fix: always show environment selector in TopBar
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 1m12s
CI / deploy (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
Use unfiltered agent query to discover environments (avoids circular
filter). Always show selector even with single environment so it's
visible as a label. Default to ['default'] when no agents connected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:47:48 +02:00
hsiegeln
574f82b731 docs: add historical implementation plans
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 37s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:45:49 +02:00
hsiegeln
c2d4d38bfb feat: move environment selector into TopBar (DS v0.1.30)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Update @cameleer/design-system to v0.1.30 which accepts ReactNode
for the environment prop. Move EnvironmentSelector from standalone
div into TopBar, rendering between theme toggle and user menu.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:43:43 +02:00
hsiegeln
694d0eef59 feat: add environment filtering across all APIs and UI
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Backend: Added optional `environment` query parameter to catalog,
search, stats, timeseries, punchcard, top-errors, logs, and agents
endpoints. ClickHouse queries filter by environment when specified
(literal SQL for AggregatingMergeTree, ? binds for raw tables).
StatsStore interface methods all accept environment parameter.

UI: Added EnvironmentSelector component (compact native select).
LayoutShell extracts distinct environments from agent data and
passes selected environment to catalog and agent queries via URL
search param (?env=). TopBar shows current environment label.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:42:26 +02:00
hsiegeln
babdc1d7a4 docs: update CLAUDE.md with multitenancy architecture
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:14:38 +02:00
hsiegeln
a188308ec5 feat: implement multitenancy with tenant isolation + environment support
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m25s
Adds configurable tenant ID (CAMELEER_TENANT_ID env var, default:
"default") and environment as a first-class concept. Each server
instance serves one tenant with multiple environments.

Changes across 36 files:
- TenantProperties config bean for tenant ID injection
- AgentInfo: added environmentId field
- AgentRegistrationRequest: added environmentId field
- All 9 ClickHouse stores: inject tenant ID, replace hardcoded
  "default" constant, add environment to writes/reads
- ChunkAccumulator: configurable tenant ID + environment resolver
- MergedExecution/ProcessorBatch/BufferedLogEntry: added environment
- ClickHouse init.sql: added environment column to all tables,
  updated ORDER BY (tenant→time→env→app), added tenant_id to
  usage_events, updated all MV GROUP BY clauses
- Controllers: pass environmentId through registration/auto-heal
- K8s deploy: added CAMELEER_TENANT_ID env var
- All tests updated for new signatures

Closes #123

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:00:18 +02:00
hsiegeln
ee7226cf1c docs: multitenancy architecture design spec
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Covers tenant isolation (1 tenant = 1 server instance), environment
support (first-class agent property), ClickHouse partitioning
(tenant → time → environment → application), PostgreSQL schema-per-
tenant via JDBC currentSchema, and agent protocol changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 14:37:00 +02:00
hsiegeln
7429b85964 feat: show route control bar on topology diagram
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
When no exchange is selected, the topology-only diagram now shows
the RouteControlBar above it (if the agent supports routeControl
or replay and the user has OPERATOR/ADMIN role). This fixes a gap
where suspended routes with no recent exchanges had no way to be
resumed from the UI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:49:28 +02:00
hsiegeln
a5c07b8585 docs: update CLAUDE.md with heartbeat capabilities restoration
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:31:33 +02:00
hsiegeln
45a74075a1 feat: restore agent capabilities from heartbeat after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
The heartbeat now carries capabilities (per protocol v2 update).
On each heartbeat, capabilities are updated in the agent registry.
On auto-heal (server restart), capabilities from the heartbeat
are used instead of empty Map.of(), so the agent's feature flags
(replay, routeControl, logForwarding, etc.) are restored
immediately on the first heartbeat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 13:19:15 +02:00
hsiegeln
abed4dc96f security: fix SQL injection in ClickHouse query escaping
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 1m6s
CI / deploy (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
Convert ClickHouseUsageTracker and ClickHouseMetricsQueryStore to
use JDBC parameterized queries (? binds) — these query raw tables
without AggregateFunction columns.

Fix lit(String) in RouteMetricsController and ClickHouseStatsStore
to escape backslashes before single quotes. Without this, an input
like \' breaks out of the string literal in ClickHouse (where \
is an escaped backslash). These must remain as literal SQL because
the ClickHouse JDBC 0.9.x driver wraps PreparedStatement in
sub-queries that strip AggregateFunction types, breaking -Merge
combinators.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 12:17:12 +02:00
hsiegeln
170b2c4a02 fix: run sonar:sonar in same reactor as verify
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
Running mvn sonar:sonar as a separate invocation skips child
modules. Combining verify and sonar:sonar in a single mvn
command ensures the reactor processes all modules.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 11:57:05 +02:00
hsiegeln
66e91ba18c fix: remove explicit sonar.sources/tests from mvn sonar:sonar
All checks were successful
CI / build (push) Successful in 2m0s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 14s
CI / deploy (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
Maven sonar plugin auto-detects sources and tests from the POM
module structure. Passing sonar.sources as CLI args caused path
doubling (module-dir/module-dir/src) in multi-module projects.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 11:13:47 +02:00
hsiegeln
e30b561dfe fix: use mvn sonar:sonar instead of standalone sonar-scanner
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m31s
CI / docker (push) Successful in 14s
CI / deploy (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
The standalone sonar-scanner CLI has Java discovery issues in the
build container. Switch to the Maven sonar plugin (same approach
as cameleer3 agent repo), which uses Maven's own JDK. This also
removes the sonar-scanner download/install step entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 11:07:49 +02:00
hsiegeln
5ae94e1e2c fix: set SONAR_SCANNER_JAVA_HOME for sonar-scanner 6.x
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m42s
CI / docker (push) Successful in 15s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 48s
sonar-scanner 6.x checks SONAR_SCANNER_JAVA_HOME, not JAVA_HOME.
Despite JAVA_HOME being correct and java being on PATH, the scanner
uses its own env var for Java discovery.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 11:04:03 +02:00
hsiegeln
7dca8f2609 fix: derive JAVA_HOME from jar binary and add to PATH
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 13s
CI / deploy (push) Successful in 50s
CI / deploy-feature (push) Has been skipped
java binary may not be on PATH directly in the build container.
Derive JAVA_HOME from the jar binary location (which we know works)
and prepend JAVA_HOME/bin to PATH so sonar-scanner can find java.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 10:59:45 +02:00
hsiegeln
2589c681c5 fix: derive JAVA_HOME for sonar-scanner in CI workflow
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m53s
CI / docker (push) Successful in 14s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
sonar-scanner 6.x requires JAVA_HOME or java on PATH. The build
container has Java installed but doesn't export JAVA_HOME, so
derive it from the java binary location.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 10:05:18 +02:00
hsiegeln
352fa43ef8 fix: add chmod +x for sonar-scanner binary after jar extraction
All checks were successful
CI / build (push) Successful in 2m5s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 10s
CI / deploy (push) Successful in 51s
CI / deploy-feature (push) Has been skipped
jar xf doesn't preserve Unix file permissions from zip entries,
so the sonar-scanner binary lacks the execute bit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:57:48 +02:00
hsiegeln
b04b12220b fix: resolve 25 SonarQube code smells across 21 files
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m2s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Remove unused fields (log, rbacService, roleRepository, jwt),
unused variables (agentTps, routeKeys, updated), unused imports
(HttpHeaders, JdbcTemplate). Rename restricted identifier 'record'
to 'auditRecord'/'event'. Return empty collections instead of null.
Replace .collect(Collectors.toList()) with .toList(). Simplify
conditional return in BootstrapTokenValidator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 09:36:13 +02:00
hsiegeln
633a61d89d perf: batch processor and log inserts to reduce ClickHouse part creation
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m2s
SonarQube / sonarqube (push) Failing after 1m58s
Diagnostics showed ~3,200 tiny inserts per 5 minutes:
- processor_executions: 2,376 inserts (14 rows avg) — one per chunk
- logs: 803 inserts (5 rows avg) — synchronous in HTTP handler

Fix 1: Consolidate processor inserts — new insertProcessorBatches() method
flattens all ProcessorBatch records into a single INSERT per flush cycle.

Fix 2: Buffer log inserts — route through WriteBuffer<BufferedLogEntry>,
flushed on the same 5s interval as executions. LogIngestionController now
pushes to buffer instead of inserting directly.

Also reverts async_insert config (doesn't work with JDBC inline VALUES).

Expected: ~3,200 inserts/5min → ~160 (20x reduction in part creation,
MV triggers, and background merge work).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:48:04 +02:00
hsiegeln
e0aac4bf0a perf: enable ClickHouse async_insert to batch small inserts server-side
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Diagnostics showed 3,200 tiny inserts per 5 minutes (processor_executions:
2,376 at 14 rows avg, logs: 803 at 5 rows avg), each creating a new part
and triggering MV aggregations + background merges. This was the root cause
of ~400m CPU usage at 3 tx/s.

async_insert=1 with 5s busy timeout lets ClickHouse buffer incoming inserts
and consolidate them into fewer, larger parts before writing to disk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:33:48 +02:00
hsiegeln
ac94a67a49 fix: reduce ClickHouse CPU by increasing flush interval, rename LIVE→AUTO labels
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m24s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- Increase ingestion flush interval from 500ms to 5000ms to reduce MV merge storms
- Reduce ClickHouse background_schedule_pool_size from 8 to 4
- Rename LIVE/PAUSED badge labels to AUTO/MANUAL across all pages
- Update design system to v0.1.29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:05:29 +02:00
hsiegeln
e1cb9d7872 fix: extract snapshot data from chunks, reduce ClickHouse log noise
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- ChunkAccumulator now extracts inputBody/outputBody/inputHeaders/outputHeaders
  from ExecutionChunk.inputSnapshot/outputSnapshot instead of storing empty strings
- Set ClickHouse server log level to warning (was trace by default)
- Update CLAUDE.md to document Ed25519 key derivation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:58:54 +02:00
hsiegeln
a9ec424d52 fix: derive Ed25519 signing key from JWT secret, no DB storage
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Replace DB-persisted keypair with deterministic derivation from
CAMELEER_JWT_SECRET via HMAC-SHA256 seed + seeded SHA1PRNG KeyPairGenerator.
Same secret = same key pair across restarts, no private key in the database.

Closes #121

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:18:43 +02:00
hsiegeln
81f13396a0 fix: persist Ed25519 signing key to survive server restarts
All checks were successful
CI / build (push) Successful in 2m8s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 50s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 54s
The keypair was generated ephemerally on each startup, causing agents
to reject all commands after a server restart (signature mismatch).
Now persisted to PostgreSQL server_config table and restored on startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:13:40 +02:00
hsiegeln
670e458376 fix: update ITs to use consolidated init.sql, remove dead code
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 1m29s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 50s
- All 7 ClickHouse integration tests now load init.sql via shared
  ClickHouseTestHelper instead of deleted V1-V11 migration files
- Remove unused useScope exports (setApp, setRoute, setExchange, clearScope)
- Remove unused CSS classes (monoCell, punchcardStack)
- Update ui/README.md DS version to v0.1.28

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:03:54 +02:00
hsiegeln
d4327af6a4 refactor: consolidate ClickHouse schema into single init.sql, cache diagrams
All checks were successful
CI / build (push) Successful in 2m2s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 51s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- Merge all V1-V11 migration scripts into one idempotent init.sql
- Simplify ClickHouseSchemaInitializer to load single file
- Replace route_diagrams projection with in-memory caches:
  hashCache (routeId+instanceId → contentHash) warm-loaded on startup,
  graphCache (contentHash → RouteGraph) lazy-populated on access
- Eliminates 9M+ row scans on diagram lookups

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:24:53 +02:00
hsiegeln
bb3e1e2bc3 fix: set deduplicate_merge_projection_mode for ReplacingMergeTree projection
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
ClickHouse 24.12 requires this setting before adding projections to
ReplacingMergeTree tables. Using 'drop' mode which discards the projection
during deduplication merges and rebuilds it afterward.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:14:56 +02:00
hsiegeln
984bb2d40f fix: sort ClickHouse migration scripts by numeric version prefix
All checks were successful
CI / build (push) Successful in 2m32s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 55s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 52s
Alphabetical sort put V10/V11 before V2-V9 ("V11" < "V1_" in ASCII),
causing the route_diagrams projection to run before the table existed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:06:56 +02:00
hsiegeln
6f00ff2e28 fix: reduce ClickHouse log noise, admin query spam, and diagram scan perf
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
- Set com.clickhouse log level to INFO and org.apache.hc.client5 to WARN
- Admin hooks (useUsers/useGroups/useRoles) now only fetch on admin pages,
  eliminating AUDIT view_users entries on every UI click
- Add ClickHouse projection on route_diagrams for (tenant_id, route_id,
  instance_id, created_at) to avoid full table scans on diagram lookups
- Bump @cameleer/design-system to v0.1.28 (PAUSED mode time range fix,
  refreshTimeRange API)
- Call refreshTimeRange before invalidateQueries in PAUSED mode manual
  refresh so sidebar clicks use current time window

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:48:30 +02:00
hsiegeln
2708bcec17 fix: first exchange click doesn't highlight selected row
All checks were successful
CI / build (push) Successful in 1m47s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
On first click, Dashboard was in non-split mode. The click set
selectedId locally then triggered split view, which remounted
Dashboard — losing the selectedId state.

Added activeExchangeId prop passed from ExchangesPage so the
selection survives the remount. Also syncs via useEffect when
parent changes selection (e.g. correlated exchange navigation).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:28:26 +02:00
hsiegeln
901dfd1eb8 fix: PAUSED mode disabled queries entirely instead of just polling
Some checks failed
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
useLiveQuery returned enabled:false when paused, which prevented
queries from running at all. Changed to enabled:true always —
PAUSED now means "fetch once, no polling" instead of "don't fetch".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:25:04 +02:00
hsiegeln
726e77bb91 docs: update all documentation for session changes
Some checks failed
CI / build (push) Successful in 2m2s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CLAUDE.md:
- Agent registry auto-heal note (in-memory, JWT fallback)
- Usage analytics (ClickHouse usage_events table)

HOWTO.md:
- Architecture diagram: added deploy-demo (NodePort 30092) and cameleer-demo namespace
- Access URLs: added Deploy Demo
- Agent registry: server restart resilience documentation
- Route control: CommandGroupResponse note

ui/README.md:
- Fixed outdated generate-api command
- Added DS version (v0.1.26)
- Fixed VITE_API_TARGET (30081 not 30090)
- Added key features section (cmd-k, LIVE mode, route control, event icons)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:22:44 +02:00
hsiegeln
d30c267292 fix: route catalog missing routes after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 52s
CI / deploy (push) Successful in 54s
CI / deploy-feature (push) Has been skipped
After server restart, auto-healed agents register with empty
routeIds. The catalog only looked at agent registry for routes,
so routes and counts disappeared.

Now merges route IDs from ClickHouse stats_1m_route into the
catalog. Also includes apps that only exist in ClickHouse data
(no agent currently registered). Routes and exchange counts
survive server restarts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:14:27 +02:00
hsiegeln
37c10ae0a6 feat: manual refresh on sidebar navigation when LIVE mode is off
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
When autoRefresh is disabled, sidebar clicks now invalidate all
queries (queryClient.invalidateQueries()), triggering a re-fetch.
This gives users "click to refresh" behavior instead of stale data.

When LIVE mode is on, queries already poll at intervals, so no
invalidation is needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:01:29 +02:00
hsiegeln
c16f0e62ed fix: clicking Applications header navigates back to all apps
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m31s
CI / docker (push) Successful in 1m22s
CI / deploy (push) Failing after 2m26s
CI / deploy-feature (push) Has been skipped
When the Applications section is already expanded, clicking the
header now navigates to /{tab} (all applications) instead of
collapsing. When collapsed, clicking expands as before.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:49:54 +02:00
hsiegeln
2bc3efad7f fix: agent auth, heartbeat, and SSE all break after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Three related issues caused by in-memory agent registry being empty
after server restart:

1. JwtAuthenticationFilter rejected valid agent JWTs if agent wasn't
   in registry — now authenticates any valid JWT regardless

2. Heartbeat returned 404 for unknown agents — now auto-registers
   the agent from JWT claims (subject, application)

3. SSE endpoint returned 404 — same auto-registration fix

JWT validation result is stored as a request attribute so downstream
controllers can extract the application claim for auto-registration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:41:23 +02:00
hsiegeln
0632f1c6a8 fix: agent token refresh returns 404 after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m23s
The refresh endpoint required the agent to exist in the in-memory
registry. After server restart the registry is empty, so all refresh
attempts got 404. The refresh token itself is self-contained with
subject, application, and roles — the registry lookup is optional.

Now uses application from the JWT, falling back to registry only
if the agent happens to be registered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:37:57 +02:00
hsiegeln
bdac363e40 fix: active queries list always showed itself
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
The system.processes query was returning its own row. Added
filter: query NOT LIKE '%system.processes%'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:33:47 +02:00
hsiegeln
d9615204bf fix: admin pages not scrollable (content clipped by overflow:hidden)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
AdminLayout was a plain div with padding but no scroll. The parent
<main> has overflow:hidden, so admin page content beyond viewport
height was clipped. Added flex:1, overflow:auto, minHeight:0 to
make AdminLayout a proper scroll container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:31:04 +02:00
hsiegeln
2896bb90a9 fix: usage events never flushed to ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m7s
UsageFlushScheduler was a @Component with @ConditionalOnBean, but
ClickHouseUsageTracker is created via @Bean — component scan runs
first, so the condition always evaluated false. Events accumulated
in the WriteBuffer but flush() was never called.

Moved scheduler to @Bean in StorageBeanConfig with the same
@ConditionalOnProperty guard as the tracker.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:07:13 +02:00
hsiegeln
a036d8a027 docs: spec for cameleer-deploy-demo prototype
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 00:13:04 +02:00
hsiegeln
44a37317d1 fix: cmd-k context key for tab reset and Enter-to-navigate on admin pages
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 1m27s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 51s
SonarQube / sonarqube (push) Failing after 2m22s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:53:09 +02:00
hsiegeln
146398b183 feat: RBAC page reads cmd-k navigation state for tab switch and highlight
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 23:42:18 +02:00
hsiegeln
69ca52b25e feat: handle admin cmd-k selection with tab navigation state 2026-04-02 23:38:06 +02:00
hsiegeln
111bcc302d feat: build admin search data for cmd-k on admin pages
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 23:34:52 +02:00
hsiegeln
cf36f81ef1 chore: bump @cameleer/design-system to v0.1.26 2026-04-02 23:33:00 +02:00
hsiegeln
28f38331cc docs: implementation plan for context-aware cmd-k search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:27:07 +02:00
hsiegeln
394fde30c7 docs: spec for context-aware cmd-k search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:21:53 +02:00
hsiegeln
62b5c56c56 feat: event-type icons for agent event feeds
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
Icons now reflect event type (UserPlus for registration, Skull
for dead, HeartPulse for recovery, Route for state changes, etc.)
while severity still drives the color. Updated in both
AgentInstance and AgentHealth pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:06:01 +02:00
hsiegeln
9b401558a5 fix: make disabled route control buttons visually distinct
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Disabled buttons now show reduced opacity (0.35) and muted icon
color instead of just changing the cursor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:58:46 +02:00
hsiegeln
38b76513c7 feat: route control buttons reflect current route state
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Buttons are disabled based on route state: Started disables
Start/Resume, Stopped disables Stop/Suspend/Resume, Suspended
disables Start/Suspend. State looked up from catalog API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:56:49 +02:00
hsiegeln
2265ebf801 chore: bump @cameleer/design-system to v0.1.25
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m23s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m12s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:47:53 +02:00
hsiegeln
20af81a5dc feat: show server version in sidebar header
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m39s
Version injected at build time via VITE_APP_VERSION env var.
CI sets it to branch@sha. Falls back to 'dev' in local dev.
Displayed next to "Cameleer" in the sidebar header.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:42:06 +02:00
hsiegeln
d819f88ae4 fix: starred routes not showing — starKey prefix mismatch
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
collectStarredItems used 'app:' prefix for route keys but
buildAppTreeNodes uses 'route:' prefix. Routes were starred
but never matched in the starred section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:36:28 +02:00
hsiegeln
5880abdd93 fix: keep admin section in place, don't move to top
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Admin section stays in its fixed position (after Starred, before
Footer). Entering admin mode collapses Applications and Starred
but does not reorder sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:32:53 +02:00
hsiegeln
b676450995 fix: simplify sidebar to Applications + Starred + Admin footer
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Remove Agents and Routes sections from sidebar. Layout is now:
Header (camel logo + Cameleer) → Search → Applications section →
Starred section (when items exist) → Footer (Admin + API Docs).

Admin accordion: clicking Admin navigates to /admin/rbac and
expands Admin section at top while collapsing Applications and
Starred. Clicking Applications exits admin mode.

Removed buildAgentTreeNodes and buildRouteTreeNodes from
sidebar-utils (no longer needed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:29:44 +02:00
hsiegeln
e495b80432 fix: increase ClickHouse pool size and reduce flush interval
All checks were successful
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 2m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Pool was hardcoded to 10 connections serving 7 concurrent write
streams + UI reads, causing "too many simultaneous queries" and
WriteBuffer overflow. Pool now defaults to 50 (configurable via
clickhouse.pool-size), flush interval reduced from 1000ms to 500ms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:11:15 +02:00
hsiegeln
45eab761b7 chore: bump @cameleer/design-system to v0.1.24
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:06:13 +02:00
hsiegeln
8d899cc70c refactor: use HeartbeatRequest from cameleer3-common
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
Replace local HeartbeatRequest DTO with the shared model from
cameleer3-common. Message types exchanged between server and agent
belong in the common module.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:05:26 +02:00
hsiegeln
520b80444a feat(#119): accept route states in heartbeat and state-change events
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 34s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Replace ACK-based route state inference with agent-reported state.
Heartbeats now carry optional routeStates map, and ROUTE_STATE_CHANGED
events update the registry immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 21:45:13 +02:00
hsiegeln
17aff5ef9d docs: route state protocol extension spec
Defines two backward-compatible mechanisms for accurate route state
tracking: heartbeat extension (routeStates map in heartbeat body)
and ROUTE_STATE_CHANGED events for real-time updates. Covers
agent-side detection via Camel EventNotifier, server-side handling,
multi-agent conflict resolution, and migration path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:26:38 +02:00
hsiegeln
b714d3363f feat(#119): expose route state in catalog API and sidebar/dashboard
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 29s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add routeState field to RouteSummary DTO (null for started, 'stopped'
or 'suspended' for non-default states). Sidebar shows stop/pause icons
and state badge for affected routes in both Apps and Routes sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:46 +02:00
hsiegeln
0acceaf1a9 feat(#119): add RouteStateRegistry for tracking route operational state
In-memory registry that infers route state (started/stopped/suspended)
from successful route-control command ACKs. Updates state only when all
agents in a group confirm success.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:35 +02:00
hsiegeln
ca1d472b78 feat(#117): agent-count toasts and persistent error toast dismiss
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:08:00 +02:00
hsiegeln
c3b4f70913 feat(#116): update command hooks for synchronous group response
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add CommandGroupResponse and ConfigUpdateResponse types. Switch
useSendGroupCommand and useSendRouteCommand from openapi-fetch to authFetch
returning CommandGroupResponse. Update useUpdateApplicationConfig to return
ConfigUpdateResponse and fix all consumer onSuccess callbacks to access
saved.config.version instead of saved.version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:01:06 +02:00
hsiegeln
027e45aadf feat(#116): synchronous group command dispatch with multi-agent response collection
Add addGroupCommandWithReplies() to AgentRegistryService that sends commands
to all LIVE agents in a group and returns CompletableFuture per agent for
collecting replies. Update sendGroupCommand() and pushConfigToAgents() to
wait with a shared 10-second deadline, returning CommandGroupResponse with
per-agent status, timeouts, and overall success. Config update endpoint now
returns ConfigUpdateResponse wrapping both the saved config and push result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:00:56 +02:00
hsiegeln
f39f07e7bf feat(#118): add confirmation dialog for stop and suspend commands
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 35s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Stop and suspend route commands now show a ConfirmDialog requiring
typed confirmation before dispatch. Start and resume execute
immediately without confirmation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:54:23 +02:00
hsiegeln
d21d8b2c48 fix(#112): initialize sidebar accordion state from initial route
Some checks failed
CI / build (push) Failing after 43s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Direct navigation to /admin/* now correctly opens Admin section
and collapses operational sections on first render. Previously
the accordion effect only triggered on route transitions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:36:43 +02:00
hsiegeln
d5f5601554 fix(#112): add missing Routes section, fix admin double padding
Review feedback: buildRouteTreeNodes was defined but never rendered.
Added Routes section between Agents and Admin. Removed duplicate
padding on admin pages (AdminLayout handles its own padding).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:32:26 +02:00
hsiegeln
00042b1d14 feat(#112): remove admin tabs, sidebar handles navigation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:29 +02:00
hsiegeln
fe49eb5aba feat(#112): migrate to composable sidebar with accordion and collapse
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:25 +02:00
hsiegeln
bc913eef6e feat(#112): extract sidebar tree builders and types from DS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:22 +02:00
hsiegeln
d70ad91b33 docs: clarify search ownership and icon-rail click behavior
Search: DS renders dumb input, app owns filterQuery state and
passes it to each SidebarTree. Icon-rail click: fires both
onCollapseToggle and onToggle simultaneously, no navigation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:41:31 +02:00
hsiegeln
ba361af2d7 docs: composable sidebar design spec for #112
Replaces the previous "hide sidebar on admin" approach with a
composable compound component design. DS provides shell + building
blocks (Sidebar, Section, Footer, SidebarTree); consuming app
controls all content, section ordering, accordion behavior, and
icon-rail collapse.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:38:01 +02:00
hsiegeln
78777d2ba6 Revert "feat(#112): hide sidebar, topbar, cmd palette on admin pages"
This reverts commit d95e518622.
2026-04-02 17:22:06 +02:00
hsiegeln
3f8a9715a4 Revert "feat(#112): add admin header bar with back button and logout"
This reverts commit a484364029.
2026-04-02 17:22:06 +02:00
hsiegeln
f00a3e8b97 Revert "fix(#112): remove dead admin breadcrumb code, add logout aria-label"
This reverts commit d5028193c0.
2026-04-02 17:22:06 +02:00
hsiegeln
d5028193c0 fix(#112): remove dead admin breadcrumb code, add logout aria-label
Review feedback: breadcrumb memo had an unused isAdminPage branch
(TopBar no longer renders on admin pages). Added aria-label to
icon-only logout button for screen readers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:16:01 +02:00
hsiegeln
a484364029 feat(#112): add admin header bar with back button and logout
AdminLayout gains a self-contained header (Back / Admin / user+logout)
with CSS module styles, replacing the inline padding wrapper. Admin
pages now render fully without the main app chrome.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 17:12:50 +02:00
hsiegeln
d95e518622 feat(#112): hide sidebar, topbar, cmd palette on admin pages
Pass null as sidebar prop, guard TopBar and CommandPalette with
!isAdminPage, and remove conditional admin padding from main element.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 17:12:44 +02:00
hsiegeln
56297701e6 fix: use ILIKE for case-insensitive log search in ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m29s
LIKE is case-sensitive in ClickHouse. Switch to ILIKE for message,
stack_trace, and logger_name searches so queries match regardless
of casing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:35:34 +02:00
hsiegeln
8c7c9911c4 feat: highlight search matches in log results
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Recursive case-insensitive highlighting of the search query in
collapsed message, expanded full message, and stack trace. Uses the
project's amber accent color for the highlight mark.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:34:15 +02:00
hsiegeln
4d66d6ab23 fix: use deterministic badge color for app names in Logs tab
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Use attributeBadgeColor() (hash-based) instead of "auto" so the same
application name gets the same badge color across all pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:31:04 +02:00
hsiegeln
b73f5e6dd4 feat: add Logs tab with cursor-paginated search, level filters, and live tail
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 49s
- Extend GET /api/v1/logs with cursor pagination, multi-level filtering,
  optional application scoping, and level count aggregation
- Add exchangeId, instanceId, application, mdc fields to log responses
- Refactor ClickHouseLogStore with keyset pagination (N+1 pattern)
- Add LogSearchRequest/LogSearchResponse core domain records
- Create LogSearchPageResponse wrapper DTO
- Add Logs as 4th content tab (Exchanges | Dashboard | Runtime | Logs)
- Implement LogSearch component with debounced search, level filter bar,
  expandable log entries, cursor pagination, and live tail mode
- Add cross-navigation: exchange header → logs, log tab → logs tab
- Update ClickHouseLogStoreIT with cursor, multi-level, cross-app tests

Closes: #104

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 08:47:16 +02:00
hsiegeln
a52751da1b fix: avoid alias shadowing in processor metrics -Merge query
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
SonarQube / sonarqube (push) Failing after 1m52s
ClickHouse 24.12 new query analyzer resolves countMerge(total_count)
in the CASE WHEN to the SELECT alias (UInt64) instead of the original
AggregateFunction column when the alias has the same name. Renamed
aliases to tc/fc to avoid the collision.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:24:50 +02:00
hsiegeln
51780031ea fix: use alias in ORDER BY for processor metrics query
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
ClickHouse rejects countMerge() in ORDER BY after GROUP BY because the
column is already finalized to UInt64. Use the SELECT alias instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:11:54 +02:00
hsiegeln
eb2cafc7fa fix: use jar instead of unzip in sonarqube workflow
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 47s
The build container lacks unzip. The JDK jar command handles zip
extraction natively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:02:09 +02:00
hsiegeln
805e6d51cb fix: add processor_type to stats_1m_processor_detail MV
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m14s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The table and materialized view were missing the processor_type column,
causing the RouteMetricsController query to fail and the dashboard
processor metrics table to render empty.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:00:23 +02:00
hsiegeln
f3feaddbfe feat: show distinct attribute keys in cmd-k Attributes tab
All checks were successful
CI / build (push) Successful in 1m58s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m46s
CI / deploy (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
Add GET /search/attributes/keys endpoint that queries distinct
attribute key names from ClickHouse using JSONExtractKeys. Attribute
keys appear in the cmd-k Attributes tab alongside attribute value
matches from exchange results.

- SearchIndex.distinctAttributeKeys() interface method
- ClickHouseSearchIndex implementation using arrayJoin(JSONExtractKeys)
- SearchController /attributes/keys endpoint
- useAttributeKeys() React Query hook
- buildSearchData includes attribute keys as 'attribute' category items

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:39:27 +02:00
hsiegeln
9057981cf7 fix: use composite ID for routes in command palette search data
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 55s
Routes with the same name across different applications (e.g., "route1"
in both QUARKUS-APP and BACKEND-APP) were deduplicated because they
shared the same id (routeId). Use appId/routeId as the id so all
routes appear in cmd-k results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:33:23 +02:00
hsiegeln
b30a5b5760 fix: prevent cmd-k scroll reset on catalog poll refresh
All checks were successful
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 2m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m0s
The searchData useMemo recomputed on every catalog poll cycle because
catalogData got a new array reference even when content was unchanged.
This caused the CommandPalette list to re-render and reset scroll.

Use a ref with deep equality check to keep a stable catalog reference,
only updating when the actual data changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:22:50 +02:00
hsiegeln
910230cbf8 fix: add <mark> highlighting to search match context snippets
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 46s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The command palette renders matchContext via dangerouslySetInnerHTML
expecting HTML with <mark> tags, but extractSnippet() returned plain
text. Wrap the matched term in <mark> tags and escape surrounding
text to prevent XSS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:18:04 +02:00
hsiegeln
1d791bb329 fix: use exact match for ID fields in full-text search
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
ID fields (execution_id, correlation_id, exchange_id) should use
exact equality, not LIKE with wildcards. LIKE is only needed for
the _search_text full-text columns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:13:54 +02:00
hsiegeln
9781fe0d7c fix: include execution/correlation/exchange IDs in full-text search
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The _search_text materialized column only contained error messages,
bodies, and headers — not execution_id, correlation_id, exchange_id,
or route_id. Searching by ID via cmd-k returned no results.

- Add ID fields to _search_text in ClickHouse DDL (covered by ngram
  bloom filter index)
- Add direct LIKE matches on execution_id, correlation_id, exchange_id
  in the text search WHERE clause for faster exact ID lookups

Requires ClickHouse table recreation (fresh install).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:12:15 +02:00
hsiegeln
92951f1dcf chore: update @cameleer/design-system to v0.1.22
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m27s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
Sidebar selectedPath now uses sidebarReveal on all tabs, not just
exchanges. This fixes sidebar highlighting on dashboard and runtime.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:09:20 +02:00
hsiegeln
a7d256b38a fix: compute hasTraceData from processor records in chunk accumulator
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The chunked ingestion path hardcoded hasTraceData=false because the
execution envelope doesn't carry processor bodies. But the processor
records DO have inputBody/outputBody — we just need to check them.

Track hasTraceData across chunks in PendingExchange and pass it to
MergedExecution when the final chunk arrives or on stale sweep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:04:34 +02:00
hsiegeln
e26266532a fix: regenerate OpenAPI types, fix search scoping by applicationId
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
The identity rename (application→applicationId) broke search filtering
because the stale schema.d.ts still had 'application' as the field name.
The backend silently ignored the unknown field, returning unfiltered results.

- Regenerate openapi.json and schema.d.ts from live backend
- Fix Dashboard: application→applicationId in search request
- Fix RouteDetail: application→applicationId in search request (2 places)
- LayoutShell: scope command palette search by appId/routeId
- LayoutShell: pass sidebarReveal state on sidebar click navigation

Note for DS team: the Sidebar selectedPath logic (line 5451 in dist)
has a hardcoded pathname.startsWith("/exchanges/") guard. This should
be broadened to simply `S ? S : $.pathname` so sidebarReveal works on
all tabs (dashboard, runtime), not just exchanges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:55:19 +02:00
hsiegeln
178bc40706 Revert "fix: sidebar selection highlight and scoped command palette search"
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
This reverts commit 4168a6d45b.
2026-04-01 20:43:27 +02:00
hsiegeln
4168a6d45b fix: sidebar selection highlight and scoped command palette search
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Two fixes:
- Pass sidebarReveal state on sidebar navigation so the design system
  can highlight the selected entry (it compares internal /apps/... paths
  against this state value, not the browser URL)
- Command palette search now includes scope.appId and scope.routeId
  so results are filtered to the current sidebar selection

Note: sidebar highlighting works on the exchanges tab. The design
system's selectedPath logic only checks pathname.startsWith("/exchanges/")
for sidebarReveal — a DS update is needed to support /dashboard/ and
/runtime/ tabs too.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:41:42 +02:00
hsiegeln
a028905e41 fix: update agent field names in frontend to match backend DTO
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
The AgentInstanceResponse backend DTO uses instanceId, displayName,
applicationId, status — but the stale schema.d.ts still had id, name,
application, state. This caused the runtime table to show no data.

- Update schema.d.ts AgentInstanceResponse fields
- Fix AgentHealth: row.id→instanceId, row.name→displayName,
  row.application→applicationId, inst.id→instanceId
- Fix AgentInstance: agent.id→instanceId, agent.name→displayName
- Fix ExchangeHeader: agent.id→instanceId, agent.state→status
- Fix LayoutShell search: agent.state→status, agentTps→tps

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:36:31 +02:00
hsiegeln
f82aa26371 fix: improve ClickHouse admin page, fix AgentHealth type error
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 3m46s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 58s
Rewrite ClickHouse admin to show useful storage metrics instead of
often-empty system.events data. Add active queries section.

- Replace performance endpoint: query system.parts for disk size,
  uncompressed size, compression ratio, total rows, part count
- Add /queries endpoint querying system.processes for active queries
- Frontend: storage overview strip, tables with total size, active
  queries DataTable
- Fix AgentHealth.tsx type: agentId → instanceId in inline type cast

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:18:06 +02:00
hsiegeln
188810e54b feat: remove TimescaleDB, dead PG stores, and storage feature flags
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 32s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Complete the ClickHouse migration by removing all PostgreSQL analytics
code. PostgreSQL now serves only RBAC, config, and audit — all
observability data is exclusively in ClickHouse.

- Delete 6 dead PostgreSQL store classes (executions, stats, diagrams,
  events, metrics, metrics-query) and 2 integration tests
- Delete RetentionScheduler (ClickHouse TTL handles retention)
- Remove all 7 cameleer.storage.* feature flags from application.yml
- Remove all @ConditionalOnProperty from ClickHouse beans in StorageBeanConfig
- Consolidate 14 Flyway migrations (V1-V14) into single clean V1 with
  only RBAC/config/audit tables (no TimescaleDB, no analytics tables)
- Switch from timescale/timescaledb-ha:pg16 to postgres:16 everywhere
  (docker-compose, deploy/postgres.yaml, test containers)
- Remove TimescaleDB check and /metrics-pipeline from DatabaseAdminController
- Set clickhouse.enabled default to true

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:10:58 +02:00
hsiegeln
283e38a20d feat: remove OpenSearch, add ClickHouse admin page
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 33s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Remove all OpenSearch code, dependencies, configuration, deployment
manifests, and CI/CD references. Replace the OpenSearch admin page
with a ClickHouse admin page showing cluster status, table sizes,
performance metrics, and indexer pipeline stats.

- Delete 11 OpenSearch Java files (config, search impl, admin controller, DTOs, tests)
- Delete 3 OpenSearch frontend files (admin page, CSS, query hooks)
- Delete deploy/opensearch.yaml K8s manifest
- Remove opensearch Maven dependencies from pom.xml
- Remove opensearch config from application.yml, Dockerfile, docker-compose
- Remove opensearch from CI workflow (secrets, deploy, cleanup steps)
- Simplify ThresholdConfig (remove OpenSearch thresholds, database-only)
- Change default search backend from opensearch to clickhouse
- Add ClickHouseAdminController with /status, /tables, /performance, /pipeline
- Add ClickHouseAdminPage with StatCards, pipeline ProgressBar, tables DataTable
- Update CLAUDE.md, HOWTO.md, and source comments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:56:06 +02:00
hsiegeln
5ed7d38bf7 fix: sort sidebar entries alphanumerically
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 29s
CI / docker (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been skipped
Applications, routes within each app, and agents within each app
are now sorted by name using localeCompare.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:24:39 +02:00
hsiegeln
4cdbcdaeea fix: update frontend field names for identity rename (applicationId, instanceId)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 32s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
The backend identity rename (applicationName → applicationId,
agentId → instanceId) was not reflected in the frontend. This caused
drilldown to fail (detail.applicationName was undefined, disabling
the diagram fetch) and various display issues.

Updated schema.d.ts, ExchangeHeader, ExecutionDiagram, Dashboard,
AgentHealth, AgentInstance, LayoutShell, LogTab, InfoTab, DetailPanel,
ExchangesPage, and tracing-store.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:22:16 +02:00
hsiegeln
aa2d203f4e feat: add UI usage analytics tracking
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m14s
CI / deploy (push) Successful in 46s
CI / deploy-feature (push) Has been skipped
Tracks authenticated UI user requests to understand usage patterns:
- New ClickHouse usage_events table with 90-day TTL
- UsageTrackingInterceptor captures method, path, duration, user
- Path normalization groups dynamic segments ({id}, {hash})
- Buffered writes via WriteBuffer + periodic flush
- Admin endpoint GET /api/v1/admin/usage with groupBy=endpoint|user|hour
- Skips agent requests, health checks, and data ingestion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:53:32 +02:00
hsiegeln
ce4abaf862 fix: infer compound node color from descendants when no own overlay state
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 49s
Path containers (EIP_WHEN, EIP_OTHERWISE, etc.) don't have their own
processor records, so they never get an overlay entry. Now inferred
from descendants: green if any descendant executed, red if any failed.
Gated (amber) only when no descendants executed at all.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:37:47 +02:00
hsiegeln
40ce4a57b4 fix: only show amber on containers where gate blocked all children
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 49s
A container is only gated (amber) when filterMatched=false or
duplicateMessage=true AND no descendants were executed. Containers
with executed children (split, choice, idempotent that passed) now
correctly show green/red based on their execution status.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:32:39 +02:00
hsiegeln
b44ffd08be fix: color compound nodes by execution status in overlay
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 51s
CompoundNode now uses execution overlay status to color its header:
failed (red) > completed (green) > default. Previously only used
static type-based color regardless of execution state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:20:59 +02:00
hsiegeln
cf439248b5 feat: expose iteration/iterationSize fields for diagram overlay
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 52s
Replace synthetic wrapper node approach with direct iteration fields:
- ProcessorNode gains iteration (child's index) and iterationSize
  (container's total) fields, populated from ClickHouse flat records
- Frontend hooks detect iteration containers from iterationSize != null
  instead of scanning for wrapper processorTypes
- useExecutionOverlay filters children by iteration field instead of
  wrapper nodes, eliminating ITERATION_WRAPPER_TYPES entirely
- Cleaner data contract: API returns exactly what the DB stores

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:14:36 +02:00
hsiegeln
e8f9ada1d1 fix: inject ClickHouse JdbcTemplate into stats-querying controllers
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 49s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
RouteCatalogController, RouteMetricsController, and AgentRegistrationController
had unqualified JdbcTemplate injection, receiving the PostgreSQL template
instead of ClickHouse. The stats queries silently failed (caught exception)
returning 0 counts. Added @Qualifier("clickHouseJdbcTemplate") to all three.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:34:56 +02:00
hsiegeln
bc70797e31 fix: force UTC timezone in Docker runtime
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Sets TZ=UTC and -Duser.timezone=UTC to guarantee all JVM time operations
use UTC regardless of the container's base image or host configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:24:23 +02:00
hsiegeln
f6123b8a7c fix: use explicit UTC formatting in ClickHouse DateTime literals
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 50s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 57s
Timestamp.toString() uses JVM local timezone which can mismatch with
ClickHouse's UTC timezone, causing time-filtered queries to return empty
results. Replaced with DateTimeFormatter.withZone(UTC) in all lit() methods.

Also added warn logging to RouteCatalogController catch blocks to surface
query errors instead of silently swallowing them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:13:52 +02:00
hsiegeln
d739094a56 fix: update ClickHouse DDL files with new column names instead of ALTER RENAME
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
ClickHouse can't rename columns that are part of ORDER BY keys.
Updated V1-V8 DDL files directly with new column names (instance_id,
application_id) and removed V9 migration. Wipe ClickHouse and restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:40:54 +02:00
hsiegeln
91400defe9 fix: add missing V9 (ClickHouse) and V14 (PostgreSQL) identity column rename migrations
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Migration files were lost during worktree merge — recreated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:33:02 +02:00
hsiegeln
909d713837 feat: rename agent identity fields for protocol v2 + add SHUTDOWN lifecycle state
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 22s
Align all internal naming with the agent team's protocol v2 identity rename:
- agentId → instanceId (unique per-JVM identifier)
- applicationName → applicationId (shared app identifier)
- AgentInfo: id → instanceId, name → displayName, application → applicationId

Add SHUTDOWN lifecycle state for graceful agent shutdowns:
- New POST /data/events endpoint receives agent lifecycle events
- AGENT_STOPPED event transitions agent to SHUTDOWN (skips STALE/DEAD)
- New POST /{id}/deregister endpoint removes agent from registry
- Server now distinguishes graceful shutdown from crash (heartbeat timeout)

Includes ClickHouse V9 and PostgreSQL V14 migrations for column renames.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:22:42 +02:00
hsiegeln
ad8dd73596 fix: update ChunkAccumulator tests for DiagramStore constructor param
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 52s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:58:27 +02:00
hsiegeln
e50c9fa60d fix: address SonarQube reliability issues
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 39s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
- ElkDiagramRenderer.getElkRoot(): add null guard to prevent NPE
  when node is null (SQ java:S2259)
- WriteBuffer: add offerOrWarn() that logs when buffer is full instead
  of silently dropping data. ChunkAccumulator now uses this method
  so ingestion backpressure is visible in logs (SQ java:S899)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:55:31 +02:00
hsiegeln
d4dbfa7ae6 fix: populate diagramContentHash in chunked ingestion pipeline
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 43s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
ChunkAccumulator now injects DiagramStore and looks up the content hash
when converting to MergedExecution. Without this, the detail page had
no diagram hash, so the overlay couldn't find the route diagram.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:50:34 +02:00
hsiegeln
59374482bc fix: replace PostgreSQL aggregate functions with ClickHouse -Merge combinators
RouteCatalogController, RouteMetricsController, AgentRegistrationController
all had inline SQL using SUM() on AggregateFunction columns from stats_1m_*
AggregatingMergeTree tables. Replace with countMerge/countIfMerge/sumMerge.
Also fix time_bucket() → toStartOfInterval() and ::double → toFloat64().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:49:06 +02:00
hsiegeln
43e187a023 fix: ChunkIngestionController ObjectMapper missing FAIL_ON_UNKNOWN_PROPERTIES
Adds DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES=false (required
by PROTOCOL.md) and explicit TypeReference<List<ExecutionChunk>> for
array parsing. Without this, batched chunks from ChunkedExporter
(2+ chunks in a JSON array) were silently rejected, causing final:true
chunks to be lost and all exchanges to go stale.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:45:12 +02:00
hsiegeln
bc1c71277c fix: resolve duplicate ExecutionStore bean conflict
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 49s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 55s
ClickHouseExecutionStore implements ExecutionStore, so the concrete bean
already satisfies the interface — remove redundant wrapper bean. Align
ChunkAccumulator and ExecutionFlushScheduler conditions to
cameleer.storage.executions flag.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 09:44:02 +02:00
hsiegeln
520181d241 test(clickhouse): add integration tests for execution read path and tree reconstruction
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 46s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m16s
SonarQube / sonarqube (push) Failing after 2m21s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:11:44 +02:00
hsiegeln
95b9dea5c4 feat(clickhouse): wire ClickHouseExecutionStore as active ExecutionStore
Add cameleer.storage.executions feature flag (default: clickhouse).
PostgresExecutionStore activates only when explicitly set to postgres.
Add by-seq snapshot endpoint for iteration-aware processor lookup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:09:14 +02:00
hsiegeln
151b96a680 feat: seq-based tree reconstruction for ClickHouse flat processor model
Dual-mode buildTree: detects seq presence and uses seq/parentSeq linkage
instead of processorId map. Handles duplicate processorIds across
iterations correctly. Old processorId-based mode kept for PG compat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:07:20 +02:00
hsiegeln
0661fd995f feat(clickhouse): add read methods to ClickHouseExecutionStore
Implements ExecutionStore interface with findById (FINAL for
ReplacingMergeTree), findProcessors (ORDER BY seq), findProcessorById,
and findProcessorBySeq. Write methods unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:04:03 +02:00
hsiegeln
190ae2797d refactor: extend ProcessorRecord with seq/iteration fields for ClickHouse model
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:02:03 +02:00
hsiegeln
968117c41a feat(clickhouse): wire Phase 4 stores with feature flags
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
Add conditional beans for ClickHouseDiagramStore, ClickHouseAgentEventRepository,
and ClickHouseLogStore. All default to ClickHouse (matchIfMissing=true).
PG/OS stores activate only when explicitly configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:44:10 +02:00
hsiegeln
7d7eb52afb feat(clickhouse): add ClickHouseLogStore with LogIndex interface
Extract LogIndex interface from OpenSearchLogIndex. Both ClickHouseLogStore
and OpenSearchLogIndex implement it. Controllers now inject LogIndex.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:42:07 +02:00
hsiegeln
c73e4abf68 feat(clickhouse): add ClickHouseAgentEventRepository with integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:37:51 +02:00
hsiegeln
cd63d300b3 feat(clickhouse): add ClickHouseDiagramStore with integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:35:32 +02:00
hsiegeln
f7daadaaa9 feat(clickhouse): add DDL for route_diagrams, agent_events, and logs tables
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:30:38 +02:00
hsiegeln
af080337f5 feat: comprehensive ClickHouse low-memory tuning and switch all storage to ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 58s
Replace partial memory config with full Altinity low-memory guide
settings. Revert container limit from 6Gi back to 4Gi — proper
tuning (mlock=false, reduced caches/pools/threads, disk spill for
aggregations) makes the original budget sufficient.

Switch all storage feature flags to ClickHouse:
- CAMELEER_STORAGE_SEARCH: opensearch → clickhouse
- CAMELEER_STORAGE_METRICS: postgres → clickhouse
- CAMELEER_STORAGE_STATS: already clickhouse

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:27:10 +02:00
hsiegeln
606f81a970 fix: align server with protocol v2 chunked transport spec
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m45s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 46s
- ChunkIngestionController: /data/chunks → /data/executions (matches
  PROTOCOL.md endpoint the agent actually posts to)
- ExecutionController: conditional on ClickHouse being disabled to
  avoid mapping conflict
- Persist originalExchangeId and replayExchangeId from ExecutionChunk
  envelope through to ClickHouse (was silently dropped)
- V5 migration adds the two new columns to executions table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:18:35 +02:00
hsiegeln
154bce366a fix: remove references to deleted ProcessorExecution tree fields
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m0s
cameleer3-common removed children, loopIndex, splitIndex,
multicastIndex from ProcessorExecution (flat model only now).
Iteration context lives on synthetic wrapper nodes via processorType.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:00:11 +02:00
hsiegeln
a669df08bd fix(clickhouse): tune memory settings to prevent OOM on insert
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 40s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
ClickHouse 24.12 auto-sizes caches from the cgroup limit, leaving
insufficient headroom for MV processing and background merges.
Adds a custom config that shrinks mark/index/expression caches and
caps per-query memory at 2 GiB. Bumps container limit 4Gi → 6Gi.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 22:54:43 +02:00
hsiegeln
af18fc4142 Merge branch 'worktree-clickhouse-phase2'
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
2026-03-31 22:06:35 +02:00
hsiegeln
1a00eed389 fix: schema initializer skips comment-only SQL segments
The V4 DDL had a semicolon inside a comment which caused the
split-on-semicolon logic to produce a comment-only segment that
ClickHouse rejected as empty query. Fixed the comment and made
the initializer strip comment-only segments before execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 22:06:31 +02:00
hsiegeln
0423518f72 feat: ClickHouse Phase 3 — Stats & Analytics (materialized views)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
- DDL for 5 AggregatingMergeTree tables + 5 materialized views
- ClickHouseStatsStore: all 15 StatsStore methods using -Merge combinators
- Stats/timeseries read from pre-aggregated MVs (countMerge, sumMerge, quantileMerge)
- SLA/topErrors/punchcard query raw executions FINAL table
- Feature flag: cameleer.storage.stats (default: clickhouse)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:52:13 +02:00
hsiegeln
9df00fdde0 feat(clickhouse): wire ClickHouseStatsStore with cameleer.storage.stats feature flag (default: clickhouse)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:51:45 +02:00
hsiegeln
052990bb59 feat(clickhouse): add ClickHouseStatsStore with -Merge aggregate queries
Implements StatsStore interface for ClickHouse using AggregatingMergeTree
tables with -Merge combinators (countMerge, countIfMerge, sumMerge,
quantileMerge). Uses literal SQL for aggregate table queries to avoid
ClickHouse JDBC driver PreparedStatement issues with AggregateFunction
columns. Raw table queries (SLA, topErrors, activeErrorTypes) use normal
prepared statements.

Includes 13 integration tests covering stats, timeseries, grouped
timeseries, SLA compliance, SLA counts by app/route, top errors, active
error types, punchcard, and processor stats. Also fixes AggregateFunction
type signatures in V4 DDL (count() takes no args, countIf takes UInt8).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:49:22 +02:00
hsiegeln
eb0d26814f feat(clickhouse): add stats materialized views DDL (5 tables + 5 MVs)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 20:11:38 +02:00
hsiegeln
c8e6bbe059 Merge branch 'worktree-clickhouse-phase2'
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m2s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
2026-03-31 20:02:49 +02:00
hsiegeln
a9eabe97f7 fix: wire @Primary JdbcTemplate to the @Primary DataSource bean
The jdbcTemplate() method was calling dataSource(properties) directly,
creating a new DataSource instance instead of using the Spring-managed
@Primary bean. This caused some repositories to receive the ClickHouse
connection instead of PostgreSQL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 20:02:44 +02:00
hsiegeln
e724607a66 feat: ClickHouse Phase 2 — Executions + Search (chunked transport)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m36s
CI / docker (push) Successful in 3m21s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- DDL for executions (ReplacingMergeTree) and processor_executions (MergeTree with seq/parentSeq/iteration)
- ClickHouseExecutionStore with batch INSERT for both tables
- ChunkAccumulator: buffers exchange envelope across chunks, inserts processors immediately, writes execution on final chunk
- ExecutionFlushScheduler drains WriteBuffers to ClickHouse
- ChunkIngestionController: POST /api/v1/data/chunks endpoint
- ClickHouseSearchIndex: ngram-accelerated SQL search implementing SearchIndex interface
- Feature flags: cameleer.storage.search=opensearch|clickhouse
- Uses cameleer3-common ExecutionChunk and FlatProcessorRecord models

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:37:21 +02:00
hsiegeln
07f215b0fd refactor: replace server-side DTOs with cameleer3-common ExecutionChunk and FlatProcessorRecord
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:33:49 +02:00
hsiegeln
38551eac9d test(clickhouse): add end-to-end chunk pipeline integration test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:24:55 +02:00
hsiegeln
31f7113b3f feat(clickhouse): wire ChunkAccumulator, flush scheduler, and search feature flag
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:21:19 +02:00
hsiegeln
6052407c82 feat(clickhouse): add ClickHouseSearchIndex with ngram-accelerated SQL search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:18:01 +02:00
hsiegeln
776f2ce90d feat(clickhouse): add ExecutionFlushScheduler and ChunkIngestionController
ExecutionFlushScheduler drains MergedExecution and ProcessorBatch write
buffers on a fixed interval and delegates batch inserts to
ClickHouseExecutionStore. Also sweeps stale exchanges every 60s.

ChunkIngestionController exposes POST /api/v1/data/chunks, accepts
single or array ExecutionChunk payloads, and feeds them into the
ChunkAccumulator. Conditional on ChunkAccumulator bean (clickhouse.enabled).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:12:38 +02:00
hsiegeln
62420cf0c2 feat(clickhouse): add ChunkAccumulator for chunked execution ingestion
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:10:21 +02:00
hsiegeln
81f7f8afe1 feat(clickhouse): add ClickHouseExecutionStore with batch insert for chunked format
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:07:33 +02:00
hsiegeln
b30dfa39f4 feat(clickhouse): add executions and processor_executions DDL for chunked transport
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:04:19 +02:00
hsiegeln
20c8e17843 feat: add server-side ExecutionChunk and FlatProcessorRecord DTOs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:02:47 +02:00
a96fe59840 Merge pull request 'fix: add @Primary PG DataSource/JdbcTemplate to prevent CH bean conflict' (#99) from feature/clickhouse-phase1 into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m46s
CI / docker (push) Successful in 11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Reviewed-on: cameleer/cameleer3-server#99
2026-03-31 18:21:00 +02:00
hsiegeln
7cf849269f fix: add @Primary PG DataSource/JdbcTemplate to prevent CH bean conflict
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 41s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 38s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m51s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
When clickhouse.enabled=true, the ClickHouse JdbcTemplate bean prevents
Spring Boot auto-config from creating the default PG JdbcTemplate.
All PG repositories then get the CH JdbcTemplate and fail with
"Table cameleer.audit_log does not exist".

Fix: explicitly create @Primary DataSource and JdbcTemplate from
DataSourceProperties so PG remains the default for unqualified injections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 18:18:09 +02:00
76afcaa637 Merge pull request 'fix: cast DateTime64 to DateTime in ClickHouse TTL expression' (#98) from feature/clickhouse-phase1 into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m55s
CI / docker (push) Successful in 14s
CI / deploy (push) Successful in 30s
CI / deploy-feature (push) Has been skipped
Reviewed-on: cameleer/cameleer3-server#98
2026-03-31 18:10:58 +02:00
hsiegeln
b1c5cc0616 fix: cast DateTime64 to DateTime in ClickHouse TTL expression
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m46s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 1m8s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 2m19s
2026-03-31 18:10:20 +02:00
8838077eff Merge pull request 'fix: remove unsupported async_insert params from ClickHouse JDBC URL' (#97) from feature/clickhouse-phase1 into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m39s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 34s
Reviewed-on: cameleer/cameleer3-server#97
2026-03-31 18:04:22 +02:00
hsiegeln
8eeaecf6f3 fix: remove unsupported async_insert params from ClickHouse JDBC URL
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 55s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m39s
CI / deploy (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (push) Successful in 51s
CI / deploy-feature (pull_request) Has been skipped
clickhouse-jdbc 0.9.7 rejects async_insert and wait_for_async_insert as
unknown URL parameters. These are server-side settings, not driver config.
Can be set per-query later if needed via custom_settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 18:02:53 +02:00
b54bef302d Merge pull request 'fix: ClickHouse auth credentials and non-fatal schema init' (#96) from feature/clickhouse-phase1 into main
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m17s
Reviewed-on: cameleer/cameleer3-server#96
2026-03-31 17:57:27 +02:00
hsiegeln
f8505401d7 fix: ClickHouse auth credentials and non-fatal schema init
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 43s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 13s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m47s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
- Set CLICKHOUSE_USER/PASSWORD via k8s secret (fixes "disabling network
  access for user 'default'" when no password is set)
- Add clickhouse-credentials secret to CI deploy + feature branch copy
- Pass CLICKHOUSE_USERNAME/PASSWORD env vars to server pod
- Make schema initializer non-fatal so server starts even if CH is
  temporarily unavailable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:54:44 +02:00
a0f1a4aba4 Merge pull request 'feature/clickhouse-phase1' (#95) from feature/clickhouse-phase1 into main
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m41s
Reviewed-on: cameleer/cameleer3-server#95
2026-03-31 17:48:41 +02:00
hsiegeln
aa5fc1b830 ci: retrigger after transient GitHub actions/cache 500 error
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m44s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m44s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 11s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 2m15s
2026-03-31 17:43:40 +02:00
hsiegeln
c42e13932b ci: deploy ClickHouse StatefulSet in main deploy job
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (pull_request) Failing after 45s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / build (push) Failing after 1m6s
CI / docker (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been skipped
The deploy/clickhouse.yaml manifest was created but not referenced
in the CI workflow. Add kubectl apply between OpenSearch and Authentik.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:41:15 +02:00
hsiegeln
59dd629b0e fix: create cameleer database on ClickHouse startup
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (pull_request) Successful in 1m49s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 10s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been cancelled
ClickHouse only has the 'default' database out of the box. The JDBC URL
connects to 'cameleer', so the database must exist before the server starts.
Uses /docker-entrypoint-initdb.d/ init script via ConfigMap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:31:17 +02:00
hsiegeln
697c689192 fix: rename ClickHouse tests to *IT pattern for CI compatibility
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m28s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m27s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 3m32s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 2m17s
Testcontainers tests need Docker which isn't available in CI.
Rename to *IT so Surefire skips them (Failsafe runs them with -DskipITs=false).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:19:33 +02:00
hsiegeln
7a2a0ee649 test: add ClickHouse testcontainer to integration test base
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 2m29s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Failing after 2m28s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:09:09 +02:00
hsiegeln
1b991f99a3 deploy: add ClickHouse StatefulSet and server env vars
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:08:42 +02:00
hsiegeln
21991b6cf8 feat: wire MetricsStore and MetricsQueryStore with feature flag
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:07:35 +02:00
hsiegeln
53766aeb56 feat: add ClickHouseMetricsQueryStore with time-bucketed queries
Implements MetricsQueryStore using ClickHouse toStartOfInterval() for
time-bucketed aggregation queries; verified with 4 Testcontainers tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:05:45 +02:00
hsiegeln
bf0e9ea418 refactor: extract MetricsQueryStore interface from AgentMetricsController
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:00:57 +02:00
hsiegeln
6e30b7ec65 feat: add ClickHouseMetricsStore with batch insert
TDD implementation of MetricsStore backed by ClickHouse. Uses native
Map(String,String) column type (no JSON cast), relies on ClickHouse
DEFAULT for server_received_at, and handles null tags by substituting
an empty HashMap. All 4 Testcontainers tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:58:20 +02:00
hsiegeln
08934376df feat: add ClickHouse schema initializer with agent_metrics DDL
Adds ClickHouseSchemaInitializer that runs on ApplicationReadyEvent,
scanning classpath:clickhouse/*.sql in filename order and executing each
statement. Adds V1__agent_metrics.sql with MergeTree table, tenant/agent
partitioning, and 365-day TTL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:51:21 +02:00
hsiegeln
23f901279a feat: add ClickHouse DataSource and JdbcTemplate configuration
Adds ClickHouseProperties (bound to clickhouse.*), ClickHouseConfig
(conditional HikariDataSource + JdbcTemplate beans), and extends
application.yml with clickhouse.enabled/url/username/password and
cameleer.storage.metrics properties.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:51:14 +02:00
hsiegeln
6171827243 build: add clickhouse-jdbc and testcontainers-clickhouse dependencies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:49:04 +02:00
hsiegeln
c77d8a7af0 docs: add Phase 1 implementation plan for ClickHouse migration
10-task TDD plan covering: CH dependency, config, schema init,
ClickHouseMetricsStore, MetricsQueryStore interface extraction,
ClickHouseMetricsQueryStore, feature flag wiring, k8s deployment,
integration tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:43:14 +02:00
hsiegeln
e7eda7a7b3 docs: add ClickHouse migration design and append-only protocol spec
Design for replacing PostgreSQL/TimescaleDB + OpenSearch with ClickHouse
OSS. Covers table schemas, ingestion pipeline (ExecutionAccumulator),
ngram search indexes, materialized views, multitenancy, and retention.

Companion doc proposes append-only execution protocol for the agent repo.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:36:22 +02:00
979 changed files with 82777 additions and 13335 deletions

View File

@@ -0,0 +1,183 @@
---
paths:
- "cameleer-server-app/**"
---
# App Module Key Classes
`cameleer-server-app/src/main/java/com/cameleer/server/app/`
## URL taxonomy
User-facing data and config endpoints live under `/api/v1/environments/{envSlug}/...`. Env is a path segment, never a query param. The `envSlug` is resolved to an `Environment` bean via the `@EnvPath` argument resolver (`web/EnvironmentPathResolver.java`) — 404 on unknown slug.
**Slugs are immutable after creation** for both environments and apps. Slug regex: `^[a-z0-9][a-z0-9-]{0,63}$`. Validated in `EnvironmentService.create` and `AppService.createApp`. Update endpoints (`PUT`) do not accept a slug field; Jackson drops it as an unknown property.
### Flat-endpoint allow-list
These paths intentionally stay flat (no `/environments/{envSlug}` prefix). Every new endpoint should be env-scoped unless it appears here and the reason is documented.
| Path prefix | Why flat |
|---|---|
| `/api/v1/data/**` | Agent ingestion. JWT `env` claim is authoritative; URL-embedded env would invite spoofing. |
| `/api/v1/agents/register`, `/refresh`, `/{id}/heartbeat`, `/{id}/events` (SSE), `/{id}/deregister`, `/{id}/commands`, `/{id}/commands/{id}/ack`, `/{id}/replay` | Agent self-service; JWT-bound. |
| `/api/v1/agents/commands`, `/api/v1/agents/groups/{group}/commands` | Operator fan-out; target scope is explicit in query params. |
| `/api/v1/agents/config` | Agent-authoritative config read; JWT → registry → (app, env). |
| `/api/v1/admin/{users,roles,groups,oidc,license,audit,rbac/stats,claim-mappings,thresholds,sensitive-keys,usage,clickhouse,database,environments,outbound-connections}` | Truly cross-env admin. Env CRUD URLs use `{envSlug}`, not UUID. |
| `/api/v1/catalog`, `/api/v1/catalog/{applicationId}` | Cross-env discovery is the purpose. Env is an optional filter via `?environment=`. |
| `/api/v1/executions/{execId}`, `/processors/**` | Exchange IDs are globally unique; permalinks. |
| `/api/v1/diagrams/{contentHash}/render`, `POST /api/v1/diagrams/render` | Content-addressed or stateless. |
| `/api/v1/alerts/notifications/{id}/retry` | Notification IDs are globally unique; no env routing needed. |
| `/api/v1/auth/**` | Pre-auth; no env context exists. |
| `/api/v1/health`, `/prometheus`, `/api-docs/**`, `/swagger-ui/**` | Server metadata. |
## Tenant isolation invariant
ClickHouse is shared across tenants. Every ClickHouse query must filter by `tenant_id` (from `CAMELEER_SERVER_TENANT_ID` env var, resolved via `TenantContext`/config) in addition to `environment`. New controllers added under `/environments/{envSlug}/...` must preserve this — the env filter from the path does not replace the tenant filter.
## controller/ — REST endpoints
### Env-scoped (user-facing data & config)
- `AppController``/api/v1/environments/{envSlug}/apps`. GET list / POST create / GET `{appSlug}` / DELETE `{appSlug}` / GET `{appSlug}/versions` / POST `{appSlug}/versions` (JAR upload) / PUT `{appSlug}/container-config`. App slug uniqueness is per-env (`(env, app_slug)` is the natural key). `CreateAppRequest` body has no env (path), validates slug regex.
- `DeploymentController``/api/v1/environments/{envSlug}/apps/{appSlug}/deployments`. GET list / POST create (body `{ appVersionId }`) / POST `{id}/stop` / POST `{id}/promote` (body `{ targetEnvironment: slug }` — target app slug must exist in target env) / GET `{id}/logs`.
- `ApplicationConfigController``/api/v1/environments/{envSlug}`. GET `/config` (list), GET/PUT `/apps/{appSlug}/config`, GET `/apps/{appSlug}/processor-routes`, POST `/apps/{appSlug}/config/test-expression`. PUT also pushes `CONFIG_UPDATE` to LIVE agents in this env.
- `AppSettingsController``/api/v1/environments/{envSlug}`. GET `/app-settings` (list), GET/PUT/DELETE `/apps/{appSlug}/settings`. ADMIN/OPERATOR only.
- `SearchController``/api/v1/environments/{envSlug}`. GET `/executions`, POST `/executions/search`, GET `/stats`, `/stats/timeseries`, `/stats/timeseries/by-app`, `/stats/timeseries/by-route`, `/stats/punchcard`, `/attributes/keys`, `/errors/top`.
- `LogQueryController` — GET `/api/v1/environments/{envSlug}/logs` (filters: source (multi, comma-split, OR-joined), level (multi, comma-split, OR-joined), application, agentId, exchangeId, logger, q, time range; sort asc/desc). Cursor-paginated, returns `{ data, nextCursor, hasMore, levelCounts }`; cursor is base64url of `"{timestampIso}|{insert_id_uuid}"` — same-millisecond tiebreak via the `insert_id` UUID column on `logs`.
- `RouteCatalogController` — GET `/api/v1/environments/{envSlug}/routes` (merged route catalog from registry + ClickHouse; env filter unconditional).
- `RouteMetricsController` — GET `/api/v1/environments/{envSlug}/routes/metrics`, GET `/api/v1/environments/{envSlug}/routes/metrics/processors`.
- `AgentListController` — GET `/api/v1/environments/{envSlug}/agents` (registered agents with runtime metrics, filtered to env).
- `AgentEventsController` — GET `/api/v1/environments/{envSlug}/agents/events` (lifecycle events; cursor-paginated, returns `{ data, nextCursor, hasMore }`; order `(timestamp DESC, insert_id DESC)`; cursor is base64url of `"{timestampIso}|{insert_id_uuid}"``insert_id` is a stable UUID column used as a same-millisecond tiebreak).
- `AgentMetricsController` — GET `/api/v1/environments/{envSlug}/agents/{agentId}/metrics` (JVM/Camel metrics). Rejects cross-env agents (404) as defence-in-depth.
- `DiagramRenderController` — GET `/api/v1/environments/{envSlug}/apps/{appSlug}/routes/{routeId}/diagram` (env-scoped lookup). Also GET `/api/v1/diagrams/{contentHash}/render` (flat — content hashes are globally unique).
- `AlertRuleController``/api/v1/environments/{envSlug}/alerts/rules`. GET list / POST create / GET `{id}` / PUT `{id}` / DELETE `{id}` / POST `{id}/enable` / POST `{id}/disable` / POST `{id}/render-preview` / POST `{id}/test-evaluate`. OPERATOR+ for mutations, VIEWER+ for reads. CRITICAL: attribute keys in `ExchangeMatchCondition.filter.attributes` are validated at rule-save time against `^[a-zA-Z0-9._-]+$` — they are later inlined into ClickHouse SQL. Webhook validation: verifies `outboundConnectionId` exists and `isAllowedInEnvironment`. Null notification templates default to `""` (NOT NULL constraint). Audit: `ALERT_RULE_CHANGE`.
- `AlertController``/api/v1/environments/{envSlug}/alerts`. GET list (inbox filtered by userId/groupIds/roleNames via `InAppInboxQuery`) / GET `/unread-count` / GET `{id}` / POST `{id}/ack` / POST `{id}/read` / POST `/bulk-read`. VIEWER+ for all. Inbox SQL: `? = ANY(target_user_ids) OR target_group_ids && ? OR target_role_names && ?` — requires at least one matching target (no broadcast concept).
- `AlertSilenceController``/api/v1/environments/{envSlug}/alerts/silences`. GET list / POST create / DELETE `{id}`. 422 if `endsAt <= startsAt`. OPERATOR+ for mutations, VIEWER+ for list. Audit: `ALERT_SILENCE_CHANGE`.
- `AlertNotificationController` — Dual-path (no class-level prefix). GET `/api/v1/environments/{envSlug}/alerts/{alertId}/notifications` (VIEWER+); POST `/api/v1/alerts/notifications/{id}/retry` (OPERATOR+, flat — notification IDs globally unique). Retry resets attempts to 0 and sets `nextAttemptAt = now`.
### Env admin (env-slug-parameterized, not env-scoped data)
- `EnvironmentAdminController``/api/v1/admin/environments`. GET list / POST create / GET `{envSlug}` / PUT `{envSlug}` / DELETE `{envSlug}` / PUT `{envSlug}/default-container-config` / PUT `{envSlug}/jar-retention`. Slug immutable — PUT body has no slug field; any slug supplied is dropped by Jackson. Slug validated on POST.
### Agent-only (JWT-authoritative, intentionally flat)
- `AgentRegistrationController` — POST `/register` (requires `environmentId` in body; 400 if missing), POST `/{id}/refresh` (rejects tokens with no `env` claim), POST `/{id}/heartbeat` (env from body preferred, JWT fallback; 400 if neither), POST `/{id}/deregister`.
- `AgentSseController` — GET `/{id}/events` (SSE connection).
- `AgentCommandController` — POST `/{agentId}/commands`, POST `/groups/{group}/commands`, POST `/commands` (broadcast), POST `/{agentId}/commands/{commandId}/ack`, POST `/{agentId}/replay`.
- `AgentConfigController` — GET `/api/v1/agents/config`. Agent-authoritative config read: resolves (app, env) from JWT subject → registry (registry miss falls back to JWT env claim; no registry entry → 404 since application can't be derived).
### Ingestion (agent-only, JWT-authoritative)
- `LogIngestionController` — POST `/api/v1/data/logs` (accepts `List<LogEntry>`; WARNs on missing identity, unregistered agents, empty payloads, buffer-full drops).
- `EventIngestionController` — POST `/api/v1/data/events`.
- `ChunkIngestionController` — POST `/api/v1/ingestion/chunk/{executions|metrics|diagrams}`.
- `ExecutionController` — POST `/api/v1/data/executions` (legacy ingestion path when ClickHouse disabled).
- `MetricsController` — POST `/api/v1/data/metrics`.
- `DiagramController` — POST `/api/v1/data/diagrams` (resolves applicationId + environment from the agent registry keyed on JWT subject; stamps both on the stored `TaggedDiagram`).
### Cross-env discovery (flat)
- `CatalogController` — GET `/api/v1/catalog` (merges managed apps + in-memory agents + CH stats; optional `?environment=` filter). DELETE `/api/v1/catalog/{applicationId}` (ADMIN: dismiss app, purge all CH data + PG record).
### Admin (cross-env, flat)
- `UserAdminController` — CRUD `/api/v1/admin/users`, POST `/{id}/roles`, POST `/{id}/set-password`.
- `RoleAdminController` — CRUD `/api/v1/admin/roles`.
- `GroupAdminController` — CRUD `/api/v1/admin/groups`.
- `OidcConfigAdminController` — GET/POST `/api/v1/admin/oidc`, POST `/test`.
- `OutboundConnectionAdminController``/api/v1/admin/outbound-connections`. GET list / POST create / GET `{id}` / PUT `{id}` / DELETE `{id}` / POST `{id}/test` / GET `{id}/usage`. RBAC: list/get/usage ADMIN|OPERATOR; mutations + test ADMIN.
- `SensitiveKeysAdminController` — GET/PUT `/api/v1/admin/sensitive-keys`. GET returns 200 or 204 if not configured. PUT accepts `{ keys: [...] }` with optional `?pushToAgents=true`. Fan-out iterates every distinct `(application, environment)` slice — intentional global baseline + per-env overrides.
- `ClaimMappingAdminController` — CRUD `/api/v1/admin/claim-mappings`, POST `/test`.
- `LicenseAdminController` — GET/POST `/api/v1/admin/license`.
- `ThresholdAdminController` — CRUD `/api/v1/admin/thresholds`.
- `AuditLogController` — GET `/api/v1/admin/audit`.
- `RbacStatsController` — GET `/api/v1/admin/rbac/stats`.
- `UsageAnalyticsController` — GET `/api/v1/admin/usage` (ClickHouse `usage_events`).
- `ClickHouseAdminController` — GET `/api/v1/admin/clickhouse/**` (conditional on `infrastructureendpoints` flag).
- `DatabaseAdminController` — GET `/api/v1/admin/database/**` (conditional on `infrastructureendpoints` flag).
### Other (flat)
- `DetailController` — GET `/api/v1/executions/{executionId}` + processor snapshot endpoints.
- `MetricsController` — exposes `/api/v1/metrics` and `/api/v1/prometheus` (server-side Prometheus scrape endpoint).
## runtime/ — Docker orchestration
- `DockerRuntimeOrchestrator` — implements RuntimeOrchestrator; Docker Java client (zerodep transport), container lifecycle
- `DeploymentExecutor`@Async staged deploy: PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE. Container names are `{tenantId}-{envSlug}-{appSlug}-{replicaIndex}` (globally unique on Docker daemon). Sets per-replica `CAMELEER_AGENT_INSTANCEID` env var to `{envSlug}-{appSlug}-{replicaIndex}`.
- `DockerNetworkManager` — ensures bridge networks (cameleer-traefik, cameleer-env-{slug}), connects containers
- `DockerEventMonitor` — persistent Docker event stream listener (die, oom, start, stop), updates deployment status
- `TraefikLabelBuilder` — generates Traefik Docker labels for path-based or subdomain routing. Also emits `cameleer.replica` and `cameleer.instance-id` labels per container for labels-first identity.
- `PrometheusLabelBuilder` — generates Prometheus Docker labels (`prometheus.scrape/path/port`) per runtime type for `docker_sd_configs` auto-discovery
- `ContainerLogForwarder` — streams Docker container stdout/stderr to ClickHouse with `source='container'`. One follow-stream thread per container, batches lines every 2s/50 lines via `ClickHouseLogStore.insertBufferedBatch()`. 60-second max capture timeout.
- `DisabledRuntimeOrchestrator` — no-op when runtime not enabled
## metrics/ — Prometheus observability
- `ServerMetrics` — centralized business metrics: gauges (agents by state, SSE connections, buffer depths), counters (ingestion drops, agent transitions, deployment outcomes, auth failures), timers (flush duration, deployment duration). Exposed via `/api/v1/prometheus`.
## storage/ — PostgreSQL repositories (JdbcTemplate)
- `PostgresAppRepository`, `PostgresAppVersionRepository`, `PostgresEnvironmentRepository`
- `PostgresDeploymentRepository` — includes JSONB replica_states, deploy_stage, findByContainerId
- `PostgresUserRepository`, `PostgresRoleRepository`, `PostgresGroupRepository`
- `PostgresAuditRepository`, `PostgresOidcConfigRepository`, `PostgresClaimMappingRepository`, `PostgresSensitiveKeysRepository`
- `PostgresAppSettingsRepository`, `PostgresApplicationConfigRepository`, `PostgresThresholdRepository`. Both `app_settings` and `application_config` are env-scoped (PK `(app_id, environment)` / `(application, environment)`); finders take `(app, env)` — no env-agnostic variants.
## storage/ — ClickHouse stores
- `ClickHouseExecutionStore`, `ClickHouseMetricsStore`, `ClickHouseMetricsQueryStore`
- `ClickHouseStatsStore` — pre-aggregated stats, punchcard
- `ClickHouseDiagramStore`, `ClickHouseAgentEventRepository`
- `ClickHouseUsageTracker` — usage_events for billing
- `ClickHouseRouteCatalogStore` — persistent route catalog with first_seen cache, warm-loaded on startup
## search/ — ClickHouse search and log stores
- `ClickHouseLogStore` — log storage and query, MDC-based exchange/processor correlation
- `ClickHouseSearchIndex` — full-text search
## security/ — Spring Security
- `SecurityConfig` — WebSecurityFilterChain, JWT filter, CORS, OIDC conditional. `/api/v1/admin/outbound-connections/**` GETs permit OPERATOR in addition to ADMIN (defense-in-depth at controller level); mutations remain ADMIN-only. Alerting matchers: GET `/environments/*/alerts/**` VIEWER+; POST/PUT/DELETE rules and silences OPERATOR+; ack/read/bulk-read VIEWER+; POST `/alerts/notifications/*/retry` OPERATOR+.
- `JwtAuthenticationFilter` — OncePerRequestFilter, validates Bearer tokens
- `JwtServiceImpl` — HMAC-SHA256 JWT (Nimbus JOSE)
- `OidcAuthController` — /api/v1/auth/oidc (login-uri, token-exchange, logout)
- `OidcTokenExchanger` — code -> tokens, role extraction from access_token then id_token
- `OidcProviderHelper` — OIDC discovery, JWK source cache
## agent/ — Agent lifecycle
- `SseConnectionManager` — manages per-agent SSE connections, delivers commands
- `AgentLifecycleMonitor`@Scheduled 10s, LIVE->STALE->DEAD transitions
- `SsePayloadSigner` — Ed25519 signs SSE payloads for agent verification
## retention/ — JAR cleanup
- `JarRetentionJob`@Scheduled 03:00 daily, per-environment retention, skips deployed versions
## http/ — Outbound HTTP client implementation
- `SslContextBuilder` — composes SSL context from `OutboundHttpProperties` + `OutboundHttpRequestContext`. Supports SYSTEM_DEFAULT (JDK roots + configured CA extras), TRUST_ALL (short-circuit no-op TrustManager), TRUST_PATHS (JDK roots + system extras + per-request extras). Throws `IllegalArgumentException("CA file not found: ...")` on missing PEM.
- `ApacheOutboundHttpClientFactory` — Apache HttpClient 5 impl of `OutboundHttpClientFactory`. Memoizes clients per `CacheKey(trustAll, caPaths, mode, connectTimeout, readTimeout)`. Applies `NoopHostnameVerifier` when trust-all is active.
- `config/OutboundHttpConfig``@ConfigurationProperties("cameleer.server.outbound-http")`. Exposes beans: `OutboundHttpProperties`, `SslContextBuilder`, `OutboundHttpClientFactory`. `@PostConstruct` logs WARN on trust-all and throws if configured CA paths don't exist.
## outbound/ — Admin-managed outbound connections (implementation)
- `crypto/SecretCipher` — AES-GCM symmetric cipher with key derived via HMAC-SHA256(jwtSecret, "cameleer-outbound-secret-v1"). Ciphertext format: base64(IV(12 bytes) || GCM output with 128-bit tag). `encrypt` throws `IllegalStateException`; `decrypt` throws `IllegalArgumentException` on tamper/wrong-key/malformed.
- `storage/PostgresOutboundConnectionRepository` — JdbcTemplate impl. `save()` upserts by id; JSONB serialization via ObjectMapper; UUID arrays via `ConnectionCallback`. Reads `created_by`/`updated_by` as String (= users.user_id TEXT).
- `OutboundConnectionServiceImpl` — service layer. Tenant bound at construction via `cameleer.server.tenant.id` property. Uniqueness check via `findByName`. Narrowing-envs guard: rejects update that removes envs while rules reference the connection (rulesReferencing stubbed in Plan 01, wired in Plan 02). Delete guard: rejects if referenced by rules.
- `controller/OutboundConnectionAdminController` — REST controller. Class-level `@PreAuthorize("hasRole('ADMIN')")` defaults; GETs relaxed to ADMIN|OPERATOR. Extracts acting user id from `SecurityContextHolder.authentication.name`, strips "user:" prefix. Audit via `AuditCategory.OUTBOUND_CONNECTION_CHANGE`.
- `dto/OutboundConnectionRequest` — Bean Validation: `@NotBlank` name, `@Pattern("^https://.+")` url, `@NotNull` method/tlsTrustMode/auth. Compact ctor throws `IllegalArgumentException` if TRUST_PATHS with empty paths list.
- `dto/OutboundConnectionDto` — response DTO. `hmacSecretSet: boolean` instead of the ciphertext; `authKind: OutboundAuthKind` instead of the full auth config.
- `dto/OutboundConnectionTestResult` — result of POST `/{id}/test`: status, latencyMs, responseSnippet (first 512 chars), tlsProtocol/cipherSuite/peerCertSubject (protocol is "TLS" stub; enriched in Plan 02 follow-up), error (nullable).
- `config/OutboundBeanConfig` — registers `OutboundConnectionRepository`, `SecretCipher`, `OutboundConnectionService` beans.
## config/ — Spring beans
- `RuntimeOrchestratorAutoConfig` — conditional Docker/Disabled orchestrator + NetworkManager + EventMonitor
- `RuntimeBeanConfig` — DeploymentExecutor, AppService, EnvironmentService
- `SecurityBeanConfig` — JwtService, Ed25519, BootstrapTokenValidator
- `StorageBeanConfig` — all repositories
- `ClickHouseConfig` — ClickHouse JdbcTemplate, schema initializer

24
.claude/rules/cicd.md Normal file
View File

@@ -0,0 +1,24 @@
---
paths:
- ".gitea/**"
- "deploy/**"
- "Dockerfile"
- "docker-entrypoint.sh"
---
# CI/CD & Deployment
- CI workflow: `.gitea/workflows/ci.yml` — build -> docker -> deploy on push to main or feature branches
- Build step skips integration tests (`-DskipITs`) — Testcontainers needs Docker daemon
- Docker: multi-stage build (`Dockerfile`), `$BUILDPLATFORM` for native Maven on ARM64 runner, amd64 runtime. `docker-entrypoint.sh` imports `/certs/ca.pem` into JVM truststore before starting the app (supports custom CAs for OIDC discovery without `CAMELEER_SERVER_SECURITY_OIDCTLSSKIPVERIFY`).
- `REGISTRY_TOKEN` build arg required for `cameleer-common` dependency resolution
- Registry: `gitea.siegeln.net/cameleer/cameleer-server` (container images)
- K8s manifests in `deploy/` — Kustomize base + overlays (main/feature), shared infra (PostgreSQL, ClickHouse, Logto) as top-level manifests
- Deployment target: k3s at 192.168.50.86, namespace `cameleer` (main), `cam-<slug>` (feature branches)
- Feature branches: isolated namespace, PG schema; Traefik Ingress at `<slug>-api.cameleer.siegeln.net`
- Secrets managed in CI deploy step (idempotent `--dry-run=client | kubectl apply`): `cameleer-auth`, `cameleer-postgres-credentials`, `cameleer-clickhouse-credentials`
- K8s probes: server uses `/api/v1/health`, PostgreSQL uses `pg_isready -U "$POSTGRES_USER"` (env var, not hardcoded)
- K8s security: server and database pods run with `securityContext.runAsNonRoot`. UI (nginx) runs without securityContext (needs root for entrypoint setup).
- Docker: server Dockerfile has no default credentials — all DB config comes from env vars at runtime
- Docker build uses buildx registry cache + `--provenance=false` for Gitea compatibility
- CI: branch slug sanitization extracted to `.gitea/sanitize-branch.sh`, sourced by docker and deploy-feature jobs

View File

@@ -0,0 +1,114 @@
---
paths:
- "cameleer-server-core/**"
---
# Core Module Key Classes
`cameleer-server-core/src/main/java/com/cameleer/server/core/`
## agent/ — Agent lifecycle and commands
- `AgentRegistryService` — in-memory registry (ConcurrentHashMap), register/heartbeat/lifecycle
- `AgentInfo` — record: id, name, application, environmentId, version, routeIds, capabilities, state
- `AgentCommand` — record: id, type, targetAgent, payload, createdAt, expiresAt
- `AgentEventService` — records agent state changes, heartbeats
- `AgentState` — enum: LIVE, STALE, DEAD, SHUTDOWN
- `CommandType` — enum for command types (config-update, deep-trace, replay, route-control, etc.)
- `CommandStatus` — enum for command acknowledgement states
- `CommandReply` — record: command execution result from agent
- `AgentEventRecord`, `AgentEventRepository` — event persistence. `AgentEventRepository.queryPage(...)` is cursor-paginated (`AgentEventPage{data, nextCursor, hasMore}`); the legacy non-paginated `query(...)` path is gone.
- `AgentEventPage` — record: `(List<AgentEventRecord> data, String nextCursor, boolean hasMore)` returned by `AgentEventRepository.queryPage`
- `AgentEventListener` — callback interface for agent events
- `RouteStateRegistry` — tracks per-agent route states
## runtime/ — App/Environment/Deployment domain
- `App` — record: id, environmentId, slug, displayName, containerConfig (JSONB)
- `AppVersion` — record: id, appId, version, jarPath, detectedRuntimeType, detectedMainClass
- `Environment` — record: id, slug, jarRetentionCount
- `Deployment` — record: id, appId, appVersionId, environmentId, status, targetState, deploymentStrategy, replicaStates (JSONB), deployStage, containerId, containerName
- `DeploymentStatus` — enum: STOPPED, STARTING, RUNNING, DEGRADED, STOPPING, FAILED
- `DeployStage` — enum: PRE_FLIGHT, PULL_IMAGE, CREATE_NETWORK, START_REPLICAS, HEALTH_CHECK, SWAP_TRAFFIC, COMPLETE
- `DeploymentService` — createDeployment (deletes terminal deployments first), markRunning, markFailed, markStopped
- `RuntimeType` — enum: AUTO, SPRING_BOOT, QUARKUS, PLAIN_JAVA, NATIVE
- `RuntimeDetector` — probes JAR files at upload time: detects runtime from manifest Main-Class (Spring Boot loader, Quarkus entry point, plain Java) or native binary (non-ZIP magic bytes)
- `ContainerRequest` — record: 20 fields for Docker container creation (includes runtimeType, customArgs, mainClass)
- `ContainerStatus` — record: state, running, exitCode, error
- `ResolvedContainerConfig` — record: typed config with memoryLimitMb, memoryReserveMb, cpuRequest, cpuLimit, appPort, exposedPorts, customEnvVars, stripPathPrefix, sslOffloading, routingMode, routingDomain, serverUrl, replicas, deploymentStrategy, routeControlEnabled, replayEnabled, runtimeType, customArgs, extraNetworks
- `RoutingMode` — enum for routing strategies
- `ConfigMerger` — pure function: resolve(globalDefaults, envConfig, appConfig) -> ResolvedContainerConfig
- `RuntimeOrchestrator` — interface: startContainer, stopContainer, getContainerStatus, getLogs, startLogCapture, stopLogCapture
- `AppRepository`, `AppVersionRepository`, `EnvironmentRepository`, `DeploymentRepository` — repository interfaces
- `AppService`, `EnvironmentService` — domain services
## search/ — Execution search and stats
- `SearchService` — search, count, stats, statsForApp, statsForRoute, timeseries, timeseriesForApp, timeseriesForRoute, timeseriesGroupedByApp, timeseriesGroupedByRoute, slaCompliance, slaCountsByApp, slaCountsByRoute, topErrors, activeErrorTypes, punchcard, distinctAttributeKeys. `statsForRoute`/`timeseriesForRoute` take `(routeId, applicationId)` — app filter is applied to `stats_1m_route`.
- `SearchRequest` / `SearchResult` — search DTOs
- `ExecutionStats`, `ExecutionSummary` — stats aggregation records
- `StatsTimeseries`, `TopError` — timeseries and error DTOs
- `LogSearchRequest` / `LogSearchResponse` — log search DTOs. `LogSearchRequest.sources` / `levels` are `List<String>` (null-normalized, multi-value OR); `cursor` + `limit` + `sort` drive keyset pagination. Response carries `nextCursor` + `hasMore` + per-level `levelCounts`.
## storage/ — Storage abstractions
- `ExecutionStore`, `MetricsStore`, `MetricsQueryStore`, `StatsStore`, `DiagramStore`, `RouteCatalogStore`, `SearchIndex`, `LogIndex` — interfaces
- `RouteCatalogEntry` — record: applicationId, routeId, environment, firstSeen, lastSeen
- `LogEntryResult` — log query result record
- `model/``ExecutionDocument`, `MetricTimeSeries`, `MetricsSnapshot`
## rbac/ — Role-based access control
- `RbacService` — interface: role/group CRUD, assignRoleToUser, removeRoleFromUser, addUserToGroup, removeUserFromGroup, getDirectRolesForUser, getEffectiveRolesForUser, clearManagedAssignments, assignManagedRole, addUserToManagedGroup, getStats, listUsers
- `SystemRole` — enum: AGENT, VIEWER, OPERATOR, ADMIN; `normalizeScope()` maps scopes
- `UserDetail`, `RoleDetail`, `GroupDetail` — records
- `UserSummary`, `RoleSummary`, `GroupSummary` — lightweight list records
- `RbacStats` — aggregate stats record
- `AssignmentOrigin` — enum: DIRECT, CLAIM_MAPPING (tracks how roles were assigned)
- `ClaimMappingRule` — record: OIDC claim-to-role mapping rule
- `ClaimMappingService` — interface: CRUD for claim mapping rules
- `ClaimMappingRepository` — persistence interface
- `RoleRepository`, `GroupRepository` — persistence interfaces
## admin/ — Server-wide admin config
- `SensitiveKeysConfig` — record: keys (List<String>, immutable)
- `SensitiveKeysRepository` — interface: find(), save()
- `SensitiveKeysMerger` — pure function: merge(global, perApp) -> union with case-insensitive dedup, preserves first-seen casing. Returns null when both inputs null.
- `AppSettings`, `AppSettingsRepository` — per-app-per-env settings config and persistence. Record carries `(applicationId, environment, …)`; repository methods are `findByApplicationAndEnvironment`, `findByEnvironment`, `save`, `delete(appId, env)`. `AppSettings.defaults(appId, env)` produces a default instance scoped to an environment.
- `ThresholdConfig`, `ThresholdRepository` — alerting threshold config and persistence
- `AuditService` — audit logging facade
- `AuditRecord`, `AuditResult`, `AuditCategory` (enum: `INFRA, AUTH, USER_MGMT, CONFIG, RBAC, AGENT, OUTBOUND_CONNECTION_CHANGE, OUTBOUND_HTTP_TRUST_CHANGE`), `AuditRepository` — audit trail records and persistence
## http/ — Outbound HTTP primitives (cross-cutting)
- `OutboundHttpClientFactory` — interface: `clientFor(context)` returns memoized `CloseableHttpClient`
- `OutboundHttpProperties` — record: `trustAll, trustedCaPemPaths, defaultConnectTimeout, defaultReadTimeout, proxyUrl, proxyUsername, proxyPassword`
- `OutboundHttpRequestContext` — record of per-call TLS/timeout overrides; `systemDefault()` static factory
- `TrustMode` — enum: `SYSTEM_DEFAULT | TRUST_ALL | TRUST_PATHS`
## outbound/ — Admin-managed outbound connections
- `OutboundConnection` — record: id, tenantId, name, description, url, method, defaultHeaders, defaultBodyTmpl, tlsTrustMode, tlsCaPemPaths, hmacSecretCiphertext, auth, allowedEnvironmentIds, createdAt, createdBy (String user_id), updatedAt, updatedBy (String user_id). `isAllowedInEnvironment(envId)` returns true when allowed-envs list is empty OR contains the env.
- `OutboundAuth` — sealed interface + records: `None | Bearer(tokenCiphertext) | Basic(username, passwordCiphertext)`. Jackson `@JsonTypeInfo(use = DEDUCTION)` — wire shape has no discriminator, subtype inferred from fields.
- `OutboundAuthKind`, `OutboundMethod` — enums
- `OutboundConnectionRepository` — CRUD by (tenantId, id): save/findById/findByName/listByTenant/delete
- `OutboundConnectionService` — create/update/delete/get/list with uniqueness + narrow-envs + delete-if-referenced guards. `rulesReferencing(id)` stubbed in Plan 01 (returns `[]`); populated in Plan 02 against `AlertRuleRepository`.
## security/ — Auth
- `JwtService` — interface: createAccessToken, createRefreshToken, validateAccessToken, validateRefreshToken
- `Ed25519SigningService` — interface: sign, getPublicKeyBase64 (config signing)
- `OidcConfig` — record: enabled, issuerUri, clientId, clientSecret, rolesClaim, defaultRoles, autoSignup, displayNameClaim, userIdClaim, audience, additionalScopes
- `OidcConfigRepository` — persistence interface
- `PasswordPolicyValidator` — min 12 chars, 3-of-4 character classes, no username match
- `UserInfo`, `UserRepository` — user identity records and persistence
- `InvalidTokenException` — thrown on revoked/expired tokens
## ingestion/ — Buffered data pipeline
- `IngestionService` — ingestExecution, ingestMetric, ingestLog, ingestDiagram
- `ChunkAccumulator` — batches data for efficient flush
- `WriteBuffer` — bounded ring buffer for async flush
- `BufferedLogEntry` — log entry wrapper with metadata
- `MergedExecution`, `TaggedExecution`, `TaggedDiagram` — tagged ingestion records. `TaggedDiagram` carries `(instanceId, applicationId, environment, graph)` — env is resolved from the agent registry in the controller and stamped on the ClickHouse `route_diagrams` row.

View File

@@ -0,0 +1,76 @@
---
paths:
- "cameleer-server-app/**/runtime/**"
- "cameleer-server-core/**/runtime/**"
- "deploy/**"
- "docker-compose*.yml"
- "Dockerfile"
- "docker-entrypoint.sh"
---
# Docker Orchestration
When deployed via the cameleer-saas platform, this server orchestrates customer app containers using Docker. Key components:
- **ConfigMerger** (`core/runtime/ConfigMerger.java`) — pure function: resolve(globalDefaults, envConfig, appConfig) -> ResolvedContainerConfig. Three-layer merge: global (application.yml) -> environment (defaultContainerConfig JSONB) -> app (containerConfig JSONB). Includes `runtimeType` (default `"auto"`) and `customArgs` (default `""`).
- **TraefikLabelBuilder** (`app/runtime/TraefikLabelBuilder.java`) — generates Traefik Docker labels for path-based (`/{envSlug}/{appSlug}/`) or subdomain-based (`{appSlug}-{envSlug}.{domain}`) routing. Supports strip-prefix and SSL offloading toggles. Also sets per-replica identity labels: `cameleer.replica` (index) and `cameleer.instance-id` (`{envSlug}-{appSlug}-{replicaIndex}`). Internal processing uses labels (not container name parsing) for extensibility.
- **PrometheusLabelBuilder** (`app/runtime/PrometheusLabelBuilder.java`) — generates Prometheus `docker_sd_configs` labels per resolved runtime type: Spring Boot `/actuator/prometheus:8081`, Quarkus/native `/q/metrics:9000`, plain Java `/metrics:9464`. Labels merged into container metadata alongside Traefik labels at deploy time.
- **DockerNetworkManager** (`app/runtime/DockerNetworkManager.java`) — manages two Docker network tiers:
- `cameleer-traefik` — shared network; Traefik, server, and all app containers attach here. Server joined via docker-compose with `cameleer-server` DNS alias.
- `cameleer-env-{slug}` — per-environment isolated network; containers in the same environment discover each other via Docker DNS. In SaaS mode, env networks are tenant-scoped: `cameleer-env-{tenantId}-{envSlug}` (overloaded `envNetworkName(tenantId, envSlug)` method) to prevent cross-tenant collisions when multiple tenants have identically-named environments.
- **DockerEventMonitor** (`app/runtime/DockerEventMonitor.java`) — persistent Docker event stream listener for containers with `managed-by=cameleer-server` label. Detects die/oom/start/stop events and updates deployment replica states. Periodic reconciliation (@Scheduled every 30s) inspects actual container state and corrects deployment status mismatches (fixes stale DEGRADED with all replicas healthy).
- **DeploymentProgress** (`ui/src/components/DeploymentProgress.tsx`) — UI step indicator showing 7 deploy stages with amber active/green completed styling.
- **ContainerLogForwarder** (`app/runtime/ContainerLogForwarder.java`) — streams Docker container stdout/stderr to ClickHouse `logs` table with `source='container'`. Uses `docker logs --follow` per container, batches lines every 2s or 50 lines. Parses Docker timestamp prefix, infers log level via regex. `DeploymentExecutor` starts capture after each replica launches with the replica's `instanceId` (`{envSlug}-{appSlug}-{replicaIndex}`); `DockerEventMonitor` stops capture on die/oom. 60-second max capture timeout with 30s cleanup scheduler. Thread pool of 10 daemon threads. Container logs use the same `instanceId` as the agent (set via `CAMELEER_AGENT_INSTANCEID` env var) for unified log correlation at the instance level.
- **StartupLogPanel** (`ui/src/components/StartupLogPanel.tsx`) — collapsible log panel rendered below `DeploymentProgress`. Queries `/api/v1/logs?source=container&application={appSlug}&environment={envSlug}`. Auto-polls every 3s while deployment is STARTING; shows green "live" badge during polling, red "stopped" badge on FAILED. Uses `useStartupLogs` hook and `LogViewer` (design system).
## DeploymentExecutor Details
Primary network for app containers is set via `CAMELEER_SERVER_RUNTIME_DOCKERNETWORK` env var (in SaaS mode: `cameleer-tenant-{slug}`); apps also connect to `cameleer-traefik` (routing) and `cameleer-env-{tenantId}-{envSlug}` (per-environment discovery) as additional networks. Resolves `runtimeType: auto` to concrete type from `AppVersion.detectedRuntimeType` at PRE_FLIGHT (fails deployment if unresolvable). Builds Docker entrypoint per runtime type (all JVM types use `-javaagent:/app/agent.jar -jar`, plain Java uses `-cp` with main class, native runs binary directly). Sets per-replica `CAMELEER_AGENT_INSTANCEID` env var to `{envSlug}-{appSlug}-{replicaIndex}` so container logs and agent logs share the same instance identity. Sets `CAMELEER_AGENT_*` env vars from `ResolvedContainerConfig` (routeControlEnabled, replayEnabled, health port). These are startup-only agent properties — changing them requires redeployment.
## Deployment Status Model
| Status | Meaning |
|--------|---------|
| `STOPPED` | Intentionally stopped or initial state |
| `STARTING` | Deploy in progress |
| `RUNNING` | All replicas healthy and serving |
| `DEGRADED` | Some replicas healthy, some dead |
| `STOPPING` | Graceful shutdown in progress |
| `FAILED` | Terminal failure (pre-flight, health check, or crash) |
**Replica support**: deployments can specify a replica count. `DEGRADED` is used when at least one but not all replicas are healthy.
**Deploy stages** (`DeployStage`): PRE_FLIGHT -> PULL_IMAGE -> CREATE_NETWORK -> START_REPLICAS -> HEALTH_CHECK -> SWAP_TRAFFIC -> COMPLETE (or FAILED at any stage).
**Blue/green strategy**: when re-deploying, new replicas are started and health-checked before old ones are stopped, minimising downtime.
**Deployment uniqueness**: `DeploymentService.createDeployment()` deletes any STOPPED/FAILED deployments for the same app+environment before creating a new one, preventing duplicate rows.
## JAR Management
- **Retention policy** per environment: configurable maximum number of JAR versions to keep. Older JARs are deleted automatically.
- **Nightly cleanup job** (`JarRetentionJob`, Spring `@Scheduled` 03:00): purges JARs exceeding the retention limit and removes orphaned files not referenced by any app version. Skips versions currently deployed.
- **Volume-based JAR mounting** for Docker-in-Docker setups: set `CAMELEER_SERVER_RUNTIME_JARDOCKERVOLUME` to the Docker volume name that contains the JAR storage directory. When set, the orchestrator mounts this volume into the container instead of bind-mounting the host path (required when the SaaS container itself runs inside Docker and the host path is not accessible from sibling containers).
## Runtime Type Detection
The server detects the app framework from uploaded JARs and builds Docker entrypoints. The agent shaded JAR bundles the log appender, so no separate `cameleer-log-appender.jar` or `PropertiesLauncher` is needed:
- **Detection** (`RuntimeDetector`): runs at JAR upload time. Checks ZIP magic bytes (non-ZIP = native binary), then probes `META-INF/MANIFEST.MF` Main-Class: Spring Boot loader prefix -> `spring-boot`, Quarkus entry point -> `quarkus`, other Main-Class -> `plain-java` (extracts class name). Results stored on `AppVersion` (`detected_runtime_type`, `detected_main_class`).
- **Runtime types** (`RuntimeType` enum): `AUTO`, `SPRING_BOOT`, `QUARKUS`, `PLAIN_JAVA`, `NATIVE`. Configurable per app/environment via `containerConfig.runtimeType` (default `"auto"`).
- **Entrypoint per type**: All JVM types use `java -javaagent:/app/agent.jar -jar app.jar`. Plain Java uses `-cp` with explicit main class instead of `-jar`. Native runs the binary directly.
- **Custom arguments** (`containerConfig.customArgs`): freeform string appended to the start command. Validated against a strict pattern to prevent shell injection (entrypoint uses `sh -c`).
- **AUTO resolution**: at deploy time (PRE_FLIGHT), `"auto"` resolves to the detected type from `AppVersion`. Fails deployment if detection was unsuccessful — user must set type explicitly.
- **UI**: Resources tab shows Runtime Type dropdown (with detection hint from latest uploaded version) and Custom Arguments text field.
## SaaS Multi-Tenant Network Isolation
In SaaS mode, each tenant's server and its deployed apps are isolated at the Docker network level:
- **Tenant network** (`cameleer-tenant-{slug}`) — primary internal bridge for all of a tenant's containers. Set as `CAMELEER_SERVER_RUNTIME_DOCKERNETWORK` for the tenant's server instance. Tenant A's apps cannot reach tenant B's apps.
- **Shared services network** — server also connects to the shared infrastructure network (PostgreSQL, ClickHouse, Logto) and `cameleer-traefik` for HTTP routing.
- **Tenant-scoped environment networks** (`cameleer-env-{tenantId}-{envSlug}`) — per-environment discovery is scoped per tenant, so `alpha-corp`'s "dev" environment network is separate from `beta-corp`'s "dev" environment network.
## nginx / Reverse Proxy
- `client_max_body_size 200m` is required in the nginx config to allow JAR uploads up to 200 MB. Without this, large JAR uploads return 413.

98
.claude/rules/gitnexus.md Normal file
View File

@@ -0,0 +1,98 @@
# GitNexus — Code Intelligence
This project is indexed by GitNexus as **cameleer-server** (6306 symbols, 15892 relationships, 300 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.
> If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.
## Always Do
- **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user.
- **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows.
- **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
- When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
- When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`.
## When Debugging
1. `gitnexus_query({query: "<error or symptom>"})` — find execution flows related to the issue
2. `gitnexus_context({name: "<suspect function>"})` — see all callers, callees, and process participation
3. `READ gitnexus://repo/cameleer-server/process/{processName}` — trace the full execution flow step by step
4. For regressions: `gitnexus_detect_changes({scope: "compare", base_ref: "main"})` — see what your branch changed
## When Refactoring
- **Renaming**: MUST use `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` first. Review the preview — graph edits are safe, text_search edits need manual review. Then run with `dry_run: false`.
- **Extracting/Splitting**: MUST run `gitnexus_context({name: "target"})` to see all incoming/outgoing refs, then `gitnexus_impact({target: "target", direction: "upstream"})` to find all external callers before moving code.
- After any refactor: run `gitnexus_detect_changes({scope: "all"})` to verify only expected files changed.
## Never Do
- NEVER edit a function, class, or method without first running `gitnexus_impact` on it.
- NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
- NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph.
- NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope.
## Tools Quick Reference
| Tool | When to use | Command |
|------|-------------|---------|
| `query` | Find code by concept | `gitnexus_query({query: "auth validation"})` |
| `context` | 360-degree view of one symbol | `gitnexus_context({name: "validateUser"})` |
| `impact` | Blast radius before editing | `gitnexus_impact({target: "X", direction: "upstream"})` |
| `detect_changes` | Pre-commit scope check | `gitnexus_detect_changes({scope: "staged"})` |
| `rename` | Safe multi-file rename | `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` |
| `cypher` | Custom graph queries | `gitnexus_cypher({query: "MATCH ..."})` |
## Impact Risk Levels
| Depth | Meaning | Action |
|-------|---------|--------|
| d=1 | WILL BREAK — direct callers/importers | MUST update these |
| d=2 | LIKELY AFFECTED — indirect deps | Should test |
| d=3 | MAY NEED TESTING — transitive | Test if critical path |
## Resources
| Resource | Use for |
|----------|---------|
| `gitnexus://repo/cameleer-server/context` | Codebase overview, check index freshness |
| `gitnexus://repo/cameleer-server/clusters` | All functional areas |
| `gitnexus://repo/cameleer-server/processes` | All execution flows |
| `gitnexus://repo/cameleer-server/process/{name}` | Step-by-step execution trace |
## Self-Check Before Finishing
Before completing any code modification task, verify:
1. `gitnexus_impact` was run for all modified symbols
2. No HIGH/CRITICAL risk warnings were ignored
3. `gitnexus_detect_changes()` confirms changes match expected scope
4. All d=1 (WILL BREAK) dependents were updated
## Keeping the Index Fresh
After committing code changes, the GitNexus index becomes stale. Re-run analyze to update it:
```bash
npx gitnexus analyze
```
If the index previously included embeddings, preserve them by adding `--embeddings`:
```bash
npx gitnexus analyze --embeddings
```
To check whether embeddings exist, inspect `.gitnexus/meta.json` — the `stats.embeddings` field shows the count (0 means no embeddings). **Running analyze without `--embeddings` will delete any previously generated embeddings.**
> Claude Code users: A PostToolUse hook handles this automatically after `git commit` and `git merge`.
## CLI
| Task | Read this skill file |
|------|---------------------|
| Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` |
| Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` |
| Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` |
| Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` |
| Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` |
| Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |

85
.claude/rules/metrics.md Normal file
View File

@@ -0,0 +1,85 @@
---
paths:
- "cameleer-server-app/**/metrics/**"
- "cameleer-server-app/**/ServerMetrics*"
- "ui/src/pages/RuntimeTab/**"
- "ui/src/pages/DashboardTab/**"
---
# Prometheus Metrics
Server exposes `/api/v1/prometheus` (unauthenticated, Prometheus text format). Spring Boot Actuator provides JVM, GC, thread pool, and `http.server.requests` metrics automatically. Business metrics via `ServerMetrics` component:
## Gauges (auto-polled)
| Metric | Tags | Source |
|--------|------|--------|
| `cameleer.agents.connected` | `state` (live, stale, dead, shutdown) | `AgentRegistryService.findByState()` |
| `cameleer.agents.sse.active` | — | `SseConnectionManager.getConnectionCount()` |
| `cameleer.ingestion.buffer.size` | `type` (execution, processor, log, metrics) | `WriteBuffer.size()` |
| `cameleer.ingestion.accumulator.pending` | — | `ChunkAccumulator.getPendingCount()` |
## Counters
| Metric | Tags | Instrumented in |
|--------|------|-----------------|
| `cameleer.ingestion.drops` | `reason` (buffer_full, no_agent, no_identity) | `LogIngestionController` |
| `cameleer.agents.transitions` | `transition` (went_stale, went_dead, recovered) | `AgentLifecycleMonitor` |
| `cameleer.deployments.outcome` | `status` (running, failed, degraded) | `DeploymentExecutor` |
| `cameleer.auth.failures` | `reason` (invalid_token, revoked, oidc_rejected) | `JwtAuthenticationFilter` |
## Timers
| Metric | Tags | Instrumented in |
|--------|------|-----------------|
| `cameleer.ingestion.flush.duration` | `type` (execution, processor, log) | `ExecutionFlushScheduler` |
| `cameleer.deployments.duration` | — | `DeploymentExecutor` |
## Agent container Prometheus labels (set by PrometheusLabelBuilder at deploy time)
| Runtime Type | `prometheus.path` | `prometheus.port` |
|---|---|---|
| `spring-boot` | `/actuator/prometheus` | `8081` |
| `quarkus` / `native` | `/q/metrics` | `9000` |
| `plain-java` | `/metrics` | `9464` |
All containers also get `prometheus.scrape=true`. These labels enable Prometheus `docker_sd_configs` auto-discovery.
## Agent Metric Names (Micrometer)
Agents send `MetricsSnapshot` records with Micrometer-convention metric names. The server stores them generically (ClickHouse `agent_metrics.metric_name`). The UI references specific names in `AgentInstance.tsx` for JVM charts.
### JVM metrics (used by UI)
| Metric name | UI usage |
|---|---|
| `process.cpu.usage.value` | CPU % stat card + chart |
| `jvm.memory.used.value` | Heap MB stat card + chart (tags: `area=heap`) |
| `jvm.memory.max.value` | Heap max for % calculation (tags: `area=heap`) |
| `jvm.threads.live.value` | Thread count chart |
| `jvm.gc.pause.total_time` | GC time chart |
### Camel route metrics (stored, queried by dashboard)
| Metric name | Type | Tags |
|---|---|---|
| `camel.exchanges.succeeded.count` | counter | `routeId`, `camelContext` |
| `camel.exchanges.failed.count` | counter | `routeId`, `camelContext` |
| `camel.exchanges.total.count` | counter | `routeId`, `camelContext` |
| `camel.exchanges.failures.handled.count` | counter | `routeId`, `camelContext` |
| `camel.route.policy.count` | count | `routeId`, `camelContext` |
| `camel.route.policy.total_time` | total | `routeId`, `camelContext` |
| `camel.route.policy.max` | gauge | `routeId`, `camelContext` |
| `camel.routes.running.value` | gauge | — |
Mean processing time = `camel.route.policy.total_time / camel.route.policy.count`. Min processing time is not available (Micrometer does not track minimums).
### Cameleer agent metrics
| Metric name | Type | Tags |
|---|---|---|
| `cameleer.chunks.exported.count` | counter | `instanceId` |
| `cameleer.chunks.dropped.count` | counter | `instanceId`, `reason` |
| `cameleer.sse.reconnects.count` | counter | `instanceId` |
| `cameleer.taps.evaluated.count` | counter | `instanceId` |
| `cameleer.metrics.exported.count` | counter | `instanceId` |

46
.claude/rules/ui.md Normal file
View File

@@ -0,0 +1,46 @@
---
paths:
- "ui/**"
---
# UI Structure
The UI has 4 main tabs: **Exchanges**, **Dashboard**, **Runtime**, **Deployments**.
- **Exchanges** — route execution search and detail (`ui/src/pages/Exchanges/`)
- **Dashboard** — metrics and stats with L1/L2/L3 drill-down (`ui/src/pages/DashboardTab/`)
- **Runtime** — live agent status, logs, commands (`ui/src/pages/RuntimeTab/`). AgentHealth supports compact view (dense health-tinted cards) and expanded view (full GroupCard+DataTable per app). View mode persisted to localStorage.
- **Deployments** — app management, JAR upload, deployment lifecycle (`ui/src/pages/AppsTab/`)
- Config sub-tabs: **Monitoring | Resources | Variables | Traces & Taps | Route Recording**
- Create app: full page at `/apps/new` (not a modal)
- Deployment progress: `ui/src/components/DeploymentProgress.tsx` (7-stage step indicator)
**Admin pages** (ADMIN-only, under `/admin/`):
- **Sensitive Keys** (`ui/src/pages/Admin/SensitiveKeysPage.tsx`) — global sensitive key masking config. Shows agent built-in defaults as outlined Badge reference, editable Tag pills for custom keys, amber-highlighted push-to-agents toggle. Keys add to (not replace) agent defaults. Per-app sensitive key additions managed via `ApplicationConfigController` API. Note: `AppConfigDetailPage.tsx` exists but is not routed in `router.tsx`.
## Key UI Files
- `ui/src/router.tsx` — React Router v6 routes
- `ui/src/config.ts` — apiBaseUrl, basePath
- `ui/src/auth/auth-store.ts` — Zustand: accessToken, user, roles, login/logout
- `ui/src/api/environment-store.ts` — Zustand: selected environment (localStorage)
- `ui/src/components/ContentTabs.tsx` — main tab switcher
- `ui/src/components/ExecutionDiagram/` — interactive trace view (canvas)
- `ui/src/components/ProcessDiagram/` — ELK-rendered route diagram
- `ui/src/hooks/useScope.ts` — TabKey type, scope inference
- `ui/src/components/StartupLogPanel.tsx` — deployment startup log viewer (container logs from ClickHouse, polls 3s while STARTING)
- `ui/src/api/queries/logs.ts``useStartupLogs` hook for container startup log polling, `useLogs`/`useApplicationLogs` for bounded log search (single page), `useInfiniteApplicationLogs` for streaming log views (cursor-paginated, server-side source/level filters)
- `ui/src/api/queries/agents.ts``useAgents` for agent list, `useInfiniteAgentEvents` for cursor-paginated timeline stream
- `ui/src/hooks/useInfiniteStream.ts` — tanstack `useInfiniteQuery` wrapper with top-gated auto-refetch, flattened `items[]`, and `refresh()` invalidator
- `ui/src/components/InfiniteScrollArea.tsx` — scrollable container with IntersectionObserver top/bottom sentinels. Streaming log/event views use this + `useInfiniteStream`. Bounded views (LogTab, StartupLogPanel) keep `useLogs`/`useStartupLogs`
## UI Styling
- Always use `@cameleer/design-system` CSS variables for colors (`var(--amber)`, `var(--error)`, `var(--success)`, etc.) — never hardcode hex values. This applies to CSS modules, inline styles, and SVG `fill`/`stroke` attributes. SVG presentation attributes resolve `var()` correctly. All colors use CSS variables (no hardcoded hex).
- Shared CSS modules in `ui/src/styles/` (table-section, log-panel, rate-colors, refresh-indicator, chart-card, section-card) — import these instead of duplicating patterns.
- Shared `PageLoader` component replaces copy-pasted spinner patterns.
- Design system components used consistently: `Select`, `Tabs`, `Toggle`, `Button`, `LogViewer`, `Label` — prefer DS components over raw HTML elements. `LogViewer` renders optional source badges (`container`, `app`, `agent`) via `LogEntry.source` field (DS v0.1.49+).
- Environment slugs are auto-computed from display name (read-only in UI).
- Brand assets: `@cameleer/design-system/assets/` provides `camel-logo.svg` (currentColor), `cameleer-{16,32,48,192,512}.png`, and `cameleer-logo.png`. Copied to `ui/public/` for use as favicon (`favicon-16.png`, `favicon-32.png`) and logo (`camel-logo.svg` — login dialog 36px, sidebar 28x24px).
- Sidebar generates `/exchanges/` paths directly (no legacy `/apps/` redirects). basePath is centralized in `ui/src/config.ts`; router.tsx imports it instead of re-reading `<base>` tag.
- Global user preferences (environment selection) use Zustand stores with localStorage persistence — never URL search params. URL params are for page-specific state only (e.g. `?text=` search query). Switching environment resets all filters and remounts pages.

11
.gitea/sanitize-branch.sh Normal file
View File

@@ -0,0 +1,11 @@
#!/bin/sh
# Shared branch slug sanitization for CI jobs.
# Strips prefix (feature/, fix/, etc.), lowercases, replaces non-alphanum, truncates to 20 chars.
sanitize_branch() {
echo "$1" | sed -E 's#^(feature|fix|feat|hotfix)/##' \
| tr '[:upper:]' '[:lower:]' \
| sed 's/[^a-z0-9-]/-/g' \
| sed 's/--*/-/g; s/^-//; s/-$//' \
| cut -c1-20 \
| sed 's/-$//'
}

View File

@@ -53,6 +53,7 @@ jobs:
npm run build
env:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
VITE_APP_VERSION: ${{ github.sha }}
- name: Build and Test
run: mvn clean verify -DskipITs -U --batch-mode
@@ -78,14 +79,7 @@ jobs:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
- name: Compute branch slug
run: |
sanitize_branch() {
echo "$1" | sed -E 's#^(feature|fix|feat|hotfix)/##' \
| tr '[:upper:]' '[:lower:]' \
| sed 's/[^a-z0-9-]/-/g' \
| sed 's/--*/-/g; s/^-//; s/-$//' \
| cut -c1-20 \
| sed 's/-$//'
}
. .gitea/sanitize-branch.sh
if [ "$GITHUB_REF_NAME" = "main" ]; then
echo "BRANCH_SLUG=main" >> "$GITHUB_ENV"
echo "IMAGE_TAGS=latest" >> "$GITHUB_ENV"
@@ -99,31 +93,33 @@ jobs:
- name: Build and push server
run: |
docker buildx create --use --name cibuilder
TAGS="-t gitea.siegeln.net/cameleer/cameleer3-server:${{ github.sha }}"
TAGS="-t gitea.siegeln.net/cameleer/cameleer-server:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer3-server:$TAG"
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-server:$TAG"
done
docker buildx build --platform linux/amd64 \
--build-arg REGISTRY_TOKEN="$REGISTRY_TOKEN" \
$TAGS \
--cache-from type=registry,ref=gitea.siegeln.net/cameleer/cameleer3-server:buildcache \
--cache-to type=registry,ref=gitea.siegeln.net/cameleer/cameleer3-server:buildcache,mode=max \
--cache-from type=registry,ref=gitea.siegeln.net/cameleer/cameleer-server:buildcache \
--cache-to type=registry,ref=gitea.siegeln.net/cameleer/cameleer-server:buildcache,mode=max \
--provenance=false \
--push .
env:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
- name: Build and push UI
run: |
TAGS="-t gitea.siegeln.net/cameleer/cameleer3-server-ui:${{ github.sha }}"
TAGS="-t gitea.siegeln.net/cameleer/cameleer-server-ui:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer3-server-ui:$TAG"
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-server-ui:$TAG"
done
SHORT_SHA=$(echo "${{ github.sha }}" | cut -c1-7)
docker buildx build --platform linux/amd64 \
-f ui/Dockerfile \
--build-arg REGISTRY_TOKEN="$REGISTRY_TOKEN" \
--build-arg VITE_APP_VERSION="$SHORT_SHA" \
$TAGS \
--cache-from type=registry,ref=gitea.siegeln.net/cameleer/cameleer3-server-ui:buildcache \
--cache-to type=registry,ref=gitea.siegeln.net/cameleer/cameleer3-server-ui:buildcache,mode=max \
--cache-from type=registry,ref=gitea.siegeln.net/cameleer/cameleer-server-ui:buildcache \
--cache-to type=registry,ref=gitea.siegeln.net/cameleer/cameleer-server-ui:buildcache,mode=max \
--provenance=false \
--push ui/
env:
@@ -141,7 +137,7 @@ jobs:
if [ "$BRANCH_SLUG" != "main" ]; then
KEEP_TAGS="$KEEP_TAGS branch-$BRANCH_SLUG"
fi
for PKG in cameleer3-server cameleer3-server-ui; do
for PKG in cameleer-server cameleer-server-ui; do
curl -sf -H "$AUTH" "$API/packages/cameleer/container/$PKG" | \
jq -r '.[] | "\(.id) \(.version)"' | \
while read id version; do
@@ -196,49 +192,50 @@ jobs:
kubectl create secret generic cameleer-auth \
--namespace=cameleer \
--from-literal=CAMELEER_AUTH_TOKEN="$CAMELEER_AUTH_TOKEN" \
--from-literal=CAMELEER_UI_USER="${CAMELEER_UI_USER:-admin}" \
--from-literal=CAMELEER_UI_PASSWORD="${CAMELEER_UI_PASSWORD:-admin}" \
--from-literal=CAMELEER_JWT_SECRET="${CAMELEER_JWT_SECRET}" \
--from-literal=CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN="$CAMELEER_AUTH_TOKEN" \
--from-literal=CAMELEER_SERVER_SECURITY_UIUSER="${CAMELEER_UI_USER:-admin}" \
--from-literal=CAMELEER_SERVER_SECURITY_UIPASSWORD="${CAMELEER_UI_PASSWORD:-admin}" \
--from-literal=CAMELEER_SERVER_SECURITY_JWTSECRET="${CAMELEER_JWT_SECRET}" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl create secret generic postgres-credentials \
kubectl create secret generic cameleer-postgres-credentials \
--namespace=cameleer \
--from-literal=POSTGRES_USER="$POSTGRES_USER" \
--from-literal=POSTGRES_PASSWORD="$POSTGRES_PASSWORD" \
--from-literal=POSTGRES_DB="${POSTGRES_DB:-cameleer}" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl create secret generic opensearch-credentials \
kubectl create secret generic cameleer-logto-credentials \
--namespace=cameleer \
--from-literal=OPENSEARCH_USER="${OPENSEARCH_USER:-admin}" \
--from-literal=OPENSEARCH_PASSWORD="$OPENSEARCH_PASSWORD" \
--from-literal=PG_USER="${LOGTO_PG_USER:-logto}" \
--from-literal=PG_PASSWORD="${LOGTO_PG_PASSWORD}" \
--from-literal=ENDPOINT="${LOGTO_ENDPOINT}" \
--from-literal=ADMIN_ENDPOINT="${LOGTO_ADMIN_ENDPOINT}" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl create secret generic authentik-credentials \
kubectl create secret generic cameleer-clickhouse-credentials \
--namespace=cameleer \
--from-literal=PG_USER="${AUTHENTIK_PG_USER:-authentik}" \
--from-literal=PG_PASSWORD="${AUTHENTIK_PG_PASSWORD}" \
--from-literal=AUTHENTIK_SECRET_KEY="${AUTHENTIK_SECRET_KEY}" \
--from-literal=CLICKHOUSE_USER="${CLICKHOUSE_USER:-default}" \
--from-literal=CLICKHOUSE_PASSWORD="$CLICKHOUSE_PASSWORD" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl apply -f deploy/postgres.yaml
kubectl -n cameleer rollout status statefulset/postgres --timeout=120s
kubectl apply -f deploy/cameleer-postgres.yaml
kubectl -n cameleer rollout status statefulset/cameleer-postgres --timeout=120s
kubectl apply -f deploy/opensearch.yaml
kubectl -n cameleer rollout status statefulset/opensearch --timeout=180s
kubectl apply -f deploy/cameleer-clickhouse.yaml
kubectl -n cameleer rollout status statefulset/cameleer-clickhouse --timeout=180s
kubectl apply -f deploy/authentik.yaml
kubectl -n cameleer rollout status deployment/authentik-server --timeout=180s
kubectl apply -f deploy/cameleer-logto.yaml
kubectl -n cameleer rollout status deployment/cameleer-logto --timeout=180s
kubectl apply -k deploy/overlays/main
kubectl -n cameleer set image deployment/cameleer3-server \
server=gitea.siegeln.net/cameleer/cameleer3-server:${{ github.sha }}
kubectl -n cameleer rollout status deployment/cameleer3-server --timeout=120s
kubectl -n cameleer set image deployment/cameleer-server \
server=gitea.siegeln.net/cameleer/cameleer-server:${{ github.sha }}
kubectl -n cameleer rollout status deployment/cameleer-server --timeout=120s
kubectl -n cameleer set image deployment/cameleer3-ui \
ui=gitea.siegeln.net/cameleer/cameleer3-server-ui:${{ github.sha }}
kubectl -n cameleer rollout status deployment/cameleer3-ui --timeout=120s
kubectl -n cameleer set image deployment/cameleer-ui \
ui=gitea.siegeln.net/cameleer/cameleer-server-ui:${{ github.sha }}
kubectl -n cameleer rollout status deployment/cameleer-ui --timeout=120s
env:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
CAMELEER_AUTH_TOKEN: ${{ secrets.CAMELEER_AUTH_TOKEN }}
@@ -248,11 +245,12 @@ jobs:
POSTGRES_USER: ${{ secrets.POSTGRES_USER }}
POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
POSTGRES_DB: ${{ secrets.POSTGRES_DB }}
OPENSEARCH_USER: ${{ secrets.OPENSEARCH_USER }}
OPENSEARCH_PASSWORD: ${{ secrets.OPENSEARCH_PASSWORD }}
AUTHENTIK_PG_USER: ${{ secrets.AUTHENTIK_PG_USER }}
AUTHENTIK_PG_PASSWORD: ${{ secrets.AUTHENTIK_PG_PASSWORD }}
AUTHENTIK_SECRET_KEY: ${{ secrets.AUTHENTIK_SECRET_KEY }}
LOGTO_PG_USER: ${{ secrets.LOGTO_PG_USER }}
LOGTO_PG_PASSWORD: ${{ secrets.LOGTO_PG_PASSWORD }}
LOGTO_ENDPOINT: ${{ secrets.LOGTO_ENDPOINT }}
LOGTO_ADMIN_ENDPOINT: ${{ secrets.LOGTO_ADMIN_ENDPOINT }}
CLICKHOUSE_USER: ${{ secrets.CLICKHOUSE_USER }}
CLICKHOUSE_PASSWORD: ${{ secrets.CLICKHOUSE_PASSWORD }}
deploy-feature:
needs: docker
@@ -274,14 +272,7 @@ jobs:
KUBECONFIG_B64: ${{ secrets.KUBECONFIG_BASE64 }}
- name: Compute branch variables
run: |
sanitize_branch() {
echo "$1" | sed -E 's#^(feature|fix|feat|hotfix)/##' \
| tr '[:upper:]' '[:lower:]' \
| sed 's/[^a-z0-9-]/-/g' \
| sed 's/--*/-/g; s/^-//; s/-$//' \
| cut -c1-20 \
| sed 's/-$//'
}
. .gitea/sanitize-branch.sh
SLUG=$(sanitize_branch "$GITHUB_REF_NAME")
NS="cam-${SLUG}"
SCHEMA="cam_$(echo $SLUG | tr '-' '_')"
@@ -292,7 +283,7 @@ jobs:
run: kubectl create namespace "$BRANCH_NS" --dry-run=client -o yaml | kubectl apply -f -
- name: Copy secrets from cameleer namespace
run: |
for SECRET in gitea-registry postgres-credentials opensearch-credentials cameleer-auth; do
for SECRET in gitea-registry cameleer-postgres-credentials cameleer-clickhouse-credentials cameleer-auth; do
kubectl get secret "$SECRET" -n cameleer -o json \
| jq 'del(.metadata.namespace, .metadata.resourceVersion, .metadata.uid, .metadata.creationTimestamp, .metadata.managedFields)' \
| kubectl apply -n "$BRANCH_NS" -f -
@@ -318,9 +309,9 @@ jobs:
kubectl -n "$BRANCH_NS" wait --for=condition=complete job/init-schema --timeout=60s || \
echo "Warning: init-schema job did not complete in time"
- name: Wait for server rollout
run: kubectl -n "$BRANCH_NS" rollout status deployment/cameleer3-server --timeout=120s
run: kubectl -n "$BRANCH_NS" rollout status deployment/cameleer-server --timeout=120s
- name: Wait for UI rollout
run: kubectl -n "$BRANCH_NS" rollout status deployment/cameleer3-ui --timeout=60s
run: kubectl -n "$BRANCH_NS" rollout status deployment/cameleer-ui --timeout=60s
- name: Print deployment URLs
run: |
echo "===================================="
@@ -367,25 +358,16 @@ jobs:
--namespace=cameleer \
--image=postgres:16 \
--restart=Never \
--env="PGPASSWORD=$(kubectl get secret postgres-credentials -n cameleer -o jsonpath='{.data.POSTGRES_PASSWORD}' | base64 -d)" \
--command -- sh -c "psql -h postgres -U $(kubectl get secret postgres-credentials -n cameleer -o jsonpath='{.data.POSTGRES_USER}' | base64 -d) -d cameleer3 -c 'DROP SCHEMA IF EXISTS ${BRANCH_SCHEMA} CASCADE'"
--env="PGPASSWORD=$(kubectl get secret cameleer-postgres-credentials -n cameleer -o jsonpath='{.data.POSTGRES_PASSWORD}' | base64 -d)" \
--command -- sh -c "psql -h cameleer-postgres -U $(kubectl get secret cameleer-postgres-credentials -n cameleer -o jsonpath='{.data.POSTGRES_USER}' | base64 -d) -d cameleer -c 'DROP SCHEMA IF EXISTS ${BRANCH_SCHEMA} CASCADE'"
kubectl wait --for=condition=Ready pod/cleanup-schema-${BRANCH_SLUG} -n cameleer --timeout=30s || true
kubectl wait --for=jsonpath='{.status.phase}'=Succeeded pod/cleanup-schema-${BRANCH_SLUG} -n cameleer --timeout=60s || true
kubectl delete pod cleanup-schema-${BRANCH_SLUG} -n cameleer --ignore-not-found
- name: Delete OpenSearch indices
run: |
kubectl run cleanup-indices-${BRANCH_SLUG} \
--namespace=cameleer \
--image=curlimages/curl:latest \
--restart=Never \
--command -- curl -sf -X DELETE "http://opensearch:9200/cam-${BRANCH_SLUG}-*"
kubectl wait --for=jsonpath='{.status.phase}'=Succeeded pod/cleanup-indices-${BRANCH_SLUG} -n cameleer --timeout=60s || true
kubectl delete pod cleanup-indices-${BRANCH_SLUG} -n cameleer --ignore-not-found
- name: Cleanup Docker images
run: |
API="https://gitea.siegeln.net/api/v1"
AUTH="Authorization: token ${REGISTRY_TOKEN}"
for PKG in cameleer3-server cameleer3-server-ui; do
for PKG in cameleer-server cameleer-server-ui; do
# Delete branch-specific tag
curl -sf -X DELETE -H "$AUTH" "$API/packages/cameleer/container/$PKG/branch-${BRANCH_SLUG}" || true
done

View File

@@ -42,9 +42,6 @@ jobs:
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: ${{ runner.os }}-maven-
- name: Build and Test Java
run: mvn clean verify -DskipITs -U --batch-mode
- name: Install UI dependencies
working-directory: ui
run: |
@@ -57,33 +54,10 @@ jobs:
working-directory: ui
run: npm run lint -- --format json --output-file eslint-report.json || true
- name: Install sonar-scanner
- name: Build, Test and Analyze
run: |
SONAR_SCANNER_VERSION=6.2.1.4610
ARCH=$(uname -m)
case "$ARCH" in
aarch64|arm64) PLATFORM="linux-aarch64" ;;
*) PLATFORM="linux-x64" ;;
esac
curl -sSLo sonar-scanner.zip "https://binaries.sonarsource.com/Distribution/sonar-scanner-cli/sonar-scanner-cli-${SONAR_SCANNER_VERSION}-${PLATFORM}.zip"
unzip -q sonar-scanner.zip
ln -s "$(pwd)/sonar-scanner-${SONAR_SCANNER_VERSION}-${PLATFORM}/bin/sonar-scanner" /usr/local/bin/sonar-scanner
- name: SonarQube Analysis
run: |
sonar-scanner \
-Dsonar.host.url="$SONAR_HOST_URL" \
-Dsonar.token="$SONAR_TOKEN" \
-Dsonar.projectKey=cameleer3-server \
-Dsonar.projectName="Cameleer3 Server" \
-Dsonar.sources=cameleer3-server-core/src/main/java,cameleer3-server-app/src/main/java,ui/src \
-Dsonar.tests=cameleer3-server-core/src/test/java,cameleer3-server-app/src/test/java \
-Dsonar.java.binaries=cameleer3-server-core/target/classes,cameleer3-server-app/target/classes \
-Dsonar.java.test.binaries=cameleer3-server-core/target/test-classes,cameleer3-server-app/target/test-classes \
-Dsonar.java.libraries="$HOME/.m2/repository/**/*.jar" \
-Dsonar.typescript.eslint.reportPaths=ui/eslint-report.json \
-Dsonar.eslint.reportPaths=ui/eslint-report.json \
-Dsonar.exclusions="ui/node_modules/**,ui/dist/**,**/target/**"
env:
SONAR_HOST_URL: ${{ secrets.SONAR_HOST_URL }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
mvn clean verify sonar:sonar -DskipITs -U --batch-mode \
-Dsonar.host.url=${{ secrets.SONAR_HOST_URL }} \
-Dsonar.token=${{ secrets.SONAR_TOKEN }} \
-Dsonar.projectKey=cameleer-server \
-Dsonar.projectName="Cameleer Server"

6
.gitignore vendored
View File

@@ -38,5 +38,9 @@ Thumbs.db
logs/
# Claude
.claude/
.claude/*
!.claude/rules/
.superpowers/
.playwright-mcp/
.worktrees/
.gitnexus

View File

@@ -1,8 +1,8 @@
# Cameleer3 Server
# Cameleer Server
## What This Is
An observability server that receives, stores, and serves Apache Camel route execution data from distributed Cameleer3 agents. Think njams Server (by Integration Matters) — but built incrementally, API-first, with a modern stack. Users can search through millions of recorded transactions by state, time, duration, full text, and correlate executions across multiple Camel instances. The server also pushes configuration, tracing controls, and ad-hoc commands to agents via SSE.
An observability server that receives, stores, and serves Apache Camel route execution data from distributed Cameleer agents. Think njams Server (by Integration Matters) — but built incrementally, API-first, with a modern stack. Users can search through millions of recorded transactions by state, time, duration, full text, and correlate executions across multiple Camel instances. The server also pushes configuration, tracing controls, and ad-hoc commands to agents via SSE.
## Core Value
@@ -16,7 +16,7 @@ Users can reliably search and find any transaction across all connected Camel in
### Active
- [ ] Receive and ingest transaction/activity data from Cameleer3 agents via HTTP POST
- [ ] Receive and ingest transaction/activity data from Cameleer agents via HTTP POST
- [ ] Store transactions in a high-volume, horizontally scalable data store with 30-day retention
- [ ] Search transactions by state, execution date/time, duration, and full-text content
- [ ] Correlate activities across multiple routes and Camel instances within a single transaction
@@ -38,8 +38,8 @@ Users can reliably search and find any transaction across all connected Camel in
## Context
- **Agent side**: cameleer3 agent (`https://gitea.siegeln.net/cameleer/cameleer3`) is under active development; already supports creating diagrams and capturing executions
- **Shared library**: `com.cameleer3:cameleer3-common` contains shared models and the graph API; protocol defined in `cameleer3-common/PROTOCOL.md`
- **Agent side**: cameleer agent (`https://gitea.siegeln.net/cameleer/cameleer`) is under active development; already supports creating diagrams and capturing executions
- **Shared library**: `com.cameleer:cameleer-common` contains shared models and the graph API; protocol defined in `cameleer-common/PROTOCOL.md`
- **Data model**: Hierarchical — a **transaction** represents a message's full journey, containing **activities** per route execution. Transactions can span multiple Camel instances (e.g., route A calls route B on another instance via endpoint)
- **Scale target**: Millions of transactions per day, 50+ connected agents, 30-day data retention
- **Query pattern**: Incident-driven — mostly recent data queries, with deep historical dives during incidents
@@ -49,7 +49,7 @@ Users can reliably search and find any transaction across all connected Camel in
## Constraints
- **Tech stack**: Java 17+, Spring Boot 3.4.3, Maven multi-module — already established
- **Dependency**: Must consume `com.cameleer3:cameleer3-common` from Gitea Maven registry
- **Dependency**: Must consume `com.cameleer:cameleer-common` from Gitea Maven registry
- **Protocol**: Agent protocol is still evolving — server must adapt as it stabilizes
- **Incremental delivery**: Build step by step; storage and search first, then layer features on top

View File

@@ -1,4 +1,4 @@
# Requirements: Cameleer3 Server
# Requirements: Cameleer Server
**Defined:** 2026-03-11
**Core Value:** Users can reliably search and find any transaction across all connected Camel instances — by any combination of state, time, duration, or content — even at millions of transactions per day with 30-day retention.

View File

@@ -1,4 +1,4 @@
# Roadmap: Cameleer3 Server
# Roadmap: Cameleer Server
## Overview

View File

@@ -75,7 +75,7 @@ Recent decisions affecting current work:
- [Roadmap]: Phases 2 and 3 can execute in parallel (both depend only on Phase 1)
- [Roadmap]: Web UI deferred to v2
- [Phase 01]: Used spring-boot-starter-jdbc for JdbcTemplate + HikariCP auto-config
- [Phase 01]: Created MetricsSnapshot record in core module (cameleer3-common has no metrics model)
- [Phase 01]: Created MetricsSnapshot record in core module (cameleer-common has no metrics model)
- [Phase 01]: Upgraded testcontainers to 2.0.3 for Docker Desktop 29.x compatibility
- [Phase 01]: Changed error_message/error_stacktrace to non-nullable String for tokenbf_v1 index compat
- [Phase 01]: TTL expressions require toDateTime() cast for DateTime64 columns in ClickHouse 25.3
@@ -121,7 +121,7 @@ None yet.
### Blockers/Concerns
- [Phase 1]: ClickHouse Java client API needs phase-specific research (library has undergone changes)
- [Phase 1]: Must read cameleer3-common PROTOCOL.md before designing ClickHouse schema
- [Phase 1]: Must read cameleer-common PROTOCOL.md before designing ClickHouse schema
- [Phase 2]: Diagram rendering library selection is an open question (Batik, jsvg, JGraphX, or client-side)
- [Phase 2]: ClickHouse skip indexes may not suffice for full-text; decision point during Phase 2

View File

@@ -6,18 +6,18 @@ wave: 1
depends_on: []
files_modified:
- pom.xml
- cameleer3-server-core/pom.xml
- cameleer3-server-app/pom.xml
- cameleer-server-core/pom.xml
- cameleer-server-app/pom.xml
- docker-compose.yml
- clickhouse/init/01-schema.sql
- cameleer3-server-app/src/main/resources/application.yml
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/ClickHouseConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/WriteBuffer.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/DiagramRepository.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/MetricsRepository.java
- cameleer3-server-core/src/test/java/com/cameleer3/server/core/ingestion/WriteBufferTest.java
- cameleer-server-app/src/main/resources/application.yml
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/ClickHouseConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/MetricsRepository.java
- cameleer-server-core/src/test/java/com/cameleer/server/core/ingestion/WriteBufferTest.java
autonomous: true
requirements:
- INGST-04
@@ -32,7 +32,7 @@ must_haves:
- "TTL clause on tables removes data older than configured days"
- "Docker Compose starts ClickHouse and initializes the schema"
artifacts:
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/WriteBuffer.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java"
provides: "Generic bounded write buffer with offer/drain/isFull"
min_lines: 30
- path: "clickhouse/init/01-schema.sql"
@@ -41,15 +41,15 @@ must_haves:
- path: "docker-compose.yml"
provides: "Local ClickHouse service"
contains: "clickhouse-server"
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java"
provides: "Repository interface for execution batch inserts"
exports: ["insertBatch"]
key_links:
- from: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/ClickHouseConfig.java"
- from: "cameleer-server-app/src/main/java/com/cameleer/server/app/config/ClickHouseConfig.java"
to: "application.yml"
via: "spring.datasource properties"
pattern: "spring\\.datasource"
- from: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java"
- from: "cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java"
to: "application.yml"
via: "ingestion.* properties"
pattern: "ingestion\\."
@@ -74,8 +74,8 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
@.planning/phases/01-ingestion-pipeline-api-foundation/01-RESEARCH.md
@pom.xml
@cameleer3-server-core/pom.xml
@cameleer3-server-app/pom.xml
@cameleer-server-core/pom.xml
@cameleer-server-app/pom.xml
</context>
<tasks>
@@ -84,11 +84,11 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
<name>Task 1: Dependencies, Docker Compose, ClickHouse schema, and application config</name>
<files>
pom.xml,
cameleer3-server-core/pom.xml,
cameleer3-server-app/pom.xml,
cameleer-server-core/pom.xml,
cameleer-server-app/pom.xml,
docker-compose.yml,
clickhouse/init/01-schema.sql,
cameleer3-server-app/src/main/resources/application.yml
cameleer-server-app/src/main/resources/application.yml
</files>
<behavior>
- docker compose up -d starts ClickHouse on ports 8123/9000
@@ -101,7 +101,7 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
- Maven compile succeeds with new dependencies
</behavior>
<action>
1. Add dependencies to cameleer3-server-app/pom.xml per research:
1. Add dependencies to cameleer-server-app/pom.xml per research:
- clickhouse-jdbc 0.9.7 (classifier: all)
- spring-boot-starter-actuator
- springdoc-openapi-starter-webmvc-ui 2.8.6
@@ -109,13 +109,13 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
- junit-jupiter from testcontainers 2.0.2 (test scope)
- awaitility (test scope)
2. Add slf4j-api dependency to cameleer3-server-core/pom.xml.
2. Add slf4j-api dependency to cameleer-server-core/pom.xml.
3. Create docker-compose.yml at project root with ClickHouse service:
- Image: clickhouse/clickhouse-server:25.3
- Ports: 8123:8123, 9000:9000
- Volume mount ./clickhouse/init to /docker-entrypoint-initdb.d
- Environment: CLICKHOUSE_USER=cameleer, CLICKHOUSE_PASSWORD=cameleer_dev, CLICKHOUSE_DB=cameleer3
- Environment: CLICKHOUSE_USER=cameleer, CLICKHOUSE_PASSWORD=cameleer_dev, CLICKHOUSE_DB=cameleer
- ulimits nofile 262144
4. Create clickhouse/init/01-schema.sql with the three tables from research:
@@ -124,9 +124,9 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
- agent_metrics: MergeTree, daily partitioning on collected_at, ORDER BY (agent_id, metric_name, collected_at), TTL collected_at + INTERVAL 30 DAY, SETTINGS ttl_only_drop_parts=1.
- All DateTime fields use DateTime64(3, 'UTC').
5. Create cameleer3-server-app/src/main/resources/application.yml with config from research:
5. Create cameleer-server-app/src/main/resources/application.yml with config from research:
- server.port: 8081
- spring.datasource: url=jdbc:ch://localhost:8123/cameleer3, username/password, driver-class-name
- spring.datasource: url=jdbc:ch://localhost:8123/cameleer, username/password, driver-class-name
- spring.jackson: write-dates-as-timestamps=false, fail-on-unknown-properties=false
- ingestion: buffer-capacity=50000, batch-size=5000, flush-interval-ms=1000
- clickhouse.ttl-days: 30
@@ -144,13 +144,13 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
<task type="auto" tdd="true">
<name>Task 2: WriteBuffer, repository interfaces, IngestionConfig, and ClickHouseConfig</name>
<files>
cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/WriteBuffer.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/DiagramRepository.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/MetricsRepository.java,
cameleer3-server-core/src/test/java/com/cameleer3/server/core/ingestion/WriteBufferTest.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/ClickHouseConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java
cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/storage/MetricsRepository.java,
cameleer-server-core/src/test/java/com/cameleer/server/core/ingestion/WriteBufferTest.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/ClickHouseConfig.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
</files>
<behavior>
- WriteBuffer(capacity=10): offer() returns true for first 10 items, false on 11th
@@ -181,7 +181,7 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
3. Create repository interfaces in core module:
- ExecutionRepository: void insertBatch(List<RouteExecution> executions)
- DiagramRepository: void store(RouteGraph graph), Optional<RouteGraph> findByContentHash(String hash), Optional<String> findContentHashForRoute(String routeId, String agentId)
- MetricsRepository: void insertBatch(List<MetricsSnapshot> metrics) -- use a generic type or the cameleer3-common metrics model if available; if not, create a simple MetricsData record in core module
- MetricsRepository: void insertBatch(List<MetricsSnapshot> metrics) -- use a generic type or the cameleer-common metrics model if available; if not, create a simple MetricsData record in core module
4. Create IngestionConfig as @ConfigurationProperties("ingestion"):
- bufferCapacity (int, default 50000)
@@ -193,7 +193,7 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
- No custom bean needed if relying on auto-config; only create if explicit JdbcTemplate customization required
</action>
<verify>
<automated>mvn test -pl cameleer3-server-core -Dtest=WriteBufferTest -q 2>&1 | tail -10</automated>
<automated>mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q 2>&1 | tail -10</automated>
</verify>
<done>WriteBuffer passes all unit tests. Repository interfaces exist with correct method signatures. IngestionConfig reads from application.yml.</done>
</task>
@@ -201,7 +201,7 @@ Output: Working ClickHouse via Docker Compose, DDL with TTL, WriteBuffer with un
</tasks>
<verification>
- `mvn test -pl cameleer3-server-core -q` -- all WriteBuffer unit tests pass
- `mvn test -pl cameleer-server-core -q` -- all WriteBuffer unit tests pass
- `mvn clean compile -q` -- full project compiles with new dependencies
- `docker compose config` -- validates Docker Compose file
- clickhouse/init/01-schema.sql contains CREATE TABLE for all three tables with correct ENGINE, ORDER BY, PARTITION BY, and TTL

View File

@@ -26,22 +26,22 @@ key-files:
created:
- docker-compose.yml
- clickhouse/init/01-schema.sql
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/WriteBuffer.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/DiagramRepository.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/MetricsRepository.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/model/MetricsSnapshot.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/ClickHouseConfig.java
- cameleer3-server-core/src/test/java/com/cameleer3/server/core/ingestion/WriteBufferTest.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/MetricsRepository.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/model/MetricsSnapshot.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/ClickHouseConfig.java
- cameleer-server-core/src/test/java/com/cameleer/server/core/ingestion/WriteBufferTest.java
modified:
- cameleer3-server-core/pom.xml
- cameleer3-server-app/pom.xml
- cameleer3-server-app/src/main/resources/application.yml
- cameleer-server-core/pom.xml
- cameleer-server-app/pom.xml
- cameleer-server-app/src/main/resources/application.yml
key-decisions:
- "Used spring-boot-starter-jdbc for JdbcTemplate + HikariCP auto-config rather than manual DataSource"
- "Created MetricsSnapshot record in core module since cameleer3-common has no metrics model"
- "Created MetricsSnapshot record in core module since cameleer-common has no metrics model"
- "ClickHouseConfig exposes JdbcTemplate bean; relies on Spring Boot DataSource auto-config"
patterns-established:
@@ -84,21 +84,21 @@ Each task was committed atomically:
## Files Created/Modified
- `docker-compose.yml` - ClickHouse service with ports 8123/9000, init volume mount
- `clickhouse/init/01-schema.sql` - DDL for route_executions, route_diagrams, agent_metrics
- `cameleer3-server-core/src/main/java/.../ingestion/WriteBuffer.java` - Bounded queue with offer/offerBatch/drain
- `cameleer3-server-core/src/main/java/.../storage/ExecutionRepository.java` - Batch insert interface for RouteExecution
- `cameleer3-server-core/src/main/java/.../storage/DiagramRepository.java` - Store/find interface for RouteGraph
- `cameleer3-server-core/src/main/java/.../storage/MetricsRepository.java` - Batch insert interface for MetricsSnapshot
- `cameleer3-server-core/src/main/java/.../storage/model/MetricsSnapshot.java` - Metrics data record
- `cameleer3-server-app/src/main/java/.../config/IngestionConfig.java` - Buffer capacity, batch size, flush interval
- `cameleer3-server-app/src/main/java/.../config/ClickHouseConfig.java` - JdbcTemplate bean
- `cameleer3-server-core/src/test/java/.../ingestion/WriteBufferTest.java` - 10 unit tests for WriteBuffer
- `cameleer3-server-core/pom.xml` - Added slf4j-api
- `cameleer3-server-app/pom.xml` - Added clickhouse-jdbc, springdoc, actuator, testcontainers, awaitility
- `cameleer3-server-app/src/main/resources/application.yml` - Full config with datasource, ingestion, springdoc, actuator
- `cameleer-server-core/src/main/java/.../ingestion/WriteBuffer.java` - Bounded queue with offer/offerBatch/drain
- `cameleer-server-core/src/main/java/.../storage/ExecutionRepository.java` - Batch insert interface for RouteExecution
- `cameleer-server-core/src/main/java/.../storage/DiagramRepository.java` - Store/find interface for RouteGraph
- `cameleer-server-core/src/main/java/.../storage/MetricsRepository.java` - Batch insert interface for MetricsSnapshot
- `cameleer-server-core/src/main/java/.../storage/model/MetricsSnapshot.java` - Metrics data record
- `cameleer-server-app/src/main/java/.../config/IngestionConfig.java` - Buffer capacity, batch size, flush interval
- `cameleer-server-app/src/main/java/.../config/ClickHouseConfig.java` - JdbcTemplate bean
- `cameleer-server-core/src/test/java/.../ingestion/WriteBufferTest.java` - 10 unit tests for WriteBuffer
- `cameleer-server-core/pom.xml` - Added slf4j-api
- `cameleer-server-app/pom.xml` - Added clickhouse-jdbc, springdoc, actuator, testcontainers, awaitility
- `cameleer-server-app/src/main/resources/application.yml` - Full config with datasource, ingestion, springdoc, actuator
## Decisions Made
- Used spring-boot-starter-jdbc to get JdbcTemplate and HikariCP auto-configuration rather than manually wiring a DataSource
- Created MetricsSnapshot record in core module since cameleer3-common does not include a metrics model
- Created MetricsSnapshot record in core module since cameleer-common does not include a metrics model
- ClickHouseConfig is minimal -- relies on Spring Boot auto-configuring DataSource from spring.datasource properties
## Deviations from Plan

View File

@@ -5,18 +5,18 @@ type: execute
wave: 3
depends_on: ["01-01", "01-03"]
files_modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/ExecutionController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/MetricsController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseDiagramRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseMetricsRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ExecutionControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/MetricsControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/BackpressureIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ExecutionController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/MetricsController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseDiagramRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseMetricsRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ExecutionControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/MetricsControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/BackpressureIT.java
autonomous: true
requirements:
- INGST-01
@@ -32,16 +32,16 @@ must_haves:
- "Data posted to endpoints appears in ClickHouse after flush interval"
- "When buffer is full, endpoints return 503 with Retry-After header"
artifacts:
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/ExecutionController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ExecutionController.java"
provides: "POST /api/v1/data/executions endpoint"
min_lines: 20
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java"
provides: "Batch insert to route_executions table via JdbcTemplate"
min_lines: 30
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java"
provides: "Scheduled drain of WriteBuffer into ClickHouse"
min_lines: 20
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java"
provides: "Routes data to appropriate WriteBuffer instances"
min_lines: 20
key_links:
@@ -92,7 +92,7 @@ Output: Working ingestion flow verified by integration tests against Testcontain
<!-- Interfaces from Plan 01 that this plan depends on -->
<interfaces>
From cameleer3-server-core WriteBuffer.java:
From cameleer-server-core WriteBuffer.java:
```java
public class WriteBuffer<T> {
public WriteBuffer(int capacity);
@@ -106,7 +106,7 @@ public class WriteBuffer<T> {
}
```
From cameleer3-server-core repository interfaces:
From cameleer-server-core repository interfaces:
```java
public interface ExecutionRepository {
void insertBatch(List<RouteExecution> executions);
@@ -138,11 +138,11 @@ public class IngestionConfig {
<task type="auto" tdd="false">
<name>Task 1: IngestionService, ClickHouse repositories, and flush scheduler</name>
<files>
cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseDiagramRepository.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseMetricsRepository.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java
cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseDiagramRepository.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseMetricsRepository.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java
</files>
<action>
1. Create IngestionService in core module (no Spring annotations -- it's a plain class):
@@ -188,13 +188,13 @@ public class IngestionConfig {
<task type="auto" tdd="true">
<name>Task 2: Ingestion REST controllers and integration tests</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/ExecutionController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/MetricsController.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ExecutionControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/MetricsControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/BackpressureIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ExecutionController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/MetricsController.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ExecutionControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/MetricsControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/BackpressureIT.java
</files>
<behavior>
- POST /api/v1/data/executions with single RouteExecution JSON returns 202
@@ -245,7 +245,7 @@ public class IngestionConfig {
Note: All integration tests must include X-Cameleer-Protocol-Version:1 header (API-04 will be enforced by Plan 03's interceptor, but include the header now for forward compatibility).
</action>
<verify>
<automated>mvn test -pl cameleer3-server-app -Dtest="ExecutionControllerIT,DiagramControllerIT,MetricsControllerIT,BackpressureIT" -q 2>&1 | tail -15</automated>
<automated>mvn test -pl cameleer-server-app -Dtest="ExecutionControllerIT,DiagramControllerIT,MetricsControllerIT,BackpressureIT" -q 2>&1 | tail -15</automated>
</verify>
<done>All three ingestion endpoints return 202 on valid data. Data arrives in ClickHouse after flush. Buffer-full returns 503 with Retry-After. Unknown JSON fields accepted. Integration tests green.</done>
</task>
@@ -253,7 +253,7 @@ public class IngestionConfig {
</tasks>
<verification>
- `mvn test -pl cameleer3-server-app -Dtest="ExecutionControllerIT,DiagramControllerIT,MetricsControllerIT,BackpressureIT" -q` -- all integration tests pass
- `mvn test -pl cameleer-server-app -Dtest="ExecutionControllerIT,DiagramControllerIT,MetricsControllerIT,BackpressureIT" -q` -- all integration tests pass
- POST to /api/v1/data/executions returns 202
- POST to /api/v1/data/diagrams returns 202
- POST to /api/v1/data/metrics returns 202

View File

@@ -30,21 +30,21 @@ tech-stack:
key-files:
created:
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseDiagramRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseMetricsRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionBeanConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/ExecutionController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/MetricsController.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ExecutionControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/MetricsControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/BackpressureIT.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseDiagramRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseMetricsRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionBeanConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ExecutionController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/MetricsController.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ExecutionControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/MetricsControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/BackpressureIT.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
key-decisions:
- "Controllers accept raw String body and detect single vs array JSON to support both payload formats"
@@ -91,20 +91,20 @@ Each task was committed atomically:
3. **Task 2 GREEN: Ingestion REST controllers with backpressure** - `8fe65f0` (feat)
## Files Created/Modified
- `cameleer3-server-core/.../ingestion/IngestionService.java` - Routes data to WriteBuffer instances
- `cameleer3-server-app/.../storage/ClickHouseExecutionRepository.java` - Batch insert with parallel processor arrays
- `cameleer3-server-app/.../storage/ClickHouseDiagramRepository.java` - JSON storage with SHA-256 content-hash dedup
- `cameleer3-server-app/.../storage/ClickHouseMetricsRepository.java` - Batch insert for agent_metrics
- `cameleer3-server-app/.../ingestion/ClickHouseFlushScheduler.java` - Scheduled drain + SmartLifecycle shutdown
- `cameleer3-server-app/.../config/IngestionBeanConfig.java` - WriteBuffer and IngestionService bean wiring
- `cameleer3-server-app/.../controller/ExecutionController.java` - POST /api/v1/data/executions
- `cameleer3-server-app/.../controller/DiagramController.java` - POST /api/v1/data/diagrams
- `cameleer3-server-app/.../controller/MetricsController.java` - POST /api/v1/data/metrics
- `cameleer3-server-app/.../config/IngestionConfig.java` - Removed @Configuration (fix duplicate bean)
- `cameleer3-server-app/.../controller/ExecutionControllerIT.java` - 4 tests: single, array, flush, unknown fields
- `cameleer3-server-app/.../controller/DiagramControllerIT.java` - 3 tests: single, array, flush
- `cameleer3-server-app/.../controller/MetricsControllerIT.java` - 2 tests: POST, flush
- `cameleer3-server-app/.../controller/BackpressureIT.java` - 2 tests: 503 response, data not lost
- `cameleer-server-core/.../ingestion/IngestionService.java` - Routes data to WriteBuffer instances
- `cameleer-server-app/.../storage/ClickHouseExecutionRepository.java` - Batch insert with parallel processor arrays
- `cameleer-server-app/.../storage/ClickHouseDiagramRepository.java` - JSON storage with SHA-256 content-hash dedup
- `cameleer-server-app/.../storage/ClickHouseMetricsRepository.java` - Batch insert for agent_metrics
- `cameleer-server-app/.../ingestion/ClickHouseFlushScheduler.java` - Scheduled drain + SmartLifecycle shutdown
- `cameleer-server-app/.../config/IngestionBeanConfig.java` - WriteBuffer and IngestionService bean wiring
- `cameleer-server-app/.../controller/ExecutionController.java` - POST /api/v1/data/executions
- `cameleer-server-app/.../controller/DiagramController.java` - POST /api/v1/data/diagrams
- `cameleer-server-app/.../controller/MetricsController.java` - POST /api/v1/data/metrics
- `cameleer-server-app/.../config/IngestionConfig.java` - Removed @Configuration (fix duplicate bean)
- `cameleer-server-app/.../controller/ExecutionControllerIT.java` - 4 tests: single, array, flush, unknown fields
- `cameleer-server-app/.../controller/DiagramControllerIT.java` - 3 tests: single, array, flush
- `cameleer-server-app/.../controller/MetricsControllerIT.java` - 2 tests: POST, flush
- `cameleer-server-app/.../controller/BackpressureIT.java` - 2 tests: 503 response, data not lost
## Decisions Made
- Controllers accept raw String body and detect single vs array JSON (starts with `[`), supporting both payload formats per protocol spec
@@ -119,7 +119,7 @@ Each task was committed atomically:
- **Found during:** Task 2 (integration test context startup)
- **Issue:** IngestionConfig had both `@Configuration` and `@ConfigurationProperties`, while `@EnableConfigurationProperties(IngestionConfig.class)` on the app class created a second bean, causing "expected single matching bean but found 2"
- **Fix:** Removed `@Configuration` from IngestionConfig, relying solely on `@EnableConfigurationProperties`
- **Files modified:** cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java
- **Files modified:** cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
- **Verification:** Application context starts successfully, all tests pass
- **Committed in:** 8fe65f0

View File

@@ -5,15 +5,15 @@ type: execute
wave: 2
depends_on: ["01-01"]
files_modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/interceptor/ProtocolVersionInterceptor.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java
- cameleer3-server-app/src/test/resources/application-test.yml
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/interceptor/ProtocolVersionIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/HealthControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/OpenApiIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ForwardCompatIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/interceptor/ProtocolVersionInterceptor.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java
- cameleer-server-app/src/test/resources/application-test.yml
- cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/interceptor/ProtocolVersionIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/HealthControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/OpenApiIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ForwardCompatIT.java
autonomous: true
requirements:
- API-01
@@ -34,13 +34,13 @@ must_haves:
- "Unknown JSON fields in request body do not cause deserialization errors"
- "ClickHouse tables have TTL clause for 30-day retention"
artifacts:
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/interceptor/ProtocolVersionInterceptor.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/interceptor/ProtocolVersionInterceptor.java"
provides: "Validates X-Cameleer-Protocol-Version:1 header on data endpoints"
min_lines: 20
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java"
provides: "Registers interceptor with path patterns"
min_lines: 10
- path: "cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java"
- path: "cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java"
provides: "Shared Testcontainers base class for integration tests"
min_lines: 20
key_links:
@@ -83,11 +83,11 @@ Output: AbstractClickHouseIT base class, working health, Swagger UI, protocol he
<task type="auto">
<name>Task 1: Test infrastructure, protocol version interceptor, WebConfig, and Spring Boot application class</name>
<files>
cameleer3-server-app/src/test/resources/application-test.yml,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/interceptor/ProtocolVersionInterceptor.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java
cameleer-server-app/src/test/resources/application-test.yml,
cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/interceptor/ProtocolVersionInterceptor.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java
</files>
<action>
1. Create application-test.yml for test profile:
@@ -113,15 +113,15 @@ Output: AbstractClickHouseIT base class, working health, Swagger UI, protocol he
- Override addInterceptors: register interceptor with pathPatterns "/api/v1/data/**" and "/api/v1/agents/**"
- Explicitly EXCLUDE: "/api/v1/health", "/api/v1/api-docs/**", "/api/v1/swagger-ui/**", "/api/v1/swagger-ui.html"
5. Create or update Cameleer3ServerApplication:
- @SpringBootApplication in package com.cameleer3.server.app
5. Create or update CameleerServerApplication:
- @SpringBootApplication in package com.cameleer.server.app
- @EnableScheduling (needed for ClickHouseFlushScheduler from Plan 02)
- @EnableConfigurationProperties(IngestionConfig.class)
- Main method with SpringApplication.run()
- Ensure package scanning covers com.cameleer3.server.app and com.cameleer3.server.core
- Ensure package scanning covers com.cameleer.server.app and com.cameleer.server.core
</action>
<verify>
<automated>mvn clean compile -pl cameleer3-server-app -q 2>&1 | tail -5</automated>
<automated>mvn clean compile -pl cameleer-server-app -q 2>&1 | tail -5</automated>
</verify>
<done>AbstractClickHouseIT base class ready for integration tests. ProtocolVersionInterceptor validates header on data/agent paths. Health, swagger, and api-docs paths excluded. Application class enables scheduling and config properties.</done>
</task>
@@ -129,10 +129,10 @@ Output: AbstractClickHouseIT base class, working health, Swagger UI, protocol he
<task type="auto" tdd="true">
<name>Task 2: Health, OpenAPI, protocol version, forward compat, and TTL integration tests</name>
<files>
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/HealthControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/OpenApiIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/interceptor/ProtocolVersionIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ForwardCompatIT.java
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/HealthControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/OpenApiIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/interceptor/ProtocolVersionIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ForwardCompatIT.java
</files>
<behavior>
- GET /api/v1/health returns 200 with JSON containing status field
@@ -178,7 +178,7 @@ Output: AbstractClickHouseIT base class, working health, Swagger UI, protocol he
Note: All tests that POST to data endpoints must include X-Cameleer-Protocol-Version:1 header.
</action>
<verify>
<automated>mvn test -pl cameleer3-server-app -Dtest="HealthControllerIT,OpenApiIT,ProtocolVersionIT,ForwardCompatIT" -q 2>&1 | tail -15</automated>
<automated>mvn test -pl cameleer-server-app -Dtest="HealthControllerIT,OpenApiIT,ProtocolVersionIT,ForwardCompatIT" -q 2>&1 | tail -15</automated>
</verify>
<done>Health returns 200. OpenAPI docs are available and list endpoints. Protocol version header enforced on data paths, not on health/docs. Unknown JSON fields accepted. TTL confirmed in ClickHouse DDL via HealthControllerIT test methods.</done>
</task>
@@ -186,7 +186,7 @@ Output: AbstractClickHouseIT base class, working health, Swagger UI, protocol he
</tasks>
<verification>
- `mvn test -pl cameleer3-server-app -Dtest="HealthControllerIT,OpenApiIT,ProtocolVersionIT,ForwardCompatIT" -q` -- all tests pass
- `mvn test -pl cameleer-server-app -Dtest="HealthControllerIT,OpenApiIT,ProtocolVersionIT,ForwardCompatIT" -q` -- all tests pass
- GET /api/v1/health returns 200
- GET /api/v1/api-docs returns OpenAPI spec
- Missing protocol header returns 400 on data endpoints

View File

@@ -12,7 +12,7 @@ provides:
- AbstractClickHouseIT base class for all integration tests
- ProtocolVersionInterceptor enforcing X-Cameleer-Protocol-Version:1 on data/agent paths
- WebConfig with interceptor registration and path exclusions
- Cameleer3ServerApplication with @EnableScheduling and component scanning
- CameleerServerApplication with @EnableScheduling and component scanning
- 12 passing integration tests (health, OpenAPI, protocol version, forward compat, TTL)
affects: [01-02, 02-search, 03-agent-registry]
@@ -23,17 +23,17 @@ tech-stack:
key-files:
created:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/interceptor/ProtocolVersionInterceptor.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
- cameleer3-server-app/src/test/resources/application-test.yml
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/HealthControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/OpenApiIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/interceptor/ProtocolVersionIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ForwardCompatIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/interceptor/ProtocolVersionInterceptor.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
- cameleer-server-app/src/test/resources/application-test.yml
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/HealthControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/OpenApiIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/interceptor/ProtocolVersionIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ForwardCompatIT.java
modified:
- cameleer3-server-app/pom.xml
- cameleer-server-app/pom.xml
- pom.xml
- clickhouse/init/01-schema.sql
@@ -70,7 +70,7 @@ completed: 2026-03-11
- ProtocolVersionInterceptor validates X-Cameleer-Protocol-Version:1 on /api/v1/data/** and /api/v1/agents/** paths, returning 400 JSON error for missing or wrong version
- AbstractClickHouseIT base class with Testcontainers ClickHouse 25.3, shared static container, schema init from 01-schema.sql
- 12 integration tests: health endpoint (2), OpenAPI docs (2), protocol version enforcement (5), forward compatibility (1), TTL verification (2)
- Cameleer3ServerApplication with @EnableScheduling, @EnableConfigurationProperties, and dual package scanning
- CameleerServerApplication with @EnableScheduling, @EnableConfigurationProperties, and dual package scanning
## Task Commits
@@ -80,17 +80,17 @@ Each task was committed atomically:
2. **Task 2: Integration tests for health, OpenAPI, protocol version, forward compat, and TTL** - `2d3fde3` (test)
## Files Created/Modified
- `cameleer3-server-app/src/main/java/.../Cameleer3ServerApplication.java` - Spring Boot entry point with scheduling and config properties
- `cameleer3-server-app/src/main/java/.../interceptor/ProtocolVersionInterceptor.java` - Validates protocol version header on data/agent paths
- `cameleer3-server-app/src/main/java/.../config/WebConfig.java` - Registers interceptor with path patterns and exclusions
- `cameleer3-server-app/src/test/java/.../AbstractClickHouseIT.java` - Shared Testcontainers base class for ITs
- `cameleer3-server-app/src/test/resources/application-test.yml` - Test profile with small buffer config
- `cameleer3-server-app/src/test/java/.../controller/HealthControllerIT.java` - Health endpoint and TTL tests
- `cameleer3-server-app/src/test/java/.../controller/OpenApiIT.java` - OpenAPI and Swagger UI tests
- `cameleer3-server-app/src/test/java/.../interceptor/ProtocolVersionIT.java` - Protocol header enforcement tests
- `cameleer3-server-app/src/test/java/.../controller/ForwardCompatIT.java` - Unknown JSON fields test
- `cameleer-server-app/src/main/java/.../CameleerServerApplication.java` - Spring Boot entry point with scheduling and config properties
- `cameleer-server-app/src/main/java/.../interceptor/ProtocolVersionInterceptor.java` - Validates protocol version header on data/agent paths
- `cameleer-server-app/src/main/java/.../config/WebConfig.java` - Registers interceptor with path patterns and exclusions
- `cameleer-server-app/src/test/java/.../AbstractClickHouseIT.java` - Shared Testcontainers base class for ITs
- `cameleer-server-app/src/test/resources/application-test.yml` - Test profile with small buffer config
- `cameleer-server-app/src/test/java/.../controller/HealthControllerIT.java` - Health endpoint and TTL tests
- `cameleer-server-app/src/test/java/.../controller/OpenApiIT.java` - OpenAPI and Swagger UI tests
- `cameleer-server-app/src/test/java/.../interceptor/ProtocolVersionIT.java` - Protocol header enforcement tests
- `cameleer-server-app/src/test/java/.../controller/ForwardCompatIT.java` - Unknown JSON fields test
- `pom.xml` - Override testcontainers.version to 2.0.3
- `cameleer3-server-app/pom.xml` - Remove junit-jupiter, upgrade testcontainers-clickhouse to 2.0.3
- `cameleer-server-app/pom.xml` - Remove junit-jupiter, upgrade testcontainers-clickhouse to 2.0.3
- `clickhouse/init/01-schema.sql` - Fix TTL expressions and error column types
## Decisions Made
@@ -107,7 +107,7 @@ Each task was committed atomically:
- **Found during:** Task 2 (compilation)
- **Issue:** org.testcontainers:junit-jupiter:2.0.2 does not exist in Maven Central
- **Fix:** Removed junit-jupiter dependency, upgraded to TC 2.0.3, managed container lifecycle manually via static initializer
- **Files modified:** cameleer3-server-app/pom.xml, pom.xml, AbstractClickHouseIT.java
- **Files modified:** cameleer-server-app/pom.xml, pom.xml, AbstractClickHouseIT.java
- **Verification:** All tests compile and pass
- **Committed in:** 2d3fde3

View File

@@ -6,7 +6,7 @@
## Summary
Phase 1 establishes the data pipeline and API skeleton for Cameleer3 Server. Agents POST execution data, diagrams, and metrics to REST endpoints; the server buffers these in memory and batch-flushes to ClickHouse. The ClickHouse schema design is the most critical and least reversible decision in this phase -- ORDER BY and partitioning cannot be changed without table recreation.
Phase 1 establishes the data pipeline and API skeleton for Cameleer Server. Agents POST execution data, diagrams, and metrics to REST endpoints; the server buffers these in memory and batch-flushes to ClickHouse. The ClickHouse schema design is the most critical and least reversible decision in this phase -- ORDER BY and partitioning cannot be changed without table recreation.
The ClickHouse Java ecosystem has undergone significant changes. The recommended approach is **clickhouse-jdbc v0.9.7** (JDBC V2 driver) with Spring Boot's JdbcTemplate for batch inserts. An alternative is the standalone **client-v2** artifact which offers a POJO-based insert API, but JDBC integration with Spring Boot is more conventional and better documented. ClickHouse now has a native full-text index (TYPE text, GA as of March 2026) that supersedes the older tokenbf_v1 bloom filter approach -- this is relevant for Phase 2 but should be accounted for in schema design now.
@@ -17,7 +17,7 @@ The ClickHouse Java ecosystem has undergone significant changes. The recommended
| ID | Description | Research Support |
|----|-------------|-----------------|
| INGST-01 (#1) | Accept RouteExecution via POST /api/v1/data/executions, return 202 | REST controller + async write buffer pattern; Jackson deserialization of cameleer3-common models |
| INGST-01 (#1) | Accept RouteExecution via POST /api/v1/data/executions, return 202 | REST controller + async write buffer pattern; Jackson deserialization of cameleer-common models |
| INGST-02 (#2) | Accept RouteGraph via POST /api/v1/data/diagrams, return 202 | Same pattern; separate ClickHouse table for diagrams with content-hash dedup |
| INGST-03 (#3) | Accept metrics via POST /api/v1/data/metrics, return 202 | Same pattern; separate ClickHouse table for metrics |
| INGST-04 (#4) | In-memory batch buffer with configurable flush interval/size | ArrayBlockingQueue + @Scheduled flush; configurable via application.yml |
@@ -60,7 +60,7 @@ The ClickHouse Java ecosystem has undergone significant changes. The recommended
| ArrayBlockingQueue | LMAX Disruptor | Disruptor is faster under extreme contention but adds complexity; ABQ is sufficient for this throughput |
| Spring JdbcTemplate | Raw JDBC PreparedStatement | JdbcTemplate provides cleaner error handling and resource management; no meaningful overhead |
**Installation (add to cameleer3-server-app/pom.xml):**
**Installation (add to cameleer-server-app/pom.xml):**
```xml
<!-- ClickHouse JDBC V2 -->
<dependency>
@@ -103,7 +103,7 @@ The ClickHouse Java ecosystem has undergone significant changes. The recommended
</dependency>
```
**Add to cameleer3-server-core/pom.xml:**
**Add to cameleer-server-core/pom.xml:**
```xml
<!-- SLF4J for logging (no Spring dependency) -->
<dependency>
@@ -117,7 +117,7 @@ The ClickHouse Java ecosystem has undergone significant changes. The recommended
### Recommended Project Structure
```
cameleer3-server-core/src/main/java/com/cameleer3/server/core/
cameleer-server-core/src/main/java/com/cameleer/server/core/
ingestion/
WriteBuffer.java # Bounded queue + flush logic
IngestionService.java # Accepts data, routes to buffer
@@ -126,9 +126,9 @@ cameleer3-server-core/src/main/java/com/cameleer3/server/core/
DiagramRepository.java # Interface: store/retrieve diagrams
MetricsRepository.java # Interface: store metrics
model/
(extend/complement cameleer3-common models as needed)
(extend/complement cameleer-common models as needed)
cameleer3-server-app/src/main/java/com/cameleer3/server/app/
cameleer-server-app/src/main/java/com/cameleer/server/app/
config/
ClickHouseConfig.java # DataSource + JdbcTemplate bean
IngestionConfig.java # Buffer size, flush interval from YAML
@@ -424,7 +424,7 @@ services:
environment:
CLICKHOUSE_USER: cameleer
CLICKHOUSE_PASSWORD: cameleer_dev
CLICKHOUSE_DB: cameleer3
CLICKHOUSE_DB: cameleer
ulimits:
nofile:
soft: 262144
@@ -442,7 +442,7 @@ server:
spring:
datasource:
url: jdbc:ch://localhost:8123/cameleer3
url: jdbc:ch://localhost:8123/cameleer
username: cameleer
password: cameleer_dev
driver-class-name: com.clickhouse.jdbc.ClickHouseDriver
@@ -493,10 +493,10 @@ management:
## Open Questions
1. **Exact cameleer3-common model structure**
1. **Exact cameleer-common model structure**
- What we know: Models include RouteExecution, ProcessorExecution, ExchangeSnapshot, RouteGraph, RouteNode, RouteEdge
- What's unclear: Exact field names, types, nesting structure -- needed to design ClickHouse schema precisely
- Recommendation: Read cameleer3-common source code before implementing schema. Schema must match the wire format.
- Recommendation: Read cameleer-common source code before implementing schema. Schema must match the wire format.
2. **ClickHouse JDBC V2 + HikariCP compatibility**
- What we know: clickhouse-jdbc 0.9.7 implements JDBC spec; HikariCP is Spring Boot default
@@ -515,36 +515,36 @@ management:
| Property | Value |
|----------|-------|
| Framework | JUnit 5 (Spring Boot managed) + Testcontainers 2.0.2 |
| Config file | cameleer3-server-app/src/test/resources/application-test.yml (Wave 0) |
| Quick run command | `mvn test -pl cameleer3-server-core -Dtest=WriteBufferTest -q` |
| Config file | cameleer-server-app/src/test/resources/application-test.yml (Wave 0) |
| Quick run command | `mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q` |
| Full suite command | `mvn verify` |
### Phase Requirements -> Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| INGST-01 | POST /api/v1/data/executions returns 202, data in ClickHouse | integration | `mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT -q` | Wave 0 |
| INGST-02 | POST /api/v1/data/diagrams returns 202 | integration | `mvn test -pl cameleer3-server-app -Dtest=DiagramControllerIT -q` | Wave 0 |
| INGST-03 | POST /api/v1/data/metrics returns 202 | integration | `mvn test -pl cameleer3-server-app -Dtest=MetricsControllerIT -q` | Wave 0 |
| INGST-04 | Buffer flushes at interval/size | unit | `mvn test -pl cameleer3-server-core -Dtest=WriteBufferTest -q` | Wave 0 |
| INGST-05 | 503 when buffer full | unit+integration | `mvn test -pl cameleer3-server-app -Dtest=BackpressureIT -q` | Wave 0 |
| INGST-06 | TTL removes old data | integration | `mvn test -pl cameleer3-server-app -Dtest=ClickHouseTtlIT -q` | Wave 0 |
| INGST-01 | POST /api/v1/data/executions returns 202, data in ClickHouse | integration | `mvn test -pl cameleer-server-app -Dtest=ExecutionControllerIT -q` | Wave 0 |
| INGST-02 | POST /api/v1/data/diagrams returns 202 | integration | `mvn test -pl cameleer-server-app -Dtest=DiagramControllerIT -q` | Wave 0 |
| INGST-03 | POST /api/v1/data/metrics returns 202 | integration | `mvn test -pl cameleer-server-app -Dtest=MetricsControllerIT -q` | Wave 0 |
| INGST-04 | Buffer flushes at interval/size | unit | `mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q` | Wave 0 |
| INGST-05 | 503 when buffer full | unit+integration | `mvn test -pl cameleer-server-app -Dtest=BackpressureIT -q` | Wave 0 |
| INGST-06 | TTL removes old data | integration | `mvn test -pl cameleer-server-app -Dtest=ClickHouseTtlIT -q` | Wave 0 |
| API-01 | Endpoints under /api/v1/ | integration | Covered by controller ITs | Wave 0 |
| API-02 | OpenAPI docs available | integration | `mvn test -pl cameleer3-server-app -Dtest=OpenApiIT -q` | Wave 0 |
| API-03 | GET /api/v1/health responds | integration | `mvn test -pl cameleer3-server-app -Dtest=HealthControllerIT -q` | Wave 0 |
| API-04 | Protocol version header validated | integration | `mvn test -pl cameleer3-server-app -Dtest=ProtocolVersionIT -q` | Wave 0 |
| API-05 | Unknown JSON fields accepted | unit | `mvn test -pl cameleer3-server-app -Dtest=ForwardCompatIT -q` | Wave 0 |
| API-02 | OpenAPI docs available | integration | `mvn test -pl cameleer-server-app -Dtest=OpenApiIT -q` | Wave 0 |
| API-03 | GET /api/v1/health responds | integration | `mvn test -pl cameleer-server-app -Dtest=HealthControllerIT -q` | Wave 0 |
| API-04 | Protocol version header validated | integration | `mvn test -pl cameleer-server-app -Dtest=ProtocolVersionIT -q` | Wave 0 |
| API-05 | Unknown JSON fields accepted | unit | `mvn test -pl cameleer-server-app -Dtest=ForwardCompatIT -q` | Wave 0 |
### Sampling Rate
- **Per task commit:** `mvn test -pl cameleer3-server-core -q` (unit tests, fast)
- **Per task commit:** `mvn test -pl cameleer-server-core -q` (unit tests, fast)
- **Per wave merge:** `mvn verify` (full suite with Testcontainers integration tests)
- **Phase gate:** Full suite green before verification
### Wave 0 Gaps
- [ ] `cameleer3-server-app/src/test/resources/application-test.yml` -- test ClickHouse config
- [ ] `cameleer3-server-core/src/test/java/.../WriteBufferTest.java` -- buffer unit tests
- [ ] `cameleer3-server-app/src/test/java/.../AbstractClickHouseIT.java` -- shared Testcontainers base class
- [ ] `cameleer3-server-app/src/test/java/.../ExecutionControllerIT.java` -- ingestion integration test
- [ ] `cameleer-server-app/src/test/resources/application-test.yml` -- test ClickHouse config
- [ ] `cameleer-server-core/src/test/java/.../WriteBufferTest.java` -- buffer unit tests
- [ ] `cameleer-server-app/src/test/java/.../AbstractClickHouseIT.java` -- shared Testcontainers base class
- [ ] `cameleer-server-app/src/test/java/.../ExecutionControllerIT.java` -- ingestion integration test
- [ ] Docker available on test machine for Testcontainers
## Sources

View File

@@ -18,8 +18,8 @@ created: 2026-03-11
| Property | Value |
|----------|-------|
| **Framework** | JUnit 5 (Spring Boot managed) + Testcontainers 2.0.2 |
| **Config file** | cameleer3-server-app/src/test/resources/application-test.yml (Wave 0) |
| **Quick run command** | `mvn test -pl cameleer3-server-core -Dtest=WriteBufferTest -q` |
| **Config file** | cameleer-server-app/src/test/resources/application-test.yml (Wave 0) |
| **Quick run command** | `mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q` |
| **Full suite command** | `mvn verify` |
| **Estimated runtime** | ~30 seconds |
@@ -27,7 +27,7 @@ created: 2026-03-11
## Sampling Rate
- **After every task commit:** Run `mvn test -pl cameleer3-server-core -q`
- **After every task commit:** Run `mvn test -pl cameleer-server-core -q`
- **After every plan wave:** Run `mvn verify`
- **Before `/gsd:verify-work`:** Full suite must be green
- **Max feedback latency:** 30 seconds
@@ -38,17 +38,17 @@ created: 2026-03-11
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
| 1-01-01 | 01 | 1 | INGST-04 | unit | `mvn test -pl cameleer3-server-core -Dtest=WriteBufferTest -q` | no W0 | pending |
| 1-01-02 | 01 | 1 | INGST-01 | integration | `mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT -q` | no W0 | pending |
| 1-01-03 | 01 | 1 | INGST-05 | integration | `mvn test -pl cameleer3-server-app -Dtest=BackpressureIT -q` | no W0 | pending |
| 1-01-04 | 01 | 1 | INGST-06 | integration | `mvn test -pl cameleer3-server-app -Dtest=HealthControllerIT#ttlConfigured* -q` | no W0 | pending |
| 1-02-01 | 02 | 1 | INGST-01 | integration | `mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT -q` | no W0 | pending |
| 1-02-02 | 02 | 1 | INGST-02 | integration | `mvn test -pl cameleer3-server-app -Dtest=DiagramControllerIT -q` | no W0 | pending |
| 1-02-03 | 02 | 1 | INGST-03 | integration | `mvn test -pl cameleer3-server-app -Dtest=MetricsControllerIT -q` | no W0 | pending |
| 1-02-04 | 02 | 1 | API-02 | integration | `mvn test -pl cameleer3-server-app -Dtest=OpenApiIT -q` | no W0 | pending |
| 1-02-05 | 02 | 1 | API-03 | integration | `mvn test -pl cameleer3-server-app -Dtest=HealthControllerIT -q` | no W0 | pending |
| 1-02-06 | 02 | 1 | API-04 | integration | `mvn test -pl cameleer3-server-app -Dtest=ProtocolVersionIT -q` | no W0 | pending |
| 1-02-07 | 02 | 1 | API-05 | unit | `mvn test -pl cameleer3-server-app -Dtest=ForwardCompatIT -q` | no W0 | pending |
| 1-01-01 | 01 | 1 | INGST-04 | unit | `mvn test -pl cameleer-server-core -Dtest=WriteBufferTest -q` | no W0 | pending |
| 1-01-02 | 01 | 1 | INGST-01 | integration | `mvn test -pl cameleer-server-app -Dtest=ExecutionControllerIT -q` | no W0 | pending |
| 1-01-03 | 01 | 1 | INGST-05 | integration | `mvn test -pl cameleer-server-app -Dtest=BackpressureIT -q` | no W0 | pending |
| 1-01-04 | 01 | 1 | INGST-06 | integration | `mvn test -pl cameleer-server-app -Dtest=HealthControllerIT#ttlConfigured* -q` | no W0 | pending |
| 1-02-01 | 02 | 1 | INGST-01 | integration | `mvn test -pl cameleer-server-app -Dtest=ExecutionControllerIT -q` | no W0 | pending |
| 1-02-02 | 02 | 1 | INGST-02 | integration | `mvn test -pl cameleer-server-app -Dtest=DiagramControllerIT -q` | no W0 | pending |
| 1-02-03 | 02 | 1 | INGST-03 | integration | `mvn test -pl cameleer-server-app -Dtest=MetricsControllerIT -q` | no W0 | pending |
| 1-02-04 | 02 | 1 | API-02 | integration | `mvn test -pl cameleer-server-app -Dtest=OpenApiIT -q` | no W0 | pending |
| 1-02-05 | 02 | 1 | API-03 | integration | `mvn test -pl cameleer-server-app -Dtest=HealthControllerIT -q` | no W0 | pending |
| 1-02-06 | 02 | 1 | API-04 | integration | `mvn test -pl cameleer-server-app -Dtest=ProtocolVersionIT -q` | no W0 | pending |
| 1-02-07 | 02 | 1 | API-05 | unit | `mvn test -pl cameleer-server-app -Dtest=ForwardCompatIT -q` | no W0 | pending |
*Status: pending / green / red / flaky*
@@ -56,10 +56,10 @@ created: 2026-03-11
## Wave 0 Requirements
- [ ] `cameleer3-server-app/src/test/resources/application-test.yml` — test ClickHouse config
- [ ] `cameleer3-server-core/src/test/java/.../WriteBufferTest.java` — buffer unit tests
- [ ] `cameleer3-server-app/src/test/java/.../AbstractClickHouseIT.java` — shared Testcontainers base class
- [ ] `cameleer3-server-app/src/test/java/.../ExecutionControllerIT.java` — ingestion integration test
- [ ] `cameleer-server-app/src/test/resources/application-test.yml` — test ClickHouse config
- [ ] `cameleer-server-core/src/test/java/.../WriteBufferTest.java` — buffer unit tests
- [ ] `cameleer-server-app/src/test/java/.../AbstractClickHouseIT.java` — shared Testcontainers base class
- [ ] `cameleer-server-app/src/test/java/.../ExecutionControllerIT.java` — ingestion integration test
- [ ] Docker available on test machine for Testcontainers
*If none: "Existing infrastructure covers all phase requirements."*

View File

@@ -35,27 +35,27 @@ re_verification: false
| Artifact | Expected | Status | Details |
|---|---|---|---|
| `cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/WriteBuffer.java` | Generic bounded write buffer with offer/drain/isFull | VERIFIED | 80 lines; `ArrayBlockingQueue`-backed; implements `offer`, `offerBatch` (all-or-nothing), `drain`, `isFull`, `size`, `capacity`, `remainingCapacity` |
| `cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/WriteBuffer.java` | Generic bounded write buffer with offer/drain/isFull | VERIFIED | 80 lines; `ArrayBlockingQueue`-backed; implements `offer`, `offerBatch` (all-or-nothing), `drain`, `isFull`, `size`, `capacity`, `remainingCapacity` |
| `clickhouse/init/01-schema.sql` | ClickHouse DDL for all three tables | VERIFIED | Contains `CREATE TABLE route_executions`, `route_diagrams`, `agent_metrics`; correct ENGINE, ORDER BY, PARTITION BY, TTL with `toDateTime()` cast |
| `docker-compose.yml` | Local ClickHouse service | VERIFIED | `clickhouse/clickhouse-server:25.3`; ports 8123/9000; init volume mount; credentials configured |
| `cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java` | Repository interface for execution batch inserts | VERIFIED | Declares `void insertBatch(List<RouteExecution> executions)` |
| `cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java` | Repository interface for execution batch inserts | VERIFIED | Declares `void insertBatch(List<RouteExecution> executions)` |
#### Plan 01-02 Artifacts
| Artifact | Expected | Status | Details |
|---|---|---|---|
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/ExecutionController.java` | POST /api/v1/data/executions endpoint | VERIFIED | 79 lines; `@PostMapping("/executions")`; handles single/array via raw String parsing; returns 202 or 503 + Retry-After |
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java` | Batch insert to route_executions via JdbcTemplate | VERIFIED | 118 lines; `@Repository`; `BatchPreparedStatementSetter`; flattens processor tree to parallel arrays |
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java` | Scheduled drain of WriteBuffer into ClickHouse | VERIFIED | 160 lines; `@Scheduled(fixedDelayString="${ingestion.flush-interval-ms:1000}")`; implements `SmartLifecycle` for shutdown drain |
| `cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java` | Routes data to appropriate WriteBuffer instances | VERIFIED | 115 lines; plain class; `acceptExecution`, `acceptExecutions`, `acceptDiagram`, `acceptDiagrams`, `acceptMetrics`; delegates to typed `WriteBuffer` instances |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ExecutionController.java` | POST /api/v1/data/executions endpoint | VERIFIED | 79 lines; `@PostMapping("/executions")`; handles single/array via raw String parsing; returns 202 or 503 + Retry-After |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java` | Batch insert to route_executions via JdbcTemplate | VERIFIED | 118 lines; `@Repository`; `BatchPreparedStatementSetter`; flattens processor tree to parallel arrays |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java` | Scheduled drain of WriteBuffer into ClickHouse | VERIFIED | 160 lines; `@Scheduled(fixedDelayString="${ingestion.flush-interval-ms:1000}")`; implements `SmartLifecycle` for shutdown drain |
| `cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java` | Routes data to appropriate WriteBuffer instances | VERIFIED | 115 lines; plain class; `acceptExecution`, `acceptExecutions`, `acceptDiagram`, `acceptDiagrams`, `acceptMetrics`; delegates to typed `WriteBuffer` instances |
#### Plan 01-03 Artifacts
| Artifact | Expected | Status | Details |
|---|---|---|---|
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/interceptor/ProtocolVersionInterceptor.java` | Validates X-Cameleer-Protocol-Version:1 header on data endpoints | VERIFIED | 47 lines; implements `HandlerInterceptor.preHandle`; returns 400 JSON on missing/wrong version |
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java` | Registers interceptor with path patterns | VERIFIED | 35 lines; `addInterceptors` registers interceptor on `/api/v1/data/**` and `/api/v1/agents/**`; excludes health, api-docs, swagger-ui |
| `cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java` | Shared Testcontainers base class for integration tests | VERIFIED | 73 lines; static `ClickHouseContainer`; `@DynamicPropertySource`; `@BeforeAll` schema init from SQL file; `JdbcTemplate` exposed to subclasses |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/interceptor/ProtocolVersionInterceptor.java` | Validates X-Cameleer-Protocol-Version:1 header on data endpoints | VERIFIED | 47 lines; implements `HandlerInterceptor.preHandle`; returns 400 JSON on missing/wrong version |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java` | Registers interceptor with path patterns | VERIFIED | 35 lines; `addInterceptors` registers interceptor on `/api/v1/data/**` and `/api/v1/agents/**`; excludes health, api-docs, swagger-ui |
| `cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java` | Shared Testcontainers base class for integration tests | VERIFIED | 73 lines; static `ClickHouseContainer`; `@DynamicPropertySource`; `@BeforeAll` schema init from SQL file; `JdbcTemplate` exposed to subclasses |
---
@@ -113,7 +113,7 @@ No orphaned requirements — all 11 IDs declared in plan frontmatter match the R
### Anti-Patterns Found
No anti-patterns detected. Scanned all source files in `cameleer3-server-app/src/main` and `cameleer3-server-core/src/main` for TODO/FIXME/PLACEHOLDER/stub return patterns. None found.
No anti-patterns detected. Scanned all source files in `cameleer-server-app/src/main` and `cameleer-server-core/src/main` for TODO/FIXME/PLACEHOLDER/stub return patterns. None found.
One minor observation (not a blocker):

View File

@@ -6,17 +6,17 @@ wave: 1
depends_on: []
files_modified:
- clickhouse/init/02-search-columns.sql
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchRequest.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchResult.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchEngine.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/ExecutionSummary.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/DetailService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/ExecutionDetail.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/ProcessorNode.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchResult.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchEngine.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/ExecutionSummary.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/detail/DetailService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ExecutionDetail.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ProcessorNode.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
autonomous: true
requirements:
- SRCH-01
@@ -38,21 +38,21 @@ must_haves:
- path: "clickhouse/init/02-search-columns.sql"
provides: "Schema extension DDL for Phase 2 columns and skip indexes"
contains: "exchange_bodies"
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchEngine.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchEngine.java"
provides: "Search backend abstraction interface"
exports: ["SearchEngine"]
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchRequest.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java"
provides: "Immutable search criteria record"
exports: ["SearchRequest"]
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java"
provides: "Extended with new columns in INSERT, plus query methods"
min_lines: 100
key_links:
- from: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchService.java"
- from: "cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchService.java"
to: "SearchEngine"
via: "constructor injection"
pattern: "SearchEngine"
- from: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java"
- from: "cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java"
to: "clickhouse/init/02-search-columns.sql"
via: "INSERT and SELECT SQL matching schema"
pattern: "exchange_bodies|processor_depths|diagram_content_hash"
@@ -79,22 +79,22 @@ Output: Schema migration SQL, updated ingestion INSERT with new columns, core se
@.planning/phases/02-transaction-search-diagrams/02-RESEARCH.md
@clickhouse/init/01-schema.sql
@cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java
@cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/DiagramRepository.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
@cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
@cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
@cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
@cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
<interfaces>
<!-- Existing interfaces the executor needs -->
From cameleer3-server-core/.../storage/ExecutionRepository.java:
From cameleer-server-core/.../storage/ExecutionRepository.java:
```java
public interface ExecutionRepository {
void insertBatch(List<RouteExecution> executions);
}
```
From cameleer3-server-core/.../storage/DiagramRepository.java:
From cameleer-server-core/.../storage/DiagramRepository.java:
```java
public interface DiagramRepository {
void store(RouteGraph graph);
@@ -103,7 +103,7 @@ public interface DiagramRepository {
}
```
From cameleer3-common (decompiled — key fields):
From cameleer-common (decompiled — key fields):
```java
// RouteExecution: routeId, status (ExecutionStatus enum: COMPLETED/FAILED/RUNNING),
// startTime (Instant), endTime (Instant), durationMs (long), correlationId, exchangeId,
@@ -145,15 +145,15 @@ Existing ClickHouse schema (01-schema.sql):
<name>Task 1: Schema extension and core domain types</name>
<files>
clickhouse/init/02-search-columns.sql,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchRequest.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchResult.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchEngine.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchService.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/ExecutionSummary.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/DetailService.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/ExecutionDetail.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/ProcessorNode.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java
cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchResult.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchEngine.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchService.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/search/ExecutionSummary.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/detail/DetailService.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ExecutionDetail.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ProcessorNode.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
</files>
<action>
1. Create `clickhouse/init/02-search-columns.sql` with ALTER TABLE statements to add Phase 2 columns to route_executions:
@@ -172,14 +172,14 @@ Existing ClickHouse schema (01-schema.sql):
- Add tokenbf_v1 skip indexes on exchange_bodies and exchange_headers (GRANULARITY 4, same as idx_error)
- Add tokenbf_v1 skip index on error_stacktrace (it has no index yet, needed for SRCH-05 full-text search across stack traces)
2. Create core search domain types in `com.cameleer3.server.core.search`:
2. Create core search domain types in `com.cameleer.server.core.search`:
- `SearchRequest` record: status (String, nullable), timeFrom (Instant), timeTo (Instant), durationMin (Long), durationMax (Long), correlationId (String), text (String — global full-text), textInBody (String), textInHeaders (String), textInErrors (String), offset (int), limit (int). Compact constructor validates: limit defaults to 50 if <= 0, capped at 500; offset defaults to 0 if < 0.
- `SearchResult<T>` record: data (List<T>), total (long), offset (int), limit (int). Include static factory `empty(int offset, int limit)`.
- `ExecutionSummary` record: executionId (String), routeId (String), agentId (String), status (String), startTime (Instant), endTime (Instant), durationMs (long), correlationId (String), errorMessage (String), diagramContentHash (String). This is the lightweight list-view DTO — NOT the full processor arrays.
- `SearchEngine` interface with methods: `SearchResult<ExecutionSummary> search(SearchRequest request)` and `long count(SearchRequest request)`. This is the swappable backend (ClickHouse now, OpenSearch later per user decision).
- `SearchService` class: plain class (no Spring annotations, same pattern as IngestionService). Constructor takes SearchEngine. `search(SearchRequest)` delegates to engine.search(). This thin orchestration layer allows adding cross-cutting concerns later.
3. Create core detail domain types in `com.cameleer3.server.core.detail`:
3. Create core detail domain types in `com.cameleer.server.core.detail`:
- `ProcessorNode` record: processorId (String), processorType (String), status (String), startTime (Instant), endTime (Instant), durationMs (long), diagramNodeId (String), errorMessage (String), errorStackTrace (String), children (List<ProcessorNode>). This is the nested tree node.
- `ExecutionDetail` record: executionId (String), routeId (String), agentId (String), status (String), startTime (Instant), endTime (Instant), durationMs (long), correlationId (String), exchangeId (String), errorMessage (String), errorStackTrace (String), diagramContentHash (String), processors (List<ProcessorNode>). This is the full detail response.
- `DetailService` class: plain class (no Spring annotations). Constructor takes ExecutionRepository. Method `getDetail(String executionId)` returns `Optional<ExecutionDetail>`. Calls repository's new `findDetailById` method, then calls `reconstructTree()` to convert flat arrays into nested ProcessorNode tree. The `reconstructTree` method: takes parallel arrays (ids, types, statuses, starts, ends, durations, diagramNodeIds, errorMessages, errorStackTraces, depths, parentIndexes), creates ProcessorNode[] array, then wires children using parentIndexes (parentIndex == -1 means root).
@@ -190,7 +190,7 @@ Existing ClickHouse schema (01-schema.sql):
Actually, use a different approach per the layering: add a `findRawById(String executionId)` method that returns `Optional<RawExecutionRow>` — a new record containing all parallel arrays. DetailService takes this and reconstructs. Create `RawExecutionRow` as a record in the detail package with all fields needed for reconstruction.
</action>
<verify>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer3-server && mvn compile -pl cameleer3-server-core</automated>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn compile -pl cameleer-server-core</automated>
</verify>
<done>Schema migration SQL exists, all core domain types compile, SearchEngine interface and SearchService defined, ExecutionRepository extended with query method, DetailService has tree reconstruction logic</done>
</task>
@@ -198,9 +198,9 @@ Existing ClickHouse schema (01-schema.sql):
<task type="auto" tdd="true">
<name>Task 2: Update ingestion to populate new columns and verify with integration test</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/IngestionSchemaIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/storage/IngestionSchemaIT.java
</files>
<behavior>
- Test: After inserting a RouteExecution with processors that have exchange snapshots and nested children, the route_executions row has non-empty exchange_bodies, exchange_headers, processor_depths (correct depth values), processor_parent_indexes (correct parent wiring), processor_input_bodies, processor_output_bodies, processor_input_headers, processor_output_headers, processor_diagram_node_ids, and diagram_content_hash columns
@@ -231,7 +231,7 @@ Existing ClickHouse schema (01-schema.sql):
- Verifies a second insertion with null snapshots succeeds with empty defaults
</action>
<verify>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer3-server && mvn test -pl cameleer3-server-app -Dtest=IngestionSchemaIT</automated>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=IngestionSchemaIT</automated>
</verify>
<done>All new columns populated correctly during ingestion, tree metadata (depth/parent) correct for nested processors, exchange data concatenated for search, existing ingestion tests still pass</done>
</task>
@@ -239,9 +239,9 @@ Existing ClickHouse schema (01-schema.sql):
</tasks>
<verification>
- `mvn compile -pl cameleer3-server-core` succeeds (core domain types compile)
- `mvn test -pl cameleer3-server-app -Dtest=IngestionSchemaIT` passes (new columns populated correctly)
- `mvn test -pl cameleer3-server-app` passes (all existing tests still green with schema extension)
- `mvn compile -pl cameleer-server-core` succeeds (core domain types compile)
- `mvn test -pl cameleer-server-app -Dtest=IngestionSchemaIT` passes (new columns populated correctly)
- `mvn test -pl cameleer-server-app` passes (all existing tests still green with schema extension)
</verification>
<success_criteria>

View File

@@ -22,22 +22,22 @@ tech-stack:
key-files:
created:
- clickhouse/init/02-search-columns.sql
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchRequest.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchResult.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/ExecutionSummary.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchEngine.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/search/SearchService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/DetailService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/ExecutionDetail.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/ProcessorNode.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/detail/RawExecutionRow.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramRenderer.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramLayout.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/IngestionSchemaIT.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchRequest.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchResult.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/ExecutionSummary.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchEngine.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/search/SearchService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/detail/DetailService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ExecutionDetail.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/detail/ProcessorNode.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/detail/RawExecutionRow.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramRenderer.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramLayout.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/storage/IngestionSchemaIT.java
modified:
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/ExecutionRepository.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/storage/ExecutionRepository.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
key-decisions:
- "FlatProcessor record captures depth and parentIndex during DFS traversal"
@@ -86,21 +86,21 @@ Each task was committed atomically:
## Files Created/Modified
- `clickhouse/init/02-search-columns.sql` - ALTER TABLE adding 12 columns + 3 skip indexes
- `cameleer3-server-core/.../search/SearchRequest.java` - Immutable search criteria record with validation
- `cameleer3-server-core/.../search/SearchResult.java` - Paginated result envelope
- `cameleer3-server-core/.../search/ExecutionSummary.java` - Lightweight list-view DTO
- `cameleer3-server-core/.../search/SearchEngine.java` - Swappable search backend interface
- `cameleer3-server-core/.../search/SearchService.java` - Search orchestration layer
- `cameleer3-server-core/.../detail/DetailService.java` - Tree reconstruction from flat arrays
- `cameleer3-server-core/.../detail/ExecutionDetail.java` - Full execution detail record
- `cameleer3-server-core/.../detail/ProcessorNode.java` - Nested tree node (mutable children)
- `cameleer3-server-core/.../detail/RawExecutionRow.java` - DB-to-domain intermediate record
- `cameleer3-server-core/.../diagram/DiagramRenderer.java` - Diagram rendering interface (stub)
- `cameleer3-server-core/.../diagram/DiagramLayout.java` - JSON layout record (stub)
- `cameleer3-server-core/.../storage/ExecutionRepository.java` - Extended with findRawById
- `cameleer3-server-app/.../storage/ClickHouseExecutionRepository.java` - INSERT extended with 12 new columns
- `cameleer3-server-app/src/test/.../AbstractClickHouseIT.java` - Loads 02-search-columns.sql
- `cameleer3-server-app/src/test/.../storage/IngestionSchemaIT.java` - 3 integration tests
- `cameleer-server-core/.../search/SearchRequest.java` - Immutable search criteria record with validation
- `cameleer-server-core/.../search/SearchResult.java` - Paginated result envelope
- `cameleer-server-core/.../search/ExecutionSummary.java` - Lightweight list-view DTO
- `cameleer-server-core/.../search/SearchEngine.java` - Swappable search backend interface
- `cameleer-server-core/.../search/SearchService.java` - Search orchestration layer
- `cameleer-server-core/.../detail/DetailService.java` - Tree reconstruction from flat arrays
- `cameleer-server-core/.../detail/ExecutionDetail.java` - Full execution detail record
- `cameleer-server-core/.../detail/ProcessorNode.java` - Nested tree node (mutable children)
- `cameleer-server-core/.../detail/RawExecutionRow.java` - DB-to-domain intermediate record
- `cameleer-server-core/.../diagram/DiagramRenderer.java` - Diagram rendering interface (stub)
- `cameleer-server-core/.../diagram/DiagramLayout.java` - JSON layout record (stub)
- `cameleer-server-core/.../storage/ExecutionRepository.java` - Extended with findRawById
- `cameleer-server-app/.../storage/ClickHouseExecutionRepository.java` - INSERT extended with 12 new columns
- `cameleer-server-app/src/test/.../AbstractClickHouseIT.java` - Loads 02-search-columns.sql
- `cameleer-server-app/src/test/.../storage/IngestionSchemaIT.java` - 3 integration tests
## Decisions Made
- Used FlatProcessor record to carry depth and parentIndex alongside the ProcessorExecution during DFS flattening -- single pass, no separate traversal
@@ -116,9 +116,9 @@ Each task was committed atomically:
**1. [Rule 3 - Blocking] Created DiagramRenderer and DiagramLayout stub interfaces**
- **Found during:** Task 2 (compilation step)
- **Issue:** Pre-existing `ElkDiagramRenderer` in app module referenced `DiagramRenderer` and `DiagramLayout` interfaces that did not exist in core module, causing compilation failure
- **Fix:** Created minimal stub interfaces in `com.cameleer3.server.core.diagram` package
- **Fix:** Created minimal stub interfaces in `com.cameleer.server.core.diagram` package
- **Files created:** DiagramRenderer.java, DiagramLayout.java
- **Verification:** `mvn compile -pl cameleer3-server-core` and `mvn compile -pl cameleer3-server-app` succeed
- **Verification:** `mvn compile -pl cameleer-server-core` and `mvn compile -pl cameleer-server-app` succeed
- **Committed in:** f6ff279 (Task 2 GREEN commit)
**2. [Rule 1 - Bug] Fixed ClickHouse Array type handling in IngestionSchemaIT**

View File

@@ -5,16 +5,16 @@ type: execute
wave: 1
depends_on: []
files_modified:
- cameleer3-server-app/pom.xml
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramRenderer.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramLayout.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/PositionedNode.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/PositionedEdge.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/diagram/ElkDiagramRenderer.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramRenderController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/DiagramBeanConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramRenderControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/diagram/ElkDiagramRendererTest.java
- cameleer-server-app/pom.xml
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramRenderer.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramLayout.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/PositionedNode.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/PositionedEdge.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/diagram/ElkDiagramRenderer.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramRenderController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/DiagramBeanConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramRenderControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/diagram/ElkDiagramRendererTest.java
autonomous: true
requirements:
- DIAG-03
@@ -27,13 +27,13 @@ must_haves:
- "Node colors match the route-diagram-example.html style: blue endpoints, green processors, red error handlers, purple EIPs"
- "Nested processors (inside split, choice, try-catch) are rendered in compound/swimlane groups"
artifacts:
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramRenderer.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramRenderer.java"
provides: "Renderer interface for SVG and JSON layout output"
exports: ["DiagramRenderer"]
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/diagram/ElkDiagramRenderer.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/diagram/ElkDiagramRenderer.java"
provides: "ELK + JFreeSVG implementation of DiagramRenderer"
min_lines: 100
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramRenderController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramRenderController.java"
provides: "GET /api/v1/diagrams/{hash} with content negotiation"
exports: ["DiagramRenderController"]
key_links:
@@ -70,14 +70,14 @@ Output: DiagramRenderer interface in core, ElkDiagramRenderer implementation in
@.planning/phases/02-transaction-search-diagrams/02-CONTEXT.md
@.planning/phases/02-transaction-search-diagrams/02-RESEARCH.md
@cameleer3-server-core/src/main/java/com/cameleer3/server/core/storage/DiagramRepository.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseDiagramRepository.java
@cameleer3-server-app/pom.xml
@cameleer-server-core/src/main/java/com/cameleer/server/core/storage/DiagramRepository.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseDiagramRepository.java
@cameleer-server-app/pom.xml
<interfaces>
<!-- Existing interfaces needed -->
From cameleer3-server-core/.../storage/DiagramRepository.java:
From cameleer-server-core/.../storage/DiagramRepository.java:
```java
public interface DiagramRepository {
void store(RouteGraph graph);
@@ -86,7 +86,7 @@ public interface DiagramRepository {
}
```
From cameleer3-common (decompiled — diagram models):
From cameleer-common (decompiled — diagram models):
```java
// RouteGraph: routeId (String), nodes (List<RouteNode>), edges (List<RouteEdge>),
// processorNodeMapping (Map<String,String>)
@@ -114,14 +114,14 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
<task type="auto">
<name>Task 1: Add ELK/JFreeSVG dependencies and create core diagram rendering interfaces</name>
<files>
cameleer3-server-app/pom.xml,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramRenderer.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramLayout.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/PositionedNode.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/PositionedEdge.java
cameleer-server-app/pom.xml,
cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramRenderer.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramLayout.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/PositionedNode.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/PositionedEdge.java
</files>
<action>
1. Add Maven dependencies to `cameleer3-server-app/pom.xml`:
1. Add Maven dependencies to `cameleer-server-app/pom.xml`:
```xml
<dependency>
<groupId>org.eclipse.elk</groupId>
@@ -140,7 +140,7 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
</dependency>
```
2. Create core diagram rendering interfaces in `com.cameleer3.server.core.diagram`:
2. Create core diagram rendering interfaces in `com.cameleer.server.core.diagram`:
- `PositionedNode` record: id (String), label (String), type (String — NodeType name), x (double), y (double), width (double), height (double), children (List<PositionedNode> — for compound/swimlane groups). JSON-serializable for the JSON layout response.
@@ -154,7 +154,7 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
Both methods take a RouteGraph and produce output. The interface lives in core so it can be swapped (e.g., for a different renderer).
</action>
<verify>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer3-server && mvn compile -pl cameleer3-server-core && mvn dependency:resolve -pl cameleer3-server-app -q</automated>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn compile -pl cameleer-server-core && mvn dependency:resolve -pl cameleer-server-app -q</automated>
</verify>
<done>ELK and JFreeSVG dependencies resolve, DiagramRenderer interface and layout DTOs compile in core module</done>
</task>
@@ -162,11 +162,11 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
<task type="auto" tdd="true">
<name>Task 2: Implement ElkDiagramRenderer, DiagramRenderController, and integration tests</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/diagram/ElkDiagramRenderer.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramRenderController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/DiagramBeanConfig.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/diagram/ElkDiagramRendererTest.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramRenderControllerIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/diagram/ElkDiagramRenderer.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramRenderController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/DiagramBeanConfig.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/diagram/ElkDiagramRendererTest.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramRenderControllerIT.java
</files>
<behavior>
- Unit test: ElkDiagramRenderer.renderSvg with a simple 3-node graph (from->process->to) produces valid SVG containing svg element, rect elements for nodes, line/path elements for edges
@@ -179,7 +179,7 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
- Integration test: GET /api/v1/diagrams/{hash} with no Accept preference defaults to SVG
</behavior>
<action>
1. Create `ElkDiagramRenderer` implementing `DiagramRenderer` in `com.cameleer3.server.app.diagram`:
1. Create `ElkDiagramRenderer` implementing `DiagramRenderer` in `com.cameleer.server.app.diagram`:
**Layout phase (shared by both SVG and JSON):**
- Convert RouteGraph to ELK graph: create ElkNode root, set properties for LayeredOptions.ALGORITHM_ID, Direction.DOWN (top-to-bottom per user decision), spacing 40px node-node, 20px edge-node.
@@ -206,10 +206,10 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
**JSON layout (layoutJson):**
- Run layout phase, return DiagramLayout directly. Jackson will serialize it to JSON.
2. Create `DiagramBeanConfig` in `com.cameleer3.server.app.config`:
2. Create `DiagramBeanConfig` in `com.cameleer.server.app.config`:
- @Configuration class that creates DiagramRenderer bean (ElkDiagramRenderer) and SearchService bean wiring (prepare for Plan 03).
3. Create `DiagramRenderController` in `com.cameleer3.server.app.controller`:
3. Create `DiagramRenderController` in `com.cameleer.server.app.controller`:
- `GET /api/v1/diagrams/{contentHash}/render` — renders the diagram
- Inject DiagramRepository and DiagramRenderer.
- Look up RouteGraph via `diagramRepository.findByContentHash(contentHash)`. If empty, return 404.
@@ -233,7 +233,7 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
- GET /api/v1/diagrams/{hash}/render with no Accept header -> assert SVG response (default).
</action>
<verify>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer3-server && mvn test -pl cameleer3-server-app -Dtest="ElkDiagramRendererTest,DiagramRenderControllerIT"</automated>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest="ElkDiagramRendererTest,DiagramRenderControllerIT"</automated>
</verify>
<done>Diagram rendering produces color-coded top-to-bottom SVG and JSON layout, content negotiation works via Accept header, compound nodes group nested processors, all tests pass</done>
</task>
@@ -241,8 +241,8 @@ NodeType color mapping (from CONTEXT.md, matching route-diagram-example.html):
</tasks>
<verification>
- `mvn test -pl cameleer3-server-app -Dtest=ElkDiagramRendererTest` passes (unit tests for layout and SVG)
- `mvn test -pl cameleer3-server-app -Dtest=DiagramRenderControllerIT` passes (integration tests for REST endpoint)
- `mvn test -pl cameleer-server-app -Dtest=ElkDiagramRendererTest` passes (unit tests for layout and SVG)
- `mvn test -pl cameleer-server-app -Dtest=DiagramRenderControllerIT` passes (integration tests for REST endpoint)
- `mvn clean verify` passes (all existing tests still green)
- SVG output contains color-coded nodes matching the NodeType color scheme
</verification>

View File

@@ -21,17 +21,17 @@ tech-stack:
key-files:
created:
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramRenderer.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/DiagramLayout.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/PositionedNode.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/diagram/PositionedEdge.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/diagram/ElkDiagramRenderer.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DiagramRenderController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/DiagramBeanConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/diagram/ElkDiagramRendererTest.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramRenderControllerIT.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramRenderer.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/DiagramLayout.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/PositionedNode.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/diagram/PositionedEdge.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/diagram/ElkDiagramRenderer.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DiagramRenderController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/DiagramBeanConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/diagram/ElkDiagramRendererTest.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramRenderControllerIT.java
modified:
- cameleer3-server-app/pom.xml
- cameleer-server-app/pom.xml
key-decisions:
- "Used ELK layered algorithm with top-to-bottom direction for route diagram layout"
@@ -78,16 +78,16 @@ Each task was committed atomically:
2. **Task 2: Implement ElkDiagramRenderer, DiagramRenderController, and integration tests** - `c1bc32d` (feat, TDD)
## Files Created/Modified
- `cameleer3-server-core/.../diagram/DiagramRenderer.java` - Renderer interface with renderSvg and layoutJson
- `cameleer3-server-core/.../diagram/DiagramLayout.java` - Layout record (width, height, nodes, edges)
- `cameleer3-server-core/.../diagram/PositionedNode.java` - Node record with position, dimensions, children
- `cameleer3-server-core/.../diagram/PositionedEdge.java` - Edge record with waypoints
- `cameleer3-server-app/.../diagram/ElkDiagramRenderer.java` - ELK + JFreeSVG implementation (~400 lines)
- `cameleer3-server-app/.../controller/DiagramRenderController.java` - GET /api/v1/diagrams/{hash}/render
- `cameleer3-server-app/.../config/DiagramBeanConfig.java` - Spring bean wiring for DiagramRenderer
- `cameleer3-server-app/pom.xml` - Added ELK, JFreeSVG, xtext dependencies
- `cameleer3-server-app/.../diagram/ElkDiagramRendererTest.java` - 11 unit tests
- `cameleer3-server-app/.../controller/DiagramRenderControllerIT.java` - 4 integration tests
- `cameleer-server-core/.../diagram/DiagramRenderer.java` - Renderer interface with renderSvg and layoutJson
- `cameleer-server-core/.../diagram/DiagramLayout.java` - Layout record (width, height, nodes, edges)
- `cameleer-server-core/.../diagram/PositionedNode.java` - Node record with position, dimensions, children
- `cameleer-server-core/.../diagram/PositionedEdge.java` - Edge record with waypoints
- `cameleer-server-app/.../diagram/ElkDiagramRenderer.java` - ELK + JFreeSVG implementation (~400 lines)
- `cameleer-server-app/.../controller/DiagramRenderController.java` - GET /api/v1/diagrams/{hash}/render
- `cameleer-server-app/.../config/DiagramBeanConfig.java` - Spring bean wiring for DiagramRenderer
- `cameleer-server-app/pom.xml` - Added ELK, JFreeSVG, xtext dependencies
- `cameleer-server-app/.../diagram/ElkDiagramRendererTest.java` - 11 unit tests
- `cameleer-server-app/.../controller/DiagramRenderControllerIT.java` - 4 integration tests
## Decisions Made
- Used ELK layered algorithm (org.eclipse.elk.alg.layered) -- well-maintained, supports compound nodes natively
@@ -104,7 +104,7 @@ Each task was committed atomically:
- **Found during:** Task 2 (ElkDiagramRenderer implementation)
- **Issue:** ELK 0.11.0 LayeredMetaDataProvider references org.eclipse.xtext.xbase.lib.CollectionLiterals at class initialization, causing NoClassDefFoundError
- **Fix:** Added org.eclipse.xtext:org.eclipse.xtext.xbase.lib:2.37.0 dependency to app pom.xml
- **Files modified:** cameleer3-server-app/pom.xml
- **Files modified:** cameleer-server-app/pom.xml
- **Verification:** All unit tests pass after adding dependency
- **Committed in:** c1bc32d (Task 2 commit)
@@ -119,7 +119,7 @@ Each task was committed atomically:
**3. [Rule 1 - Bug] Adapted to actual NodeType enum naming (EIP_ prefix)**
- **Found during:** Task 2 (ElkDiagramRenderer implementation)
- **Issue:** Plan referenced CHOICE, SPLIT etc. but actual enum values are EIP_CHOICE, EIP_SPLIT etc.
- **Fix:** Used correct enum names from decompiled cameleer3-common jar in all color mapping sets
- **Fix:** Used correct enum names from decompiled cameleer-common jar in all color mapping sets
- **Files modified:** ElkDiagramRenderer.java
- **Verification:** Unit tests verify correct colors for endpoint and processor nodes
- **Committed in:** c1bc32d (Task 2 commit)

View File

@@ -6,14 +6,14 @@ wave: 2
depends_on:
- "02-01"
files_modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/search/ClickHouseSearchEngine.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/SearchController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DetailController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/SearchBeanConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DetailControllerIT.java
- cameleer3-server-core/src/test/java/com/cameleer3/server/core/detail/TreeReconstructionTest.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/search/ClickHouseSearchEngine.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/SearchController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DetailController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/SearchBeanConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DetailControllerIT.java
- cameleer-server-core/src/test/java/com/cameleer/server/core/detail/TreeReconstructionTest.java
autonomous: true
requirements:
- SRCH-01
@@ -35,16 +35,16 @@ must_haves:
- "Detail response includes diagramContentHash for linking to diagram endpoint"
- "Search results are paginated with total count, offset, and limit"
artifacts:
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/search/ClickHouseSearchEngine.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/search/ClickHouseSearchEngine.java"
provides: "ClickHouse implementation of SearchEngine with dynamic WHERE building"
min_lines: 80
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/SearchController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/SearchController.java"
provides: "GET + POST /api/v1/search/executions endpoints"
exports: ["SearchController"]
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DetailController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DetailController.java"
provides: "GET /api/v1/executions/{id} endpoint returning nested tree"
exports: ["DetailController"]
- path: "cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java"
- path: "cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java"
provides: "Integration tests for all search filter combinations"
min_lines: 100
key_links:
@@ -92,13 +92,13 @@ Output: SearchController (GET + POST), DetailController, ClickHouseSearchEngine,
@clickhouse/init/01-schema.sql
@clickhouse/init/02-search-columns.sql
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/ExecutionController.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ExecutionController.java
<interfaces>
<!-- Core types created by Plan 01 — executor reads these from plan 01 SUMMARY -->
From cameleer3-server-core/.../search/SearchEngine.java:
From cameleer-server-core/.../search/SearchEngine.java:
```java
public interface SearchEngine {
SearchResult<ExecutionSummary> search(SearchRequest request);
@@ -106,7 +106,7 @@ public interface SearchEngine {
}
```
From cameleer3-server-core/.../search/SearchRequest.java:
From cameleer-server-core/.../search/SearchRequest.java:
```java
public record SearchRequest(
String status, // nullable, filter by ExecutionStatus name
@@ -124,14 +124,14 @@ public record SearchRequest(
) { /* compact constructor with validation */ }
```
From cameleer3-server-core/.../search/SearchResult.java:
From cameleer-server-core/.../search/SearchResult.java:
```java
public record SearchResult<T>(List<T> data, long total, int offset, int limit) {
public static <T> SearchResult<T> empty(int offset, int limit);
}
```
From cameleer3-server-core/.../search/ExecutionSummary.java:
From cameleer-server-core/.../search/ExecutionSummary.java:
```java
public record ExecutionSummary(
String executionId, String routeId, String agentId, String status,
@@ -140,7 +140,7 @@ public record ExecutionSummary(
) {}
```
From cameleer3-server-core/.../detail/DetailService.java:
From cameleer-server-core/.../detail/DetailService.java:
```java
public class DetailService {
// Constructor takes ExecutionRepository (or a query interface)
@@ -149,7 +149,7 @@ public class DetailService {
}
```
From cameleer3-server-core/.../detail/ExecutionDetail.java:
From cameleer-server-core/.../detail/ExecutionDetail.java:
```java
public record ExecutionDetail(
String executionId, String routeId, String agentId, String status,
@@ -160,7 +160,7 @@ public record ExecutionDetail(
) {}
```
From cameleer3-server-core/.../detail/ProcessorNode.java:
From cameleer-server-core/.../detail/ProcessorNode.java:
```java
public record ProcessorNode(
String processorId, String processorType, String status,
@@ -201,10 +201,10 @@ Established controller pattern (from Phase 1):
<task type="auto" tdd="true">
<name>Task 1: ClickHouseSearchEngine, SearchController, and search integration tests</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/search/ClickHouseSearchEngine.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/SearchController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/SearchBeanConfig.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/search/ClickHouseSearchEngine.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/SearchController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/SearchBeanConfig.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java
</files>
<behavior>
- Test searchByStatus: Insert 3 executions (COMPLETED, FAILED, RUNNING). GET /api/v1/search/executions?status=FAILED returns only the FAILED execution. Response has envelope: {"data":[...],"total":1,"offset":0,"limit":50}
@@ -221,7 +221,7 @@ Established controller pattern (from Phase 1):
- Test emptyResults: Search with no matches returns {"data":[],"total":0,"offset":0,"limit":50}
</behavior>
<action>
1. Create `ClickHouseSearchEngine` in `com.cameleer3.server.app.search`:
1. Create `ClickHouseSearchEngine` in `com.cameleer.server.app.search`:
- Implements SearchEngine interface from core module.
- Constructor takes JdbcTemplate.
- `search(SearchRequest)` method:
@@ -244,13 +244,13 @@ Established controller pattern (from Phase 1):
- `escapeLike(String)` utility: escape `%`, `_`, `\` characters in user input to prevent LIKE injection. Replace `\` with `\\`, `%` with `\%`, `_` with `\_`.
- `count(SearchRequest)` method: same WHERE building, just count query.
2. Create `SearchBeanConfig` in `com.cameleer3.server.app.config`:
2. Create `SearchBeanConfig` in `com.cameleer.server.app.config`:
- @Configuration class that creates:
- `ClickHouseSearchEngine` bean (takes JdbcTemplate)
- `SearchService` bean (takes SearchEngine)
- `DetailService` bean (takes the execution query interface from Plan 01)
3. Create `SearchController` in `com.cameleer3.server.app.controller`:
3. Create `SearchController` in `com.cameleer.server.app.controller`:
- Inject SearchService.
- `GET /api/v1/search/executions` with @RequestParam for basic filters:
- status (optional String)
@@ -274,7 +274,7 @@ Established controller pattern (from Phase 1):
- Assert response structure matches the envelope format.
</action>
<verify>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer3-server && mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT</automated>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=SearchControllerIT</automated>
</verify>
<done>All search filter types work independently and in combination, response envelope has correct format, pagination works correctly, full-text search finds matches in all text fields, LIKE patterns are properly escaped</done>
</task>
@@ -282,10 +282,10 @@ Established controller pattern (from Phase 1):
<task type="auto" tdd="true">
<name>Task 2: DetailController, tree reconstruction, exchange snapshot endpoint, and integration tests</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DetailController.java,
cameleer3-server-core/src/test/java/com/cameleer3/server/core/detail/TreeReconstructionTest.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DetailControllerIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DetailController.java,
cameleer-server-core/src/test/java/com/cameleer/server/core/detail/TreeReconstructionTest.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DetailControllerIT.java
</files>
<behavior>
- Unit test: reconstructTree with [root, child, grandchild], depths=[0,1,2], parents=[-1,0,1] produces single root with one child that has one grandchild
@@ -308,7 +308,7 @@ Established controller pattern (from Phase 1):
- Add `findRawById(String executionId)` method that queries all columns from route_executions WHERE execution_id = ?. Return Optional<RawExecutionRow> (use the record created in Plan 01 or create it here if needed). The RawExecutionRow should contain ALL columns including the parallel arrays for processors.
- Add `findProcessorSnapshot(String executionId, int processorIndex)` method: queries processor_input_bodies[index+1], processor_output_bodies[index+1], processor_input_headers[index+1], processor_output_headers[index+1] for the given execution. Returns a DTO with inputBody, outputBody, inputHeaders, outputHeaders. ClickHouse arrays are 1-indexed in SQL, so add 1 to the Java 0-based index.
3. Create `DetailController` in `com.cameleer3.server.app.controller`:
3. Create `DetailController` in `com.cameleer.server.app.controller`:
- Inject DetailService.
- `GET /api/v1/executions/{executionId}`: call detailService.getDetail(executionId). If empty, return 404. Otherwise return 200 with ExecutionDetail JSON. The processors field is a nested tree of ProcessorNode objects.
- `GET /api/v1/executions/{executionId}/processors/{index}/snapshot`: call repository's findProcessorSnapshot. If execution not found or index out of bounds, return 404. Return JSON with inputBody, outputBody, inputHeaders, outputHeaders. Per user decision: exchange snapshot data fetched separately per processor, not inlined in detail response.
@@ -323,7 +323,7 @@ Established controller pattern (from Phase 1):
- Test GET /api/v1/executions/{id}/processors/999/snapshot: returns 404 for out-of-bounds index.
</action>
<verify>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer3-server && mvn test -pl cameleer3-server-core -Dtest=TreeReconstructionTest && mvn test -pl cameleer3-server-app -Dtest=DetailControllerIT</automated>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-core -Dtest=TreeReconstructionTest && mvn test -pl cameleer-server-app -Dtest=DetailControllerIT</automated>
</verify>
<done>Tree reconstruction correctly rebuilds nested processor trees from flat arrays, detail endpoint returns nested tree with all fields, snapshot endpoint returns per-processor exchange data, diagram hash included in detail response, all tests pass</done>
</task>
@@ -331,9 +331,9 @@ Established controller pattern (from Phase 1):
</tasks>
<verification>
- `mvn test -pl cameleer3-server-core -Dtest=TreeReconstructionTest` passes (unit test for tree rebuild)
- `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT` passes (all search filters)
- `mvn test -pl cameleer3-server-app -Dtest=DetailControllerIT` passes (detail + snapshot)
- `mvn test -pl cameleer-server-core -Dtest=TreeReconstructionTest` passes (unit test for tree rebuild)
- `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT` passes (all search filters)
- `mvn test -pl cameleer-server-app -Dtest=DetailControllerIT` passes (detail + snapshot)
- `mvn clean verify` passes (full suite green)
</verification>

View File

@@ -21,16 +21,16 @@ tech-stack:
key-files:
created:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/search/ClickHouseSearchEngine.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/SearchController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/DetailController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/SearchBeanConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DetailControllerIT.java
- cameleer3-server-core/src/test/java/com/cameleer3/server/core/detail/TreeReconstructionTest.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/search/ClickHouseSearchEngine.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/SearchController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DetailController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/SearchBeanConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DetailControllerIT.java
- cameleer-server-core/src/test/java/com/cameleer/server/core/detail/TreeReconstructionTest.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-core/pom.xml
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-core/pom.xml
key-decisions:
- "Search tests use correlationId scoping and >= assertions for shared ClickHouse isolation"
@@ -77,15 +77,15 @@ Each task was committed atomically:
3. **Task 2 fix: Test isolation for shared ClickHouse** - `079dce5` (fix)
## Files Created/Modified
- `cameleer3-server-app/.../search/ClickHouseSearchEngine.java` - Dynamic SQL search with LIKE escape, implements SearchEngine
- `cameleer3-server-app/.../controller/SearchController.java` - GET + POST /api/v1/search/executions endpoints
- `cameleer3-server-app/.../controller/DetailController.java` - GET /api/v1/executions/{id} and processor snapshot endpoints
- `cameleer3-server-app/.../config/SearchBeanConfig.java` - Wires SearchEngine, SearchService, DetailService beans
- `cameleer3-server-app/.../storage/ClickHouseExecutionRepository.java` - Added findRawById, findProcessorSnapshot, array extraction helpers
- `cameleer3-server-app/.../controller/SearchControllerIT.java` - 13 integration tests for search
- `cameleer3-server-app/.../controller/DetailControllerIT.java` - 6 integration tests for detail/snapshot
- `cameleer3-server-core/.../detail/TreeReconstructionTest.java` - 5 unit tests for tree reconstruction
- `cameleer3-server-core/pom.xml` - Added assertj and mockito test dependencies
- `cameleer-server-app/.../search/ClickHouseSearchEngine.java` - Dynamic SQL search with LIKE escape, implements SearchEngine
- `cameleer-server-app/.../controller/SearchController.java` - GET + POST /api/v1/search/executions endpoints
- `cameleer-server-app/.../controller/DetailController.java` - GET /api/v1/executions/{id} and processor snapshot endpoints
- `cameleer-server-app/.../config/SearchBeanConfig.java` - Wires SearchEngine, SearchService, DetailService beans
- `cameleer-server-app/.../storage/ClickHouseExecutionRepository.java` - Added findRawById, findProcessorSnapshot, array extraction helpers
- `cameleer-server-app/.../controller/SearchControllerIT.java` - 13 integration tests for search
- `cameleer-server-app/.../controller/DetailControllerIT.java` - 6 integration tests for detail/snapshot
- `cameleer-server-core/.../detail/TreeReconstructionTest.java` - 5 unit tests for tree reconstruction
- `cameleer-server-core/pom.xml` - Added assertj and mockito test dependencies
## Decisions Made
- Search tests use correlationId scoping and >= assertions to remain stable when other test classes seed data in the shared ClickHouse container
@@ -99,8 +99,8 @@ Each task was committed atomically:
**1. [Rule 3 - Blocking] Added assertj and mockito test dependencies to core module**
- **Found during:** Task 2 (TreeReconstructionTest compilation)
- **Issue:** Core module only had JUnit Jupiter as test dependency, TreeReconstructionTest needed assertj for assertions and mockito for mock(ExecutionRepository.class)
- **Fix:** Added assertj-core and mockito-core test-scoped dependencies to cameleer3-server-core/pom.xml
- **Files modified:** cameleer3-server-core/pom.xml
- **Fix:** Added assertj-core and mockito-core test-scoped dependencies to cameleer-server-core/pom.xml
- **Files modified:** cameleer-server-core/pom.xml
- **Committed in:** 0615a98 (Task 2 commit)
**2. [Rule 1 - Bug] Fixed search tests failing with shared ClickHouse data**

View File

@@ -5,9 +5,9 @@ type: execute
wave: 1
depends_on: ["02-01", "02-02", "02-03"]
files_modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-app/pom.xml
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/pom.xml
- cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java
autonomous: true
gap_closure: true
requirements: ["DIAG-02"]
@@ -17,13 +17,13 @@ must_haves:
- "Each transaction links to the RouteGraph version that was active at execution time"
- "Full test suite passes with mvn clean verify (no classloader failures)"
artifacts:
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java"
provides: "Diagram hash lookup during batch insert"
contains: "findContentHashForRoute"
- path: "cameleer3-server-app/pom.xml"
- path: "cameleer-server-app/pom.xml"
provides: "Surefire fork configuration isolating ELK classloader"
contains: "reuseForks"
- path: "cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java"
- path: "cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java"
provides: "Integration test proving diagram hash is stored during ingestion"
key_links:
- from: "ClickHouseExecutionRepository"
@@ -57,18 +57,18 @@ Prior plan summaries (needed — touches same files):
<interfaces>
<!-- Key types and contracts the executor needs. -->
From cameleer3-server-core/.../storage/DiagramRepository.java:
From cameleer-server-core/.../storage/DiagramRepository.java:
```java
Optional<String> findContentHashForRoute(String routeId, String agentId);
```
From cameleer3-server-app/.../storage/ClickHouseExecutionRepository.java (line 141):
From cameleer-server-app/.../storage/ClickHouseExecutionRepository.java (line 141):
```java
ps.setString(col++, ""); // diagram_content_hash (wired later)
```
The class is @Repository annotated, constructor takes JdbcTemplate only. It needs DiagramRepository injected to perform the lookup.
From cameleer3-server-app/.../storage/ClickHouseDiagramRepository.java:
From cameleer-server-app/.../storage/ClickHouseDiagramRepository.java:
```java
@Repository
public class ClickHouseDiagramRepository implements DiagramRepository {
@@ -83,9 +83,9 @@ public class ClickHouseDiagramRepository implements DiagramRepository {
<task type="auto" tdd="true">
<name>Task 1: Populate diagram_content_hash during ingestion and fix Surefire forks</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java,
cameleer3-server-app/pom.xml,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java,
cameleer-server-app/pom.xml,
cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java
</files>
<behavior>
- Test 1: When a RouteGraph is ingested before a RouteExecution for the same routeId+agentId, the execution's diagram_content_hash column contains the SHA-256 hash of the diagram (not empty string)
@@ -117,7 +117,7 @@ public class ClickHouseDiagramRepository implements DiagramRepository {
**Gap 2 — Surefire classloader isolation:**
5. In `cameleer3-server-app/pom.xml`, add a `<build><plugins>` section (after the existing `spring-boot-maven-plugin`) with `maven-surefire-plugin` configuration:
5. In `cameleer-server-app/pom.xml`, add a `<build><plugins>` section (after the existing `spring-boot-maven-plugin`) with `maven-surefire-plugin` configuration:
```xml
<plugin>
<groupId>org.apache.maven.plugins</groupId>
@@ -131,12 +131,12 @@ public class ClickHouseDiagramRepository implements DiagramRepository {
This forces Surefire to fork a fresh JVM for each test class, isolating ELK's static initializer (LayeredMetaDataProvider + xtext CollectionLiterals) from Spring Boot's classloader. Trade-off: slightly slower test execution, but correct results.
</action>
<verify>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer3-server && mvn clean verify -pl cameleer3-server-app -am 2>&1 | tail -30</automated>
<automated>cd C:/Users/Hendrik/Documents/projects/cameleer-server && mvn clean verify -pl cameleer-server-app -am 2>&1 | tail -30</automated>
</verify>
<done>
- diagram_content_hash is populated with the active diagram's SHA-256 hash during ingestion (not empty string)
- DiagramLinkingIT passes with both positive and negative cases
- `mvn clean verify` passes for cameleer3-server-app (no classloader failures from ElkDiagramRendererTest)
- `mvn clean verify` passes for cameleer-server-app (no classloader failures from ElkDiagramRendererTest)
</done>
</task>

View File

@@ -20,12 +20,12 @@ tech-stack:
key-files:
created:
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java
- cameleer3-server-app/pom.xml
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/IngestionSchemaIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java
- cameleer-server-app/pom.xml
- cameleer-server-app/src/test/java/com/cameleer/server/app/storage/IngestionSchemaIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java
key-decisions:
- "DiagramRepository injected via constructor into ClickHouseExecutionRepository for diagram hash lookup during batch insert"
@@ -69,11 +69,11 @@ Each task was committed atomically:
**Plan metadata:** (pending)
## Files Created/Modified
- `cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java` - Added DiagramRepository injection, diagram hash lookup in insertBatch
- `cameleer3-server-app/pom.xml` - Added maven-surefire-plugin and maven-failsafe-plugin with reuseForks=false
- `cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java` - Integration test for diagram hash linking
- `cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/IngestionSchemaIT.java` - Added ignoreExceptions + increased timeouts
- `cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java` - Adjusted pagination assertion count
- `cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java` - Added DiagramRepository injection, diagram hash lookup in insertBatch
- `cameleer-server-app/pom.xml` - Added maven-surefire-plugin and maven-failsafe-plugin with reuseForks=false
- `cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java` - Integration test for diagram hash linking
- `cameleer-server-app/src/test/java/com/cameleer/server/app/storage/IngestionSchemaIT.java` - Added ignoreExceptions + increased timeouts
- `cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java` - Adjusted pagination assertion count
## Decisions Made
- DiagramRepository injected via constructor into ClickHouseExecutionRepository -- both are @Repository Spring beans, so constructor injection autowires cleanly

View File

@@ -44,7 +44,7 @@ Users can find any transaction by status, time, duration, correlation ID, or con
- No execution overlay in server-rendered SVG — the UI handles overlay with theme support (dark/light)
- Top-to-bottom node layout flow
- Nested processors (inside for-each, split, try-catch) rendered in swimlanes to highlight nesting/scope
- Reference: cameleer3 agent repo `examples/route-diagram-example.html` for visual style inspiration (color-coded node types, EIP icons)
- Reference: cameleer agent repo `examples/route-diagram-example.html` for visual style inspiration (color-coded node types, EIP icons)
### Claude's Discretion
- Pagination implementation details (offset/limit vs cursor)
@@ -58,7 +58,7 @@ Users can find any transaction by status, time, duration, correlation ID, or con
<specifics>
## Specific Ideas
- "We want a cmd+k type of search in the UI" — see `cameleer3/examples/cmd-k-search-example.html` for the target UX. Key features:
- "We want a cmd+k type of search in the UI" — see `cameleer/examples/cmd-k-search-example.html` for the target UX. Key features:
- Cross-entity search: single query hits Executions, Routes, Exchanges, Agents with scope tabs and counts
- Filter chips in the input (e.g., `route:order` prefix filtering)
- Inline preview pane with JSON syntax highlighting for selected result

View File

@@ -112,7 +112,7 @@ The existing Phase 1 code provides a solid foundation: `ClickHouseExecutionRepos
### Recommended Project Structure (additions for Phase 2)
```
cameleer3-server-core/src/main/java/com/cameleer3/server/core/
cameleer-server-core/src/main/java/com/cameleer/server/core/
├── search/
│ ├── SearchService.java # Orchestrates search, delegates to SearchEngine
│ ├── SearchEngine.java # Interface for search backends (ClickHouse now, OpenSearch later)
@@ -127,7 +127,7 @@ cameleer3-server-core/src/main/java/com/cameleer3/server/core/
├── ExecutionRepository.java # Extended with query methods
└── DiagramRepository.java # Extended with lookup methods
cameleer3-server-app/src/main/java/com/cameleer3/server/app/
cameleer-server-app/src/main/java/com/cameleer/server/app/
├── controller/
│ ├── SearchController.java # GET + POST /api/v1/search/executions
│ ├── DetailController.java # GET /api/v1/executions/{id}
@@ -517,25 +517,25 @@ private void populateExchangeColumns(PreparedStatement ps, List<FlatProcessor> p
| Property | Value |
|----------|-------|
| Framework | JUnit 5 + Spring Boot Test + Testcontainers ClickHouse 25.3 |
| Config file | cameleer3-server-app/pom.xml (testcontainers dep), AbstractClickHouseIT base class |
| Quick run command | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT -Dfailsafe.skip=true` |
| Config file | cameleer-server-app/pom.xml (testcontainers dep), AbstractClickHouseIT base class |
| Quick run command | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT -Dfailsafe.skip=true` |
| Full suite command | `mvn clean verify` |
### Phase Requirements -> Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| SRCH-01 | Filter by status returns matching executions | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByStatus` | No -- Wave 0 |
| SRCH-02 | Filter by time range returns matching executions | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByTimeRange` | No -- Wave 0 |
| SRCH-03 | Filter by duration range returns matching | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByDuration` | No -- Wave 0 |
| SRCH-04 | Filter by correlationId returns correlated | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByCorrelationId` | No -- Wave 0 |
| SRCH-05 | Full-text search across bodies/headers/errors | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#fullTextSearch` | No -- Wave 0 |
| SRCH-06 | Detail returns nested processor tree | integration | `mvn test -pl cameleer3-server-app -Dtest=DetailControllerIT#detailReturnsNestedTree` | No -- Wave 0 |
| DIAG-01 | Content-hash dedup stores identical defs once | integration | `mvn test -pl cameleer3-server-app -Dtest=DiagramControllerIT#contentHashDedup` | Partial (ingestion test exists) |
| DIAG-02 | Transaction links to active diagram version | integration | `mvn test -pl cameleer3-server-app -Dtest=DetailControllerIT#detailIncludesDiagramHash` | No -- Wave 0 |
| DIAG-03 | Diagram rendered as SVG or JSON layout | integration | `mvn test -pl cameleer3-server-app -Dtest=DiagramRenderControllerIT#renderSvg` | No -- Wave 0 |
| SRCH-01 | Filter by status returns matching executions | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByStatus` | No -- Wave 0 |
| SRCH-02 | Filter by time range returns matching executions | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByTimeRange` | No -- Wave 0 |
| SRCH-03 | Filter by duration range returns matching | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByDuration` | No -- Wave 0 |
| SRCH-04 | Filter by correlationId returns correlated | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByCorrelationId` | No -- Wave 0 |
| SRCH-05 | Full-text search across bodies/headers/errors | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#fullTextSearch` | No -- Wave 0 |
| SRCH-06 | Detail returns nested processor tree | integration | `mvn test -pl cameleer-server-app -Dtest=DetailControllerIT#detailReturnsNestedTree` | No -- Wave 0 |
| DIAG-01 | Content-hash dedup stores identical defs once | integration | `mvn test -pl cameleer-server-app -Dtest=DiagramControllerIT#contentHashDedup` | Partial (ingestion test exists) |
| DIAG-02 | Transaction links to active diagram version | integration | `mvn test -pl cameleer-server-app -Dtest=DetailControllerIT#detailIncludesDiagramHash` | No -- Wave 0 |
| DIAG-03 | Diagram rendered as SVG or JSON layout | integration | `mvn test -pl cameleer-server-app -Dtest=DiagramRenderControllerIT#renderSvg` | No -- Wave 0 |
### Sampling Rate
- **Per task commit:** `mvn test -pl cameleer3-server-app -Dtest=<relevant>IT`
- **Per task commit:** `mvn test -pl cameleer-server-app -Dtest=<relevant>IT`
- **Per wave merge:** `mvn clean verify`
- **Phase gate:** Full suite green before `/gsd:verify-work`
@@ -551,7 +551,7 @@ private void populateExchangeColumns(PreparedStatement ps, List<FlatProcessor> p
### Primary (HIGH confidence)
- ClickHouse JDBC 0.9.7, ClickHouse 25.3 -- verified from project pom.xml and AbstractClickHouseIT
- cameleer3-common 1.0-SNAPSHOT JAR -- decompiled to verify RouteGraph, RouteNode, RouteEdge, NodeType, ProcessorExecution, ExchangeSnapshot field structures
- cameleer-common 1.0-SNAPSHOT JAR -- decompiled to verify RouteGraph, RouteNode, RouteEdge, NodeType, ProcessorExecution, ExchangeSnapshot field structures
- Existing Phase 1 codebase -- ClickHouseExecutionRepository, ClickHouseDiagramRepository, schema, test patterns
### Secondary (MEDIUM confidence)

View File

@@ -18,8 +18,8 @@ created: 2026-03-11
| Property | Value |
|----------|-------|
| **Framework** | JUnit 5 + Spring Boot Test + Testcontainers ClickHouse 25.3 |
| **Config file** | cameleer3-server-app/pom.xml (testcontainers dep), AbstractClickHouseIT base class |
| **Quick run command** | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT` |
| **Config file** | cameleer-server-app/pom.xml (testcontainers dep), AbstractClickHouseIT base class |
| **Quick run command** | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT` |
| **Full suite command** | `mvn clean verify` |
| **Estimated runtime** | ~45 seconds |
@@ -27,7 +27,7 @@ created: 2026-03-11
## Sampling Rate
- **After every task commit:** Run `mvn test -pl cameleer3-server-app -Dtest=<relevant>IT`
- **After every task commit:** Run `mvn test -pl cameleer-server-app -Dtest=<relevant>IT`
- **After every plan wave:** Run `mvn clean verify`
- **Before `/gsd:verify-work`:** Full suite must be green
- **Max feedback latency:** 45 seconds
@@ -38,15 +38,15 @@ created: 2026-03-11
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
| 02-01-01 | 01 | 1 | SRCH-01 | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByStatus` | ❌ W0 | ⬜ pending |
| 02-01-02 | 01 | 1 | SRCH-02 | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByTimeRange` | ❌ W0 | ⬜ pending |
| 02-01-03 | 01 | 1 | SRCH-03 | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByDuration` | ❌ W0 | ⬜ pending |
| 02-01-04 | 01 | 1 | SRCH-04 | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#searchByCorrelationId` | ❌ W0 | ⬜ pending |
| 02-01-05 | 01 | 1 | SRCH-05 | integration | `mvn test -pl cameleer3-server-app -Dtest=SearchControllerIT#fullTextSearch` | ❌ W0 | ⬜ pending |
| 02-01-06 | 01 | 1 | SRCH-06 | integration | `mvn test -pl cameleer3-server-app -Dtest=DetailControllerIT#detailReturnsNestedTree` | ❌ W0 | ⬜ pending |
| 02-02-01 | 02 | 1 | DIAG-01 | integration | `mvn test -pl cameleer3-server-app -Dtest=DiagramControllerIT#contentHashDedup` | Partial | ⬜ pending |
| 02-02-02 | 02 | 1 | DIAG-02 | integration | `mvn test -pl cameleer3-server-app -Dtest=DetailControllerIT#detailIncludesDiagramHash` | ❌ W0 | ⬜ pending |
| 02-02-03 | 02 | 1 | DIAG-03 | integration | `mvn test -pl cameleer3-server-app -Dtest=DiagramRenderControllerIT#renderSvg` | ❌ W0 | ⬜ pending |
| 02-01-01 | 01 | 1 | SRCH-01 | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByStatus` | ❌ W0 | ⬜ pending |
| 02-01-02 | 01 | 1 | SRCH-02 | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByTimeRange` | ❌ W0 | ⬜ pending |
| 02-01-03 | 01 | 1 | SRCH-03 | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByDuration` | ❌ W0 | ⬜ pending |
| 02-01-04 | 01 | 1 | SRCH-04 | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#searchByCorrelationId` | ❌ W0 | ⬜ pending |
| 02-01-05 | 01 | 1 | SRCH-05 | integration | `mvn test -pl cameleer-server-app -Dtest=SearchControllerIT#fullTextSearch` | ❌ W0 | ⬜ pending |
| 02-01-06 | 01 | 1 | SRCH-06 | integration | `mvn test -pl cameleer-server-app -Dtest=DetailControllerIT#detailReturnsNestedTree` | ❌ W0 | ⬜ pending |
| 02-02-01 | 02 | 1 | DIAG-01 | integration | `mvn test -pl cameleer-server-app -Dtest=DiagramControllerIT#contentHashDedup` | Partial | ⬜ pending |
| 02-02-02 | 02 | 1 | DIAG-02 | integration | `mvn test -pl cameleer-server-app -Dtest=DetailControllerIT#detailIncludesDiagramHash` | ❌ W0 | ⬜ pending |
| 02-02-03 | 02 | 1 | DIAG-03 | integration | `mvn test -pl cameleer-server-app -Dtest=DiagramRenderControllerIT#renderSvg` | ❌ W0 | ⬜ pending |
*Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*

View File

@@ -50,9 +50,9 @@ human_verification:
| Artifact | Status | Notes |
|----------|--------|-------|
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/storage/ClickHouseExecutionRepository.java` | VERIFIED | DiagramRepository injected via constructor (line 59); `findContentHashForRoute` called in `setValues()` (lines 144147); former `""` placeholder removed |
| `cameleer3-server-app/pom.xml` | VERIFIED | `maven-surefire-plugin` with `forkCount=1` `reuseForks=false` at lines 95100; `maven-failsafe-plugin` same config at lines 103108 |
| `cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java` | VERIFIED | 152 lines; 2 integration tests; positive case asserts 64-char hex hash; negative case asserts empty string; uses `ignoreExceptions()` for ClickHouse eventual consistency |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/storage/ClickHouseExecutionRepository.java` | VERIFIED | DiagramRepository injected via constructor (line 59); `findContentHashForRoute` called in `setValues()` (lines 144147); former `""` placeholder removed |
| `cameleer-server-app/pom.xml` | VERIFIED | `maven-surefire-plugin` with `forkCount=1` `reuseForks=false` at lines 95100; `maven-failsafe-plugin` same config at lines 103108 |
| `cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java` | VERIFIED | 152 lines; 2 integration tests; positive case asserts 64-char hex hash; negative case asserts empty string; uses `ignoreExceptions()` for ClickHouse eventual consistency |
### Key Link Verification
@@ -62,7 +62,7 @@ human_verification:
| `SearchController` | `SearchService` | constructor injection, `searchService.search()` | WIRED | Previously verified; no regression |
| `DetailController` | `DetailService` | constructor injection, `detailService.getDetail()` | WIRED | Previously verified; no regression |
| `DiagramRenderController` | `DiagramRepository` + `DiagramRenderer` | `findByContentHash()` + `renderSvg()`/`layoutJson()` | WIRED | Previously verified; no regression |
| `Surefire/Failsafe` | ELK classloader isolation | `reuseForks=false` forces fresh JVM per test class | WIRED | Lines 95116 in `cameleer3-server-app/pom.xml` |
| `Surefire/Failsafe` | ELK classloader isolation | `reuseForks=false` forces fresh JVM per test class | WIRED | Lines 95116 in `cameleer-server-app/pom.xml` |
### Requirements Coverage
@@ -108,7 +108,7 @@ Two blockers from the initial verification (2026-03-11T16:00:00Z) have been reso
**Gap 1 resolved — DIAG-02 diagram hash linking:** `ClickHouseExecutionRepository` now injects `DiagramRepository` via constructor and calls `findContentHashForRoute(exec.getRouteId(), "")` in `insertBatch()`. Both the diagram store path and the execution ingest path use `agent_id=""` consistently, so the lookup is correct. `DiagramLinkingIT` provides integration test coverage for both the positive case (hash populated when diagram exists) and negative case (empty string when no diagram exists for the route).
**Gap 2 resolved — Test suite stability:** Both `maven-surefire-plugin` and `maven-failsafe-plugin` in `cameleer3-server-app/pom.xml` are now configured with `forkCount=1` `reuseForks=false`. This forces a fresh JVM per test class, isolating ELK's `LayeredMetaDataProvider` static initializer from Spring Boot's classloader. The SUMMARY reports 51 tests, 0 failures. Test count across 16 test files totals 80 `@Test` methods; the difference from 51 reflects how Surefire/Failsafe counts parameterized and nested tests vs. raw annotation count.
**Gap 2 resolved — Test suite stability:** Both `maven-surefire-plugin` and `maven-failsafe-plugin` in `cameleer-server-app/pom.xml` are now configured with `forkCount=1` `reuseForks=false`. This forces a fresh JVM per test class, isolating ELK's `LayeredMetaDataProvider` static initializer from Spring Boot's classloader. The SUMMARY reports 51 tests, 0 failures. Test count across 16 test files totals 80 `@Test` methods; the difference from 51 reflects how Surefire/Failsafe counts parameterized and nested tests vs. raw annotation count.
No regressions were introduced. All 10 observable truths and all 9 phase requirements are now satisfied. Two items remain for human visual verification (SVG rendering correctness).

View File

@@ -5,21 +5,21 @@ type: execute
wave: 1
depends_on: []
files_modified:
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentInfo.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentState.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentCommand.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandStatus.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandType.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentRegistryService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentEventListener.java
- cameleer3-server-core/src/test/java/com/cameleer3/server/core/agent/AgentRegistryServiceTest.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryBeanConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java
- cameleer3-server-app/src/main/resources/application.yml
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentState.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentCommand.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandStatus.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandType.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentEventListener.java
- cameleer-server-core/src/test/java/com/cameleer/server/core/agent/AgentRegistryServiceTest.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java
- cameleer-server-app/src/main/resources/application.yml
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java
autonomous: true
requirements:
- AGNT-01
@@ -34,13 +34,13 @@ must_haves:
- "Server transitions agents LIVE->STALE after 90s without heartbeat, STALE->DEAD 5 minutes after staleTransitionTime"
- "Agent list endpoint GET /api/v1/agents returns all agents, filterable by ?status=LIVE|STALE|DEAD"
artifacts:
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentRegistryService.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java"
provides: "Agent registration, heartbeat, lifecycle transitions, find/filter"
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentInfo.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java"
provides: "Agent record with id, name, group, version, routeIds, capabilities, state, timestamps"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java"
provides: "POST /register, POST /{id}/heartbeat, GET /agents endpoints"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java"
provides: "@Scheduled lifecycle transitions LIVE->STALE->DEAD"
key_links:
- from: "AgentRegistrationController"
@@ -76,14 +76,14 @@ Output: Core domain types (AgentInfo, AgentState, AgentCommand, CommandStatus, C
@.planning/phases/03-agent-registry-sse-push/03-CONTEXT.md
@.planning/phases/03-agent-registry-sse-push/03-RESEARCH.md
@cameleer3-server-core/src/main/java/com/cameleer3/server/core/ingestion/IngestionService.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionBeanConfig.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/IngestionConfig.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/ingestion/ClickHouseFlushScheduler.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java
@cameleer3-server-app/src/main/resources/application.yml
@cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
@cameleer-server-core/src/main/java/com/cameleer/server/core/ingestion/IngestionService.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionBeanConfig.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/config/IngestionConfig.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/ingestion/ClickHouseFlushScheduler.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java
@cameleer-server-app/src/main/resources/application.yml
@cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
<interfaces>
<!-- Established codebase patterns the executor must follow -->
@@ -99,10 +99,10 @@ Pattern: Controller accepts raw String body:
Pattern: @Scheduled for periodic tasks:
- ClickHouseFlushScheduler uses @Scheduled(fixedDelayString = "${ingestion.flush-interval-ms:1000}")
- @EnableScheduling already on Cameleer3ServerApplication
- @EnableScheduling already on CameleerServerApplication
Pattern: @EnableConfigurationProperties registration:
- Cameleer3ServerApplication has @EnableConfigurationProperties(IngestionConfig.class)
- CameleerServerApplication has @EnableConfigurationProperties(IngestionConfig.class)
- New config classes must be added to this annotation
Pattern: ProtocolVersionInterceptor:
@@ -116,14 +116,14 @@ Pattern: ProtocolVersionInterceptor:
<task type="auto" tdd="true">
<name>Task 1: Core domain types and AgentRegistryService with unit tests</name>
<files>
cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentInfo.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentState.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentCommand.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandStatus.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandType.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentRegistryService.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentEventListener.java,
cameleer3-server-core/src/test/java/com/cameleer3/server/core/agent/AgentRegistryServiceTest.java
cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentState.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentCommand.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandStatus.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandType.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentEventListener.java,
cameleer-server-core/src/test/java/com/cameleer/server/core/agent/AgentRegistryServiceTest.java
</files>
<behavior>
- register: new agent ID creates AgentInfo with state LIVE, returns AgentInfo
@@ -142,7 +142,7 @@ Pattern: ProtocolVersionInterceptor:
- findPendingCommands: returns PENDING commands for given agentId
</behavior>
<action>
Create the agent domain model in the core module (package com.cameleer3.server.core.agent):
Create the agent domain model in the core module (package com.cameleer.server.core.agent):
1. **AgentState enum**: LIVE, STALE, DEAD
@@ -182,7 +182,7 @@ Pattern: ProtocolVersionInterceptor:
Write tests FIRST (RED), then implement (GREEN). Test class: AgentRegistryServiceTest.
</action>
<verify>
<automated>mvn test -pl cameleer3-server-core -Dtest=AgentRegistryServiceTest</automated>
<automated>mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest</automated>
</verify>
<done>All unit tests pass: registration (new + re-register), heartbeat (known + unknown), lifecycle transitions (LIVE->STALE->DEAD, heartbeat revives STALE), findAll/findByState/findById, command add/acknowledge/expire. AgentEventListener interface defined.</done>
</task>
@@ -190,13 +190,13 @@ Pattern: ProtocolVersionInterceptor:
<task type="auto">
<name>Task 2: Registration/heartbeat/list controllers, config, lifecycle monitor, integration tests</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryBeanConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java,
cameleer3-server-app/src/main/resources/application.yml,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryConfig.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java,
cameleer-server-app/src/main/resources/application.yml,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java
</files>
<action>
Wire the agent registry into the Spring Boot app and create REST endpoints:
@@ -214,7 +214,7 @@ Pattern: ProtocolVersionInterceptor:
- @Bean AgentRegistryService: `new AgentRegistryService(config.getStaleThresholdMs(), config.getDeadThresholdMs(), config.getCommandExpiryMs())`
Follow IngestionBeanConfig pattern.
3. **Update Cameleer3ServerApplication**: Add AgentRegistryConfig.class to @EnableConfigurationProperties.
3. **Update CameleerServerApplication**: Add AgentRegistryConfig.class to @EnableConfigurationProperties.
4. **Update application.yml**: Add agent-registry section with all defaults (see RESEARCH.md code example). Also add `spring.mvc.async.request-timeout: -1` for SSE support (Plan 02 needs it, but set it now).
@@ -242,7 +242,7 @@ Pattern: ProtocolVersionInterceptor:
- Use TestRestTemplate (already available from AbstractClickHouseIT's @SpringBootTest)
</action>
<verify>
<automated>mvn test -pl cameleer3-server-core,cameleer3-server-app -Dtest="Agent*"</automated>
<automated>mvn test -pl cameleer-server-core,cameleer-server-app -Dtest="Agent*"</automated>
</verify>
<done>POST /register returns 200 with agentId + sseEndpoint + heartbeatIntervalMs. POST /{id}/heartbeat returns 200 for known agents, 404 for unknown. GET /agents returns all agents with optional ?status= filter. AgentLifecycleMonitor runs on schedule. All integration tests pass. mvn clean verify passes.</done>
</task>

View File

@@ -27,22 +27,22 @@ tech-stack:
key-files:
created:
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentInfo.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentState.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentCommand.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandStatus.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/CommandType.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentRegistryService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentEventListener.java
- cameleer3-server-core/src/test/java/com/cameleer3/server/core/agent/AgentRegistryServiceTest.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryBeanConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentState.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentCommand.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandStatus.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/CommandType.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentEventListener.java
- cameleer-server-core/src/test/java/com/cameleer/server/core/agent/AgentRegistryServiceTest.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/Cameleer3ServerApplication.java
- cameleer3-server-app/src/main/resources/application.yml
- cameleer-server-app/src/main/java/com/cameleer/server/app/CameleerServerApplication.java
- cameleer-server-app/src/main/resources/application.yml
key-decisions:
- "AgentInfo as Java record with wither-style methods for immutable ConcurrentHashMap swapping"
@@ -103,7 +103,7 @@ _Note: Task 1 used TDD with separate RED/GREEN commits_
- `AgentRegistrationController.java` - REST endpoints for agents
- `AgentRegistryServiceTest.java` - 23 unit tests
- `AgentRegistrationControllerIT.java` - 7 integration tests
- `Cameleer3ServerApplication.java` - Added AgentRegistryConfig to @EnableConfigurationProperties
- `CameleerServerApplication.java` - Added AgentRegistryConfig to @EnableConfigurationProperties
- `application.yml` - Added agent-registry config section and spring.mvc.async.request-timeout
## Decisions Made

View File

@@ -5,12 +5,12 @@ type: execute
wave: 2
depends_on: ["03-01"]
files_modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentCommandController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentSseControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentCommandControllerIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentCommandController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentSseControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentCommandControllerIT.java
autonomous: true
requirements:
- AGNT-04
@@ -30,11 +30,11 @@ must_haves:
- "SSE events include event ID for Last-Event-ID reconnection support (no replay of missed events)"
- "Agent can acknowledge command receipt via POST /api/v1/agents/{id}/commands/{commandId}/ack"
artifacts:
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java"
provides: "Per-agent SseEmitter management, event sending, ping keepalive"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java"
provides: "GET /{id}/events SSE endpoint"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentCommandController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentCommandController.java"
provides: "POST command endpoints (single, group, broadcast) + ack endpoint"
key_links:
- from: "AgentCommandController"
@@ -75,49 +75,49 @@ Output: SseConnectionManager, SSE endpoint, command controller (single/group/bro
@.planning/phases/03-agent-registry-sse-push/03-RESEARCH.md
@.planning/phases/03-agent-registry-sse-push/03-01-SUMMARY.md
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
@cameleer3-server-app/src/main/resources/application.yml
@cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
@cameleer-server-app/src/main/resources/application.yml
@cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
<interfaces>
<!-- From Plan 01 (must exist before this plan executes) -->
From cameleer3-server-core/.../agent/AgentInfo.java:
From cameleer-server-core/.../agent/AgentInfo.java:
```java
// Record or class with fields:
// id, name, group, version, routeIds, capabilities, state, registeredAt, lastHeartbeat, staleTransitionTime
// Methods: withState(), withLastHeartbeat(), etc.
```
From cameleer3-server-core/.../agent/AgentState.java:
From cameleer-server-core/.../agent/AgentState.java:
```java
public enum AgentState { LIVE, STALE, DEAD }
```
From cameleer3-server-core/.../agent/CommandType.java:
From cameleer-server-core/.../agent/CommandType.java:
```java
public enum CommandType { CONFIG_UPDATE, DEEP_TRACE, REPLAY }
```
From cameleer3-server-core/.../agent/CommandStatus.java:
From cameleer-server-core/.../agent/CommandStatus.java:
```java
public enum CommandStatus { PENDING, DELIVERED, ACKNOWLEDGED, EXPIRED }
```
From cameleer3-server-core/.../agent/AgentCommand.java:
From cameleer-server-core/.../agent/AgentCommand.java:
```java
// Record: id (UUID string), type (CommandType), payload (String JSON), targetAgentId, createdAt, status
// Method: withStatus()
```
From cameleer3-server-core/.../agent/AgentEventListener.java:
From cameleer-server-core/.../agent/AgentEventListener.java:
```java
public interface AgentEventListener {
void onCommandReady(String agentId, AgentCommand command);
}
```
From cameleer3-server-core/.../agent/AgentRegistryService.java:
From cameleer-server-core/.../agent/AgentRegistryService.java:
```java
// Key methods:
// register(id, name, group, version, routeIds, capabilities) -> AgentInfo
@@ -131,7 +131,7 @@ From cameleer3-server-core/.../agent/AgentRegistryService.java:
// setEventListener(listener) -> void
```
From cameleer3-server-app/.../config/AgentRegistryConfig.java:
From cameleer-server-app/.../config/AgentRegistryConfig.java:
```java
// @ConfigurationProperties(prefix = "agent-registry")
// getPingIntervalMs(), getCommandExpiryMs(), etc.
@@ -144,11 +144,11 @@ From cameleer3-server-app/.../config/AgentRegistryConfig.java:
<task type="auto">
<name>Task 1: SseConnectionManager, SSE controller, and command controller</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentCommandController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryBeanConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentCommandController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
</files>
<action>
Build the SSE infrastructure and command delivery system:
@@ -181,7 +181,7 @@ From cameleer3-server-app/.../config/AgentRegistryConfig.java:
5. **Update WebConfig**: The SSE endpoint GET /api/v1/agents/{id}/events is already covered by the interceptor pattern "/api/v1/agents/**". Agents send the protocol version header on all requests (per research recommendation), so no exclusion needed. However, if the SSE GET causes issues because browsers/clients may not easily add custom headers to EventSource, add the SSE events path to excludePathPatterns: `/api/v1/agents/*/events`. This is a practical consideration -- add the exclusion to be safe.
</action>
<verify>
<automated>mvn compile -pl cameleer3-server-core,cameleer3-server-app</automated>
<automated>mvn compile -pl cameleer-server-core,cameleer-server-app</automated>
</verify>
<done>SseConnectionManager, AgentSseController, and AgentCommandController compile. SSE endpoint returns SseEmitter. Command endpoints accept type/payload and deliver via SSE. Ping keepalive scheduled. WebConfig updated if needed.</done>
</task>
@@ -189,8 +189,8 @@ From cameleer3-server-app/.../config/AgentRegistryConfig.java:
<task type="auto">
<name>Task 2: Integration tests for SSE, commands, and full flow</name>
<files>
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentSseControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentCommandControllerIT.java
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentSseControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentCommandControllerIT.java
</files>
<action>
Write integration tests covering SSE connection, command delivery, ping, and acknowledgement:
@@ -224,7 +224,7 @@ From cameleer3-server-app/.../config/AgentRegistryConfig.java:
**Test configuration**: If ping interval needs to be shorter for tests, add to test application.yml or use @TestPropertySource with agent-registry.ping-interval-ms=1000.
</action>
<verify>
<automated>mvn test -pl cameleer3-server-core,cameleer3-server-app -Dtest="Agent*"</automated>
<automated>mvn test -pl cameleer-server-core,cameleer-server-app -Dtest="Agent*"</automated>
</verify>
<done>All SSE integration tests pass: connect/disconnect, config-update/deep-trace/replay delivery via SSE, ping keepalive received, Last-Event-ID accepted, command targeting (single/group/broadcast), command acknowledgement. mvn clean verify passes with all existing tests still green.</done>
</task>

View File

@@ -24,14 +24,14 @@ tech-stack:
key-files:
created:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentCommandController.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentSseControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentCommandControllerIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentCommandController.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentSseControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentCommandControllerIT.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
- cameleer3-server-app/src/test/resources/application-test.yml
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
- cameleer-server-app/src/test/resources/application-test.yml
key-decisions:
- "SSE events path excluded from ProtocolVersionInterceptor for EventSource client compatibility"

View File

@@ -6,7 +6,7 @@
## Summary
This phase adds agent registration, heartbeat-based lifecycle management (LIVE/STALE/DEAD), and real-time command push via SSE to the Cameleer3 server. The technology stack is straightforward: Spring MVC's `SseEmitter` for server-push, `ConcurrentHashMap` for the in-memory agent registry, and `@Scheduled` for periodic lifecycle checks (same pattern already used by `ClickHouseFlushScheduler`).
This phase adds agent registration, heartbeat-based lifecycle management (LIVE/STALE/DEAD), and real-time command push via SSE to the Cameleer server. The technology stack is straightforward: Spring MVC's `SseEmitter` for server-push, `ConcurrentHashMap` for the in-memory agent registry, and `@Scheduled` for periodic lifecycle checks (same pattern already used by `ClickHouseFlushScheduler`).
The main architectural challenge is managing per-agent SSE connections reliably -- handling disconnections, timeouts, and cleanup without leaking threads or emitters. The command delivery model (PENDING with 60s expiry, acknowledgement) adds a second concurrent data structure to manage alongside the registry itself.
@@ -93,7 +93,7 @@ No new dependencies required. Everything is already on the classpath.
### Recommended Project Structure
```
cameleer3-server-core/src/main/java/com/cameleer3/server/core/
cameleer-server-core/src/main/java/com/cameleer/server/core/
├── agent/
│ ├── AgentInfo.java # Record: id, name, group, version, routeIds, capabilities, state, timestamps
│ ├── AgentState.java # Enum: LIVE, STALE, DEAD
@@ -101,7 +101,7 @@ cameleer3-server-core/src/main/java/com/cameleer3/server/core/
│ ├── AgentCommand.java # Record: id, type, payload, targetAgentId, createdAt, status
│ └── CommandStatus.java # Enum: PENDING, DELIVERED, ACKNOWLEDGED, EXPIRED
cameleer3-server-app/src/main/java/com/cameleer3/server/app/
cameleer-server-app/src/main/java/com/cameleer/server/app/
├── config/
│ ├── AgentRegistryConfig.java # @ConfigurationProperties(prefix = "agent-registry")
│ └── AgentRegistryBeanConfig.java # @Configuration: wires AgentRegistryService as bean
@@ -452,30 +452,30 @@ spring:
|----------|-------|
| Framework | JUnit 5 + Spring Boot Test (via spring-boot-starter-test) |
| Config file | pom.xml (Surefire + Failsafe configured) |
| Quick run command | `mvn test -pl cameleer3-server-core -Dtest=AgentRegistryServiceTest` |
| Quick run command | `mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest` |
| Full suite command | `mvn clean verify` |
### Phase Requirements to Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| AGNT-01 | Agent registers and gets response | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentRegistrationControllerIT#registerAgent*` | No - Wave 0 |
| AGNT-02 | Lifecycle transitions LIVE/STALE/DEAD | unit | `mvn test -pl cameleer3-server-core -Dtest=AgentRegistryServiceTest#lifecycle*` | No - Wave 0 |
| AGNT-03 | Heartbeat updates timestamp, returns 200/404 | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentRegistrationControllerIT#heartbeat*` | No - Wave 0 |
| AGNT-04 | Config-update pushed via SSE | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#configUpdate*` | No - Wave 0 |
| AGNT-05 | Deep-trace command pushed via SSE | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#deepTrace*` | No - Wave 0 |
| AGNT-06 | Replay command pushed via SSE | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#replay*` | No - Wave 0 |
| AGNT-07 | SSE ping keepalive + Last-Event-ID | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#pingKeepalive*` | No - Wave 0 |
| AGNT-01 | Agent registers and gets response | integration | `mvn test -pl cameleer-server-app -Dtest=AgentRegistrationControllerIT#registerAgent*` | No - Wave 0 |
| AGNT-02 | Lifecycle transitions LIVE/STALE/DEAD | unit | `mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest#lifecycle*` | No - Wave 0 |
| AGNT-03 | Heartbeat updates timestamp, returns 200/404 | integration | `mvn test -pl cameleer-server-app -Dtest=AgentRegistrationControllerIT#heartbeat*` | No - Wave 0 |
| AGNT-04 | Config-update pushed via SSE | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#configUpdate*` | No - Wave 0 |
| AGNT-05 | Deep-trace command pushed via SSE | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#deepTrace*` | No - Wave 0 |
| AGNT-06 | Replay command pushed via SSE | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#replay*` | No - Wave 0 |
| AGNT-07 | SSE ping keepalive + Last-Event-ID | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#pingKeepalive*` | No - Wave 0 |
### Sampling Rate
- **Per task commit:** `mvn test -pl cameleer3-server-core,cameleer3-server-app -Dtest="Agent*"` (agent-related tests only)
- **Per task commit:** `mvn test -pl cameleer-server-core,cameleer-server-app -Dtest="Agent*"` (agent-related tests only)
- **Per wave merge:** `mvn clean verify`
- **Phase gate:** Full suite green before /gsd:verify-work
### Wave 0 Gaps
- [ ] `cameleer3-server-core/.../agent/AgentRegistryServiceTest.java` -- covers AGNT-02, AGNT-03 (unit tests for registry logic)
- [ ] `cameleer3-server-app/.../controller/AgentRegistrationControllerIT.java` -- covers AGNT-01, AGNT-03
- [ ] `cameleer3-server-app/.../controller/AgentSseControllerIT.java` -- covers AGNT-04, AGNT-05, AGNT-06, AGNT-07
- [ ] `cameleer3-server-app/.../controller/AgentCommandControllerIT.java` -- covers command targeting (single, group, all)
- [ ] `cameleer-server-core/.../agent/AgentRegistryServiceTest.java` -- covers AGNT-02, AGNT-03 (unit tests for registry logic)
- [ ] `cameleer-server-app/.../controller/AgentRegistrationControllerIT.java` -- covers AGNT-01, AGNT-03
- [ ] `cameleer-server-app/.../controller/AgentSseControllerIT.java` -- covers AGNT-04, AGNT-05, AGNT-06, AGNT-07
- [ ] `cameleer-server-app/.../controller/AgentCommandControllerIT.java` -- covers command targeting (single, group, all)
- [ ] No new framework install needed -- JUnit 5 + Spring Boot Test + Awaitility already in place
### SSE Test Strategy

View File

@@ -18,8 +18,8 @@ created: 2026-03-11
| Property | Value |
|----------|-------|
| **Framework** | JUnit 5 + Spring Boot Test + Testcontainers ClickHouse 25.3 |
| **Config file** | cameleer3-server-app/pom.xml (Surefire + Failsafe configured) |
| **Quick run command** | `mvn test -pl cameleer3-server-core -Dtest=AgentRegistryServiceTest` |
| **Config file** | cameleer-server-app/pom.xml (Surefire + Failsafe configured) |
| **Quick run command** | `mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest` |
| **Full suite command** | `mvn clean verify` |
| **Estimated runtime** | ~50 seconds |
@@ -27,7 +27,7 @@ created: 2026-03-11
## Sampling Rate
- **After every task commit:** Run `mvn test -pl cameleer3-server-core,cameleer3-server-app -Dtest="Agent*"`
- **After every task commit:** Run `mvn test -pl cameleer-server-core,cameleer-server-app -Dtest="Agent*"`
- **After every plan wave:** Run `mvn clean verify`
- **Before `/gsd:verify-work`:** Full suite must be green
- **Max feedback latency:** 50 seconds
@@ -38,13 +38,13 @@ created: 2026-03-11
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
| 03-01-01 | 01 | 1 | AGNT-01 | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentRegistrationControllerIT#registerAgent*` | ❌ W0 | ⬜ pending |
| 03-01-02 | 01 | 1 | AGNT-02 | unit | `mvn test -pl cameleer3-server-core -Dtest=AgentRegistryServiceTest#lifecycle*` | ❌ W0 | ⬜ pending |
| 03-01-03 | 01 | 1 | AGNT-03 | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentRegistrationControllerIT#heartbeat*` | ❌ W0 | ⬜ pending |
| 03-02-01 | 02 | 1 | AGNT-04 | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#configUpdate*` | ❌ W0 | ⬜ pending |
| 03-02-02 | 02 | 1 | AGNT-05 | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#deepTrace*` | ❌ W0 | ⬜ pending |
| 03-02-03 | 02 | 1 | AGNT-06 | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#replay*` | ❌ W0 | ⬜ pending |
| 03-02-04 | 02 | 1 | AGNT-07 | integration | `mvn test -pl cameleer3-server-app -Dtest=AgentSseControllerIT#pingKeepalive*` | ❌ W0 | ⬜ pending |
| 03-01-01 | 01 | 1 | AGNT-01 | integration | `mvn test -pl cameleer-server-app -Dtest=AgentRegistrationControllerIT#registerAgent*` | ❌ W0 | ⬜ pending |
| 03-01-02 | 01 | 1 | AGNT-02 | unit | `mvn test -pl cameleer-server-core -Dtest=AgentRegistryServiceTest#lifecycle*` | ❌ W0 | ⬜ pending |
| 03-01-03 | 01 | 1 | AGNT-03 | integration | `mvn test -pl cameleer-server-app -Dtest=AgentRegistrationControllerIT#heartbeat*` | ❌ W0 | ⬜ pending |
| 03-02-01 | 02 | 1 | AGNT-04 | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#configUpdate*` | ❌ W0 | ⬜ pending |
| 03-02-02 | 02 | 1 | AGNT-05 | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#deepTrace*` | ❌ W0 | ⬜ pending |
| 03-02-03 | 02 | 1 | AGNT-06 | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#replay*` | ❌ W0 | ⬜ pending |
| 03-02-04 | 02 | 1 | AGNT-07 | integration | `mvn test -pl cameleer-server-app -Dtest=AgentSseControllerIT#pingKeepalive*` | ❌ W0 | ⬜ pending |
*Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*

View File

@@ -51,18 +51,18 @@ re_verification: false
| Artifact | Expected | Status | Details |
|----------|----------|--------|---------|
| `cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentRegistryService.java` | Registration, heartbeat, lifecycle, find/filter, commands | VERIFIED | 281 lines; full implementation with ConcurrentHashMap, compute-based atomic swaps, eventListener bridge |
| `cameleer3-server-core/src/main/java/com/cameleer3/server/core/agent/AgentInfo.java` | Immutable record with all fields and wither methods | VERIFIED | 63 lines; record with 10 fields and 5 wither-style methods |
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java` | POST /register, POST /{id}/heartbeat, GET /agents | VERIFIED | 153 lines; all three endpoints implemented with OpenAPI annotations |
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/AgentLifecycleMonitor.java` | @Scheduled LIVE->STALE->DEAD transitions | VERIFIED | 37 lines; calls `registryService.checkLifecycle()` and `expireOldCommands()` on schedule |
| `cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentRegistryService.java` | Registration, heartbeat, lifecycle, find/filter, commands | VERIFIED | 281 lines; full implementation with ConcurrentHashMap, compute-based atomic swaps, eventListener bridge |
| `cameleer-server-core/src/main/java/com/cameleer/server/core/agent/AgentInfo.java` | Immutable record with all fields and wither methods | VERIFIED | 63 lines; record with 10 fields and 5 wither-style methods |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java` | POST /register, POST /{id}/heartbeat, GET /agents | VERIFIED | 153 lines; all three endpoints implemented with OpenAPI annotations |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/agent/AgentLifecycleMonitor.java` | @Scheduled LIVE->STALE->DEAD transitions | VERIFIED | 37 lines; calls `registryService.checkLifecycle()` and `expireOldCommands()` on schedule |
### Plan 02 Artifacts
| Artifact | Expected | Status | Details |
|----------|----------|--------|---------|
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java` | Per-agent SseEmitter management, event sending, ping | VERIFIED | 158 lines; implements AgentEventListener, reference-equality removal, @PostConstruct registration |
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java` | GET /{id}/events SSE endpoint | VERIFIED | 67 lines; checks agent exists, delegates to connectionManager.connect() |
| `cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentCommandController.java` | POST commands (single/group/broadcast) + ack | VERIFIED | 182 lines; all four endpoints implemented |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java` | Per-agent SseEmitter management, event sending, ping | VERIFIED | 158 lines; implements AgentEventListener, reference-equality removal, @PostConstruct registration |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java` | GET /{id}/events SSE endpoint | VERIFIED | 67 lines; checks agent exists, delegates to connectionManager.connect() |
| `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentCommandController.java` | POST commands (single/group/broadcast) + ack | VERIFIED | 182 lines; all four endpoints implemented |
### Supporting Artifacts (confirmed present)
@@ -77,7 +77,7 @@ re_verification: false
| `AgentRegistryBeanConfig.java` (@Configuration) | VERIFIED — creates AgentRegistryService with config values |
| `application.yml` | VERIFIED — agent-registry section present; `spring.mvc.async.request-timeout: -1` present |
| `application-test.yml` | VERIFIED — `agent-registry.ping-interval-ms: 1000` for fast SSE test assertions |
| `Cameleer3ServerApplication.java` | VERIFIED — `AgentRegistryConfig.class` added to `@EnableConfigurationProperties` |
| `CameleerServerApplication.java` | VERIFIED — `AgentRegistryConfig.class` added to `@EnableConfigurationProperties` |
---

View File

@@ -5,19 +5,19 @@ type: execute
wave: 1
depends_on: []
files_modified:
- cameleer3-server-app/pom.xml
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/JwtService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/Ed25519SigningService.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtServiceImpl.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/Ed25519SigningServiceImpl.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/BootstrapTokenValidator.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityProperties.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityBeanConfig.java
- cameleer3-server-app/src/main/resources/application.yml
- cameleer3-server-app/src/test/resources/application-test.yml
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/JwtServiceTest.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/Ed25519SigningServiceTest.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/BootstrapTokenValidatorTest.java
- cameleer-server-app/pom.xml
- cameleer-server-core/src/main/java/com/cameleer/server/core/security/JwtService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/security/Ed25519SigningService.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtServiceImpl.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/Ed25519SigningServiceImpl.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/BootstrapTokenValidator.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityProperties.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityBeanConfig.java
- cameleer-server-app/src/main/resources/application.yml
- cameleer-server-app/src/test/resources/application-test.yml
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/JwtServiceTest.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/Ed25519SigningServiceTest.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/BootstrapTokenValidatorTest.java
autonomous: true
requirements:
- SECU-03
@@ -31,15 +31,15 @@ must_haves:
- "BootstrapTokenValidator accepts CAMELEER_AUTH_TOKEN and optionally CAMELEER_AUTH_TOKEN_PREVIOUS using constant-time comparison"
- "Server fails fast on startup if CAMELEER_AUTH_TOKEN is not set"
artifacts:
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/JwtService.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/security/JwtService.java"
provides: "JWT service interface with createAccessToken, createRefreshToken, validateAndExtractAgentId"
- path: "cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/Ed25519SigningService.java"
- path: "cameleer-server-core/src/main/java/com/cameleer/server/core/security/Ed25519SigningService.java"
provides: "Ed25519 signing interface with sign(payload) and getPublicKeyBase64()"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtServiceImpl.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtServiceImpl.java"
provides: "Nimbus JOSE+JWT HMAC-SHA256 implementation"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/Ed25519SigningServiceImpl.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/security/Ed25519SigningServiceImpl.java"
provides: "JDK 17 Ed25519 KeyPairGenerator implementation"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/BootstrapTokenValidator.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/security/BootstrapTokenValidator.java"
provides: "Constant-time bootstrap token validation with dual-token rotation"
key_links:
- from: "JwtServiceImpl"
@@ -76,10 +76,10 @@ Output: Working JwtService, Ed25519SigningService, BootstrapTokenValidator with
@.planning/phases/04-security/04-RESEARCH.md
@.planning/phases/04-security/04-VALIDATION.md
@cameleer3-server-app/pom.xml
@cameleer3-server-app/src/main/resources/application.yml
@cameleer3-server-app/src/test/resources/application-test.yml
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/AgentRegistryConfig.java
@cameleer-server-app/pom.xml
@cameleer-server-app/src/main/resources/application.yml
@cameleer-server-app/src/test/resources/application-test.yml
@cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryConfig.java
<interfaces>
<!-- Existing patterns to follow: core module = interfaces/domain, app module = Spring implementations -->
@@ -106,19 +106,19 @@ public class AgentRegistryConfig { ... }
<task type="auto" tdd="true">
<name>Task 1: Core interfaces + app implementations + Maven deps</name>
<files>
cameleer3-server-app/pom.xml,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/JwtService.java,
cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/Ed25519SigningService.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtServiceImpl.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/Ed25519SigningServiceImpl.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/BootstrapTokenValidator.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityProperties.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityBeanConfig.java,
cameleer3-server-app/src/main/resources/application.yml,
cameleer3-server-app/src/test/resources/application-test.yml,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/JwtServiceTest.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/Ed25519SigningServiceTest.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/BootstrapTokenValidatorTest.java
cameleer-server-app/pom.xml,
cameleer-server-core/src/main/java/com/cameleer/server/core/security/JwtService.java,
cameleer-server-core/src/main/java/com/cameleer/server/core/security/Ed25519SigningService.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtServiceImpl.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/security/Ed25519SigningServiceImpl.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/security/BootstrapTokenValidator.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityProperties.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityBeanConfig.java,
cameleer-server-app/src/main/resources/application.yml,
cameleer-server-app/src/test/resources/application-test.yml,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/JwtServiceTest.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/Ed25519SigningServiceTest.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/BootstrapTokenValidatorTest.java
</files>
<behavior>
JwtService tests:
@@ -145,7 +145,7 @@ public class AgentRegistryConfig { ... }
- Uses constant-time comparison (MessageDigest.isEqual)
</behavior>
<action>
1. Add Maven dependencies to cameleer3-server-app/pom.xml:
1. Add Maven dependencies to cameleer-server-app/pom.xml:
- `spring-boot-starter-security` (managed version)
- `com.nimbusds:nimbus-jose-jwt:9.47` (explicit, may not be transitive without OAuth2 resource server)
- `spring-security-test` scope test (managed version)
@@ -165,12 +165,12 @@ public class AgentRegistryConfig { ... }
5. Update application-test.yml: Add `security.bootstrap-token: test-bootstrap-token`, `security.bootstrap-token-previous: old-bootstrap-token`. Also set `CAMELEER_AUTH_TOKEN: test-bootstrap-token` as an env override if needed.
6. IMPORTANT: Adding spring-boot-starter-security will break ALL existing tests immediately (401 on all endpoints). To prevent this during Plan 01 (before the security filter chain is configured in Plan 02), add a temporary test security config class `src/test/java/com/cameleer3/server/app/security/TestSecurityConfig.java` annotated `@TestConfiguration` that creates a `SecurityFilterChain` permitting all requests. This keeps existing tests green while security services are built. Plan 02 will replace this with real security config and update tests.
6. IMPORTANT: Adding spring-boot-starter-security will break ALL existing tests immediately (401 on all endpoints). To prevent this during Plan 01 (before the security filter chain is configured in Plan 02), add a temporary test security config class `src/test/java/com/cameleer/server/app/security/TestSecurityConfig.java` annotated `@TestConfiguration` that creates a `SecurityFilterChain` permitting all requests. This keeps existing tests green while security services are built. Plan 02 will replace this with real security config and update tests.
7. Write unit tests per the behavior spec above. Tests should NOT require Spring context -- construct implementations directly with test SecurityProperties.
</action>
<verify>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer3-server && mvn test -pl cameleer3-server-app -Dtest="JwtServiceTest,Ed25519SigningServiceTest,BootstrapTokenValidatorTest" -Dsurefire.reuseForks=false</automated>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest="JwtServiceTest,Ed25519SigningServiceTest,BootstrapTokenValidatorTest" -Dsurefire.reuseForks=false</automated>
</verify>
<done>
- JwtService creates and validates access/refresh JWTs with correct claims and expiry

View File

@@ -25,22 +25,22 @@ tech-stack:
key-files:
created:
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/JwtService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/Ed25519SigningService.java
- cameleer3-server-core/src/main/java/com/cameleer3/server/core/security/InvalidTokenException.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtServiceImpl.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/Ed25519SigningServiceImpl.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/BootstrapTokenValidator.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityProperties.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityBeanConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/TestSecurityConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/JwtServiceTest.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/Ed25519SigningServiceTest.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/BootstrapTokenValidatorTest.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/security/JwtService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/security/Ed25519SigningService.java
- cameleer-server-core/src/main/java/com/cameleer/server/core/security/InvalidTokenException.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtServiceImpl.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/Ed25519SigningServiceImpl.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/BootstrapTokenValidator.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityProperties.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityBeanConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/TestSecurityConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/JwtServiceTest.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/Ed25519SigningServiceTest.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/BootstrapTokenValidatorTest.java
modified:
- cameleer3-server-app/pom.xml
- cameleer3-server-app/src/main/resources/application.yml
- cameleer3-server-app/src/test/resources/application-test.yml
- cameleer-server-app/pom.xml
- cameleer-server-app/src/main/resources/application.yml
- cameleer-server-app/src/test/resources/application-test.yml
key-decisions:
- "HMAC-SHA256 with ephemeral 256-bit secret for JWT signing (simpler than Ed25519 for tokens, Ed25519 reserved for config signing)"
@@ -91,21 +91,21 @@ _No REFACTOR commit needed -- implementations are clean and minimal._
## Files Created/Modified
- `cameleer3-server-core/.../security/JwtService.java` - JWT service interface with create/validate methods
- `cameleer3-server-core/.../security/Ed25519SigningService.java` - Ed25519 signing interface with sign/getPublicKeyBase64
- `cameleer3-server-core/.../security/InvalidTokenException.java` - Runtime exception for invalid/expired/wrong-type tokens
- `cameleer3-server-app/.../security/JwtServiceImpl.java` - Nimbus JOSE+JWT HMAC-SHA256 implementation
- `cameleer3-server-app/.../security/Ed25519SigningServiceImpl.java` - JDK 17 Ed25519 KeyPairGenerator implementation
- `cameleer3-server-app/.../security/BootstrapTokenValidator.java` - Constant-time bootstrap token validation
- `cameleer3-server-app/.../security/SecurityProperties.java` - Config properties for token expiry and bootstrap tokens
- `cameleer3-server-app/.../security/SecurityBeanConfig.java` - Bean wiring with fail-fast startup validation
- `cameleer3-server-app/.../security/TestSecurityConfig.java` - Temporary permit-all for existing test compatibility
- `cameleer3-server-app/pom.xml` - Added nimbus-jose-jwt, spring-boot-starter-security, spring-security-test
- `cameleer3-server-app/.../application.yml` - Security config section with env var mapping
- `cameleer3-server-app/.../application-test.yml` - Test bootstrap token values
- `cameleer3-server-app/.../security/JwtServiceTest.java` - 7 unit tests for JWT creation/validation
- `cameleer3-server-app/.../security/Ed25519SigningServiceTest.java` - 5 unit tests for signing/verification
- `cameleer3-server-app/.../security/BootstrapTokenValidatorTest.java` - 6 unit tests for token matching
- `cameleer-server-core/.../security/JwtService.java` - JWT service interface with create/validate methods
- `cameleer-server-core/.../security/Ed25519SigningService.java` - Ed25519 signing interface with sign/getPublicKeyBase64
- `cameleer-server-core/.../security/InvalidTokenException.java` - Runtime exception for invalid/expired/wrong-type tokens
- `cameleer-server-app/.../security/JwtServiceImpl.java` - Nimbus JOSE+JWT HMAC-SHA256 implementation
- `cameleer-server-app/.../security/Ed25519SigningServiceImpl.java` - JDK 17 Ed25519 KeyPairGenerator implementation
- `cameleer-server-app/.../security/BootstrapTokenValidator.java` - Constant-time bootstrap token validation
- `cameleer-server-app/.../security/SecurityProperties.java` - Config properties for token expiry and bootstrap tokens
- `cameleer-server-app/.../security/SecurityBeanConfig.java` - Bean wiring with fail-fast startup validation
- `cameleer-server-app/.../security/TestSecurityConfig.java` - Temporary permit-all for existing test compatibility
- `cameleer-server-app/pom.xml` - Added nimbus-jose-jwt, spring-boot-starter-security, spring-security-test
- `cameleer-server-app/.../application.yml` - Security config section with env var mapping
- `cameleer-server-app/.../application-test.yml` - Test bootstrap token values
- `cameleer-server-app/.../security/JwtServiceTest.java` - 7 unit tests for JWT creation/validation
- `cameleer-server-app/.../security/Ed25519SigningServiceTest.java` - 5 unit tests for signing/verification
- `cameleer-server-app/.../security/BootstrapTokenValidatorTest.java` - 6 unit tests for token matching
## Decisions Made

View File

@@ -5,17 +5,17 @@ type: execute
wave: 2
depends_on: ["04-01"]
files_modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtAuthenticationFilter.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityConfig.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/SecurityFilterIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/JwtRefreshIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/RegistrationSecurityIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/BootstrapTokenIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/TestSecurityHelper.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/TestSecurityConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtAuthenticationFilter.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityConfig.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/SecurityFilterIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/JwtRefreshIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/RegistrationSecurityIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/BootstrapTokenIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/TestSecurityHelper.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/TestSecurityConfig.java
autonomous: true
requirements:
- SECU-01
@@ -30,11 +30,11 @@ must_haves:
- "SSE endpoint accepts JWT via ?token= query parameter"
- "Health endpoint and Swagger UI remain publicly accessible"
artifacts:
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtAuthenticationFilter.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtAuthenticationFilter.java"
provides: "OncePerRequestFilter extracting JWT from header or query param"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityConfig.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityConfig.java"
provides: "SecurityFilterChain with permitAll for public paths, authenticated for rest"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java"
provides: "Updated register endpoint with bootstrap token validation, JWT issuance, public key"
key_links:
- from: "JwtAuthenticationFilter"
@@ -76,11 +76,11 @@ Output: Working security filter chain with protected/public endpoints, registrat
@.planning/phases/04-security/04-VALIDATION.md
@.planning/phases/04-security/04-01-SUMMARY.md
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
@cameleer3-server-app/src/test/java/com/cameleer3/server/app/AbstractClickHouseIT.java
@cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
@cameleer-server-app/src/test/java/com/cameleer/server/app/AbstractClickHouseIT.java
@cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java
<interfaces>
<!-- From Plan 01 (will exist after execution): -->
@@ -136,11 +136,11 @@ public class AgentRegistryService {
<task type="auto">
<name>Task 1: SecurityFilterChain + JwtAuthenticationFilter + registration/refresh integration</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtAuthenticationFilter.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityConfig.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentSseController.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtAuthenticationFilter.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityConfig.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentSseController.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
</files>
<action>
1. Create `JwtAuthenticationFilter extends OncePerRequestFilter` (NOT annotated @Component -- constructed in SecurityConfig to avoid double registration):
@@ -176,7 +176,7 @@ public class AgentRegistryService {
6. Update `WebConfig` if needed: The `ProtocolVersionInterceptor` excluded paths should align with Spring Security public paths. The SSE events path is already excluded from protocol version check (Phase 3 decision). Verify no conflicts.
</action>
<verify>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer3-server && mvn clean compile -pl cameleer3-server-app</automated>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn clean compile -pl cameleer-server-app</automated>
</verify>
<done>
- SecurityConfig creates stateless filter chain with correct public/protected path split
@@ -190,28 +190,28 @@ public class AgentRegistryService {
<task type="auto">
<name>Task 2: Security integration tests + existing test adaptation</name>
<files>
cameleer3-server-app/src/test/java/com/cameleer3/server/app/TestSecurityHelper.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/TestSecurityConfig.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/SecurityFilterIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/JwtRefreshIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/RegistrationSecurityIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/BootstrapTokenIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ExecutionControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/MetricsControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/BackpressureIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramRenderControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DetailControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentCommandControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentSseControllerIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/IngestionSchemaIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/interceptor/ProtocolVersionIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/OpenApiIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ForwardCompatIT.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/HealthControllerIT.java
cameleer-server-app/src/test/java/com/cameleer/server/app/TestSecurityHelper.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/TestSecurityConfig.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/SecurityFilterIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/JwtRefreshIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/RegistrationSecurityIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/BootstrapTokenIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ExecutionControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/MetricsControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/BackpressureIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramRenderControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DetailControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentCommandControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentSseControllerIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/storage/IngestionSchemaIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/interceptor/ProtocolVersionIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/OpenApiIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ForwardCompatIT.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/controller/HealthControllerIT.java
</files>
<action>
1. Replace the Plan 01 temporary `TestSecurityConfig` (permit-all) with real security active in tests. Remove the permit-all override so tests run with actual security enforcement.
@@ -259,7 +259,7 @@ public class AgentRegistryService {
- Test: New access token from refresh can access protected endpoints
</action>
<verify>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer3-server && mvn clean verify</automated>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn clean verify</automated>
</verify>
<done>
- All 17 existing ITs pass with JWT authentication

View File

@@ -25,31 +25,31 @@ tech-stack:
key-files:
created:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/JwtAuthenticationFilter.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/security/SecurityConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/TestSecurityHelper.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/SecurityFilterIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/BootstrapTokenIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/RegistrationSecurityIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/JwtRefreshIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtAuthenticationFilter.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/TestSecurityHelper.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/SecurityFilterIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/BootstrapTokenIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/RegistrationSecurityIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/JwtRefreshIT.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/controller/AgentRegistrationController.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/config/WebConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/TestSecurityConfig.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentRegistrationControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ExecutionControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/MetricsControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/BackpressureIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DiagramRenderControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/DetailControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/SearchControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentCommandControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/AgentSseControllerIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/DiagramLinkingIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/storage/IngestionSchemaIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/interceptor/ProtocolVersionIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/controller/ForwardCompatIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AgentRegistrationController.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/config/WebConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/TestSecurityConfig.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentRegistrationControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ExecutionControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/MetricsControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/BackpressureIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DiagramRenderControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/DetailControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/SearchControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentCommandControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/AgentSseControllerIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/storage/DiagramLinkingIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/storage/IngestionSchemaIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/interceptor/ProtocolVersionIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ForwardCompatIT.java
key-decisions:
- "Added /error to SecurityConfig permitAll to allow Spring Boot error page forwarding through security"

View File

@@ -5,10 +5,10 @@ type: execute
wave: 2
depends_on: ["04-01"]
files_modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SsePayloadSigner.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/SseSigningIT.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/agent/SsePayloadSignerTest.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SsePayloadSigner.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/SseSigningIT.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/agent/SsePayloadSignerTest.java
autonomous: true
requirements:
- SECU-04
@@ -19,9 +19,9 @@ must_haves:
- "Signature is computed over the payload JSON without the signature field, then added as a 'signature' field"
- "Agent can verify the signature using the public key received at registration"
artifacts:
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SsePayloadSigner.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SsePayloadSigner.java"
provides: "Component that signs SSE command payloads before delivery"
- path: "cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java"
- path: "cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java"
provides: "Updated onCommandReady with signing before sendEvent"
key_links:
- from: "SseConnectionManager.onCommandReady"
@@ -54,7 +54,7 @@ Output: All SSE command events carry verifiable Ed25519 signatures.
@.planning/phases/04-security/04-RESEARCH.md
@.planning/phases/04-security/04-01-SUMMARY.md
@cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java
@cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java
<interfaces>
<!-- From Plan 01 (will exist after execution): -->
@@ -98,10 +98,10 @@ public record AgentCommand(String id, CommandType type, String payload, String a
<task type="auto" tdd="true">
<name>Task 1: SsePayloadSigner + signing integration in SseConnectionManager</name>
<files>
cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SsePayloadSigner.java,
cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/agent/SsePayloadSignerTest.java,
cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/SseSigningIT.java
cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SsePayloadSigner.java,
cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/agent/SsePayloadSignerTest.java,
cameleer-server-app/src/test/java/com/cameleer/server/app/security/SseSigningIT.java
</files>
<behavior>
SsePayloadSigner unit tests:
@@ -155,7 +155,7 @@ public record AgentCommand(String id, CommandType type, String payload, String a
- NOTE: This test depends on Plan 02's bootstrap token and JWT auth being in place. If Plan 03 executes before Plan 02, the test will need the TestSecurityHelper or a different auth approach. Since both are Wave 2 but independent, document this: "If Plan 02 is not yet complete, use TestSecurityHelper from Plan 01's temporary permit-all config."
</action>
<verify>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer3-server && mvn test -pl cameleer3-server-app -Dtest="SsePayloadSignerTest,SseSigningIT" -Dsurefire.reuseForks=false</automated>
<automated>cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest="SsePayloadSignerTest,SseSigningIT" -Dsurefire.reuseForks=false</automated>
</verify>
<done>
- SsePayloadSigner signs JSON payloads with Ed25519 and adds signature field

View File

@@ -22,11 +22,11 @@ tech-stack:
key-files:
created:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SsePayloadSigner.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/agent/SsePayloadSignerTest.java
- cameleer3-server-app/src/test/java/com/cameleer3/server/app/security/SseSigningIT.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SsePayloadSigner.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/agent/SsePayloadSignerTest.java
- cameleer-server-app/src/test/java/com/cameleer/server/app/security/SseSigningIT.java
modified:
- cameleer3-server-app/src/main/java/com/cameleer3/server/app/agent/SseConnectionManager.java
- cameleer-server-app/src/main/java/com/cameleer/server/app/agent/SseConnectionManager.java
key-decisions:
- "Signed payload parsed to JsonNode before passing to SseEmitter to avoid double-quoting raw JSON strings"
@@ -73,10 +73,10 @@ _No REFACTOR commit needed -- implementation is clean and minimal._
## Files Created/Modified
- `cameleer3-server-app/.../agent/SsePayloadSigner.java` - Component that signs JSON payloads with Ed25519 and adds signature field
- `cameleer3-server-app/.../agent/SseConnectionManager.java` - Updated onCommandReady to sign payload before SSE delivery
- `cameleer3-server-app/.../agent/SsePayloadSignerTest.java` - 7 unit tests for signing behavior and edge cases
- `cameleer3-server-app/.../security/SseSigningIT.java` - 2 integration tests for end-to-end signature verification
- `cameleer-server-app/.../agent/SsePayloadSigner.java` - Component that signs JSON payloads with Ed25519 and adds signature field
- `cameleer-server-app/.../agent/SseConnectionManager.java` - Updated onCommandReady to sign payload before SSE delivery
- `cameleer-server-app/.../agent/SsePayloadSignerTest.java` - 7 unit tests for signing behavior and edge cases
- `cameleer-server-app/.../security/SseSigningIT.java` - 2 integration tests for end-to-end signature verification
## Decisions Made

View File

@@ -6,7 +6,7 @@
## Summary
This phase adds authentication and integrity protection to the Cameleer3 server. The implementation uses Spring Security 6.4.3 (managed by Spring Boot 3.4.3) with a custom `OncePerRequestFilter` for JWT validation, JDK 17 built-in Ed25519 for signing SSE payloads, and environment variable-based bootstrap tokens for agent registration. The approach is deliberately simple -- no OAuth2 resource server, no external identity provider, just symmetric HMAC JWTs for access control and Ed25519 signatures for payload integrity.
This phase adds authentication and integrity protection to the Cameleer server. The implementation uses Spring Security 6.4.3 (managed by Spring Boot 3.4.3) with a custom `OncePerRequestFilter` for JWT validation, JDK 17 built-in Ed25519 for signing SSE payloads, and environment variable-based bootstrap tokens for agent registration. The approach is deliberately simple -- no OAuth2 resource server, no external identity provider, just symmetric HMAC JWTs for access control and Ed25519 signatures for payload integrity.
The existing codebase has clear integration points: `AgentRegistrationController.register()` already returns `serverPublicKey: null` as a placeholder, `SseConnectionManager.onCommandReady()` is the signing hook for SSE events, and `WebConfig` already defines excluded paths that align with the public endpoint list. Spring Security's `SecurityFilterChain` replaces the need for hand-rolled authorization logic -- endpoints are protected by default, with explicit `permitAll()` for health, register, and docs.
@@ -89,7 +89,7 @@ The existing codebase has clear integration points: `AgentRegistrationController
- **Ed25519 library:** Use JDK built-in. Zero external dependencies, native performance, well-tested in JDK 17+.
- **Refresh token storage:** Use stateless signed refresh tokens (also HMAC-signed JWTs with different claims/expiry). This avoids any in-memory storage for refresh tokens and scales naturally. The refresh token is just a JWT with `type=refresh`, `sub=agentId`, and 7-day expiry. On refresh, validate the refresh JWT, check agent still exists, issue new access JWT.
**Installation (add to cameleer3-server-app pom.xml):**
**Installation (add to cameleer-server-app pom.xml):**
```xml
<dependency>
<groupId>org.springframework.boot</groupId>
@@ -108,12 +108,12 @@ Note: If `spring-boot-starter-security` brings Nimbus transitively (via `spring-
### Recommended Project Structure
```
cameleer3-server-core/src/main/java/com/cameleer3/server/core/
cameleer-server-core/src/main/java/com/cameleer/server/core/
security/
JwtService.java # Interface: createAccessToken, createRefreshToken, validateToken, extractAgentId
Ed25519SigningService.java # Interface: sign(payload) -> signature, getPublicKeyBase64()
cameleer3-server-app/src/main/java/com/cameleer3/server/app/
cameleer-server-app/src/main/java/com/cameleer/server/app/
security/
JwtServiceImpl.java # Nimbus JOSE+JWT HMAC implementation
Ed25519SigningServiceImpl.java # JDK Ed25519 keypair + signing implementation
@@ -439,23 +439,23 @@ public boolean validateBootstrapToken(String provided) {
| Property | Value |
|----------|-------|
| Framework | JUnit 5 + Spring Boot Test (spring-boot-starter-test) |
| Config file | `cameleer3-server-app/src/test/resources/application-test.yml` |
| Quick run command | `mvn test -pl cameleer3-server-app -Dtest=Security*Test -Dsurefire.reuseForks=false` |
| Config file | `cameleer-server-app/src/test/resources/application-test.yml` |
| Quick run command | `mvn test -pl cameleer-server-app -Dtest=Security*Test -Dsurefire.reuseForks=false` |
| Full suite command | `mvn clean verify` |
### Phase Requirements to Test Map
| Req ID | Behavior | Test Type | Automated Command | File Exists? |
|--------|----------|-----------|-------------------|-------------|
| SECU-01 | Protected endpoints reject requests without JWT; public endpoints accessible | integration | `mvn test -pl cameleer3-server-app -Dtest=SecurityFilterIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-02 | Refresh endpoint issues new access JWT from valid refresh token | integration | `mvn test -pl cameleer3-server-app -Dtest=JwtRefreshIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-03 | Ed25519 keypair generated at startup; public key in registration response | integration | `mvn test -pl cameleer3-server-app -Dtest=RegistrationSecurityIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-04 | SSE payloads carry valid Ed25519 signature | integration | `mvn test -pl cameleer3-server-app -Dtest=SseSigningIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-05 | Bootstrap token required for registration; rejects invalid/missing tokens | integration | `mvn test -pl cameleer3-server-app -Dtest=BootstrapTokenIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| N/A | JWT creation, validation, expiry logic | unit | `mvn test -pl cameleer3-server-app -Dtest=JwtServiceTest -Dsurefire.reuseForks=false` | No -- Wave 0 |
| N/A | Ed25519 signing and verification roundtrip | unit | `mvn test -pl cameleer3-server-app -Dtest=Ed25519SigningServiceTest -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-01 | Protected endpoints reject requests without JWT; public endpoints accessible | integration | `mvn test -pl cameleer-server-app -Dtest=SecurityFilterIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-02 | Refresh endpoint issues new access JWT from valid refresh token | integration | `mvn test -pl cameleer-server-app -Dtest=JwtRefreshIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-03 | Ed25519 keypair generated at startup; public key in registration response | integration | `mvn test -pl cameleer-server-app -Dtest=RegistrationSecurityIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-04 | SSE payloads carry valid Ed25519 signature | integration | `mvn test -pl cameleer-server-app -Dtest=SseSigningIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| SECU-05 | Bootstrap token required for registration; rejects invalid/missing tokens | integration | `mvn test -pl cameleer-server-app -Dtest=BootstrapTokenIT -Dsurefire.reuseForks=false` | No -- Wave 0 |
| N/A | JWT creation, validation, expiry logic | unit | `mvn test -pl cameleer-server-app -Dtest=JwtServiceTest -Dsurefire.reuseForks=false` | No -- Wave 0 |
| N/A | Ed25519 signing and verification roundtrip | unit | `mvn test -pl cameleer-server-app -Dtest=Ed25519SigningServiceTest -Dsurefire.reuseForks=false` | No -- Wave 0 |
### Sampling Rate
- **Per task commit:** `mvn test -pl cameleer3-server-app -Dsurefire.reuseForks=false`
- **Per task commit:** `mvn test -pl cameleer-server-app -Dsurefire.reuseForks=false`
- **Per wave merge:** `mvn clean verify`
- **Phase gate:** Full suite green before `/gsd:verify-work`

View File

@@ -18,8 +18,8 @@ created: 2026-03-11
| Property | Value |
|----------|-------|
| **Framework** | JUnit 5 + Spring Boot Test + Spring Security Test |
| **Config file** | cameleer3-server-app/src/test/resources/application-test.yml |
| **Quick run command** | `mvn test -pl cameleer3-server-app -Dtest="Security*,Jwt*,Bootstrap*,Ed25519*" -Dsurefire.reuseForks=false` |
| **Config file** | cameleer-server-app/src/test/resources/application-test.yml |
| **Quick run command** | `mvn test -pl cameleer-server-app -Dtest="Security*,Jwt*,Bootstrap*,Ed25519*" -Dsurefire.reuseForks=false` |
| **Full suite command** | `mvn clean verify` |
| **Estimated runtime** | ~60 seconds |
@@ -27,7 +27,7 @@ created: 2026-03-11
## Sampling Rate
- **After every task commit:** Run `mvn test -pl cameleer3-server-app -Dsurefire.reuseForks=false`
- **After every task commit:** Run `mvn test -pl cameleer-server-app -Dsurefire.reuseForks=false`
- **After every plan wave:** Run `mvn clean verify`
- **Before `/gsd:verify-work`:** Full suite must be green
- **Max feedback latency:** 60 seconds
@@ -38,13 +38,13 @@ created: 2026-03-11
| Task ID | Plan | Wave | Requirement | Test Type | Automated Command | File Exists | Status |
|---------|------|------|-------------|-----------|-------------------|-------------|--------|
| 04-01-01 | 01 | 1 | SECU-03 | unit | `mvn test -pl cameleer3-server-app -Dtest=Ed25519SigningServiceTest -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-02 | 01 | 1 | SECU-01 | unit | `mvn test -pl cameleer3-server-app -Dtest=JwtServiceTest -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-03 | 01 | 1 | SECU-05 | integration | `mvn test -pl cameleer3-server-app -Dtest=BootstrapTokenIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-04 | 01 | 1 | SECU-01 | integration | `mvn test -pl cameleer3-server-app -Dtest=SecurityFilterIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-05 | 01 | 1 | SECU-02 | integration | `mvn test -pl cameleer3-server-app -Dtest=JwtRefreshIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-06 | 01 | 1 | SECU-04 | integration | `mvn test -pl cameleer3-server-app -Dtest=SseSigningIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-07 | 01 | 1 | N/A | integration | `mvn test -pl cameleer3-server-app -Dtest=RegistrationSecurityIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-01 | 01 | 1 | SECU-03 | unit | `mvn test -pl cameleer-server-app -Dtest=Ed25519SigningServiceTest -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-02 | 01 | 1 | SECU-01 | unit | `mvn test -pl cameleer-server-app -Dtest=JwtServiceTest -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-03 | 01 | 1 | SECU-05 | integration | `mvn test -pl cameleer-server-app -Dtest=BootstrapTokenIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-04 | 01 | 1 | SECU-01 | integration | `mvn test -pl cameleer-server-app -Dtest=SecurityFilterIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-05 | 01 | 1 | SECU-02 | integration | `mvn test -pl cameleer-server-app -Dtest=JwtRefreshIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-06 | 01 | 1 | SECU-04 | integration | `mvn test -pl cameleer-server-app -Dtest=SseSigningIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
| 04-01-07 | 01 | 1 | N/A | integration | `mvn test -pl cameleer-server-app -Dtest=RegistrationSecurityIT -Dsurefire.reuseForks=false` | ❌ W0 | ⬜ pending |
*Status: ⬜ pending · ✅ green · ❌ red · ⚠️ flaky*

View File

@@ -39,19 +39,19 @@ All truths drawn from PLAN frontmatter must_haves across plans 01, 02, and 03.
| Artifact | Provides | Status | Details |
|----------|----------|--------|---------|
| `cameleer3-server-core/.../security/JwtService.java` | JWT interface: createAccessToken, createRefreshToken, validateAndExtractAgentId, validateRefreshToken | VERIFIED | 49 lines, substantive interface with 4 methods |
| `cameleer3-server-core/.../security/Ed25519SigningService.java` | Ed25519 interface: sign(payload), getPublicKeyBase64() | VERIFIED | 29 lines, substantive interface with 2 methods |
| `cameleer3-server-app/.../security/JwtServiceImpl.java` | Nimbus JOSE+JWT HMAC-SHA256 implementation | VERIFIED | 120 lines; uses `MACSigner`/`MACVerifier`, ephemeral 256-bit secret, correct claims |
| `cameleer3-server-app/.../security/Ed25519SigningServiceImpl.java` | JDK 17 Ed25519 KeyPairGenerator implementation | VERIFIED | 54 lines; `KeyPairGenerator.getInstance("Ed25519")`, `Signature.getInstance("Ed25519")`, Base64-encoded output |
| `cameleer3-server-app/.../security/BootstrapTokenValidator.java` | Constant-time bootstrap token validation with dual-token rotation | VERIFIED | 50 lines; `MessageDigest.isEqual()`, checks current and previous token, null/blank guard |
| `cameleer3-server-app/.../security/SecurityProperties.java` | Config binding with env var mapping | VERIFIED | 48 lines; `@ConfigurationProperties(prefix="security")`; all 4 fields with defaults |
| `cameleer3-server-app/.../security/SecurityBeanConfig.java` | Bean wiring with fail-fast validation | VERIFIED | 43 lines; `@EnableConfigurationProperties`, all 3 service beans, `InitializingBean` check |
| `cameleer3-server-app/.../security/JwtAuthenticationFilter.java` | OncePerRequestFilter extracting JWT from header or query param | VERIFIED | 72 lines; extracts from `Authorization: Bearer` then `?token=` query param; sets `SecurityContextHolder` |
| `cameleer3-server-app/.../security/SecurityConfig.java` | SecurityFilterChain with permitAll for public paths, authenticated for rest | VERIFIED | 54 lines; stateless, CSRF disabled, correct permitAll list, `addFilterBefore` JwtAuthenticationFilter |
| `cameleer3-server-app/.../controller/AgentRegistrationController.java` | Updated register endpoint with bootstrap token validation, JWT issuance, public key; refresh endpoint | VERIFIED | 230 lines; both `/register` and `/{id}/refresh` endpoints fully wired |
| `cameleer3-server-app/.../agent/SsePayloadSigner.java` | Component that signs SSE command payloads | VERIFIED | 77 lines; `@Component`, signs then adds field, defensive null/blank handling |
| `cameleer3-server-app/.../agent/SseConnectionManager.java` | Updated onCommandReady with signing before sendEvent | VERIFIED | `onCommandReady()` calls `ssePayloadSigner.signPayload()`, parses to `JsonNode` to avoid double-quoting |
| `cameleer3-server-app/.../resources/application.yml` | Security config with env var mapping | VERIFIED | `security.bootstrap-token: ${CAMELEER_AUTH_TOKEN:}` and `security.bootstrap-token-previous: ${CAMELEER_AUTH_TOKEN_PREVIOUS:}` present |
| `cameleer-server-core/.../security/JwtService.java` | JWT interface: createAccessToken, createRefreshToken, validateAndExtractAgentId, validateRefreshToken | VERIFIED | 49 lines, substantive interface with 4 methods |
| `cameleer-server-core/.../security/Ed25519SigningService.java` | Ed25519 interface: sign(payload), getPublicKeyBase64() | VERIFIED | 29 lines, substantive interface with 2 methods |
| `cameleer-server-app/.../security/JwtServiceImpl.java` | Nimbus JOSE+JWT HMAC-SHA256 implementation | VERIFIED | 120 lines; uses `MACSigner`/`MACVerifier`, ephemeral 256-bit secret, correct claims |
| `cameleer-server-app/.../security/Ed25519SigningServiceImpl.java` | JDK 17 Ed25519 KeyPairGenerator implementation | VERIFIED | 54 lines; `KeyPairGenerator.getInstance("Ed25519")`, `Signature.getInstance("Ed25519")`, Base64-encoded output |
| `cameleer-server-app/.../security/BootstrapTokenValidator.java` | Constant-time bootstrap token validation with dual-token rotation | VERIFIED | 50 lines; `MessageDigest.isEqual()`, checks current and previous token, null/blank guard |
| `cameleer-server-app/.../security/SecurityProperties.java` | Config binding with env var mapping | VERIFIED | 48 lines; `@ConfigurationProperties(prefix="security")`; all 4 fields with defaults |
| `cameleer-server-app/.../security/SecurityBeanConfig.java` | Bean wiring with fail-fast validation | VERIFIED | 43 lines; `@EnableConfigurationProperties`, all 3 service beans, `InitializingBean` check |
| `cameleer-server-app/.../security/JwtAuthenticationFilter.java` | OncePerRequestFilter extracting JWT from header or query param | VERIFIED | 72 lines; extracts from `Authorization: Bearer` then `?token=` query param; sets `SecurityContextHolder` |
| `cameleer-server-app/.../security/SecurityConfig.java` | SecurityFilterChain with permitAll for public paths, authenticated for rest | VERIFIED | 54 lines; stateless, CSRF disabled, correct permitAll list, `addFilterBefore` JwtAuthenticationFilter |
| `cameleer-server-app/.../controller/AgentRegistrationController.java` | Updated register endpoint with bootstrap token validation, JWT issuance, public key; refresh endpoint | VERIFIED | 230 lines; both `/register` and `/{id}/refresh` endpoints fully wired |
| `cameleer-server-app/.../agent/SsePayloadSigner.java` | Component that signs SSE command payloads | VERIFIED | 77 lines; `@Component`, signs then adds field, defensive null/blank handling |
| `cameleer-server-app/.../agent/SseConnectionManager.java` | Updated onCommandReady with signing before sendEvent | VERIFIED | `onCommandReady()` calls `ssePayloadSigner.signPayload()`, parses to `JsonNode` to avoid double-quoting |
| `cameleer-server-app/.../resources/application.yml` | Security config with env var mapping | VERIFIED | `security.bootstrap-token: ${CAMELEER_AUTH_TOKEN:}` and `security.bootstrap-token-previous: ${CAMELEER_AUTH_TOKEN_PREVIOUS:}` present |
### Key Link Verification

View File

@@ -57,7 +57,7 @@ Agents (50+) Users / UI
Agent POST /api/v1/ingest
|
v
[IngestController] -- validates JWT, deserializes using cameleer3-common models
[IngestController] -- validates JWT, deserializes using cameleer-common models
|
v
[IngestionService.accept(batch)] -- accepts TransactionData/ActivityData
@@ -126,7 +126,7 @@ Each registered SseEmitter sends event to connected agent
Agent POST /api/v1/diagrams (on startup or route change)
|
v
[DiagramController] -- receives route definition (XML/YAML/JSON from cameleer3-common)
[DiagramController] -- receives route definition (XML/YAML/JSON from cameleer-common)
|
v
[DiagramService.storeVersion(definition)]
@@ -341,11 +341,11 @@ public record PageCursor(Instant timestamp, String id) {}
## Module Boundary Design
### Core Module (`cameleer3-server-core`)
### Core Module (`cameleer-server-core`)
The core module is the domain layer. It contains:
- **Domain models** -- Transaction, Activity, Agent, DiagramVersion, etc. (may extend or complement cameleer3-common models)
- **Domain models** -- Transaction, Activity, Agent, DiagramVersion, etc. (may extend or complement cameleer-common models)
- **Service interfaces and implementations** -- TransactionService, AgentRegistryService, DiagramService, QueryEngine
- **Repository interfaces** -- TransactionRepository, DiagramRepository, AgentRepository (interfaces only, no implementations)
- **Ingestion logic** -- WriteBuffer, batch assembly, backpressure signaling
@@ -356,7 +356,7 @@ The core module is the domain layer. It contains:
**No Spring Boot dependencies.** Jackson is acceptable (already present). JUnit for tests.
### App Module (`cameleer3-server-app`)
### App Module (`cameleer-server-app`)
The app module is the infrastructure/adapter layer. It contains:
@@ -376,8 +376,8 @@ The app module is the infrastructure/adapter layer. It contains:
```
app --> core (allowed)
core --> app (NEVER)
core --> cameleer3-common (allowed)
app --> cameleer3-common (transitively via core)
core --> cameleer-common (allowed)
app --> cameleer-common (transitively via core)
```
## Ingestion Pipeline Detail
@@ -393,7 +393,7 @@ Use a two-stage approach:
- WriteBuffer has a bounded capacity (configurable, default 50,000 items).
- When buffer is >80% full, respond with HTTP 429 + `Retry-After` header.
- Agents (cameleer3) should implement exponential backoff on 429.
- Agents (cameleer) should implement exponential backoff on 429.
- Monitor buffer fill level as a metric.
### Batch Size Tuning

View File

@@ -1,6 +1,6 @@
# Domain Pitfalls
**Domain:** Transaction monitoring / observability server (Cameleer3 Server)
**Domain:** Transaction monitoring / observability server (Cameleer Server)
**Researched:** 2026-03-11
**Confidence:** MEDIUM (based on established patterns for ClickHouse, SSE, high-volume ingestion; no web verification available)

View File

@@ -1,6 +1,6 @@
# Technology Stack
**Project:** Cameleer3 Server
**Project:** Cameleer Server
**Researched:** 2026-03-11
**Overall confidence:** MEDIUM (no live source verification available; versions based on training data up to May 2025)

View File

@@ -1,4 +1,4 @@
# Research Summary: Cameleer3 Server
# Research Summary: Cameleer Server
**Domain:** Transaction observability server for Apache Camel integrations
**Researched:** 2026-03-11
@@ -6,7 +6,7 @@
## Executive Summary
Cameleer3 Server is a write-heavy, read-occasional observability system that receives millions of transaction records per day from distributed Apache Camel agents, stores them with 30-day retention, and provides structured + full-text search. The architecture closely parallels established observability platforms like Jaeger, Zipkin, and njams Server, with the key differentiator being Camel route diagram visualization tied to individual transactions.
Cameleer Server is a write-heavy, read-occasional observability system that receives millions of transaction records per day from distributed Apache Camel agents, stores them with 30-day retention, and provides structured + full-text search. The architecture closely parallels established observability platforms like Jaeger, Zipkin, and njams Server, with the key differentiator being Camel route diagram visualization tied to individual transactions.
The recommended stack centers on **ClickHouse** as the primary data store. ClickHouse's columnar MergeTree engine provides the exact properties this project needs: massive batch insert throughput, excellent time-range query performance, native TTL-based retention, and 10-20x compression on structured observability data. This is a well-established pattern used by production observability platforms (SigNoz, Uptrace, PostHog all run on ClickHouse).
@@ -93,9 +93,9 @@ Based on research, suggested phase structure:
## Gaps to Address
- **ClickHouse Java client API:** The clickhouse-java library has undergone significant changes. Exact API, connection pooling, and Spring Boot integration patterns need phase-specific research
- **cameleer3-common PROTOCOL.md:** Must read the agent protocol definition before designing ClickHouse schema -- this defines the exact data structures being ingested
- **cameleer-common PROTOCOL.md:** Must read the agent protocol definition before designing ClickHouse schema -- this defines the exact data structures being ingested
- **ClickHouse Docker setup:** Optimal ClickHouse Docker configuration (memory limits, merge settings) for development and production
- **Full-text search decision:** ClickHouse skip indexes may or may not meet the "search by any content" requirement. This needs prototyping with realistic data
- **Diagram rendering library:** Server-side route diagram rendering is a significant unknown; needs prototyping with actual Camel route graph data from cameleer3-common
- **Diagram rendering library:** Server-side route diagram rendering is a significant unknown; needs prototyping with actual Camel route graph data from cameleer-common
- **Frontend framework:** No research on UI technology -- deferred to UI phase
- **Agent protocol stability:** The cameleer3-common protocol is still evolving. Schema evolution strategy needs alignment with agent development
- **Agent protocol stability:** The cameleer-common protocol is still evolving. Schema evolution strategy needs alignment with agent development

Binary file not shown.

Before

Width:  |  Height:  |  Size: 142 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 141 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 6.7 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 92 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 115 KiB

101
AGENTS.md Normal file
View File

@@ -0,0 +1,101 @@
<!-- gitnexus:start -->
# GitNexus — Code Intelligence
This project is indexed by GitNexus as **alerting-02** (7810 symbols, 20082 relationships, 300 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.
> If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.
## Always Do
- **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user.
- **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows.
- **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
- When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
- When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`.
## When Debugging
1. `gitnexus_query({query: "<error or symptom>"})` — find execution flows related to the issue
2. `gitnexus_context({name: "<suspect function>"})` — see all callers, callees, and process participation
3. `READ gitnexus://repo/alerting-02/process/{processName}` — trace the full execution flow step by step
4. For regressions: `gitnexus_detect_changes({scope: "compare", base_ref: "main"})` — see what your branch changed
## When Refactoring
- **Renaming**: MUST use `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` first. Review the preview — graph edits are safe, text_search edits need manual review. Then run with `dry_run: false`.
- **Extracting/Splitting**: MUST run `gitnexus_context({name: "target"})` to see all incoming/outgoing refs, then `gitnexus_impact({target: "target", direction: "upstream"})` to find all external callers before moving code.
- After any refactor: run `gitnexus_detect_changes({scope: "all"})` to verify only expected files changed.
## Never Do
- NEVER edit a function, class, or method without first running `gitnexus_impact` on it.
- NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
- NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph.
- NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope.
## Tools Quick Reference
| Tool | When to use | Command |
|------|-------------|---------|
| `query` | Find code by concept | `gitnexus_query({query: "auth validation"})` |
| `context` | 360-degree view of one symbol | `gitnexus_context({name: "validateUser"})` |
| `impact` | Blast radius before editing | `gitnexus_impact({target: "X", direction: "upstream"})` |
| `detect_changes` | Pre-commit scope check | `gitnexus_detect_changes({scope: "staged"})` |
| `rename` | Safe multi-file rename | `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` |
| `cypher` | Custom graph queries | `gitnexus_cypher({query: "MATCH ..."})` |
## Impact Risk Levels
| Depth | Meaning | Action |
|-------|---------|--------|
| d=1 | WILL BREAK — direct callers/importers | MUST update these |
| d=2 | LIKELY AFFECTED — indirect deps | Should test |
| d=3 | MAY NEED TESTING — transitive | Test if critical path |
## Resources
| Resource | Use for |
|----------|---------|
| `gitnexus://repo/alerting-02/context` | Codebase overview, check index freshness |
| `gitnexus://repo/alerting-02/clusters` | All functional areas |
| `gitnexus://repo/alerting-02/processes` | All execution flows |
| `gitnexus://repo/alerting-02/process/{name}` | Step-by-step execution trace |
## Self-Check Before Finishing
Before completing any code modification task, verify:
1. `gitnexus_impact` was run for all modified symbols
2. No HIGH/CRITICAL risk warnings were ignored
3. `gitnexus_detect_changes()` confirms changes match expected scope
4. All d=1 (WILL BREAK) dependents were updated
## Keeping the Index Fresh
After committing code changes, the GitNexus index becomes stale. Re-run analyze to update it:
```bash
npx gitnexus analyze
```
If the index previously included embeddings, preserve them by adding `--embeddings`:
```bash
npx gitnexus analyze --embeddings
```
To check whether embeddings exist, inspect `.gitnexus/meta.json` — the `stats.embeddings` field shows the count (0 means no embeddings). **Running analyze without `--embeddings` will delete any previously generated embeddings.**
> Claude Code users: A PostToolUse hook handles this automatically after `git commit` and `git merge`.
## CLI
| Task | Read this skill file |
|------|---------------------|
| Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` |
| Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` |
| Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` |
| Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` |
| Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` |
| Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |
<!-- gitnexus:end -->

183
CLAUDE.md
View File

@@ -4,18 +4,18 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
## Project
Cameleer3 Server — observability server that receives, stores, and serves Camel route execution data and route diagrams from Cameleer3 agents. Pushes config and commands to agents via SSE.
Cameleer Server — observability server that receives, stores, and serves Camel route execution data and route diagrams from Cameleer agents. Pushes config and commands to agents via SSE. Also orchestrates Docker container deployments when running under cameleer-saas.
## Related Project
- **cameleer3** (`https://gitea.siegeln.net/cameleer/cameleer3`) — the Java agent that instruments Camel applications
- Protocol defined in `cameleer3-common/PROTOCOL.md` in the agent repo
- This server depends on `com.cameleer3:cameleer3-common` (shared models and graph API)
- **cameleer** (`https://gitea.siegeln.net/cameleer/cameleer`) — the Java agent that instruments Camel applications
- Protocol defined in `cameleer-common/PROTOCOL.md` in the agent repo
- This server depends on `com.cameleer:cameleer-common` (shared models and graph API)
## Modules
- `cameleer3-server-core` — domain logic, storage, agent registry
- `cameleer3-server-app` — Spring Boot web app, REST controllers, SSE, static resources
- `cameleer-server-core` — domain logic, storage interfaces, services (no Spring dependencies)
- `cameleer-server-app` — Spring Boot web app, REST controllers, SSE, persistence, Docker orchestration
## Build Commands
@@ -27,40 +27,171 @@ mvn clean verify # Full build with tests
## Run
```bash
java -jar cameleer3-server-app/target/cameleer3-server-app-1.0-SNAPSHOT.jar
java -jar cameleer-server-app/target/cameleer-server-app-1.0-SNAPSHOT.jar
```
## Key Conventions
- Java 17+ required
- Spring Boot 3.4.3 parent POM
- Depends on `com.cameleer3:cameleer3-common` from Gitea Maven registry
- Depends on `com.cameleer:cameleer-common` from Gitea Maven registry
- Jackson `JavaTimeModule` for `Instant` deserialization
- Communication: receives HTTP POST data from agents (executions, diagrams, metrics, logs), serves SSE event streams for config push/commands (config-update, deep-trace, replay, route-control)
- Maintains agent instance registry with states: LIVE → STALE → DEAD
- Storage: PostgreSQL (TimescaleDB) for structured data, OpenSearch for full-text search and application log storage
- Security: JWT auth with RBAC (AGENT/VIEWER/OPERATOR/ADMIN roles), Ed25519 config signing, bootstrap token for registration
- OIDC: Optional external identity provider support (token exchange pattern). Configured via admin API, stored in database (`server_config` table)
- URL taxonomy: user-facing data, config, and query endpoints live under `/api/v1/environments/{envSlug}/...`. Env is a path segment, resolved via the `@EnvPath` argument resolver (404 on unknown slug). Flat endpoints are only for: agent self-service (JWT-authoritative), cross-env admin (RBAC, OIDC, audit, license, thresholds, env CRUD), cross-env discovery (`/catalog`), content-addressed lookups (`/diagrams/{contentHash}/render`, `/executions/{id}`), and auth. See `.claude/rules/app-classes.md` for the full allow-list.
- Slug immutability: environment and app slugs are immutable after creation (both appear in URLs, Docker network names, container names, and ClickHouse partition keys). Slug regex `^[a-z0-9][a-z0-9-]{0,63}$` is enforced on POST; update endpoints silently drop any slug field in the request body via Jackson's default unknown-property handling.
- App uniqueness: `(environment_id, app_slug)` is the natural key. The same app slug can legitimately exist in multiple environments; `AppService.getByEnvironmentAndSlug(envId, slug)` is the canonical lookup for controllers. Bare `getBySlug(slug)` remains for internal use but is ambiguous across envs.
- Environment filtering: all data queries filter by the selected environment. All commands target only agents in the selected environment. Env is required on every env-scoped endpoint (path param); the legacy `?environment=` query form is retired.
- Maintains agent instance registry (in-memory) with states: LIVE -> STALE -> DEAD. Auto-heals from JWT `env` claim + heartbeat body on heartbeat/SSE after server restart (priority: heartbeat `environmentId` > JWT `env` claim; no silent default — missing env on heartbeat auto-heal returns 400). Registration (`POST /api/v1/agents/register`) requires `environmentId` in the request body; missing or blank returns 400. Capabilities and route states updated on every heartbeat (protocol v2). Route catalog merges three sources: in-memory agent registry, persistent `route_catalog` table (ClickHouse), and `stats_1m_route` execution stats. The persistent catalog tracks `first_seen`/`last_seen` per route per environment, updated on every registration and heartbeat. Routes appear in the sidebar when their lifecycle overlaps the selected time window (`first_seen <= to AND last_seen >= from`), so historical routes remain visible even after being dropped from newer app versions.
- Multi-tenancy: each server instance serves one tenant (configured via `CAMELEER_SERVER_TENANT_ID`, default: `"default"`). Environments (dev/staging/prod) are first-class. PostgreSQL isolated via schema-per-tenant (`?currentSchema=tenant_{id}`) and `ApplicationName=tenant_{id}` on the JDBC URL. ClickHouse shared DB with `tenant_id` + `environment` columns, partitioned by `(tenant_id, toYYYYMM(timestamp))`.
- Storage: PostgreSQL for RBAC, config, and audit; ClickHouse for all observability data (executions, search, logs, metrics, stats, diagrams). ClickHouse schema migrations in `clickhouse/*.sql`, run idempotently on startup by `ClickHouseSchemaInitializer`. Use `IF NOT EXISTS` for CREATE and ADD PROJECTION.
- Log exchange correlation: `ClickHouseLogStore` extracts `exchange_id` from log entry MDC, preferring `cameleer.exchangeId` over `camel.exchangeId` (fallback for older agents). For `ON_COMPLETION` exchange copies, the agent sets `cameleer.exchangeId` to the parent's exchange ID via `CORRELATION_ID`.
- Log processor correlation: The agent sets `cameleer.processorId` in MDC, identifying which processor node emitted a log line.
- Logging: ClickHouse JDBC set to INFO (`com.clickhouse`), HTTP client to WARN (`org.apache.hc.client5`) in application.yml
- Security: JWT auth with RBAC (AGENT/VIEWER/OPERATOR/ADMIN roles), Ed25519 config signing (key derived deterministically from JWT secret via HMAC-SHA256), bootstrap token for registration. CORS: `CAMELEER_SERVER_SECURITY_CORSALLOWEDORIGINS` (comma-separated) overrides `CAMELEER_SERVER_SECURITY_UIORIGIN` for multi-origin setups. Infrastructure access: `CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS=false` disables Database and ClickHouse admin endpoints. Last-ADMIN guard: system prevents removal of the last ADMIN role (409 Conflict). Password policy: min 12 chars, 3-of-4 character classes, no username match. Brute-force protection: 5 failed attempts -> 15 min lockout. Token revocation: `token_revoked_before` column on users, checked in `JwtAuthenticationFilter`, set on password change.
- OIDC: Optional external identity provider support (token exchange pattern). Configured via admin API/UI, stored in database (`server_config` table). Resource server mode: accepts external access tokens (Logto M2M) via JWKS validation when `CAMELEER_SERVER_SECURITY_OIDCISSUERURI` is set. Scope-based role mapping via `SystemRole.normalizeScope()`. System roles synced on every OIDC login via `applyClaimMappings()` in `OidcAuthController` (calls `clearManagedAssignments` + `assignManagedRole` on `RbacService`) — always overwrites managed role assignments; uses managed assignment origin to avoid touching group-inherited or directly-assigned roles. Supports ES384, ES256, RS256.
- OIDC role extraction: `OidcTokenExchanger` reads roles from the **access_token** first (JWT with `at+jwt` type), then falls back to id_token. `OidcConfig` includes `audience` (RFC 8707 resource indicator) and `additionalScopes`. All provider-specific configuration is external — no provider-specific code in the server.
- Sensitive keys: Global enforced baseline for masking sensitive data in agent payloads. Merge rule: `final = global UNION per-app` (case-insensitive dedup, per-app can only add, never remove global keys).
- User persistence: PostgreSQL `users` table, admin CRUD at `/api/v1/admin/users`
- Usage analytics: ClickHouse `usage_events` table tracks authenticated UI requests, flushed every 5s
## CI/CD & Deployment
## Database Migrations
- CI workflow: `.gitea/workflows/ci.yml` — build → docker → deploy on push to main or feature branches
- Build step skips integration tests (`-DskipITs`) — Testcontainers needs Docker daemon
- Docker: multi-stage build (`Dockerfile`), `$BUILDPLATFORM` for native Maven on ARM64 runner, amd64 runtime
- `REGISTRY_TOKEN` build arg required for `cameleer3-common` dependency resolution
- Registry: `gitea.siegeln.net/cameleer/cameleer3-server` (container images)
- K8s manifests in `deploy/` — Kustomize base + overlays (main/feature), shared infra (PostgreSQL, OpenSearch, Authentik) as top-level manifests
- Deployment target: k3s at 192.168.50.86, namespace `cameleer` (main), `cam-<slug>` (feature branches)
- Feature branches: isolated namespace, PG schema, OpenSearch index prefix; Traefik Ingress at `<slug>-api.cameleer.siegeln.net`
- Secrets managed in CI deploy step (idempotent `--dry-run=client | kubectl apply`): `cameleer-auth`, `postgres-credentials`, `opensearch-credentials`
- K8s probes: server uses `/api/v1/health`, PostgreSQL uses `pg_isready`, OpenSearch uses `/_cluster/health`
- Docker build uses buildx registry cache + `--provenance=false` for Gitea compatibility
PostgreSQL (Flyway): `cameleer-server-app/src/main/resources/db/migration/`
- V1 — RBAC (users, roles, groups, audit_log). `application_config` PK is `(application, environment)`; `app_settings` PK is `(application_id, environment)` — both tables are env-scoped.
- V2 — Claim mappings (OIDC)
- V3 — Runtime management (apps, environments, deployments, app_versions)
- V4 — Environment config (default_container_config JSONB)
- V5 — App container config (container_config JSONB on apps)
- V6 — JAR retention policy (jar_retention_count on environments)
- V7 — Deployment orchestration (target_state, deployment_strategy, replica_states JSONB, deploy_stage)
- V8 — Deployment active config (resolved_config JSONB on deployments)
- V9 — Password hardening (failed_login_attempts, locked_until, token_revoked_before on users)
- V10 — Runtime type detection (detected_runtime_type, detected_main_class on app_versions)
- V11 — Outbound connections (outbound_connections table, enums)
- V12 — Alerting tables (alert_rules, alert_rule_targets, alert_instances, alert_notifications, alert_reads, alert_silences)
- V13 — alert_instances open-rule unique index (alert_instances_open_rule_uq partial index on rule_id WHERE state IN PENDING/FIRING/ACKNOWLEDGED)
## UI Styling
ClickHouse: `cameleer-server-app/src/main/resources/clickhouse/init.sql` (run idempotently on startup)
- Always use `@cameleer/design-system` CSS variables for colors (`var(--amber)`, `var(--error)`, `var(--success)`, etc.) — never hardcode hex values. This applies to CSS modules, inline styles, and SVG `fill`/`stroke` attributes. SVG presentation attributes resolve `var()` correctly.
## Regenerating OpenAPI schema (SPA types)
After any change to REST controller paths, request/response DTOs, or `@PathVariable`/`@RequestParam`/`@RequestBody` signatures, regenerate the TypeScript types the SPA consumes. Required for every controller-level change.
```bash
# Backend must be running on :8081
cd ui && npm run generate-api:live # fetches fresh openapi.json AND regenerates schema.d.ts
# OR, if openapi.json was updated by other means:
cd ui && npm run generate-api # regenerates schema.d.ts from existing openapi.json
```
After regeneration, `ui/src/api/schema.d.ts` and `ui/src/api/openapi.json` will update. The TypeScript compiler then surfaces every SPA call site that needs updating — fix all compile errors before testing in the browser. Commit the regenerated files with the controller change.
## Maintaining .claude/rules/
When adding, removing, or renaming classes, controllers, endpoints, UI components, or metrics, update the corresponding `.claude/rules/` file as part of the same change. The rule files are the class/API map that future sessions rely on — stale rules cause wrong assumptions. Treat rule file updates like updating an import: part of the change, not a separate task.
## Disabled Skills
- Do NOT use any `gsd:*` skills in this project. This includes all `/gsd:` prefixed commands.
<!-- gitnexus:start -->
# GitNexus — Code Intelligence
This project is indexed by GitNexus as **alerting-02** (7810 symbols, 20082 relationships, 300 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.
> If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.
## Always Do
- **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user.
- **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows.
- **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
- When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
- When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`.
## When Debugging
1. `gitnexus_query({query: "<error or symptom>"})` — find execution flows related to the issue
2. `gitnexus_context({name: "<suspect function>"})` — see all callers, callees, and process participation
3. `READ gitnexus://repo/alerting-02/process/{processName}` — trace the full execution flow step by step
4. For regressions: `gitnexus_detect_changes({scope: "compare", base_ref: "main"})` — see what your branch changed
## When Refactoring
- **Renaming**: MUST use `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` first. Review the preview — graph edits are safe, text_search edits need manual review. Then run with `dry_run: false`.
- **Extracting/Splitting**: MUST run `gitnexus_context({name: "target"})` to see all incoming/outgoing refs, then `gitnexus_impact({target: "target", direction: "upstream"})` to find all external callers before moving code.
- After any refactor: run `gitnexus_detect_changes({scope: "all"})` to verify only expected files changed.
## Never Do
- NEVER edit a function, class, or method without first running `gitnexus_impact` on it.
- NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
- NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph.
- NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope.
## Tools Quick Reference
| Tool | When to use | Command |
|------|-------------|---------|
| `query` | Find code by concept | `gitnexus_query({query: "auth validation"})` |
| `context` | 360-degree view of one symbol | `gitnexus_context({name: "validateUser"})` |
| `impact` | Blast radius before editing | `gitnexus_impact({target: "X", direction: "upstream"})` |
| `detect_changes` | Pre-commit scope check | `gitnexus_detect_changes({scope: "staged"})` |
| `rename` | Safe multi-file rename | `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` |
| `cypher` | Custom graph queries | `gitnexus_cypher({query: "MATCH ..."})` |
## Impact Risk Levels
| Depth | Meaning | Action |
|-------|---------|--------|
| d=1 | WILL BREAK — direct callers/importers | MUST update these |
| d=2 | LIKELY AFFECTED — indirect deps | Should test |
| d=3 | MAY NEED TESTING — transitive | Test if critical path |
## Resources
| Resource | Use for |
|----------|---------|
| `gitnexus://repo/alerting-02/context` | Codebase overview, check index freshness |
| `gitnexus://repo/alerting-02/clusters` | All functional areas |
| `gitnexus://repo/alerting-02/processes` | All execution flows |
| `gitnexus://repo/alerting-02/process/{name}` | Step-by-step execution trace |
## Self-Check Before Finishing
Before completing any code modification task, verify:
1. `gitnexus_impact` was run for all modified symbols
2. No HIGH/CRITICAL risk warnings were ignored
3. `gitnexus_detect_changes()` confirms changes match expected scope
4. All d=1 (WILL BREAK) dependents were updated
## Keeping the Index Fresh
After committing code changes, the GitNexus index becomes stale. Re-run analyze to update it:
```bash
npx gitnexus analyze
```
If the index previously included embeddings, preserve them by adding `--embeddings`:
```bash
npx gitnexus analyze --embeddings
```
To check whether embeddings exist, inspect `.gitnexus/meta.json` — the `stats.embeddings` field shows the count (0 means no embeddings). **Running analyze without `--embeddings` will delete any previously generated embeddings.**
> Claude Code users: A PostToolUse hook handles this automatically after `git commit` and `git merge`.
## CLI
| Task | Read this skill file |
|------|---------------------|
| Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` |
| Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` |
| Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` |
| Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` |
| Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` |
| Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |
<!-- gitnexus:end -->

View File

@@ -1,14 +1,14 @@
FROM --platform=$BUILDPLATFORM maven:3.9-eclipse-temurin-17 AS build
WORKDIR /build
# Configure Gitea Maven Registry for cameleer3-common dependency
# Configure Gitea Maven Registry for cameleer-common dependency
ARG REGISTRY_TOKEN
RUN mkdir -p ~/.m2 && \
echo '<settings><servers><server><id>gitea</id><username>cameleer</username><password>'${REGISTRY_TOKEN}'</password></server></servers></settings>' > ~/.m2/settings.xml
COPY pom.xml .
COPY cameleer3-server-core/pom.xml cameleer3-server-core/
COPY cameleer3-server-app/pom.xml cameleer3-server-app/
COPY cameleer-server-core/pom.xml cameleer-server-core/
COPY cameleer-server-app/pom.xml cameleer-server-app/
# Cache deps — only re-downloaded when POMs change
RUN mvn dependency:go-offline -B || true
COPY . .
@@ -16,12 +16,10 @@ RUN mvn clean package -DskipTests -U -B
FROM eclipse-temurin:17-jre
WORKDIR /app
COPY --from=build /build/cameleer3-server-app/target/cameleer3-server-app-*.jar /app/server.jar
ENV SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/cameleer3
ENV SPRING_DATASOURCE_USERNAME=cameleer
ENV SPRING_DATASOURCE_PASSWORD=cameleer_dev
ENV OPENSEARCH_URL=http://opensearch:9200
COPY --from=build /build/cameleer-server-app/target/cameleer-server-app-*.jar /app/server.jar
COPY docker-entrypoint.sh /app/
RUN chmod +x /app/docker-entrypoint.sh
EXPOSE 8081
ENTRYPOINT exec java -jar /app/server.jar
ENV TZ=UTC
ENTRYPOINT ["/app/docker-entrypoint.sh"]

257
HOWTO.md
View File

@@ -1,4 +1,4 @@
# HOWTO — Cameleer3 Server
# HOWTO — Cameleer Server
## Prerequisites
@@ -6,7 +6,7 @@
- Maven 3.9+
- Node.js 22+ and npm
- Docker & Docker Compose
- Access to the Gitea Maven registry (for `cameleer3-common` dependency)
- Access to the Gitea Maven registry (for `cameleer-common` dependency)
## Build
@@ -21,31 +21,36 @@ mvn clean verify # compile + run all tests (needs Docker for integrati
## Infrastructure Setup
Start PostgreSQL and OpenSearch:
Start PostgreSQL:
```bash
docker compose up -d
```
This starts TimescaleDB (PostgreSQL 16) and OpenSearch 2.19. The database schema is applied automatically via Flyway migrations on server startup.
This starts PostgreSQL 16. The database schema is applied automatically via Flyway migrations on server startup. ClickHouse tables are created by the schema initializer on startup.
| Service | Port | Purpose |
|------------|------|----------------------|
| PostgreSQL | 5432 | JDBC (Spring JDBC) |
| OpenSearch | 9200 | REST API (full-text) |
PostgreSQL credentials: `cameleer` / `cameleer_dev`, database `cameleer3`.
PostgreSQL credentials: `cameleer` / `cameleer_dev`, database `cameleer`.
## Run the Server
```bash
mvn clean package -DskipTests
CAMELEER_AUTH_TOKEN=my-secret-token java -jar cameleer3-server-app/target/cameleer3-server-app-1.0-SNAPSHOT.jar
SPRING_DATASOURCE_URL=jdbc:postgresql://localhost:5432/cameleer \
SPRING_DATASOURCE_USERNAME=cameleer \
SPRING_DATASOURCE_PASSWORD=cameleer_dev \
CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN=my-secret-token \
java -jar cameleer-server-app/target/cameleer-server-app-1.0-SNAPSHOT.jar
```
The server starts on **port 8081**. The `CAMELEER_AUTH_TOKEN` environment variable is **required** — the server fails fast on startup if it is not set.
> **Note:** The Docker image no longer includes default database credentials. When running via `docker run`, pass `-e SPRING_DATASOURCE_URL=...` etc. The docker-compose setup provides these automatically.
For token rotation without downtime, set `CAMELEER_AUTH_TOKEN_PREVIOUS` to the old token while rolling out the new one. The server accepts both during the overlap window.
The server starts on **port 8081**. The `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN` environment variable is **required** — the server fails fast on startup if it is not set.
For token rotation without downtime, set `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKENPREVIOUS` to the old token while rolling out the new one. The server accepts both during the overlap window.
## API Endpoints
@@ -84,7 +89,7 @@ curl -s -X POST http://localhost:8081/api/v1/auth/refresh \
-d '{"refreshToken":"<refreshToken>"}'
```
UI credentials are configured via `CAMELEER_UI_USER` / `CAMELEER_UI_PASSWORD` env vars (default: `admin` / `admin`).
UI credentials are configured via `CAMELEER_SERVER_SECURITY_UIUSER` / `CAMELEER_SERVER_SECURITY_UIPASSWORD` env vars (default: `admin` / `admin`).
**Public endpoints (no JWT required):** `GET /api/v1/health`, `POST /api/v1/agents/register` (uses bootstrap token), `POST /api/v1/auth/**`, OpenAPI/Swagger docs.
@@ -101,12 +106,14 @@ JWTs carry a `roles` claim. Endpoints are restricted by role:
| Role | Access |
|------|--------|
| `AGENT` | Data ingestion (`/data/**` — executions, diagrams, metrics, logs), heartbeat, SSE events, command ack |
| `VIEWER` | Search, execution detail, diagrams, agent list |
| `OPERATOR` | VIEWER + send commands to agents |
| `ADMIN` | OPERATOR + user management (`/admin/**`) |
| `VIEWER` | Search, execution detail, diagrams, agent list, app config (read-only) |
| `OPERATOR` | VIEWER + send commands to agents, route control, replay, edit app config |
| `ADMIN` | OPERATOR + user management, audit log, OIDC config, database admin (`/admin/**`) |
The env-var local user gets `ADMIN` role. Agents get `AGENT` role at registration.
**UI role gating:** The sidebar hides the Admin section for non-ADMIN users. Admin routes (`/admin/*`) redirect to `/` for non-admin. The diagram node toolbar and route control bar are hidden for VIEWER. Config is a main tab (`/config` shows all apps, `/config/:appId` filters to one app with detail panel; sidebar clicks stay on config tab, route clicks resolve to parent app). VIEWER sees read-only, OPERATOR+ can edit.
### OIDC Login (Optional)
OIDC configuration is stored in PostgreSQL and managed via the admin API or UI. The SPA checks if OIDC is available:
@@ -139,7 +146,7 @@ curl -s -X PUT http://localhost:8081/api/v1/admin/oidc \
-H "Authorization: Bearer $TOKEN" \
-d '{
"enabled": true,
"issuerUri": "http://authentik:9000/application/o/cameleer/",
"issuerUri": "http://cameleer-logto:3001/oidc",
"clientId": "your-client-id",
"clientSecret": "your-client-secret",
"rolesClaim": "realm_access.roles",
@@ -155,30 +162,44 @@ curl -s -X DELETE http://localhost:8081/api/v1/admin/oidc \
-H "Authorization: Bearer $TOKEN"
```
**Initial provisioning**: OIDC can also be seeded from `CAMELEER_OIDC_*` env vars on first startup (when DB is empty). After that, the admin API takes over.
**Initial provisioning**: OIDC can also be seeded from `CAMELEER_SERVER_SECURITY_OIDC*` env vars on first startup (when DB is empty). After that, the admin API takes over.
### Authentik Setup (OIDC Provider)
### Logto Setup (OIDC Provider)
Authentik is deployed alongside the Cameleer stack. After first deployment:
Logto is deployed alongside the Cameleer stack. After first deployment:
1. **Initial setup**: Open `http://192.168.50.86:30950/if/flow/initial-setup/` and create the admin account
2. **Create provider**: Admin Interface → Providers → Create → OAuth2/OpenID Provider
- Name: `Cameleer`
- Authorization flow: `default-provider-authorization-explicit-consent`
- Client type: `Confidential`
- Redirect URIs: `http://192.168.50.86:30090/callback` (or your UI URL)
Logto is proxy-aware via `TRUST_PROXY_HEADER=1`. The `LOGTO_ENDPOINT` and `LOGTO_ADMIN_ENDPOINT` secrets define the public-facing URLs that Logto uses for OIDC discovery, issuer URI, and redirect URLs. When behind a reverse proxy (e.g., Traefik), set these to the external URLs (e.g., `https://auth.cameleer.my.domain`). Logto needs its own subdomain — it cannot be path-prefixed under another app.
1. **Initial setup**: Open the Logto admin console (the `LOGTO_ADMIN_ENDPOINT` URL) and create the admin account
2. **Create SPA application**: Applications → Create → Single Page App
- Name: `Cameleer UI`
- Redirect URI: your UI URL + `/oidc/callback`
- Note the **Client ID**
3. **Create API Resource**: API Resources → Create
- Name: `Cameleer Server API`
- Indicator: your API URL (e.g., `https://cameleer.siegeln.net/api`)
- Add permissions: `server:admin`, `server:operator`, `server:viewer`
4. **Create M2M application** (for SaaS platform): Applications → Create → Machine-to-Machine
- Name: `Cameleer SaaS`
- Assign the API Resource created above with `server:admin` scope
- Note the **Client ID** and **Client Secret**
3. **Create application**: Admin Interface → Applications → Create
- Name: `Cameleer`
- Provider: select `Cameleer` (created above)
4. **Configure roles** (optional): Create groups in Authentik and map them to Cameleer roles via the `roles-claim` config. Default claim path is `realm_access.roles`. For Authentik, you may need to customize the OIDC scope to include group claims.
5. **Configure Cameleer**: Use the admin API (`PUT /api/v1/admin/oidc`) or set env vars for initial seeding:
5. **Configure Cameleer OIDC login**: Use the admin API (`PUT /api/v1/admin/oidc`) or the admin UI. OIDC login configuration is stored in the database — no env vars needed for the SPA OIDC flow.
6. **Configure resource server** (for M2M token validation):
```
CAMELEER_OIDC_ENABLED=true
CAMELEER_OIDC_ISSUER=http://authentik:9000/application/o/cameleer/
CAMELEER_OIDC_CLIENT_ID=<client-id-from-step-2>
CAMELEER_OIDC_CLIENT_SECRET=<client-secret-from-step-2>
CAMELEER_SERVER_SECURITY_OIDCISSUERURI=<LOGTO_ENDPOINT>/oidc
CAMELEER_SERVER_SECURITY_OIDCJWKSETURI=http://cameleer-logto:3001/oidc/jwks
CAMELEER_SERVER_SECURITY_OIDCAUDIENCE=<api-resource-indicator-from-step-3>
CAMELEER_SERVER_SECURITY_OIDCTLSSKIPVERIFY=true # optional — skip cert verification for self-signed CAs
```
`OIDCJWKSETURI` is needed when the public issuer URL isn't reachable from inside containers — it fetches JWKS directly from the internal Logto service. `OIDCTLSSKIPVERIFY` disables certificate verification for all OIDC HTTP calls (discovery, token exchange, JWKS); use only when the provider has a self-signed CA.
### SSO Behavior
When OIDC is configured and enabled, the UI automatically redirects to the OIDC provider for silent SSO (`prompt=none`). Users with an active provider session are signed in without seeing a login form. On first login, the provider may show a consent screen (scopes), after which subsequent logins are seamless. If auto-signup is enabled, new users are automatically provisioned with the configured default roles.
- **Bypass SSO**: Navigate to `/login?local` to see the local login form
- **Subpath deployments**: The OIDC redirect_uri respects `BASE_PATH` (e.g., `https://host/server/oidc/callback`)
- **Role sync**: System roles (ADMIN/OPERATOR/VIEWER) are synced from OIDC scopes on every login — revoking a scope in the provider takes effect on next login. Manually assigned group memberships are preserved.
### User Management (ADMIN only)
@@ -221,19 +242,18 @@ curl -s -X POST http://localhost:8081/api/v1/data/metrics \
-H "Authorization: Bearer $TOKEN" \
-d '[{"agentId":"agent-1","metricName":"cpu","value":42.0,"timestamp":"2026-03-11T00:00:00Z","tags":{}}]'
# Post application log entries (batch)
# Post application log entries (raw JSON array — no wrapper)
curl -s -X POST http://localhost:8081/api/v1/data/logs \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"entries": [{
"timestamp": "2026-03-25T10:00:00Z",
"level": "INFO",
"loggerName": "com.acme.MyService",
"message": "Processing order #12345",
"threadName": "main"
}]
}'
-d '[{
"timestamp": "2026-03-25T10:00:00Z",
"level": "INFO",
"loggerName": "com.acme.MyService",
"message": "Processing order #12345",
"threadName": "main",
"source": "app"
}]'
```
**Note:** The `X-Protocol-Version: 1` header is required on all `/api/v1/data/**` endpoints. Missing or wrong version returns 400.
@@ -344,45 +364,98 @@ curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/commands/{commandId}
**Agent lifecycle:** LIVE (heartbeat within 90s) → STALE (missed 3 heartbeats) → DEAD (5min after STALE). DEAD agents kept indefinitely.
**Server restart resilience:** The agent registry is in-memory and lost on server restart. Agents auto-re-register on their next heartbeat or SSE connection — the server reconstructs registry entries from JWT claims (subject, application). Route catalog uses ClickHouse execution data as fallback until agents re-register with full route IDs. Agents should also handle 404 on heartbeat by triggering a full re-registration.
**SSE events:** `config-update`, `deep-trace`, `replay`, `route-control` commands pushed in real time. Server sends ping keepalive every 15s.
**Command expiry:** Unacknowledged commands expire after 60 seconds.
**Route control responses:** Route control commands return `CommandGroupResponse` with per-agent status, response count, and timed-out agent IDs.
### Backpressure
When the write buffer is full (default capacity: 50,000), ingestion endpoints return **503 Service Unavailable**. Already-buffered data is not lost.
## Configuration
Key settings in `cameleer3-server-app/src/main/resources/application.yml`:
Key settings in `cameleer-server-app/src/main/resources/application.yml`. All custom properties live under `cameleer.server.*`. Env vars are a mechanical 1:1 mapping (dots to underscores, uppercase).
| Setting | Default | Description |
|---------|---------|-------------|
| `server.port` | 8081 | Server port |
| `ingestion.buffer-capacity` | 50000 | Max items in write buffer |
| `ingestion.batch-size` | 5000 | Items per batch insert |
| `ingestion.flush-interval-ms` | 1000 | Buffer flush interval (ms) |
| `agent-registry.heartbeat-interval-seconds` | 30 | Expected heartbeat interval |
| `agent-registry.stale-threshold-seconds` | 90 | Time before agent marked STALE |
| `agent-registry.dead-threshold-seconds` | 300 | Time after STALE before DEAD |
| `agent-registry.command-expiry-seconds` | 60 | Pending command TTL |
| `agent-registry.keepalive-interval-seconds` | 15 | SSE ping keepalive interval |
| `security.access-token-expiry-ms` | 3600000 | JWT access token lifetime (1h) |
| `security.refresh-token-expiry-ms` | 604800000 | Refresh token lifetime (7d) |
| `security.bootstrap-token` | `${CAMELEER_AUTH_TOKEN}` | Bootstrap token for agent registration (required) |
| `security.bootstrap-token-previous` | `${CAMELEER_AUTH_TOKEN_PREVIOUS}` | Previous bootstrap token for rotation (optional) |
| `security.ui-user` | `admin` | UI login username (`CAMELEER_UI_USER` env var) |
| `security.ui-password` | `admin` | UI login password (`CAMELEER_UI_PASSWORD` env var) |
| `security.ui-origin` | `http://localhost:5173` | CORS allowed origin for UI (`CAMELEER_UI_ORIGIN` env var) |
| `security.jwt-secret` | *(random)* | HMAC secret for JWT signing (`CAMELEER_JWT_SECRET`). If set, tokens survive restarts |
| `security.oidc.enabled` | `false` | Enable OIDC login (`CAMELEER_OIDC_ENABLED`) |
| `security.oidc.issuer-uri` | | OIDC provider issuer URL (`CAMELEER_OIDC_ISSUER`) |
| `security.oidc.client-id` | | OAuth2 client ID (`CAMELEER_OIDC_CLIENT_ID`) |
| `security.oidc.client-secret` | | OAuth2 client secret (`CAMELEER_OIDC_CLIENT_SECRET`) |
| `security.oidc.roles-claim` | `realm_access.roles` | JSONPath to roles in OIDC id_token (`CAMELEER_OIDC_ROLES_CLAIM`) |
| `security.oidc.default-roles` | `VIEWER` | Default roles for new OIDC users (`CAMELEER_OIDC_DEFAULT_ROLES`) |
| `opensearch.log-index-prefix` | `logs-` | OpenSearch index prefix for application logs (`CAMELEER_LOG_INDEX_PREFIX`) |
| `opensearch.log-retention-days` | `7` | Days before log indices are deleted (`CAMELEER_LOG_RETENTION_DAYS`) |
**Security** (`cameleer.server.security.*`):
| Setting | Default | Env var | Description |
|---------|---------|---------|-------------|
| `cameleer.server.security.bootstraptoken` | *(required)* | `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN` | Bootstrap token for agent registration |
| `cameleer.server.security.bootstraptokenprevious` | *(empty)* | `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKENPREVIOUS` | Previous bootstrap token for rotation |
| `cameleer.server.security.uiuser` | `admin` | `CAMELEER_SERVER_SECURITY_UIUSER` | UI login username |
| `cameleer.server.security.uipassword` | `admin` | `CAMELEER_SERVER_SECURITY_UIPASSWORD` | UI login password |
| `cameleer.server.security.uiorigin` | `http://localhost:5173` | `CAMELEER_SERVER_SECURITY_UIORIGIN` | CORS allowed origin for UI |
| `cameleer.server.security.corsallowedorigins` | *(empty)* | `CAMELEER_SERVER_SECURITY_CORSALLOWEDORIGINS` | Comma-separated CORS origins — overrides `uiorigin` when set |
| `cameleer.server.security.jwtsecret` | *(random)* | `CAMELEER_SERVER_SECURITY_JWTSECRET` | HMAC secret for JWT signing. If set, tokens survive restarts |
| `cameleer.server.security.accesstokenexpiryms` | `3600000` | `CAMELEER_SERVER_SECURITY_ACCESSTOKENEXPIRYMS` | JWT access token lifetime (1h) |
| `cameleer.server.security.refreshtokenexpiryms` | `604800000` | `CAMELEER_SERVER_SECURITY_REFRESHTOKENEXPIRYMS` | Refresh token lifetime (7d) |
| `cameleer.server.security.infrastructureendpoints` | `true` | `CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS` | Show DB/ClickHouse admin endpoints. Set `false` in SaaS-managed mode |
**OIDC resource server** (`cameleer.server.security.oidc.*`):
| Setting | Default | Env var | Description |
|---------|---------|---------|-------------|
| `cameleer.server.security.oidc.issueruri` | *(empty)* | `CAMELEER_SERVER_SECURITY_OIDC_ISSUERURI` | OIDC issuer URI — enables resource server mode |
| `cameleer.server.security.oidc.jwkseturi` | *(empty)* | `CAMELEER_SERVER_SECURITY_OIDC_JWKSETURI` | Direct JWKS URL — bypasses OIDC discovery |
| `cameleer.server.security.oidc.audience` | *(empty)* | `CAMELEER_SERVER_SECURITY_OIDC_AUDIENCE` | Expected JWT audience |
| `cameleer.server.security.oidc.tlsskipverify` | `false` | `CAMELEER_SERVER_SECURITY_OIDC_TLSSKIPVERIFY` | Skip TLS cert verification for OIDC calls |
**Note:** OIDC *login* configuration (issuer, client ID, client secret, roles claim, default roles) is stored in the database and managed via the admin API (`PUT /api/v1/admin/oidc`) or admin UI. The env vars above are for resource server mode (M2M token validation) only.
**Ingestion** (`cameleer.server.ingestion.*`):
| Setting | Default | Env var | Description |
|---------|---------|---------|-------------|
| `cameleer.server.ingestion.buffercapacity` | `50000` | `CAMELEER_SERVER_INGESTION_BUFFERCAPACITY` | Max items in write buffer |
| `cameleer.server.ingestion.batchsize` | `5000` | `CAMELEER_SERVER_INGESTION_BATCHSIZE` | Items per batch insert |
| `cameleer.server.ingestion.flushintervalms` | `5000` | `CAMELEER_SERVER_INGESTION_FLUSHINTERVALMS` | Buffer flush interval (ms) |
| `cameleer.server.ingestion.bodysizelimit` | `16384` | `CAMELEER_SERVER_INGESTION_BODYSIZELIMIT` | Max body size per execution (bytes) |
**Agent registry** (`cameleer.server.agentregistry.*`):
| Setting | Default | Env var | Description |
|---------|---------|---------|-------------|
| `cameleer.server.agentregistry.heartbeatintervalms` | `30000` | `CAMELEER_SERVER_AGENTREGISTRY_HEARTBEATINTERVALMS` | Expected heartbeat interval (ms) |
| `cameleer.server.agentregistry.stalethresholdms` | `90000` | `CAMELEER_SERVER_AGENTREGISTRY_STALETHRESHOLDMS` | Time before agent marked STALE (ms) |
| `cameleer.server.agentregistry.deadthresholdms` | `300000` | `CAMELEER_SERVER_AGENTREGISTRY_DEADTHRESHOLDMS` | Time after STALE before DEAD (ms) |
| `cameleer.server.agentregistry.pingintervalms` | `15000` | `CAMELEER_SERVER_AGENTREGISTRY_PINGINTERVALMS` | SSE ping keepalive interval (ms) |
| `cameleer.server.agentregistry.commandexpiryms` | `60000` | `CAMELEER_SERVER_AGENTREGISTRY_COMMANDEXPIRYMS` | Pending command TTL (ms) |
| `cameleer.server.agentregistry.lifecyclecheckintervalms` | `10000` | `CAMELEER_SERVER_AGENTREGISTRY_LIFECYCLECHECKINTERVALMS` | Lifecycle monitor interval (ms) |
**Runtime** (`cameleer.server.runtime.*`):
| Setting | Default | Env var | Description |
|---------|---------|---------|-------------|
| `cameleer.server.runtime.enabled` | `true` | `CAMELEER_SERVER_RUNTIME_ENABLED` | Enable Docker orchestration |
| `cameleer.server.runtime.baseimage` | `cameleer-runtime-base:latest` | `CAMELEER_SERVER_RUNTIME_BASEIMAGE` | Base Docker image for app containers |
| `cameleer.server.runtime.dockernetwork` | `cameleer` | `CAMELEER_SERVER_RUNTIME_DOCKERNETWORK` | Primary Docker network |
| `cameleer.server.runtime.jarstoragepath` | `/data/jars` | `CAMELEER_SERVER_RUNTIME_JARSTORAGEPATH` | JAR file storage directory |
| `cameleer.server.runtime.jardockervolume` | *(empty)* | `CAMELEER_SERVER_RUNTIME_JARDOCKERVOLUME` | Docker volume for JAR sharing |
| `cameleer.server.runtime.routingmode` | `path` | `CAMELEER_SERVER_RUNTIME_ROUTINGMODE` | `path` or `subdomain` Traefik routing |
| `cameleer.server.runtime.routingdomain` | `localhost` | `CAMELEER_SERVER_RUNTIME_ROUTINGDOMAIN` | Domain for Traefik routing labels |
| `cameleer.server.runtime.serverurl` | *(empty)* | `CAMELEER_SERVER_RUNTIME_SERVERURL` | Server URL injected into app containers |
| `cameleer.server.runtime.agenthealthport` | `9464` | `CAMELEER_SERVER_RUNTIME_AGENTHEALTHPORT` | Agent health check port |
| `cameleer.server.runtime.healthchecktimeout` | `60` | `CAMELEER_SERVER_RUNTIME_HEALTHCHECKTIMEOUT` | Health check timeout (seconds) |
| `cameleer.server.runtime.container.memorylimit` | `512m` | `CAMELEER_SERVER_RUNTIME_CONTAINER_MEMORYLIMIT` | Default memory limit for app containers |
| `cameleer.server.runtime.container.cpushares` | `512` | `CAMELEER_SERVER_RUNTIME_CONTAINER_CPUSHARES` | Default CPU shares for app containers |
**Other** (`cameleer.server.*`):
| Setting | Default | Env var | Description |
|---------|---------|---------|-------------|
| `cameleer.server.catalog.discoveryttldays` | `7` | `CAMELEER_SERVER_CATALOG_DISCOVERYTTLDAYS` | Days before stale discovered apps auto-hide from sidebar |
| `cameleer.server.tenant.id` | `default` | `CAMELEER_SERVER_TENANT_ID` | Tenant identifier |
| `cameleer.server.indexer.debouncems` | `2000` | `CAMELEER_SERVER_INDEXER_DEBOUNCEMS` | Search indexer debounce delay (ms) |
| `cameleer.server.indexer.queuesize` | `10000` | `CAMELEER_SERVER_INDEXER_QUEUESIZE` | Search indexer queue capacity |
| `cameleer.server.license.token` | *(empty)* | `CAMELEER_SERVER_LICENSE_TOKEN` | License token |
| `cameleer.server.license.publickey` | *(empty)* | `CAMELEER_SERVER_LICENSE_PUBLICKEY` | License verification public key |
| `cameleer.server.clickhouse.url` | `jdbc:clickhouse://localhost:8123/cameleer` | `CAMELEER_SERVER_CLICKHOUSE_URL` | ClickHouse JDBC URL |
| `cameleer.server.clickhouse.username` | `default` | `CAMELEER_SERVER_CLICKHOUSE_USERNAME` | ClickHouse user |
| `cameleer.server.clickhouse.password` | *(empty)* | `CAMELEER_SERVER_CLICKHOUSE_PASSWORD` | ClickHouse password |
## Web UI Development
@@ -393,7 +466,7 @@ npm run dev # Vite dev server on http://localhost:5173 (proxies /api to
npm run build # Production build to ui/dist/
```
Login with `admin` / `admin` (or whatever `CAMELEER_UI_USER` / `CAMELEER_UI_PASSWORD` are set to).
Login with `admin` / `admin` (or whatever `CAMELEER_SERVER_SECURITY_UIUSER` / `CAMELEER_SERVER_SECURITY_UIPASSWORD` are set to).
The UI uses runtime configuration via `public/config.js`. In Kubernetes, a ConfigMap overrides this file to set the correct API base URL.
@@ -407,17 +480,17 @@ npm run generate-api # Requires backend running on :8081
## Running Tests
Integration tests use Testcontainers (starts PostgreSQL and OpenSearch automatically — requires Docker):
Integration tests use Testcontainers (starts PostgreSQL automatically — requires Docker):
```bash
# All tests
mvn verify
# Unit tests only (no Docker needed)
mvn test -pl cameleer3-server-core
mvn test -pl cameleer-server-core
# Specific integration test
mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT
mvn test -pl cameleer-server-app -Dtest=ExecutionControllerIT
```
## Verify Database Data
@@ -425,7 +498,7 @@ mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT
After posting data and waiting for the flush interval (1s default):
```bash
docker exec -it cameleer3-server-postgres-1 psql -U cameleer -d cameleer3 \
docker exec -it cameleer-server-postgres-1 psql -U cameleer -d cameleer \
-c "SELECT count(*) FROM route_executions"
```
@@ -437,14 +510,16 @@ The full stack is deployed to k3s via CI/CD on push to `main`. K8s manifests are
```
cameleer namespace:
PostgreSQL (StatefulSet, 10Gi PVC) ← postgres:5432 (ClusterIP)
OpenSearch (StatefulSet, 10Gi PVC) ← opensearch:9200 (ClusterIP)
cameleer3-server (Deployment) ← NodePort 30081
cameleer3-ui (Deployment, Nginx) ← NodePort 30090
Authentik Server (Deployment) ← NodePort 30950
Authentik Worker (Deployment)
Authentik PostgreSQL (StatefulSet, 1Gi) ← ClusterIP
Authentik Redis (Deployment) ← ClusterIP
PostgreSQL (StatefulSet, 10Gi PVC) ← cameleer-postgres:5432 (ClusterIP)
ClickHouse (StatefulSet, 10Gi PVC) ← cameleer-clickhouse:8123 (ClusterIP)
cameleer-server (Deployment) ← NodePort 30081
cameleer-ui (Deployment, Nginx) ← NodePort 30090
cameleer-deploy-demo (Deployment) ← NodePort 30092
Logto Server (Deployment) ← NodePort 30951/30952
Logto PostgreSQL (StatefulSet, 1Gi) ← ClusterIP
cameleer-demo namespace:
(deployed Camel applications — managed by cameleer-deploy-demo)
```
### Access (from your network)
@@ -454,13 +529,15 @@ cameleer namespace:
| Web UI | `http://192.168.50.86:30090` |
| Server API | `http://192.168.50.86:30081/api/v1/health` |
| Swagger UI | `http://192.168.50.86:30081/api/v1/swagger-ui.html` |
| Authentik | `http://192.168.50.86:30950` |
| Deploy Demo | `http://192.168.50.86:30092` |
| Logto API | `LOGTO_ENDPOINT` secret (NodePort 30951 direct, or behind reverse proxy) |
| Logto Admin | `LOGTO_ADMIN_ENDPOINT` secret (NodePort 30952 direct, or behind reverse proxy) |
### CI/CD Pipeline
Push to `main` triggers: **build** (UI npm + Maven, unit tests) → **docker** (buildx amd64 for server + UI, push to Gitea registry) → **deploy** (kubectl apply + rolling update).
Required Gitea org secrets: `REGISTRY_TOKEN`, `KUBECONFIG_BASE64`, `CAMELEER_AUTH_TOKEN`, `CAMELEER_JWT_SECRET`, `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`, `OPENSEARCH_USER`, `OPENSEARCH_PASSWORD`, `CAMELEER_UI_USER` (optional), `CAMELEER_UI_PASSWORD` (optional), `AUTHENTIK_PG_USER`, `AUTHENTIK_PG_PASSWORD`, `AUTHENTIK_SECRET_KEY`, `CAMELEER_OIDC_ENABLED`, `CAMELEER_OIDC_ISSUER`, `CAMELEER_OIDC_CLIENT_ID`, `CAMELEER_OIDC_CLIENT_SECRET`.
Required Gitea org secrets: `REGISTRY_TOKEN`, `KUBECONFIG_BASE64`, `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN`, `CAMELEER_SERVER_SECURITY_JWTSECRET`, `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`, `CLICKHOUSE_USER`, `CLICKHOUSE_PASSWORD`, `CAMELEER_SERVER_SECURITY_UIUSER` (optional), `CAMELEER_SERVER_SECURITY_UIPASSWORD` (optional), `LOGTO_PG_USER`, `LOGTO_PG_PASSWORD`, `LOGTO_ENDPOINT` (public-facing Logto URL, e.g., `https://auth.cameleer.my.domain`), `LOGTO_ADMIN_ENDPOINT` (admin console URL), `CAMELEER_SERVER_SECURITY_OIDCISSUERURI` (optional, for resource server M2M token validation), `CAMELEER_SERVER_SECURITY_OIDCAUDIENCE` (optional, API resource indicator), `CAMELEER_SERVER_SECURITY_OIDCTLSSKIPVERIFY` (optional, skip TLS cert verification for self-signed CAs).
### Manual K8s Commands
@@ -469,14 +546,14 @@ Required Gitea org secrets: `REGISTRY_TOKEN`, `KUBECONFIG_BASE64`, `CAMELEER_AUT
kubectl -n cameleer get pods
# View server logs
kubectl -n cameleer logs -f deploy/cameleer3-server
kubectl -n cameleer logs -f deploy/cameleer-server
# View PostgreSQL logs
kubectl -n cameleer logs -f statefulset/postgres
kubectl -n cameleer logs -f statefulset/cameleer-postgres
# View OpenSearch logs
kubectl -n cameleer logs -f statefulset/opensearch
# View ClickHouse logs
kubectl -n cameleer logs -f statefulset/cameleer-clickhouse
# Restart server
kubectl -n cameleer rollout restart deployment/cameleer3-server
kubectl -n cameleer rollout restart deployment/cameleer-server
```

259
UI-CONSISTENCY-AUDIT.md Normal file
View File

@@ -0,0 +1,259 @@
> **Status: RESOLVED** — All phases (1-5) executed on 2026-04-09. Remaining: responsive design (separate initiative).
# UI Consistency Audit — cameleer-server
**Date:** 2026-04-09
**Scope:** All files under `ui/src/` (26 CSS modules, ~45 TSX components, ~15 pages)
**Verdict:** ~55% design system adoption for interactive UI. Significant duplication and inline style debt.
---
## Executive Summary
| Dimension | Score | Key Issue |
|-----------|-------|-----------|
| Design system component adoption | 55% | 32 raw `<button>`, 12 raw `<select>`, 8 raw `<input>` should use DS |
| Color consistency | Poor | ~140 violations: 45 hardcoded hex in TSX, 13 naked hex in CSS, ~55 fallback hex in `var()` |
| Inline styles | Poor | 55 RED (static inline styles), 8 YELLOW, 14 GREEN (justified) |
| Layout consistency | Mixed | 3 different page padding values, mixed gap/margin approaches |
| CSS module duplication | 22% | ~135 of 618 classes are copy-pasted across files |
| Responsive design | None | Zero `@media` queries in entire UI |
---
## 1. Critical: Hardcoded Colors (CLAUDE.md violation)
The project rule states: *"Always use `@cameleer/design-system` CSS variables for colors — never hardcode hex values."*
### Worst offenders
| File | Violations | Severity |
|------|-----------|----------|
| `ProcessDiagram/DiagramNode.tsx` | ~20 hex values in SVG fill/stroke | Critical |
| `ExecutionDiagram/ExecutionDiagram.module.css` | 17 naked hex + ~40 hex fallbacks in `var()` | Critical |
| `ProcessDiagram/CompoundNode.tsx` | 8 hex values | Critical |
| `ProcessDiagram/DiagramEdge.tsx` | 3 hex values | High |
| `ProcessDiagram/ConfigBadge.tsx` | 3 hex values | High |
| `ProcessDiagram/ErrorSection.tsx` | 2 hex values | High |
| `ProcessDiagram/NodeToolbar.tsx` | 2 hex values | High |
| `ProcessDiagram/Minimap.tsx` | 3 hex values | High |
| `Dashboard/Dashboard.module.css` | `#5db866` (not even a DS color) | High |
| `AppsTab/AppsTab.module.css` | `var(--accent, #6c7aff)` (undefined DS variable) | Medium |
### Undefined CSS variables (not in design system)
| Variable | Files | Should be |
|----------|-------|-----------|
| `--accent` | EnvironmentSelector, AppsTab | `--amber` (or define in DS) |
| `--bg-base` | LoginPage | `--bg-body` |
| `--surface` | ContentTabs, ExchangeHeader | `--bg-surface` |
| `--bg-surface-raised` | AgentHealth | `--bg-raised` |
### Missing DS tokens needed
Several tint/background colors are used repeatedly but have no DS variable:
- `--error-bg` (used as `#FDF2F0`, `#F9E0DC`)
- `--success-bg` (used as `#F0F9F1`)
- `--amber-bg` / `--warning-bg` (used as `#FFF8F0`)
- `--bg-inverse` / `--text-inverse` (used as `#1A1612` / `#E4DFD8`)
---
## 2. Critical: CSS Module Duplication (~22%)
~135 of 618 class definitions are copy-pasted across files.
### Table section pattern — 5 files, ~35 duplicate classes
`.tableSection`, `.tableHeader`, `.tableTitle`, `.tableMeta`, `.tableRight` are **identical** in:
- `DashboardTab.module.css`
- `AuditLogPage.module.css`
- `ClickHouseAdminPage.module.css`
- `RoutesMetrics.module.css`
- `RouteDetail.module.css`
### Log viewer panel — 2 files, ~50 lines identical
`.logCard`, `.logHeader`, `.logToolbar`, `.logSearchWrap`, `.logSearchInput`, `.logSearchClear`, `.logClearFilters`, `.logEmpty`, `.sortBtn`, `.refreshBtn`, `.headerActions` — byte-for-byte identical in `AgentHealth.module.css` and `AgentInstance.module.css`.
### Tap modal form — 2 files, ~40 lines identical
`.typeSelector`, `.typeOption`, `.typeOptionActive`, `.testSection`, `.testTabs`, `.testTabBtn`, `.testTabBtnActive`, `.testBody`, `.testResult`, `.testSuccess`, `.testError` — identical in `TapConfigModal.module.css` and `RouteDetail.module.css`.
### Other duplicates
| Pattern | Files | Lines |
|---------|-------|-------|
| Rate color classes (`.rateGood/.rateWarn/.rateBad/.rateNeutral`) | DashboardTab, RouteDetail, RoutesMetrics | ~12 each |
| Refresh indicator + `@keyframes pulse` | DashboardTab, RoutesMetrics | ~15 each |
| Chart card (`.chartCard`) | AgentInstance, RouteDetail | ~6 each |
| Section card (`.section`) | AppConfigDetailPage, OidcConfigPage | ~7 each |
| Meta grid (`.metaGrid/.metaLabel/.metaValue`) | AboutMeDialog, UserManagement | ~9 each |
---
## 3. High: Inline Styles (55 RED violations)
### Files with zero CSS modules (all inline)
| File | Issue |
|------|-------|
| `pages/Admin/AdminLayout.tsx` | Entire layout wrapper is inline styled |
| `pages/Admin/DatabaseAdminPage.tsx` | All layout, typography, spacing inline — no CSS module |
| `auth/OidcCallback.tsx` | Full-page layout inline — no CSS module |
### Most inline violations
| File | RED count | Primary patterns |
|------|-----------|-----------------|
| `pages/AppsTab/AppsTab.tsx` | ~25 | Fixed-width inputs (`width: 50-90px` x18), visually-hidden pattern x2, table cell layouts |
| `components/LayoutShell.tsx` | 6 | StarredList sub-component, sidebar layout |
| `pages/Admin/EnvironmentsPage.tsx` | 8 | Raw `<select>` fully styled inline, save/cancel button rows |
| `pages/Routes/RouteDetail.tsx` | 5 | Heading styles, tab panel margins |
### Repeated inline patterns that need extraction
| Pattern | Occurrences | Fix |
|---------|-------------|-----|
| `style={{ display: 'flex', justifyContent: 'center', padding: '4rem' }}` (loading fallback) | 3 files | Create shared `<PageLoader>` |
| `style={{ position: 'absolute', width: 1, height: 1, clip: 'rect(0,0,0,0)' }}` (visually hidden) | 2 in AppsTab | Create `.visuallyHidden` utility class |
| `style={{ width: N }}` on `<Input>`/`<Select>` (fixed widths) | 18+ in AppsTab | Size classes or CSS module rules |
| `style={{ marginTop: 8, display: 'flex', gap: 8, justifyContent: 'flex-end' }}` (action row) | 3+ in EnvironmentsPage | Shared `.editActions` class |
---
## 4. High: Design System Component Adoption Gaps
### Native HTML that should use DS components
| Element | Instances | Files | DS Replacement |
|---------|-----------|-------|---------------|
| `<button>` | 32 | 8 files | `Button`, `SegmentedTabs` |
| `<select>` | 12 | 4 files | `Select` |
| `<input>` | 8 | 4 files | `Input`, `Toggle`, `Checkbox` |
| `<label>` | 9 | 2 files | `FormField`, `Label` |
| `<table>` (data) | 2 | 2 files | `DataTable`, `LogViewer` |
### Highest-priority replacements
1. **`EnvironmentSelector.tsx`** — zero DS imports, entire component is a bare `<select>`. Used globally in sidebar.
2. **`ExecutionDiagram/tabs/LogTab.tsx`** — reimplements LogViewer from scratch (raw table + input + button). AgentInstance and AgentHealth already use DS `LogViewer` correctly.
3. **`AppsTab.tsx` sub-tabs** — 3 instances of homegrown `<button>` tab bars. DS provides `SegmentedTabs` and `Tabs`.
4. **`AppConfigDetailPage.tsx`** — 4x `<select>`, 4x `<label>`, 2x `<input type="checkbox">`, 4x `<button>` — all have DS equivalents already used elsewhere.
5. **`AgentHealth.tsx`** — config bar uses `Toggle` (correct) alongside raw `<select>` and `<button>` (incorrect).
### Cross-page inconsistencies
| Pattern | Correct usage | Incorrect usage |
|---------|--------------|-----------------|
| Log viewer | AgentInstance, AgentHealth use DS `LogViewer` | LogTab rebuilds from scratch |
| Config edit form | Both pages render same 4 fields | AgentHealth uses `Toggle`, AppConfigDetail uses `<input type="checkbox">` |
| Sub-tabs | RbacPage uses DS `Tabs` | AppsTab uses homegrown `<button>` tabs with non-DS `--accent` color |
| Select dropdowns | AppsTab uses DS `Select` for some fields | Same file uses raw `<select>` for other fields |
---
## 5. Medium: Layout Inconsistencies
### Page padding (3 different values)
| Pages | Padding |
|-------|---------|
| AgentHealth, AgentInstance, AdminLayout | `20px 24px 40px` |
| AppsTab | `16px` (all sides) |
| DashboardTab, Dashboard | No padding (full-bleed) |
### Section gap spacing (mixed approaches)
| Approach | Pages |
|----------|-------|
| CSS `gap: 20px` on flex container | DashboardTab, RoutesMetrics |
| `margin-bottom: 20px` | AgentInstance |
| Mixed `margin-bottom: 16px` and `20px` on same page | AgentHealth, ClickHouseAdminPage |
### Typography inconsistencies
| Issue | Details |
|-------|---------|
| Card title weight | Most use `font-weight: 600`, RouteDetail `.paneTitle` uses `700` |
| Chart title style | RouteDetail: `12px/700/uppercase`, AgentHealth: `12px/600/uppercase` |
| Font units | ExchangeHeader + TabKpis use `rem`, everything else uses `px` |
| Raw headings | DatabaseAdminPage uses `<h2>`/`<h3>` with inline styles; all others use DS `SectionHeader` or CSS classes |
| Table header padding | Most: `12px 16px`, Dashboard: `8px 12px`, AgentHealth eventCard: `10px 16px` |
### Stat strip layouts
| Page | Layout | Gap |
|------|--------|-----|
| AgentHealth, AgentInstance, RbacPage | CSS grid `repeat(N, 1fr)` | `10px` |
| ClickHouseAdminPage | Flexbox (unequal widths) | `10px` |
| DatabaseAdminPage | Inline flex | `1rem` (16px) |
### Empty state patterns (4 different approaches)
1. DS `<EmptyState>` component (AgentInstance — correct)
2. `EntityList emptyMessage` prop (EnvironmentsPage, RbacPage)
3. `.logEmpty` CSS class, `12px`, `var(--text-faint)` (AgentHealth, AgentInstance)
4. `.emptyNote` CSS class, `12px`, `italic` (AppsTab)
5. Inline `0.875rem`, `var(--text-muted)` (ExchangesPage)
### Loading state patterns (3 different approaches)
1. `<Spinner size="lg">` in flex div with inline `padding: 4rem` — copy-pasted 3 times
2. `<Spinner size="md">` returned directly, no centering (EnvironmentsPage)
3. No loading UI, data simply absent (DashboardL1/L2/L3)
---
## 6. Low: Other Findings
- **`!important`**: 1 use in `RouteControlBar.module.css` — works around specificity conflict
- **Zero responsive design**: no `@media` queries anywhere
- **Z-index**: only 4 uses, all in diagram components (5 and 10), consistent
- **Naming convention**: all camelCase — consistent, no issues
- **Unused CSS classes**: ~11 likely unused in AppsTab (old create-modal classes) and TapConfigModal
---
## Recommended Fix Order
### Phase 1: Design system tokens (unblocks everything else)
1. Add missing DS variables: `--error-bg`, `--success-bg`, `--amber-bg`, `--bg-inverse`, `--text-inverse`
2. Fix undefined variables: `--accent` -> `--amber`, `--bg-base` -> `--bg-body`, `--surface` -> `--bg-surface`
### Phase 2: Eliminate CSS duplication (~22% of all classes)
3. Extract shared `tableSection` pattern to shared CSS module (saves ~140 duplicate lines across 5 files)
4. Extract shared log viewer CSS to shared module (saves ~50 lines across 2 files)
5. Remove duplicate tap modal CSS from RouteDetail (saves ~40 lines)
6. Extract shared rate/refresh/chart patterns
### Phase 3: Fix hardcoded colors
7. Replace all hex in `ProcessDiagram/*.tsx` SVG components (~45 values)
8. Replace all hex in `ExecutionDiagram.module.css` (~17 naked + strip ~40 fallbacks)
9. Fix remaining CSS hex violations (Dashboard, AppsTab, AgentHealth)
### Phase 4: Replace native HTML with DS components
10. `EnvironmentSelector` -> DS `Select`
11. `LogTab` -> DS `LogViewer`
12. `AppsTab` sub-tabs -> DS `SegmentedTabs`
13. `AppConfigDetailPage` form elements -> DS `Select`/`Toggle`/`FormField`/`Button`
14. Remaining `<button>` -> DS `Button`
### Phase 5: Eliminate inline styles
15. Create CSS modules for AdminLayout, DatabaseAdminPage, OidcCallback
16. Extract shared `<PageLoader>` component
17. Move AppsTab fixed-width inputs to CSS module size classes
18. Move remaining inline margins/flex patterns to CSS classes
### Phase 6: Standardize layout patterns
19. Unify page padding to `20px 24px 40px`
20. Standardize section gaps to `gap: 20px` on flex containers
21. Normalize font units to `px` throughout
22. Standardize empty state to DS `<EmptyState>`
23. Standardize loading state to shared `<PageLoader>`

View File

@@ -1,4 +1,4 @@
# UI/UX Evaluation Report — Cameleer3 Server
# UI/UX Evaluation Report — Cameleer Server
**Date:** 2026-03-25
**Evaluated URL:** http://192.168.50.86:30090/
@@ -8,7 +8,7 @@
## Executive Summary
The Cameleer3 dashboard has a **distinctive, well-crafted warm amber design language** that stands out in the observability space. The core monitoring pages (Dashboard, Exchange Detail, Routes, Agents) are polished and consistent. The design system provides a solid foundation.
The Cameleer dashboard has a **distinctive, well-crafted warm amber design language** that stands out in the observability space. The core monitoring pages (Dashboard, Exchange Detail, Routes, Agents) are polished and consistent. The design system provides a solid foundation.
**Key strengths:** KPI strip pattern, command palette (Ctrl+K), agent card grouping, dark mode token system, cohesive brand identity.

BIN
audit/01-exchanges-list.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 269 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 54 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 271 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 200 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 270 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 168 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 193 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 202 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 83 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 90 KiB

BIN
audit/07-dashboard-full.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 118 KiB

BIN
audit/07-dashboard-tab.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 119 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 135 KiB

BIN
audit/09-runtime-full.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 144 KiB

BIN
audit/09-runtime-tab.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 144 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 129 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 95 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 60 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 67 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 68 KiB

BIN
audit/13-groups-tab.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 244 KiB

Some files were not shown because too many files have changed in this diff Show More