The jdbcTemplate() method was calling dataSource(properties) directly,
creating a new DataSource instance instead of using the Spring-managed
@Primary bean. This caused some repositories to receive the ClickHouse
connection instead of PostgreSQL.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ExecutionFlushScheduler drains MergedExecution and ProcessorBatch write
buffers on a fixed interval and delegates batch inserts to
ClickHouseExecutionStore. Also sweeps stale exchanges every 60s.
ChunkIngestionController exposes POST /api/v1/data/chunks, accepts
single or array ExecutionChunk payloads, and feeds them into the
ChunkAccumulator. Conditional on ChunkAccumulator bean (clickhouse.enabled).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When clickhouse.enabled=true, the ClickHouse JdbcTemplate bean prevents
Spring Boot auto-config from creating the default PG JdbcTemplate.
All PG repositories then get the CH JdbcTemplate and fail with
"Table cameleer.audit_log does not exist".
Fix: explicitly create @Primary DataSource and JdbcTemplate from
DataSourceProperties so PG remains the default for unqualified injections.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
clickhouse-jdbc 0.9.7 rejects async_insert and wait_for_async_insert as
unknown URL parameters. These are server-side settings, not driver config.
Can be set per-query later if needed via custom_settings.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Set CLICKHOUSE_USER/PASSWORD via k8s secret (fixes "disabling network
access for user 'default'" when no password is set)
- Add clickhouse-credentials secret to CI deploy + feature branch copy
- Pass CLICKHOUSE_USERNAME/PASSWORD env vars to server pod
- Make schema initializer non-fatal so server starts even if CH is
temporarily unavailable
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Testcontainers tests need Docker which isn't available in CI.
Rename to *IT so Surefire skips them (Failsafe runs them with -DskipITs=false).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implements MetricsQueryStore using ClickHouse toStartOfInterval() for
time-bucketed aggregation queries; verified with 4 Testcontainers tests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
TDD implementation of MetricsStore backed by ClickHouse. Uses native
Map(String,String) column type (no JSON cast), relies on ClickHouse
DEFAULT for server_received_at, and handles null tags by substituting
an empty HashMap. All 4 Testcontainers tests pass.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds ClickHouseSchemaInitializer that runs on ApplicationReadyEvent,
scanning classpath:clickhouse/*.sql in filename order and executing each
statement. Adds V1__agent_metrics.sql with MergeTree table, tenant/agent
partitioning, and 365-day TTL.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds ClickHouseProperties (bound to clickhouse.*), ClickHouseConfig
(conditional HikariDataSource + JdbcTemplate beans), and extends
application.yml with clickhouse.enabled/url/username/password and
cameleer.storage.metrics properties.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The ExecutionDocument and ExecutionRecord records gained an isReplay
field but the integration tests were not updated, breaking CI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ElkDiagramRenderer: guard against null containingNode before getElkRoot()
- OpenSearchAdminController: return 503/502 instead of 200 on errors
- DatabaseAdminController: return 503 instead of 200 on connection failure
- SpaForwardController: replace unbound {path} variables with /** wildcards
- WriteBuffer: check offer() return value and log on unexpected rejection
- ApiExceptionHandler: extract getReason() to local var for null safety
- Admin UI pages: handle isError state for disconnected service display
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detect replayed exchanges via X-Cameleer-Replay header during ingestion,
persist the flag through PostgreSQL and OpenSearch, and surface it in
the dashboard (amber replay icon) and exchange detail chain view.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replay audit log now records the agent's reply status (SUCCESS/FAILURE),
message, and error details. Timeout and internal errors are also logged
as FAILURE with the cause.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add dedicated POST /agents/{id}/replay endpoint that uses
addCommandWithReply to wait for the agent ACK (30s timeout).
Returns the actual replay result (status, message, data) instead
of just a delivery confirmation.
Frontend toast now reflects the agent's response: "Replay completed"
on success, agent error message on failure, timeout message if the
agent doesn't respond.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add ROUTE_CONTROL command type and route-control mapping in
AgentCommandController. New RouteControlBar component in the exchange
header shows Start/Stop/Suspend/Resume actions (grouped pill bar) and
a Replay button, gated by agent capabilities and OPERATOR/ADMIN role.
Fix useReplayExchange hook to match protocol section 16: payload now
uses { routeId, exchange: { body, headers }, originalExchangeId, nonce }
instead of the flat { headers, body } format.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Treemap: rectangle area = transaction volume, color = SLA compliance
(green→red). Shows apps at L1, routes at L2. Click navigates deeper.
Punchcard heatmap: 7-day rolling weekday x 24-hour grid showing
transaction volume and error patterns. Two side-by-side views
(transactions + errors) reveal temporal clustering.
Backend: new GET /search/stats/punchcard endpoint aggregating
stats_1m_all/app by DOW x hour over rolling 7 days.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
RouteGraph no longer stores a separate nodes list; getNodes() computes
from root tree. Tests now build proper tree via setRoot() + setChildren()
instead of calling setNodes().
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The agent now sends shallow copies (without children) in the flat nodes
list. Build nodeById map by walking graph.getRoot() tree which preserves
children, falling back to flat list via putIfAbsent for compatibility.
Also adds EIP_FILTER, EIP_IDEMPOTENT_CONSUMER, EIP_RECIPIENT_LIST as
new compound container types per updated DIAGRAMS.md.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Restores e8039f9. The compound rendering regression was caused by
the agent sending flat nodes without children, not the renderer code.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reverting e8039f9 to diagnose compound rendering regression affecting
all compound types (SPLIT, CHOICE, LOOP, DO_TRY) and error handlers.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Follow the DO_TRY pattern: virtual _CB_MAIN wrapper for main path children,
onFallback rendered as _CB_FALLBACK section with purple dashed border.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The cross-root boundary check in createElkEdges() was too aggressive,
skipping all edges where source and target have different ELK roots.
Compound nodes are their own ELK roots, so valid continuation edges
from the last child inside a compound to the next sibling were lost.
Now allows edges when nodes share a common grandparent or when one
node exits/enters a compound boundary.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Show resolved endpoint URI as teal italic line on diagram nodes
when execution overlay is active
- Enable drill-down for TO and TO_DYNAMIC nodes (not just DIRECT/SEDA)
- Use runtime resolvedEndpointUri from execution overlay for drill-down
when static endpointUri doesn't match
- Increase node height from 50px to 56px to accommodate the third line
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The toMap() method was missing the has_trace_data field, so it was
never indexed despite being read back in hitToSummary().
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Trace data visibility:
- ProcessorNode now includes hasTraceData flag computed from captured
body/headers during tree conversion
- ConfigBadge shows teal for tracing configured, green when data captured
- Search results show green footprints icon for exchanges with trace data
- New has_trace_data column on executions table (V11 migration with backfill)
- OpenSearch documents and ExecutionSummary include the flag
Inline tap configuration:
- Extracted reusable TapConfigModal component from RouteDetail
- Diagram context menu opens tap modal inline instead of navigating away
- Toggle-trace action works immediately with toast feedback
- Modal closes only on ESC, Cancel, Save, or Delete (not backdrop click)
Detail panel tab gating:
- Headers, Input, Output tabs disabled when no data is available
- Works at both exchange and processor level
- Falls back to Info tab when active tab becomes empty
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes iteration overlay corruption caused by flat storage collapsing
duplicate processorIds across loop iterations.
Server:
- Store raw processor tree as processors_json JSONB on executions table
- Detail endpoint serves from processors_json (faithful tree), falls back
to flat record reconstruction for older executions
- V10 migration: processors_json, error categorization (errorType,
errorCategory, rootCauseType, rootCauseMessage), OTel (traceId, spanId),
circuit breaker (circuitBreakerState, fallbackTriggered), drops
erroneous splitDepth/loopDepth columns
- Add all new fields through full ingestion/storage/API chain
UI:
- Fix overlay wrapper filtering: check wrapper type before status filter
- Add new fields to schema.d.ts
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add resolvedEndpointUri, splitDepth, loopDepth arguments to
ProcessorRecord constructors in TreeReconstructionTest and
PostgresExecutionStoreIT.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wire resolvedEndpointUri through the full chain:
- V9 migration adds resolved_endpoint_uri column
- IngestionService extracts from ProcessorExecution
- PostgresExecutionStore persists and reads the column
- ProcessorNode includes field in detail API response
- UI schema updated for ProcessorNode and PositionedNode
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Server:
- Add endpointUri to PositionedNode (from RouteNode)
- Add fromEndpointUri to RouteSummary (catalog API)
- Catalog controller resolves endpoint URI from diagram store
UI:
- Build endpointRouteMap from catalog's fromEndpointUri field
- Drill-down uses exact match on node.endpointUri against the map
- Remove label parsing heuristics (extractTargetEndpoint, camelToKebab)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove PORT_ALIGNMENT_DEFAULT=BEGIN so NETWORK_SIMPLEX centers edges
at the vertical midpoint of the compound instead of the top.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Left-align all sections (try_body, doFinally, doCatch) within DO_TRY
- Shrink DO_TRY height to match actual content, removing bottom padding
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Use NETWORK_SIMPLEX placement for vertical centering of root flow nodes
- Skip structural edges from all compound nodes to descendants (not just DO_TRY)
- Reduce DO_TRY section spacing from NODE_SPACING*0.4 to fixed 20px
- Use SVG clipPath for node text instead of character-count truncation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Increase node width (160→220), height (40→50), spacing (90→120)
- Use SVG clipPath for text instead of character-count truncation
- Add UI sources, ESLint report, and sonar-scanner CLI to SonarQube workflow
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Port alignment BEGIN on DO_TRY compounds makes edges attach at the top
instead of center, keeping the main flow level. Post-processing also
stretches all DO_TRY sections (doFinally, doCatch) to match the widest
section's width for visual consistency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ELK's partitioning doesn't reliably order disconnected children within
a compound node. Instead, let ELK lay out freely then re-stack sections
in correct order (try_body → doFinally → doCatch) by adjusting Y
positions in the ELK graph before extraction. This propagates correctly
to both node and edge coordinates via getAbsoluteY().
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>