OIDC tokens had subject "oidc:<sub>" which didn't match the "ui:" prefix
check in JwtAuthenticationFilter, causing every post-login API call to
return 401 and trigger automatic logout. Renamed the prefix from "ui:"
to "user:" across all auth code for clarity (it covers both browser and
API clients, not just UI).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add dedicated request/response DTOs for all controllers, replacing raw
JsonNode parameters with validated types. Move OpenAPI path-prefix stripping
and ProcessorNode children injection into OpenApiCustomizer beans so the
spec served at /api/v1/api-docs is already clean — eliminating the need for
the ui/scripts/process-openapi.mjs post-processing script.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend:
- Expose end_session_endpoint from OIDC provider metadata in /auth/oidc/config
- Add getEndSessionEndpoint() to OidcTokenExchanger
Frontend:
- On OIDC logout, redirect to provider's end_session_endpoint to clear SSO session
- Strip /api/v1 prefix from OpenAPI paths to match client baseUrl convention
- Add schema-types.ts with convenience type re-exports from generated schema
- Fix all type imports to use schema-types instead of raw generated schema
- Fix optional field access (processors, children, duration) with proper typing
- Fix AgentInstance.state → status field name
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SPA catch-all was missing these paths, causing 404 when Authentik
redirected back to /oidc/callback after authentication.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add "Sign in with SSO" button on login page (shown when OIDC is configured)
- Add /oidc/callback route to exchange authorization code for JWT tokens
- Add loginWithOidcCode action to auth store
- Treat issuer URI as complete discovery URL (no auto-append of .well-known)
- Update admin page placeholder to show full discovery URL format
- Fix datetime picker calendar icon visibility in dark mode (color-scheme)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CREATE TABLE IF NOT EXISTS won't add new columns to an existing table.
Add 05-oidc-auto-signup.sql with ALTER TABLE ADD COLUMN IF NOT EXISTS and
register it in ClickHouseConfig startup schema + test init.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: add autoSignup field to OidcConfig, ClickHouse schema, repository,
and admin controller. Gate OIDC login when auto-signup is disabled and user
is not pre-created (returns 403).
Frontend: add OIDC admin page with full CRUD (save/test/delete), role-gated
Admin nav link parsed from JWT, and matching design system styles.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OIDC provider settings (issuer, client ID/secret, roles claim) are
now stored in ClickHouse and managed via admin REST API at
/api/v1/admin/oidc. This allows runtime configuration from the UI
without server restarts.
- New oidc_config table (ReplacingMergeTree, singleton row)
- OidcConfig record + OidcConfigRepository interface in core
- ClickHouseOidcConfigRepository implementation
- OidcConfigAdminController: GET/PUT/DELETE config, POST test
connectivity, client_secret masked in responses
- OidcTokenExchanger: reads config from DB, invalidateCache()
on config change
- OidcAuthController: always registered (no @ConditionalOnProperty),
returns 404 when OIDC not configured
- Startup seeder: env vars seed DB on first boot only, then admin
API takes over
- HOWTO.md updated with admin OIDC config API examples
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement three-phase security upgrade:
Phase 1 - RBAC: Extend JWT with roles claim, populate Spring
GrantedAuthority in filter, enforce role-based access (AGENT for
data/heartbeat/SSE, VIEWER+ for search/diagrams, OPERATOR+ for
commands, ADMIN for user management). Configurable JWT secret via
CAMELEER_JWT_SECRET env var for token persistence across restarts.
Phase 2 - User persistence: ClickHouse users table with
ReplacingMergeTree, UserRepository interface + ClickHouse impl,
UserAdminController for CRUD at /api/v1/admin/users. Local login
upserts user on each authentication.
Phase 3 - OIDC: Token exchange flow where SPA sends auth code,
server exchanges it server-side (keeping client_secret secure),
validates id_token via JWKS, resolves roles (DB override > OIDC
claim > default), issues internal JWT. Conditional on
CAMELEER_OIDC_ENABLED=true. Uses oauth2-oidc-sdk for standards
compliance.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ClickHouse avg() and quantile() return nan/inf on zero rows, which
toInt64() cannot convert. Wrap with ifNotFinite(..., 0) to default to
zero. Applied to both stats and timeseries queries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stats endpoint now returns current + previous period (24h shift) values
plus today's total count. UI shows:
- Total Matches: "of 12.3K today"
- Avg Duration: arrow + % vs yesterday
- Failure Rate: percentage of errors vs total, arrow + % vs yesterday
- P99 Latency: arrow + % vs yesterday
- In-Flight: unchanged (running executions)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All stat card values now come from the /search/stats endpoint which
queries the full time window, not just the current page of results.
Consolidated into a single ClickHouse query for efficiency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Execution rows are wide (29 cols with serialized arrays/JSON), so 500
rows can exceed ClickHouse's memory limit. Reduce default batch size
from 500 to 100 and bump ClickHouse memory limit from 2Gi to 4Gi.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
quantile(0.99) returns Float64 which ClickHouse JDBC cannot cast to
Long directly. Same toInt64() pattern already used in timeseries query.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
P99 latency and active count now use the same from/to parameters as the
timeseries sparklines, so all stat cards are consistent with the user's
selected time range.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace toStartOfInterval with intDiv on epoch seconds, and cast
avg/quantile results to Int64 to avoid Float64 JDBC mapping issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New /search/stats/timeseries endpoint returns bucketed counts/metrics
over a time window using ClickHouse toStartOfInterval(). Frontend
Sparkline component renders SVG polyline + gradient fill on each
stat card, driven by a useStatsTimeseries query hook.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend now returns 401 instead of 403 for unauthenticated requests
via HttpStatusEntryPoint. UI middleware handles both 401 and 403,
triggering token refresh and redirecting to /login on failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Status filter now parses comma-separated values into SQL IN clause
instead of exact match, so filtering by multiple statuses works.
Added GET /api/v1/search/stats returning P99 latency (last hour) and
active execution count, wired into the UI stat cards with 10s polling.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Propagate authenticated agent identity through write buffers via
TaggedExecution/TaggedDiagram wrappers so ClickHouse rows get real
agent IDs instead of empty strings
- Add execution_id to text search LIKE clause so selecting an execution
by ID in the palette actually finds it
- Clear status filter to all three statuses on palette selection so the
chosen execution/agent isn't filtered out
- Add disabled Routes and Exchanges scope tabs with "coming soon" state
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend: add routeId, agentId, processorType filter fields to SearchRequest
and ClickHouseSearchEngine. Expand global text search to match route_id and
agent_id columns.
Frontend: new command palette component (portal overlay, Zustand store,
TanStack Query search hook with 300ms debounce, filter chip parsing,
keyboard navigation, scope tabs). Search bar in SearchFilters and TopNav
now open the palette. Selecting a result writes filters to the execution
search store to drive the results table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Scaffold Vite + React + TypeScript frontend in ui/ with full design
system (dark/light themes) matching the HTML mockups
- Implement Execution Explorer page: search filters, results table with
expandable processor tree and exchange detail sidebar, pagination
- Add UI authentication: UiAuthController (login/refresh endpoints),
JWT filter handles ui: subject prefix, CORS configuration
- Shared components: StatusPill, DurationBar, StatCard, AppBadge,
FilterChip, Pagination — all using CSS Modules with design tokens
- API client layer: openapi-fetch with auth middleware, TanStack Query
hooks for search/detail/snapshot queries, Zustand for state
- Standalone deployment: Nginx Dockerfile, K8s Deployment + ConfigMap +
NodePort (30080), runtime config.js for API base URL
- Embedded mode: maven-resources-plugin copies ui/dist into JAR static
resources, SPA forward controller for client-side routing
- CI/CD: UI build step, Docker build/push for server-ui image, K8s
deploy step for UI, UI credential secrets
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Increase ClickHouse memory limit from 1Gi to 2Gi and reduce default
batch size from 5000 to 500. During VM backup snapshots, I/O contention
prevents ClickHouse from flushing writes fast enough, causing buffer
accumulation that exceeds the 1Gi container limit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The DriverManager-based approach likely failed because the ClickHouse
JDBC driver wasn't registered with DriverManager. The original
JdbcTemplate approach worked for route_diagrams and agent_metrics —
only route_executions was skipped due to the comment-parsing bug.
Reverts to simple JdbcTemplate-based init with unqualified table names
(DataSource targets cameleer3 database). The CLICKHOUSE_DB env var on
the ClickHouse container handles database creation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
split(';') produced chunks starting with '--' comment lines, causing
the startsWith('--') check to skip the entire CREATE TABLE statement
for route_executions. Now strips comment lines before splitting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The auto-configured DataSource targets jdbc:ch://.../cameleer3 which fails
if the database doesn't exist yet. Schema init now uses a direct JDBC
connection to the root URL, creates the database first, then applies all
schema SQL with fully qualified cameleer3.* table names.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ClickHouseConfig.ensureDatabaseExists() connects without the database path
to run CREATE DATABASE IF NOT EXISTS before the main DataSource is used.
Removes the ConfigMap-based init scripts from the K8s manifest — the server
is now the single owner of all ClickHouse schema management.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Server now applies schema via @PostConstruct using classpath SQL files.
All statements use IF NOT EXISTS/IF NOT EXISTS so it's idempotent and
safe to run on every startup. Removes ConfigMap and init script mount
from K8s manifest since ClickHouse no longer needs to manage the schema.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SsePayloadSigner signs JSON payloads and adds signature field before SSE delivery
- SseConnectionManager signs all command payloads via SsePayloadSigner before sendEvent
- Signed payload parsed to JsonNode for correct SseEmitter serialization
- Integration tests use bootstrap token + JWT auth (adapts to Plan 02 security layer)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SsePayloadSignerTest: 7 unit tests for sign/verify roundtrip and edge cases
- SseSigningIT: 2 integration tests for end-to-end SSE signature verification
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- JwtServiceImpl: HMAC-SHA256 via Nimbus JOSE+JWT with ephemeral 256-bit secret
- Ed25519SigningServiceImpl: JDK 17 KeyPairGenerator with ephemeral keypair
- BootstrapTokenValidator: constant-time comparison with dual-token rotation
- SecurityBeanConfig: bean wiring with fail-fast validation for CAMELEER_AUTH_TOKEN
- SecurityProperties: config binding for token expiry and bootstrap tokens
- TestSecurityConfig: permit-all filter chain to keep existing tests green
- application.yml: security config with env var mapping
- All 18 security unit tests pass, all 71 tests pass in full verify
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- SseConnectionManager with per-agent SseEmitter, ping keepalive, event delivery
- AgentSseController GET /{id}/events SSE endpoint with Last-Event-ID support
- AgentCommandController with single/group/broadcast command targeting + ack
- WebConfig excludes SSE events path from protocol version interceptor
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Inject DiagramRepository into ClickHouseExecutionRepository for hash lookup
- Replace empty string placeholder with actual SHA-256 diagram hash in insertBatch
- Add Surefire/Failsafe forkCount=1 reuseForks=false for classloader isolation
- Add failsafe-plugin integration-test/verify goals for IT execution
- Create DiagramLinkingIT with positive (hash populated) and negative (empty fallback) cases
- Fix flaky awaitility assertions with ignoreExceptions for EmptyResultDataAccess
- Increase IngestionSchemaIT timeouts to 30s for reliable batch flush waits
- Adjust SearchControllerIT pagination assertion to match correct seed data count
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Test 1: diagram_content_hash populated with SHA-256 when RouteGraph exists
- Test 2: diagram_content_hash empty when no RouteGraph exists for route
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use correlationId scoping and >= assertions for status/duration tests
- Prevents false failures when other test classes seed data in same container
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Implement findRawById and findProcessorSnapshot in ClickHouseExecutionRepository
- DetailController with GET /executions/{id} returning nested processor tree
- GET /executions/{id}/processors/{index}/snapshot for per-processor exchange data
- 5 unit tests for tree reconstruction (linear, branching, multiple roots, empty)
- 6 integration tests for detail endpoint, snapshot, and 404 handling
- Added assertj and mockito test dependencies to core module
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ClickHouseSearchEngine with dynamic WHERE clause building and LIKE escape
- SearchController with GET (basic filters) and POST (advanced JSON body)
- SearchBeanConfig wiring SearchEngine, SearchService, DetailService beans
- 13 integration tests covering all filter types, combinations, pagination, empty results
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add Eclipse ELK core + layered algorithm and JFreeSVG dependencies to app module
- Create DiagramRenderer interface in core with renderSvg and layoutJson methods
- Create DiagramLayout, PositionedNode, PositionedEdge records for layout data
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ExecutionController: POST /api/v1/data/executions (single or array)
- DiagramController: POST /api/v1/data/diagrams (single or array)
- MetricsController: POST /api/v1/data/metrics (array)
- All return 202 Accepted or 503 with Retry-After when buffer full
- Fix duplicate IngestionConfig bean (remove @Configuration, use @EnableConfigurationProperties)
- Fix BackpressureIT timing by using batch POST and 60s flush interval
- All 11 integration tests green
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ExecutionControllerIT: single/array POST, flush verification, unknown fields
- DiagramControllerIT: single/array POST, flush verification
- MetricsControllerIT: POST metrics, flush verification
- BackpressureIT: buffer-full returns 503, buffered data not lost
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>