OIDC tokens had subject "oidc:<sub>" which didn't match the "ui:" prefix
check in JwtAuthenticationFilter, causing every post-login API call to
return 401 and trigger automatic logout. Renamed the prefix from "ui:"
to "user:" across all auth code for clarity (it covers both browser and
API clients, not just UI).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add dedicated request/response DTOs for all controllers, replacing raw
JsonNode parameters with validated types. Move OpenAPI path-prefix stripping
and ProcessorNode children injection into OpenApiCustomizer beans so the
spec served at /api/v1/api-docs is already clean — eliminating the need for
the ui/scripts/process-openapi.mjs post-processing script.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend:
- Expose end_session_endpoint from OIDC provider metadata in /auth/oidc/config
- Add getEndSessionEndpoint() to OidcTokenExchanger
Frontend:
- On OIDC logout, redirect to provider's end_session_endpoint to clear SSO session
- Strip /api/v1 prefix from OpenAPI paths to match client baseUrl convention
- Add schema-types.ts with convenience type re-exports from generated schema
- Fix all type imports to use schema-types instead of raw generated schema
- Fix optional field access (processors, children, duration) with proper typing
- Fix AgentInstance.state → status field name
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The SPA catch-all was missing these paths, causing 404 when Authentik
redirected back to /oidc/callback after authentication.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add "Sign in with SSO" button on login page (shown when OIDC is configured)
- Add /oidc/callback route to exchange authorization code for JWT tokens
- Add loginWithOidcCode action to auth store
- Treat issuer URI as complete discovery URL (no auto-append of .well-known)
- Update admin page placeholder to show full discovery URL format
- Fix datetime picker calendar icon visibility in dark mode (color-scheme)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The CREATE TABLE IF NOT EXISTS won't add new columns to an existing table.
Add 05-oidc-auto-signup.sql with ALTER TABLE ADD COLUMN IF NOT EXISTS and
register it in ClickHouseConfig startup schema + test init.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: add autoSignup field to OidcConfig, ClickHouse schema, repository,
and admin controller. Gate OIDC login when auto-signup is disabled and user
is not pre-created (returns 403).
Frontend: add OIDC admin page with full CRUD (save/test/delete), role-gated
Admin nav link parsed from JWT, and matching design system styles.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OIDC provider settings (issuer, client ID/secret, roles claim) are
now stored in ClickHouse and managed via admin REST API at
/api/v1/admin/oidc. This allows runtime configuration from the UI
without server restarts.
- New oidc_config table (ReplacingMergeTree, singleton row)
- OidcConfig record + OidcConfigRepository interface in core
- ClickHouseOidcConfigRepository implementation
- OidcConfigAdminController: GET/PUT/DELETE config, POST test
connectivity, client_secret masked in responses
- OidcTokenExchanger: reads config from DB, invalidateCache()
on config change
- OidcAuthController: always registered (no @ConditionalOnProperty),
returns 404 when OIDC not configured
- Startup seeder: env vars seed DB on first boot only, then admin
API takes over
- HOWTO.md updated with admin OIDC config API examples
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- deploy/authentik.yaml: PostgreSQL StatefulSet, Redis, Authentik
server (NodePort 30900) and worker, all in cameleer namespace
- deploy/server.yaml: Add CAMELEER_JWT_SECRET and CAMELEER_OIDC_*
env vars from secrets (all optional for backward compat)
- ci.yml: Create authentik-credentials and cameleer-oidc secrets,
deploy Authentik before the server
- HOWTO.md: Authentik setup instructions, updated architecture
diagram and Gitea secrets list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add RBAC role table, OIDC login flow, user admin API examples, and
new configuration properties to HOWTO.md. Update CLAUDE.md with RBAC
roles, OIDC support, and user persistence. Add user repository to
ARCHITECTURE.md component table.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Implement three-phase security upgrade:
Phase 1 - RBAC: Extend JWT with roles claim, populate Spring
GrantedAuthority in filter, enforce role-based access (AGENT for
data/heartbeat/SSE, VIEWER+ for search/diagrams, OPERATOR+ for
commands, ADMIN for user management). Configurable JWT secret via
CAMELEER_JWT_SECRET env var for token persistence across restarts.
Phase 2 - User persistence: ClickHouse users table with
ReplacingMergeTree, UserRepository interface + ClickHouse impl,
UserAdminController for CRUD at /api/v1/admin/users. Local login
upserts user on each authentication.
Phase 3 - OIDC: Token exchange flow where SPA sends auth code,
server exchanges it server-side (keeping client_secret secure),
validates id_token via JWKS, resolves roles (DB override > OIDC
claim > default), issues internal JWT. Conditional on
CAMELEER_OIDC_ENABLED=true. Uses oauth2-oidc-sdk for standards
compliance.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace hardcoded purple badge and plain text with AppBadge component
so agent names show the same deterministic color across the UI.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ClickHouse avg() and quantile() return nan/inf on zero rows, which
toInt64() cannot convert. Wrap with ifNotFinite(..., 0) to default to
zero. Applied to both stats and timeseries queries.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Stats endpoint now returns current + previous period (24h shift) values
plus today's total count. UI shows:
- Total Matches: "of 12.3K today"
- Avg Duration: arrow + % vs yesterday
- Failure Rate: percentage of errors vs total, arrow + % vs yesterday
- P99 Latency: arrow + % vs yesterday
- In-Flight: unchanged (running executions)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All stat card values now come from the /search/stats endpoint which
queries the full time window, not just the current page of results.
Consolidated into a single ClickHouse query for efficiency.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Execution rows are wide (29 cols with serialized arrays/JSON), so 500
rows can exceed ClickHouse's memory limit. Reduce default batch size
from 500 to 100 and bump ClickHouse memory limit from 2Gi to 4Gi.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
quantile(0.99) returns Float64 which ClickHouse JDBC cannot cast to
Long directly. Same toInt64() pattern already used in timeseries query.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
P99 latency and active count now use the same from/to parameters as the
timeseries sparklines, so all stat cards are consistent with the user's
selected time range.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The auth store loads tokens from localStorage synchronously at import
time, but configureAuth() was deferred to a useEffect — so the first
API requests fired before the token getter was wired, causing 401s on
hard refresh. Now getAccessToken reads from the store by default.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace useId() with colon-free ref-based ID generator to avoid SVG
url() gradient resolution failures, and add placeholderData to
timeseries query to prevent flash during refetch.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add LIVE/PAUSED toggle button that auto-refreshes search results every 5s
- Source environment badge from VITE_ENV_NAME env var (defaults to DEV locally, PRODUCTION in Docker)
- Remove search trigger button from topnav (command palette still available via keyboard)
- Rename "Transaction Explorer" to "Route Explorer" and "Active Now" to "In-Flight"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace toStartOfInterval with intDiv on epoch seconds, and cast
avg/quantile results to Int64 to avoid Float64 JDBC mapping issues.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New /search/stats/timeseries endpoint returns bucketed counts/metrics
over a time window using ClickHouse toStartOfInterval(). Frontend
Sparkline component renders SVG polyline + gradient fill on each
stat card, driven by a useStatsTimeseries query hook.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Rename "Agents" scope/labels to "Applications" throughout command palette
- Remove "Exchanges" scope (was disabled placeholder)
- Implement "Routes" scope: derives routes from agents' routeIds, filterable
by route ID or owning application name
- Selecting a route filters executions by routeId
- Route results show purple icon, route ID, and owning application(s)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Table headers are now clickable to sort by column (client-side)
- From date picker defaults to today 00:00 instead of empty
- Command palette expands inline from search bar instead of modal dialog
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend AgentInfo record uses 'id' but UI schema had 'agentId',
causing undefined property access crash in command palette.
Regenerated openapi.json and aligned all UI types with live spec.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend now returns 401 instead of 403 for unauthenticated requests
via HttpStatusEntryPoint. UI middleware handles both 401 and 403,
triggering token refresh and redirecting to /login on failure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Status filter now parses comma-separated values into SQL IN clause
instead of exact match, so filtering by multiple statuses works.
Added GET /api/v1/search/stats returning P99 latency (last hour) and
active execution count, wired into the UI stat cards with 10s polling.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The ProtocolVersionInterceptor requires X-Cameleer-Protocol-Version: 1
on /api/v1/agents/** but the UI client middleware wasn't sending it,
causing the agents GET to fail silently in the command palette.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Propagate authenticated agent identity through write buffers via
TaggedExecution/TaggedDiagram wrappers so ClickHouse rows get real
agent IDs instead of empty strings
- Add execution_id to text search LIKE clause so selecting an execution
by ID in the palette actually finds it
- Clear status filter to all three statuses on palette selection so the
chosen execution/agent isn't filtered out
- Add disabled Routes and Exchanges scope tabs with "coming soon" state
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Backend: add routeId, agentId, processorType filter fields to SearchRequest
and ClickHouseSearchEngine. Expand global text search to match route_id and
agent_id columns.
Frontend: new command palette component (portal overlay, Zustand store,
TanStack Query search hook with 300ms debounce, filter chip parsing,
keyboard navigation, scope tabs). Search bar in SearchFilters and TopNav
now open the palette. Selecting a result writes filters to the execution
search store to drive the results table.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Validated against live OpenAPI spec at /api/v1/api-docs. Fixes:
- duration → durationMs (all models)
- Remove processorCount (not in ExecutionSummary)
- Remove ProcessorNode.index and .uri (not in backend)
- ProcessorSnapshot is Record<string,string>, not structured object
- Add missing fields: endTime, diagramContentHash, exchangeId, etc.
- Save openapi.json from live server
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Port 30080 is already allocated. Updated deploy manifests,
CORS origin, and HOWTO.md references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CI runner is ARM64 and buildx was running npm ci under QEMU
amd64 emulation, causing a V8 crash. Use --platform=$BUILDPLATFORM
on the build stage so Node runs natively, matching the server Dockerfile.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Vite 8 requires Node.js 20.19+ or 22.12+. The previous apt install
gave Node.js 18. Switch to NodeSource repo for Node.js 22.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Scaffold Vite + React + TypeScript frontend in ui/ with full design
system (dark/light themes) matching the HTML mockups
- Implement Execution Explorer page: search filters, results table with
expandable processor tree and exchange detail sidebar, pagination
- Add UI authentication: UiAuthController (login/refresh endpoints),
JWT filter handles ui: subject prefix, CORS configuration
- Shared components: StatusPill, DurationBar, StatCard, AppBadge,
FilterChip, Pagination — all using CSS Modules with design tokens
- API client layer: openapi-fetch with auth middleware, TanStack Query
hooks for search/detail/snapshot queries, Zustand for state
- Standalone deployment: Nginx Dockerfile, K8s Deployment + ConfigMap +
NodePort (30080), runtime config.js for API base URL
- Embedded mode: maven-resources-plugin copies ui/dist into JAR static
resources, SPA forward controller for client-side routing
- CI/CD: UI build step, Docker build/push for server-ui image, K8s
deploy step for UI, UI credential secrets
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- ClickHouse user/password now injected via `clickhouse-credentials` Secret
instead of hardcoded plaintext in deploy manifests (#33)
- CI deploy step creates the secret idempotently from Gitea CI secrets
- Added liveness/readiness probes: server uses /api/v1/health, ClickHouse
uses /ping (#35)
- Updated HOWTO.md and CLAUDE.md with new secrets and probe details
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Increase ClickHouse memory limit from 1Gi to 2Gi and reduce default
batch size from 5000 to 500. During VM backup snapshots, I/O contention
prevents ClickHouse from flushing writes fast enough, causing buffer
accumulation that exceeds the 1Gi container limit.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The DriverManager-based approach likely failed because the ClickHouse
JDBC driver wasn't registered with DriverManager. The original
JdbcTemplate approach worked for route_diagrams and agent_metrics —
only route_executions was skipped due to the comment-parsing bug.
Reverts to simple JdbcTemplate-based init with unqualified table names
(DataSource targets cameleer3 database). The CLICKHOUSE_DB env var on
the ClickHouse container handles database creation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
split(';') produced chunks starting with '--' comment lines, causing
the startsWith('--') check to skip the entire CREATE TABLE statement
for route_executions. Now strips comment lines before splitting.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The auto-configured DataSource targets jdbc:ch://.../cameleer3 which fails
if the database doesn't exist yet. Schema init now uses a direct JDBC
connection to the root URL, creates the database first, then applies all
schema SQL with fully qualified cameleer3.* table names.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ClickHouseConfig.ensureDatabaseExists() connects without the database path
to run CREATE DATABASE IF NOT EXISTS before the main DataSource is used.
Removes the ConfigMap-based init scripts from the K8s manifest — the server
is now the single owner of all ClickHouse schema management.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>