Commit Graph

33 Commits

Author SHA1 Message Date
hsiegeln
cc433b4215 feat: add GET /routes/metrics/processors endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 18:10:54 +01:00
hsiegeln
2b111c603c feat: migrate UI to @cameleer/design-system, add backend endpoints
Some checks failed
CI / build (push) Failing after 47s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Backend:
- Add agent_events table (V5) and lifecycle event recording
- Add route catalog endpoint (GET /routes/catalog)
- Add route metrics endpoint (GET /routes/metrics)
- Add agent events endpoint (GET /agents/events-log)
- Enrich AgentInstanceResponse with tps, errorRate, activeRoutes, uptimeSeconds
- Add TimescaleDB retention/compression policies (V6)

Frontend:
- Replace custom Mission Control UI with @cameleer/design-system components
- Rebuild all pages: Dashboard, ExchangeDetail, RoutesMetrics, AgentHealth,
  AgentInstance, RBAC, AuditLog, OIDC, DatabaseAdmin, OpenSearchAdmin, Swagger
- New LayoutShell with design system AppShell, Sidebar, TopBar, CommandPalette
- Consume design system from Gitea npm registry (@cameleer/design-system@0.0.1)
- Add .npmrc for scoped registry, update Dockerfile with REGISTRY_TOKEN arg

CI:
- Pass REGISTRY_TOKEN build-arg to UI Docker build step

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:38:39 +01:00
hsiegeln
82124c3145 fix: remove RBAC user_roles insert from agent registration
All checks were successful
CI / build (push) Successful in 1m22s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
Agents are transient and should not be persisted in the users table.
The assignRoleToUser call caused a FK violation (user_roles → users),
resulting in HTTP 500 on registration. The AGENT role is already
embedded directly in the JWT claims.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 22:10:48 +01:00
hsiegeln
17ef48e392 fix: return rotated refresh token from agent token refresh endpoint
All checks were successful
CI / build (push) Successful in 1m22s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 56s
CI / deploy (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
Previously the refresh endpoint only returned a new accessToken, causing
agents to lose their refreshToken after the first refresh cycle and
forcing a full re-registration every ~2 hours.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 16:44:16 +01:00
hsiegeln
5a0a915cc6 fix: scope admin infra pages to current tenant's tables and indices
All checks were successful
CI / build (push) Successful in 1m14s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 44s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 35s
Database tables filtered to current_schema(), active queries to
current_database(), OpenSearch indices to configured index-prefix.
Delete endpoint rejects indices outside application scope.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 09:29:06 +01:00
hsiegeln
6f5b5b8655 feat: add password support for local user creation and per-user login
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 19:08:19 +01:00
hsiegeln
f42e6279e6 fix: null safety in role/group creation, add user create/update endpoints
- RoleAdminController.createRole: default null description to "" and null scope to "custom"
- RoleAdminController.updateRole: pass null audit details to avoid NPE when name is null
- GroupAdminController.updateGroup: pass null audit details to avoid NPE when name is null
- UserAdminController: add POST / createUser endpoint with default VIEWER role assignment
- UserAdminController: add PUT /{userId} updateUser endpoint for displayName/email updates
2026-03-17 18:49:34 +01:00
hsiegeln
01295c84d8 feat: add Group, Role, and RBAC stats admin controllers
GroupAdminController with cycle detection, RoleAdminController
with system role protection, RbacStatsController for dashboard.
Rewrite UserAdminController to use RbacService.
2026-03-17 17:47:26 +01:00
hsiegeln
eb0cc8c141 feat: replace flat users.roles with relational RBAC model
New package com.cameleer3.server.core.rbac with SystemRole constants,
detail/summary records, GroupRepository, RoleRepository, RbacService.
Remove roles field from UserInfo. Implement PostgresGroupRepository,
PostgresRoleRepository, RbacServiceImpl with inheritance computation.
Update UiAuthController, OidcAuthController, AgentRegistrationController
to assign roles via user_roles table. JWT populated from effective system roles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:44:32 +01:00
hsiegeln
321b8808cc feat: add ThresholdAdminController and AuditLogController with integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:57:23 +01:00
hsiegeln
c6da858c2f feat: add OpenSearchAdminController with status, pipeline, indices, performance, and delete endpoints
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:57:18 +01:00
hsiegeln
c6b2f7c331 feat: add DatabaseAdminController with status, pool, tables, queries, and kill endpoints
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:57:14 +01:00
hsiegeln
1d6ae00b1c feat: wire AuditService, enable method security, retrofit audit logging into existing controllers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:51:22 +01:00
hsiegeln
565b548ac1 refactor: remove all ClickHouse code, old interfaces, and SQL migrations
- Delete all ClickHouse storage implementations and config
- Delete old core interfaces (ExecutionRepository, DiagramRepository, MetricsRepository, SearchEngine, RawExecutionRow)
- Delete ClickHouse SQL migration files
- Delete AbstractClickHouseIT
- Update controllers to use new store interfaces (DiagramStore, ExecutionStore)
- Fix IngestionService calls in controllers for new synchronous API
- Migrate all ITs from AbstractClickHouseIT to AbstractPostgresIT
- Fix count() syntax and remove ClickHouse-specific test assertions
- Update TreeReconstructionTest for new buildTree() method

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:56:13 +01:00
hsiegeln
7778793e7b Add route diagram page with execution overlay and group-aware APIs
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 1m3s
CI / deploy (push) Successful in 31s
Backend: Add group filtering to agent list, search, stats, and timeseries
endpoints. Add diagram lookup by group+routeId. Resolve application group
to agent IDs server-side for ClickHouse IN-clause queries.

Frontend: New route detail page at /apps/{group}/routes/{routeId} with
three tabs (Diagram, Performance, Processor Tree). SVG diagram rendering
with panzoom, execution overlay (glow effects, duration/sequence badges,
flow particles, minimap), and processor detail panel. uPlot charts for
performance tab replacing old SVG sparklines. Ctrl+Click from
ExecutionExplorer navigates to route diagram with overlay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 21:35:42 +01:00
hsiegeln
b64edaa16f Server-side sorting for execution search results
All checks were successful
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 50s
CI / deploy (push) Successful in 33s
Sorting now applies to the entire result set via ClickHouse ORDER BY
instead of only sorting the current page client-side. Default sort
order is timestamp descending. Supported sort columns: startTime,
status, agentId, routeId, correlationId, durationMs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 19:34:22 +01:00
hsiegeln
463cab1196 Add displayName to auth response and configurable display name claim for OIDC
Some checks failed
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 49s
CI / deploy (push) Failing after 2m9s
- Add displayName field to AuthTokenResponse so the UI shows human-readable
  names instead of internal JWT subjects (e.g. user:oidc:<hash>)
- Add displayNameClaim to OIDC config (default: "name") allowing admins to
  configure which ID token claim contains the user's display name
- Support dot-separated claim paths (e.g. profile.display_name) like rolesClaim
- Add admin UI field for Display Name Claim on the OIDC config page
- ClickHouse migration: ALTER TABLE adds display_name_claim column

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:09:24 +01:00
hsiegeln
465f210aee Contract-first API with DTOs, validation, and server-side OpenAPI post-processing
All checks were successful
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 2m6s
CI / deploy (push) Successful in 30s
Add dedicated request/response DTOs for all controllers, replacing raw
JsonNode parameters with validated types. Move OpenAPI path-prefix stripping
and ProcessorNode children injection into OpenApiCustomizer beans so the
spec served at /api/v1/api-docs is already clean — eliminating the need for
the ui/scripts/process-openapi.mjs post-processing script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 15:33:37 +01:00
hsiegeln
0c47ac9b1a Add OIDC admin config page with auto-signup toggle
Some checks failed
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 50s
CI / deploy (push) Failing after 2m10s
Backend: add autoSignup field to OidcConfig, ClickHouse schema, repository,
and admin controller. Gate OIDC login when auto-signup is disabled and user
is not pre-created (returns 403).

Frontend: add OIDC admin page with full CRUD (save/test/delete), role-gated
Admin nav link parsed from JWT, and matching design system styles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:56:02 +01:00
hsiegeln
9d2e6f30a7 Move OIDC config from env vars to database with admin API
All checks were successful
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 2m11s
OIDC provider settings (issuer, client ID/secret, roles claim) are
now stored in ClickHouse and managed via admin REST API at
/api/v1/admin/oidc. This allows runtime configuration from the UI
without server restarts.

- New oidc_config table (ReplacingMergeTree, singleton row)
- OidcConfig record + OidcConfigRepository interface in core
- ClickHouseOidcConfigRepository implementation
- OidcConfigAdminController: GET/PUT/DELETE config, POST test
  connectivity, client_secret masked in responses
- OidcTokenExchanger: reads config from DB, invalidateCache()
  on config change
- OidcAuthController: always registered (no @ConditionalOnProperty),
  returns 404 when OIDC not configured
- Startup seeder: env vars seed DB on first boot only, then admin
  API takes over
- HOWTO.md updated with admin OIDC config API examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:01:05 +01:00
hsiegeln
a4de2a7b79 Add RBAC with role-based endpoint authorization and OIDC support
Some checks failed
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m38s
CI / deploy (push) Has been cancelled
Implement three-phase security upgrade:

Phase 1 - RBAC: Extend JWT with roles claim, populate Spring
GrantedAuthority in filter, enforce role-based access (AGENT for
data/heartbeat/SSE, VIEWER+ for search/diagrams, OPERATOR+ for
commands, ADMIN for user management). Configurable JWT secret via
CAMELEER_JWT_SECRET env var for token persistence across restarts.

Phase 2 - User persistence: ClickHouse users table with
ReplacingMergeTree, UserRepository interface + ClickHouse impl,
UserAdminController for CRUD at /api/v1/admin/users. Local login
upserts user on each authentication.

Phase 3 - OIDC: Token exchange flow where SPA sends auth code,
server exchanges it server-side (keeping client_secret secure),
validates id_token via JWKS, resolves roles (DB override > OIDC
claim > default), issues internal JWT. Conditional on
CAMELEER_OIDC_ENABLED=true. Uses oauth2-oidc-sdk for standards
compliance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 12:35:45 +01:00
hsiegeln
cdf4c93630 Make stats endpoint respect selected time window instead of hardcoded last hour
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 48s
CI / deploy (push) Successful in 28s
P99 latency and active count now use the same from/to parameters as the
timeseries sparklines, so all stat cards are consistent with the user's
selected time range.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 22:19:59 +01:00
hsiegeln
9e6e1b350a Add stat card sparkline graphs with timeseries backend endpoint
All checks were successful
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 45s
CI / deploy (push) Successful in 23s
New /search/stats/timeseries endpoint returns bucketed counts/metrics
over a time window using ClickHouse toStartOfInterval(). Frontend
Sparkline component renders SVG polyline + gradient fill on each
stat card, driven by a useStatsTimeseries query hook.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:20:08 +01:00
hsiegeln
3f98467ba5 Fix status filter OR logic and add P99/active stats endpoint
All checks were successful
CI / build (push) Successful in 1m1s
CI / docker (push) Successful in 47s
CI / deploy (push) Successful in 29s
Status filter now parses comma-separated values into SQL IN clause
instead of exact match, so filtering by multiple statuses works.

Added GET /api/v1/search/stats returning P99 latency (last hour) and
active execution count, wired into the UI stat cards with 10s polling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:34:11 +01:00
hsiegeln
86e016874a Fix command palette: agent ID propagation, result selection, and scope tabs
All checks were successful
CI / build (push) Successful in 59s
CI / docker (push) Successful in 46s
CI / deploy (push) Successful in 25s
- Propagate authenticated agent identity through write buffers via
  TaggedExecution/TaggedDiagram wrappers so ClickHouse rows get real
  agent IDs instead of empty strings
- Add execution_id to text search LIKE clause so selecting an execution
  by ID in the palette actually finds it
- Clear status filter to all three statuses on palette selection so the
  chosen execution/agent isn't filtered out
- Add disabled Routes and Exchanges scope tabs with "coming soon" state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:13:14 +01:00
hsiegeln
64b03a4e2f Add Cmd+K command palette for searching executions and agents
All checks were successful
CI / build (push) Successful in 59s
CI / docker (push) Successful in 56s
CI / deploy (push) Successful in 26s
Backend: add routeId, agentId, processorType filter fields to SearchRequest
and ClickHouseSearchEngine. Expand global text search to match route_id and
agent_id columns.

Frontend: new command palette component (portal overlay, Zustand store,
TanStack Query search hook with 300ms debounce, filter chip parsing,
keyboard navigation, scope tabs). Search bar in SearchFilters and TopNav
now open the palette. Selecting a result writes filters to the execution
search store to drive the results table.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 16:28:16 +01:00
hsiegeln
387e2e66b2 feat(04-02): wire Spring Security filter chain with JWT auth, bootstrap registration, and refresh endpoint
- JwtAuthenticationFilter extracts JWT from Authorization header or query param, validates via JwtService
- SecurityConfig creates stateless SecurityFilterChain with public/protected endpoint split
- AgentRegistrationController requires bootstrap token, returns accessToken + refreshToken + serverPublicKey
- New POST /agents/{id}/refresh endpoint issues new access JWT from valid refresh token

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:13:53 +01:00
hsiegeln
5746886a0b feat(03-02): SSE connection manager, SSE endpoint, and command controller
- SseConnectionManager with per-agent SseEmitter, ping keepalive, event delivery
- AgentSseController GET /{id}/events SSE endpoint with Last-Event-ID support
- AgentCommandController with single/group/broadcast command targeting + ack
- WebConfig excludes SSE events path from protocol version interceptor

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:45:47 +01:00
hsiegeln
0372be2334 feat(03-01): add agent registration controller, config, lifecycle monitor
- AgentRegistryConfig: heartbeat, stale, dead, ping, command expiry settings
- AgentRegistryBeanConfig: wires AgentRegistryService as Spring bean
- AgentLifecycleMonitor: @Scheduled lifecycle check + command expiry sweep
- AgentRegistrationController: POST /register, POST /{id}/heartbeat, GET /agents
- Updated Cameleer3ServerApplication with AgentRegistryConfig
- Updated application.yml with agent-registry section and async timeout
- 7 integration tests: register, re-register, heartbeat, list, filter, invalid status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:40:57 +01:00
hsiegeln
0615a9851d feat(02-03): detail controller, tree reconstruction, processor snapshot endpoint
- Implement findRawById and findProcessorSnapshot in ClickHouseExecutionRepository
- DetailController with GET /executions/{id} returning nested processor tree
- GET /executions/{id}/processors/{index}/snapshot for per-processor exchange data
- 5 unit tests for tree reconstruction (linear, branching, multiple roots, empty)
- 6 integration tests for detail endpoint, snapshot, and 404 handling
- Added assertj and mockito test dependencies to core module

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:29:53 +01:00
hsiegeln
82a190c8e2 feat(02-03): ClickHouse search engine, search controller, and 13 integration tests
- ClickHouseSearchEngine with dynamic WHERE clause building and LIKE escape
- SearchController with GET (basic filters) and POST (advanced JSON body)
- SearchBeanConfig wiring SearchEngine, SearchService, DetailService beans
- 13 integration tests covering all filter types, combinations, pagination, empty results

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:23:20 +01:00
hsiegeln
c1bc32d50a feat(02-02): implement ELK diagram renderer with SVG/JSON content negotiation
- ElkDiagramRenderer: ELK layered layout (top-to-bottom) with JFreeSVG rendering
- Color-coded nodes: blue endpoints, green processors, red errors, purple EIPs, cyan cross-route
- Compound node support for CHOICE/SPLIT/TRY_CATCH swimlane groups
- DiagramRenderController: GET /api/v1/diagrams/{hash}/render with Accept header negotiation
- DiagramBeanConfig for Spring wiring
- 11 unit tests (layout, SVG structure, colors, compound nodes)
- 4 integration tests (SVG, JSON, 404, default format)
- Added xtext xbase lib dependency for ELK compatibility

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:17:13 +01:00
hsiegeln
8fe65f083c feat(01-02): implement ingestion REST controllers with backpressure
- ExecutionController: POST /api/v1/data/executions (single or array)
- DiagramController: POST /api/v1/data/diagrams (single or array)
- MetricsController: POST /api/v1/data/metrics (array)
- All return 202 Accepted or 503 with Retry-After when buffer full
- Fix duplicate IngestionConfig bean (remove @Configuration, use @EnableConfigurationProperties)
- Fix BackpressureIT timing by using batch POST and 60s flush interval
- All 11 integration tests green

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 12:13:27 +01:00