Commit Graph

56 Commits

Author SHA1 Message Date
hsiegeln
b3c5e87230 fix: expose exchange body in API, fix RouteFlow index mapping
Some checks failed
CI / build (push) Failing after 25s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add inputBody/outputBody/inputHeaders/outputHeaders to ExecutionDetail
DTO so exchange-level bodies are returned by the detail endpoint. Show
"Exchange Input" and "Exchange Output" panels on the detail page when
the data is available.

Fix RouteFlow node click selecting the wrong processor snapshot by
building a flowToTreeIndex mapping that correctly translates flow
display index → diagram node index → processorId → processor tree
index. Previously the diagram node index was used directly as the
processor tree index, which broke when the two orderings differed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 22:02:26 +01:00
hsiegeln
7fd55ea8ba fix: remove core LogIndexService to fix CI snapshot resolution
Some checks failed
CI / build (push) Failing after 1m11s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
LogIndexService in server-core imported LogEntry from cameleer3-common,
but the SNAPSHOT on the registry may not have it yet when the server CI
runs. Moved the dependency to server-app where both the controller and
OpenSearch implementation live.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 13:11:11 +01:00
hsiegeln
7423e2ca14 feat: add application log ingestion with OpenSearch storage
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 59s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Agents can now send application log entries in batches via POST /api/v1/data/logs.
Logs are indexed directly into OpenSearch daily indices (logs-{yyyy-MM-dd}) using
the bulk API. Index template defines explicit mappings for full-text search readiness.

New DTOs (LogEntry, LogBatch) added to cameleer3-common in the agent repo.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 11:53:27 +01:00
hsiegeln
d9c8816647 feat: add OpenSearch highlight snippets to search results
All checks were successful
CI / build (push) Successful in 1m23s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 54s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
- Add highlight field to ExecutionSummary record
- Request highlight fragments from OpenSearch when full-text search is active
- Pass matchContext to command palette for display

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 09:29:07 +01:00
hsiegeln
6fea5f2c5b fix: use .keyword suffix for text field sorting in OpenSearch
All checks were successful
CI / build (push) Successful in 1m22s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 44s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
OpenSearch dynamically maps string fields as text with a .keyword
subfield. Sorting on text fields throws an error; only .keyword,
date, and numeric fields support sorting. Add .keyword suffix to
all string sort columns (status, routeId, agentId, executionId,
correlationId, applicationName) while keeping start_time and
duration_ms as-is.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 17:56:18 +01:00
hsiegeln
b7cac68ee1 fix: filter exchanges by application and restore snake_case sort columns
All checks were successful
CI / build (push) Successful in 1m23s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
Add application_name filter to OpenSearch query builder — sidebar
app selection now correctly filters the exchange list. The
application field was being resolved to agentIds in the controller
but never applied as a query filter in OpenSearch.

Also restore snake_case sort column mapping since the OpenSearch
toMap() serializer uses snake_case field names (start_time, route_id,
etc.), not camelCase.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 17:41:07 +01:00
hsiegeln
cdbe330c47 fix: support all sortable columns and use camelCase for OpenSearch
All checks were successful
CI / build (push) Successful in 1m24s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 45s
CI / deploy (push) Successful in 37s
CI / deploy-feature (push) Has been skipped
Add executionId and applicationName to allowed sort fields. Fix sort
column mapping to use camelCase field names matching the OpenSearch
ExecutionDocument fields instead of snake_case DB column names. This
was causing sorts on most columns to either silently fall back to
startTime or return empty results from OpenSearch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 17:37:01 +01:00
e7835e1100 feat: map engineLevel and route-level snapshots in IngestionService
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
Extract inputBody/outputBody/inputHeaders/outputHeaders from RouteExecution
snapshots and pass to ExecutionRecord. Maps engineLevel field. Critical for
REGULAR mode where no processor records exist but route-level payloads do.
2026-03-24 16:11:55 +01:00
ed65b87af2 feat: add engineLevel and route-level snapshot fields to ExecutionRecord
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
Adds engineLevel (NONE/MINIMAL/REGULAR/COMPLETE) and inputBody/outputBody/
inputHeaders/outputHeaders to ExecutionRecord so REGULAR mode route-level
payloads are persisted (previously only processor-level records had payloads).
2026-03-24 16:11:26 +01:00
292a38fe30 feat: add SET_TRACED_PROCESSORS command type for per-processor overrides
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
2026-03-24 16:10:21 +01:00
hsiegeln
ff76751629 refactor: rename agent group→application across entire codebase
All checks were successful
CI / build (push) Successful in 1m22s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 52s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
Complete the group→application terminology rename in the agent
registry subsystem:

- AgentInfo: field group → application, all wither methods updated
- AgentRegistryService: findByGroup → findByApplication
- AgentInstanceResponse: field group → application (API response)
- AgentRegistrationRequest: field group → application (API request)
- JwtServiceImpl: parameter names group → application (JWT claim
  string "group" preserved for token backward compatibility)
- All controllers, lifecycle monitor, command controller updated
- Integration tests: JSON request bodies "group" → "application"
- Frontend: schema.d.ts, openapi.json, agent queries, AgentHealth

RBAC group references (groups table, GroupAdminController, etc.)
are NOT affected — they are a separate domain concept.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 08:48:12 +01:00
hsiegeln
413839452c fix: use statsForApp when application is set without routeId
All checks were successful
CI / build (push) Successful in 1m21s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 44s
CI / deploy (push) Successful in 38s
CI / deploy-feature (push) Has been skipped
The stats endpoint was calling statsForRoute(null, agentIds) when
only application was set — this filtered by route_id=null, returning
zero results. Now correctly routes to statsForApp/timeseriesForApp
which queries the stats_1m_app continuous aggregate by application_name.

Also reverts the group parameter alias workaround — the deployed
backend correctly accepts 'application'.

Three code paths now:
- No filters → stats_1m_all (global)
- application only → stats_1m_app (per-app)
- routeId (±application) → stats_1m_route (per-route)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 08:28:05 +01:00
hsiegeln
8ad0016a8e refactor: rename group/groupName to application/applicationName
Some checks failed
CI / build (push) Failing after 40s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
The execution-related "group" concept actually represents the
application name. Rename all Java fields, API parameters, and frontend
types from groupName→applicationName and group→application for clarity.

- Java records: ExecutionSummary, ExecutionDetail, ExecutionDocument,
  ExecutionRecord, ProcessorRecord
- API params: SearchRequest.group→application, SearchController
  @RequestParam group→application
- Services: IngestionService, DetailService, SearchIndexer, StatsStore
- Frontend: schema.d.ts, Dashboard, ExchangeDetail, RouteDetail,
  executions query hooks

Database column names (group_name) and OpenSearch field names are
unchanged — only the API-facing Java/TS field names are renamed.

RBAC group references (groups table, GroupRepository, GroupsTab) are
a separate domain concept and are NOT affected by this change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 21:21:38 +01:00
hsiegeln
2ae2871822 fix: add groupName to ExecutionDetail, rewrite ExchangeDetail to match mock
Some checks failed
CI / build (push) Failing after 40s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
- Add groupName field to ExecutionDetail record and DetailService
- Dashboard: fix TDZ error (rows referenced before definition), add
  selectedRow fallback for diagram groupName lookup
- ExchangeDetail: rewrite to match mock layout — auto-select first
  processor, Message IN/OUT split panels with header key-value rows,
  error panel for failed processors, Timeline/Flow toggle buttons
- Track diagram-mapping utility (was untracked, caused CI build failure)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 21:02:14 +01:00
hsiegeln
a72b0954db fix: add groupName to ExecutionSummary, locale format stat values, inspect column, fix duplicate keys
Some checks failed
CI / build (push) Failing after 40s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
- Added groupName field to ExecutionSummary Java record and OpenSearch mapper
- Dashboard stat cards use locale-formatted numbers (en-US)
- Added inspect column (↗) linking directly to exchange detail page
- Fixed duplicate React key warning from two columns sharing executionId key

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 20:41:46 +01:00
hsiegeln
2b111c603c feat: migrate UI to @cameleer/design-system, add backend endpoints
Some checks failed
CI / build (push) Failing after 47s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Backend:
- Add agent_events table (V5) and lifecycle event recording
- Add route catalog endpoint (GET /routes/catalog)
- Add route metrics endpoint (GET /routes/metrics)
- Add agent events endpoint (GET /agents/events-log)
- Enrich AgentInstanceResponse with tps, errorRate, activeRoutes, uptimeSeconds
- Add TimescaleDB retention/compression policies (V6)

Frontend:
- Replace custom Mission Control UI with @cameleer/design-system components
- Rebuild all pages: Dashboard, ExchangeDetail, RoutesMetrics, AgentHealth,
  AgentInstance, RBAC, AuditLog, OIDC, DatabaseAdmin, OpenSearchAdmin, Swagger
- New LayoutShell with design system AppShell, Sidebar, TopBar, CommandPalette
- Consume design system from Gitea npm registry (@cameleer/design-system@0.0.1)
- Add .npmrc for scoped registry, update Dockerfile with REGISTRY_TOKEN arg

CI:
- Pass REGISTRY_TOKEN build-arg to UI Docker build step

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:38:39 +01:00
hsiegeln
6f5b5b8655 feat: add password support for local user creation and per-user login
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 19:08:19 +01:00
hsiegeln
4842507ff3 feat: seed built-in Admins group and assign admin users on login
- Add V2 Flyway migration to create built-in Admins group (id: ...0010) with ADMIN role
- Add ADMINS_GROUP_ID constant to SystemRole
- Add user to Admins group on successful local login alongside role assignment
2026-03-17 18:30:16 +01:00
hsiegeln
eb0cc8c141 feat: replace flat users.roles with relational RBAC model
New package com.cameleer3.server.core.rbac with SystemRole constants,
detail/summary records, GroupRepository, RoleRepository, RbacService.
Remove roles field from UserInfo. Implement PostgresGroupRepository,
PostgresRoleRepository, RbacServiceImpl with inheritance computation.
Update UiAuthController, OidcAuthController, AgentRegistrationController
to assign roles via user_roles table. JWT populated from effective system roles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:44:32 +01:00
hsiegeln
4d33592015 feat: add ThresholdConfig, ThresholdRepository, SearchIndexerStats, and instrument SearchIndexer
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:43:16 +01:00
hsiegeln
a0944a1c72 feat: add audit domain model, repository interface, AuditService, and unit test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:36:21 +01:00
hsiegeln
796be06a09 fix: resolve all integration test failures after storage layer refactor
- Use singleton container pattern for PostgreSQL + OpenSearch testcontainers
  (fixes container lifecycle issues with @TestInstance(PER_CLASS))
- Fix table name route_executions → executions in DetailControllerIT and
  ExecutionControllerIT
- Serialize processor headers as JSON (ObjectMapper) instead of Map.toString()
  for JSONB column compatibility
- Add nested mapping for processors field in OpenSearch index template
- Use .keyword sub-field for term queries on dynamically mapped text fields
- Add wildcard fallback queries for all text searches (substring matching)
- Isolate stats tests with unique route names to prevent data contamination
- Wait for OpenSearch indexing in SearchControllerIT with targeted Awaitility
- Reduce OpenSearch debounce to 100ms in test profile

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 00:02:19 +01:00
hsiegeln
26f5a2ce3b fix: update remaining ITs for synchronous ingestion and PostgreSQL storage
- SearchControllerIT: remove @TestInstance(PER_CLASS), use @BeforeEach with
  static guard, fix table name (route_executions -> executions), remove
  Awaitility polling
- OpenSearchIndexIT: replace Thread.sleep with explicit index refresh via
  OpenSearchClient
- DiagramLinkingIT: fix table name, remove Awaitility awaits (writes are
  synchronous)
- IngestionSchemaIT: rewrite queries for PostgreSQL relational model
  (processor_executions table instead of ClickHouse array columns)
- PostgresStatsStoreIT: use explicit time bounds in
  refresh_continuous_aggregate calls
- IngestionService: populate diagramContentHash during execution ingestion
  by looking up the latest diagram for the route+agent

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 22:03:29 +01:00
hsiegeln
565b548ac1 refactor: remove all ClickHouse code, old interfaces, and SQL migrations
- Delete all ClickHouse storage implementations and config
- Delete old core interfaces (ExecutionRepository, DiagramRepository, MetricsRepository, SearchEngine, RawExecutionRow)
- Delete ClickHouse SQL migration files
- Delete AbstractClickHouseIT
- Update controllers to use new store interfaces (DiagramStore, ExecutionStore)
- Fix IngestionService calls in controllers for new synchronous API
- Migrate all ITs from AbstractClickHouseIT to AbstractPostgresIT
- Fix count() syntax and remove ClickHouse-specific test assertions
- Update TreeReconstructionTest for new buildTree() method

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:56:13 +01:00
hsiegeln
c48e0bdfde feat: implement debounced SearchIndexer for async OpenSearch indexing 2026-03-16 18:25:54 +01:00
hsiegeln
85ebe76111 refactor: IngestionService uses synchronous ExecutionStore writes with event publishing
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:18:54 +01:00
hsiegeln
adf4b44d78 refactor: DetailService uses ExecutionStore, tree built from parentProcessorId 2026-03-16 18:18:53 +01:00
hsiegeln
84b93d74c7 refactor: SearchService uses SearchIndex + StatsStore instead of SearchEngine 2026-03-16 18:18:52 +01:00
hsiegeln
a55fc3c10d feat: add new storage interfaces for PostgreSQL/OpenSearch backends
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:16:53 +01:00
hsiegeln
55ed3be71a feat: add ExecutionDocument model and ExecutionUpdatedEvent
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:16:42 +01:00
hsiegeln
7778793e7b Add route diagram page with execution overlay and group-aware APIs
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 1m3s
CI / deploy (push) Successful in 31s
Backend: Add group filtering to agent list, search, stats, and timeseries
endpoints. Add diagram lookup by group+routeId. Resolve application group
to agent IDs server-side for ClickHouse IN-clause queries.

Frontend: New route detail page at /apps/{group}/routes/{routeId} with
three tabs (Diagram, Performance, Processor Tree). SVG diagram rendering
with panzoom, execution overlay (glow effects, duration/sequence badges,
flow particles, minimap), and processor detail panel. uPlot charts for
performance tab replacing old SVG sparklines. Ctrl+Click from
ExecutionExplorer navigates to route diagram with overlay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 21:35:42 +01:00
hsiegeln
b64edaa16f Server-side sorting for execution search results
All checks were successful
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 50s
CI / deploy (push) Successful in 33s
Sorting now applies to the entire result set via ClickHouse ORDER BY
instead of only sorting the current page client-side. Default sort
order is timestamp descending. Supported sort columns: startTime,
status, agentId, routeId, correlationId, durationMs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 19:34:22 +01:00
hsiegeln
463cab1196 Add displayName to auth response and configurable display name claim for OIDC
Some checks failed
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 49s
CI / deploy (push) Failing after 2m9s
- Add displayName field to AuthTokenResponse so the UI shows human-readable
  names instead of internal JWT subjects (e.g. user:oidc:<hash>)
- Add displayNameClaim to OIDC config (default: "name") allowing admins to
  configure which ID token claim contains the user's display name
- Support dot-separated claim paths (e.g. profile.display_name) like rolesClaim
- Add admin UI field for Display Name Claim on the OIDC config page
- ClickHouse migration: ALTER TABLE adds display_name_claim column

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 16:09:24 +01:00
hsiegeln
6676e209c7 Fix OIDC login immediate logout — rename JWT subject prefix ui: → user:
All checks were successful
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 30s
OIDC tokens had subject "oidc:<sub>" which didn't match the "ui:" prefix
check in JwtAuthenticationFilter, causing every post-login API call to
return 401 and trigger automatic logout. Renamed the prefix from "ui:"
to "user:" across all auth code for clarity (it covers both browser and
API clients, not just UI).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 15:55:10 +01:00
hsiegeln
0c47ac9b1a Add OIDC admin config page with auto-signup toggle
Some checks failed
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 50s
CI / deploy (push) Failing after 2m10s
Backend: add autoSignup field to OidcConfig, ClickHouse schema, repository,
and admin controller. Gate OIDC login when auto-signup is disabled and user
is not pre-created (returns 403).

Frontend: add OIDC admin page with full CRUD (save/test/delete), role-gated
Admin nav link parsed from JWT, and matching design system styles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:56:02 +01:00
hsiegeln
9d2e6f30a7 Move OIDC config from env vars to database with admin API
All checks were successful
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 2m11s
OIDC provider settings (issuer, client ID/secret, roles claim) are
now stored in ClickHouse and managed via admin REST API at
/api/v1/admin/oidc. This allows runtime configuration from the UI
without server restarts.

- New oidc_config table (ReplacingMergeTree, singleton row)
- OidcConfig record + OidcConfigRepository interface in core
- ClickHouseOidcConfigRepository implementation
- OidcConfigAdminController: GET/PUT/DELETE config, POST test
  connectivity, client_secret masked in responses
- OidcTokenExchanger: reads config from DB, invalidateCache()
  on config change
- OidcAuthController: always registered (no @ConditionalOnProperty),
  returns 404 when OIDC not configured
- Startup seeder: env vars seed DB on first boot only, then admin
  API takes over
- HOWTO.md updated with admin OIDC config API examples

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 13:01:05 +01:00
hsiegeln
a4de2a7b79 Add RBAC with role-based endpoint authorization and OIDC support
Some checks failed
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m38s
CI / deploy (push) Has been cancelled
Implement three-phase security upgrade:

Phase 1 - RBAC: Extend JWT with roles claim, populate Spring
GrantedAuthority in filter, enforce role-based access (AGENT for
data/heartbeat/SSE, VIEWER+ for search/diagrams, OPERATOR+ for
commands, ADMIN for user management). Configurable JWT secret via
CAMELEER_JWT_SECRET env var for token persistence across restarts.

Phase 2 - User persistence: ClickHouse users table with
ReplacingMergeTree, UserRepository interface + ClickHouse impl,
UserAdminController for CRUD at /api/v1/admin/users. Local login
upserts user on each authentication.

Phase 3 - OIDC: Token exchange flow where SPA sends auth code,
server exchanges it server-side (keeping client_secret secure),
validates id_token via JWKS, resolves roles (DB override > OIDC
claim > default), issues internal JWT. Conditional on
CAMELEER_OIDC_ENABLED=true. Uses oauth2-oidc-sdk for standards
compliance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 12:35:45 +01:00
hsiegeln
3641dffecc Add comparison stats: failure rate %, vs-yesterday change, today total
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 48s
CI / deploy (push) Successful in 37s
Stats endpoint now returns current + previous period (24h shift) values
plus today's total count. UI shows:
- Total Matches: "of 12.3K today"
- Avg Duration: arrow + % vs yesterday
- Failure Rate: percentage of errors vs total, arrow + % vs yesterday
- P99 Latency: arrow + % vs yesterday
- In-Flight: unchanged (running executions)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 09:29:14 +01:00
hsiegeln
393d19e3f4 Move failed count and avg duration from page-derived to backend stats
Some checks failed
CI / build (push) Successful in 1m11s
CI / deploy (push) Has been cancelled
CI / docker (push) Has been cancelled
All stat card values now come from the /search/stats endpoint which
queries the full time window, not just the current page of results.
Consolidated into a single ClickHouse query for efficiency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 22:51:43 +01:00
hsiegeln
cdf4c93630 Make stats endpoint respect selected time window instead of hardcoded last hour
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 48s
CI / deploy (push) Successful in 28s
P99 latency and active count now use the same from/to parameters as the
timeseries sparklines, so all stat cards are consistent with the user's
selected time range.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 22:19:59 +01:00
hsiegeln
9e6e1b350a Add stat card sparkline graphs with timeseries backend endpoint
All checks were successful
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 45s
CI / deploy (push) Successful in 23s
New /search/stats/timeseries endpoint returns bucketed counts/metrics
over a time window using ClickHouse toStartOfInterval(). Frontend
Sparkline component renders SVG polyline + gradient fill on each
stat card, driven by a useStatsTimeseries query hook.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:20:08 +01:00
hsiegeln
3f98467ba5 Fix status filter OR logic and add P99/active stats endpoint
All checks were successful
CI / build (push) Successful in 1m1s
CI / docker (push) Successful in 47s
CI / deploy (push) Successful in 29s
Status filter now parses comma-separated values into SQL IN clause
instead of exact match, so filtering by multiple statuses works.

Added GET /api/v1/search/stats returning P99 latency (last hour) and
active execution count, wired into the UI stat cards with 10s polling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:34:11 +01:00
hsiegeln
86e016874a Fix command palette: agent ID propagation, result selection, and scope tabs
All checks were successful
CI / build (push) Successful in 59s
CI / docker (push) Successful in 46s
CI / deploy (push) Successful in 25s
- Propagate authenticated agent identity through write buffers via
  TaggedExecution/TaggedDiagram wrappers so ClickHouse rows get real
  agent IDs instead of empty strings
- Add execution_id to text search LIKE clause so selecting an execution
  by ID in the palette actually finds it
- Clear status filter to all three statuses on palette selection so the
  chosen execution/agent isn't filtered out
- Add disabled Routes and Exchanges scope tabs with "coming soon" state

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:13:14 +01:00
hsiegeln
64b03a4e2f Add Cmd+K command palette for searching executions and agents
All checks were successful
CI / build (push) Successful in 59s
CI / docker (push) Successful in 56s
CI / deploy (push) Successful in 26s
Backend: add routeId, agentId, processorType filter fields to SearchRequest
and ClickHouseSearchEngine. Expand global text search to match route_id and
agent_id columns.

Frontend: new command palette component (portal overlay, Zustand store,
TanStack Query search hook with 300ms debounce, filter chip parsing,
keyboard navigation, scope tabs). Search bar in SearchFilters and TopNav
now open the palette. Selecting a result writes filters to the execution
search store to drive the results table.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 16:28:16 +01:00
hsiegeln
51a02700dd test(04-01): add failing tests for security services
- JwtService: 7 tests for access/refresh token creation and validation
- Ed25519SigningService: 5 tests for keypair, signing, verification
- BootstrapTokenValidator: 6 tests for token matching and rotation
- Core interfaces and stub implementations (all throw UnsupportedOperationException)
- Added nimbus-jose-jwt and spring-boot-starter-security dependencies

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 19:58:59 +01:00
hsiegeln
61f39021b3 feat(03-01): implement agent registry service and domain types
- AgentRegistryService: register, heartbeat, lifecycle, commands
- ConcurrentHashMap with atomic record-swapping for thread safety
- LIVE->STALE->DEAD lifecycle transitions via checkLifecycle()
- Heartbeat revives STALE agents back to LIVE
- Command queue with PENDING/DELIVERED/ACKNOWLEDGED/EXPIRED states
- AgentEventListener callback for SSE bridge integration
- All 23 unit tests pass

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:30:02 +01:00
hsiegeln
4cd7ed9e9a test(03-01): add failing tests for agent registry service
- 23 unit tests covering registration, heartbeat, lifecycle, queries, commands
- Domain types: AgentInfo, AgentState, AgentCommand, CommandStatus, CommandType
- AgentEventListener interface for SSE bridge
- AgentRegistryService stub with UnsupportedOperationException

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:28:28 +01:00
hsiegeln
0615a9851d feat(02-03): detail controller, tree reconstruction, processor snapshot endpoint
- Implement findRawById and findProcessorSnapshot in ClickHouseExecutionRepository
- DetailController with GET /executions/{id} returning nested processor tree
- GET /executions/{id}/processors/{index}/snapshot for per-processor exchange data
- 5 unit tests for tree reconstruction (linear, branching, multiple roots, empty)
- 6 integration tests for detail endpoint, snapshot, and 404 handling
- Added assertj and mockito test dependencies to core module

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:29:53 +01:00
hsiegeln
c0922430c4 test(02-01): add failing IngestionSchemaIT for new column population
- Tests processor tree metadata (depths, parent indexes)
- Tests exchange body concatenation for search
- Tests null snapshot graceful handling
- AbstractClickHouseIT loads 02-search-columns.sql
- DiagramRenderer/DiagramLayout stubs to fix pre-existing compilation error

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:09:45 +01:00
hsiegeln
044259535a feat(02-01): add schema extension and core search/detail domain types
- ClickHouse schema migration SQL with 12 new columns and skip indexes
- SearchRequest, SearchResult, ExecutionSummary records in core search package
- SearchEngine interface for swappable search backend (ClickHouse/OpenSearch)
- SearchService orchestration layer
- ProcessorNode, ExecutionDetail, RawExecutionRow, DetailService in core detail package
- DetailService reconstructs nested processor tree from flat parallel arrays
- ExecutionRepository extended with findRawById query method

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:05:14 +01:00