Commit Graph

30 Commits

Author SHA1 Message Date
hsiegeln
875062e59a feat: add container config to apps and default config to environments
- V5 migration: container_config JSONB + updated_at on apps,
  default_container_config JSONB on environments
- App/Environment records updated with new fields
- PUT /apps/{id}/container-config endpoint for per-app config
- PUT /admin/environments/{id}/default-container-config for env defaults
- GET /apps now supports optional environmentId (lists all when omitted)
- AppRepository.findAll() for cross-environment app listing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 16:18:08 +02:00
hsiegeln
2e006051bc feat: add production/enabled flags to environments, drop status enum
Environments now have:
- production (bool): prod vs non-prod resource allocation
- enabled (bool): disabled blocks new deployments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 11:16:09 +02:00
hsiegeln
85530d5ea3 feat: add runtime management database schema (environments, apps, versions, deployments)
- environments, apps, app_versions, deployments tables
- Default environment seeded on migration
- Foreign keys with CASCADE delete

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:40:18 +02:00
hsiegeln
bd78207060 feat: add claim mapping rules table and origin tracking to RBAC assignments
- Add origin and mapping_id columns to user_roles and user_groups
- Create claim_mapping_rules table with match_type and action constraints
- Update primary keys to include origin column
- Add indexes for fast managed assignment cleanup

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:07:30 +02:00
hsiegeln
188810e54b feat: remove TimescaleDB, dead PG stores, and storage feature flags
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 32s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Complete the ClickHouse migration by removing all PostgreSQL analytics
code. PostgreSQL now serves only RBAC, config, and audit — all
observability data is exclusively in ClickHouse.

- Delete 6 dead PostgreSQL store classes (executions, stats, diagrams,
  events, metrics, metrics-query) and 2 integration tests
- Delete RetentionScheduler (ClickHouse TTL handles retention)
- Remove all 7 cameleer.storage.* feature flags from application.yml
- Remove all @ConditionalOnProperty from ClickHouse beans in StorageBeanConfig
- Consolidate 14 Flyway migrations (V1-V14) into single clean V1 with
  only RBAC/config/audit tables (no TimescaleDB, no analytics tables)
- Switch from timescale/timescaledb-ha:pg16 to postgres:16 everywhere
  (docker-compose, deploy/postgres.yaml, test containers)
- Remove TimescaleDB check and /metrics-pipeline from DatabaseAdminController
- Set clickhouse.enabled default to true

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:10:58 +02:00
hsiegeln
91400defe9 fix: add missing V9 (ClickHouse) and V14 (PostgreSQL) identity column rename migrations
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Migration files were lost during worktree merge — recreated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:33:02 +02:00
hsiegeln
ab7031e6ed feat: add is_replay flag to execution pipeline and UI
Detect replayed exchanges via X-Cameleer-Replay header during ingestion,
persist the flag through PostgreSQL and OpenSearch, and surface it in
the dashboard (amber replay icon) and exchange detail chain view.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 14:39:40 +02:00
hsiegeln
213aa86c47 feat: progressive drill-down dashboard with RED metrics and SLA compliance (#94)
Three-level dashboard driven by sidebar selection:
- L1 (no selection): all-apps overview with health table, per-app charts
- L2 (app selected): route performance table, error velocity, top errors
- L3 (route selected): processor table, latency heatmap data, bottleneck KPI

Backend: 3 new endpoints (timeseries/by-app, timeseries/by-route, errors/top),
per-app SLA settings (app_settings table, V12 migration), exact SLA compliance
from executions hypertable, error velocity with acceleration detection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 23:29:20 +02:00
hsiegeln
3d71345181 feat: trace data indicators, inline tap config, and detail tab gating
All checks were successful
CI / build (push) Successful in 1m46s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m57s
Trace data visibility:
- ProcessorNode now includes hasTraceData flag computed from captured
  body/headers during tree conversion
- ConfigBadge shows teal for tracing configured, green when data captured
- Search results show green footprints icon for exchanges with trace data
- New has_trace_data column on executions table (V11 migration with backfill)
- OpenSearch documents and ExecutionSummary include the flag

Inline tap configuration:
- Extracted reusable TapConfigModal component from RouteDetail
- Diagram context menu opens tap modal inline instead of navigating away
- Toggle-trace action works immediately with toast feedback
- Modal closes only on ESC, Cancel, Save, or Delete (not backdrop click)

Detail panel tab gating:
- Headers, Input, Output tabs disabled when no data is available
- Works at both exchange and processor level
- Falls back to Info tab when active tab becomes empty

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-29 13:08:58 +02:00
hsiegeln
30344d29b1 feat: store raw processor tree JSON and add error categorization fields
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 53s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Fixes iteration overlay corruption caused by flat storage collapsing
duplicate processorIds across loop iterations.

Server:
- Store raw processor tree as processors_json JSONB on executions table
- Detail endpoint serves from processors_json (faithful tree), falls back
  to flat record reconstruction for older executions
- V10 migration: processors_json, error categorization (errorType,
  errorCategory, rootCauseType, rootCauseMessage), OTel (traceId, spanId),
  circuit breaker (circuitBreakerState, fallbackTriggered), drops
  erroneous splitDepth/loopDepth columns
- Add all new fields through full ingestion/storage/API chain

UI:
- Fix overlay wrapper filtering: check wrapper type before status filter
- Add new fields to schema.d.ts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 21:44:54 +01:00
hsiegeln
faf5d505f4 feat: support iteration wrapper nodes and filter overlay by selected iteration
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 38s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Server:
- Add split_depth and loop_depth columns (V9 migration)
- Persist splitDepth/loopDepth with reflection fallback for older agent versions

UI:
- Detect iterations via wrapper processorTypes (loopIteration, splitIteration, multicastBranch)
- Filter overlay by selected iteration at the wrapper level
- Skip non-selected iteration wrappers entirely (wrapper + children)
- Don't add synthetic wrappers to overlay (no diagram node correspondence)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:57:27 +01:00
hsiegeln
c4b396e618 feat: persist and expose resolvedEndpointUri for execution-level drill-down
Wire resolvedEndpointUri through the full chain:
- V9 migration adds resolved_endpoint_uri column
- IngestionService extracts from ProcessorExecution
- PostgresExecutionStore persists and reads the column
- ProcessorNode includes field in detail API response
- UI schema updated for ProcessorNode and PositionedNode

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 18:37:11 +01:00
hsiegeln
edd841ffeb feat: add iteration fields to processor execution storage
Add loop_index, loop_size, split_index, split_size, multicast_index
columns to processor_executions table and thread them through the
full storage → ingestion → detail pipeline. These fields enable
execution overlay to display iteration context for loop, split,
and multicast EIPs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 18:32:47 +01:00
hsiegeln
d6c1f2c25b refactor: derive processor-route mapping from diagrams instead of executions
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 37s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Store application_name in route_diagrams at ingestion time (V7 migration),
resolve from agent registry same as ExecutionController. Move
findProcessorRouteMapping from ExecutionStore to DiagramStore using a
JSONB query that extracts node IDs directly from stored RouteGraph
definitions. This makes the mapping available as soon as diagrams are
sent, before any executions are recorded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 23:00:10 +01:00
hsiegeln
100b780b47 refactor: remove diagramNodeId indirection, use processorId directly
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 37s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Agent now uses Camel processorId as RouteNode.id, eliminating the
nodeId mapping layer. Drop diagram_node_id column (V6 migration),
remove from ProcessorRecord/ProcessorNode/IngestionService/DetailService,
add /processor-routes endpoint for processorId→routeId lookup,
simplify frontend diagram-mapping and ExchangeDetail overlays,
replace N diagram fetches in AppConfigPage with single hook.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-26 22:44:07 +01:00
hsiegeln
f08461cf35 feat(db): add attributes JSONB columns to executions and processor_executions (Task 1)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-26 18:23:26 +01:00
hsiegeln
69a3eb192f feat: persistent per-application config with GET/PUT endpoints
Some checks failed
CI / build (push) Failing after 1m10s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add application_config table (V4 migration), repository, and REST
controller. GET /api/v1/config/{app} returns config, PUT saves and
pushes CONFIG_UPDATE to all LIVE agents via SSE. UI tracing toggle
now uses config API instead of direct SET_TRACED_PROCESSORS command.
Tracing store syncs with server config on load.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 07:42:55 +01:00
2887fe9599 feat: add V3 migration for engine_level and route-level snapshot columns
Some checks failed
CI / build (push) Failing after 51s
CI / cleanup-branch (push) Has been skipped
CI / build (pull_request) Failing after 52s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
2026-03-24 16:13:11 +01:00
hsiegeln
ea56bcf2d7 fix: split Flyway migration — DDL in V1, policies in V2
All checks were successful
CI / build (push) Successful in 1m20s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 43s
CI / deploy (push) Successful in 1m16s
CI / deploy-feature (push) Has been skipped
TimescaleDB add_continuous_aggregate_policy and add_compression_policy
cannot run inside a transaction block. Move all policy calls to V2
with flyway:executeInTransaction=false directive.

Also fix stats_1m_processor_detail: add WITH NO DATA and
materialized_only = false.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 21:34:35 +01:00
hsiegeln
6a5dba4eba refactor: rename group_name→application_name in DB, OpenSearch, SQL
Some checks failed
CI / build (push) Failing after 41s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Consolidate V1-V7 Flyway migrations into single V1__init.sql with
all columns renamed from group_name to application_name. Requires
fresh database (wipe flyway_schema_history, all data).

- DB columns: executions.group_name → application_name,
  processor_executions.group_name → application_name
- Continuous aggregates: all views updated to use application_name
- OpenSearch field: group_name → application_name in index/query
- All Java SQL strings updated to match new column names
- Delete V2-V7 migration files (folded into V1)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 21:24:19 +01:00
hsiegeln
31b60c4e24 feat: add V7 migration for per-processor-id continuous aggregate 2026-03-23 18:09:24 +01:00
hsiegeln
2b111c603c feat: migrate UI to @cameleer/design-system, add backend endpoints
Some checks failed
CI / build (push) Failing after 47s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Backend:
- Add agent_events table (V5) and lifecycle event recording
- Add route catalog endpoint (GET /routes/catalog)
- Add route metrics endpoint (GET /routes/metrics)
- Add agent events endpoint (GET /agents/events-log)
- Enrich AgentInstanceResponse with tps, errorRate, activeRoutes, uptimeSeconds
- Add TimescaleDB retention/compression policies (V6)

Frontend:
- Replace custom Mission Control UI with @cameleer/design-system components
- Rebuild all pages: Dashboard, ExchangeDetail, RoutesMetrics, AgentHealth,
  AgentInstance, RBAC, AuditLog, OIDC, DatabaseAdmin, OpenSearchAdmin, Swagger
- New LayoutShell with design system AppShell, Sidebar, TopBar, CommandPalette
- Consume design system from Gitea npm registry (@cameleer/design-system@0.0.1)
- Add .npmrc for scoped registry, update Dockerfile with REGISTRY_TOKEN arg

CI:
- Pass REGISTRY_TOKEN build-arg to UI Docker build step

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:38:39 +01:00
hsiegeln
0fcbe83cc2 refactor: consolidate oidc_config and admin_thresholds into generic server_config table
All checks were successful
CI / build (push) Successful in 1m19s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 34s
CI / build (pull_request) Successful in 1m23s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
Single JSONB key-value table replaces two singleton config tables, making
future config types trivial to add. Also fixes pre-existing IT failures:
Flyway URL not overridden by Testcontainers, threshold test ordering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 11:16:31 +01:00
hsiegeln
6f5b5b8655 feat: add password support for local user creation and per-user login
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 19:08:19 +01:00
hsiegeln
4842507ff3 feat: seed built-in Admins group and assign admin users on login
- Add V2 Flyway migration to create built-in Admins group (id: ...0010) with ADMIN role
- Add ADMINS_GROUP_ID constant to SystemRole
- Add user to Admins group on successful local login alongside role assignment
2026-03-17 18:30:16 +01:00
hsiegeln
b06b3f52a8 refactor: consolidate V1-V10 Flyway migrations into single V1__init.sql
Add RBAC tables (roles, groups, group_roles, user_groups, user_roles)
with system role seeds and join indexes. Drop users.roles TEXT[] column.
2026-03-17 17:34:15 +01:00
hsiegeln
fa3bc592d1 feat: add Flyway V9 (thresholds) and V10 (audit_log) migrations 2026-03-17 15:32:20 +01:00
hsiegeln
af03ecdf42 fix: use WITH NO DATA for continuous aggregates to avoid transaction block error 2026-03-16 19:32:54 +01:00
hsiegeln
0723f48e5b fix: disable Flyway transaction for continuous aggregate migration 2026-03-16 19:24:12 +01:00
hsiegeln
8a637df65c feat: add Flyway migrations for PostgreSQL/TimescaleDB schema
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:13:53 +01:00