Commit Graph

122 Commits

Author SHA1 Message Date
hsiegeln
8ad0016a8e refactor: rename group/groupName to application/applicationName
Some checks failed
CI / build (push) Failing after 40s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
The execution-related "group" concept actually represents the
application name. Rename all Java fields, API parameters, and frontend
types from groupName→applicationName and group→application for clarity.

- Java records: ExecutionSummary, ExecutionDetail, ExecutionDocument,
  ExecutionRecord, ProcessorRecord
- API params: SearchRequest.group→application, SearchController
  @RequestParam group→application
- Services: IngestionService, DetailService, SearchIndexer, StatsStore
- Frontend: schema.d.ts, Dashboard, ExchangeDetail, RouteDetail,
  executions query hooks

Database column names (group_name) and OpenSearch field names are
unchanged — only the API-facing Java/TS field names are renamed.

RBAC group references (groups table, GroupRepository, GroupsTab) are
a separate domain concept and are NOT affected by this change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 21:21:38 +01:00
hsiegeln
a72b0954db fix: add groupName to ExecutionSummary, locale format stat values, inspect column, fix duplicate keys
Some checks failed
CI / build (push) Failing after 40s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
- Added groupName field to ExecutionSummary Java record and OpenSearch mapper
- Dashboard stat cards use locale-formatted numbers (en-US)
- Added inspect column (↗) linking directly to exchange detail page
- Fixed duplicate React key warning from two columns sharing executionId key

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 20:41:46 +01:00
hsiegeln
f4d2693561 feat: enrich AgentInstanceResponse with version/capabilities, add password reset endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 18:13:37 +01:00
hsiegeln
2051572ee2 feat: add GET /agents/{id}/metrics endpoint for JVM metrics
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 18:11:22 +01:00
hsiegeln
cc433b4215 feat: add GET /routes/metrics/processors endpoint
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-23 18:10:54 +01:00
hsiegeln
31b60c4e24 feat: add V7 migration for per-processor-id continuous aggregate 2026-03-23 18:09:24 +01:00
hsiegeln
2b111c603c feat: migrate UI to @cameleer/design-system, add backend endpoints
Some checks failed
CI / build (push) Failing after 47s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Backend:
- Add agent_events table (V5) and lifecycle event recording
- Add route catalog endpoint (GET /routes/catalog)
- Add route metrics endpoint (GET /routes/metrics)
- Add agent events endpoint (GET /agents/events-log)
- Enrich AgentInstanceResponse with tps, errorRate, activeRoutes, uptimeSeconds
- Add TimescaleDB retention/compression policies (V6)

Frontend:
- Replace custom Mission Control UI with @cameleer/design-system components
- Rebuild all pages: Dashboard, ExchangeDetail, RoutesMetrics, AgentHealth,
  AgentInstance, RBAC, AuditLog, OIDC, DatabaseAdmin, OpenSearchAdmin, Swagger
- New LayoutShell with design system AppShell, Sidebar, TopBar, CommandPalette
- Consume design system from Gitea npm registry (@cameleer/design-system@0.0.1)
- Add .npmrc for scoped registry, update Dockerfile with REGISTRY_TOKEN arg

CI:
- Pass REGISTRY_TOKEN build-arg to UI Docker build step

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 17:38:39 +01:00
hsiegeln
82124c3145 fix: remove RBAC user_roles insert from agent registration
All checks were successful
CI / build (push) Successful in 1m22s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
Agents are transient and should not be persisted in the users table.
The assignRoleToUser call caused a FK violation (user_roles → users),
resulting in HTTP 500 on registration. The AGENT role is already
embedded directly in the JWT claims.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 22:10:48 +01:00
hsiegeln
17ef48e392 fix: return rotated refresh token from agent token refresh endpoint
All checks were successful
CI / build (push) Successful in 1m22s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 56s
CI / deploy (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
Previously the refresh endpoint only returned a new accessToken, causing
agents to lose their refreshToken after the first refresh cycle and
forcing a full re-registration every ~2 hours.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 16:44:16 +01:00
hsiegeln
0fcbe83cc2 refactor: consolidate oidc_config and admin_thresholds into generic server_config table
All checks were successful
CI / build (push) Successful in 1m19s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 34s
CI / build (pull_request) Successful in 1m23s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
Single JSONB key-value table replaces two singleton config tables, making
future config types trivial to add. Also fixes pre-existing IT failures:
Flyway URL not overridden by Testcontainers, threshold test ordering.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 11:16:31 +01:00
hsiegeln
5a0a915cc6 fix: scope admin infra pages to current tenant's tables and indices
All checks were successful
CI / build (push) Successful in 1m14s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 44s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 35s
Database tables filtered to current_schema(), active queries to
current_database(), OpenSearch indices to configured index-prefix.
Delete endpoint rejects indices outside application scope.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 09:29:06 +01:00
hsiegeln
6f5b5b8655 feat: add password support for local user creation and per-user login
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 19:08:19 +01:00
hsiegeln
f42e6279e6 fix: null safety in role/group creation, add user create/update endpoints
- RoleAdminController.createRole: default null description to "" and null scope to "custom"
- RoleAdminController.updateRole: pass null audit details to avoid NPE when name is null
- GroupAdminController.updateGroup: pass null audit details to avoid NPE when name is null
- UserAdminController: add POST / createUser endpoint with default VIEWER role assignment
- UserAdminController: add PUT /{userId} updateUser endpoint for displayName/email updates
2026-03-17 18:49:34 +01:00
hsiegeln
4842507ff3 feat: seed built-in Admins group and assign admin users on login
- Add V2 Flyway migration to create built-in Admins group (id: ...0010) with ADMIN role
- Add ADMINS_GROUP_ID constant to SystemRole
- Add user to Admins group on successful local login alongside role assignment
2026-03-17 18:30:16 +01:00
hsiegeln
01295c84d8 feat: add Group, Role, and RBAC stats admin controllers
GroupAdminController with cycle detection, RoleAdminController
with system role protection, RbacStatsController for dashboard.
Rewrite UserAdminController to use RbacService.
2026-03-17 17:47:26 +01:00
hsiegeln
eb0cc8c141 feat: replace flat users.roles with relational RBAC model
New package com.cameleer3.server.core.rbac with SystemRole constants,
detail/summary records, GroupRepository, RoleRepository, RbacService.
Remove roles field from UserInfo. Implement PostgresGroupRepository,
PostgresRoleRepository, RbacServiceImpl with inheritance computation.
Update UiAuthController, OidcAuthController, AgentRegistrationController
to assign roles via user_roles table. JWT populated from effective system roles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 17:44:32 +01:00
hsiegeln
b06b3f52a8 refactor: consolidate V1-V10 Flyway migrations into single V1__init.sql
Add RBAC tables (roles, groups, group_roles, user_groups, user_roles)
with system role seeds and join indexes. Drop users.roles TEXT[] column.
2026-03-17 17:34:15 +01:00
hsiegeln
321b8808cc feat: add ThresholdAdminController and AuditLogController with integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:57:23 +01:00
hsiegeln
c6da858c2f feat: add OpenSearchAdminController with status, pipeline, indices, performance, and delete endpoints
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:57:18 +01:00
hsiegeln
c6b2f7c331 feat: add DatabaseAdminController with status, pool, tables, queries, and kill endpoints
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:57:14 +01:00
hsiegeln
0cea8af6bc feat: add response/request DTOs for admin infrastructure endpoints
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:51:31 +01:00
hsiegeln
1d6ae00b1c feat: wire AuditService, enable method security, retrofit audit logging into existing controllers
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:51:22 +01:00
hsiegeln
e8842e3bdc feat: add Postgres implementations for AuditRepository and ThresholdRepository
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:51:13 +01:00
hsiegeln
a0944a1c72 feat: add audit domain model, repository interface, AuditService, and unit test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 15:36:21 +01:00
hsiegeln
fa3bc592d1 feat: add Flyway V9 (thresholds) and V10 (audit_log) migrations 2026-03-17 15:32:20 +01:00
hsiegeln
82117deaab fix: pass credentials to Flyway when using separate datasource URL
All checks were successful
CI / build (push) Successful in 1m6s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
When spring.flyway.url is set independently, Spring Boot does not
inherit credentials from spring.datasource. Add explicit user/password
to both application.yml and K8s deployment to prevent "no password"
failures on feature branch deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:34:41 +01:00
hsiegeln
247fdb01c0 fix: separate Flyway and app datasource search paths for schema isolation
Some checks failed
CI / build (push) Successful in 1m6s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 41s
CI / deploy (push) Failing after 2m19s
CI / deploy-feature (push) Has been skipped
Flyway needs public in the search_path to access TimescaleDB extension
functions (create_hypertable). The app datasource must NOT include public
to prevent accidental cross-schema reads from production data.

- spring.flyway.url: currentSchema=<branch>,public (extensions accessible)
- spring.datasource.url: currentSchema=<branch> (strict isolation)
- SPRING_FLYWAY_URL env var added to K8s base manifest

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:26:01 +01:00
hsiegeln
b393d262cb refactor: remove OIDC env var config and seeder
All checks were successful
CI / build (push) Successful in 1m7s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
OIDC configuration is already fully database-backed (oidc_config table,
admin API, OidcConfigRepository). Remove the redundant env var binding
(SecurityProperties.Oidc), the env-to-DB seeder (oidcConfigSeeder), and
the OIDC section from application.yml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:20:35 +01:00
hsiegeln
15f20d22ad feat: add feature branch deployments with per-branch isolation
Some checks failed
CI / build (push) Successful in 1m8s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Failing after 5s
CI / deploy-feature (push) Has been skipped
Enable deploying feature branches into isolated environments on the same
k3s cluster. Each branch gets its own namespace (cam-<slug>), PostgreSQL
schema, and OpenSearch index prefix for data isolation while sharing the
underlying infrastructure.

- Make OpenSearch index prefix and DB schema configurable via env vars
  (defaults preserve existing behavior)
- Restructure deploy/ into Kustomize base + overlays (main/feature)
- Extend CI to build Docker images for all branches, not just main
- Add deploy-feature job with namespace creation, secret copying,
  Traefik Ingress routing (<slug>-api/ui.cameleer.siegeln.net)
- Add cleanup-branch job to remove namespace, PG schema, OS indices
  on branch deletion
- Install required tools (git, jq, curl) in CI deploy containers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:35:07 +01:00
hsiegeln
672544660f fix: enable trackTotalHits for accurate OpenSearch result counts
All checks were successful
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 3m41s
CI / deploy (push) Successful in 44s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 09:54:50 +01:00
hsiegeln
c316e80d7f chore: update docs and config for PostgreSQL/OpenSearch storage layer
All checks were successful
CI / build (pull_request) Successful in 1m20s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
- Set failsafe reuseForks=true to reuse JVM across IT classes (faster test suite)
- Replace ClickHouse with PostgreSQL+OpenSearch in docker-compose.yml
- Remove redundant docker-compose.dev.yml
- Update CLAUDE.md and HOWTO.md to reflect new storage stack

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 00:26:50 +01:00
hsiegeln
796be06a09 fix: resolve all integration test failures after storage layer refactor
- Use singleton container pattern for PostgreSQL + OpenSearch testcontainers
  (fixes container lifecycle issues with @TestInstance(PER_CLASS))
- Fix table name route_executions → executions in DetailControllerIT and
  ExecutionControllerIT
- Serialize processor headers as JSON (ObjectMapper) instead of Map.toString()
  for JSONB column compatibility
- Add nested mapping for processors field in OpenSearch index template
- Use .keyword sub-field for term queries on dynamically mapped text fields
- Add wildcard fallback queries for all text searches (substring matching)
- Isolate stats tests with unique route names to prevent data contamination
- Wait for OpenSearch indexing in SearchControllerIT with targeted Awaitility
- Reduce OpenSearch debounce to 100ms in test profile

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 00:02:19 +01:00
hsiegeln
26f5a2ce3b fix: update remaining ITs for synchronous ingestion and PostgreSQL storage
- SearchControllerIT: remove @TestInstance(PER_CLASS), use @BeforeEach with
  static guard, fix table name (route_executions -> executions), remove
  Awaitility polling
- OpenSearchIndexIT: replace Thread.sleep with explicit index refresh via
  OpenSearchClient
- DiagramLinkingIT: fix table name, remove Awaitility awaits (writes are
  synchronous)
- IngestionSchemaIT: rewrite queries for PostgreSQL relational model
  (processor_executions table instead of ClickHouse array columns)
- PostgresStatsStoreIT: use explicit time bounds in
  refresh_continuous_aggregate calls
- IngestionService: populate diagramContentHash during execution ingestion
  by looking up the latest diagram for the route+agent

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 22:03:29 +01:00
hsiegeln
d23b899f00 fix: prefix user tokens with 'user:' for JwtAuthenticationFilter routing 2026-03-16 21:01:57 +01:00
hsiegeln
9f74e47ecf fix: use correct role-based JWT tokens in all integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 20:03:38 +01:00
hsiegeln
39f9925e71 fix: restore test config (bootstrap token, ingestion, agent-registry) and add @ActiveProfiles 2026-03-16 19:41:05 +01:00
hsiegeln
af03ecdf42 fix: use WITH NO DATA for continuous aggregates to avoid transaction block error 2026-03-16 19:32:54 +01:00
hsiegeln
0723f48e5b fix: disable Flyway transaction for continuous aggregate migration 2026-03-16 19:24:12 +01:00
hsiegeln
3c0e615fb7 fix: use timescaledb-ha image which includes toolkit extension 2026-03-16 19:13:47 +01:00
hsiegeln
589da1b6d6 fix: use asCompatibleSubstituteFor for TimescaleDB Testcontainer image 2026-03-16 19:06:54 +01:00
hsiegeln
41e2038190 fix: use ChronoUnit for Instant arithmetic in PostgresStatsStoreIT 2026-03-16 19:04:42 +01:00
hsiegeln
565b548ac1 refactor: remove all ClickHouse code, old interfaces, and SQL migrations
- Delete all ClickHouse storage implementations and config
- Delete old core interfaces (ExecutionRepository, DiagramRepository, MetricsRepository, SearchEngine, RawExecutionRow)
- Delete ClickHouse SQL migration files
- Delete AbstractClickHouseIT
- Update controllers to use new store interfaces (DiagramStore, ExecutionStore)
- Fix IngestionService calls in controllers for new synchronous API
- Migrate all ITs from AbstractClickHouseIT to AbstractPostgresIT
- Fix count() syntax and remove ClickHouse-specific test assertions
- Update TreeReconstructionTest for new buildTree() method

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:56:13 +01:00
hsiegeln
7dbfaf0932 feat: wire new storage beans, add MetricsFlushScheduler and RetentionScheduler
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:27:58 +01:00
hsiegeln
f7d7302694 feat: implement OpenSearchIndex with full-text and wildcard search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:25:55 +01:00
hsiegeln
5932b5d969 feat: implement PostgresDiagramStore, PostgresUserRepository, PostgresOidcConfigRepository, PostgresMetricsStore
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:23:21 +01:00
hsiegeln
527e2cf017 feat: implement PostgresStatsStore querying continuous aggregates 2026-03-16 18:22:44 +01:00
hsiegeln
9fd02c4edb feat: implement PostgresExecutionStore with upsert and dedup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:20:57 +01:00
hsiegeln
41a9a975fd config: switch datasource to PostgreSQL, add OpenSearch and Flyway config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:15:33 +01:00
hsiegeln
0eeae70369 test: add TimescaleDB test base class and Flyway migration smoke test 2026-03-16 18:15:32 +01:00
hsiegeln
8a637df65c feat: add Flyway migrations for PostgreSQL/TimescaleDB schema
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:13:53 +01:00