Commit Graph

43 Commits

Author SHA1 Message Date
hsiegeln
7a63135d26 fix: scope pg_stat_activity queries by ApplicationName for tenant isolation
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 36s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
DatabaseAdminController's active-queries and kill-query endpoints could
expose SQL text from other tenants sharing the same PostgreSQL instance.
Added ApplicationName=tenant_{id} to the JDBC URL and filter
pg_stat_activity by application_name so each tenant only sees its own
connections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:51:13 +02:00
hsiegeln
6bf7175a6c feat: add Micrometer Prometheus metrics to server
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m36s
CI / deploy (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Adds micrometer-registry-prometheus and exposes /api/v1/prometheus
endpoint (unauthenticated for scraping). ServerMetrics component
provides business metrics beyond default JVM/HTTP:

Gauges: agents by state, SSE connections, buffer depths (execution,
processor, log, metrics), accumulator pending exchanges.

Counters: ingestion drops (buffer_full, no_agent, no_identity),
agent transitions (went_stale, went_dead, recovered), deployment
outcomes (running, failed, degraded), auth failures (invalid_token,
revoked, oidc_rejected).

Timers: ClickHouse flush duration by type, deployment duration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:23:27 +02:00
hsiegeln
90c82238a0 feat: add orphaned app cleanup — auto-filter stale discovered apps, manual dismiss with data purge
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 16:19:59 +02:00
hsiegeln
293d11e52b feat: add infrastructureendpoints flag with conditional DB/CH controllers
Add cameleer.server.security.infrastructureendpoints property (default true) and
@ConditionalOnProperty to DatabaseAdminController and ClickHouseAdminController so
the SaaS provisioner can set CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS=false
to suppress these endpoints (404) on tenant server containers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 23:09:28 +02:00
hsiegeln
350e769948 Group container settings under cameleer.server.runtime.container.*
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 1m2s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Move container resource defaults into their own sub-namespace for
future extensibility:

  cameleer.server.runtime.container.memorylimit → CAMELEER_SERVER_RUNTIME_CONTAINER_MEMORYLIMIT
  cameleer.server.runtime.container.cpushares   → CAMELEER_SERVER_RUNTIME_CONTAINER_CPUSHARES

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:33:07 +02:00
hsiegeln
534e936cd4 Group OIDC settings under cameleer.server.security.oidc.*
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m59s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Move OIDC properties into a nested Oidc class within SecurityProperties
for clearer grouping. Env vars gain an extra separator:

  cameleer.server.security.oidc.issueruri     → CAMELEER_SERVER_SECURITY_OIDC_ISSUERURI
  cameleer.server.security.oidc.jwkseturi     → CAMELEER_SERVER_SECURITY_OIDC_JWKSETURI
  cameleer.server.security.oidc.audience      → CAMELEER_SERVER_SECURITY_OIDC_AUDIENCE
  cameleer.server.security.oidc.tlsskipverify → CAMELEER_SERVER_SECURITY_OIDC_TLSSKIPVERIFY

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:30:33 +02:00
hsiegeln
60fb5fe21a Remove vestigial clickhouse.enabled flag
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m4s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
ClickHouse is the only storage backend — there is no alternative.
The enabled flag created a false sense of optionality: setting it to
false would crash on startup because most beans unconditionally depend
on the ClickHouse JdbcTemplate.

Remove all @ConditionalOnProperty annotations gating ClickHouse beans,
the enabled property from application.yml, and the K8s manifest entry.
Also fix old property names in AbstractPostgresIT test config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:27:10 +02:00
hsiegeln
8fe48bbf02 Migrate config to cameleer.server.* naming convention
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m52s
CI / docker (push) Successful in 1m30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Move all configuration properties under the cameleer.server.* namespace
with all-lowercase dot-separated names and mechanical env var mapping
(dots→underscores, uppercase). This aligns with the agent's convention
(cameleer.agent.*) and establishes a predictable pattern across all
components.

Changes:
- Move 6 config prefixes under cameleer.server.*: agent-registry,
  ingestion, security, license, clickhouse, and cameleer.tenant/runtime/indexer
- Rename all kebab-case properties to concatenated lowercase
  (e.g., bootstrap-token → bootstraptoken, jar-storage-path → jarstoragepath)
- Update all env vars to CAMELEER_SERVER_* mechanical mapping
- Fix container-cpu-request/container-cpu-shares mismatch bug
- Remove displayName from AgentRegistrationRequest (redundant with instanceId)
- Update agent container env vars to CAMELEER_AGENT_* convention
- Update K8s manifests and CI workflow for new env var names
- Update CLAUDE.md, HOWTO.md, SERVER-CAPABILITIES.md documentation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:10:51 +02:00
hsiegeln
20f3dfe59d feat: support Docker volume-based JAR mounting for Docker-in-Docker
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
When CAMELEER_JAR_DOCKER_VOLUME is set, the orchestrator mounts the
named volume at the jar storage path instead of using a host bind mount.
This solves the path translation issue in Docker-in-Docker setups where
the server runs inside a container and manages sibling containers.

The entrypoint is overridden to use the volume-mounted JAR path via
the CAMELEER_APP_JAR env var.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 21:38:34 +02:00
hsiegeln
6e444a414d feat: add CAMELEER_SERVER_URL config property
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 20:28:44 +02:00
hsiegeln
0b2d231b6b feat: split config into 4 tabs and fix JAR upload 413
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Config sub-tabs are now: Monitoring | Traces & Taps | Route Recording | Resources
(renamed from Agent/Infrastructure, with traces and recording as their own tabs).

Also increase Spring multipart max-file-size and max-request-size to 200MB
to fix HTTP 413 on JAR uploads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 18:22:39 +02:00
hsiegeln
3d20d7a0cb feat: add runtime management configuration properties
- JAR storage path, base image, Docker network
- Container memory/CPU limits, health check timeout
- Routing mode and domain for Traefik integration

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:47:43 +02:00
hsiegeln
b969075007 feat: add license loading at startup from env var or file
- LicenseBeanConfig wires LicenseGate bean with startup validation
- Supports token from CAMELEER_LICENSE_TOKEN env var or CAMELEER_LICENSE_FILE path
- Falls back to open mode when no license or no public key configured
- Add license config properties to application.yml
2026-04-07 23:11:02 +02:00
hsiegeln
083cb8b9ec feat: add CAMELEER_CORS_ALLOWED_ORIGINS for multi-origin CORS support
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Behind a reverse proxy the browser sends Origin matching the proxy's
public URL, which the single-origin CAMELEER_UI_ORIGIN rejects.
New env var accepts comma-separated origins and takes priority over
UI_ORIGIN, which remains as a backwards-compatible fallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:41:00 +02:00
hsiegeln
ca92b3ce7d feat: add CAMELEER_OIDC_TLS_SKIP_VERIFY to bypass cert verification for OIDC
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Self-signed CA certs on the OIDC provider (e.g. Logto behind a reverse
proxy) cause the login flow to fail because Java's truststore rejects
the connection. This adds an opt-in env var that creates a trust-all
SSLContext scoped to OIDC HTTP calls only (discovery, token exchange,
JWKS fetch) without affecting system-wide TLS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:26:40 +02:00
hsiegeln
3c70313d78 feat: add CAMELEER_OIDC_JWK_SET_URI for direct JWKS fetching
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
When set, fetches JWKs from this URL directly instead of discovering
from the OIDC well-known endpoint. Needed when the public issuer URL
(e.g., https://domain.com/oidc) isn't reachable from inside containers
but the internal URL (http://logto:3001/oidc/jwks) is.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 21:02:51 +02:00
hsiegeln
a5c4e0cead feat: add spring-boot-starter-oauth2-resource-server and OIDC properties
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 13:06:53 +02:00
hsiegeln
de85cdf5a2 fix: let SPRING_DATASOURCE_URL fully control datasource connection
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
SonarQube / sonarqube (push) Successful in 3m26s
Explicit spring.datasource.url in YAML takes precedence over the env var,
causing deployed containers to connect to localhost instead of the postgres
service. Now the YAML uses ${SPRING_DATASOURCE_URL:...} so the env var
wins when set. Flyway inherits from the datasource (no separate URL).
Removed CAMELEER_DB_SCHEMA — schema is part of the datasource URL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:24:22 +02:00
hsiegeln
ac87aa6eb2 fix: derive PG schema from tenant ID instead of defaulting to public
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m17s
Schema now defaults to tenant_${cameleer.tenant.id} (e.g. tenant_default,
tenant_acme) instead of public. Flyway create-schemas: true ensures the
schema is auto-created on first startup. CAMELEER_DB_SCHEMA env var still
available as override for feature branch isolation. Removed hardcoded
public schema from K8s base and main overlay.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 21:46:57 +02:00
hsiegeln
a188308ec5 feat: implement multitenancy with tenant isolation + environment support
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m25s
Adds configurable tenant ID (CAMELEER_TENANT_ID env var, default:
"default") and environment as a first-class concept. Each server
instance serves one tenant with multiple environments.

Changes across 36 files:
- TenantProperties config bean for tenant ID injection
- AgentInfo: added environmentId field
- AgentRegistrationRequest: added environmentId field
- All 9 ClickHouse stores: inject tenant ID, replace hardcoded
  "default" constant, add environment to writes/reads
- ChunkAccumulator: configurable tenant ID + environment resolver
- MergedExecution/ProcessorBatch/BufferedLogEntry: added environment
- ClickHouse init.sql: added environment column to all tables,
  updated ORDER BY (tenant→time→env→app), added tenant_id to
  usage_events, updated all MV GROUP BY clauses
- Controllers: pass environmentId through registration/auto-heal
- K8s deploy: added CAMELEER_TENANT_ID env var
- All tests updated for new signatures

Closes #123

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:00:18 +02:00
hsiegeln
ac94a67a49 fix: reduce ClickHouse CPU by increasing flush interval, rename LIVE→AUTO labels
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m24s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- Increase ingestion flush interval from 500ms to 5000ms to reduce MV merge storms
- Reduce ClickHouse background_schedule_pool_size from 8 to 4
- Rename LIVE/PAUSED badge labels to AUTO/MANUAL across all pages
- Update design system to v0.1.29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:05:29 +02:00
hsiegeln
6f00ff2e28 fix: reduce ClickHouse log noise, admin query spam, and diagram scan perf
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
- Set com.clickhouse log level to INFO and org.apache.hc.client5 to WARN
- Admin hooks (useUsers/useGroups/useRoles) now only fetch on admin pages,
  eliminating AUDIT view_users entries on every UI click
- Add ClickHouse projection on route_diagrams for (tenant_id, route_id,
  instance_id, created_at) to avoid full table scans on diagram lookups
- Bump @cameleer/design-system to v0.1.28 (PAUSED mode time range fix,
  refreshTimeRange API)
- Call refreshTimeRange before invalidateQueries in PAUSED mode manual
  refresh so sidebar clicks use current time window

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:48:30 +02:00
hsiegeln
e495b80432 fix: increase ClickHouse pool size and reduce flush interval
All checks were successful
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 2m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Pool was hardcoded to 10 connections serving 7 concurrent write
streams + UI reads, causing "too many simultaneous queries" and
WriteBuffer overflow. Pool now defaults to 50 (configurable via
clickhouse.pool-size), flush interval reduced from 1000ms to 500ms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:11:15 +02:00
hsiegeln
188810e54b feat: remove TimescaleDB, dead PG stores, and storage feature flags
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 32s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Complete the ClickHouse migration by removing all PostgreSQL analytics
code. PostgreSQL now serves only RBAC, config, and audit — all
observability data is exclusively in ClickHouse.

- Delete 6 dead PostgreSQL store classes (executions, stats, diagrams,
  events, metrics, metrics-query) and 2 integration tests
- Delete RetentionScheduler (ClickHouse TTL handles retention)
- Remove all 7 cameleer.storage.* feature flags from application.yml
- Remove all @ConditionalOnProperty from ClickHouse beans in StorageBeanConfig
- Consolidate 14 Flyway migrations (V1-V14) into single clean V1 with
  only RBAC/config/audit tables (no TimescaleDB, no analytics tables)
- Switch from timescale/timescaledb-ha:pg16 to postgres:16 everywhere
  (docker-compose, deploy/postgres.yaml, test containers)
- Remove TimescaleDB check and /metrics-pipeline from DatabaseAdminController
- Set clickhouse.enabled default to true

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:10:58 +02:00
hsiegeln
283e38a20d feat: remove OpenSearch, add ClickHouse admin page
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 33s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Remove all OpenSearch code, dependencies, configuration, deployment
manifests, and CI/CD references. Replace the OpenSearch admin page
with a ClickHouse admin page showing cluster status, table sizes,
performance metrics, and indexer pipeline stats.

- Delete 11 OpenSearch Java files (config, search impl, admin controller, DTOs, tests)
- Delete 3 OpenSearch frontend files (admin page, CSS, query hooks)
- Delete deploy/opensearch.yaml K8s manifest
- Remove opensearch Maven dependencies from pom.xml
- Remove opensearch config from application.yml, Dockerfile, docker-compose
- Remove opensearch from CI workflow (secrets, deploy, cleanup steps)
- Simplify ThresholdConfig (remove OpenSearch thresholds, database-only)
- Change default search backend from opensearch to clickhouse
- Add ClickHouseAdminController with /status, /tables, /performance, /pipeline
- Add ClickHouseAdminPage with StatCards, pipeline ProgressBar, tables DataTable
- Update CLAUDE.md, HOWTO.md, and source comments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:56:06 +02:00
hsiegeln
95b9dea5c4 feat(clickhouse): wire ClickHouseExecutionStore as active ExecutionStore
Add cameleer.storage.executions feature flag (default: clickhouse).
PostgresExecutionStore activates only when explicitly set to postgres.
Add by-seq snapshot endpoint for iteration-aware processor lookup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:09:14 +02:00
hsiegeln
968117c41a feat(clickhouse): wire Phase 4 stores with feature flags
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
Add conditional beans for ClickHouseDiagramStore, ClickHouseAgentEventRepository,
and ClickHouseLogStore. All default to ClickHouse (matchIfMissing=true).
PG/OS stores activate only when explicitly configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:44:10 +02:00
hsiegeln
9df00fdde0 feat(clickhouse): wire ClickHouseStatsStore with cameleer.storage.stats feature flag (default: clickhouse)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:51:45 +02:00
hsiegeln
31f7113b3f feat(clickhouse): wire ChunkAccumulator, flush scheduler, and search feature flag
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:21:19 +02:00
hsiegeln
8eeaecf6f3 fix: remove unsupported async_insert params from ClickHouse JDBC URL
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 55s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m39s
CI / deploy (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (push) Successful in 51s
CI / deploy-feature (pull_request) Has been skipped
clickhouse-jdbc 0.9.7 rejects async_insert and wait_for_async_insert as
unknown URL parameters. These are server-side settings, not driver config.
Can be set per-query later if needed via custom_settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 18:02:53 +02:00
hsiegeln
23f901279a feat: add ClickHouse DataSource and JdbcTemplate configuration
Adds ClickHouseProperties (bound to clickhouse.*), ClickHouseConfig
(conditional HikariDataSource + JdbcTemplate beans), and extends
application.yml with clickhouse.enabled/url/username/password and
cameleer.storage.metrics properties.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:51:14 +02:00
hsiegeln
7423e2ca14 feat: add application log ingestion with OpenSearch storage
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 59s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Agents can now send application log entries in batches via POST /api/v1/data/logs.
Logs are indexed directly into OpenSearch daily indices (logs-{yyyy-MM-dd}) using
the bulk API. Index template defines explicit mappings for full-text search readiness.

New DTOs (LogEntry, LogBatch) added to cameleer3-common in the agent repo.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 11:53:27 +01:00
hsiegeln
82117deaab fix: pass credentials to Flyway when using separate datasource URL
All checks were successful
CI / build (push) Successful in 1m6s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
When spring.flyway.url is set independently, Spring Boot does not
inherit credentials from spring.datasource. Add explicit user/password
to both application.yml and K8s deployment to prevent "no password"
failures on feature branch deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:34:41 +01:00
hsiegeln
247fdb01c0 fix: separate Flyway and app datasource search paths for schema isolation
Some checks failed
CI / build (push) Successful in 1m6s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 41s
CI / deploy (push) Failing after 2m19s
CI / deploy-feature (push) Has been skipped
Flyway needs public in the search_path to access TimescaleDB extension
functions (create_hypertable). The app datasource must NOT include public
to prevent accidental cross-schema reads from production data.

- spring.flyway.url: currentSchema=<branch>,public (extensions accessible)
- spring.datasource.url: currentSchema=<branch> (strict isolation)
- SPRING_FLYWAY_URL env var added to K8s base manifest

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:26:01 +01:00
hsiegeln
b393d262cb refactor: remove OIDC env var config and seeder
All checks were successful
CI / build (push) Successful in 1m7s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
OIDC configuration is already fully database-backed (oidc_config table,
admin API, OidcConfigRepository). Remove the redundant env var binding
(SecurityProperties.Oidc), the env-to-DB seeder (oidcConfigSeeder), and
the OIDC section from application.yml.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:20:35 +01:00
hsiegeln
15f20d22ad feat: add feature branch deployments with per-branch isolation
Some checks failed
CI / build (push) Successful in 1m8s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Failing after 5s
CI / deploy-feature (push) Has been skipped
Enable deploying feature branches into isolated environments on the same
k3s cluster. Each branch gets its own namespace (cam-<slug>), PostgreSQL
schema, and OpenSearch index prefix for data isolation while sharing the
underlying infrastructure.

- Make OpenSearch index prefix and DB schema configurable via env vars
  (defaults preserve existing behavior)
- Restructure deploy/ into Kustomize base + overlays (main/feature)
- Extend CI to build Docker images for all branches, not just main
- Add deploy-feature job with namespace creation, secret copying,
  Traefik Ingress routing (<slug>-api/ui.cameleer.siegeln.net)
- Add cleanup-branch job to remove namespace, PG schema, OS indices
  on branch deletion
- Install required tools (git, jq, curl) in CI deploy containers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:35:07 +01:00
hsiegeln
41a9a975fd config: switch datasource to PostgreSQL, add OpenSearch and Flyway config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:15:33 +01:00
hsiegeln
a4de2a7b79 Add RBAC with role-based endpoint authorization and OIDC support
Some checks failed
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m38s
CI / deploy (push) Has been cancelled
Implement three-phase security upgrade:

Phase 1 - RBAC: Extend JWT with roles claim, populate Spring
GrantedAuthority in filter, enforce role-based access (AGENT for
data/heartbeat/SSE, VIEWER+ for search/diagrams, OPERATOR+ for
commands, ADMIN for user management). Configurable JWT secret via
CAMELEER_JWT_SECRET env var for token persistence across restarts.

Phase 2 - User persistence: ClickHouse users table with
ReplacingMergeTree, UserRepository interface + ClickHouse impl,
UserAdminController for CRUD at /api/v1/admin/users. Local login
upserts user on each authentication.

Phase 3 - OIDC: Token exchange flow where SPA sends auth code,
server exchanges it server-side (keeping client_secret secure),
validates id_token via JWKS, resolves roles (DB override > OIDC
claim > default), issues internal JWT. Conditional on
CAMELEER_OIDC_ENABLED=true. Uses oauth2-oidc-sdk for standards
compliance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 12:35:45 +01:00
hsiegeln
3eb83f97d3 Add React UI with Execution Explorer, auth, and standalone deployment
Some checks failed
CI / build (push) Failing after 1m53s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
- Scaffold Vite + React + TypeScript frontend in ui/ with full design
  system (dark/light themes) matching the HTML mockups
- Implement Execution Explorer page: search filters, results table with
  expandable processor tree and exchange detail sidebar, pagination
- Add UI authentication: UiAuthController (login/refresh endpoints),
  JWT filter handles ui: subject prefix, CORS configuration
- Shared components: StatusPill, DurationBar, StatCard, AppBadge,
  FilterChip, Pagination — all using CSS Modules with design tokens
- API client layer: openapi-fetch with auth middleware, TanStack Query
  hooks for search/detail/snapshot queries, Zustand for state
- Standalone deployment: Nginx Dockerfile, K8s Deployment + ConfigMap +
  NodePort (30080), runtime config.js for API base URL
- Embedded mode: maven-resources-plugin copies ui/dist into JAR static
  resources, SPA forward controller for client-side routing
- CI/CD: UI build step, Docker build/push for server-ui image, K8s
  deploy step for UI, UI credential secrets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 13:59:22 +01:00
hsiegeln
ac9e8ae4dd feat(04-01): implement security service foundation
- JwtServiceImpl: HMAC-SHA256 via Nimbus JOSE+JWT with ephemeral 256-bit secret
- Ed25519SigningServiceImpl: JDK 17 KeyPairGenerator with ephemeral keypair
- BootstrapTokenValidator: constant-time comparison with dual-token rotation
- SecurityBeanConfig: bean wiring with fail-fast validation for CAMELEER_AUTH_TOKEN
- SecurityProperties: config binding for token expiry and bootstrap tokens
- TestSecurityConfig: permit-all filter chain to keep existing tests green
- application.yml: security config with env var mapping
- All 18 security unit tests pass, all 71 tests pass in full verify

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:08:30 +01:00
hsiegeln
0372be2334 feat(03-01): add agent registration controller, config, lifecycle monitor
- AgentRegistryConfig: heartbeat, stale, dead, ping, command expiry settings
- AgentRegistryBeanConfig: wires AgentRegistryService as Spring bean
- AgentLifecycleMonitor: @Scheduled lifecycle check + command expiry sweep
- AgentRegistrationController: POST /register, POST /{id}/heartbeat, GET /agents
- Updated Cameleer3ServerApplication with AgentRegistryConfig
- Updated application.yml with agent-registry section and async timeout
- 7 integration tests: register, re-register, heartbeat, list, filter, invalid status

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:40:57 +01:00
hsiegeln
96c52b88d1 feat(01-01): add ClickHouse dependencies, Docker Compose, schema, and app config
- Add clickhouse-jdbc, springdoc-openapi, actuator, testcontainers deps
- Add slf4j-api to core module
- Create Docker Compose with ClickHouse service on ports 8123/9000
- Create ClickHouse DDL: route_executions, route_diagrams, agent_metrics
- Configure application.yml with datasource, ingestion buffer, springdoc

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:47:20 +01:00
hsiegeln
db17f02fcc Scaffold cameleer3-server project structure
Some checks failed
CI / build (push) Failing after 3s
Multi-module Maven project (server-core + server-app) with Spring Boot 3.4.3,
Gitea CI workflow, and dependency on cameleer3-common from Gitea Maven registry.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 10:06:17 +01:00