Commit Graph

34 Commits

Author SHA1 Message Date
hsiegeln
95b9dea5c4 feat(clickhouse): wire ClickHouseExecutionStore as active ExecutionStore
Add cameleer.storage.executions feature flag (default: clickhouse).
PostgresExecutionStore activates only when explicitly set to postgres.
Add by-seq snapshot endpoint for iteration-aware processor lookup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:09:14 +02:00
hsiegeln
968117c41a feat(clickhouse): wire Phase 4 stores with feature flags
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
Add conditional beans for ClickHouseDiagramStore, ClickHouseAgentEventRepository,
and ClickHouseLogStore. All default to ClickHouse (matchIfMissing=true).
PG/OS stores activate only when explicitly configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:44:10 +02:00
hsiegeln
af080337f5 feat: comprehensive ClickHouse low-memory tuning and switch all storage to ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 58s
Replace partial memory config with full Altinity low-memory guide
settings. Revert container limit from 6Gi back to 4Gi — proper
tuning (mlock=false, reduced caches/pools/threads, disk spill for
aggregations) makes the original budget sufficient.

Switch all storage feature flags to ClickHouse:
- CAMELEER_STORAGE_SEARCH: opensearch → clickhouse
- CAMELEER_STORAGE_METRICS: postgres → clickhouse
- CAMELEER_STORAGE_STATS: already clickhouse

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:27:10 +02:00
hsiegeln
a669df08bd fix(clickhouse): tune memory settings to prevent OOM on insert
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 40s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
ClickHouse 24.12 auto-sizes caches from the cgroup limit, leaving
insufficient headroom for MV processing and background merges.
Adds a custom config that shrinks mark/index/expression caches and
caps per-query memory at 2 GiB. Bumps container limit 4Gi → 6Gi.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 22:54:43 +02:00
hsiegeln
9df00fdde0 feat(clickhouse): wire ClickHouseStatsStore with cameleer.storage.stats feature flag (default: clickhouse)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:51:45 +02:00
hsiegeln
31f7113b3f feat(clickhouse): wire ChunkAccumulator, flush scheduler, and search feature flag
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:21:19 +02:00
hsiegeln
8eeaecf6f3 fix: remove unsupported async_insert params from ClickHouse JDBC URL
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 55s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m39s
CI / deploy (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (push) Successful in 51s
CI / deploy-feature (pull_request) Has been skipped
clickhouse-jdbc 0.9.7 rejects async_insert and wait_for_async_insert as
unknown URL parameters. These are server-side settings, not driver config.
Can be set per-query later if needed via custom_settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 18:02:53 +02:00
hsiegeln
f8505401d7 fix: ClickHouse auth credentials and non-fatal schema init
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 43s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 13s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m47s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
- Set CLICKHOUSE_USER/PASSWORD via k8s secret (fixes "disabling network
  access for user 'default'" when no password is set)
- Add clickhouse-credentials secret to CI deploy + feature branch copy
- Pass CLICKHOUSE_USERNAME/PASSWORD env vars to server pod
- Make schema initializer non-fatal so server starts even if CH is
  temporarily unavailable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:54:44 +02:00
hsiegeln
59dd629b0e fix: create cameleer database on ClickHouse startup
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (pull_request) Successful in 1m49s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 10s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been cancelled
ClickHouse only has the 'default' database out of the box. The JDBC URL
connects to 'cameleer', so the database must exist before the server starts.
Uses /docker-entrypoint-initdb.d/ init script via ConfigMap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:31:17 +02:00
hsiegeln
1b991f99a3 deploy: add ClickHouse StatefulSet and server env vars
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:08:42 +02:00
hsiegeln
fc412f7251 fix: use relative API URL in feature branch UI to eliminate CORS errors
All checks were successful
CI / build (push) Successful in 1m4s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 13s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 34s
Browser requests now go to the UI origin and nginx proxies them to the
backend within the cluster. Removes the separate API Ingress host rule
since API traffic no longer needs its own subdomain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:49:01 +01:00
hsiegeln
82117deaab fix: pass credentials to Flyway when using separate datasource URL
All checks were successful
CI / build (push) Successful in 1m6s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
When spring.flyway.url is set independently, Spring Boot does not
inherit credentials from spring.datasource. Add explicit user/password
to both application.yml and K8s deployment to prevent "no password"
failures on feature branch deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:34:41 +01:00
hsiegeln
247fdb01c0 fix: separate Flyway and app datasource search paths for schema isolation
Some checks failed
CI / build (push) Successful in 1m6s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 41s
CI / deploy (push) Failing after 2m19s
CI / deploy-feature (push) Has been skipped
Flyway needs public in the search_path to access TimescaleDB extension
functions (create_hypertable). The app datasource must NOT include public
to prevent accidental cross-schema reads from production data.

- spring.flyway.url: currentSchema=<branch>,public (extensions accessible)
- spring.datasource.url: currentSchema=<branch> (strict isolation)
- SPRING_FLYWAY_URL env var added to K8s base manifest

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:26:01 +01:00
hsiegeln
ff3a046f5a refactor: remove OIDC config from K8s manifests
All checks were successful
CI / build (push) Successful in 1m8s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 37s
CI / deploy-feature (push) Has been skipped
OIDC configuration should be managed by the server itself (database-backed),
not injected via K8s secrets. Remove all CAMELEER_OIDC_* env vars from
deployment manifests and the cameleer-oidc secret from CI. The server
defaults to OIDC disabled via application.yml.

This also fixes the Kustomize strategic merge conflict where the feature
overlay tried to set value on an env var that had valueFrom in the base.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 13:12:41 +01:00
hsiegeln
c1cf8ae260 chore: remove old flat deploy manifests superseded by Kustomize
Some checks failed
CI / build (push) Successful in 1m7s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 14s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 19s
deploy/server.yaml and deploy/ui.yaml are no longer referenced by CI,
which now uses deploy/base/ + deploy/overlays/main/ via kubectl apply -k.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:52:58 +01:00
hsiegeln
15f20d22ad feat: add feature branch deployments with per-branch isolation
Some checks failed
CI / build (push) Successful in 1m8s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 42s
CI / deploy (push) Failing after 5s
CI / deploy-feature (push) Has been skipped
Enable deploying feature branches into isolated environments on the same
k3s cluster. Each branch gets its own namespace (cam-<slug>), PostgreSQL
schema, and OpenSearch index prefix for data isolation while sharing the
underlying infrastructure.

- Make OpenSearch index prefix and DB schema configurable via env vars
  (defaults preserve existing behavior)
- Restructure deploy/ into Kustomize base + overlays (main/feature)
- Extend CI to build Docker images for all branches, not just main
- Add deploy-feature job with namespace creation, secret copying,
  Traefik Ingress routing (<slug>-api/ui.cameleer.siegeln.net)
- Add cleanup-branch job to remove namespace, PG schema, OS indices
  on branch deletion
- Install required tools (git, jq, curl) in CI deploy containers

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 11:35:07 +01:00
hsiegeln
c346babe33 fix: correct PostgreSQL mountPath and add external NodePort services
All checks were successful
CI / build (pull_request) Successful in 1m9s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
- Fix postgres.yaml mountPath to /home/postgres/pgdata matching timescaledb-ha PGDATA
- Add NodePort services for external access: PostgreSQL (30432), OpenSearch (30920)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 01:00:20 +01:00
hsiegeln
2634f60e59 fix: use timescaledb-ha image in K8s manifest for toolkit support 2026-03-16 19:14:15 +01:00
hsiegeln
ea687a342c deploy: remove obsolete ClickHouse K8s manifest 2026-03-16 19:01:26 +01:00
hsiegeln
a344be3a49 deploy: replace ClickHouse with PostgreSQL/TimescaleDB + OpenSearch in K8s manifests
- Dockerfile: update default SPRING_DATASOURCE_URL to jdbc:postgresql, add OPENSEARCH_URL default env
- deploy/postgres.yaml: new TimescaleDB StatefulSet + headless Service (10Gi PVC, pg_isready probes)
- deploy/opensearch.yaml: new OpenSearch 2.19.0 StatefulSet + headless Service (10Gi PVC, single-node, security disabled)
- deploy/server.yaml: switch datasource env from clickhouse-credentials to postgres-credentials, add OPENSEARCH_URL

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-16 18:58:35 +01:00
a1e1c8f6ff deploy/authentik.yaml aktualisiert
Some checks failed
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy (push) Failing after 2m9s
change authentik UI port to 30950
2026-03-14 12:52:13 +01:00
hsiegeln
554d6822c0 Add Authentik OIDC provider K8s manifests and wire deployment
Some checks failed
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 40s
CI / deploy (push) Failing after 8s
- deploy/authentik.yaml: PostgreSQL StatefulSet, Redis, Authentik
  server (NodePort 30900) and worker, all in cameleer namespace
- deploy/server.yaml: Add CAMELEER_JWT_SECRET and CAMELEER_OIDC_*
  env vars from secrets (all optional for backward compat)
- ci.yml: Create authentik-credentials and cameleer-oidc secrets,
  deploy Authentik before the server
- HOWTO.md: Authentik setup instructions, updated architecture
  diagram and Gitea secrets list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 12:45:02 +01:00
hsiegeln
4cdf2ac012 Fix ClickHouse OOM on batch insert: reduce batch size, increase memory
All checks were successful
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 41s
CI / deploy (push) Successful in 46s
Execution rows are wide (29 cols with serialized arrays/JSON), so 500
rows can exceed ClickHouse's memory limit. Reduce default batch size
from 500 to 100 and bump ClickHouse memory limit from 2Gi to 4Gi.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 22:44:03 +01:00
hsiegeln
fc2daddf54 Change UI NodePort from 30080 to 30090
All checks were successful
CI / build (push) Successful in 58s
CI / docker (push) Successful in 39s
CI / deploy (push) Successful in 27s
Port 30080 is already allocated. Updated deploy manifests,
CORS origin, and HOWTO.md references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 14:11:50 +01:00
hsiegeln
3eb83f97d3 Add React UI with Execution Explorer, auth, and standalone deployment
Some checks failed
CI / build (push) Failing after 1m53s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
- Scaffold Vite + React + TypeScript frontend in ui/ with full design
  system (dark/light themes) matching the HTML mockups
- Implement Execution Explorer page: search filters, results table with
  expandable processor tree and exchange detail sidebar, pagination
- Add UI authentication: UiAuthController (login/refresh endpoints),
  JWT filter handles ui: subject prefix, CORS configuration
- Shared components: StatusPill, DurationBar, StatCard, AppBadge,
  FilterChip, Pagination — all using CSS Modules with design tokens
- API client layer: openapi-fetch with auth middleware, TanStack Query
  hooks for search/detail/snapshot queries, Zustand for state
- Standalone deployment: Nginx Dockerfile, K8s Deployment + ConfigMap +
  NodePort (30080), runtime config.js for API base URL
- Embedded mode: maven-resources-plugin copies ui/dist into JAR static
  resources, SPA forward controller for client-side routing
- CI/CD: UI build step, Docker build/push for server-ui image, K8s
  deploy step for UI, UI credential secrets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 13:59:22 +01:00
hsiegeln
9c2391e5d4 Move ClickHouse credentials to K8s Secret and add health probes
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 13s
CI / deploy (push) Successful in 38s
- ClickHouse user/password now injected via `clickhouse-credentials` Secret
  instead of hardcoded plaintext in deploy manifests (#33)
- CI deploy step creates the secret idempotently from Gitea CI secrets
- Added liveness/readiness probes: server uses /api/v1/health, ClickHouse
  uses /ping (#35)
- Updated HOWTO.md and CLAUDE.md with new secrets and probe details

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 10:59:15 +01:00
hsiegeln
88da1a0dd8 Fix ClickHouse OOM during Proxmox nightly backups
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 1m36s
CI / deploy (push) Successful in 20s
Increase ClickHouse memory limit from 1Gi to 2Gi and reduce default
batch size from 5000 to 500. During VM backup snapshots, I/O contention
prevents ClickHouse from flushing writes fast enough, causing buffer
accumulation that exceeds the 1Gi container limit.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:55:46 +01:00
hsiegeln
48bdb46760 Server fully owns ClickHouse schema — create database + tables on startup
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 39s
CI / deploy (push) Successful in 9s
ClickHouseConfig.ensureDatabaseExists() connects without the database path
to run CREATE DATABASE IF NOT EXISTS before the main DataSource is used.
Removes the ConfigMap-based init scripts from the K8s manifest — the server
is now the single owner of all ClickHouse schema management.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:41:35 +01:00
hsiegeln
f7ed91ef9c Use fully qualified cameleer3.* table names in ClickHouse init schema
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 38s
CI / deploy (push) Successful in 10s
Init scripts run against the default database, not CLICKHOUSE_DB.
Prefix all table references with cameleer3.* and add CREATE DATABASE.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:38:09 +01:00
hsiegeln
5576b50a3a Add ClickHouse schema init via ConfigMap + docker-entrypoint-initdb.d
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 37s
CI / deploy (push) Successful in 8s
Mounts the schema SQL files as a ConfigMap into ClickHouse's init
directory so tables are created automatically on fresh starts. All
statements use IF NOT EXISTS so they're safe to re-run. This ensures
the schema exists even if the PVC is lost or the pod is recreated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:29:58 +01:00
hsiegeln
a1280609f6 Add NodePort service to expose ClickHouse externally
All checks were successful
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 38s
CI / deploy (push) Successful in 6s
HTTP on port 30123, native protocol on port 30900. Keeps the existing
headless service for internal StatefulSet DNS.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 21:16:45 +01:00
hsiegeln
9dffa9ea81 Move schema initialization from ClickHouse init scripts to server startup
All checks were successful
CI / build (push) Successful in 49s
CI / docker (push) Successful in 43s
CI / deploy (push) Successful in 15s
Server now applies schema via @PostConstruct using classpath SQL files.
All statements use IF NOT EXISTS/IF NOT EXISTS so it's idempotent and
safe to run on every startup. Removes ConfigMap and init script mount
from K8s manifest since ClickHouse no longer needs to manage the schema.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:59:33 +01:00
hsiegeln
129b97183a Use fully qualified table names in ClickHouse init scripts
All checks were successful
CI / build (push) Successful in 49s
CI / docker (push) Successful in 40s
CI / deploy (push) Successful in 13s
ClickHouse Docker entrypoint runs init scripts against the default
database, not the one specified by CLICKHOUSE_DB. Prefix all table
names with cameleer3. to ensure they're created in the right database.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:54:45 +01:00
hsiegeln
c228c3201b Add Docker build, K8s manifests, and CI/CD deploy pipeline
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / build (push) Has been cancelled
- Dockerfile: multi-stage build with $BUILDPLATFORM for native Maven
  builds on ARM64 runners, amd64 runtime target. Passes REGISTRY_TOKEN
  build arg for cameleer3-common dependency resolution.
- K8s manifests: ClickHouse StatefulSet with init scripts ConfigMap,
  server Deployment + NodePort (30081)
- CI: docker job (QEMU + buildx cross-compile, registry cache,
  provenance=false, old image cleanup) + deploy job (kubectl)
- .dockerignore for build context optimization

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 19:01:39 +01:00