cameleer-server

Author	SHA1	Message	Date
hsiegeln	e2f784bf82	fix: deduplicate processor stats using uniq(execution_id) All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m17s Details CI / docker (push) Successful in 1m10s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 38s Details Processor execution counts were inflated by duplicate inserts into the plain MergeTree processor_executions table (chunk retries, reconnects). Replace count()/countIf() with uniq(execution_id)/uniqIf() in both stats_1m_processor and stats_1m_processor_detail MVs so each exchange is counted once per processor regardless of duplicates. Tables are dropped and rebuilt from raw data on startup. MV created after backfill to avoid double-counting. Also adds stats_1m_processor_detail to the catalog purge list (was missing). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 23:12:00 +02:00
hsiegeln	6bf7175a6c	feat: add Micrometer Prometheus metrics to server Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 2m36s Details CI / deploy (push) Has been cancelled Details CI / docker (push) Has been cancelled Details CI / deploy-feature (push) Has been cancelled Details Adds micrometer-registry-prometheus and exposes /api/v1/prometheus endpoint (unauthenticated for scraping). ServerMetrics component provides business metrics beyond default JVM/HTTP: Gauges: agents by state, SSE connections, buffer depths (execution, processor, log, metrics), accumulator pending exchanges. Counters: ingestion drops (buffer_full, no_agent, no_identity), agent transitions (went_stale, went_dead, recovered), deployment outcomes (running, failed, degraded), auth failures (invalid_token, revoked, oidc_rejected). Timers: ClickHouse flush duration by type, deployment duration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 18:23:27 +02:00
hsiegeln	90c82238a0	feat: add orphaned app cleanup — auto-filter stale discovered apps, manual dismiss with data purge Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 16:19:59 +02:00
hsiegeln	f4bbc1f65f	feat: add detected_runtime_type and detected_main_class to app_versions Flyway V10 migration adds the two nullable columns. AppVersion record, AppVersionRepository interface, and PostgresAppVersionRepository are updated to carry and persist detected runtime information. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 13:01:24 +02:00
hsiegeln	b03dfee4f3	feat: log forwarding v2 — accept List<LogEntry>, add source field Replace LogBatch wrapper with raw List<LogEntry> on the ingestion endpoint. Add source column to ClickHouse logs table and propagate it through the storage, search, and HTTP layers (LogSearchRequest, LogEntryResult, LogEntryResponse, LogQueryController). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-12 10:25:46 +02:00
hsiegeln	293d11e52b	feat: add infrastructureendpoints flag with conditional DB/CH controllers Add cameleer.server.security.infrastructureendpoints property (default true) and @ConditionalOnProperty to DatabaseAdminController and ClickHouseAdminController so the SaaS provisioner can set CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS=false to suppress these endpoints (404) on tenant server containers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 23:09:28 +02:00
hsiegeln	350e769948	Group container settings under cameleer.server.runtime.container.* All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m21s Details CI / docker (push) Successful in 1m2s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details Move container resource defaults into their own sub-namespace for future extensibility: cameleer.server.runtime.container.memorylimit → CAMELEER_SERVER_RUNTIME_CONTAINER_MEMORYLIMIT cameleer.server.runtime.container.cpushares → CAMELEER_SERVER_RUNTIME_CONTAINER_CPUSHARES Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 21:33:07 +02:00
hsiegeln	534e936cd4	Group OIDC settings under cameleer.server.security.oidc.* Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m59s Details CI / deploy (push) Has been cancelled Details CI / deploy-feature (push) Has been cancelled Details CI / docker (push) Has been cancelled Details Move OIDC properties into a nested Oidc class within SecurityProperties for clearer grouping. Env vars gain an extra separator: cameleer.server.security.oidc.issueruri → CAMELEER_SERVER_SECURITY_OIDC_ISSUERURI cameleer.server.security.oidc.jwkseturi → CAMELEER_SERVER_SECURITY_OIDC_JWKSETURI cameleer.server.security.oidc.audience → CAMELEER_SERVER_SECURITY_OIDC_AUDIENCE cameleer.server.security.oidc.tlsskipverify → CAMELEER_SERVER_SECURITY_OIDC_TLSSKIPVERIFY Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 21:30:33 +02:00
hsiegeln	60fb5fe21a	Remove vestigial clickhouse.enabled flag All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m19s Details CI / docker (push) Successful in 1m4s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details ClickHouse is the only storage backend — there is no alternative. The enabled flag created a false sense of optionality: setting it to false would crash on startup because most beans unconditionally depend on the ClickHouse JdbcTemplate. Remove all @ConditionalOnProperty annotations gating ClickHouse beans, the enabled property from application.yml, and the K8s manifest entry. Also fix old property names in AbstractPostgresIT test config. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 21:27:10 +02:00
hsiegeln	8fe48bbf02	Migrate config to cameleer.server.* naming convention All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m52s Details CI / docker (push) Successful in 1m30s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details Move all configuration properties under the cameleer.server.* namespace with all-lowercase dot-separated names and mechanical env var mapping (dots→underscores, uppercase). This aligns with the agent's convention (cameleer.agent.) and establishes a predictable pattern across all components. Changes: - Move 6 config prefixes under cameleer.server.: agent-registry, ingestion, security, license, clickhouse, and cameleer.tenant/runtime/indexer - Rename all kebab-case properties to concatenated lowercase (e.g., bootstrap-token → bootstraptoken, jar-storage-path → jarstoragepath) - Update all env vars to CAMELEER_SERVER_* mechanical mapping - Fix container-cpu-request/container-cpu-shares mismatch bug - Remove displayName from AgentRegistrationRequest (redundant with instanceId) - Update agent container env vars to CAMELEER_AGENT_* convention - Update K8s manifests and CI workflow for new env var names - Update CLAUDE.md, HOWTO.md, SERVER-CAPABILITIES.md documentation Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-11 18:10:51 +02:00
hsiegeln	827ba3c798	feat: last-ADMIN guard and password hardening (#87 , #89 ) All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m57s Details CI / docker (push) Successful in 1m48s Details CI / deploy (push) Successful in 51s Details CI / deploy-feature (push) Has been skipped Details - Prevent removal of last ADMIN role via role unassign, user delete, or group role removal (returns 409 Conflict) - Add password policy: min 12 chars, 3/4 character classes, no username - Add brute-force protection: 5 attempts then 15min lockout, IP rate limit - Add token revocation on password change via token_revoked_before column - V9 migration adds failed_login_attempts, locked_until, token_revoked_before Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 08:58:03 +02:00
hsiegeln	2df5e0d7ba	feat: active config snapshot, composite StatusDot with tooltip Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Failing after 43s Details CI / docker (push) Has been skipped Details CI / deploy (push) Has been skipped Details CI / deploy-feature (push) Has been skipped Details Part 1 — Config snapshot: - V8 migration adds resolved_config JSONB to deployments table - DeploymentExecutor saves the full resolved config at deploy time - Deployment record includes resolvedConfig for auditability Part 2 — Composite health StatusDot: - CatalogController computes composite health from deployment status + agent health (green only when RUNNING AND agent live) - CatalogApp includes healthTooltip (e.g. "Deployment: RUNNING, Agents: live (1 connected)") - StatusDot added to app detail header with deployment status Badge - StatusDot added to deployment table rows - Sidebar passes composite health + tooltip through to tree nodes Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 08:00:54 +02:00
hsiegeln	20f3dfe59d	feat: support Docker volume-based JAR mounting for Docker-in-Docker All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m27s Details CI / docker (push) Successful in 1m8s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 36s Details When CAMELEER_JAR_DOCKER_VOLUME is set, the orchestrator mounts the named volume at the jar storage path instead of using a host bind mount. This solves the path translation issue in Docker-in-Docker setups where the server runs inside a container and manages sibling containers. The entrypoint is overridden to use the volume-mounted JAR path via the CAMELEER_APP_JAR env var. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 21:38:34 +02:00
hsiegeln	6e444a414d	feat: add CAMELEER_SERVER_URL config property Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-08 20:28:44 +02:00
hsiegeln	0fccdb636f	feat(db): add V7 deployment orchestration migration Adds target_state, deployment_strategy, replica_states (JSONB), and deploy_stage columns to the deployments table with backfill logic.	2026-04-08 20:15:01 +02:00
hsiegeln	7e47f1628d	feat: JAR retention policy with nightly cleanup job All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m23s Details CI / docker (push) Successful in 1m9s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 40s Details Per-environment "keep last N versions" setting (default 5, null for unlimited). Nightly scheduled job at 03:00 deletes old versions from both database and disk, skipping any version that is currently deployed. Full stack: - V6 migration: adds jar_retention_count column to environments - Environment record, repository, service, admin controller endpoint - JarRetentionJob: @Scheduled nightly, iterates environments and apps - UI: retention policy editor on admin Environments page with toggle between limited/unlimited and version count input - AppVersionRepository.delete() for version cleanup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 19:06:28 +02:00
hsiegeln	0b2d231b6b	feat: split config into 4 tabs and fix JAR upload 413 All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m19s Details CI / docker (push) Successful in 1m5s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details Config sub-tabs are now: Monitoring \| Traces & Taps \| Route Recording \| Resources (renamed from Agent/Infrastructure, with traces and recording as their own tabs). Also increase Spring multipart max-file-size and max-request-size to 200MB to fix HTTP 413 on JAR uploads. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 18:22:39 +02:00
hsiegeln	b1b7e142bb	fix: remove duplicate updated_at column from V5 migration All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m24s Details CI / docker (push) Successful in 1m5s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details apps.updated_at already exists from V3. The duplicate ALTER caused Flyway to fail on startup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 16:31:06 +02:00
hsiegeln	875062e59a	feat: add container config to apps and default config to environments - V5 migration: container_config JSONB + updated_at on apps, default_container_config JSONB on environments - App/Environment records updated with new fields - PUT /apps/{id}/container-config endpoint for per-app config - PUT /admin/environments/{id}/default-container-config for env defaults - GET /apps now supports optional environmentId (lists all when omitted) - AppRepository.findAll() for cross-environment app listing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 16:18:08 +02:00
hsiegeln	2e006051bc	feat: add production/enabled flags to environments, drop status enum Environments now have: - production (bool): prod vs non-prod resource allocation - enabled (bool): disabled blocks new deployments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 11:16:09 +02:00
hsiegeln	3d20d7a0cb	feat: add runtime management configuration properties - JAR storage path, base image, Docker network - Container memory/CPU limits, health check timeout - Routing mode and domain for Traefik integration Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 23:47:43 +02:00
hsiegeln	85530d5ea3	feat: add runtime management database schema (environments, apps, versions, deployments) - environments, apps, app_versions, deployments tables - Default environment seeded on migration - Foreign keys with CASCADE delete Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 23:40:18 +02:00
hsiegeln	b969075007	feat: add license loading at startup from env var or file - LicenseBeanConfig wires LicenseGate bean with startup validation - Supports token from CAMELEER_LICENSE_TOKEN env var or CAMELEER_LICENSE_FILE path - Falls back to open mode when no license or no public key configured - Add license config properties to application.yml	2026-04-07 23:11:02 +02:00
hsiegeln	bd78207060	feat: add claim mapping rules table and origin tracking to RBAC assignments - Add origin and mapping_id columns to user_roles and user_groups - Create claim_mapping_rules table with match_type and action constraints - Update primary keys to include origin column - Add indexes for fast managed assignment cleanup Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 23:07:30 +02:00
hsiegeln	083cb8b9ec	feat: add CAMELEER_CORS_ALLOWED_ORIGINS for multi-origin CORS support All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m7s Details CI / docker (push) Successful in 41s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details Behind a reverse proxy the browser sends Origin matching the proxy's public URL, which the single-origin CAMELEER_UI_ORIGIN rejects. New env var accepts comma-separated origins and takes priority over UI_ORIGIN, which remains as a backwards-compatible fallback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 00:41:00 +02:00
hsiegeln	ca92b3ce7d	feat: add CAMELEER_OIDC_TLS_SKIP_VERIFY to bypass cert verification for OIDC All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m7s Details CI / docker (push) Successful in 43s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 36s Details Self-signed CA certs on the OIDC provider (e.g. Logto behind a reverse proxy) cause the login flow to fail because Java's truststore rejects the connection. This adds an opt-in env var that creates a trust-all SSLContext scoped to OIDC HTTP calls only (discovery, token exchange, JWKS fetch) without affecting system-wide TLS. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 00:26:40 +02:00
hsiegeln	3c70313d78	feat: add CAMELEER_OIDC_JWK_SET_URI for direct JWKS fetching Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / docker (push) Has been cancelled Details CI / deploy (push) Has been cancelled Details CI / deploy-feature (push) Has been cancelled Details CI / build (push) Has been cancelled Details When set, fetches JWKs from this URL directly instead of discovering from the OIDC well-known endpoint. Needed when the public issuer URL (e.g., https://domain.com/oidc) isn't reachable from inside containers but the internal URL (http://logto:3001/oidc/jwks) is. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 21:02:51 +02:00
hsiegeln	a5c4e0cead	feat: add spring-boot-starter-oauth2-resource-server and OIDC properties Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-05 13:06:53 +02:00
hsiegeln	de85cdf5a2	fix: let SPRING_DATASOURCE_URL fully control datasource connection All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m5s Details CI / docker (push) Successful in 41s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details SonarQube / sonarqube (push) Successful in 3m26s Details Explicit spring.datasource.url in YAML takes precedence over the env var, causing deployed containers to connect to localhost instead of the postgres service. Now the YAML uses ${SPRING_DATASOURCE_URL:...} so the env var wins when set. Flyway inherits from the datasource (no separate URL). Removed CAMELEER_DB_SCHEMA — schema is part of the datasource URL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 23:24:22 +02:00
hsiegeln	ac87aa6eb2	fix: derive PG schema from tenant ID instead of defaulting to public Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m6s Details CI / docker (push) Successful in 43s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Failing after 2m17s Details Schema now defaults to tenant_${cameleer.tenant.id} (e.g. tenant_default, tenant_acme) instead of public. Flyway create-schemas: true ensures the schema is auto-created on first startup. CAMELEER_DB_SCHEMA env var still available as override for feature branch isolation. Removed hardcoded public schema from K8s base and main overlay. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 21:46:57 +02:00
hsiegeln	a188308ec5	feat: implement multitenancy with tenant isolation + environment support All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m8s Details CI / docker (push) Successful in 42s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 1m25s Details Adds configurable tenant ID (CAMELEER_TENANT_ID env var, default: "default") and environment as a first-class concept. Each server instance serves one tenant with multiple environments. Changes across 36 files: - TenantProperties config bean for tenant ID injection - AgentInfo: added environmentId field - AgentRegistrationRequest: added environmentId field - All 9 ClickHouse stores: inject tenant ID, replace hardcoded "default" constant, add environment to writes/reads - ChunkAccumulator: configurable tenant ID + environment resolver - MergedExecution/ProcessorBatch/BufferedLogEntry: added environment - ClickHouse init.sql: added environment column to all tables, updated ORDER BY (tenant→time→env→app), added tenant_id to usage_events, updated all MV GROUP BY clauses - Controllers: pass environmentId through registration/auto-heal - K8s deploy: added CAMELEER_TENANT_ID env var - All tests updated for new signatures Closes #123 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 15:00:18 +02:00
hsiegeln	ac94a67a49	fix: reduce ClickHouse CPU by increasing flush interval, rename LIVE→AUTO labels All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m7s Details CI / docker (push) Successful in 1m24s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 41s Details - Increase ingestion flush interval from 500ms to 5000ms to reduce MV merge storms - Reduce ClickHouse background_schedule_pool_size from 8 to 4 - Rename LIVE/PAUSED badge labels to AUTO/MANUAL across all pages - Update design system to v0.1.29 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 22:05:29 +02:00
hsiegeln	d4327af6a4	refactor: consolidate ClickHouse schema into single init.sql, cache diagrams All checks were successful CI / build (push) Successful in 2m2s Details CI / cleanup-branch (push) Has been skipped Details CI / docker (push) Successful in 51s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details - Merge all V1-V11 migration scripts into one idempotent init.sql - Simplify ClickHouseSchemaInitializer to load single file - Replace route_diagrams projection with in-memory caches: hashCache (routeId+instanceId → contentHash) warm-loaded on startup, graphCache (contentHash → RouteGraph) lazy-populated on access - Eliminates 9M+ row scans on diagram lookups Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 15:24:53 +02:00
hsiegeln	bb3e1e2bc3	fix: set deduplicate_merge_projection_mode for ReplacingMergeTree projection All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m8s Details CI / docker (push) Successful in 42s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details ClickHouse 24.12 requires this setting before adding projections to ReplacingMergeTree tables. Using 'drop' mode which discards the projection during deduplication merges and rebuilds it afterward. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 15:14:56 +02:00
hsiegeln	6f00ff2e28	fix: reduce ClickHouse log noise, admin query spam, and diagram scan perf All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m7s Details CI / docker (push) Successful in 1m25s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 40s Details - Set com.clickhouse log level to INFO and org.apache.hc.client5 to WARN - Admin hooks (useUsers/useGroups/useRoles) now only fetch on admin pages, eliminating AUDIT view_users entries on every UI click - Add ClickHouse projection on route_diagrams for (tenant_id, route_id, instance_id, created_at) to avoid full table scans on diagram lookups - Bump @cameleer/design-system to v0.1.28 (PAUSED mode time range fix, refreshTimeRange API) - Call refreshTimeRange before invalidateQueries in PAUSED mode manual refresh so sidebar clicks use current time window Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 14:48:30 +02:00
hsiegeln	e495b80432	fix: increase ClickHouse pool size and reduce flush interval All checks were successful CI / build (push) Successful in 1m49s Details CI / cleanup-branch (push) Has been skipped Details CI / docker (push) Successful in 2m10s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 43s Details Pool was hardcoded to 10 connections serving 7 concurrent write streams + UI reads, causing "too many simultaneous queries" and WriteBuffer overflow. Pool now defaults to 50 (configurable via clickhouse.pool-size), flush interval reduced from 1000ms to 500ms. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 22:11:15 +02:00
hsiegeln	805e6d51cb	fix: add processor_type to stats_1m_processor_detail MV Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m14s Details CI / deploy (push) Has been cancelled Details CI / deploy-feature (push) Has been cancelled Details CI / docker (push) Has been cancelled Details The table and materialized view were missing the processor_type column, causing the RouteMetricsController query to fail and the dashboard processor metrics table to render empty. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 22:00:23 +02:00
hsiegeln	9781fe0d7c	fix: include execution/correlation/exchange IDs in full-text search Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m12s Details CI / deploy (push) Has been cancelled Details CI / deploy-feature (push) Has been cancelled Details CI / docker (push) Has been cancelled Details The _search_text materialized column only contained error messages, bodies, and headers — not execution_id, correlation_id, exchange_id, or route_id. Searching by ID via cmd-k returned no results. - Add ID fields to _search_text in ClickHouse DDL (covered by ngram bloom filter index) - Add direct LIKE matches on execution_id, correlation_id, exchange_id in the text search WHERE clause for faster exact ID lookups Requires ClickHouse table recreation (fresh install). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 21:12:15 +02:00
hsiegeln	188810e54b	feat: remove TimescaleDB, dead PG stores, and storage feature flags Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Failing after 32s Details CI / docker (push) Has been skipped Details CI / deploy (push) Has been skipped Details CI / deploy-feature (push) Has been skipped Details Complete the ClickHouse migration by removing all PostgreSQL analytics code. PostgreSQL now serves only RBAC, config, and audit — all observability data is exclusively in ClickHouse. - Delete 6 dead PostgreSQL store classes (executions, stats, diagrams, events, metrics, metrics-query) and 2 integration tests - Delete RetentionScheduler (ClickHouse TTL handles retention) - Remove all 7 cameleer.storage.* feature flags from application.yml - Remove all @ConditionalOnProperty from ClickHouse beans in StorageBeanConfig - Consolidate 14 Flyway migrations (V1-V14) into single clean V1 with only RBAC/config/audit tables (no TimescaleDB, no analytics tables) - Switch from timescale/timescaledb-ha:pg16 to postgres:16 everywhere (docker-compose, deploy/postgres.yaml, test containers) - Remove TimescaleDB check and /metrics-pipeline from DatabaseAdminController - Set clickhouse.enabled default to true Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 20:10:58 +02:00
hsiegeln	283e38a20d	feat: remove OpenSearch, add ClickHouse admin page Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Failing after 33s Details CI / docker (push) Has been skipped Details CI / deploy (push) Has been skipped Details CI / deploy-feature (push) Has been skipped Details Remove all OpenSearch code, dependencies, configuration, deployment manifests, and CI/CD references. Replace the OpenSearch admin page with a ClickHouse admin page showing cluster status, table sizes, performance metrics, and indexer pipeline stats. - Delete 11 OpenSearch Java files (config, search impl, admin controller, DTOs, tests) - Delete 3 OpenSearch frontend files (admin page, CSS, query hooks) - Delete deploy/opensearch.yaml K8s manifest - Remove opensearch Maven dependencies from pom.xml - Remove opensearch config from application.yml, Dockerfile, docker-compose - Remove opensearch from CI workflow (secrets, deploy, cleanup steps) - Simplify ThresholdConfig (remove OpenSearch thresholds, database-only) - Change default search backend from opensearch to clickhouse - Add ClickHouseAdminController with /status, /tables, /performance, /pipeline - Add ClickHouseAdminPage with StatCards, pipeline ProgressBar, tables DataTable - Update CLAUDE.md, HOWTO.md, and source comments Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 18:56:06 +02:00
hsiegeln	aa2d203f4e	feat: add UI usage analytics tracking All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m9s Details CI / docker (push) Successful in 1m14s Details CI / deploy (push) Successful in 46s Details CI / deploy-feature (push) Has been skipped Details Tracks authenticated UI user requests to understand usage patterns: - New ClickHouse usage_events table with 90-day TTL - UsageTrackingInterceptor captures method, path, duration, user - Path normalization groups dynamic segments ({id}, {hash}) - Buffered writes via WriteBuffer + periodic flush - Admin endpoint GET /api/v1/admin/usage with groupBy=endpoint\|user\|hour - Skips agent requests, health checks, and data ingestion Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 17:53:32 +02:00
hsiegeln	d739094a56	fix: update ClickHouse DDL files with new column names instead of ALTER RENAME All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m9s Details CI / docker (push) Successful in 45s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 43s Details ClickHouse can't rename columns that are part of ORDER BY keys. Updated V1-V8 DDL files directly with new column names (instance_id, application_id) and removed V9 migration. Wipe ClickHouse and restart. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 12:40:54 +02:00
hsiegeln	91400defe9	fix: add missing V9 (ClickHouse) and V14 (PostgreSQL) identity column rename migrations All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m7s Details CI / docker (push) Successful in 45s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 39s Details Migration files were lost during worktree merge — recreated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 12:33:02 +02:00
hsiegeln	95b9dea5c4	feat(clickhouse): wire ClickHouseExecutionStore as active ExecutionStore Add cameleer.storage.executions feature flag (default: clickhouse). PostgresExecutionStore activates only when explicitly set to postgres. Add by-seq snapshot endpoint for iteration-aware processor lookup. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-01 00:09:14 +02:00
hsiegeln	968117c41a	feat(clickhouse): wire Phase 4 stores with feature flags All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m7s Details CI / docker (push) Successful in 43s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 44s Details Add conditional beans for ClickHouseDiagramStore, ClickHouseAgentEventRepository, and ClickHouseLogStore. All default to ClickHouse (matchIfMissing=true). PG/OS stores activate only when explicitly configured. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 23:44:10 +02:00
hsiegeln	f7daadaaa9	feat(clickhouse): add DDL for route_diagrams, agent_events, and logs tables Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 23:30:38 +02:00
hsiegeln	606f81a970	fix: align server with protocol v2 chunked transport spec All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m45s Details CI / docker (push) Successful in 59s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 46s Details - ChunkIngestionController: /data/chunks → /data/executions (matches PROTOCOL.md endpoint the agent actually posts to) - ExecutionController: conditional on ClickHouse being disabled to avoid mapping conflict - Persist originalExchangeId and replayExchangeId from ExecutionChunk envelope through to ClickHouse (was silently dropped) - V5 migration adds the two new columns to executions table Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 23:18:35 +02:00
hsiegeln	1a00eed389	fix: schema initializer skips comment-only SQL segments The V4 DDL had a semicolon inside a comment which caused the split-on-semicolon logic to produce a comment-only segment that ClickHouse rejected as empty query. Fixed the comment and made the initializer strip comment-only segments before execution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 22:06:31 +02:00
hsiegeln	9df00fdde0	feat(clickhouse): wire ClickHouseStatsStore with cameleer.storage.stats feature flag (default: clickhouse) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 21:51:45 +02:00
hsiegeln	052990bb59	feat(clickhouse): add ClickHouseStatsStore with -Merge aggregate queries Implements StatsStore interface for ClickHouse using AggregatingMergeTree tables with -Merge combinators (countMerge, countIfMerge, sumMerge, quantileMerge). Uses literal SQL for aggregate table queries to avoid ClickHouse JDBC driver PreparedStatement issues with AggregateFunction columns. Raw table queries (SLA, topErrors, activeErrorTypes) use normal prepared statements. Includes 13 integration tests covering stats, timeseries, grouped timeseries, SLA compliance, SLA counts by app/route, top errors, active error types, punchcard, and processor stats. Also fixes AggregateFunction type signatures in V4 DDL (count() takes no args, countIf takes UInt8). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-31 21:49:22 +02:00

1 2 3

102 Commits