cameleer-server

Author	SHA1	Message	Date
hsiegeln	1ddae94930	feat(runtime): init-container loader pattern + withUsernsMode (#152 hardening close) Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because 9 alone leaves the orchestrator+executor referencing removed ContainerRequest fields. ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage. DockerRuntimeOrchestrator (app): - per-replica named volume "cameleer-jars-{containerName}" - phase 1: loader container with the volume mounted RW at /app/jars, ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract - block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit remove the loader, remove the volume, propagate RuntimeException so DeploymentExecutor marks the deployment FAILED. main is never created. - phase 2: main container with the same volume mounted RO at /app/jars - withUsernsMode("host:1000:65536") on BOTH containers — closes the last open hardening gap from issue #152 - main entrypoint paths point at /app/jars/app.jar - extracted baseHardenedHostConfig() so loader and main share the cap_drop / security_opt / readonly / pids / tmpfs contract - removeContainer() also removes the per-replica volume so blue/green doesn't leak volumes DeploymentExecutor (app): - injects ArtifactDownloadTokenSigner; new @Value props loaderimage, artifacttokenttlseconds, artifactbaseurl - replaces the temporary getVersion(...).jarPath() bridge with a signed URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig - drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is the size-of-record check now - drops jarDockerVolume / jarStoragePath @Value fields and the volume plumbing in startReplica - DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize in place of jarPath Tests: - DockerRuntimeOrchestratorHardeningTest updated for the new shape; captures HostConfig on the MAIN container and asserts cap_drop ALL + no-new-privileges + apparmor + readonly + pids + tmpfs + the new withUsernsMode("host:1000:65536") - DockerRuntimeOrchestratorLoaderTest (new): verifies volume create → loader create with RW bind → loader started → awaited → loader removed → main create with RO bind → main started; verifies abort + cleanup on loader exit != 0 (loader removed, volume removed, main NEVER created); verifies userns_mode applied to both containers. Config: - application.yml replaces jardockervolume with loaderimage, artifacttokenttlseconds, artifactbaseurl Rules updated: .claude/rules/docker-orchestration.md (loader pattern, userns, no more bind-mount); .claude/rules/core-classes.md (ContainerRequest field map). Test counts after change: - cameleer-server-core: 116/116 unit tests pass - cameleer-server-app: 273/273 unit tests pass Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 16:06:56 +02:00
hsiegeln	20aefd5bf6	feat(license): Flyway V5 — license table + environments retention columns Per-tenant license row stores the signed token, licenseId for audit, installed/expires/last_validated timestamps. environments gains three INTEGER NOT NULL DEFAULT 1 retention columns (execution, log, metric) so existing rows land inside the default-tier cap. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 11:02:44 +02:00
hsiegeln	f6b76b2d5e	docs(runtime): document hardening contract and runtime override (#152 ) Surfaces the multi-tenant container hardening contract introduced in the prior commit so operators and integrators know what is enforced and why. - application.yml: declare `cameleer.server.runtime.dockerruntime` alongside the other runtime properties (empty = auto-detect runsc). - HOWTO.md: add the override row to the Runtime config table. - SERVER-CAPABILITIES.md: new "Multi-Tenant Runtime Sandboxing" section describing the cap_drop, no-new-privileges, AppArmor, read-only rootfs, pids_limit, /tmp tmpfs, and runsc auto-detect contract — plus the on-disk state caveat that motivates issue #153. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-25 21:06:10 +02:00
hsiegeln	48ce75bf38	feat(server): persist server self-metrics into ClickHouse Snapshot the full Micrometer registry (cameleer business metrics, alerting metrics, and Spring Boot Actuator defaults) every 60s into a new server_metrics table so server health survives restarts without an external Prometheus. Includes a dashboard-builder reference for the SaaS team. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 23:20:45 +02:00
hsiegeln	21db92ff00	fix(traefik): make TLS cert resolver configurable, omit when unset All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m15s Details CI / docker (push) Successful in 1m3s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 42s Details Previously `TraefikLabelBuilder` hardcoded `tls.certresolver=default` on every router. That assumes a resolver literally named `default` exists in the Traefik static config — true for ACME-backed installs, false for dev/local installs that use a file-based TLS store. Traefik logs "Router uses a nonexistent certificate resolver" for the bogus resolver on every managed app, and any future attempt to define a differently- named real resolver would silently skip these routers. Server-wide setting via `CAMELEER_SERVER_RUNTIME_CERTRESOLVER` (empty by default) flows through `ConfigMerger.GlobalRuntimeDefaults.certResolver` into `ResolvedContainerConfig.certResolver`. When blank the `tls.certresolver` label is omitted entirely; `tls=true` is still emitted so Traefik serves the default TLS-store cert. When set, the label is emitted with the configured resolver name. Not per-app/per-env configurable: there is one Traefik per server instance and one resolver config; app-level override would only let users break their own routers. TDD: TraefikLabelBuilderTest gains 3 cases (resolver set, null, blank). Full unit suite 211/0/0. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-23 18:18:47 +02:00
hsiegeln	35748ea7a1	feat(deploy): V4 migration — add created_by to deployments	2026-04-23 11:44:05 +02:00
hsiegeln	ff95187707	db(deploy): add deployments.deployed_config_snapshot column (V3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-22 21:23:46 +02:00
hsiegeln	c2eab71a31	env(admin): per-environment color field + V2 migration - V2__add_environment_color.sql adds a CHECK-constrained VARCHAR color column (default 'slate'); existing rows backfill to slate. - Environment record + EnvironmentColor constants (8 preset values) flow through repository, service, and admin API. - UpdateEnvironmentRequest.color nullable: null preserves existing; unknown values → 400. - ITs cover valid / invalid / null-preserves behaviour; existing Environment constructor call-sites updated with the new color arg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 19:24:30 +02:00
hsiegeln	e6dcad1e07	config(app): silence MustacheAutoConfiguration templates-dir warning jmustache on the classpath (for alert notification templates) triggers Spring Boot's MustacheAutoConfiguration, which warns about the missing classpath:/templates/ folder we don't use. Disable its check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 18:47:46 +02:00
hsiegeln	eda74b7339	docs(alerting): PER_EXCHANGE exactly-once — fireMode reference + deploy-backlog-cap All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 2m7s Details CI / docker (push) Successful in 1m22s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 41s Details Fix stale `AGGREGATE` label (actual enum: `COUNT_IN_WINDOW`). Expand EXCHANGE_MATCH section with both fire modes, PER_EXCHANGE config-surface restrictions (0 for reNotifyMinutes/forDurationSeconds, at-least-one-sink rule), exactly-once guarantee scope, and the first-run backlog-cap knob. Surface the new config in application.yml with the 24h default and the opt-out-to-0 semantics. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-22 18:39:49 +02:00
hsiegeln	a9a6b465d4	fix(stats): close 8 ClickHouseStatsStoreIT TZ failures (bucket DateTime('UTC') + JVM UTC pin) Two-layer fix for the TZ drift that caused stats reads to miss every row when the JVM default TZ and CH session TZ disagreed: - Insert side: ClickHouse JDBC 0.9.7 formats java.sql.Timestamp via Timestamp.toString(), which uses JVM default TZ. A CEST JVM shipping to a UTC CH server stored Unix timestamps off by the TZ offset (the triage report's original symptom). Pinned JVM default to UTC in CameleerServerApplication.main() — standard practice for observability servers that push to time-series stores. - Read side: stats_1m_* tables now declare bucket as DateTime('UTC'), MV SELECTs wrap toStartOfMinute(start_time) in toDateTime(..., 'UTC') so projections match column type, and ClickHouseStatsStore.lit(Instant) emits toDateTime('...', 'UTC') rather than a bare literal — defence in depth against future refactors. Test class pins its own JVM TZ (the store IT builds its own HikariDataSource, bypassing the main() path). Debug scaffolding from the triage investigation removed. Greenfield CH — no migration needed. Verified: 14/14 ClickHouseStatsStoreIT green, plus 84/84 across all ClickHouse IT classes (no regression from the JVM TZ default change). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 23:25:22 +02:00
hsiegeln	90083f886a	refactor(schema): collapse V1..V18 into single V1__init.sql baseline Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 2m4s Details CI / docker (push) Successful in 1m17s Details CI / deploy (push) Has been cancelled Details CI / deploy-feature (push) Has been cancelled Details The project is still greenfield (no production deployment) so this is the last safe moment to flatten the migration archaeology before the checksum history starts mattering for real. Schema changes - 18 migration files (531 lines) → one V1__init.sql (~380 lines) declaring the final end-state: RBAC + claim mappings + runtime management + config + audit + outbound + alerting, plus seed data (system roles, Admins group, default environment). - Drops the data-repair statements from V14 (firemode backfill), V16 (subjectFingerprint migration), V17 (ACKNOWLEDGED → FIRING coercion) — they were no-ops on any DB that starts at V1. - Declares condition_kind_enum with AGENT_LIFECYCLE from the start (was added retroactively by V18). - Declares alert_state_enum with three values only (was five, then swapped in V17) and alert_instances with read_at / deleted_at columns from day one (was added by V17). - alert_reads table never created (V12 created, V17 dropped). - alert_instances_open_rule_uq built with the V17 predicate from the start. Test changes - Replace V12MigrationIT / V17MigrationIT / V18MigrationIT with one SchemaBootstrapIT that asserts the combined invariants: tables present, alert_reads absent, enum value sets, alert_instances has read_at + deleted_at, open_rule_uq exists and is unique, env-delete cascade fires. Verification - pg_dump of the new V1 matches the pg_dump of V1..V18 applied in sequence (bytewise modulo column order and Postgres-auto FK names). - Full alerting IT suite (53 tests across 6 classes) green against the new schema. - The 47 pre-existing test failures on main (AgentRegistrationIT, SearchControllerIT, ClickHouseStatsStoreIT, …) are unrelated and fail identically without this change. Developer impact - Existing local DBs will fail checksum validation on boot. Wipe: docker compose down -v (or drop the tenant_default schema). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 20:52:22 +02:00
hsiegeln	b7d201d743	fix(alerts): add AGENT_LIFECYCLE to condition_kind_enum + readable error toasts All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 2m5s Details CI / docker (push) Successful in 1m19s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details Backend - V18 migration adds AGENT_LIFECYCLE to condition_kind_enum. Java ConditionKind enum shipped with this value but no Postgres migration extended the type, so any AGENT_LIFECYCLE rule insert failed with "invalid input value for enum condition_kind_enum". - ALTER TYPE ... ADD VALUE lives alone in its migration per Postgres constraint that the new value cannot be referenced in the same tx. - V18MigrationIT asserts the enum now contains all 7 kinds. Frontend - Add describeApiError(e) helper to unwrap openapi-fetch error bodies (Spring error JSON) into readable strings. String(e) on a plain object rendered "[object Object]" in toasts — the actual failure reason was hidden from the user. - Replace String(e) in all 13 toast descriptions across the alerting and outbound-connection mutation paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 20:23:14 +02:00
hsiegeln	e95c21d0cb	feat(alerts): V17 migration — drop ACKNOWLEDGED, add read_at + deleted_at Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-21 17:04:09 +02:00
hsiegeln	414f7204bf	feat(alerting): AGENT_LIFECYCLE condition kind with per-subject fire mode Allows alert rules to fire on agent-lifecycle events — REGISTERED, RE_REGISTERED, DEREGISTERED, WENT_STALE, WENT_DEAD, RECOVERED — rather than only on current state. Each matching `(agent, eventType, timestamp)` becomes its own ackable AlertInstance, so outages on distinct agents are independently routable. Core: - New `ConditionKind.AGENT_LIFECYCLE` + `AgentLifecycleCondition` record (scope, eventTypes, withinSeconds). Compact ctor rejects empty eventTypes and withinSeconds<1. - Strict allowlist enum `AgentLifecycleEventType` (six entries matching the server-emitted types in `AgentRegistrationController` and `AgentLifecycleMonitor`). Custom agent-emitted event types tracked in backlog issue #145. - `AgentEventRepository.findInWindow(env, appSlug, agentId, eventTypes, from, to, limit)` — new read path ordered `(timestamp ASC, insert_id ASC)` used by the evaluator. Implemented on `ClickHouseAgentEventRepository` with tenant + env filter mandatory. App: - `AgentLifecycleEvaluator` queries events in the last `withinSeconds` window and returns `EvalResult.Batch` with one `Firing` per row. Every Firing carries a canonical `_subjectFingerprint` of `"<agentId>:<eventType>:<tsMillis>"` in context plus `agent` / `event` subtrees for Mustache templating. - `NotificationContextBuilder` gains an `AGENT_LIFECYCLE` branch that exposes `{{agent.id}}`, `{{agent.app}}`, `{{event.type}}`, `{{event.timestamp}}`, `{{event.detail}}`. - Validation is delegated to the record compact ctor + enum at Jackson deserialization time — matches the existing policy of keeping controller validators focused on env-scoped / SQL-injection concerns. Schema: - V16 migration generalises the V15 per-exchange discriminator on `alert_instances_open_rule_uq` to prefer `_subjectFingerprint` with a fallback to the legacy `exchange.id` expression. Scalar kinds still resolve to `''` and keep one-open-per-rule. Duplicate-key path in `PostgresAlertInstanceRepository.save` is unchanged — the index is the deduper. UI: - New `AgentLifecycleForm.tsx` wizard form with multi-select chips for the six allowed event types + `withinSeconds` input. Wired into `ConditionStep`, `form-state` (validation + defaults: WENT_DEAD, 300 s), and `enums.ts` options. Tests in `enums.test.ts` pin the new option array. - `alert-variables.ts` registers `{{agent.app}}`, `{{event.type}}`, `{{event.timestamp}}`, `{{event.detail}}` leaves for the new kind, and extends `agent.id`'s availability list to include `AGENT_LIFECYCLE`. Tests (all passing): - 5 new JSON-roundtrip cases on `AlertConditionJsonTest` (positive + empty/zero/unknown-type rejection). - 5 new evaluator unit tests on `AgentLifecycleEvaluatorTest` (empty window, multi-agent fingerprint shape, scope forwarding, missing env). - `NotificationContextBuilderTest` switch now covers the new kind. - 119 alerting unit tests + 71 UI tests green. Docs: `.claude/rules/{core,app,ui}` and CLAUDE.md migration list updated.	2026-04-21 14:52:08 +02:00
hsiegeln	037a27d405	fix(alerting): allow multiple open alert_instances per rule for PER_EXCHANGE All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m51s Details CI / docker (push) Successful in 1m17s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 41s Details V13 added a partial unique index on alert_instances(rule_id) WHERE state IN (PENDING,FIRING,ACKNOWLEDGED). Correct for scalar condition kinds (ROUTE_METRIC / AGENT_STATE / DEPLOYMENT_STATE / LOG_PATTERN / JVM_METRIC / EXCHANGE_MATCH in COUNT_IN_WINDOW) but wrong for EXCHANGE_MATCH / PER_EXCHANGE, which by design emits one alert_instance per matching exchange. Under V13 every PER_EXCHANGE tick with >1 match logged "Skipped duplicate open alert_instance for rule …" at evaluator cadence and silently lost alert fidelity — only the first matching exchange per tick got an AlertInstance + webhook dispatch. V15 drops the rule_id-only constraint and recreates it with a discriminator on context->'exchange'->>'id'. Scalar kinds emit Map.of() as context, so their expression resolves to '' — "one open per rule" preserved. ExchangeMatchEvaluator.evaluatePerExchange always populates exchange.id, so per-exchange instances coexist cleanly. Two new PostgresAlertInstanceRepositoryIT tests: - multiple open instances for same rule + distinct exchanges all land - second open for identical (rule, exchange) still dedups via the DuplicateKeyException fallback in save() — defense-in-depth kept Also fixes pre-existing PostgresAlertReadRepositoryIT brokenness: its setup() inserted 3 open instances sharing one rule_id, which V13 blocked on arrival. Migrate to one rule_id per instance (pattern already used across other storage ITs). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 22:26:19 +02:00
hsiegeln	efa8390108	fix(alerting): reject null fireMode on ExchangeMatchCondition + repair in-flight rows All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 2m2s Details CI / docker (push) Successful in 1m20s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 37s Details SonarQube / sonarqube (push) Successful in 5m31s Details The rule editor wizard reset the condition payload on kind-change without seeding a fireMode default; the ExchangeMatchCondition ctor allowed null to pass through; AlertEvaluatorJob then NPE-looped every tick on a saved rule. - core: compact ctor now rejects null fireMode (Jackson-deser path only — all production callers already pass a value). - V14: repair existing EXCHANGE_MATCH rows with fireMode=null to PER_EXCHANGE + perExchangeLingerSeconds=300 (default matches the wizard). - ui: ConditionStep.onKindChange seeds EXCHANGE_MATCH defaults so the Select's displayed fallback ("Per exchange") is actually in form state. - ui: validateStep('condition', ...) now enforces fireMode presence + the mode-specific fields before the user reaches Review. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-20 20:05:55 +02:00
hsiegeln	7e79ff4d98	fix(alerting/I-2): add unique partial index on alert_instances(rule_id) for open states V13 migration creates alert_instances_open_rule_uq — a partial unique index on (rule_id) WHERE state IN ('PENDING','FIRING','ACKNOWLEDGED'), preventing duplicate open instances per rule. PostgresAlertInstanceRepository.save() catches DuplicateKeyException and returns the existing open instance instead of failing. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 08:26:07 +02:00
hsiegeln	8bf45d5456	fix(alerting): use ALTER TABLE MODIFY SETTING to enable projections on executions ReplacingMergeTree Investigated three approaches for CH 24.12: - Inline SETTINGS on ADD PROJECTION: rejected (UNKNOWN_SETTING — not a query-level setting). - ALTER TABLE MODIFY SETTING deduplicate_merge_projection_mode='rebuild': works; persists in table metadata across connection restarts; runs before ADD PROJECTION in the SQL script. - Session-level JDBC URL param: not pursued (MODIFY SETTING is strictly better). alerting_projections.sql now runs MODIFY SETTING before the two executions ADD PROJECTIONs. AlertingProjectionsIT strengthened to assert all four projections (including alerting_app_status and alerting_route_status on executions) exist after schema init. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-20 07:36:55 +02:00
hsiegeln	63669bd1d7	docs(alerting): default config + admin guide Adds alerting stanza to application.yml with all AlertingProperties fields backed by env-var overrides. Creates docs/alerting.md covering six condition kinds (with example JSON), template variables, webhook setup (Slack/PagerDuty examples), silence patterns, circuit-breaker and retention troubleshooting, and Prometheus metrics reference. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-19 22:16:38 +02:00
hsiegeln	7c0e94a425	feat(alerting): ClickHouse projections for alerting read paths Adds alerting_projections.sql with four projections (alerting_app_status, alerting_route_status on executions; alerting_app_level on logs; alerting_instance_metric on agent_metrics). ClickHouseSchemaInitializer now runs both init.sql and alerting_projections.sql, with ADD PROJECTION and MATERIALIZE treated as non-fatal — executions (ReplacingMergeTree) requires deduplicate_merge_projection_mode=rebuild which is unavailable via JDBC pool. MergeTree projections (logs, agent_metrics) always succeed and are asserted in IT. Column names confirmed from init.sql: logs uses 'application' (not application_id), agent_metrics uses 'collected_at' (not timestamp). All column names match the plan. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-19 19:18:58 +02:00
hsiegeln	59e76bdfb6	feat(alerting): V12 flyway migration for alerting tables	2026-04-19 18:28:09 +02:00
hsiegeln	380ccb102b	fix(outbound): align user FK with users(user_id) TEXT schema V11 migration referenced users(id) as uuid, but V1 users table has user_id as TEXT primary key. Amending V11 and the OutboundConnection record before Task 7's integration tests catch this at Flyway startup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 16:18:12 +02:00
hsiegeln	000e9d2847	feat(http): ApacheOutboundHttpClientFactory with memoization and startup validation Adds ApacheOutboundHttpClientFactory (Apache HttpClient 5) that memoizes CloseableHttpClient instances keyed on effective TLS + timeout config, and OutboundHttpConfig (@ConfigurationProperties) that validates trusted CA paths at startup and exposes OutboundHttpClientFactory as a Spring bean. TRUST_ALL mode disables both cert validation (TrustAllManager in SslContextBuilder) and hostname verification (NoopHostnameVerifier on SSLConnectionSocketFactoryBuilder). WireMock HTTPS integration test covers trust-all bypass, system-default PKIX rejection, and client memoization. OIDC audit: OidcProviderHelper and OidcTokenExchanger use Nimbus SDK's own HTTP layer (DefaultResourceRetriever for JWKS, HTTPRequest.send() for token exchange) plus the bespoke InsecureTlsHelper for TLS skip-verify; neither uses OutboundHttpClientFactory. Retrofit deferred to a separate follow-up per plan §20. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-19 16:03:56 +02:00
hsiegeln	ffdfd6cd9a	feat(outbound): add HTTPS CHECK constraint on outbound_connections.url Defense-in-depth per code review. DTO layer already validates HTTPS at save time; this DB-level check guards against future code paths that might bypass the DTO validator. Mustache template variables in the URL (e.g., {{env.slug}}) remain valid since only the scheme prefix is constrained.	2026-04-19 15:37:35 +02:00
hsiegeln	116038262a	feat(outbound): V11 flyway migration for outbound_connections table	2026-04-19 15:33:39 +02:00
hsiegeln	89c9b53edd	fix(pagination): add insert_id UUID tiebreak to cursor keyset All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m26s Details CI / docker (push) Successful in 1m12s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 43s Details Same-millisecond rows were silently skipped between pages because the log cursor had no tiebreak and the events cursor tied by instance_id (which also collides when one instance emits multiple events within a millisecond). Add an insert_id UUID (DEFAULT generateUUIDv4()) column to both logs and agent_events, order by (timestamp, insert_id) consistently, and encode the cursor as 'timestamp\|insert_id'. Existing data is materialized via ALTER TABLE MATERIALIZE COLUMN (one-time background mutation). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-17 14:25:36 +02:00
hsiegeln	9b1ef51d77	feat!: scope per-app config and settings by environment All checks were successful CI / cleanup-branch (push) Has been skipped Details CI / build (push) Successful in 1m27s Details CI / docker (push) Successful in 1m10s Details CI / deploy-feature (push) Has been skipped Details CI / deploy (push) Successful in 1m40s Details SonarQube / sonarqube (push) Successful in 4m29s Details BREAKING: wipe dev PostgreSQL before deploying — V1 checksum changes. Agents must now send environmentId on registration (400 if missing). Two tables previously keyed on app name alone caused cross-environment data bleed: writing config for (app=X, env=dev) would overwrite the row used by (app=X, env=prod) agents, and agent startup fetches ignored env entirely. - V1 schema: application_config and app_settings are now PK (app, env). - Repositories: env-keyed finders/saves; env is the authoritative column, stamped on the stored JSON so the row agrees with itself. - ApplicationConfigController.getConfig is dual-mode — AGENT role uses JWT env claim (agents cannot spoof env); non-agent callers provide env via ?environment= query param. - AppSettingsController endpoints now require ?environment=. - SensitiveKeysAdminController fan-out iterates (app, env) slices so each env gets its own merged keys. - DiagramController ingestion stamps env on TaggedDiagram; ClickHouse route_diagrams INSERT + findProcessorRouteMapping are env-scoped. - AgentRegistrationController: environmentId is required on register; removed all "default" fallbacks from register/refresh/heartbeat auto-heal. - UI hooks (useApplicationConfig, useProcessorRouteMapping, useAppSettings, useAllAppSettings, useUpdateAppSettings) take env, wired to useEnvironmentStore at all call sites. - New ConfigEnvIsolationIT covers env-isolation for both repositories. Plan in docs/superpowers/plans/2026-04-16-environment-scoping.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 22:25:21 +02:00
hsiegeln	04da0af4bc	feat: add route_catalog table to ClickHouse schema	2026-04-16 18:45:38 +02:00
hsiegeln	cb3ebfea7c	chore: rename cameleer3 to cameleer Some checks failed CI / cleanup-branch (push) Has been skipped Details CI / build (push) Failing after 18s Details CI / docker (push) Has been skipped Details CI / deploy (push) Has been skipped Details CI / deploy-feature (push) Has been skipped Details Rename Java packages from com.cameleer3 to com.cameleer, module directories from cameleer3-* to cameleer-*, and all references throughout workflows, Dockerfiles, docs, migrations, and pom.xml. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 15:28:42 +02:00

30 Commits