SilenceMatcherService.matches() evaluates AND semantics across ruleId,
severity, appSlug, routeId, agentId constraints. Null fields are wildcards.
Scope-based constraints (appSlug/routeId/agentId) return false when rule is
null (deleted rule — scope cannot be verified). 17 unit tests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sentinel-substitution approach: unresolved {{x.y.z}} tokens are replaced
with a unique NUL-delimited sentinel before Mustache compilation, rendered
as opaque text, then post-replaced with the original {{x.y.z}} literal.
Malformed templates (unclosed {{) are caught and return the raw template.
Never throws. 9 unit tests.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Declared in cameleer-server-core pom (canonical location for unit-testable
rendering without Spring) and mirrored in cameleer-server-app pom so the
app module compiles standalone without a full reactor install.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds alerting_projections.sql with four projections (alerting_app_status,
alerting_route_status on executions; alerting_app_level on logs;
alerting_instance_metric on agent_metrics). ClickHouseSchemaInitializer now
runs both init.sql and alerting_projections.sql, with ADD PROJECTION and
MATERIALIZE treated as non-fatal — executions (ReplacingMergeTree) requires
deduplicate_merge_projection_mode=rebuild which is unavailable via JDBC pool.
MergeTree projections (logs, agent_metrics) always succeed and are asserted in IT.
Column names confirmed from init.sql: logs uses 'application' (not application_id),
agent_metrics uses 'collected_at' (not timestamp). All column names match the plan.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds AlertMatchSpec record (core) and ClickHouseSearchIndex.countExecutionsForAlerting —
no FINAL, no text subqueries. Filters by tenant, env, app, route, status, time window,
and optional after-cursor. Attributes (JSON string column) use inlined JSONExtractString
key literals since ClickHouse JDBC does not bind ? placeholders inside JSON functions.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds countLogs(LogSearchRequest) — no FINAL, no cursor/sort/limit —
reusing the same WHERE-clause logic as search() for tenant, env, app,
level, q, logger, source, exchangeId, and time-range filters.
Also extends ClickHouseTestHelper with executeInitSqlWithProjections()
and makes the script runner non-fatal for ADD/MATERIALIZE PROJECTION.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements AlertInstanceRepository: save (upsert), findById, findOpenForRule,
listForInbox (3-way OR: user/group/role via && array-overlap + ANY), countUnreadForUser
(LEFT JOIN alert_reads), ack, resolve, markSilenced, deleteResolvedBefore.
Integration test covers all 9 scenarios including inbox fan-out across all
three target types. Also adds @JsonIgnoreProperties(ignoreUnknown=true) to
SilenceMatcher to suppress Jackson serializing isWildcard() as a round-trip field.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replaces the Plan 01 stub that returned [] with a real call through
AlertRuleRepository.findRuleIdsByOutboundConnectionId. Adds AlertingBeanConfig
exposing the AlertRuleRepository bean; widens OutboundBeanConfig constructor
to inject it. Delete and narrow-envs guards now correctly block when rules
reference a connection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Implements AlertRuleRepository with JSONB condition/webhooks/eval_state
serialization via ObjectMapper, UPSERT on conflict, JSONB containment
query for findRuleIdsByOutboundConnectionId, and FOR UPDATE SKIP LOCKED
claim-polling for horizontal scale.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace hard-coded 'u1' user_id with per-test UUID to prevent PK collision on re-runs
- Add @AfterEach null-safe cleanup for environments and users rows
- Use containsExactlyInAnyOrder for enum assertions to catch misspelled names
- Slug suffix on environment insert avoids slug uniqueness conflicts on re-runs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- OutboundConnectionRequest compact ctor: avoid NPE if tlsTrustMode is null
(defense-in-depth alongside @NotNull Bean Validation).
- Add operatorCannotTest IT case to lock the ADMIN-only contract on
POST /{id}/test — was previously untested.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Shared Spring test context meant seeded test-admin/test-operator/test-viewer/test-alice
users persisted across IT classes, breaking FlywayMigrationIT's "users is empty" assertion.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POST /{id}/test issues a synthetic probe against the connection URL.
TLS protocol/cipher/peer-cert details stubbed for now (Plan 02 follow-up).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New audit categories: OUTBOUND_CONNECTION_CHANGE, OUTBOUND_HTTP_TRUST_CHANGE.
Controller-level @PreAuthorize defaults to ADMIN; GETs relaxed to ADMIN|OPERATOR.
SecurityConfig permits OPERATOR GETs on /api/v1/admin/outbound-connections/**.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V11 migration referenced users(id) as uuid, but V1 users table has
user_id as TEXT primary key. Amending V11 and the OutboundConnection
record before Task 7's integration tests catch this at Flyway startup.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds ApacheOutboundHttpClientFactory (Apache HttpClient 5) that memoizes
CloseableHttpClient instances keyed on effective TLS + timeout config, and
OutboundHttpConfig (@ConfigurationProperties) that validates trusted CA paths
at startup and exposes OutboundHttpClientFactory as a Spring bean.
TRUST_ALL mode disables both cert validation (TrustAllManager in SslContextBuilder)
and hostname verification (NoopHostnameVerifier on SSLConnectionSocketFactoryBuilder).
WireMock HTTPS integration test covers trust-all bypass, system-default PKIX rejection,
and client memoization.
OIDC audit: OidcProviderHelper and OidcTokenExchanger use Nimbus SDK's own HTTP layer
(DefaultResourceRetriever for JWKS, HTTPRequest.send() for token exchange) plus the
bespoke InsecureTlsHelper for TLS skip-verify; neither uses OutboundHttpClientFactory.
Retrofit deferred to a separate follow-up per plan §20.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Defense-in-depth per code review. DTO layer already validates HTTPS at save
time; this DB-level check guards against future code paths that might bypass
the DTO validator. Mustache template variables in the URL (e.g., {{env.slug}})
remain valid since only the scheme prefix is constrained.
Same-millisecond rows were silently skipped between pages because the
log cursor had no tiebreak and the events cursor tied by instance_id
(which also collides when one instance emits multiple events within a
millisecond). Add an insert_id UUID (DEFAULT generateUUIDv4()) column
to both logs and agent_events, order by (timestamp, insert_id)
consistently, and encode the cursor as 'timestamp|insert_id'. Existing
data is materialized via ALTER TABLE MATERIALIZE COLUMN (one-time
background mutation).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
AgentEventService.queryEvents, AgentEventRepository.query, and the
ClickHouse implementation have had no callers since /agents/events
became cursor-paginated. Remove them along with their dedicated IT
tests. queryPage and its tests remain as the single query path.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Extends the existing ApiExceptionHandler @RestControllerAdvice to map
DateTimeParseException and IllegalArgumentException to 400 Bad Request.
Logs and agent-events endpoints both parse ISO-8601 query params and
previously leaked parse failures as internal server errors. All
IllegalArgumentException throw sites in production code are
input-validation usages (slug validation, containerConfig validation,
cursor decoding), so mapping to 400 is correct across the board.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
JDBC Timestamp binding shifted timestamps by the JVM local timezone
offset on both insert and query, producing asymmetric UTC offsets that
broke time-range filtering and cursor pagination. Switching inserts
(indexBatch, insertBufferedBatch) and all WHERE predicates to ISO-8601
strings via parseDateTime64BestEffort, and reading timestamps back as
epoch-millis via toUnixTimestamp64Milli, pins everything to UTC and
fixes the time-range filter test plus cursor pagination.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wraps DateTimeParseException from Instant.parse in IllegalArgumentException
so the controller maps it to 400. Also rejects cursors with empty
instance_id (trailing '|') which would otherwise produce a vacuous
keyset predicate.
Orders by (timestamp DESC, instance_id ASC). Cursor is
base64url('timestampIso|instanceId') with a tuple keyset predicate
for stable paging across ties.
Replaces LogSearchRequest.source (String) with sources (List<String>)
and emits 'source IN (...)' when non-empty. LogQueryController parses
?source=a,b,c the same way it parses ?level=a,b,c.
The agent_events table has an `environment` column and AgentEventsController
filters on it, but the INSERT never populated it — every row got the
column default ('default'). Result: Timeline on the Application Runtime
page was empty whenever the user's selected env was anything other than
'default'.
Thread env through the write path:
- AgentEventRepository.insert + AgentEventService.recordEvent gain an
`environment` param; delete the no-env query overload (unused).
- ClickHouseAgentEventRepository.insert writes the column (falls back to
'default' on null to match column DEFAULT).
- All 5 callers source env from the agent registry (AgentInfo.environmentId)
or the registration request body; AgentLifecycleMonitor, deregister,
command ack, event ingestion, register/re-register.
- Integration test updated for the new signatures.
Pre-existing rows in deployed CH will still report environment='default'.
New events from this build forward will carry the correct env. Backfill
(UPDATE ... FROM apps) is left as a manual DB step if historical timeline
is needed for non-default envs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Correlated exchanges always share the env of the one being viewed —
using the globally-selected env from the picker was wrong if the user
switched envs after opening a detail view (or arrived via permalink).
Thread `environment` through:
- `ExecutionStore.ExecutionRecord` gains `environment` field; the
ClickHouse `executions` table already stores this, just not read back.
- `ClickHouseExecutionStore.findById` SELECT adds the column; mapper
populates it.
- `ExecutionDetail` gains `environment`; `DetailService` passes through.
- `IngestionService.toExecutionRecord` passes null — this legacy PG
ingestion path isn't active when ClickHouse is enabled, and the
read-side is what drives the correlation UI.
- UI `ExchangeHeader` reads `detail.environment ?? storeEnv` and
extends the TS type locally (schema.d.ts catches up on next regen).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Picks up the URL moves from P2/P3A/P3B/P3C. Also fixes a latent bug in
AppControllerIT.uploadJar_asOperator_returns201 / DeploymentControllerIT
setUp: the tests were passing the app's UUID as the {appSlug} path
variable (via `path("id").asText()`); the old AppController looked up
apps via getBySlug(), so the legacy URL call would 404 when the slug
literal was a UUID. Now the test tracks the known slug string and uses
it for every /apps/{appSlug}/... path.
Test URL updates:
- SearchControllerIT: /api/v1/search/executions →
/api/v1/environments/default/executions (GET) and
/api/v1/environments/default/executions/search (POST).
- AppControllerIT: /api/v1/apps → /api/v1/environments/default/apps.
Request bodies drop environmentId (it's in the path).
- DeploymentControllerIT: /api/v1/apps/{appId}/deployments →
/api/v1/environments/default/apps/{appSlug}/deployments. DeployRequest
body drops environmentId.
- JwtRefreshIT + RegistrationSecurityIT: smoke-test protected endpoint
call updated to the new /environments/default/executions shape.
All tests compile clean. Runtime behavior requires a full stack
(Postgres + ClickHouse + Docker); validating integration tests is a
pre-merge step before merging the feature branch.
Remaining pre-merge items (not blocked by code):
1. Regenerate ui/src/api/schema.d.ts + openapi.json by running
`cd ui && npm run generate-api:live` against a running backend.
SearchController, DeploymentController, etc. DTO signatures have
changed; schema.d.ts is frozen at the pre-migration shape.
Raw-fetch call sites introduced in P3A/P3C work at runtime without
the schema; the regen only sharpens TypeScript coverage.
2. Smoke test locally: boot server, verify EnvironmentsPage,
AppsTab, Exchanges, Dashboard, Runtime pages all function.
3. Run `mvn verify` end-to-end (Testcontainers + Docker required).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P3C — the last data/query wave of the taxonomy migration. Every user-
facing read endpoint that was keyed on env-as-query-param is now under
the env-scoped URL, making env impossible to omit and unambiguous in
server-side tenant+env filtering.
Server:
- SearchController: /api/v1/search/** → /api/v1/environments/{envSlug}/...
Endpoints: /executions (GET), /executions/search (POST), /stats,
/stats/timeseries, /stats/timeseries/by-app, /stats/timeseries/by-route,
/stats/punchcard, /attributes/keys, /errors/top. Env comes from path.
- LogQueryController: /api/v1/logs → /api/v1/environments/{envSlug}/logs.
- RouteCatalogController: /api/v1/routes/catalog → /api/v1/environments/
{envSlug}/routes. Env filter unconditional (path).
- RouteMetricsController: /api/v1/routes/metrics →
/api/v1/environments/{envSlug}/routes/metrics (and /metrics/processors).
- DiagramRenderController: /{contentHash}/render stays flat (hashes are
globally unique). Find-by-route moved to /api/v1/environments/{envSlug}/
apps/{appSlug}/routes/{routeId}/diagram — the old GET /diagrams?...
handler is removed.
- Agent views split cleanly:
- AgentListController (new): /api/v1/environments/{envSlug}/agents
- AgentEventsController: /api/v1/environments/{envSlug}/agents/events
- AgentMetricsController: /api/v1/environments/{envSlug}/agents/
{agentId}/metrics — now also rejects cross-env agents (404) as a
defense-in-depth check, fulfilling B3.
Agent self-service endpoints (register/refresh/heartbeat/deregister)
remain flat at /api/v1/agents/** — JWT-authoritative.
SPA:
- queries/agents.ts, agent-metrics.ts, logs.ts, catalog.ts (route
metrics only; /catalog stays flat), processor-metrics.ts,
executions.ts (attributes/keys, stats, timeseries, search),
dashboard.ts (all stats/errors/punchcard), correlation.ts,
diagrams.ts (by-route) — all rewritten to env-scoped URLs.
- Hooks now either read env from useEnvironmentStore internally or
require it as an argument. Query keys include env so switching env
invalidates caches.
- useAgents/useAgentEvents signature simplified — env is no longer a
parameter; it's read from the store. Callers (LayoutShell,
AgentHealth, AgentInstance) updated accordingly.
- LogTab and useStartupLogs thread env through to useLogs.
- envFetch helper introduced in executions.ts for env-prefixed raw
fetch until schema.d.ts is regenerated against the new backend.
BREAKING CHANGE: All these flat paths are removed:
/api/v1/search/**, /api/v1/logs, /api/v1/routes/catalog,
/api/v1/routes/metrics (and /processors), /api/v1/diagrams
(lookup), /api/v1/agents (list), /api/v1/agents/events-log,
/api/v1/agents/{id}/metrics, /api/v1/agent-events.
Clients must use the /api/v1/environments/{envSlug}/... equivalents.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P3B of the taxonomy migration. App and deployment routes are now
env-scoped in the URL itself, making the (env, app_slug) uniqueness
key explicit. Previously /api/v1/apps/{appSlug} was ambiguous: with
the same app deployed to multiple environments (dev/staging/prod),
the handler called AppService.getBySlug(slug) which returns the
first row matching slug regardless of env.
Server:
- AppController: @RequestMapping("/api/v1/environments/{envSlug}/
apps"). Every handler now calls
appService.getByEnvironmentAndSlug(env.id(), appSlug) — 404 if the
app doesn't exist in *this* env. CreateAppRequest body drops
environmentId (it's in the path).
- DeploymentController: @RequestMapping("/api/v1/environments/
{envSlug}/apps/{appSlug}/deployments"). DeployRequest body drops
environmentId. PromoteRequest body switches from
targetEnvironmentId (UUID) to targetEnvironment (slug);
promote handler resolves the target env by slug and looks up the
app with the same slug in the target env (fails with 404 if the
target app doesn't exist yet — apps must exist in both source
and target before promote).
- AppService: added getByEnvironmentAndSlug helper; createApp now
validates slug against ^[a-z0-9][a-z0-9-]{0,63}$ (400 on
invalid).
SPA:
- queries/admin/apps.ts: rewritten. Hooks take envSlug where
env-scoped. Removed useAllApps (no flat endpoint). Renamed path
param naming: appId → appSlug throughout. Added
usePromoteDeployment. Query keys include envSlug so cache is
env-scoped.
- AppsTab.tsx: call sites updated. When no environment is selected,
the managed-app list is empty — cross-env discovery lives in the
Runtime tab (catalog). handleDeploy/handleStop/etc. pass envSlug
to the new hook signatures.
BREAKING CHANGE: /api/v1/apps/** paths removed. Clients must use
/api/v1/environments/{envSlug}/apps/{appSlug}/**. Request bodies
for POST /apps and POST /apps/{slug}/deployments no longer accept
environmentId (use the URL path instead). Promote body uses slug
not UUID.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
P3A of the taxonomy migration. Env-scoped config and settings endpoints
now live under the env-prefixed URL shape, making env a first-class
path segment instead of a query param. Agent-authoritative config is
split off into a dedicated endpoint so agent env comes from the JWT
only — never spoofable via URL.
Server:
- ApplicationConfigController: @RequestMapping("/api/v1/environments/
{envSlug}"). Handlers use @EnvPath Environment env, appSlug as
@PathVariable. Removed the dual-mode resolveEnvironmentForRead —
user flow only; agent flow moved to AgentConfigController.
- AgentConfigController (new): GET /api/v1/agents/config. Reads
instanceId from JWT subject, resolves (app, env) from registry,
returns AppConfigResponse. Registry miss → falls back to JWT env
claim for environment, but 404s if application cannot be derived
(no other source without registry).
- AppSettingsController: @RequestMapping("/api/v1/environments/
{envSlug}"). List at /app-settings, per-app at /apps/{appSlug}/
settings. Access class-wide PreAuthorize preserved (ADMIN/OPERATOR).
SPA:
- commands.ts: useAllApplicationConfigs, useApplicationConfig,
useUpdateApplicationConfig, useProcessorRouteMapping,
useTestExpression — rewritten URLs to /environments/{env}/apps/
{app}/... shape. environment now required on every call. Query
keys include environment so cache is env-scoped.
- dashboard.ts: useAppSettings, useAllAppSettings, useUpdateAppSettings
rewritten.
- TapConfigModal: new required environment prop; callers updated.
- RouteDetail, ExchangesPage: thread selectedEnv into test-expression
and modal.
Config changes in SecurityConfig for the new shape landed earlier in
P0.2; no security rule changes needed in this commit.
BREAKING CHANGE: /api/v1/config/** and /api/v1/admin/app-settings/**
paths removed. Agents must use /api/v1/agents/config instead of
GET /api/v1/config/{app}; users must use /api/v1/environments/{env}/
apps/{app}/config and /api/v1/environments/{env}/apps/{app}/settings.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
UUID-based admin paths were the only remaining UUID-in-URL pattern in
the API. Migrates /api/v1/admin/environments/{id} to /{envSlug} so
slugs are the single environment identifier in every URL. UUIDs stay
internal to the database.
- Controller: @PathVariable UUID id → @PathVariable String envSlug on
get/update/delete and the two nested endpoints (default-container-
config, jar-retention). Handlers resolve slug → Environment via
EnvironmentService.getBySlug, then delegate to existing UUID-based
service methods.
- Service: create() now validates slug against ^[a-z0-9][a-z0-9-]{0,63}$
and returns 400 on invalid slugs. Rationale documented in the class:
slugs are immutable after creation because they appear in URLs,
Docker network names, container names, and ClickHouse partition keys.
- UpdateEnvironmentRequest has no slug field and Jackson's default
ignore-unknown behavior drops any slug supplied in a PUT body;
regression test (updateEnvironment_withSlugInBody_ignoresSlug)
documents this invariant.
- SPA: mutation args change from { id } to { slug }. EnvironmentsPage
still uses env.id for local selection state (UUID from DB) but
passes env.slug to every mutation.
BREAKING CHANGE: /api/v1/admin/environments/{id:UUID}/... paths removed.
Clients must use /{envSlug}/... (slug from the environments list).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes two cross-env data leakage paths. Both endpoints previously
returned data aggregated across all environments, so a diagram or
attribute key from dev could appear in a prod UI query (and vice versa).
B1: GET /api/v1/diagrams?application=&routeId= now requires
?environment= and resolves agents via
registryService.findByApplicationAndEnvironment instead of
findByApplication. Prevents serving a dev diagram for a prod route.
B2: GET /api/v1/search/attributes/keys now requires ?environment=.
SearchIndex.distinctAttributeKeys gains an environment parameter and
the ClickHouse query adds the env filter alongside the existing
tenant_id filter. Prevents prod attribute names leaking into dev
autocompletion (and vice versa).
SPA hooks updated to thread environment through from
useEnvironmentStore; query keys include environment so React Query
re-fetches on env switch. No call-site changes needed — hook
signatures unchanged.
B3 (AgentMetricsController env scope) deferred to P3C: agent-env is
effectively 1:1 today via the instance_id naming
({envSlug}-{appSlug}-{replicaIndex}), and the URL migration in P3C
to /api/v1/environments/{env}/agents/{agentId}/metrics naturally
introduces env from path. A minimal P1 fix would regress the "view
metrics of a killed agent" case.
BREAKING CHANGE: Both endpoints now require ?environment= (slug).
Clients omitting the parameter receive 400.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Groundwork for the REST API taxonomy migration. Introduces the
infrastructure that future waves use to move data/query endpoints under
/api/v1/environments/{envSlug}/... without per-handler boilerplate.
- Add @EnvPath annotation + EnvironmentPathResolver: injects the
Environment identified by the {envSlug} path variable, 404 on unknown
slug, registered via WebConfig.addArgumentResolvers.
- Add env-scoped URL matchers to SecurityConfig (config, settings,
executions/stats/logs/routes/agents/apps/deployments under
/environments/*/**). Legacy flat matchers kept in place and will be
removed per-wave as controllers migrate. New agent-authoritative
/api/v1/agents/config matcher prepared for the agent/user split.
- Document OpenAPI schema regen workflow in CLAUDE.md so future API
changes cover schema.d.ts regeneration as part of the change.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>