Environments now have:
- production (bool): prod vs non-prod resource allocation
- enabled (bool): disabled blocks new deployments
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- EnvironmentService: CRUD with slug uniqueness, default env protection
- AppService: CRUD, JAR upload with SHA-256 checksumming
- DeploymentService: create, promote, status transitions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- EnvironmentRepository, AppRepository, AppVersionRepository, DeploymentRepository
- RuntimeOrchestrator interface with ContainerRequest and ContainerStatus
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add clearManagedAssignments, assignManagedRole, addUserToManagedGroup to interface
- Update assignRoleToUser and addUserToGroup to explicitly set origin='direct'
- Update getDirectRolesForUser to filter by origin='direct'
- Implement managed assignment methods with ON CONFLICT upsert
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Thread-safe AtomicReference-based license holder
- Defaults to open mode (all features enabled) when no license loaded
- Runtime license loading with feature/limit queries
- Unit tests for open mode and licensed mode
- Evaluates JWT claims against mapping rules
- Supports equals, contains (list + space-separated), regex match types
- Results sorted by priority
- 7 unit tests covering all match types and edge cases
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Validates payload.signature license tokens using Ed25519 public key
- Parses tier, features, limits, timestamps from JSON payload
- Rejects expired and tampered tokens
- Unit tests for valid, expired, and tampered license scenarios
- AssignmentOrigin enum (direct/managed)
- ClaimMappingRule record with match type and action enums
- ClaimMappingRepository interface for CRUD operations
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Expose getDirectRolesForUser on RbacService interface so syncOidcRoles
compares against directly-assigned roles only, not group-inherited ones
- Remove early-return that preserved existing roles when OIDC returned
none — now always applies defaultRoles as fallback
- Update CLAUDE.md and SERVER-CAPABILITIES.md to reflect changes
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The OIDC login flow now reads roles from the access_token (JWT) in
addition to the id_token. This fixes role extraction with providers
like Logto that put scopes/roles in access tokens rather than id_tokens.
- Add audience and additionalScopes to OidcConfig for RFC 8707 resource
indicator support and configurable extra scopes
- OidcTokenExchanger decodes access_token with at+jwt-compatible processor,
falls back to id_token if access_token is opaque or has no roles
- syncOidcRoles preserves existing local roles when OIDC returns none
- SPA includes resource and additionalScopes in authorization requests
- Admin UI exposes new config fields
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract OidcProviderHelper for shared discovery + JWK source construction
- Add SystemRole.normalizeScope() to centralize role normalization
- Merge duplicate claim extraction in OidcTokenExchanger
- Add PKCE (S256) to OIDC authorization flow (frontend + backend)
- Add SecurityContext (runAsNonRoot) to all K8s deployments
- Fix postgres probe to use $POSTGRES_USER instead of hardcoded username
- Remove default credentials from Dockerfile
- Extract sanitize_branch() to shared .gitea/sanitize-branch.sh
- Fix sidebar to use /exchanges/ paths directly, remove legacy redirects
- Centralize basePath computation in router.tsx via config module
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The OIDC user login ID is now configurable via the admin OIDC setup
dialog (userIdClaim field). Supports dot-separated claim paths (e.g.
'email', 'preferred_username', 'custom.user_id'). Defaults to 'sub'
for backwards compatibility. Throws if the configured claim is missing
from the id_token.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add 'env' claim to agent JWTs (set at registration, carried through
refresh). Auto-heal on heartbeat/SSE now reads environment from the
JWT instead of hardcoding 'default', so agents retain their correct
environment after server restart.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Backend: Added optional `environment` query parameter to catalog,
search, stats, timeseries, punchcard, top-errors, logs, and agents
endpoints. ClickHouse queries filter by environment when specified
(literal SQL for AggregatingMergeTree, ? binds for raw tables).
StatsStore interface methods all accept environment parameter.
UI: Added EnvironmentSelector component (compact native select).
LayoutShell extracts distinct environments from agent data and
passes selected environment to catalog and agent queries via URL
search param (?env=). TopBar shows current environment label.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds configurable tenant ID (CAMELEER_TENANT_ID env var, default:
"default") and environment as a first-class concept. Each server
instance serves one tenant with multiple environments.
Changes across 36 files:
- TenantProperties config bean for tenant ID injection
- AgentInfo: added environmentId field
- AgentRegistrationRequest: added environmentId field
- All 9 ClickHouse stores: inject tenant ID, replace hardcoded
"default" constant, add environment to writes/reads
- ChunkAccumulator: configurable tenant ID + environment resolver
- MergedExecution/ProcessorBatch/BufferedLogEntry: added environment
- ClickHouse init.sql: added environment column to all tables,
updated ORDER BY (tenant→time→env→app), added tenant_id to
usage_events, updated all MV GROUP BY clauses
- Controllers: pass environmentId through registration/auto-heal
- K8s deploy: added CAMELEER_TENANT_ID env var
- All tests updated for new signatures
Closes#123
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The heartbeat now carries capabilities (per protocol v2 update).
On each heartbeat, capabilities are updated in the agent registry.
On auto-heal (server restart), capabilities from the heartbeat
are used instead of empty Map.of(), so the agent's feature flags
(replay, routeControl, logForwarding, etc.) are restored
immediately on the first heartbeat.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Diagnostics showed ~3,200 tiny inserts per 5 minutes:
- processor_executions: 2,376 inserts (14 rows avg) — one per chunk
- logs: 803 inserts (5 rows avg) — synchronous in HTTP handler
Fix 1: Consolidate processor inserts — new insertProcessorBatches() method
flattens all ProcessorBatch records into a single INSERT per flush cycle.
Fix 2: Buffer log inserts — route through WriteBuffer<BufferedLogEntry>,
flushed on the same 5s interval as executions. LogIngestionController now
pushes to buffer instead of inserting directly.
Also reverts async_insert config (doesn't work with JDBC inline VALUES).
Expected: ~3,200 inserts/5min → ~160 (20x reduction in part creation,
MV triggers, and background merge work).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ChunkAccumulator now extracts inputBody/outputBody/inputHeaders/outputHeaders
from ExecutionChunk.inputSnapshot/outputSnapshot instead of storing empty strings
- Set ClickHouse server log level to warning (was trace by default)
- Update CLAUDE.md to document Ed25519 key derivation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace ACK-based route state inference with agent-reported state.
Heartbeats now carry optional routeStates map, and ROUTE_STATE_CHANGED
events update the registry immediately.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
In-memory registry that infers route state (started/stopped/suspended)
from successful route-control command ACKs. Updates state only when all
agents in a group confirm success.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add addGroupCommandWithReplies() to AgentRegistryService that sends commands
to all LIVE agents in a group and returns CompletableFuture per agent for
collecting replies. Update sendGroupCommand() and pushConfigToAgents() to
wait with a shared 10-second deadline, returning CommandGroupResponse with
per-agent status, timeouts, and overall success. Config update endpoint now
returns ConfigUpdateResponse wrapping both the saved config and push result.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add GET /search/attributes/keys endpoint that queries distinct
attribute key names from ClickHouse using JSONExtractKeys. Attribute
keys appear in the cmd-k Attributes tab alongside attribute value
matches from exchange results.
- SearchIndex.distinctAttributeKeys() interface method
- ClickHouseSearchIndex implementation using arrayJoin(JSONExtractKeys)
- SearchController /attributes/keys endpoint
- useAttributeKeys() React Query hook
- buildSearchData includes attribute keys as 'attribute' category items
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The chunked ingestion path hardcoded hasTraceData=false because the
execution envelope doesn't carry processor bodies. But the processor
records DO have inputBody/outputBody — we just need to check them.
Track hasTraceData across chunks in PendingExchange and pass it to
MergedExecution when the final chunk arrives or on stale sweep.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tracks authenticated UI user requests to understand usage patterns:
- New ClickHouse usage_events table with 90-day TTL
- UsageTrackingInterceptor captures method, path, duration, user
- Path normalization groups dynamic segments ({id}, {hash})
- Buffered writes via WriteBuffer + periodic flush
- Admin endpoint GET /api/v1/admin/usage with groupBy=endpoint|user|hour
- Skips agent requests, health checks, and data ingestion
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace synthetic wrapper node approach with direct iteration fields:
- ProcessorNode gains iteration (child's index) and iterationSize
(container's total) fields, populated from ClickHouse flat records
- Frontend hooks detect iteration containers from iterationSize != null
instead of scanning for wrapper processorTypes
- useExecutionOverlay filters children by iteration field instead of
wrapper nodes, eliminating ITERATION_WRAPPER_TYPES entirely
- Cleaner data contract: API returns exactly what the DB stores
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Align all internal naming with the agent team's protocol v2 identity rename:
- agentId → instanceId (unique per-JVM identifier)
- applicationName → applicationId (shared app identifier)
- AgentInfo: id → instanceId, name → displayName, application → applicationId
Add SHUTDOWN lifecycle state for graceful agent shutdowns:
- New POST /data/events endpoint receives agent lifecycle events
- AGENT_STOPPED event transitions agent to SHUTDOWN (skips STALE/DEAD)
- New POST /{id}/deregister endpoint removes agent from registry
- Server now distinguishes graceful shutdown from crash (heartbeat timeout)
Includes ClickHouse V9 and PostgreSQL V14 migrations for column renames.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ElkDiagramRenderer.getElkRoot(): add null guard to prevent NPE
when node is null (SQ java:S2259)
- WriteBuffer: add offerOrWarn() that logs when buffer is full instead
of silently dropping data. ChunkAccumulator now uses this method
so ingestion backpressure is visible in logs (SQ java:S899)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ChunkAccumulator now injects DiagramStore and looks up the content hash
when converting to MergedExecution. Without this, the detail page had
no diagram hash, so the overlay couldn't find the route diagram.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Dual-mode buildTree: detects seq presence and uses seq/parentSeq linkage
instead of processorId map. Handles duplicate processorIds across
iterations correctly. Old processorId-based mode kept for PG compat.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Extract LogIndex interface from OpenSearchLogIndex. Both ClickHouseLogStore
and OpenSearchLogIndex implement it. Controllers now inject LogIndex.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ChunkIngestionController: /data/chunks → /data/executions (matches
PROTOCOL.md endpoint the agent actually posts to)
- ExecutionController: conditional on ClickHouse being disabled to
avoid mapping conflict
- Persist originalExchangeId and replayExchangeId from ExecutionChunk
envelope through to ClickHouse (was silently dropped)
- V5 migration adds the two new columns to executions table
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
cameleer3-common removed children, loopIndex, splitIndex,
multicastIndex from ProcessorExecution (flat model only now).
Iteration context lives on synthetic wrapper nodes via processorType.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The X-Cameleer-Replay header is only available when inputSnapshot is
captured (DETAILED/DEEP engine level). The agent always sets
replayExchangeId on RouteExecution, so check that first.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ElkDiagramRenderer: guard against null containingNode before getElkRoot()
- OpenSearchAdminController: return 503/502 instead of 200 on errors
- DatabaseAdminController: return 503 instead of 200 on connection failure
- SpaForwardController: replace unbound {path} variables with /** wildcards
- WriteBuffer: check offer() return value and log on unexpected rejection
- ApiExceptionHandler: extract getReason() to local var for null safety
- Admin UI pages: handle isError state for disconnected service display
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detect replayed exchanges via X-Cameleer-Replay header during ingestion,
persist the flag through PostgreSQL and OpenSearch, and surface it in
the dashboard (amber replay icon) and exchange detail chain view.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add ROUTE_CONTROL command type and route-control mapping in
AgentCommandController. New RouteControlBar component in the exchange
header shows Start/Stop/Suspend/Resume actions (grouped pill bar) and
a Replay button, gated by agent capabilities and OPERATOR/ADMIN role.
Fix useReplayExchange hook to match protocol section 16: payload now
uses { routeId, exchange: { body, headers }, originalExchangeId, nonce }
instead of the flat { headers, body } format.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a filter processor rejects a message (filterMatched=false) or an
idempotent consumer detects a duplicate (duplicateMessage=true), the
compound container turns amber (header, border, body tint).
Also adds red pulsing rings on the failed processor badge (same SMIL
pattern as the teal hasTraceData pulse).
Backend: ProcessorNode gains filterMatched/duplicateMessage fields,
threaded from ProcessorExecution JSON path.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>