Files
cameleer-server/.claude/rules/core-classes.md
hsiegeln 1ddae94930 feat(runtime): init-container loader pattern + withUsernsMode (#152 hardening close)
Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because
9 alone leaves the orchestrator+executor referencing removed ContainerRequest
fields.

ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds
appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage.

DockerRuntimeOrchestrator (app):
  - per-replica named volume "cameleer-jars-{containerName}"
  - phase 1: loader container with the volume mounted RW at /app/jars,
    ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract
  - block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit
    remove the loader, remove the volume, propagate RuntimeException so
    DeploymentExecutor marks the deployment FAILED. main is never created.
  - phase 2: main container with the same volume mounted RO at /app/jars
  - withUsernsMode("host:1000:65536") on BOTH containers — closes the last
    open hardening gap from issue #152
  - main entrypoint paths point at /app/jars/app.jar
  - extracted baseHardenedHostConfig() so loader and main share the
    cap_drop / security_opt / readonly / pids / tmpfs contract
  - removeContainer() also removes the per-replica volume so blue/green
    doesn't leak volumes

DeploymentExecutor (app):
  - injects ArtifactDownloadTokenSigner; new @Value props loaderimage,
    artifacttokenttlseconds, artifactbaseurl
  - replaces the temporary getVersion(...).jarPath() bridge with a signed
    URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig
  - drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is
    the size-of-record check now
  - drops jarDockerVolume / jarStoragePath @Value fields and the volume
    plumbing in startReplica
  - DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize
    in place of jarPath

Tests:
  - DockerRuntimeOrchestratorHardeningTest updated for the new shape;
    captures HostConfig on the MAIN container and asserts cap_drop ALL
    + no-new-privileges + apparmor + readonly + pids + tmpfs + the new
    withUsernsMode("host:1000:65536")
  - DockerRuntimeOrchestratorLoaderTest (new): verifies volume create →
    loader create with RW bind → loader started → awaited → loader
    removed → main create with RO bind → main started; verifies abort
    + cleanup on loader exit != 0 (loader removed, volume removed, main
    NEVER created); verifies userns_mode applied to both containers.

Config:
  - application.yml replaces jardockervolume with loaderimage,
    artifacttokenttlseconds, artifactbaseurl

Rules updated: .claude/rules/docker-orchestration.md (loader pattern,
userns, no more bind-mount); .claude/rules/core-classes.md
(ContainerRequest field map).

Test counts after change:
  - cameleer-server-core: 116/116 unit tests pass
  - cameleer-server-app: 273/273 unit tests pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:06:56 +02:00

16 KiB

paths
paths
cameleer-server-core/**

Core Module Key Classes

cameleer-server-core/src/main/java/com/cameleer/server/core/

agent/ — Agent lifecycle and commands

  • AgentRegistryService — in-memory registry (ConcurrentHashMap), register/heartbeat/lifecycle
  • AgentInfo — record: id, name, application, environmentId, version, routeIds, capabilities, state
  • AgentCommand — record: id, type, targetAgent, payload, createdAt, expiresAt
  • AgentEventService — records agent state changes, heartbeats
  • AgentState — enum: LIVE, STALE, DEAD, SHUTDOWN
  • CommandType — enum for command types (config-update, deep-trace, replay, route-control, etc.)
  • CommandStatus — enum for command acknowledgement states
  • CommandReply — record: command execution result from agent
  • AgentEventRecord, AgentEventRepository — event persistence. AgentEventRepository.queryPage(...) is cursor-paginated (AgentEventPage{data, nextCursor, hasMore}); the legacy non-paginated query(...) path is gone. AgentEventRepository.findInWindow(env, appSlug, agentId, eventTypes, from, to, limit) returns matching events ordered by (timestamp ASC, insert_id ASC) — consumed by AgentLifecycleEvaluator.
  • AgentEventPage — record: (List<AgentEventRecord> data, String nextCursor, boolean hasMore) returned by AgentEventRepository.queryPage
  • AgentEventListener — callback interface for agent events
  • RouteStateRegistry — tracks per-agent route states

runtime/ — App/Environment/Deployment domain

  • App — record: id, environmentId, slug, displayName, containerConfig (JSONB)
  • AppVersion — record: id, appId, version, jarPath, detectedRuntimeType, detectedMainClass
  • Environment — record: id, slug, displayName, production, enabled, defaultContainerConfig, jarRetentionCount, color, createdAt, executionRetentionDays, logRetentionDays, metricRetentionDays. color is one of the 8 preset palette values validated by EnvironmentColor.VALUES and CHECK-constrained in PostgreSQL (V2 migration). The 3 retention day fields (V5) are int-typed (not nullable, since unlimited has no use-case), default to 1 day per the V5 NOT NULL DEFAULT 1, validated >= 1 in the canonical constructor.
  • EnvironmentColor — constants: DEFAULT = "slate", VALUES = {slate,red,amber,green,teal,blue,purple,pink}, isValid(String).
  • Deployment — record: id, appId, appVersionId, environmentId, status, targetState, deploymentStrategy, replicaStates (JSONB), deployStage, containerId, containerName, createdBy (String, user_id reference; nullable for pre-V4 historical rows)
  • DeploymentStatus — enum: STOPPED, STARTING, RUNNING, DEGRADED, STOPPING, FAILED. DEGRADED is reserved for post-deploy drift (a replica died after RUNNING); DeploymentExecutor now marks partial-healthy deploys FAILED, not DEGRADED.
  • DeployStage — enum: PRE_FLIGHT, PULL_IMAGE, CREATE_NETWORK, START_REPLICAS, HEALTH_CHECK, SWAP_TRAFFIC, COMPLETE
  • DeploymentStrategy — enum: BLUE_GREEN, ROLLING. Stored on ResolvedContainerConfig.deploymentStrategy as kebab-case string ("blue-green" / "rolling"). fromWire(String) is the only conversion entry point; unknown/null inputs fall back to BLUE_GREEN so the executor dispatch site never null-checks or throws.
  • DeploymentService — createDeployment (calls deleteFailedByAppAndEnvironment first so FAILED rows don't pile up; STOPPED rows are preserved as restorable checkpoints), markRunning, markFailed, markStopped
  • RuntimeType — enum: AUTO, SPRING_BOOT, QUARKUS, PLAIN_JAVA, NATIVE
  • RuntimeDetector — probes JAR files at upload time: detects runtime from manifest Main-Class (Spring Boot loader, Quarkus entry point, plain Java) or native binary (non-ZIP magic bytes)
  • ContainerRequest — record: 21 fields for Docker container creation. Replaces the legacy jarPath/jarVolumeName/jarVolumeMountPath triple with appVersionId (UUID), artifactDownloadUrl (signed), artifactExpectedSize (bytes), and loaderImage. The orchestrator's loader init-container fetches the JAR from the URL into a per-replica named volume; the main container reads it from /app/jars/app.jar.
  • ContainerStatus — record: state, running, exitCode, error
  • ResolvedContainerConfig — record: typed config with memoryLimitMb, memoryReserveMb, cpuRequest, cpuLimit, appPort, exposedPorts, customEnvVars, stripPathPrefix, sslOffloading, routingMode, routingDomain, serverUrl, replicas, deploymentStrategy, routeControlEnabled, replayEnabled, runtimeType, customArgs, extraNetworks, externalRouting (default true; when false, TraefikLabelBuilder strips all traefik.* labels so the container is not publicly routed), certResolver (server-wide, sourced from CAMELEER_SERVER_RUNTIME_CERTRESOLVER; when blank the tls.certresolver label is omitted — use for dev installs with a static TLS store)
  • RoutingMode — enum for routing strategies
  • ConfigMerger — pure function: resolve(globalDefaults, envConfig, appConfig) -> ResolvedContainerConfig
  • RuntimeOrchestrator — interface: startContainer, stopContainer, getContainerStatus, getLogs, startLogCapture, stopLogCapture
  • AppRepository, AppVersionRepository, EnvironmentRepository, DeploymentRepository — repository interfaces
  • AppService, EnvironmentService — domain services
  • CreateGuard@FunctionalInterface. void check(long current) — implementations throw to abort creation. NOOP constant is the default. Consulted by EnvironmentService.create, AppService.createApp, and AgentRegistryService.register so license caps can be enforced from the app module without leaking Spring or app-only types into core. Wired in LicenseBeanConfig to a LicenseEnforcer.assertWithinCap(...) call per limit key.

license/ — License domain (signed-token tier system)

The pure license contract types live in the separate cameleer-license-api module under package com.cameleer.license (no Spring, no server-runtime deps) so consumers like cameleer-license-minter and cameleer-saas can use them without inheriting server internals. Server-core only contains the runtime state holder (LicenseGate).

Contract types in cameleer-license-api (package com.cameleer.license):

  • LicenseInfo — record: (UUID licenseId, String tenantId, String label, Map<String,Integer> limits, Instant issuedAt, Instant expiresAt, int gracePeriodDays). isExpired() true once now > expiresAt + gracePeriodDays; isAfterRawExpiry() true once now > expiresAt. Constructed via LicenseValidator; canonical ctor null-checks all required fields and rejects blank tenantId / negative grace.
  • LicenseLimits — typed limits container backed by Map<String,Integer>. defaultsOnly() returns the DefaultTierLimits.DEFAULTS view; mergeOverDefaults(overrides) produces the license-overrides UNION default tier. get(String key) returns the cap; throws IllegalArgumentException for unknown keys (programmer error). isDefaultSourced(key, license) reports whether a key fell through to the default tier.
  • DefaultTierLimits — immutable LinkedHashMap of constants for the no-license fallback tier: max_environments=1, max_apps=3, max_agents=5, max_users=3, max_outbound_connections=1, max_alert_rules=2, max_total_cpu_millis=2000, max_total_memory_mb=2048, max_total_replicas=5, max_execution_retention_days=1, max_log_retention_days=1, max_metric_retention_days=1, max_jar_retention_count=3.
  • LicenseValidator — verifies signed token. Constructor (String publicKeyBase64, String expectedTenantId) decodes an X.509 Ed25519 public key. validate(String token) splits payload.signature, verifies the Ed25519 signature, parses the JSON payload, enforces tenantId == expectedTenantId, and returns LicenseInfo. Throws SecurityException on signature mismatch / IllegalArgumentException on parse failure / expired payload.
  • LicenseStateMachine — pure classifier. classify(LicenseInfo, String invalidReason) returns INVALID if a reason is set, ABSENT if no license, ACTIVE if now <= expiresAt, GRACE if expired but within grace window, EXPIRED otherwise.
  • LicenseState — enum: ABSENT, ACTIVE, GRACE, EXPIRED, INVALID.

Runtime state holder in server-core (package com.cameleer.server.core.license):

  • LicenseGate — runtime state holder (thread-safe via AtomicReference<Snapshot>). getCurrent() returns the current LicenseInfo (null when ABSENT/INVALID); getState() delegates to LicenseStateMachine.classify(...); getEffectiveLimits() returns license-overrides UNION defaults in ACTIVE/GRACE, defaults-only otherwise. getInvalidReason(), load(LicenseInfo), markInvalid(String reason), clear() are the mutators. getLimit(key, defaultValue) shorthand swallows unknown-key errors.

search/ — Execution search and stats

  • SearchService — search, count, stats, statsForApp, statsForRoute, timeseries, timeseriesForApp, timeseriesForRoute, timeseriesGroupedByApp, timeseriesGroupedByRoute, slaCompliance, slaCountsByApp, slaCountsByRoute, topErrors, activeErrorTypes, punchcard, distinctAttributeKeys. statsForRoute/timeseriesForRoute take (routeId, applicationId) — app filter is applied to stats_1m_route.
  • SearchRequest / SearchResult — search DTOs. SearchRequest.attributeFilters: List<AttributeFilter> carries structured facet filters for execution attributes — key-only (exists), exact (key=value), or wildcard (* in value). The 21-arg legacy ctor is preserved for call-site churn; the compact ctor normalises null → List.of().
  • AttributeFilter(key, value) — record with key regex ^[a-zA-Z0-9._-]+$ (inlined into SQL, same constraint as alerting), value == null means key-exists, value containing * becomes a SQL LIKE pattern via toLikePattern().
  • ExecutionStats, ExecutionSummary — stats aggregation records
  • StatsTimeseries, TopError — timeseries and error DTOs
  • LogSearchRequest / LogSearchResponse — log search DTOs. LogSearchRequest.sources / levels are List<String> (null-normalized, multi-value OR); cursor + limit + sort drive keyset pagination. Response carries nextCursor + hasMore + per-level levelCounts.

storage/ — Storage abstractions

  • ExecutionStore, MetricsStore, MetricsQueryStore, StatsStore, DiagramStore, RouteCatalogStore, SearchIndex, LogIndex — interfaces. DiagramStore.findLatestContentHashForAppRoute(appId, routeId, env) resolves the latest diagram by (app, env, route) without consulting the agent registry, so routes whose publishing agents were removed between app versions still resolve. findContentHashForRoute(route, instance) is retained for the ingestion path that stamps a per-execution diagramContentHash at ingest time (point-in-time link from ExecutionDetail/ExecutionSummary).
  • RouteCatalogEntry — record: applicationId, routeId, environment, firstSeen, lastSeen
  • LogEntryResult — log query result record
  • model/ExecutionDocument, MetricTimeSeries, MetricsSnapshot

rbac/ — Role-based access control

  • RbacService — interface: role/group CRUD, assignRoleToUser, removeRoleFromUser, addUserToGroup, removeUserFromGroup, getDirectRolesForUser, getEffectiveRolesForUser, clearManagedAssignments, assignManagedRole, addUserToManagedGroup, getStats, listUsers
  • SystemRole — enum: AGENT, VIEWER, OPERATOR, ADMIN; normalizeScope() maps scopes
  • UserDetail, RoleDetail, GroupDetail — records
  • UserSummary, RoleSummary, GroupSummary — lightweight list records
  • RbacStats — aggregate stats record
  • AssignmentOrigin — enum: DIRECT, CLAIM_MAPPING (tracks how roles were assigned)
  • ClaimMappingRule — record: OIDC claim-to-role mapping rule
  • ClaimMappingService — interface: CRUD for claim mapping rules
  • ClaimMappingRepository — persistence interface
  • RoleRepository, GroupRepository — persistence interfaces

admin/ — Server-wide admin config

  • SensitiveKeysConfig — record: keys (List, immutable)
  • SensitiveKeysRepository — interface: find(), save()
  • SensitiveKeysMerger — pure function: merge(global, perApp) -> union with case-insensitive dedup, preserves first-seen casing. Returns null when both inputs null.
  • AppSettings, AppSettingsRepository — per-app-per-env settings config and persistence. Record carries (applicationId, environment, …); repository methods are findByApplicationAndEnvironment, findByEnvironment, save, delete(appId, env). AppSettings.defaults(appId, env) produces a default instance scoped to an environment.
  • ThresholdConfig, ThresholdRepository — alerting threshold config and persistence
  • AuditService — audit logging facade
  • AuditRecord, AuditResult, AuditCategory (enum: INFRA, AUTH, USER_MGMT, CONFIG, RBAC, AGENT, OUTBOUND_CONNECTION_CHANGE, OUTBOUND_HTTP_TRUST_CHANGE, ALERT_RULE_CHANGE, ALERT_SILENCE_CHANGE, DEPLOYMENT, LICENSE), AuditRepository — audit trail records and persistence

http/ — Outbound HTTP primitives (cross-cutting)

  • OutboundHttpClientFactory — interface: clientFor(context) returns memoized CloseableHttpClient
  • OutboundHttpProperties — record: trustAll, trustedCaPemPaths, defaultConnectTimeout, defaultReadTimeout, proxyUrl, proxyUsername, proxyPassword
  • OutboundHttpRequestContext — record of per-call TLS/timeout overrides; systemDefault() static factory
  • TrustMode — enum: SYSTEM_DEFAULT | TRUST_ALL | TRUST_PATHS

outbound/ — Admin-managed outbound connections

  • OutboundConnection — record: id, tenantId, name, description, url, method, defaultHeaders, defaultBodyTmpl, tlsTrustMode, tlsCaPemPaths, hmacSecretCiphertext, auth, allowedEnvironmentIds, createdAt, createdBy (String user_id), updatedAt, updatedBy (String user_id). isAllowedInEnvironment(envId) returns true when allowed-envs list is empty OR contains the env.
  • OutboundAuth — sealed interface + records: None | Bearer(tokenCiphertext) | Basic(username, passwordCiphertext). Jackson @JsonTypeInfo(use = DEDUCTION) — wire shape has no discriminator, subtype inferred from fields.
  • OutboundAuthKind, OutboundMethod — enums
  • OutboundConnectionRepository — CRUD by (tenantId, id): save/findById/findByName/listByTenant/delete
  • OutboundConnectionService — create/update/delete/get/list with uniqueness + narrow-envs + delete-if-referenced guards. rulesReferencing(id) stubbed in Plan 01 (returns []); populated in Plan 02 against AlertRuleRepository.

security/ — Auth

  • JwtService — interface: createAccessToken, createRefreshToken, validateAccessToken, validateRefreshToken
  • Ed25519SigningService — interface: sign, getPublicKeyBase64 (config signing)
  • OidcConfig — record: enabled, issuerUri, clientId, clientSecret, rolesClaim, defaultRoles, autoSignup, displayNameClaim, userIdClaim, audience, additionalScopes
  • OidcConfigRepository — persistence interface
  • PasswordPolicyValidator — min 12 chars, 3-of-4 character classes, no username match
  • UserInfo, UserRepository — user identity records and persistence
  • InvalidTokenException — thrown on revoked/expired tokens

ingestion/ — Buffered data pipeline

  • IngestionService — diagram + metrics facade (ingestDiagram, acceptMetrics, getMetricsBuffer). Execution ingestion went through here via the legacy RouteExecution shape until ChunkAccumulator took over writes from the chunked pipeline — the ingestExecution path plus its ExecutionStore.upsert / upsertProcessors dependencies were removed.
  • ChunkAccumulator — batches data for efficient flush; owns the execution write path (chunks → buffers → flush scheduler → ClickHouseExecutionStore.insertExecutionBatch).
  • WriteBuffer — bounded ring buffer for async flush
  • BufferedLogEntry — log entry wrapper with metadata
  • MergedExecution, TaggedDiagram — tagged ingestion records. TaggedDiagram carries (instanceId, applicationId, environment, graph) — env is resolved from the agent registry in the controller and stamped on the ClickHouse route_diagrams row.