Commit Graph

257 Commits

Author SHA1 Message Date
hsiegeln
f772e868e6 docs: correct loader-network reachability claim; refresh HOWTO env vars
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 4m32s
CI / docker (push) Successful in 2m55s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 55s
Final-review must-fixes:
- HOWTO.md: drop CAMELEER_SERVER_RUNTIME_JARDOCKERVOLUME; add the three new
  artifact env vars (loaderimage / artifacttokenttlseconds / artifactbaseurl).
- DeploymentExecutor @PostConstruct WARN, handoff doc, and docker-orchestration
  rule no longer claim the loader uses cameleer-traefik. The loader runs on
  the PRIMARY Docker network only — additional networks are attached after
  startContainer returns, by which time the loader has exited. SaaS still
  works because the tenant's primary network hosts the tenant server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 17:13:56 +02:00
hsiegeln
cc076b1923 fix(runtime): pre-pull loader image, plug volume-leak windows, document network dep
Pre-pull the loader image at PULL_IMAGE so the implicit pull on first
createContainerCmd doesn't bypass the 120s loader-wait timeout.

Wrap createAndStartLoader in try/catch so a create/start failure cleans
up the just-created volume; same guard around createAndStartMain on
phase-2 failures. Folds the wait-error message into the rethrown
RuntimeException so the cause chain is visible.

Add a @PostConstruct WARN when neither artifactbaseurl nor serverurl is
set so the implicit cameleer-server DNS dependency is loud at boot, and
document the loader-to-server reachability contract in
.claude/rules/docker-orchestration.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:26:35 +02:00
hsiegeln
1ddae94930 feat(runtime): init-container loader pattern + withUsernsMode (#152 hardening close)
Tasks 9+10+11 of the init-container-jar-fetch plan, landed atomically because
9 alone leaves the orchestrator+executor referencing removed ContainerRequest
fields.

ContainerRequest (core) drops jarPath/jarVolumeName/jarVolumeMountPath; adds
appVersionId, artifactDownloadUrl, artifactExpectedSize, loaderImage.

DockerRuntimeOrchestrator (app):
  - per-replica named volume "cameleer-jars-{containerName}"
  - phase 1: loader container with the volume mounted RW at /app/jars,
    ARTIFACT_URL + ARTIFACT_EXPECTED_SIZE env, full hardening contract
  - block on waitContainerCmd().awaitStatusCode(120s); on non-zero exit
    remove the loader, remove the volume, propagate RuntimeException so
    DeploymentExecutor marks the deployment FAILED. main is never created.
  - phase 2: main container with the same volume mounted RO at /app/jars
  - withUsernsMode("host:1000:65536") on BOTH containers — closes the last
    open hardening gap from issue #152
  - main entrypoint paths point at /app/jars/app.jar
  - extracted baseHardenedHostConfig() so loader and main share the
    cap_drop / security_opt / readonly / pids / tmpfs contract
  - removeContainer() also removes the per-replica volume so blue/green
    doesn't leak volumes

DeploymentExecutor (app):
  - injects ArtifactDownloadTokenSigner; new @Value props loaderimage,
    artifacttokenttlseconds, artifactbaseurl
  - replaces the temporary getVersion(...).jarPath() bridge with a signed
    URL ${artifactBaseUrl}/api/v1/artifacts/{id}?exp&sig
  - drops the Files.exists pre-flight check; AppVersion.jarSizeBytes is
    the size-of-record check now
  - drops jarDockerVolume / jarStoragePath @Value fields and the volume
    plumbing in startReplica
  - DeployCtx carries appVersionId / artifactUrl / artifactExpectedSize
    in place of jarPath

Tests:
  - DockerRuntimeOrchestratorHardeningTest updated for the new shape;
    captures HostConfig on the MAIN container and asserts cap_drop ALL
    + no-new-privileges + apparmor + readonly + pids + tmpfs + the new
    withUsernsMode("host:1000:65536")
  - DockerRuntimeOrchestratorLoaderTest (new): verifies volume create →
    loader create with RW bind → loader started → awaited → loader
    removed → main create with RO bind → main started; verifies abort
    + cleanup on loader exit != 0 (loader removed, volume removed, main
    NEVER created); verifies userns_mode applied to both containers.

Config:
  - application.yml replaces jardockervolume with loaderimage,
    artifacttokenttlseconds, artifactbaseurl

Rules updated: .claude/rules/docker-orchestration.md (loader pattern,
userns, no more bind-mount); .claude/rules/core-classes.md
(ContainerRequest field map).

Test counts after change:
  - cameleer-server-core: 116/116 unit tests pass
  - cameleer-server-app: 273/273 unit tests pass

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 16:06:56 +02:00
hsiegeln
940bf18aba refactor(web): authoritative Content-Length, typed Optional<AppVersion> in controller 2026-04-27 15:47:37 +02:00
hsiegeln
433155ae0c feat(web): add ArtifactDownloadController with HMAC URL auth
New permitAll endpoint GET /api/v1/artifacts/{appVersionId}?exp&sig that
the cameleer-runtime-loader init container hits to stream the deployed
JAR. Auth is the HMAC-signed URL (sig + exp) — no JWT, no bootstrap
token — so SecurityConfig permits the path and the controller does the
verification itself.

Also hardens ArtifactDownloadTokenSigner to reject null/blank jwtSecret
at construction (Task 6 review feedback I-3).

Wires the ArtifactDownloadTokenSigner bean in SecurityBeanConfig from
${cameleer.server.security.jwtsecret}, the same property the rest of
the security stack uses.

Test coverage: 200/401/404 paths via standalone-MockMvc unit test
(avoids dragging in WebConfig's audit + usage interceptors that pull
the full bean graph) plus the existing signer suite extended with a
null/blank-secret guard test.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:36:28 +02:00
hsiegeln
73e06d8164 test(web): cover constant-time compare path in HMAC verify
Existing rejectsTamperedSignature uses len+1 sig — short-circuits in
MessageDigest.isEqual on length mismatch. Same-length tamper test
forces the byte-by-byte compare so the constant-time branch is
exercised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:30:13 +02:00
hsiegeln
25bbd759d0 feat(web): add HMAC token signer for artifact downloads 2026-04-27 15:25:57 +02:00
hsiegeln
d90cd5ef2d test(retention): cover deployed-version-skip; preserve stack on delete failure 2026-04-27 15:23:07 +02:00
hsiegeln
4abcc610d5 refactor(retention): JarRetentionJob deletes via ArtifactStore 2026-04-27 15:17:17 +02:00
hsiegeln
6b7b5ae1ff docs(runtime): mark DeploymentExecutor jarPath as Task-11 bridge
Tactical filesystem-path read of the AppVersion locator survives until the
loader init-container lands — flagged inline so future readers don't read
the staging step as steady state.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 15:11:27 +02:00
hsiegeln
07a2fd6090 refactor(core): AppService writes via ArtifactStore; remove resolveJarPath
Task 4 of the init-container JAR fetch plan: migrate AppService.uploadJar
off direct filesystem writes onto the ArtifactStore abstraction so future
backends (OCI/Zot, S3) can swap in without touching service or controller
code.

- AppService constructor now takes (AppRepository, AppVersionRepository,
  ArtifactStore, tenantId[, CreateGuard]). The store owns layout and the
  locator string written into app_versions.jar_path.
- uploadJar buffers the request body once for hashing + storage, then
  writes a scratch temp file solely for RuntimeDetector (which still
  takes a Path); scratch is unconditionally deleted in finally.
- Add coordinatesFor(AppVersion) helper so downstream callers (Task 5+)
  can derive ArtifactCoordinates without knowing the tenant binding.
- Remove resolveJarPath. DeploymentExecutor now reads jarPath directly
  off the AppVersion record; the clean cut to download-URL delivery
  lands in Task 11.
- RuntimeBeanConfig wires a FilesystemArtifactStore bean rooted at
  cameleer.server.runtime.jarstoragepath and threads tenantId into the
  AppService bean.
2026-04-27 15:05:40 +02:00
hsiegeln
5238c58dd5 refactor(storage): clean up tmp on put failure; promote DirectoryNotEmptyException import
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 14:59:25 +02:00
hsiegeln
5eb07f5047 fix(storage): atomic put + tolerate DirectoryNotEmptyException in delete 2026-04-27 14:55:38 +02:00
hsiegeln
bc8bd590a6 feat(storage): add FilesystemArtifactStore (one impl of ArtifactStore) 2026-04-27 14:48:42 +02:00
hsiegeln
83a10de497 fix(auth): close same-ms revocation race + tidy audit cleanup
Bumps token_revoked_before by 1ms so a JWT issued in the same millisecond
as a logout call (Date.from(Instant.now()) quantises iat to ms) does not
survive the filter's strict isBefore check.

Also extends LogoutControllerIT @AfterEach to delete the audit_log row,
keeping reused Postgres containers clean for downstream ITs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 09:26:05 +02:00
hsiegeln
9031533077 feat(auth): add POST /auth/logout that revokes all user tokens
Bumps users.token_revoked_before = now() for the calling user, audited
under AuditCategory.AUTH. Best-effort: returns 204 even when the request
is unauthenticated, so the SPA can call it on every logout regardless of
token state. Token-rejection is enforced by the existing
JwtAuthenticationFilter revocation check (fixed in 7066795c).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 09:21:47 +02:00
hsiegeln
b4c6e45d35 test(auth): JwtRevocationIT cleanup + unrevoked-token coverage
Adds @AfterEach to delete the test users so Testcontainers reuse does
not leak an authenticated user with a future token_revoked_before into
the shared schema (visible to LicenseUsageReader.snapshot, user-admin
listing tests, etc.). Adds unrevokedUserTokenIsAccepted to pin the
revoked == null no-op branch as a first-class assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 09:18:10 +02:00
hsiegeln
7066795c3c fix(auth): strip user: prefix before token-revocation lookup
JwtAuthenticationFilter compared the JWT subject (user:alice) against
users.user_id (bare alice), so token_revoked_before was never read for
any user. Strips the prefix to match the convention documented in
CLAUDE.md. Adds JwtRevocationIT as a regression.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 09:11:55 +02:00
hsiegeln
858975f03f refactor(license): extract cameleer-license-api module from server-core
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 2m57s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Splits the pure license contract types (LicenseInfo, LicenseValidator,
LicenseState, LicenseStateMachine, LicenseLimits, DefaultTierLimits) into a
new cameleer-license-api module under package com.cameleer.license.

Why: cameleer-license-minter previously depended on cameleer-server-core for
these types, dragging cameleer-server-core + cameleer-common onto the
classpath of every minter consumer (notably cameleer-saas). The SaaS
management plane has no business carrying server-runtime types — it only
needs the license contract to mint and verify tokens.

After:
  cameleer-license-minter -> cameleer-license-api  (no server internals)
  cameleer-server-core    -> cameleer-license-api
  cameleer-saas           -> cameleer-license-minter -> cameleer-license-api

Verified: mvn -pl cameleer-license-minter dependency:tree shows the minter
no longer pulls cameleer-server-core or cameleer-common. Full reactor
verify (-DskipITs) green: 371 tests pass.

LicenseGate stays in server-core (server-runtime state holder, not contract).

Closes cameleer/cameleer-server#156

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 20:06:52 +02:00
hsiegeln
45b5f473c9 refactor(auth): post-review tidy — drop @NotNull, refresh e2e comment, use oidc.primary
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 19:48:20 +02:00
hsiegeln
af53eca7f6 test(auth): tighten AuthCapabilitiesControllerIT — drop redundant stub, add coverage gaps 2026-04-26 19:17:05 +02:00
hsiegeln
4f6e7ea4dc feat(auth): AuthCapabilitiesController — GET /api/v1/auth/capabilities
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 19:10:17 +02:00
hsiegeln
2f7c6aa005 fix(auth): @NotNull on AuthCapabilitiesResponse.Oidc.providerName 2026-04-26 18:59:20 +02:00
hsiegeln
f945d10d48 feat(auth): AuthCapabilitiesResponse DTO 2026-04-26 18:57:09 +02:00
hsiegeln
ddb18c4f17 feat(auth): OidcProviderNameDeriver — issuer URI → display label
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-26 18:53:31 +02:00
hsiegeln
581dc1ad13 test(license): SchemaBootstrapIT — assert V5 license + retention columns
Two new assertions: license table has tenant_id/license_id/token/
installed_at/installed_by/expires_at/last_validated_at columns with
expected types + NOT NULL constraints, PK on tenant_id; environments
has execution_retention_days/log_retention_days/metric_retention_days
all integer NOT NULL DEFAULT 1.

Note: V5 migration does not include an installed_via column; the
plan's spec was aspirational. Test asserts what the migration
actually creates (and what PostgresLicenseRepository reads/writes).

OpenAPI regen (Step 35.2) deferred to session end — requires running
backend + UI dev server.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 16:13:50 +02:00
hsiegeln
e198c13e8a test(license): RetentionRuntimeRecomputeIT — TTL recompute on license change
Install license with max_log_retention_days=30, env.configured=60 →
effective=30; verify ClickHouse logs table reflects toIntervalDay(30).
Replace with max=7 → effective=7; verify TTL recomputed. Polls
system.tables.create_table_query up to 5s for the @Async listener
to apply.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 16:08:28 +02:00
hsiegeln
1e78439ddd test(license): LicenseEnforcementIT — cross-cap smoke regression net
Five @Nested cap surfaces (envs, apps, outbound, alert rules, users)
share a single synthetic license with cap=1 each. Each test pushes
just past the cap and verifies the standard 403 envelope plus a
cap_exceeded audit row. Per-limit ITs cover full per-cap behavior;
this IT catches accidental wire-rip regressions across all caps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 16:00:50 +02:00
hsiegeln
1a307da6b2 test(license): LicenseLifecycleIT — install/persist/revalidate/reject
End-to-end IT covering the full lifecycle: mint a token via
cameleer-license-minter (test-scope), POST it via /api/v1/admin/license,
verify state=ACTIVE, clear gate, revalidate from PG, verify state restored.
Plus: tampered signature -> 400 + LICENSE/FAILURE audit row, gate not
mutated to ACTIVE.

Adds cameleer-license-minter as a test-scope dep on cameleer-server-app
(verified absent from runtime/compile classpaths). Also disables the
default spring-boot:repackage execution on the minter pom so the main
artifact stays as a plain library JAR consumable as a Maven dependency
(the cli classifier still produces the executable jar).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:56:01 +02:00
hsiegeln
885f2be16b feat(license): Prometheus gauges for state + days remaining
cameleer_license_state{state=...} (one-hot per LicenseState),
cameleer_license_days_remaining (negative when ABSENT/INVALID),
cameleer_license_last_validated_age_seconds. Refreshed on
LicenseChangedEvent and every 60s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:43:54 +02:00
hsiegeln
945ecd78cf feat(license): LicenseUsageController GET /api/v1/admin/license/usage
Returns state, expiresAt/daysRemaining, lastValidatedAt, message
(LicenseMessageRenderer.forState), and a limits[] array where each
entry carries key/current/cap/source ("license" vs "default"). Adds
public AgentRegistryService.liveCount() so max_agents can be reported
from the in-memory registry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:42:39 +02:00
hsiegeln
3f69e546e4 refactor(license): LicenseAdminController delegates to LicenseService
GET returns {state, invalidReason, envelope, lastValidatedAt}. POST
delegates to licenseService.install(token, userId, "api") so install
goes through audit + persistence + event publish. Removes the inline
LicenseValidator construction from the controller.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:34:07 +02:00
hsiegeln
340d954fed feat(license): LicenseRevalidationJob — daily cron + 60s post-startup
@Scheduled(cron = "0 0 3 * * *") triggers svc.revalidate() daily.
@EventListener(ApplicationReadyEvent.class) @Async fires once 60s
after boot to catch ABSENT->ACTIVE transitions if the license was
written to PG between server starts. Exceptions are logged but never
propagate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:32:33 +02:00
hsiegeln
484a55f4f4 feat(license): RetentionPolicyApplier listens on LicenseChangedEvent
@EventListener fires on every license install/replace/expire. For each
environment, computes effective TTL = min(licenseCap, env.configured)
and emits one ALTER TABLE ... MODIFY TTL ... per (table, env). Tables
covered: executions, processor_executions, logs, agent_metrics,
agent_events. ClickHouse failures are logged but do not propagate
(listener is async-tolerant).

route_diagrams is intentionally excluded -- it has no TTL clause in
init.sql (ReplacingMergeTree keyed on content_hash, not time-series).
server_metrics is also excluded -- it has no environment column
(server straddles environments).

Per-environment TTL via WHERE requires ClickHouse 22.3+; the project's
current image (clickhouse/clickhouse-server:24.12) is well above that
floor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:28:42 +02:00
hsiegeln
cc5d88d708 feat(license): surface execution/log/metric retention days on Environment
Adds three int fields to the Environment record + repository row mapper,
matching the columns added in V5. Default value is 1 per the V5 NOT NULL
DEFAULT 1. Read DTO surfaces the fields via Jackson record serialization;
setter endpoint deferred to a follow-up that wires the corresponding
license cap checks.

The canonical constructor enforces >= 1 for each retention field — V5
guarantees this at the DB level, but the runtime guard catches in-memory
construction errors (e.g., test sites that pass 0).

Test sites updated to the 12-arg signature with retention defaults of 1.
EnvironmentAdminControllerIT gains a regression test asserting the wire
shape exposes all three fields.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:22:40 +02:00
hsiegeln
046f08fe87 feat(license): enforce max_jar_retention_count at PUT jar-retention
Returns 422 UNPROCESSABLE_ENTITY when jarRetentionCount exceeds
license cap. Default tier cap = 3. The other three retention caps
(execution/log/metric retention days) are deferred to T26+ where
the corresponding fields are added to Environment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:16:04 +02:00
hsiegeln
56bddcc747 feat(license): enforce compute caps at DeploymentExecutor PRE_FLIGHT
Adds ComputeUsage record + computeUsage() helper to LicenseUsageReader
that aggregates from PG. DeploymentExecutor.executeAsync runs three
assertWithinCap checks (max_total_cpu_millis, max_total_memory_mb,
max_total_replicas) right after config resolution. The existing
executor try/catch turns a LicenseCapExceededException into a FAILED
deployment with the cap message in the failure reason.

Adds ComputeCapEnforcementIT (HTTP-driven; @MockBean RuntimeOrchestrator,
since cap rejection short-circuits before any orchestrator call) plus
defensive license lifts in BlueGreenStrategyIT, RollingStrategyIT,
DeploymentSnapshotIT, and DeploymentControllerAuditIT so sequential
deploys under testcontainer reuse don't trip the new caps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 15:09:39 +02:00
hsiegeln
71f3b70b86 feat(license): enforce max_alert_rules at AlertRuleController.create
Adds AlertRuleRepository.count() and a LicenseEnforcer.assertWithinCap
call at the top of the POST handler. Default cap = 2; the 3rd rule
gets the standard 403 envelope. Sibling alert ITs that legitimately
need more than 2 rules get the cap lifted via the test-license helper.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:50:59 +02:00
hsiegeln
5a579415a1 feat(license): enforce max_outbound_connections at OutboundConnectionServiceImpl.create
Adds LicenseEnforcer.assertWithinCap call at the top of create() using
repo.listByTenant(tenantId).size() as the current count. Lifts the cap
in OutboundConnectionAdminControllerIT (duplicateNameReturns409 needs
2 creates in one test). LicenseExceptionAdvice maps the rejection to
the standard 403 envelope; cap_exceeded audit row emitted via the
LicenseEnforcer 3-arg ctor.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:40:12 +02:00
hsiegeln
1ff30905f7 feat(license): enforce max_users at user creation paths
Wires LicenseEnforcer into UserAdminController.createUser and
OidcAuthController auto-signup. Cap fires before any validation so
over-cap creates short-circuit cheaply. Audit emission already
present (LicenseEnforcer 3-arg ctor from T16 emits cap_exceeded
under AuditCategory.LICENSE).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:29:54 +02:00
hsiegeln
afdaee628b feat(license): enforce max_agents at AgentRegistryService.register
Adds a CreateGuard to AgentRegistryService that fires only on NEW
registrations: re-registers of an existing agent bypass the cap (they
don't grow the registry, and rejecting them would orphan an agent that
already counts against the cap). Live-only count for cap enforcement —
STALE/DEAD/SHUTDOWN agents are excluded so the cap reflects the working
fleet, not historical residue.

Reuses the CreateGuard pattern from T18-T19. The global
LicenseExceptionAdvice maps the resulting LicenseCapExceededException to
403 with the structured envelope — no AgentRegistrationController
changes needed.

AgentCapEnforcementIT exercises the HTTP path end-to-end: two registers
succeed at cap=2, a third returns 403 with the expected envelope, and a
re-register of an already-registered agent succeeds at-cap.

Sibling agent-registering ITs (Agent*ControllerIT, Diagram*IT,
Execution*IT, Search*IT, Protocol*IT, Backpressure*IT, JwtRefresh*IT,
Registration*IT, Security*IT, SseSigning*IT, IngestionSchemaIT) lift
max_agents in @BeforeEach and clear the synthetic license in @AfterEach
— the in-memory registry is shared across @SpringBootTest reuse
boundaries, so without the lift the default-tier max_agents=5 would be
exhausted by accumulated test residue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 14:19:08 +02:00
hsiegeln
80dafe685b feat(license): enforce max_apps at AppService.createApp
Adds CreateGuard hook to AppService.createApp using the same pattern
as T18 (EnvironmentService). AppRepository.count() added; the bean
wires LicenseEnforcer.assertWithinCap("max_apps", current, 1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 13:36:34 +02:00
hsiegeln
198811b752 refactor(license-test): rename installTestLicenseWithCaps -> installSyntheticUnsignedLicense
Makes the signature-bypass loud at every call site since T19-T25 will
copy this pattern 5+ more times. The helper still loads via
LicenseGate.load() directly (no signature check) — the new name
ensures any future caller has to acknowledge that.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 13:24:58 +02:00
hsiegeln
8a64a9e04c feat(license): enforce max_environments at EnvironmentService.create
Adds CreateGuard functional interface to core (preserves the no-Spring
boundary between core and app) and wires LicenseEnforcer into the
EnvironmentService bean in RuntimeBeanConfig so POST
/api/v1/admin/environments rejects with the structured 403 envelope
(error/limit/cap/state/message) once the cap is reached. Default tier
max_environments=1; the V1 baseline seeds the default env, so the very
next create through the API is rejected unless a license lifts the cap.

Also adds EnvironmentRepository.count() (with PostgresEnvironmentRepository
impl), TestSecurityHelper.installTestLicenseWithCaps(...) so existing ITs
that POST envs keep working, and a defensive cleanup in
LicenseUsageReaderIT/EnvironmentAdminControllerIT to stay
order-independent under Testcontainer reuse (deletes deployments+apps
before envs to avoid FK violations).

Test: EnvironmentCapEnforcementIT (new) drives the rejection path
end-to-end and asserts the 403 body shape produced by
LicenseExceptionAdvice.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 13:16:41 +02:00
hsiegeln
f291d7c24d feat(license): LicenseUsageReader aggregates current usage
One COUNT per entity table; one SUM-grouped query over non-stopped
deployments for compute caps. SQL traverses
deployed_config_snapshot->'containerConfig' (corrected from the
plan's top-level path; the snapshot record nests containerConfig
under that key). agentCount is fed in by the controller since it's
an in-memory registry value, not a DB row.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:47:59 +02:00
hsiegeln
9b9b56043c fix(license): explicit @Autowired ctor + tolerate audit failures
Two follow-ups to LicenseEnforcer review:
- Add @Autowired to the 3-arg ctor so Spring picks it unambiguously
  (the 2-arg test ctor is otherwise an equally-greedy candidate).
- Wrap audit.log() in try/catch + log.warn so a degraded audit DB
  cannot mask a cap rejection: callers still see HTTP 403 even when
  audit storage is unhealthy.
- Extract counter name to private static final.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:43:27 +02:00
hsiegeln
4985348827 feat(license): LicenseEnforcer single entry point
assertWithinCap consults LicenseGate.getEffectiveLimits, throws
LicenseCapExceededException on overflow, increments
cameleer_license_cap_rejections_total{limit=...} for telemetry, and
emits an AuditCategory.LICENSE cap_exceeded audit row when an
AuditService is wired (3-arg ctor; the test-only 2-arg ctor passes
null and the audit call short-circuits). Unknown limit keys are
programmer errors (IllegalArgumentException), not 403s.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:36:58 +02:00
hsiegeln
2bad9c3e48 feat(license): cap-exceeded exception + state-aware message renderer
LicenseCapExceededException + @ControllerAdvice mapping to 403 with a
body that includes state, limit, current, cap, and a per-state human
message templated by LicenseMessageRenderer (covers ABSENT/ACTIVE/
GRACE/EXPIRED/INVALID with day counts and reason). Adds the forState()
overload now (used by the /usage endpoint in Task 30) so both surfaces
share identical phrasing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 12:26:39 +02:00
hsiegeln
b95e80a24a feat(license): wire LicenseService into boot order (env > file > DB)
LicenseBootLoader @PostConstruct calls LicenseService.loadInitial,
which delegates to install() so env-var/file/DB paths share a single
audit + event-publish code path. A missing public key now produces
an always-failing validator (constructed with a throwaway keypair so
the parent ctor accepts it) so loaded tokens route to INVALID
instead of being silently ignored.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:16:49 +02:00
hsiegeln
6fbcf10ee4 feat(license): LicenseService + LicenseChangedEvent
Single mediation point for token install/replace/revalidate. Audits
under AuditCategory.LICENSE, persists to PG, mutates the LicenseGate,
and publishes LicenseChangedEvent so downstream listeners
(RetentionPolicyApplier, LicenseMetrics) react uniformly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 11:11:48 +02:00