Two new assertions: license table has tenant_id/license_id/token/
installed_at/installed_by/expires_at/last_validated_at columns with
expected types + NOT NULL constraints, PK on tenant_id; environments
has execution_retention_days/log_retention_days/metric_retention_days
all integer NOT NULL DEFAULT 1.
Note: V5 migration does not include an installed_via column; the
plan's spec was aspirational. Test asserts what the migration
actually creates (and what PostgresLicenseRepository reads/writes).
OpenAPI regen (Step 35.2) deferred to session end — requires running
backend + UI dev server.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Install license with max_log_retention_days=30, env.configured=60 →
effective=30; verify ClickHouse logs table reflects toIntervalDay(30).
Replace with max=7 → effective=7; verify TTL recomputed. Polls
system.tables.create_table_query up to 5s for the @Async listener
to apply.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Five @Nested cap surfaces (envs, apps, outbound, alert rules, users)
share a single synthetic license with cap=1 each. Each test pushes
just past the cap and verifies the standard 403 envelope plus a
cap_exceeded audit row. Per-limit ITs cover full per-cap behavior;
this IT catches accidental wire-rip regressions across all caps.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
End-to-end IT covering the full lifecycle: mint a token via
cameleer-license-minter (test-scope), POST it via /api/v1/admin/license,
verify state=ACTIVE, clear gate, revalidate from PG, verify state restored.
Plus: tampered signature -> 400 + LICENSE/FAILURE audit row, gate not
mutated to ACTIVE.
Adds cameleer-license-minter as a test-scope dep on cameleer-server-app
(verified absent from runtime/compile classpaths). Also disables the
default spring-boot:repackage execution on the minter pom so the main
artifact stays as a plain library JAR consumable as a Maven dependency
(the cli classifier still produces the executable jar).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cameleer_license_state{state=...} (one-hot per LicenseState),
cameleer_license_days_remaining (negative when ABSENT/INVALID),
cameleer_license_last_validated_age_seconds. Refreshed on
LicenseChangedEvent and every 60s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Returns state, expiresAt/daysRemaining, lastValidatedAt, message
(LicenseMessageRenderer.forState), and a limits[] array where each
entry carries key/current/cap/source ("license" vs "default"). Adds
public AgentRegistryService.liveCount() so max_agents can be reported
from the in-memory registry.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
GET returns {state, invalidReason, envelope, lastValidatedAt}. POST
delegates to licenseService.install(token, userId, "api") so install
goes through audit + persistence + event publish. Removes the inline
LicenseValidator construction from the controller.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Scheduled(cron = "0 0 3 * * *") triggers svc.revalidate() daily.
@EventListener(ApplicationReadyEvent.class) @Async fires once 60s
after boot to catch ABSENT->ACTIVE transitions if the license was
written to PG between server starts. Exceptions are logged but never
propagate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@EventListener fires on every license install/replace/expire. For each
environment, computes effective TTL = min(licenseCap, env.configured)
and emits one ALTER TABLE ... MODIFY TTL ... per (table, env). Tables
covered: executions, processor_executions, logs, agent_metrics,
agent_events. ClickHouse failures are logged but do not propagate
(listener is async-tolerant).
route_diagrams is intentionally excluded -- it has no TTL clause in
init.sql (ReplacingMergeTree keyed on content_hash, not time-series).
server_metrics is also excluded -- it has no environment column
(server straddles environments).
Per-environment TTL via WHERE requires ClickHouse 22.3+; the project's
current image (clickhouse/clickhouse-server:24.12) is well above that
floor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds three int fields to the Environment record + repository row mapper,
matching the columns added in V5. Default value is 1 per the V5 NOT NULL
DEFAULT 1. Read DTO surfaces the fields via Jackson record serialization;
setter endpoint deferred to a follow-up that wires the corresponding
license cap checks.
The canonical constructor enforces >= 1 for each retention field — V5
guarantees this at the DB level, but the runtime guard catches in-memory
construction errors (e.g., test sites that pass 0).
Test sites updated to the 12-arg signature with retention defaults of 1.
EnvironmentAdminControllerIT gains a regression test asserting the wire
shape exposes all three fields.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Returns 422 UNPROCESSABLE_ENTITY when jarRetentionCount exceeds
license cap. Default tier cap = 3. The other three retention caps
(execution/log/metric retention days) are deferred to T26+ where
the corresponding fields are added to Environment.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds ComputeUsage record + computeUsage() helper to LicenseUsageReader
that aggregates from PG. DeploymentExecutor.executeAsync runs three
assertWithinCap checks (max_total_cpu_millis, max_total_memory_mb,
max_total_replicas) right after config resolution. The existing
executor try/catch turns a LicenseCapExceededException into a FAILED
deployment with the cap message in the failure reason.
Adds ComputeCapEnforcementIT (HTTP-driven; @MockBean RuntimeOrchestrator,
since cap rejection short-circuits before any orchestrator call) plus
defensive license lifts in BlueGreenStrategyIT, RollingStrategyIT,
DeploymentSnapshotIT, and DeploymentControllerAuditIT so sequential
deploys under testcontainer reuse don't trip the new caps.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds AlertRuleRepository.count() and a LicenseEnforcer.assertWithinCap
call at the top of the POST handler. Default cap = 2; the 3rd rule
gets the standard 403 envelope. Sibling alert ITs that legitimately
need more than 2 rules get the cap lifted via the test-license helper.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds LicenseEnforcer.assertWithinCap call at the top of create() using
repo.listByTenant(tenantId).size() as the current count. Lifts the cap
in OutboundConnectionAdminControllerIT (duplicateNameReturns409 needs
2 creates in one test). LicenseExceptionAdvice maps the rejection to
the standard 403 envelope; cap_exceeded audit row emitted via the
LicenseEnforcer 3-arg ctor.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wires LicenseEnforcer into UserAdminController.createUser and
OidcAuthController auto-signup. Cap fires before any validation so
over-cap creates short-circuit cheaply. Audit emission already
present (LicenseEnforcer 3-arg ctor from T16 emits cap_exceeded
under AuditCategory.LICENSE).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a CreateGuard to AgentRegistryService that fires only on NEW
registrations: re-registers of an existing agent bypass the cap (they
don't grow the registry, and rejecting them would orphan an agent that
already counts against the cap). Live-only count for cap enforcement —
STALE/DEAD/SHUTDOWN agents are excluded so the cap reflects the working
fleet, not historical residue.
Reuses the CreateGuard pattern from T18-T19. The global
LicenseExceptionAdvice maps the resulting LicenseCapExceededException to
403 with the structured envelope — no AgentRegistrationController
changes needed.
AgentCapEnforcementIT exercises the HTTP path end-to-end: two registers
succeed at cap=2, a third returns 403 with the expected envelope, and a
re-register of an already-registered agent succeeds at-cap.
Sibling agent-registering ITs (Agent*ControllerIT, Diagram*IT,
Execution*IT, Search*IT, Protocol*IT, Backpressure*IT, JwtRefresh*IT,
Registration*IT, Security*IT, SseSigning*IT, IngestionSchemaIT) lift
max_agents in @BeforeEach and clear the synthetic license in @AfterEach
— the in-memory registry is shared across @SpringBootTest reuse
boundaries, so without the lift the default-tier max_agents=5 would be
exhausted by accumulated test residue.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds CreateGuard hook to AppService.createApp using the same pattern
as T18 (EnvironmentService). AppRepository.count() added; the bean
wires LicenseEnforcer.assertWithinCap("max_apps", current, 1).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Makes the signature-bypass loud at every call site since T19-T25 will
copy this pattern 5+ more times. The helper still loads via
LicenseGate.load() directly (no signature check) — the new name
ensures any future caller has to acknowledge that.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds CreateGuard functional interface to core (preserves the no-Spring
boundary between core and app) and wires LicenseEnforcer into the
EnvironmentService bean in RuntimeBeanConfig so POST
/api/v1/admin/environments rejects with the structured 403 envelope
(error/limit/cap/state/message) once the cap is reached. Default tier
max_environments=1; the V1 baseline seeds the default env, so the very
next create through the API is rejected unless a license lifts the cap.
Also adds EnvironmentRepository.count() (with PostgresEnvironmentRepository
impl), TestSecurityHelper.installTestLicenseWithCaps(...) so existing ITs
that POST envs keep working, and a defensive cleanup in
LicenseUsageReaderIT/EnvironmentAdminControllerIT to stay
order-independent under Testcontainer reuse (deletes deployments+apps
before envs to avoid FK violations).
Test: EnvironmentCapEnforcementIT (new) drives the rejection path
end-to-end and asserts the 403 body shape produced by
LicenseExceptionAdvice.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
One COUNT per entity table; one SUM-grouped query over non-stopped
deployments for compute caps. SQL traverses
deployed_config_snapshot->'containerConfig' (corrected from the
plan's top-level path; the snapshot record nests containerConfig
under that key). agentCount is fed in by the controller since it's
an in-memory registry value, not a DB row.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups to LicenseEnforcer review:
- Add @Autowired to the 3-arg ctor so Spring picks it unambiguously
(the 2-arg test ctor is otherwise an equally-greedy candidate).
- Wrap audit.log() in try/catch + log.warn so a degraded audit DB
cannot mask a cap rejection: callers still see HTTP 403 even when
audit storage is unhealthy.
- Extract counter name to private static final.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
assertWithinCap consults LicenseGate.getEffectiveLimits, throws
LicenseCapExceededException on overflow, increments
cameleer_license_cap_rejections_total{limit=...} for telemetry, and
emits an AuditCategory.LICENSE cap_exceeded audit row when an
AuditService is wired (3-arg ctor; the test-only 2-arg ctor passes
null and the audit call short-circuits). Unknown limit keys are
programmer errors (IllegalArgumentException), not 403s.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When username is empty (e.g. email-registered OIDC users with no display
name), the badge was hidden entirely, making logout inaccessible. Always
render the badge with fallback text "Account".
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
LicenseCapExceededException + @ControllerAdvice mapping to 403 with a
body that includes state, limit, current, cap, and a per-state human
message templated by LicenseMessageRenderer (covers ABSENT/ACTIVE/
GRACE/EXPIRED/INVALID with day counts and reason). Adds the forState()
overload now (used by the /usage endpoint in Task 30) so both surfaces
share identical phrasing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Resume point for the next session executing the License Enforcement
plan. Captures: 14 done commit SHAs, what works/doesn't end-to-end,
critical plan deviations (AuditService.log API; LicenseInfo.label
not tier; throwaway-keypair fallback validator; ClickHouse TTL WHERE
caveat for T27), batching strategy, and suggested next-task order.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LicenseBootLoader @PostConstruct calls LicenseService.loadInitial,
which delegates to install() so env-var/file/DB paths share a single
audit + event-publish code path. A missing public key now produces
an always-failing validator (constructed with a throwaway keypair so
the parent ctor accepts it) so loaded tokens route to INVALID
instead of being silently ignored.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Single mediation point for token install/replace/revalidate. Audits
under AuditCategory.LICENSE, persists to PG, mutates the LicenseGate,
and publishes LicenseChangedEvent so downstream listeners
(RetentionPolicyApplier, LicenseMetrics) react uniformly.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
JdbcTemplate-backed repo; upsert is ON CONFLICT (tenant_id), touch
updates only last_validated_at, delete is provided for future
operator-clear flow (not exposed as REST in v1).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per-tenant license row stores the signed token, licenseId for audit,
installed/expires/last_validated timestamps. environments gains three
INTEGER NOT NULL DEFAULT 1 retention columns (execution, log, metric)
so existing rows land inside the default-tier cap.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds --verify (requires --public-key) to LicenseMinterCli. After
writing the output file the CLI parses the freshly-minted token
through LicenseValidator against the supplied public key. On
verify failure the output file is deleted (so the bad token is
not accidentally shipped) and the CLI exits 3.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reads PEM or base64 PKCS#8 Ed25519 private key, maps --max-foo-bar
flags to max_foo_bar limit keys, parses --expires as a UTC date,
defaults --grace-days to 0. Unknown flags fail fast with exit 2.
--verify path is added in the next task.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure signing primitive: serialises LicenseInfo to canonical JSON
(sorted top-level keys via ORDER_MAP_ENTRIES_BY_KEYS plus a TreeMap
for the limits sub-object) then signs with Ed25519. Round-trips
through LicenseValidator and is byte-stable across runs for
identical inputs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Top-level module sibling to cameleer-server-core/-app. Depends on
cameleer-server-core for the LicenseInfo schema. Spring Boot
repackage produces a runnable -cli classifier for the vendor.
Not added as a dependency from cameleer-server-app — runtime tree
must not carry signing primitives.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LicenseGate now exposes getState() (delegates to LicenseStateMachine),
getEffectiveLimits() (merged over DefaultTierLimits in ACTIVE/GRACE,
defaults-only in ABSENT/EXPIRED/INVALID), markInvalid(reason), and
clear(). Internal snapshot is an immutable record-like class swapped
atomically so concurrent reads see a consistent license+reason pair.
Removes the transient openSentinel() and getTier() introduced by
earlier tasks (no production consumers).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pure-domain FSM (ABSENT/ACTIVE/GRACE/EXPIRED/INVALID) and the
default-tier constants per spec §3. invalidReason wins over any
loaded license so signature failures surface as INVALID rather
than masking as ABSENT.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §2.1 — both fields are required and the validator rejects a
token whose tenantId does not match the server's configured tenant
(CAMELEER_SERVER_TENANT_ID). Self-hosted customers cannot strip
tenantId because the field is in the signed payload.
LicenseBeanConfig and LicenseAdminController updated to pass the
expected tenant to the validator constructor. The transient
placeholder/TODO from Task 2 is removed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Required fields per spec §2.1. tenantId is non-blank; gracePeriodDays
defines the post-exp window during which limits keep applying.
isExpired() now honours the grace; isAfterRawExpiry() distinguishes
ACTIVE from GRACE for the state machine in Task 4.
Validator and gate use placeholder values temporarily; Task 3 wires
the validator to read the new fields, Task 5 rewrites the gate.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Spec §9 — feature flags are out of scope for license enforcement.
Drops Feature.java, LicenseGate.isEnabled, LicenseInfo.hasFeature,
and the corresponding test cases. LicenseValidator now silently
ignores any features array on the wire (no error).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Add INVALID state to FSM (signature/tenant/parse failure ≠ ABSENT)
with loud UI/audit/metric severity; ABSENT stays a calm state.
- Make tenantId required in the license envelope (it's already inside
the signed payload, so a self-hosted customer cannot strip it).
- Move ClickHouse TTL recompute from boot-only to a
RetentionPolicyApplier @EventListener(LicenseChangedEvent), so a
long-running server that lands in EXPIRED tightens TTL automatically.
- Add LicenseRevalidationJob (daily) that re-runs signature check
against the DB row and updates last_validated_at; transitions to
INVALID on failure (catches public-key rotation drift).
- Add last_validated_at column to the license table, surfaced on the
/usage endpoint and as cameleer_license_last_validated_age_seconds.
- Enrich enforcement-failure responses and the /usage endpoint with a
per-state human-readable message so 403s and the UI both explain
WHY caps changed.
- Add --verify (with --public-key) to the minter CLI to round-trip a
freshly-minted token through LicenseValidator before shipping it,
deleting the output file on verify failure.
- Add corresponding tests, telemetry gauge, and a runtime-recompute IT.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the agreed design for enforcing licensing on cameleer-server:
- Default tier with hard caps when no license is configured
- Arbitrary per-customer limits in signed Ed25519 license tokens
- Standalone cameleer-license-minter module (vendor-only)
- DB-persisted license with env/file override paths
- ABSENT/ACTIVE/GRACE/EXPIRED state machine; offline expiry only
- Removes the dead Feature enum scaffolding
Pending writing-plans.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Surfaces the multi-tenant container hardening contract introduced in the
prior commit so operators and integrators know what is enforced and why.
- application.yml: declare `cameleer.server.runtime.dockerruntime`
alongside the other runtime properties (empty = auto-detect runsc).
- HOWTO.md: add the override row to the Runtime config table.
- SERVER-CAPABILITIES.md: new "Multi-Tenant Runtime Sandboxing" section
describing the cap_drop, no-new-privileges, AppArmor, read-only rootfs,
pids_limit, /tmp tmpfs, and runsc auto-detect contract — plus the
on-disk state caveat that motivates issue #153.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tenant JARs are arbitrary user code: Camel ships components (camel-exec,
camel-bean, MVEL/Groovy templating) that turn a header into shell, and
Java 17 has no SecurityManager — the JVM is not a security boundary.
This applies an unconditional hardening contract to every tenant
container so a single runc CVE no longer equals host takeover.
DockerRuntimeOrchestrator.startContainer now sets:
- cap_drop ALL (Capability.values() — docker-java has no ALL constant)
- security_opt: no-new-privileges, apparmor=docker-default
(default seccomp profile applies implicitly)
- read_only rootfs, pids_limit=512
- /tmp tmpfs rw,nosuid,size=256m — no noexec, since Netty/Snappy/LZ4/Zstd
dlopen native libs from /tmp via mmap(PROT_EXEC) which noexec blocks
The orchestrator also probes `docker info` at construction and uses runsc
(gVisor) automatically when the daemon has it registered. Override via
cameleer.server.runtime.dockerruntime (e.g. "kata"); empty = auto.
Outbound TCP, DNS, and TLS are unaffected — caps/seccomp don't gate
those — so vanilla Camel-Kafka producers/consumers and REST integrations
keep working unchanged. Stateful tenants (Kafka Streams with on-disk
state stores, apps writing to /var/log/...) need explicit writeable
volumes; that's tracked in #153 as the natural follow-up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Live-pushed config fields (taps, tapVersion, tracedProcessors,
routeRecording) apply via SSE CONFIG_UPDATE — they take effect on
running agents without a redeploy and are fetched on agent restart
from application_config. They must not contribute to the
"pending deploy" diff against the last-successful-deployment snapshot.
Before this fix, applying a tap from the process diagram correctly
rolled out in real time but then marked the app "Pending Deploy (1)"
because DirtyStateCalculator compared every agentConfig field. This
also contradicted the UI rule (ui.md) that the live tabs "never mark
dirty".
Adds taps, tapVersion, tracedProcessors, routeRecording to
AGENT_CONFIG_IGNORED_KEYS. Updates the nested-path test to use a
staged field (sensitiveKeys) and adds a new test asserting that
divergent live-push fields keep dirty=false.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a StatusDot + colored Badge next to the app name in the deployment
page header, showing the latest deployment's status (RUNNING / STARTING
/ FAILED / STOPPED / DEGRADED / STOPPING). The existing "Pending
deploy" badge now carries a tooltip explaining *why*: either a list of
local unsaved edits, or a per-field diff against the last successful
deploy's snapshot (field, staged vs deployed values). When server-side
differences exist, the badge shows the count.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>