Files
cameleer-server/docs/superpowers/specs/2026-04-25-license-enforcement-design.md
hsiegeln e0be6a069f docs(license): apply review feedback to enforcement design
- Add INVALID state to FSM (signature/tenant/parse failure ≠ ABSENT)
  with loud UI/audit/metric severity; ABSENT stays a calm state.
- Make tenantId required in the license envelope (it's already inside
  the signed payload, so a self-hosted customer cannot strip it).
- Move ClickHouse TTL recompute from boot-only to a
  RetentionPolicyApplier @EventListener(LicenseChangedEvent), so a
  long-running server that lands in EXPIRED tightens TTL automatically.
- Add LicenseRevalidationJob (daily) that re-runs signature check
  against the DB row and updates last_validated_at; transitions to
  INVALID on failure (catches public-key rotation drift).
- Add last_validated_at column to the license table, surfaced on the
  /usage endpoint and as cameleer_license_last_validated_age_seconds.
- Enrich enforcement-failure responses and the /usage endpoint with a
  per-state human-readable message so 403s and the UI both explain
  WHY caps changed.
- Add --verify (with --public-key) to the minter CLI to round-trip a
  freshly-minted token through LicenseValidator before shipping it,
  deleting the output file on verify failure.
- Add corresponding tests, telemetry gauge, and a runtime-recompute IT.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-26 09:42:16 +02:00

29 KiB
Raw Permalink Blame History

License Enforcement — Design

Date: 2026-04-25 Status: Approved (brainstorm); pending writing-plans Related: cameleer-saas#7 (Epic: License & Feature Gating), cameleer-saas#42 (vendor minting), cameleer-saas#50 (customer license view)

Problem

cameleer-server ships a license skeleton (LicenseValidator, LicenseGate, admin endpoint) but nothing enforces anything. Open mode (no license configured) currently grants all features and no limits — the opposite of what we want for a self-hosted distribution that needs to gate scale behind a paid license.

We want:

  1. A self-hosted server with no license to operate within a small, hard-coded "default tier" that is enough to evaluate the product but not enough to run it in production.
  2. Licenses to express arbitrary per-customer limits (no fixed tiers) on a vendor-defined set of resources: entity counts, compute footprint, retention.
  3. A standalone minter owned by the vendor that signs licenses with an Ed25519 private key the customer never sees.
  4. Licenses to be persisted on the server, installable via env var, file, or admin POST, and renewable by replacement.
  5. Revocation handled out of band (vendor suspends the SaaS tenant, or issues short-exp licenses) — no online revocation callback in v1.

Non-goals

  • Feature flags. The current Feature enum (topology/lineage/correlation/debugger/replay) is dead scaffolding and gets removed; this design is about quantitative limits only.
  • Ingestion-rate limits (executions/minute, logs/minute). Defer to a follow-up.
  • Online revocation. Vendor uses shorter exp + reissue; SaaS suspension is independent.
  • Auto-deletion of resources when caps are lowered. Existing rows stay; only new creates reject.
  • Minter keypair generation tooling. Vendor uses standard openssl genpkey -algorithm ed25519 out of band.

1. Architecture

1.1 Module layout

cameleer-server-core/                    (existing — pure domain, no Spring)
└── license/
    ├── LicenseInfo                      (record — see §2)
    ├── LicenseLimits                    (typed wrapper over the limits map)
    ├── LicenseValidator                 (existing, payload schema updated)
    ├── LicenseGate                      (existing, gutted: no Feature; getLimits() only)
    ├── LicenseStateMachine              (NEW — pure FSM: ABSENT / ACTIVE / GRACE / EXPIRED / INVALID)
    └── DefaultTierLimits                (constant — §3.2 numbers)

cameleer-server-app/                     (existing — Spring, web, persistence)
├── license/
│   ├── LicenseRepository                (NEW — PostgreSQL persistence)
│   ├── LicenseService                   (NEW — load/save/replace; publishes LicenseChangedEvent)
│   ├── LicenseEnforcer                  (NEW — assertWithinCap entry point)
│   ├── LicenseUsageReader               (NEW — counts current usage for /usage endpoint)
│   ├── LicenseCapExceededException      (NEW — mapped to 403 by ControllerAdvice)
│   ├── LicenseRevalidationJob           (NEW — @Scheduled daily; updates last_validated_at)
│   ├── RetentionPolicyApplier           (NEW — @EventListener(LicenseChangedEvent); recomputes ClickHouse TTL + per-env caps)
│   └── LicenseMetrics                   (NEW — Prometheus gauges)
├── controller/
│   ├── LicenseAdminController           (existing — extended; persists, audited)
│   └── LicenseUsageController           (NEW — GET /admin/license/usage)
└── config/
    └── LicenseBeanConfig                (existing — extended for DB load order)

cameleer-license-minter/                 (NEW — top-level Maven module)
├── pom.xml                              (depends on cameleer-server-core)
├── LicenseMinter                        (signing primitive; takes private key + LicenseInfo)
└── cli/LicenseMinterCli                 (CLI main class, supports --verify)

1.2 Why a separate cameleer-license-minter module

Not shipped in the runtime JAR. Vendor distributes it independently or builds it from source on a trusted machine. Customers never receive it.

This is module hygiene + smaller runtime attack surface, not a cryptographic protection — license forgery requires the vendor's private key, and the public key in the server is enough to verify forged tokens regardless of where the minter code lives.

1.3 Dependency graph

cameleer-license-minter ──▶ cameleer-server-core (LicenseInfo schema only)
cameleer-server-app     ──▶ cameleer-server-core (validator, gate, FSM, defaults)
cameleer-saas           ──▶ cameleer-license-minter (for SaaS-mode minting)
cameleer-saas           ──▶ cameleer-server-core   (transitive)

cameleer-server-app has no dependency on cameleer-license-minter.


2. License envelope

Wire format unchanged: base64(payload).base64(ed25519_signature). Payload schema:

{
  "licenseId": "550e8400-e29b-41d4-a716-446655440000",
  "tenantId": "acme-corp",
  "label": "ACME prod 2026 — site:hamburg",
  "iat": 1745539200,
  "exp": 1777075200,
  "gracePeriodDays": 30,
  "limits": {
    "max_environments": 5,
    "max_apps": 50,
    "max_agents": 100,
    "max_users": 25,
    "max_outbound_connections": 10,
    "max_alert_rules": 200,
    "max_total_cpu_millis": 32000,
    "max_total_memory_mb": 65536,
    "max_total_replicas": 100,
    "max_execution_retention_days": 90,
    "max_log_retention_days": 30,
    "max_metric_retention_days": 365,
    "max_jar_retention_count": 10
  }
}

2.1 Field rules

Field Required Notes
licenseId yes UUID. Used in audit + future revocation.
tenantId yes Must match CAMELEER_SERVER_TENANT_ID. Mismatch = INVALID state (see §3). The field is inside the signed payload, so a self-hosted customer cannot strip it to make a license portable across tenants — any edit invalidates the signature. Air-gapped customers receive a license bound to a vendor-issued tenant id (not necessarily a UUID — any non-empty slug).
label optional Free-form human description. Surfaced in UI.
iat yes Unix seconds.
exp yes Unix seconds.
gracePeriodDays optional, default 0 Days exp may be in the past while limits still apply.
limits.* each optional Missing key inherits from DefaultTierLimits. A license can lift any subset.

2.2 Removed from the current envelope

  • tier (string) — was a non-functional label. Folded into label.
  • features (array) — out of scope. Feature enum deleted.

3. License state machine

                                                      exp + grace passes
       ┌─────────┐  install valid  ┌────────┐  exp  ┌────────┐ ────────► ┌─────────┐
       │ ABSENT  │ ───────────────▶│ ACTIVE │──────▶│ GRACE  │           │ EXPIRED │
       └─────────┘                  └────────┘       └────────┘           └─────────┘
            ▲                            │              │ ▲                   │
            │                            │ replace      │ │ replace valid     │ replace
            │                            ▼              │ │                   ▼
            │  ┌─────────┐               └──────────────┴─┴───────────────────┘
            └──┤ INVALID │ ──── replace valid ────────────────────────────────▶ ACTIVE
               └─────────┘
                   ▲
                   │ install fails (signature / tenant / parse / public-key-missing)
                                              all transitions persist + audit-log

3.1 State semantics

State Effective limits Trigger Severity
ABSENT DefaultTierLimits No DB row. Clean install with no license configured. INFO
ACTIVE merge(default, license.limits) License loaded, now < exp. INFO
GRACE Same as ACTIVE exp ≤ now < exp + gracePeriodDays. UI warning banner. WARN
EXPIRED DefaultTierLimits now ≥ exp + gracePeriodDays. UI label distinct from ABSENT. ERROR
INVALID DefaultTierLimits Signature failure, tenant mismatch, parse error, or public key not configured but a token is present. ERROR — loud

ABSENT and INVALID produce the same enforcement (default tier) but are surfaced very differently:

  • ABSENT is a clean state — fresh install, no license yet. UI shows a calm "Install a license to lift the default-tier caps" call to action. No audit row beyond the boot log line.
  • INVALID is an active error — tampering, wrong public key, or a paste that lost characters. UI shows a red banner with the validator's error message (e.g. "License signature verification failed", "License tenantId 'acme-corp' does not match server tenant 'beta-corp'"). Audit row written under AuditCategory.LICENSE action reject_license. Prometheus cameleer_license_state{state="INVALID"} = 1 so an alert can fire.

State is recomputed on every limit check (clock comparison only against parsed in-memory LicenseInfo) — no scheduler needed for ACTIVE → GRACE → EXPIRED transitions. A separate daily revalidation job (§6.6) re-runs the signature check against the DB row to catch slow failures like public-key rotation drift.

3.2 Default tier (the "no license" caps)

Limit Default
max_environments 1
max_apps 3
max_agents 5
max_users 3
max_outbound_connections 1
max_alert_rules 2
max_total_cpu_millis 2000 (2 cores)
max_total_memory_mb 2048 (2 GB)
max_total_replicas 5
max_execution_retention_days 1
max_log_retention_days 1
max_metric_retention_days 1
max_jar_retention_count 3

Encoded as public static final Map<String, Integer> DEFAULTS in DefaultTierLimits. Keys match the license payload exactly.


4. Enforcement map

Every limit check goes through one method on LicenseEnforcer:

void assertWithinCap(String limitKey, long currentUsage, long requestedDelta);

Throws LicenseCapExceededException(limitKey, current, cap) when currentUsage + requestedDelta > cap. A @ControllerAdvice maps it to 403 with a body that explains the "why" so operators can act without grepping logs:

{
  "error": "license cap reached",
  "limit": "max_apps",
  "current": 3,
  "cap": 3,
  "state": "EXPIRED",
  "message": "License expired 5 days ago: system reverted to default tier (3 apps). Current usage is 3. Install or renew the license to create more apps."
}

The message field is rendered server-side from a small template per state:

State Message template
ABSENT "No license installed: default tier applies (cap = N for {limit}). Install a license to raise this."
ACTIVE "License cap reached: {limit} = {cap}. Current usage is {current}. Contact your vendor to raise the cap."
GRACE "License expired {n} day(s) ago and is in its grace period (ends in {m} days). Cap unchanged at {cap}. Renew before grace ends."
EXPIRED "License expired {n} days ago: system reverted to default tier (cap = N for {limit}). Current usage is {current}. Renew the license to lift the cap."
INVALID "License rejected ({reason}): default tier applies (cap = N for {limit}). Fix the license to raise this."

4.1 Per-limit call sites

Limit Call site Failure response
max_environments EnvironmentService.create (start) 403
max_apps AppService.createApp 403
max_agents AgentRegistryService.register 403 — agent treated as unregistered (no SSE, no commands)
max_users UserAdminController.createUser and OidcAuthController.callback (auto-signup) 403 / OIDC login failure
max_outbound_connections OutboundConnectionServiceImpl.create 403
max_alert_rules AlertRuleController.create 403
max_total_cpu_millis DeploymentExecutor.PRE_FLIGHT (sum across non-stopped deploys + new) Deploy fails fast at PRE_FLIGHT, status FAILED, audit row
max_total_memory_mb same same
max_total_replicas same same
max_execution_retention_days EnvironmentService.update (per-env field, see §4.2) + RetentionPolicyApplier (see §4.3) 422 on update; ClickHouse TTL recomputed on every license change
max_log_retention_days same same
max_metric_retention_days same same
max_jar_retention_count EnvironmentAdminController.PUT /jar-retention 422

4.2 Per-environment retention fields

Three new columns on environments (Flyway V2):

ALTER TABLE environments
  ADD COLUMN execution_retention_days INTEGER NOT NULL DEFAULT 1,
  ADD COLUMN log_retention_days       INTEGER NOT NULL DEFAULT 1,
  ADD COLUMN metric_retention_days    INTEGER NOT NULL DEFAULT 1;

These are the configured per-env values. The effective ClickHouse TTL is min(licenseCap, configured). Admin UI surfaces the configured values; EnvironmentService.update rejects values above the license cap with 422.

4.3 Runtime retention recompute

RetentionPolicyApplier is @EventListener(LicenseChangedEvent):

  • Triggered on every LicenseService.replace(...) (boot install, env-var override, file override, POST /admin/license) and on every state transition the revalidation job detects (e.g. license becomes EXPIRED, caps drop to default).
  • Recomputes the effective TTL per env (min(licenseCap, configured)), then issues ALTER TABLE … MODIFY TTL … on the affected ClickHouse tables (executions, processors, logs, metrics, route_diagrams, agent_events). One ALTER per table per affected env.
  • Errors are logged WARN; a failed ALTER does not block the license install — the operator can retry by reposting the license. The previous TTL keeps applying until the next successful ALTER.
  • At boot, LicenseService.loadInitial(...) publishes one LicenseChangedEvent after the load order in §6.2 settles, so the boot path goes through the same applier as runtime changes.

Result: a server that stays up for months and lands in EXPIRED will see ClickHouse TTLs collapse to default-tier values automatically — no restart needed.

4.4 Boot-time invariant

If a license is added that lowers a cap below current usage (10 apps, license now allows 5), the server logs one WARN per limit at boot. No deletion. New creates reject; existing resources keep working.


5. Usage endpoint

GET /api/v1/admin/license/usage (ADMIN only):

{
  "state": "ACTIVE",
  "expiresAt": "2027-04-25T00:00:00Z",
  "daysRemaining": 365,
  "gracePeriodDays": 30,
  "tenantId": "acme-corp",
  "label": "ACME prod 2026",
  "lastValidatedAt": "2026-04-26T03:14:07Z",
  "message": "License active. 365 days remaining.",
  "limits": [
    {"key": "max_apps", "current": 7, "cap": 50, "source": "license"},
    {"key": "max_agents", "current": 12, "cap": 100, "source": "license"},
    {"key": "max_total_cpu_millis", "current": 8500, "cap": 32000, "source": "license"},
    {"key": "max_outbound_connections", "current": 0, "cap": 1, "source": "default"}
  ]
}

source is "default" when the cap comes from DefaultTierLimits (i.e. the license omits this key, or there is no license), and "license" when the cap is explicit in the license. Drives the SaaS UI's "free tier" badge.

message carries the same human-readable explanation that the 403 body uses, varying by state:

  • ABSENT — "No license installed. Default tier applies."
  • ACTIVE — "License active. {n} days remaining."
  • GRACE — "License expired {n} days ago. Grace period ends in {m} days. Renew now to avoid degradation."
  • EXPIRED— "License expired {n} days ago. System reverted to default tier."
  • INVALID— "License rejected: {reason}. Default tier applies. Fix the license to recover."

LicenseUsageReader issues one cheap aggregate per limit (SELECT COUNT(*) per entity table; a single grouped SELECT SUM(replicas * cpuMillis), SUM(replicas * memoryMb), SUM(replicas) over non-stopped deployments).

GET /api/v1/admin/license (existing) is extended to return {state, envelope, lastValidatedAt} with the raw token omitted from the response.


6. Lifecycle, persistence, install paths

6.1 Storage

Flyway V2 migration:

CREATE TABLE license (
  tenant_id         TEXT PRIMARY KEY,        -- one row per server (= one tenant)
  token             TEXT NOT NULL,           -- full signed token
  license_id        UUID NOT NULL,
  installed_at      TIMESTAMPTZ NOT NULL,
  installed_by      TEXT NOT NULL,           -- users.user_id (bare) or 'system' for env/file boot
  expires_at        TIMESTAMPTZ NOT NULL,
  last_validated_at TIMESTAMPTZ NOT NULL     -- updated by boot, install, and revalidation job
);

last_validated_at is the timestamp of the most recent successful signature/parse round-trip against the current public key. Useful for troubleshooting "why did my license stop working" — a stale last_validated_at next to a recent now is a strong signal that revalidation is failing and the operator should check the public key.

6.2 Boot order

LicenseBeanConfig:

  1. If CAMELEER_SERVER_LICENSE_TOKEN env var is set → validate → write to DB (overwrite, sets last_validated_at = now) → load.
  2. Else if CAMELEER_SERVER_LICENSE_FILE is set → read file → validate → write to DB → load.
  3. Else read license row from DB → validate → on success update last_validated_at = now → load.
  4. Else ABSENT.

After step 13 the service publishes one LicenseChangedEvent so the retention applier and metrics gauges initialise off the same code path as runtime changes.

Env-var / file act as idempotent overrides — they always win and replace the DB row, so the operator's last action survives reboots.

6.3 Runtime install

POST /api/v1/admin/license { "token": "..." } (existing):

  • Validates against the configured public key.
  • On success, persists to license table (installed_by = user_id, last_validated_at = now), updates the in-memory LicenseGate, publishes LicenseChangedEvent, audits.
  • On failure, returns 400 with the validator error message and audits the rejection. Server transitions to INVALID state if a previously-loaded license was replaced; otherwise remains in its prior state (the rejected token is not written to DB).

6.4 Public key custody

CAMELEER_SERVER_LICENSE_PUBLICKEY (existing) remains the only verification key. Build- / deploy-time secret bound to the vendor distribution. Not stored in DB. If unset and a license is present → reject all licenses (existing behaviour) → INVALID state.

6.5 Audit trail

New AuditCategory.LICENSE. Actions:

Action When Payload
install_license First successful install in an empty state {licenseId, expiresAt, installedBy, source} (source = env/file/api)
replace_license Successful install over an existing license same + previousLicenseId
reject_license Validation failed (signature, tenant, parse, public key missing) {reason, source}
revalidate_license Daily job result, on failure only {licenseId, reason}
cap_exceeded Any LicenseCapExceededException {limit, current, cap, requestedBy, state}

6.6 Daily revalidation job

LicenseRevalidationJob:

  • @Scheduled(cron = "0 0 3 * * *") (03:00 server local time) plus an immediate run 60s after boot.
  • Reads the DB token, re-runs LicenseValidator.validate(token) against the current public key.
  • On success: UPDATE license SET last_validated_at = now WHERE tenant_id = ?.
  • On failure (e.g. operator rotated the public key without reinstalling the license, or DB row was tampered with directly): transition state to INVALID, publish LicenseChangedEvent (so retention recomputes too), audit revalidate_license with the reason, log ERROR.
  • Cheap (no I/O beyond one DB read + one DB write); safe to run frequently. 03:00 is chosen to coincide with off-peak so the WARN noise lands when humans aren't deploying.

7. Minter

7.1 LicenseMinter (library)

Pure function, packaged in cameleer-license-minter:

public final class LicenseMinter {
    public static String mint(LicenseInfo info, PrivateKey ed25519PrivateKey);
}

Serializes LicenseInfo to canonical JSON (sorted keys), signs the bytes with Ed25519, returns base64(payload).base64(signature). cameleer-saas calls this directly to mint per-tenant tokens.

7.2 LicenseMinterCli (CLI)

java -jar cameleer-license-minter-1.0-SNAPSHOT.jar \
     --private-key=/secure/vendor.key \
     --public-key=/secure/vendor.pub \
     --tenant=acme-corp \
     --label="ACME prod 2026" \
     --expires=2027-04-25 \
     --grace-days=30 \
     --max-apps=50 \
     --max-agents=100 \
     --max-total-cpu-millis=32000 \
     --max-total-memory-mb=65536 \
     --max-execution-retention-days=90 \
     --output=acme-license.tok \
     --verify
  • --private-key reads a PEM-encoded Ed25519 private key (output of openssl genpkey -algorithm ed25519).
  • --public-key (used only with --verify) reads the matching public key. Required when --verify is set; ignored otherwise.
  • Unspecified --max-* flags are omitted from the payload — the license inherits the default for that key.
  • Unknown flags fail fast.
  • --output writes the token; if omitted, prints to stdout.
  • --verify round-trips the freshly-minted token through LicenseValidator against --public-key after writing the output file. This catches:
    • corruption between String → file write,
    • wrong-key pairing (vendor accidentally pointed --public-key at a different keypair's public half),
    • signature mismatch from a buggy build of the minter. On verify failure the CLI exits non-zero, prints the validator error, and (if --output was written) deletes the output file so the bad token does not get shipped.

Keypair generation is out of band — vendor uses openssl and stores both halves in their secret manager. We deliberately do not ship a --gen-keypair subcommand to keep the boundary clean.


8. Telemetry

Prometheus gauges scraped via /api/v1/prometheus:

Metric Labels Notes
cameleer_license_state `state="ABSENT ACTIVE
cameleer_license_days_remaining (none) Negative in GRACE/EXPIRED.
cameleer_license_limit_utilisation limit="max_apps" etc. current / cap, in [0, 1+].
cameleer_license_cap_rejections_total limit="..." Counter.
cameleer_license_last_validated_age_seconds (none) now - last_validated_at. Spikes if the daily revalidation job is failing.

State-transition log lines: INFO on install/ACTIVE, WARN on GRACE, ERROR on EXPIRED, ERROR on INVALID, WARN on cap reject (sampled to avoid log spam).

Recommended alert (in cameleer-saas Grafana, not shipped with the server): page on cameleer_license_state{state="INVALID"} == 1 for > 5 minutes.


9. Dead-code removal

Performed in the first commit of the implementation. Per the project's "no backwards compatibility shims" preference, no deprecated path or feature flag.

  • Delete Feature.java.
  • Delete LicenseGate.isEnabled(Feature).
  • Delete LicenseInfo.features field, LicenseInfo.hasFeature(Feature).
  • Delete LicenseGateTest.withLicense_onlyLicensedFeaturesEnabled and LicenseInfo.open()'s Set.of(Feature.values()) assertion.
  • Update LicenseValidator to ignore features if present in old tokens (silently dropped, not an error).

10. Testing

Layer Tests
Core unit LicenseValidatorTest — signature, expiry, tenant mismatch, missing required fields (tenantId, licenseId, iat, exp), unknown extra fields.
Core unit LicenseStateMachineTest — all five transitions including grace boundary, replace from any state, invalid install routes to INVALID, valid install from INVALID recovers to ACTIVE.
Core unit DefaultTierLimitsTest — every documented key has a default.
Minter unit LicenseMinterTest — round-trip with a throwaway Ed25519 keypair. Canonical JSON is stable across runs.
Minter CLI LicenseMinterCliTest — invokes main with --private-key=tmp and checks output token validates; --verify happy path; --verify failure path deletes the output file and exits non-zero.
App unit LicenseEnforcerTest — for each limit: cap-reached, under-cap, default-tier with no license, missing-cap-inherits-default, message text varies per state.
App unit RetentionPolicyApplierTest — license-changed event recomputes effective TTL per env; failed ALTER logs WARN and does not throw.
App integration LicenseLifecycleIT — install via env, replace via POST, restart restores from DB, public-key removal at runtime transitions to INVALID, daily revalidation job updates last_validated_at. Driven through REST.
App integration LicenseEnforcementIT — REST-driven, hit each cap end-to-end (per the project's "REST-API-driven ITs" preference). Includes cap_exceeded audit row check and verifies the 403 body's message field matches the state.
App integration RetentionRuntimeRecomputeIT — install license with max_log_retention_days=30, observe logs TTL ALTER fires; replace with max_log_retention_days=7, observe TTL drops to 7 without restart.
Boot SchemaBootstrapIT extension — license table exists with last_validated_at, environments retention columns exist, retention pinning honoured at boot.

No raw-SQL seeding of caps in ITs. All caps installed via the REST endpoint or env var.


11. Open follow-ups (deliberately deferred)

  • Ingestion-rate limits (max_executions_per_minute, max_logs_per_minute).
  • Online revocation callback (the revocation_check_url envelope field).
  • Concurrent debug session limit (max_concurrent_debug_sessions from the SaaS epic).
  • A "license usage history" report for vendors to see growth over time.
  • Open a tracking issue on cameleer/cameleer-server (Gitea) — none exists today.

12. Risk register

Risk Mitigation
Default tier so tight that an honest evaluator cannot try the product. Defaults documented; vendor can ship a longer-exp "trial" license at install time if needed.
Customer lowers gracePeriodDays field by editing token. Token is signed; any edit invalidates the signature.
License removed from DB out of band, server lands in ABSENT and rejects new resources but old ones are above default tier. Boot-time WARN per over-cap limit. UI banner in the admin license page. No auto-deletion.
Public key rotation. Out of scope for v1; documented as "redeploy with new key" — vendors are expected to rotate via redeployment. Daily revalidation job catches a rotation that wasn't paired with a reinstall (state → INVALID, alertable).
Compute cap arithmetic relies on cpuLimit and memoryLimitMb being set on every container. Existing ResolvedContainerConfig already enforces these; DeploymentExecutor.PRE_FLIGHT rejects deploys with unset compute fields.
Per-env retention column added but old ClickHouse partitions retain longer. Documented: TTL change is honoured by ClickHouse on its next merge cycle. New rows inserted always honour the new TTL.
RetentionPolicyApplier issues blocking ALTERs from the event listener thread. Applier runs ALTERs serialised but on a separate executor (not the publisher thread) so a slow ClickHouse does not stall the install API call. License install API returns immediately with the new state; retention recompute completes asynchronously and is observable via metrics.