docs(license): apply review feedback to enforcement design
- Add INVALID state to FSM (signature/tenant/parse failure ≠ ABSENT) with loud UI/audit/metric severity; ABSENT stays a calm state. - Make tenantId required in the license envelope (it's already inside the signed payload, so a self-hosted customer cannot strip it). - Move ClickHouse TTL recompute from boot-only to a RetentionPolicyApplier @EventListener(LicenseChangedEvent), so a long-running server that lands in EXPIRED tightens TTL automatically. - Add LicenseRevalidationJob (daily) that re-runs signature check against the DB row and updates last_validated_at; transitions to INVALID on failure (catches public-key rotation drift). - Add last_validated_at column to the license table, surfaced on the /usage endpoint and as cameleer_license_last_validated_age_seconds. - Enrich enforcement-failure responses and the /usage endpoint with a per-state human-readable message so 403s and the UI both explain WHY caps changed. - Add --verify (with --public-key) to the minter CLI to round-trip a freshly-minted token through LicenseValidator before shipping it, deleting the output file on verify failure. - Add corresponding tests, telemetry gauge, and a runtime-recompute IT. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -47,16 +47,18 @@ cameleer-server-core/ (existing — pure domain, no Spring)
|
||||
├── LicenseLimits (typed wrapper over the limits map)
|
||||
├── LicenseValidator (existing, payload schema updated)
|
||||
├── LicenseGate (existing, gutted: no Feature; getLimits() only)
|
||||
├── LicenseStateMachine (NEW — pure FSM: ABSENT / ACTIVE / GRACE / EXPIRED)
|
||||
└── DefaultTierLimits (constant — §5 numbers)
|
||||
├── LicenseStateMachine (NEW — pure FSM: ABSENT / ACTIVE / GRACE / EXPIRED / INVALID)
|
||||
└── DefaultTierLimits (constant — §3.2 numbers)
|
||||
|
||||
cameleer-server-app/ (existing — Spring, web, persistence)
|
||||
├── license/
|
||||
│ ├── LicenseRepository (NEW — PostgreSQL persistence)
|
||||
│ ├── LicenseService (NEW — load/save/replace; emits state events)
|
||||
│ ├── LicenseService (NEW — load/save/replace; publishes LicenseChangedEvent)
|
||||
│ ├── LicenseEnforcer (NEW — assertWithinCap entry point)
|
||||
│ ├── LicenseUsageReader (NEW — counts current usage for /usage endpoint)
|
||||
│ ├── LicenseCapExceededException (NEW — mapped to 403 by ControllerAdvice)
|
||||
│ ├── LicenseRevalidationJob (NEW — @Scheduled daily; updates last_validated_at)
|
||||
│ ├── RetentionPolicyApplier (NEW — @EventListener(LicenseChangedEvent); recomputes ClickHouse TTL + per-env caps)
|
||||
│ └── LicenseMetrics (NEW — Prometheus gauges)
|
||||
├── controller/
|
||||
│ ├── LicenseAdminController (existing — extended; persists, audited)
|
||||
@@ -67,7 +69,7 @@ cameleer-server-app/ (existing — Spring, web, persistence)
|
||||
cameleer-license-minter/ (NEW — top-level Maven module)
|
||||
├── pom.xml (depends on cameleer-server-core)
|
||||
├── LicenseMinter (signing primitive; takes private key + LicenseInfo)
|
||||
└── cli/LicenseMinterCli (CLI main class)
|
||||
└── cli/LicenseMinterCli (CLI main class, supports --verify)
|
||||
```
|
||||
|
||||
### 1.2 Why a separate `cameleer-license-minter` module
|
||||
@@ -100,7 +102,7 @@ Wire format unchanged: `base64(payload).base64(ed25519_signature)`. Payload sche
|
||||
{
|
||||
"licenseId": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"tenantId": "acme-corp",
|
||||
"label": "ACME prod 2026",
|
||||
"label": "ACME prod 2026 — site:hamburg",
|
||||
"iat": 1745539200,
|
||||
"exp": 1777075200,
|
||||
"gracePeriodDays": 30,
|
||||
@@ -127,7 +129,7 @@ Wire format unchanged: `base64(payload).base64(ed25519_signature)`. Payload sche
|
||||
| Field | Required | Notes |
|
||||
|---|---|---|
|
||||
| `licenseId` | yes | UUID. Used in audit + future revocation. |
|
||||
| `tenantId` | optional | If present and `CAMELEER_SERVER_TENANT_ID` differs, treat as no license + log error. Air-gapped customers may omit. |
|
||||
| `tenantId` | **yes** | Must match `CAMELEER_SERVER_TENANT_ID`. Mismatch = `INVALID` state (see §3). The field is inside the signed payload, so a self-hosted customer cannot strip it to make a license portable across tenants — any edit invalidates the signature. Air-gapped customers receive a license bound to a vendor-issued tenant id (not necessarily a UUID — any non-empty slug). |
|
||||
| `label` | optional | Free-form human description. Surfaced in UI. |
|
||||
| `iat` | yes | Unix seconds. |
|
||||
| `exp` | yes | Unix seconds. |
|
||||
@@ -149,23 +151,42 @@ Wire format unchanged: `base64(payload).base64(ed25519_signature)`. Payload sche
|
||||
│ ABSENT │ ───────────────▶│ ACTIVE │──────▶│ GRACE │ │ EXPIRED │
|
||||
└─────────┘ └────────┘ └────────┘ └─────────┘
|
||||
▲ │ │ ▲ │
|
||||
│ install invalid │ replace │ │ replace valid │ replace
|
||||
│ (sig/tenant/parse) ▼ │ │ ▼
|
||||
└────────────────────────────┴──────────────┴─┴───────────────────┘
|
||||
│ │ replace │ │ replace valid │ replace
|
||||
│ ▼ │ │ ▼
|
||||
│ ┌─────────┐ └──────────────┴─┴───────────────────┘
|
||||
└──┤ INVALID │ ──── replace valid ────────────────────────────────▶ ACTIVE
|
||||
└─────────┘
|
||||
▲
|
||||
│ install fails (signature / tenant / parse / public-key-missing)
|
||||
all transitions persist + audit-log
|
||||
```
|
||||
|
||||
### 3.1 State semantics
|
||||
|
||||
| State | Effective limits | Trigger |
|
||||
|---|---|---|
|
||||
| `ABSENT` | `DefaultTierLimits` | No DB row, or signature/tenant/parse failure. |
|
||||
| `ACTIVE` | `merge(default, license.limits)` | License loaded, `now < exp`. |
|
||||
| `GRACE` | Same as `ACTIVE` | `exp ≤ now < exp + gracePeriodDays`. UI banner. |
|
||||
| `EXPIRED` | `DefaultTierLimits` | `now ≥ exp + gracePeriodDays`. Distinct UI label vs ABSENT. |
|
||||
| State | Effective limits | Trigger | Severity |
|
||||
|--- |--- |--- |--- |
|
||||
| `ABSENT` | `DefaultTierLimits` | No DB row. Clean install with no license configured. | INFO |
|
||||
| `ACTIVE` | `merge(default, license.limits)` | License loaded, `now < exp`. | INFO |
|
||||
| `GRACE` | Same as `ACTIVE` | `exp ≤ now < exp + gracePeriodDays`. UI warning banner. | WARN |
|
||||
| `EXPIRED` | `DefaultTierLimits` | `now ≥ exp + gracePeriodDays`. UI label distinct from ABSENT. | ERROR |
|
||||
| `INVALID` | `DefaultTierLimits` | Signature failure, tenant mismatch, parse error, or public key not configured but a token is present. | **ERROR — loud** |
|
||||
|
||||
State is recomputed on every limit check (clock comparison only) — no scheduler needed for
|
||||
transitions. The only "background" behaviour is the Prometheus gauge refresh.
|
||||
`ABSENT` and `INVALID` produce the same enforcement (default tier) but are surfaced very
|
||||
differently:
|
||||
|
||||
- **`ABSENT`** is a clean state — fresh install, no license yet. UI shows a calm "Install a
|
||||
license to lift the default-tier caps" call to action. No audit row beyond the boot log line.
|
||||
- **`INVALID`** is an active error — tampering, wrong public key, or a paste that lost
|
||||
characters. UI shows a red banner with the validator's error message
|
||||
(e.g. "License signature verification failed", "License tenantId 'acme-corp' does not match
|
||||
server tenant 'beta-corp'"). Audit row written under
|
||||
`AuditCategory.LICENSE` action `reject_license`. Prometheus
|
||||
`cameleer_license_state{state="INVALID"} = 1` so an alert can fire.
|
||||
|
||||
State is recomputed on every limit check (clock comparison only against parsed in-memory
|
||||
`LicenseInfo`) — no scheduler needed for `ACTIVE → GRACE → EXPIRED` transitions. A separate
|
||||
**daily revalidation job** (§6.6) re-runs the signature check against the DB row to catch slow
|
||||
failures like public-key rotation drift.
|
||||
|
||||
### 3.2 Default tier (the "no license" caps)
|
||||
|
||||
@@ -199,8 +220,31 @@ void assertWithinCap(String limitKey, long currentUsage, long requestedDelta);
|
||||
```
|
||||
|
||||
Throws `LicenseCapExceededException(limitKey, current, cap)` when `currentUsage + requestedDelta > cap`.
|
||||
A `@ControllerAdvice` maps it to `403` with body
|
||||
`{"error":"license cap reached","limit":"max_apps","current":3,"cap":3}`.
|
||||
A `@ControllerAdvice` maps it to `403` with a body that explains the "why" so operators can act
|
||||
without grepping logs:
|
||||
|
||||
```json
|
||||
{
|
||||
"error": "license cap reached",
|
||||
"limit": "max_apps",
|
||||
"current": 3,
|
||||
"cap": 3,
|
||||
"state": "EXPIRED",
|
||||
"message": "License expired 5 days ago: system reverted to default tier (3 apps). Current usage is 3. Install or renew the license to create more apps."
|
||||
}
|
||||
```
|
||||
|
||||
The `message` field is rendered server-side from a small template per state:
|
||||
|
||||
| State | Message template |
|
||||
|--- |---|
|
||||
| `ABSENT` | "No license installed: default tier applies (cap = N for {limit}). Install a license to raise this." |
|
||||
| `ACTIVE` | "License cap reached: {limit} = {cap}. Current usage is {current}. Contact your vendor to raise the cap." |
|
||||
| `GRACE` | "License expired {n} day(s) ago and is in its grace period (ends in {m} days). Cap unchanged at {cap}. Renew before grace ends." |
|
||||
| `EXPIRED`| "License expired {n} days ago: system reverted to default tier (cap = N for {limit}). Current usage is {current}. Renew the license to lift the cap." |
|
||||
| `INVALID`| "License rejected ({reason}): default tier applies (cap = N for {limit}). Fix the license to raise this." |
|
||||
|
||||
### 4.1 Per-limit call sites
|
||||
|
||||
| Limit | Call site | Failure response |
|
||||
|---|---|---|
|
||||
@@ -213,12 +257,12 @@ A `@ControllerAdvice` maps it to `403` with body
|
||||
| `max_total_cpu_millis` | `DeploymentExecutor.PRE_FLIGHT` (sum across non-stopped deploys + new) | Deploy fails fast at PRE_FLIGHT, status FAILED, audit row |
|
||||
| `max_total_memory_mb` | same | same |
|
||||
| `max_total_replicas` | same | same |
|
||||
| `max_execution_retention_days` | `EnvironmentService.update` (per-env field, see §4.1) + `ClickHouseSchemaInitializer.applyRetention()` at boot | 422 on update; boot pins effective TTL = `min(licenseCap, configured)` |
|
||||
| `max_execution_retention_days` | `EnvironmentService.update` (per-env field, see §4.2) + `RetentionPolicyApplier` (see §4.3) | 422 on update; ClickHouse TTL recomputed on every license change |
|
||||
| `max_log_retention_days` | same | same |
|
||||
| `max_metric_retention_days` | same | same |
|
||||
| `max_jar_retention_count` | `EnvironmentAdminController.PUT /jar-retention` | 422 |
|
||||
|
||||
### 4.1 Per-environment retention fields
|
||||
### 4.2 Per-environment retention fields
|
||||
|
||||
Three new columns on `environments` (Flyway V2):
|
||||
|
||||
@@ -230,11 +274,30 @@ ALTER TABLE environments
|
||||
```
|
||||
|
||||
These are the configured per-env values. The effective ClickHouse TTL is
|
||||
`min(licenseCap, configured)`, applied at startup by `ClickHouseSchemaInitializer`. Admin UI
|
||||
surfaces the configured values; `EnvironmentService.update` rejects values above the license cap
|
||||
with 422.
|
||||
`min(licenseCap, configured)`. Admin UI surfaces the configured values;
|
||||
`EnvironmentService.update` rejects values above the license cap with 422.
|
||||
|
||||
### 4.2 Boot-time invariant
|
||||
### 4.3 Runtime retention recompute
|
||||
|
||||
`RetentionPolicyApplier` is `@EventListener(LicenseChangedEvent)`:
|
||||
|
||||
- Triggered on every `LicenseService.replace(...)` (boot install, env-var override, file
|
||||
override, POST `/admin/license`) **and** on every state transition the revalidation job
|
||||
detects (e.g. license becomes `EXPIRED`, caps drop to default).
|
||||
- Recomputes the effective TTL per env (`min(licenseCap, configured)`), then issues
|
||||
`ALTER TABLE … MODIFY TTL …` on the affected ClickHouse tables (executions, processors,
|
||||
logs, metrics, route_diagrams, agent_events). One ALTER per table per affected env.
|
||||
- Errors are logged WARN; a failed ALTER does not block the license install — the operator can
|
||||
retry by reposting the license. The previous TTL keeps applying until the next successful
|
||||
ALTER.
|
||||
- At boot, `LicenseService.loadInitial(...)` publishes one `LicenseChangedEvent` after the
|
||||
load order in §6.2 settles, so the boot path goes through the same applier as runtime
|
||||
changes.
|
||||
|
||||
Result: a server that stays up for months and lands in `EXPIRED` will see ClickHouse TTLs
|
||||
collapse to default-tier values automatically — no restart needed.
|
||||
|
||||
### 4.4 Boot-time invariant
|
||||
|
||||
If a license is added that *lowers* a cap below current usage (10 apps, license now allows 5), the
|
||||
server logs one WARN per limit at boot. **No deletion**. New creates reject; existing resources
|
||||
@@ -254,6 +317,8 @@ keep working.
|
||||
"gracePeriodDays": 30,
|
||||
"tenantId": "acme-corp",
|
||||
"label": "ACME prod 2026",
|
||||
"lastValidatedAt": "2026-04-26T03:14:07Z",
|
||||
"message": "License active. 365 days remaining.",
|
||||
"limits": [
|
||||
{"key": "max_apps", "current": 7, "cap": 50, "source": "license"},
|
||||
{"key": "max_agents", "current": 12, "cap": 100, "source": "license"},
|
||||
@@ -267,12 +332,20 @@ keep working.
|
||||
key, or there is no license), and `"license"` when the cap is explicit in the license. Drives the
|
||||
SaaS UI's "free tier" badge.
|
||||
|
||||
`message` carries the same human-readable explanation that the 403 body uses, varying by state:
|
||||
|
||||
- `ABSENT` — "No license installed. Default tier applies."
|
||||
- `ACTIVE` — "License active. {n} days remaining."
|
||||
- `GRACE` — "License expired {n} days ago. Grace period ends in {m} days. Renew now to avoid degradation."
|
||||
- `EXPIRED`— "License expired {n} days ago. System reverted to default tier."
|
||||
- `INVALID`— "License rejected: {reason}. Default tier applies. Fix the license to recover."
|
||||
|
||||
`LicenseUsageReader` issues one cheap aggregate per limit (`SELECT COUNT(*)` per entity table; a
|
||||
single grouped `SELECT SUM(replicas * cpuMillis), SUM(replicas * memoryMb), SUM(replicas)` over
|
||||
non-stopped deployments).
|
||||
|
||||
`GET /api/v1/admin/license` (existing) is extended to return `{state, envelope}` with the raw token
|
||||
omitted from the response.
|
||||
`GET /api/v1/admin/license` (existing) is extended to return `{state, envelope, lastValidatedAt}`
|
||||
with the raw token omitted from the response.
|
||||
|
||||
---
|
||||
|
||||
@@ -289,20 +362,30 @@ CREATE TABLE license (
|
||||
license_id UUID NOT NULL,
|
||||
installed_at TIMESTAMPTZ NOT NULL,
|
||||
installed_by TEXT NOT NULL, -- users.user_id (bare) or 'system' for env/file boot
|
||||
expires_at TIMESTAMPTZ NOT NULL
|
||||
expires_at TIMESTAMPTZ NOT NULL,
|
||||
last_validated_at TIMESTAMPTZ NOT NULL -- updated by boot, install, and revalidation job
|
||||
);
|
||||
```
|
||||
|
||||
`last_validated_at` is the timestamp of the most recent **successful** signature/parse round-trip
|
||||
against the current public key. Useful for troubleshooting "why did my license stop working" — a
|
||||
stale `last_validated_at` next to a recent `now` is a strong signal that revalidation is failing
|
||||
and the operator should check the public key.
|
||||
|
||||
### 6.2 Boot order
|
||||
|
||||
`LicenseBeanConfig`:
|
||||
|
||||
1. If `CAMELEER_SERVER_LICENSE_TOKEN` env var is set → validate → write to DB (overwrite) →
|
||||
load.
|
||||
1. If `CAMELEER_SERVER_LICENSE_TOKEN` env var is set → validate → write to DB (overwrite,
|
||||
sets `last_validated_at = now`) → load.
|
||||
2. Else if `CAMELEER_SERVER_LICENSE_FILE` is set → read file → validate → write to DB → load.
|
||||
3. Else read `license` row from DB → validate → load.
|
||||
3. Else read `license` row from DB → validate → on success update `last_validated_at = now` →
|
||||
load.
|
||||
4. Else `ABSENT`.
|
||||
|
||||
After step 1–3 the service publishes one `LicenseChangedEvent` so the retention applier and
|
||||
metrics gauges initialise off the same code path as runtime changes.
|
||||
|
||||
Env-var / file act as **idempotent overrides** — they always win and replace the DB row, so the
|
||||
operator's last action survives reboots.
|
||||
|
||||
@@ -310,15 +393,17 @@ operator's last action survives reboots.
|
||||
|
||||
`POST /api/v1/admin/license { "token": "..." }` (existing):
|
||||
- Validates against the configured public key.
|
||||
- On success, persists to `license` table (`installed_by = user_id`), updates the in-memory
|
||||
`LicenseGate`, audits.
|
||||
- On success, persists to `license` table (`installed_by = user_id`, `last_validated_at = now`),
|
||||
updates the in-memory `LicenseGate`, publishes `LicenseChangedEvent`, audits.
|
||||
- On failure, returns 400 with the validator error message and audits the rejection.
|
||||
Server transitions to `INVALID` state if a previously-loaded license was replaced; otherwise
|
||||
remains in its prior state (the rejected token is *not* written to DB).
|
||||
|
||||
### 6.4 Public key custody
|
||||
|
||||
`CAMELEER_SERVER_LICENSE_PUBLICKEY` (existing) remains the only verification key. Build- /
|
||||
deploy-time secret bound to the vendor distribution. **Not stored in DB.** If unset *and* a
|
||||
license is present → reject all licenses (existing behaviour).
|
||||
license is present → reject all licenses (existing behaviour) → `INVALID` state.
|
||||
|
||||
### 6.5 Audit trail
|
||||
|
||||
@@ -329,7 +414,23 @@ New `AuditCategory.LICENSE`. Actions:
|
||||
| `install_license` | First successful install in an empty state | `{licenseId, expiresAt, installedBy, source}` (`source` = `env`/`file`/`api`) |
|
||||
| `replace_license` | Successful install over an existing license | same + `previousLicenseId` |
|
||||
| `reject_license` | Validation failed (signature, tenant, parse, public key missing) | `{reason, source}` |
|
||||
| `cap_exceeded` | Any `LicenseCapExceededException` | `{limit, current, cap, requestedBy}` |
|
||||
| `revalidate_license` | Daily job result, on **failure only** | `{licenseId, reason}` |
|
||||
| `cap_exceeded` | Any `LicenseCapExceededException` | `{limit, current, cap, requestedBy, state}` |
|
||||
|
||||
### 6.6 Daily revalidation job
|
||||
|
||||
`LicenseRevalidationJob`:
|
||||
- `@Scheduled(cron = "0 0 3 * * *")` (03:00 server local time) plus an immediate run 60s
|
||||
after boot.
|
||||
- Reads the DB token, re-runs `LicenseValidator.validate(token)` against the current public
|
||||
key.
|
||||
- On success: `UPDATE license SET last_validated_at = now WHERE tenant_id = ?`.
|
||||
- On failure (e.g. operator rotated the public key without reinstalling the license, or DB
|
||||
row was tampered with directly): transition state to `INVALID`, publish
|
||||
`LicenseChangedEvent` (so retention recomputes too), audit `revalidate_license` with the
|
||||
reason, log `ERROR`.
|
||||
- Cheap (no I/O beyond one DB read + one DB write); safe to run frequently. 03:00 is chosen
|
||||
to coincide with off-peak so the WARN noise lands when humans aren't deploying.
|
||||
|
||||
---
|
||||
|
||||
@@ -353,6 +454,7 @@ Serializes `LicenseInfo` to canonical JSON (sorted keys), signs the bytes with E
|
||||
```bash
|
||||
java -jar cameleer-license-minter-1.0-SNAPSHOT.jar \
|
||||
--private-key=/secure/vendor.key \
|
||||
--public-key=/secure/vendor.pub \
|
||||
--tenant=acme-corp \
|
||||
--label="ACME prod 2026" \
|
||||
--expires=2027-04-25 \
|
||||
@@ -362,15 +464,26 @@ java -jar cameleer-license-minter-1.0-SNAPSHOT.jar \
|
||||
--max-total-cpu-millis=32000 \
|
||||
--max-total-memory-mb=65536 \
|
||||
--max-execution-retention-days=90 \
|
||||
--output=acme-license.tok
|
||||
--output=acme-license.tok \
|
||||
--verify
|
||||
```
|
||||
|
||||
- `--private-key` reads a PEM-encoded Ed25519 private key (output of
|
||||
`openssl genpkey -algorithm ed25519`).
|
||||
- `--public-key` *(used only with `--verify`)* reads the matching public key. Required when
|
||||
`--verify` is set; ignored otherwise.
|
||||
- Unspecified `--max-*` flags are omitted from the payload — the license inherits the default for
|
||||
that key.
|
||||
- Unknown flags fail fast.
|
||||
- `--output` writes the token; if omitted, prints to stdout.
|
||||
- `--verify` round-trips the freshly-minted token through `LicenseValidator` against
|
||||
`--public-key` *after* writing the output file. This catches:
|
||||
- corruption between `String → file` write,
|
||||
- wrong-key pairing (vendor accidentally pointed `--public-key` at a different keypair's
|
||||
public half),
|
||||
- signature mismatch from a buggy build of the minter.
|
||||
On verify failure the CLI exits non-zero, prints the validator error, and (if `--output` was
|
||||
written) deletes the output file so the bad token does not get shipped.
|
||||
|
||||
Keypair generation is **out of band** — vendor uses `openssl` and stores both halves in their
|
||||
secret manager. We deliberately do not ship a `--gen-keypair` subcommand to keep the boundary
|
||||
@@ -384,13 +497,17 @@ Prometheus gauges scraped via `/api/v1/prometheus`:
|
||||
|
||||
| Metric | Labels | Notes |
|
||||
|---|---|---|
|
||||
| `cameleer_license_state` | `state="ABSENT|ACTIVE|GRACE|EXPIRED"` | Boolean — exactly one is 1. |
|
||||
| `cameleer_license_state` | `state="ABSENT|ACTIVE|GRACE|EXPIRED|INVALID"` | Boolean — exactly one is 1. |
|
||||
| `cameleer_license_days_remaining` | (none) | Negative in GRACE/EXPIRED. |
|
||||
| `cameleer_license_limit_utilisation`| `limit="max_apps"` etc. | `current / cap`, in `[0, 1+]`. |
|
||||
| `cameleer_license_cap_rejections_total` | `limit="..."` | Counter. |
|
||||
| `cameleer_license_last_validated_age_seconds` | (none) | `now - last_validated_at`. Spikes if the daily revalidation job is failing. |
|
||||
|
||||
State-transition log lines: `INFO` on install/ACTIVE, `WARN` on GRACE, `ERROR` on EXPIRED, `WARN`
|
||||
on cap reject (sampled to avoid log spam).
|
||||
State-transition log lines: `INFO` on install/ACTIVE, `WARN` on GRACE, `ERROR` on EXPIRED,
|
||||
`ERROR` on INVALID, `WARN` on cap reject (sampled to avoid log spam).
|
||||
|
||||
Recommended alert (in cameleer-saas Grafana, not shipped with the server): page on
|
||||
`cameleer_license_state{state="INVALID"} == 1` for > 5 minutes.
|
||||
|
||||
---
|
||||
|
||||
@@ -413,15 +530,17 @@ compatibility shims" preference, no deprecated path or feature flag.
|
||||
|
||||
| Layer | Tests |
|
||||
|---|---|
|
||||
| Core unit | `LicenseValidatorTest` — signature, expiry, tenant mismatch, missing required fields, unknown extra fields. |
|
||||
| Core unit | `LicenseStateMachineTest` — all four transitions including grace boundary, replace from any state, invalid install. |
|
||||
| Core unit | `LicenseValidatorTest` — signature, expiry, tenant mismatch, missing required fields (`tenantId`, `licenseId`, `iat`, `exp`), unknown extra fields. |
|
||||
| Core unit | `LicenseStateMachineTest` — all five transitions including grace boundary, replace from any state, invalid install routes to `INVALID`, valid install from `INVALID` recovers to `ACTIVE`. |
|
||||
| Core unit | `DefaultTierLimitsTest` — every documented key has a default. |
|
||||
| Minter unit | `LicenseMinterTest` — round-trip with a throwaway Ed25519 keypair. Canonical JSON is stable across runs. |
|
||||
| Minter CLI | `LicenseMinterCliTest` — invokes `main` with `--private-key=tmp` and checks output token validates. |
|
||||
| App unit | `LicenseEnforcerTest` — for each limit: cap-reached, under-cap, default-tier with no license, missing-cap-inherits-default. |
|
||||
| App integration | `LicenseLifecycleIT` — install via env, replace via POST, restart restores from DB. Driven through REST. |
|
||||
| App integration | `LicenseEnforcementIT` — REST-driven, hit each cap end-to-end (per the project's "REST-API-driven ITs" preference). Includes `cap_exceeded` audit row check. |
|
||||
| Boot | `SchemaBootstrapIT` extension — `license` table exists, `environments` retention columns exist, retention pinning honoured at boot. |
|
||||
| Minter CLI | `LicenseMinterCliTest` — invokes `main` with `--private-key=tmp` and checks output token validates; `--verify` happy path; `--verify` failure path deletes the output file and exits non-zero. |
|
||||
| App unit | `LicenseEnforcerTest` — for each limit: cap-reached, under-cap, default-tier with no license, missing-cap-inherits-default, message text varies per state. |
|
||||
| App unit | `RetentionPolicyApplierTest` — license-changed event recomputes effective TTL per env; failed ALTER logs WARN and does not throw. |
|
||||
| App integration | `LicenseLifecycleIT` — install via env, replace via POST, restart restores from DB, public-key removal at runtime transitions to `INVALID`, daily revalidation job updates `last_validated_at`. Driven through REST. |
|
||||
| App integration | `LicenseEnforcementIT` — REST-driven, hit each cap end-to-end (per the project's "REST-API-driven ITs" preference). Includes `cap_exceeded` audit row check and verifies the 403 body's `message` field matches the state. |
|
||||
| App integration | `RetentionRuntimeRecomputeIT` — install license with `max_log_retention_days=30`, observe `logs` TTL ALTER fires; replace with `max_log_retention_days=7`, observe TTL drops to 7 without restart. |
|
||||
| Boot | `SchemaBootstrapIT` extension — `license` table exists with `last_validated_at`, `environments` retention columns exist, retention pinning honoured at boot. |
|
||||
|
||||
No raw-SQL seeding of caps in ITs. All caps installed via the REST endpoint or env var.
|
||||
|
||||
@@ -444,6 +563,7 @@ No raw-SQL seeding of caps in ITs. All caps installed via the REST endpoint or e
|
||||
| Default tier so tight that an honest evaluator cannot try the product. | Defaults documented; vendor can ship a longer-`exp` "trial" license at install time if needed. |
|
||||
| Customer lowers `gracePeriodDays` field by editing token. | Token is signed; any edit invalidates the signature. |
|
||||
| License removed from DB out of band, server lands in ABSENT and rejects new resources but old ones are above default tier. | Boot-time WARN per over-cap limit. UI banner in the admin license page. No auto-deletion. |
|
||||
| Public key rotation. | Out of scope for v1; documented as "redeploy with new key" — vendors are expected to rotate via redeployment. |
|
||||
| Public key rotation. | Out of scope for v1; documented as "redeploy with new key" — vendors are expected to rotate via redeployment. Daily revalidation job catches a rotation that wasn't paired with a reinstall (state → `INVALID`, alertable). |
|
||||
| Compute cap arithmetic relies on `cpuLimit` and `memoryLimitMb` being set on every container. | Existing `ResolvedContainerConfig` already enforces these; `DeploymentExecutor.PRE_FLIGHT` rejects deploys with unset compute fields. |
|
||||
| Per-env retention column added but old ClickHouse partitions retain longer. | Documented: TTL change is honoured by ClickHouse on its next merge cycle. New rows inserted always honour the new TTL. |
|
||||
| `RetentionPolicyApplier` issues blocking ALTERs from the event listener thread. | Applier runs ALTERs serialised but on a separate executor (not the publisher thread) so a slow ClickHouse does not stall the install API call. License install API returns immediately with the new state; retention recompute completes asynchronously and is observable via metrics. |
|
||||
|
||||
Reference in New Issue
Block a user