# License Enforcement Operator documentation for the cameleer-server license subsystem. Audience: operators running their own cameleer-server instance who need to install, monitor, or troubleshoot a license. For *issuing* licenses, see `cameleer-license-minter/README.md`. For SaaS-team operational playbooks, see `docs/handoff/2026-04-26-license-saas-handoff.md`. ## Table of contents ## Overview ## What gets enforced ## Install paths and priority ## Public-key configuration ## REST API ## License state machine ## Default tier caps ## Cap-exceeded behavior ## Retention semantics ## Daily revalidation ## Audit categories ## Prometheus metrics ## Troubleshooting --- ## Overview cameleer-server can run in one of two postures: - **Default tier (no license installed).** A small fixed cap-set applies (1 environment, 3 apps, 5 agents, 1 day retention, etc.). Suitable for evaluation and self-host single-instance use. The default tier engages automatically when no license is configured. - **Licensed (token installed).** Caps from the signed token override the default tier on a per-key basis. Any limit key the token does not specify falls through to the default value, so a partial license that only raises `max_environments` and `max_apps` keeps default retention. A signed Ed25519 license token carries the customer's `tenantId`, an `expiresAt` timestamp, an optional `gracePeriodDays`, and a `limits` map. The server's `LicenseValidator` (`cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseValidator.java`) checks the signature against `CAMELEER_SERVER_LICENSE_PUBLICKEY`, verifies the tenant matches `CAMELEER_SERVER_TENANT_ID`, and rejects expired tokens (past `expiresAt + gracePeriodDays`). The license posture is summarized as a `LicenseState`: - `ABSENT` — no license configured. Default-tier caps apply. - `ACTIVE` — valid token, current time is at or before `expiresAt`. License caps apply. - `GRACE` — past `expiresAt` but within `gracePeriodDays`. License caps still apply; the operator should renew. - `EXPIRED` — past `expiresAt + gracePeriodDays`. Default-tier caps apply. - `INVALID` — signature, tenant, or schema validation failed. Default-tier caps apply. ## What gets enforced License caps are enforced through a single component, `LicenseEnforcer.assertWithinCap(limitKey, currentUsage, requestedDelta)`, called from each creation path. | Limit key | Enforcement point | Effect when exceeded | |---|---|---| | `max_environments` | `EnvironmentService.create(...)` | HTTP 403 from `EnvironmentAdminController.create`. | | `max_apps` | `AppService.createApp(...)` | HTTP 403 from `AppController.create`. | | `max_agents` | `AgentRegistryService.register(...)` | HTTP 403 from `AgentRegistrationController.register`. Counted against the in-memory live agent registry. | | `max_users` | User creation paths in `UserAdminController`, `UiAuthController`, `OidcAuthController` | HTTP 403 (REST) or rejection during OIDC first-login. | | `max_outbound_connections` | `OutboundConnectionServiceImpl.create(...)` | HTTP 403. | | `max_alert_rules` | `AlertRuleController.create(...)` | HTTP 403. | | `max_total_cpu_millis` | `DeploymentExecutor` `PRE_FLIGHT` stage | Deployment fails before pulling images; row is marked FAILED with the cap message in `deployments.error_message`. | | `max_total_memory_mb` | same | same | | `max_total_replicas` | same | same | | `max_jar_retention_count` | `EnvironmentAdminController` PUT `/{envSlug}/jar-retention` | HTTP 403 if requested value > cap. The daily `JarRetentionJob` is also bounded by this cap. | | `max_execution_retention_days`, `max_log_retention_days`, `max_metric_retention_days` | Not a creation cap; clamps ClickHouse TTL to `min(cap, env.configured)` — see [Retention semantics](#retention-semantics). | Note that the three compute caps are checked together at deploy time, after `ConfigMerger.resolve(...)` produces the final `ResolvedContainerConfig` but before the image is pulled. The current usage figure is computed by `LicenseUsageReader.computeUsage()` over non-stopped deployments. ## Install paths and priority Tokens can be installed by four mechanisms; resolution at boot is highest-priority-first: 1. **`CAMELEER_SERVER_LICENSE_TOKEN` environment variable.** Highest priority. The raw token is read on `@PostConstruct` from `LicenseBeanConfig.LicenseBootLoader`. 2. **`cameleer.server.license.file` Spring property** (or `CAMELEER_SERVER_LICENSE_FILE`). Path to a file containing the token. Read at boot if no env-var token is present. 3. **PostgreSQL `license` table.** Set via the admin REST POST. Loaded at boot if the env var and file both miss. 4. **None of the above.** State is `ABSENT`, default-tier caps apply, the boot loader publishes a `LicenseChangedEvent(ABSENT, null)` so listeners (Prometheus gauges, retention applier) settle on default values. If a higher-priority source rejects (signature failure, tenant mismatch, expired) the loader logs the reason and **does not** fall through to a lower-priority source. This is deliberate: an operator who set `CAMELEER_SERVER_LICENSE_TOKEN` expects that token to be the active one, not a silently-stale DB row. Any token loaded at boot also flows through `LicenseService.install(...)` so audit, persistence, and `LicenseChangedEvent` publishing are uniform across paths. ## Public-key configuration ```bash export CAMELEER_SERVER_LICENSE_PUBLICKEY="$(cat cameleer-license-pub.b64)" ``` The value is the base64 encoding of the Ed25519 public key in X.509 SubjectPublicKeyInfo form (see `cameleer-license-minter/README.md` for generation). When `CAMELEER_SERVER_LICENSE_PUBLICKEY` is **unset**: - `LicenseBeanConfig.licenseValidator()` (line 62) logs a WARN: `CAMELEER_SERVER_LICENSE_PUBLICKEY not set — all licenses will be rejected as INVALID`. - The bean is constructed against a throwaway public key whose private counterpart no one holds. The override's `validate(...)` always throws `IllegalStateException("license public key not configured")`. - Any token loaded from any source routes through `LicenseService.install(...)`, fails validation, marks the gate `INVALID`, and writes a `reject_license` audit row with the failure reason. - The state will be `INVALID`, default-tier caps apply, and the operator must set the variable and restart (or hot-install via POST after restart). ## REST API All endpoints require an ADMIN-role JWT. Source-of-truth controllers: `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/LicenseAdminController.java`, `LicenseUsageController.java`. ### `GET /api/v1/admin/license` ```json { "state": "ACTIVE", "invalidReason": null, "envelope": { "licenseId": "fd3a8f2a-1c44-4eac-aa07-1a5d1ce9c4a4", "tenantId": "acme-prod", "label": "Acme Production", "limits": { "max_apps": 25, "max_environments": 3 }, "issuedAt": "2026-04-26T10:00:00Z", "expiresAt": "2027-01-01T00:00:00Z", "gracePeriodDays": 14 }, "lastValidatedAt": "2026-04-26T03:00:00Z" } ``` The raw token string is **deliberately not** returned — only the parsed envelope. `lastValidatedAt` is omitted when no DB row exists yet (env-var or file source on first boot before the next revalidation tick). ### `POST /api/v1/admin/license` ```bash curl -X POST https://server.example.com/api/v1/admin/license \ -H "Authorization: Bearer ${ADMIN_JWT}" \ -H "Content-Type: application/json" \ -d '{"token": "eyJ...long.base64.string..."}' ``` Body shape: `{"token": ""}`. On success returns `{"state": "ACTIVE", "envelope": {...}}`. On failure returns HTTP 400 with `{"error": ""}`. The handler delegates to `LicenseService.install(token, userId, "api")`. Acting `userId` comes from the authenticated principal stripped of the `user:` prefix (see `app-classes.md` user-id convention). This endpoint installs *or replaces* — there is one row per tenant in the `license` table, so a successful POST upserts and supersedes any prior token. The previous license id is captured in the `replace_license` audit detail. ### `GET /api/v1/admin/license/usage` ```json { "state": "ACTIVE", "expiresAt": "2027-01-01T00:00:00Z", "daysRemaining": 250, "gracePeriodDays": 14, "tenantId": "acme-prod", "label": "Acme Production", "lastValidatedAt": "2026-04-26T03:00:00Z", "message": "License active. 250 days remaining.", "limits": [ {"key": "max_environments", "current": 2, "cap": 3, "source": "license"}, {"key": "max_apps", "current": 12, "cap": 25, "source": "license"}, {"key": "max_agents", "current": 38, "cap": 50, "source": "license"}, {"key": "max_users", "current": 4, "cap": 3, "source": "default"} ] } ``` For each effective-limits key: - `current` — current usage. `max_agents` is read from the in-memory `AgentRegistryService.liveCount()`; everything else comes from `LicenseUsageReader.snapshot()` (PostgreSQL counts, plus deployment compute aggregates from `deployed_config_snapshot`). Limits the server does not measure return `0`. - `cap` — effective cap (license override or default-tier value). - `source` — `"license"` if the cap came from the token's `limits` map, `"default"` if it fell through. ## License state machine ``` +---------------+ | ABSENT | (no token configured) +-------+-------+ | | install via env / file / DB / POST v +-------+-------+ +-------------- | ACTIVE | --------------+ | +-------+-------+ | | revalidate | now > expiresAt | fails sig/tenant/ | | parse v | +-------+-------+ | | GRACE | | +-------+-------+ | | | | now > exp + gracePeriodDays | v | +-------+-------+ | | EXPIRED | | +-------+-------+ v +-------+-------+ | INVALID | (signature mismatch, tenant mismatch, +---------------+ missing public key, malformed payload) ``` Classification logic: `LicenseStateMachine.classify(license, invalidReason)` (`cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseStateMachine.java`). - `INVALID` and `EXPIRED` revert to **default-tier caps**. The license envelope is dropped from the gate (`getCurrent()` returns null in `INVALID`; the gate retains the parsed info in `EXPIRED` but `getEffectiveLimits()` returns defaults-only). - `GRACE` keeps **license caps**. This is the only state where the operator should be running but should also be actively working on renewal. ## Default tier caps Source: `cameleer-server-core/src/main/java/com/cameleer/server/core/license/DefaultTierLimits.java`. | Key | Default | Semantics | |---|---|---| | `max_environments` | 1 | Total environments across the tenant. | | `max_apps` | 3 | Total apps across all environments. | | `max_agents` | 5 | Live agents in the in-memory registry (LIVE state). | | `max_users` | 3 | Local + OIDC users in the `users` table. | | `max_outbound_connections` | 1 | Rows in `outbound_connections`. | | `max_alert_rules` | 2 | Rows in `alert_rules`. | | `max_total_cpu_millis` | 2000 | Sum of `replicas * cpuLimit` over non-stopped deployments. cpuLimit is millicores; 1000 = one core. | | `max_total_memory_mb` | 2048 | Sum of `replicas * memoryLimitMb` over non-stopped deployments. | | `max_total_replicas` | 5 | Sum of `replicas` over non-stopped deployments. | | `max_execution_retention_days` | 1 | Cap on TTL applied to `executions` and `processor_executions`. | | `max_log_retention_days` | 1 | Cap on TTL applied to `logs`. | | `max_metric_retention_days` | 1 | Cap on TTL applied to `agent_metrics` and `agent_events`. | | `max_jar_retention_count` | 3 | Maximum JAR retention count per environment. | The default tier is intentionally restrictive — it is sized for evaluation, single-developer demos, and "I forgot to install my license" recovery, not production. New customers should install a license at first onboarding. ## Cap-exceeded behavior When a creation path exceeds its cap, `LicenseEnforcer.assertWithinCap(...)` throws `LicenseCapExceededException(limitKey, current, cap)`. `LicenseExceptionAdvice` (`@ControllerAdvice`) maps it to: ```http HTTP/1.1 403 Forbidden Content-Type: application/json { "error": "license cap reached", "limit": "max_apps", "current": 4, "cap": 3, "state": "ABSENT", "message": "License absent. Default tier limits apply. Cap reached for max_apps (3 of 3 used)." } ``` Concurrently: - The Prometheus counter `cameleer_license_cap_rejections_total{limit=...}` increments. - An audit row is written: `category=LICENSE`, `action=cap_exceeded`, `target=`, `result=FAILURE`, `detail` carries `{limit, current, requested, cap, state}`. If audit storage fails, the 403 still surfaces (audit is best-effort here). The `message` field is rendered by `LicenseMessageRenderer.forCap(...)` and varies per state — under `EXPIRED` it nudges the operator to renew; under `INVALID` it cites `invalidReason`. ## Retention semantics The license caps `max_execution_retention_days`, `max_log_retention_days`, `max_metric_retention_days`, and `max_jar_retention_count` define **maximums**. Per-environment configuration (`environments.execution_retention_days`, `log_retention_days`, `metric_retention_days`, `jar_retention_count`) defines the **operator preference**. The effective TTL applied to ClickHouse tables is: ``` effective = min(licenseCap, env.configuredRetentionDays) ``` When `LicenseChangedEvent` fires (any install/replace/revalidate/boot transition), `RetentionPolicyApplier` (`@EventListener @Async`) recomputes TTL for every (table, env) pair using: ```sql ALTER TABLE MODIFY TTL toDateTime() + INTERVAL DAY DELETE WHERE environment = '' ``` Tables affected: `executions`, `processor_executions`, `logs`, `agent_metrics`, `agent_events`. Excluded: - `route_diagrams` — content-addressed `ReplacingMergeTree`, no time-based TTL. - `server_metrics` — server-wide, no `environment` column. Its 90-day cap is fixed in the schema. ClickHouse failures are logged (WARN) but do not fail the originating license install — TTL recompute is best-effort. ## Daily revalidation `LicenseRevalidationJob` (`@Scheduled(cron = "0 0 3 * * *")`) re-runs `LicenseService.revalidate()` against the persisted token at 03:00 server-local time. It also fires once 60 seconds after `ApplicationReadyEvent` to catch the case where a license was installed via SQL between server starts. Each revalidation: - Re-reads the token from `license` table. - Runs `LicenseValidator.validate(...)` again — same checks as install (signature, tenant, expiry). - On success: bumps `last_validated_at`, reloads the gate, publishes `LicenseChangedEvent`. - On failure: marks the gate `INVALID`, writes an audit row `revalidate_license` / `FAILURE`, publishes `LicenseChangedEvent(INVALID, null)`. A token transitioning `ACTIVE → GRACE → EXPIRED` will surface as a state change at the next revalidation tick (or on the next license-touching admin action). ## Audit categories All license lifecycle events use `AuditCategory.LICENSE`. Action codes: | Action | Result | Detail keys | |---|---|---| | `install_license` | SUCCESS | `licenseId, expiresAt, installedBy, source` | | `replace_license` | SUCCESS | same plus `previousLicenseId` | | `reject_license` | FAILURE | `reason, source` | | `revalidate_license` | FAILURE | `licenseId, reason` | | `cap_exceeded` | FAILURE | `limit, current, requested, cap, state` | The `source` value is one of `env`, `file`, `db`, `api` — corresponds to the install path. ## Prometheus metrics Scraped at `/api/v1/prometheus`. Source: `LicenseMetrics` (`cameleer-server-app/src/main/java/com/cameleer/server/app/license/LicenseMetrics.java`). | Metric | Type | Labels | Semantics | |---|---|---|---| | `cameleer_license_state` | gauge | `state=` | One-hot per state — exactly one tag value carries `1.0` at any time, others are `0.0`. | | `cameleer_license_days_remaining` | gauge | (none) | Whole days until `expiresAt`. `-1.0` when no license is loaded (ABSENT/INVALID). Suitable alert thresholds: warn at 30, page at 7. | | `cameleer_license_last_validated_age_seconds` | gauge | (none) | Seconds since the persisted `last_validated_at`. `0` when there is no DB row. Alerts at >86400 (revalidation hasn't run for >24h) detect a stuck scheduler or a misconfigured server. | | `cameleer_license_cap_rejections_total` | counter | `limit=` | Incremented every time `LicenseEnforcer` rejects a creation due to a cap. A non-zero rate indicates customers hitting their plan ceiling. | Gauges refresh on every `LicenseChangedEvent` and on a 60-second `@Scheduled(fixedDelay)` so values stay current even without state changes. ## Troubleshooting ### My license shows `INVALID` — why? Check `invalidReason` from `GET /api/v1/admin/license`. Common causes: | `invalidReason` substring | Cause | Fix | |---|---|---| | `License signature verification failed` | Public key on the server does not match the private key the token was signed with. | Confirm `CAMELEER_SERVER_LICENSE_PUBLICKEY` matches the keypair used to mint the token. | | `License tenantId 'X' does not match server tenant 'Y'` | Token minted for a different `tenantId`. | Re-mint with `--tenant=` matching `CAMELEER_SERVER_TENANT_ID`. | | `licenseId is required` / `tenantId is required` / `exp is required` | Malformed token (missing required field). | Re-mint via the supported minter — fields are mandatory. | | `License expired at <...>` | Past `expiresAt + gracePeriodDays`. | Issue a renewal license. | | `license public key not configured` | `CAMELEER_SERVER_LICENSE_PUBLICKEY` is unset. | Set the env var and either restart or POST the token again. | ### I'm getting 403s on creates — which cap is biting? ```bash curl https://server.example.com/api/v1/admin/license/usage \ -H "Authorization: Bearer ${ADMIN_JWT}" ``` The `limits[]` array shows current/cap per limit key. Any row with `current >= cap` is a candidate. The 403 response body itself names the limit: ```json {"error":"license cap reached","limit":"max_apps","current":3,"cap":3,"state":"ABSENT", ...} ``` If `state` is `ABSENT` or `EXPIRED`/`INVALID`, the fix is to install a license. If `state` is `ACTIVE` and you are at the license cap, you need a higher-tier license re-issued. ### My new license didn't take effect 1. Check the audit log: ```bash curl 'https://server.example.com/api/v1/admin/audit?category=LICENSE&limit=10' \ -H "Authorization: Bearer ${ADMIN_JWT}" ``` You should see an `install_license` or `replace_license` row at `SUCCESS`. A `reject_license` `FAILURE` row carries the reason. 2. Confirm the public key matches the private key used to mint: - Vendor side: `openssl pkey -in -pubout -outform DER | base64 -w0` - Server side: `echo $CAMELEER_SERVER_LICENSE_PUBLICKEY` - These must be byte-identical. 3. Confirm `CAMELEER_SERVER_TENANT_ID` matches the `tenantId` in the token envelope (`GET /api/v1/admin/license`). 4. If the env var token disagrees with what's in the DB (e.g. you POSTed but a stale env var remains): the env var wins on next boot. Either remove the env var or update it before restarting. ### Cap rejections spiking but no licensed customer should be hitting the cap Inspect `cameleer_license_cap_rejections_total{limit=...}`. If a tenant is on default tier (state = `ABSENT`/`EXPIRED`/`INVALID`) the very low default caps will trip immediately on routine activity. Install a license to restore expected behavior. ### Retention TTL didn't change after installing a license `RetentionPolicyApplier` runs on `LicenseChangedEvent` asynchronously (`@Async`). Look for the log line: ``` License changed (state=ACTIVE) — recomputing TTL across N environment(s) and 5 table(s) Applied TTL: table=executions env=prod days=30 (cap=30, configured=90) ``` If the log shows `Failed to apply TTL` warnings, ClickHouse rejected the `ALTER TABLE ... MODIFY TTL` statement — most often because of a permissions issue or a ClickHouse version below 22.3. The license install itself still succeeded; the TTL change just didn't land.