# Certificate Management Design ## Problem The platform currently generates a self-signed TLS certificate at bootstrap time via an Alpine init container. There is no way to supply a real certificate at bootstrap, replace it at runtime, or manage CA trust bundles for tenant enterprise SSO providers. Internal services bypass TLS verification with hardcoded flags (`CAMELEER_OIDC_TLS_SKIP_VERIFY=true`, `NODE_TLS_REJECT_UNAUTHORIZED=0`). ## Goals 1. Supply a cert+key at bootstrap time (env vars pointing to files) 2. Replace the platform TLS certificate at runtime via vendor UI 3. Manage a CA trust bundle (`ca.pem`) aggregating platform CA + tenant enterprise CAs 4. Stage certificates before activation (shadow certs) 5. Roll back to the previous certificate if activation causes issues 6. Flag tenants that need restart after CA bundle changes 7. Provider-based architecture: Docker now, K8s later ## Non-Goals - ACME/Let's Encrypt integration (separate future work) - Per-tenant TLS certificates (all tenants share the platform cert via Traefik) - Client certificate authentication (mTLS) ## Architecture ### Provider Interface ```java package net.siegeln.cameleer.saas.certificate; public interface CertificateManager { boolean isAvailable(); CertificateInfo getActive(); CertificateInfo getStaged(); CertificateInfo getArchived(); CertValidationResult stage(byte[] certPem, byte[] keyPem, byte[] caBundlePem); void activate(); void restore(); void discardStaged(); void generateSelfSigned(String hostname); byte[] getCaBundle(); } ``` Lives in `net.siegeln.cameleer.saas.certificate`. Implementation in `net.siegeln.cameleer.saas.provisioning` alongside `DockerTenantProvisioner`. `DockerCertificateManager` writes to the Docker `certs` volume. Future `K8sCertificateManager` would manage K8s TLS Secrets + cert-manager CRDs. ### Records ```java public record CertificateInfo( String subject, String issuer, Instant notBefore, Instant notAfter, boolean hasCaBundle, boolean selfSigned, String fingerprint ) {} public record CertValidationResult( boolean valid, List errors, CertificateInfo info ) {} ``` ### File Layout (Docker Volume) ``` /certs/ cert.pem <- ACTIVE platform cert (Traefik reads) key.pem <- ACTIVE private key ca.pem <- aggregated CA bundle (platform CA + tenant CAs) meta.json <- bootstrap metadata for DB seeding staged/ cert.pem <- STAGED cert key.pem <- STAGED key ca.pem <- STAGED CA bundle prev/ cert.pem <- ARCHIVED (one previous) key.pem ca.pem ``` Atomic swap pattern: write to `*.wip`, validate, rename to final path. ### Database ```sql -- V011__certificates.sql CREATE TABLE certificates ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), status VARCHAR(10) NOT NULL CHECK (status IN ('ACTIVE', 'STAGED', 'ARCHIVED')), subject VARCHAR(500), issuer VARCHAR(500), not_before TIMESTAMPTZ, not_after TIMESTAMPTZ, fingerprint VARCHAR(128), has_ca BOOLEAN NOT NULL DEFAULT FALSE, self_signed BOOLEAN NOT NULL DEFAULT FALSE, uploaded_by UUID, created_at TIMESTAMPTZ NOT NULL DEFAULT now(), activated_at TIMESTAMPTZ, archived_at TIMESTAMPTZ ); ``` At most 3 rows: one per status. On activate: delete ARCHIVED -> ACTIVE becomes ARCHIVED -> STAGED becomes ACTIVE. Tenant staleness tracked via `ca_applied_at` column on `tenants` table: ```sql -- in same migration ALTER TABLE tenants ADD COLUMN ca_applied_at TIMESTAMPTZ; ``` Tenants with `ca_applied_at < (active cert's activated_at)` are stale. ### State Transitions ``` Upload -> STAGED -> activate -> ACTIVE -> (next activate) -> ARCHIVED ^ | +------ restore ---------------+ ``` - **Activate staged**: delete ARCHIVED row+files, ACTIVE -> ARCHIVED (move files to prev/), STAGED -> ACTIVE (move files to root) - **Restore archived**: swap ACTIVE <-> ARCHIVED (swap files and DB statuses) - **Discard staged**: delete STAGED row + staged/ files ### Bootstrap Flow The `traefik-certs` init container gains env var support: ``` 1. cert.pem + key.pem exist in volume? -> Yes: skip (idempotent) -> No: continue 2. CERT_FILE + KEY_FILE env vars set? -> Yes: copy to volume, validate (PEM parseable, key matches cert) If CA_FILE set, copy as ca.pem -> No: generate self-signed (current behavior) 3. Write /certs/meta.json with subject, fingerprint, self_signed flag ``` SaaS app reads `meta.json` on startup to seed the certificates DB table if no ACTIVE row exists. ### REST API All under `platform:admin` scope: | Method | Path | Description | |--------|------|-------------| | GET | `/api/vendor/certificates` | List active, staged, archived | | POST | `/api/vendor/certificates/stage` | Upload cert+key+ca (multipart) | | POST | `/api/vendor/certificates/activate` | Promote staged -> active | | POST | `/api/vendor/certificates/restore` | Swap archived <-> active | | DELETE | `/api/vendor/certificates/staged` | Discard staged | | GET | `/api/vendor/certificates/stale-tenants` | Tenants needing restart for CA | ### Service Layer `CertificateService` orchestrates: - Validation (PEM parsing, key-cert match, chain building, expiry check) - Delegates file operations to `CertificateManager` (provider) - Manages DB metadata - Computes tenant CA staleness ### CA Bundle Management `ca.pem` is a concatenation of: - Platform cert's CA (if from a private CA, supplied at bootstrap or upload) - Tenant-supplied CAs (for enterprise SSO with private IdPs) On any CA change (platform cert upload with CA, tenant CA add/remove): 1. Rebuild: concatenate all CAs into `ca.wip` 2. Validate: parse all PEM entries, verify structure 3. Atomic swap: `mv ca.wip ca.pem` 4. Update `activated_at` on ACTIVE cert row 5. Flag tenants as stale ### Tenant CA Distribution At provisioning time (`DockerTenantProvisioner`): - Mount `certs` volume read-only at `/certs` in tenant containers - Java servers: JVM truststore import at entrypoint or `JAVA_OPTS` with custom truststore - Node containers: `NODE_EXTRA_CA_CERTS=/certs/ca.pem` - Set `ca_applied_at = now()` on tenant record - Remove TLS skip flags when `ca.pem` exists On tenant restart (manual, after CA change): - Container picks up current `ca.pem` from volume mount - Update `ca_applied_at` on tenant ### Vendor UI New "Certificates" page in vendor sidebar: - **Active cert card**: subject, issuer, expiry, fingerprint, self-signed badge, activated date - **Staged cert card** (conditional): same metadata + Activate / Discard buttons, validation errors if any - **Archived cert card** (conditional): same metadata + Restore button (disabled if expired) - **Upload area**: file inputs for cert.pem (required), key.pem (required), ca.pem (optional) - **Stale tenants banner**: "CA bundle updated - N tenants need restart" with restart action ### React Hooks ```typescript useVendorCertificates() // GET /vendor/certificates useStageCertificate() // POST multipart useActivateCertificate() // POST activate useRestoreCertificate() // POST restore useDiscardStaged() // DELETE staged useStaleTenants() // GET stale-tenants ``` ## File Inventory ### New Files | File | Description | |------|-------------| | `src/.../certificate/CertificateManager.java` | Provider interface | | `src/.../certificate/CertificateInfo.java` | Cert metadata record | | `src/.../certificate/CertValidationResult.java` | Validation result record | | `src/.../certificate/CertificateEntity.java` | JPA entity | | `src/.../certificate/CertificateRepository.java` | Spring Data repo | | `src/.../certificate/CertificateService.java` | Business logic | | `src/.../certificate/CertificateController.java` | REST endpoints | | `src/.../provisioning/DockerCertificateManager.java` | Docker volume implementation | | `src/main/resources/db/migration/V011__certificates.sql` | Migration | | `ui/src/api/certificate-hooks.ts` | React Query hooks | | `ui/src/pages/vendor/CertificatesPage.tsx` | Vendor UI page | ### Modified Files | File | Change | |------|--------| | `docker-compose.yml` | Add CERT_FILE/KEY_FILE/CA_FILE env vars to init container | | `traefik.yml` | No change (already reads from /certs/) | | `src/.../provisioning/DockerTenantProvisioner.java` | Mount certs volume, set CA env vars, remove TLS skip flags | | `ui/src/components/Layout.tsx` | Add Certificates sidebar item | | `ui/src/router.tsx` | Add certificates route | | `ui/src/api/vendor-hooks.ts` | Or new file for cert hooks |