# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Cameleer SaaS — **vendor management plane** for the Cameleer observability stack. Two personas: **vendor** (platform:admin) manages the platform and provisions tenants; **tenant admin** (tenant:manage) manages their observability instance. The vendor creates tenants, which provisions per-tenant cameleer-server + UI instances via Docker API. No example tenant — clean slate bootstrap, vendor creates everything. ## Ecosystem This repo is the SaaS layer on top of two proven components: - **cameleer** (sibling repo) — Java agent using ByteBuddy for zero-code instrumentation of Camel apps. Captures route executions, processor traces, payloads, metrics, and route graph topology. Deploys as `-javaagent` JAR. - **cameleer-server** (sibling repo) — Spring Boot observability backend. Receives agent data via HTTP, pushes config/commands via SSE. PostgreSQL + ClickHouse storage. React SPA dashboard. JWT auth with Ed25519 config signing. Docker container orchestration for app deployments. - **cameleer-website** — Marketing site (Astro 5) - **design-system** — Shared React component library (`@cameleer/design-system` on Gitea npm registry) Agent-server protocol is defined in `cameleer/cameleer-common/PROTOCOL.md`. The agent and server are mature, proven components — this repo wraps them with multi-tenancy, billing, and self-service onboarding. ## Key Classes ### Java Backend (`src/main/java/net/siegeln/cameleer/saas/`) **config/** — Security, tenant isolation, web config - `SecurityConfig.java` — OAuth2 JWT decoder (ES384, issuer/audience validation, scope extraction) - `TenantIsolationInterceptor.java` — HandlerInterceptor on `/api/**`; JWT org_id -> TenantContext, path variable validation, fail-closed - `TenantContext.java` — ThreadLocal tenant ID storage - `WebConfig.java` — registers TenantIsolationInterceptor - `PublicConfigController.java` — GET /api/config (Logto endpoint, SPA client ID, scopes) - `MeController.java` — GET /api/me (authenticated user, tenant list) **tenant/** — Tenant data model - `TenantEntity.java` — JPA entity (id, name, slug, tier, status, logto_org_id, stripe IDs, settings JSONB, db_password) **vendor/** — Vendor console (platform:admin only) - `VendorTenantService.java` — orchestrates tenant creation (sync: DB + Logto + license, async: Docker provisioning + config push), suspend/activate, delete, restart server, upgrade server (force-pull + re-provision), license renewal - `VendorTenantController.java` — REST at `/api/vendor/tenants` (platform:admin required). List endpoint returns `VendorTenantSummary` with fleet health data (agentCount, environmentCount, agentLimit) fetched in parallel via `CompletableFuture`. - `InfrastructureService.java` — raw JDBC queries against shared PostgreSQL and ClickHouse for per-tenant infrastructure monitoring (schema sizes, table stats, row counts, disk usage) - `InfrastructureController.java` — REST at `/api/vendor/infrastructure` (platform:admin required). PostgreSQL and ClickHouse overview with per-tenant breakdown. **portal/** — Tenant admin portal (org-scoped) - `TenantPortalService.java` — customer-facing: dashboard (health + agent/env counts from server via M2M), license, SSO connectors, team, settings (public endpoint URL), server restart/upgrade, password management (own + team + server admin) - `TenantPortalController.java` — REST at `/api/tenant/*` (org-scoped, includes CA cert management at `/api/tenant/ca`, password endpoints at `/api/tenant/password` and `/api/tenant/server/admin-password`) **provisioning/** — Pluggable tenant provisioning - `TenantProvisioner.java` — pluggable interface (like server's RuntimeOrchestrator) - `DockerTenantProvisioner.java` — Docker implementation, creates per-tenant server + UI containers with per-tenant JDBC credentials (`currentSchema=tenant_{slug}&ApplicationName=tenant_{slug}`). `upgrade(slug)` force-pulls latest images and removes server+UI containers (preserves app containers, volumes, networks) for re-provisioning. `remove(slug)` does full cleanup: label-based container removal, env networks, tenant network, JAR volume. - `TenantDatabaseService.java` — creates/drops per-tenant PostgreSQL users (`tenant_{slug}`) and schemas; used during provisioning and delete - `TenantDataCleanupService.java` — GDPR data erasure on tenant delete: deletes ClickHouse data across all tables with `tenant_id` column (PostgreSQL cleanup handled by `TenantDatabaseService`) - `TenantProvisionerAutoConfig.java` — auto-detects Docker socket - `DockerCertificateManager.java` — file-based cert management with atomic `.wip` swap (Docker volume) - `DisabledCertificateManager.java` — no-op when certs dir unavailable - `CertificateManagerAutoConfig.java` — auto-detects `/certs` directory **certificate/** — TLS certificate lifecycle management - `CertificateManager.java` — provider interface (Docker now, K8s later) - `CertificateService.java` — orchestrates stage/activate/restore/discard, DB metadata, tenant CA staleness - `CertificateController.java` — REST at `/api/vendor/certificates` (platform:admin required) - `CertificateEntity.java` — JPA entity (status: ACTIVE/STAGED/ARCHIVED, subject, fingerprint, etc.) - `CertificateStartupListener.java` — seeds DB from filesystem on boot (for bootstrap-generated certs) - `TenantCaCertEntity.java` — JPA entity for per-tenant CA certs (PEM stored in DB, multiple per tenant) - `TenantCaCertRepository.java` — queries by tenant, status, all active across tenants - `TenantCaCertService.java` — stage/activate/delete tenant CAs, rebuilds aggregated `ca.pem` on changes **license/** — License management - `LicenseEntity.java` — JPA entity (id, tenant_id, tier, features JSONB, limits JSONB, expires_at) - `LicenseService.java` — generation, validation, feature/limit lookups - `LicenseController.java` — POST issue, GET verify, DELETE revoke **identity/** — Logto & server integration - `LogtoConfig.java` — Logto endpoint, M2M credentials (reads from bootstrap file) - `LogtoManagementClient.java` — Logto Management API calls (create org, create user, add to org, get user, SSO connectors, JIT provisioning, password updates via `PATCH /api/users/{id}/password`) - `ServerApiClient.java` — M2M client for cameleer-server API (Logto M2M token, `X-Cameleer-Protocol-Version: 1` header). Health checks, license/OIDC push, agent count, environment count, server admin password reset per tenant server. **audit/** — Audit logging - `AuditEntity.java` — JPA entity (actor_id, actor_email, tenant_id, action, resource, status) - `AuditService.java` — log audit events (TENANT_CREATE, TENANT_UPDATE, etc.); auto-resolves actor name from Logto when actorEmail is null (cached in-memory) ### React Frontend (`ui/src/`) - `main.tsx` — React 19 root - `router.tsx` — `/vendor/*` + `/tenant/*` with `RequireScope` guards and `LandingRedirect` that waits for scopes - `Layout.tsx` — persona-aware sidebar: vendor sees expandable "Vendor" section (Tenants, Audit Log, Certificates, Infrastructure, Identity/Logto), tenant admin sees Dashboard/License/SSO/Team/Audit/Settings - `OrgResolver.tsx` — merges global + org-scoped token scopes (vendor's platform:admin is global) - `config.ts` — fetch Logto config from /platform/api/config - `auth/useAuth.ts` — auth hook (isAuthenticated, logout, signIn) - `auth/useOrganization.ts` — Zustand store for current tenant - `auth/useScopes.ts` — decode JWT scopes, hasScope() - `auth/ProtectedRoute.tsx` — guard (redirects to /login) - **Vendor pages**: `VendorTenantsPage.tsx`, `CreateTenantPage.tsx`, `TenantDetailPage.tsx`, `VendorAuditPage.tsx`, `CertificatesPage.tsx` - **Tenant pages**: `TenantDashboardPage.tsx` (restart + upgrade server), `TenantLicensePage.tsx`, `SsoPage.tsx`, `TeamPage.tsx` (reset member passwords), `TenantAuditPage.tsx`, `SettingsPage.tsx` (change own password, reset server admin password) ### Custom Sign-in UI (`ui/sign-in/src/`) - `SignInPage.tsx` — form with @cameleer/design-system components - `experience-api.ts` — Logto Experience API client (4-step: init -> verify -> identify -> submit) ## Architecture Context The SaaS platform is a **vendor management plane**. It does not proxy requests to servers — instead it provisions dedicated per-tenant cameleer-server instances via Docker API. Each tenant gets isolated server + UI containers with their own database schemas, networks, and Traefik routing. ### Routing (single-domain, path-based via Traefik) All services on one hostname. Infrastructure containers (Traefik, Logto) use `PUBLIC_HOST` + `PUBLIC_PROTOCOL` env vars directly. The SaaS app reads these via `CAMELEER_SAAS_PROVISIONING_PUBLICHOST` / `CAMELEER_SAAS_PROVISIONING_PUBLICPROTOCOL` (Spring Boot properties `cameleer.saas.provisioning.publichost` / `cameleer.saas.provisioning.publicprotocol`). | Path | Target | Notes | |------|--------|-------| | `/platform/*` | cameleer-saas:8080 | SPA + API (`server.servlet.context-path: /platform`) | | `/platform/vendor/*` | (SPA routes) | Vendor console (platform:admin) | | `/platform/tenant/*` | (SPA routes) | Tenant admin portal (org-scoped) | | `/t/{slug}/*` | per-tenant server-ui | Provisioned tenant UI containers (Traefik labels) | | `/` | redirect -> `/platform/` | Via `docker/traefik-dynamic.yml` | | `/*` (catch-all) | cameleer-logto:3001 (priority=1) | Custom sign-in UI, OIDC, interaction | - SPA assets at `/_app/` (Vite `assetsDir: '_app'`) to avoid conflict with Logto's `/assets/` - Logto `ENDPOINT` = `${PUBLIC_PROTOCOL}://${PUBLIC_HOST}` (same domain, same origin) - TLS: `traefik-certs` init container generates self-signed cert (dev) or copies user-supplied cert via `CERT_FILE`/`KEY_FILE`/`CA_FILE` env vars. Default cert configured in `docker/traefik-dynamic.yml` (NOT static `traefik.yml` — Traefik v3 ignores `tls.stores.default` in static config). Runtime cert replacement via vendor UI (stage/activate/restore). ACME for production (future). Server containers import `/certs/ca.pem` into JVM truststore at startup via `docker-entrypoint.sh` for OIDC trust. - Root `/` -> `/platform/` redirect via Traefik file provider (`docker/traefik-dynamic.yml`) - LoginPage auto-redirects to Logto OIDC (no intermediate button) - Per-tenant server containers get Traefik labels for `/t/{slug}/*` routing at provisioning time ### Docker Networks Compose-defined networks: | Network | Name on Host | Purpose | |---------|-------------|---------| | `cameleer` | `cameleer-saas_cameleer` | Compose default — shared services (DB, Logto, SaaS) | | `cameleer-traefik` | `cameleer-traefik` (fixed `name:`) | Traefik + provisioned tenant containers | Per-tenant networks (created dynamically by `DockerTenantProvisioner`): | Network | Name Pattern | Purpose | |---------|-------------|---------| | Tenant network | `cameleer-tenant-{slug}` | Internal bridge, no internet — isolates tenant server + apps | | Environment network | `cameleer-env-{tenantId}-{envSlug}` | Tenant-scoped (includes tenantId to prevent slug collision across tenants) | Server containers join three networks: tenant network (primary), shared services network (`cameleer`), and traefik network. Apps deployed by the server use the tenant network as primary. **IMPORTANT:** Dynamically-created containers MUST have `traefik.docker.network=cameleer-traefik` label. Traefik's Docker provider defaults to `network: cameleer` (compose-internal name) for IP resolution, which doesn't match dynamically-created containers connected via Docker API using the host network name (`cameleer-saas_cameleer`). Without this label, Traefik returns 504 Gateway Timeout for `/t/{slug}/api/*` paths. ### Custom sign-in UI (`ui/sign-in/`) Separate Vite+React SPA replacing Logto's default sign-in page. Visually matches cameleer-server LoginPage. - Built as custom Logto Docker image (`cameleer-logto`): `ui/sign-in/Dockerfile` = node build stage + `FROM ghcr.io/logto-io/logto:latest` + COPY dist over `/etc/logto/packages/experience/dist/` - Uses `@cameleer/design-system` components (Card, Input, Button, FormField, Alert) - Authenticates via Logto Experience API (4-step: init -> verify password -> identify -> submit -> redirect) - `CUSTOM_UI_PATH` env var does NOT work for Logto OSS — must volume-mount or replace the experience dist directory - Favicon bundled in `ui/sign-in/public/favicon.svg` (served by Logto, not SaaS) ### Auth enforcement - All API endpoints enforce OAuth2 scopes via `@PreAuthorize("hasAuthority('SCOPE_xxx')")` annotations - Tenant isolation enforced by `TenantIsolationInterceptor` (a single `HandlerInterceptor` on `/api/**` that resolves JWT org_id to TenantContext and validates `{tenantId}`, `{environmentId}`, `{appId}` path variables; fail-closed, platform admins bypass) - 13 OAuth2 scopes on the Logto API resource (`https://api.cameleer.local`): 10 platform scopes + 3 server scopes (`server:admin`, `server:operator`, `server:viewer`), served to the frontend from `GET /platform/api/config` - Server scopes map to server RBAC roles via JWT `scope` claim (SaaS platform path) or `roles` claim (server-ui OIDC login path) - Org roles: `owner` -> `server:admin` + `tenant:manage`, `operator` -> `server:operator`, `viewer` -> `server:viewer` - `saas-vendor` global role created by bootstrap Phase 12 and always assigned to the admin user — has `platform:admin` + all tenant scopes - Custom `JwtDecoder` in `SecurityConfig.java` — ES384 algorithm, `at+jwt` token type, split issuer-uri (string validation) / jwk-set-uri (Docker-internal fetch), audience validation (`https://api.cameleer.local`) - Logto Custom JWT (Phase 7b in bootstrap) injects a `roles` claim into access tokens based on org roles and global roles — this makes role data available to the server without Logto-specific code ### Auth routing by persona | Persona | Logto role | Key scope | Landing route | |---------|-----------|-----------|---------------| | SaaS admin | `saas-vendor` (global) | `platform:admin` | `/vendor/tenants` | | Tenant admin | org `owner` | `tenant:manage` | `/tenant` (dashboard) | | Regular user (operator/viewer) | org member | `server:operator` or `server:viewer` | Redirected to server dashboard directly | - `LandingRedirect` component waits for scopes to load, then routes to the correct persona landing page - `RequireScope` guard on route groups enforces scope requirements - SSO bridge: Logto session carries over to provisioned server's OIDC flow (Traditional Web App per tenant) ### Per-tenant server env vars (set by DockerTenantProvisioner) These env vars are injected into provisioned per-tenant server containers: | Env var | Value | Purpose | |---------|-------|---------| | `SPRING_DATASOURCE_URL` | `jdbc:postgresql://cameleer-postgres:5432/cameleer?currentSchema=tenant_{slug}&ApplicationName=tenant_{slug}` | Per-tenant schema isolation + diagnostic query scoping | | `SPRING_DATASOURCE_USERNAME` | `tenant_{slug}` | Per-tenant PG user (owns only its schema) | | `SPRING_DATASOURCE_PASSWORD` | (generated, stored in `TenantEntity.dbPassword`) | Per-tenant PG password | | `CAMELEER_SERVER_SECURITY_OIDCISSUERURI` | `${PUBLIC_PROTOCOL}://${PUBLIC_HOST}/oidc` | Token issuer claim validation | | `CAMELEER_SERVER_SECURITY_OIDCJWKSETURI` | `http://cameleer-logto:3001/oidc/jwks` | Docker-internal JWK fetch | | `CAMELEER_SERVER_SECURITY_OIDCTLSSKIPVERIFY` | `true` (conditional) | Skip cert verify for OIDC discovery; only set when no `/certs/ca.pem` exists. When ca.pem exists, the server's `docker-entrypoint.sh` imports it into the JVM truststore instead. | | `CAMELEER_SERVER_SECURITY_OIDCAUDIENCE` | `https://api.cameleer.local` | JWT audience validation for OIDC tokens | | `CAMELEER_SERVER_SECURITY_CORSALLOWEDORIGINS` | `${PUBLIC_PROTOCOL}://${PUBLIC_HOST}` | Allow browser requests through Traefik | | `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN` | (generated) | Bootstrap auth token for M2M communication | | `CAMELEER_SERVER_RUNTIME_ENABLED` | `true` | Enable Docker orchestration | | `CAMELEER_SERVER_RUNTIME_SERVERURL` | `http://cameleer-server-{slug}:8081` | Per-tenant server URL (DNS alias on tenant network) | | `CAMELEER_SERVER_RUNTIME_ROUTINGDOMAIN` | `${PUBLIC_HOST}` | Domain for Traefik routing labels | | `CAMELEER_SERVER_RUNTIME_ROUTINGMODE` | `path` | `path` or `subdomain` routing | | `CAMELEER_SERVER_RUNTIME_JARSTORAGEPATH` | `/data/jars` | Directory for uploaded JARs | | `CAMELEER_SERVER_RUNTIME_DOCKERNETWORK` | `cameleer-tenant-{slug}` | Primary network for deployed app containers | | `CAMELEER_SERVER_RUNTIME_JARDOCKERVOLUME` | `cameleer-jars-{slug}` | Docker volume name for JAR sharing between server and deployed containers | | `CAMELEER_SERVER_TENANT_ID` | (tenant UUID) | Tenant identifier for data isolation | | `CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS` | `false` | Hides Database/ClickHouse admin from tenant admins | | `BASE_PATH` (server-ui) | `/t/{slug}` | React Router basename + `` tag | | `CAMELEER_API_URL` (server-ui) | `http://cameleer-server-{slug}:8081` | Nginx upstream proxy target (NOT `API_URL` — image uses `${CAMELEER_API_URL}`) | ### Per-tenant volume mounts (set by DockerTenantProvisioner) | Mount | Container path | Purpose | |-------|---------------|---------| | `/var/run/docker.sock` | `/var/run/docker.sock` | Docker socket for app deployment orchestration | | `cameleer-jars-{slug}` (volume, via `CAMELEER_SERVER_RUNTIME_JARDOCKERVOLUME`) | `/data/jars` | Shared JAR storage — server writes, deployed app containers read | | `cameleer-saas_certs` (volume, ro) | `/certs` | Platform TLS certs + CA bundle for OIDC trust | ### SaaS app configuration (env vars for cameleer-saas itself) SaaS properties use the `cameleer.saas.*` prefix (env vars: `CAMELEER_SAAS_*`). Two groups: **Identity** (`cameleer.saas.identity.*` / `CAMELEER_SAAS_IDENTITY_*`): - Logto endpoint, M2M credentials, bootstrap file path — used by `LogtoConfig.java` **Provisioning** (`cameleer.saas.provisioning.*` / `CAMELEER_SAAS_PROVISIONING_*`): | Env var | Spring property | Purpose | |---------|----------------|---------| | `CAMELEER_SAAS_PROVISIONING_SERVERIMAGE` | `cameleer.saas.provisioning.serverimage` | Docker image for per-tenant server containers | | `CAMELEER_SAAS_PROVISIONING_SERVERUIIMAGE` | `cameleer.saas.provisioning.serveruiimage` | Docker image for per-tenant UI containers | | `CAMELEER_SAAS_PROVISIONING_NETWORKNAME` | `cameleer.saas.provisioning.networkname` | Shared services Docker network (compose default) | | `CAMELEER_SAAS_PROVISIONING_TRAEFIKNETWORK` | `cameleer.saas.provisioning.traefiknetwork` | Traefik Docker network for routing | | `CAMELEER_SAAS_PROVISIONING_PUBLICHOST` | `cameleer.saas.provisioning.publichost` | Public hostname (same value as infrastructure `PUBLIC_HOST`) | | `CAMELEER_SAAS_PROVISIONING_PUBLICPROTOCOL` | `cameleer.saas.provisioning.publicprotocol` | Public protocol (same value as infrastructure `PUBLIC_PROTOCOL`) | **Note:** `PUBLIC_HOST` and `PUBLIC_PROTOCOL` remain as infrastructure env vars for Traefik and Logto containers. The SaaS app reads its own copies via the `CAMELEER_SAAS_PROVISIONING_*` prefix. `LOGTO_ENDPOINT` and `LOGTO_DB_PASSWORD` are infrastructure env vars for the Logto service and are unchanged. ### Server OIDC role extraction (two paths) | Path | Token type | Role source | How it works | |------|-----------|-------------|--------------| | SaaS platform -> server API | Logto org-scoped access token | `scope` claim | `JwtAuthenticationFilter.extractRolesFromScopes()` reads `server:admin` from scope | | Server-ui SSO login | Logto JWT access token (via Traditional Web App) | `roles` claim | `OidcTokenExchanger` decodes access_token, reads `roles` injected by Custom JWT | The server's OIDC config (`OidcConfig`) includes `audience` (RFC 8707 resource indicator) and `additionalScopes`. The `audience` is sent as `resource` in both the authorization request and token exchange, which makes Logto return a JWT access token instead of opaque. The Custom JWT script maps org roles to `roles: ["server:admin"]`. **CRITICAL:** `additionalScopes` MUST include `urn:logto:scope:organizations` and `urn:logto:scope:organization_roles` — without these, Logto doesn't populate `context.user.organizationRoles` in the Custom JWT script, so the `roles` claim is empty and all users get `defaultRoles` (VIEWER). The server's `OidcAuthController.applyClaimMappings()` uses OIDC token roles (from Custom JWT) as fallback when no DB claim mapping rules exist: claim mapping rules > OIDC token roles > defaultRoles. ### Deployment pipeline App deployment is handled by the cameleer-server's `DeploymentExecutor` (7-stage async flow): 1. PRE_FLIGHT — validate config, check JAR exists 2. PULL_IMAGE — pull base image if missing 3. CREATE_NETWORK — ensure cameleer-traefik and cameleer-env-{slug} networks 4. START_REPLICAS — create N containers with Traefik labels 5. HEALTH_CHECK — poll `/cameleer/health` on agent port 9464 6. SWAP_TRAFFIC — stop old deployment (blue/green) 7. COMPLETE — mark RUNNING or DEGRADED Key files: - `DeploymentExecutor.java` (in cameleer-server) — async staged deployment - `DockerRuntimeOrchestrator.java` (in cameleer-server) — Docker client, container lifecycle - `docker/runtime-base/Dockerfile` — base image with agent JAR, maps env vars to `-D` system properties - `ServerApiClient.java` — M2M token acquisition for SaaS->server API calls (agent status). Uses `X-Cameleer-Protocol-Version: 1` header - Docker socket access: `group_add: ["0"]` in docker-compose.dev.yml (not root group membership in Dockerfile) - Network: deployed containers join `cameleer-tenant-{slug}` (primary, isolation) + `cameleer-traefik` (routing) + `cameleer-env-{tenantId}-{envSlug}` (environment isolation) ### Bootstrap (`docker/logto-bootstrap.sh`) Idempotent script run inside the Logto container entrypoint. **Clean slate** — no example tenant, no viewer user, no server configuration. Phases: 1. Wait for Logto health (no server to wait for — servers are provisioned per-tenant) 2. Get Management API token (reads `m-default` secret from DB) 3. Create Logto apps (SPA, Traditional Web App with `skipConsent`, M2M with Management API role + server API role) 3b. Create API resource scopes (10 platform + 3 server scopes) 4. Create org roles (owner, operator, viewer with API resource scope assignments) + M2M server role (`cameleer-m2m-server` with `server:admin` scope) 5. Create admin user (SaaS admin with Logto console access) 7b. Configure Logto Custom JWT for access tokens (maps org roles -> `roles` claim: owner->server:admin, operator->server:operator, viewer->server:viewer; saas-vendor global role -> server:admin) 8. Configure Logto sign-in branding (Cameleer colors `#C6820E`/`#D4941E`, logo from `/platform/logo.svg`) 9. Cleanup seeded Logto apps 10. Write bootstrap results to `/data/logto-bootstrap.json` 12. Create `saas-vendor` global role with all API scopes and assign to admin user (always runs — admin IS the platform admin). The multi-tenant compose stack is: Traefik + PostgreSQL + ClickHouse + Logto (with bootstrap entrypoint) + cameleer-saas. No `cameleer-server` or `cameleer-server-ui` in compose — those are provisioned per-tenant by `DockerTenantProvisioner`. ### Deployment Modes (installer) The installer (`installer/install.sh`) supports two deployment modes: | | Multi-tenant SaaS (`DEPLOYMENT_MODE=saas`) | Standalone (`DEPLOYMENT_MODE=standalone`) | |---|---|---| | **Containers** | traefik, postgres, clickhouse, logto, cameleer-saas | traefik, postgres, clickhouse, server, server-ui | | **Auth** | Logto OIDC (SaaS admin + tenant users) | Local auth (built-in admin, no identity provider) | | **Tenant management** | SaaS admin creates/manages tenants via UI | Single server instance, no fleet management | | **PostgreSQL** | `cameleer-postgres` image (multi-DB init) | Stock `postgres:16-alpine` (server creates schema via Flyway) | | **Use case** | Platform vendor managing multiple customers | Single customer running the product directly | Standalone mode generates a simpler compose with the server running directly. No Logto, no SaaS management plane, no bootstrap. The admin logs in with local credentials at `/`. The installer uses static docker-compose templates in `installer/templates/`. Templates are copied to the install directory and composed via `COMPOSE_FILE` in `.env`: - `docker-compose.yml` — shared infrastructure (traefik, postgres, clickhouse) - `docker-compose.saas.yml` — SaaS mode (logto, cameleer-saas) - `docker-compose.server.yml` — standalone mode (server, server-ui) - `docker-compose.tls.yml` — overlay: custom TLS cert volume - `docker-compose.monitoring.yml` — overlay: external monitoring network ### Tenant Provisioning Flow When SaaS admin creates a tenant via `VendorTenantService`: **Synchronous (in `createAndProvision`):** 1. Create `TenantEntity` (status=PROVISIONING) + Logto organization 2. Create admin user in Logto with owner org role (if credentials provided) 3. Register OIDC redirect URIs for `/t/{slug}/oidc/callback` on Logto Traditional Web App 5. Generate license (tier-appropriate, 365 days) 6. Return immediately — UI shows provisioning spinner, polls via `refetchInterval` **Asynchronous (in `provisionAsync`, `@Async`):** 7. Create per-tenant PostgreSQL user + schema via `TenantDatabaseService.createTenantDatabase(slug, password)`, store `dbPassword` on entity 8. Create tenant-isolated Docker network (`cameleer-tenant-{slug}`) 9. Create server container with per-tenant JDBC URL (`currentSchema=tenant_{slug}&ApplicationName=tenant_{slug}`), Traefik labels (`traefik.docker.network`), health check, Docker socket bind, JAR volume, certs volume (ro) 10. Create UI container with `CAMELEER_API_URL`, `BASE_PATH`, Traefik strip-prefix labels 10. Wait for health check (`/api/v1/health`, not `/actuator/health` which requires auth) 11. Push license token to server via M2M API 12. Push OIDC config (Traditional Web App credentials + `additionalScopes: [urn:logto:scope:organizations, urn:logto:scope:organization_roles]`) to server for SSO 13. Update tenant status -> ACTIVE (or set `provisionError` on failure) **Server restart** (available to SaaS admin + tenant admin): - `POST /api/vendor/tenants/{id}/restart` (SaaS admin) and `POST /api/tenant/server/restart` (tenant) - Calls `TenantProvisioner.stop(slug)` then `start(slug)` — restarts server + UI containers only (same image) **Server upgrade** (available to SaaS admin + tenant admin): - `POST /api/vendor/tenants/{id}/upgrade` (SaaS admin) and `POST /api/tenant/server/upgrade` (tenant) - Calls `TenantProvisioner.upgrade(slug)` — removes server + UI containers, force-pulls latest images (preserves app containers, volumes, networks), then `provisionAsync()` re-creates containers with the new image + pushes license + OIDC config **Tenant delete** cleanup: - `DockerTenantProvisioner.remove(slug)` — label-based container removal (`cameleer.tenant={slug}`), env network cleanup, tenant network removal, JAR volume removal - `TenantDatabaseService.dropTenantDatabase(slug)` — drops PostgreSQL `tenant_{slug}` schema + `tenant_{slug}` user - `TenantDataCleanupService.cleanupClickHouse(slug)` — deletes ClickHouse data across all tables with `tenant_id` column (GDPR) **Password management** (tenant portal): - `POST /api/tenant/password` — tenant admin changes own Logto password (via `@AuthenticationPrincipal` JWT subject) - `POST /api/tenant/team/{userId}/password` — tenant admin resets a team member's Logto password (validates org membership first) - `POST /api/tenant/server/admin-password` — tenant admin resets the server's built-in local admin password (via M2M API to `POST /api/v1/admin/users/user:admin/password`) ## Database Migrations PostgreSQL (Flyway): `src/main/resources/db/migration/` - V001 — consolidated baseline: tenants (with db_password, server_endpoint, provision_error, ca_applied_at), licenses, audit_log, certificates, tenant_ca_certs ## Related Conventions - Gitea-hosted: `gitea.siegeln.net/cameleer/` - CI: `.gitea/workflows/` — Gitea Actions - K8s target: k3s cluster at 192.168.50.86 - Docker images: CI builds and pushes all images — Dockerfiles use multi-stage builds, no local builds needed - `cameleer-saas` — SaaS vendor management plane (frontend + JAR baked in) - `cameleer-logto` — custom Logto with sign-in UI baked in - `cameleer-server` / `cameleer-server-ui` — provisioned per-tenant (not in compose, created by `DockerTenantProvisioner`) - `cameleer-runtime-base` — base image for deployed apps (agent JAR + JRE). CI downloads latest agent SNAPSHOT from Gitea Maven registry. Uses `CAMELEER_SERVER_RUNTIME_SERVERURL` env var (not CAMELEER_EXPORT_ENDPOINT). - Docker builds: `--no-cache`, `--provenance=false` for Gitea compatibility - `docker-compose.dev.yml` — exposes ports for direct access, sets `SPRING_PROFILES_ACTIVE: dev`. Volume-mounts `./ui/dist` into the container so local UI builds are served without rebuilding the Docker image (`SPRING_WEB_RESOURCES_STATIC_LOCATIONS` overrides classpath). Adds Docker socket mount for tenant provisioning. - Design system: import from `@cameleer/design-system` (Gitea npm registry) ## Disabled Skills - Do NOT use any `gsd:*` skills in this project. This includes all `/gsd:` prefixed commands. # GitNexus — Code Intelligence This project is indexed by GitNexus as **cameleer-saas** (2676 symbols, 5768 relationships, 224 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely. > If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first. ## Always Do - **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user. - **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows. - **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits. - When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance. - When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`. ## When Debugging 1. `gitnexus_query({query: ""})` — find execution flows related to the issue 2. `gitnexus_context({name: ""})` — see all callers, callees, and process participation 3. `READ gitnexus://repo/cameleer-saas/process/{processName}` — trace the full execution flow step by step 4. For regressions: `gitnexus_detect_changes({scope: "compare", base_ref: "main"})` — see what your branch changed ## When Refactoring - **Renaming**: MUST use `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` first. Review the preview — graph edits are safe, text_search edits need manual review. Then run with `dry_run: false`. - **Extracting/Splitting**: MUST run `gitnexus_context({name: "target"})` to see all incoming/outgoing refs, then `gitnexus_impact({target: "target", direction: "upstream"})` to find all external callers before moving code. - After any refactor: run `gitnexus_detect_changes({scope: "all"})` to verify only expected files changed. ## Never Do - NEVER edit a function, class, or method without first running `gitnexus_impact` on it. - NEVER ignore HIGH or CRITICAL risk warnings from impact analysis. - NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph. - NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope. ## Tools Quick Reference | Tool | When to use | Command | |------|-------------|---------| | `query` | Find code by concept | `gitnexus_query({query: "auth validation"})` | | `context` | 360-degree view of one symbol | `gitnexus_context({name: "validateUser"})` | | `impact` | Blast radius before editing | `gitnexus_impact({target: "X", direction: "upstream"})` | | `detect_changes` | Pre-commit scope check | `gitnexus_detect_changes({scope: "staged"})` | | `rename` | Safe multi-file rename | `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` | | `cypher` | Custom graph queries | `gitnexus_cypher({query: "MATCH ..."})` | ## Impact Risk Levels | Depth | Meaning | Action | |-------|---------|--------| | d=1 | WILL BREAK — direct callers/importers | MUST update these | | d=2 | LIKELY AFFECTED — indirect deps | Should test | | d=3 | MAY NEED TESTING — transitive | Test if critical path | ## Resources | Resource | Use for | |----------|---------| | `gitnexus://repo/cameleer-saas/context` | Codebase overview, check index freshness | | `gitnexus://repo/cameleer-saas/clusters` | All functional areas | | `gitnexus://repo/cameleer-saas/processes` | All execution flows | | `gitnexus://repo/cameleer-saas/process/{name}` | Step-by-step execution trace | ## Self-Check Before Finishing Before completing any code modification task, verify: 1. `gitnexus_impact` was run for all modified symbols 2. No HIGH/CRITICAL risk warnings were ignored 3. `gitnexus_detect_changes()` confirms changes match expected scope 4. All d=1 (WILL BREAK) dependents were updated ## Keeping the Index Fresh After committing code changes, the GitNexus index becomes stale. Re-run analyze to update it: ```bash npx gitnexus analyze ``` If the index previously included embeddings, preserve them by adding `--embeddings`: ```bash npx gitnexus analyze --embeddings ``` To check whether embeddings exist, inspect `.gitnexus/meta.json` — the `stats.embeddings` field shows the count (0 means no embeddings). **Running analyze without `--embeddings` will delete any previously generated embeddings.** > Claude Code users: A PostToolUse hook handles this automatically after `git commit` and `git merge`. ## CLI | Task | Read this skill file | |------|---------------------| | Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` | | Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` | | Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` | | Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` | | Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` | | Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |