463 Commits

Author SHA1 Message Date
hsiegeln
5ca118dc93 docs: update CLAUDE.md for installer submodule structure
All checks were successful
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 14s
Reflect that installer/ is now a git submodule pointing to the public
cameleer-saas-installer repo, and that docker-compose.yml is a thin
dev overlay chained via COMPOSE_FILE.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 13:05:20 +02:00
hsiegeln
0b8cdf6dd9 refactor: move installer to dedicated repo, wire as git submodule
All checks were successful
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 20s
The installer (install.sh, templates/, bootstrap scripts) now lives in
cameleer/cameleer-saas-installer (public repo). Added as a git submodule
at installer/ so compose templates remain the single source of truth.

Dev compose is now a thin overlay (ports + volume mount + dev env vars).
Production templates are chained via COMPOSE_FILE in .env:
  installer/templates/docker-compose.yml
  installer/templates/docker-compose.saas.yml
  docker-compose.yml (dev overrides)

No code duplication — fixes to compose templates go to the installer
repo and propagate to both production deployments and dev via submodule.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 12:59:44 +02:00
hsiegeln
cafd7e9369 feat: add bootstrap scripts for one-line installer download
All checks were successful
CI / build (push) Successful in 1m39s
CI / docker (push) Successful in 18s
get-cameleer.sh and get-cameleer.ps1 download just the installer
files from Gitea into a local ./installer directory. Usage:

  curl -fsSL https://gitea.siegeln.net/.../get-cameleer.sh | bash
  irm https://gitea.siegeln.net/.../get-cameleer.ps1 | iex

Supports --version=v1.2.0 to pin a specific tag, defaults to main.
Pass --run to auto-execute the installer after download.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 12:48:23 +02:00
hsiegeln
b5068250f9 fix(ui): improve onboarding page styling to match sign-in page
All checks were successful
CI / build (push) Successful in 1m51s
CI / docker (push) Successful in 1m16s
Add 32px card padding and reduce max-width to 420px for consistency
with the sign-in page. Add noValidate to prevent browser-native
required validation outlines on inputs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 12:43:11 +02:00
hsiegeln
0cfa359fc5 fix(sign-in): detect register mode from URL path, not just query param
All checks were successful
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 44s
Newer Logto versions redirect to /register?app_id=... instead of
/sign-in?first_screen=register. Check the pathname in addition to
the query param so the registration form shows correctly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 10:10:57 +02:00
hsiegeln
5cc9f8c9ef fix(spa): add /register and /onboarding to SPA forward routes
All checks were successful
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 43s
These routes were missing from SpaController, so requests to
/platform/register and /platform/onboarding had no handler. Spring
forwarded to /error, which isn't in the permitAll() list, resulting
in a 401 instead of serving the SPA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 10:05:56 +02:00
hsiegeln
b066d1abe7 fix(sign-in): validate email format before registration attempt
All checks were successful
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 46s
Show "Please enter a valid email address" when the user enters a
username instead of an email in the sign-up form, rather than letting
it hit Logto's API and returning a cryptic 400.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 09:41:41 +02:00
hsiegeln
ae1d9fa4db fix(docker): add extra_hosts so Logto can reach itself via public hostname
All checks were successful
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 18s
Logto validates M2M tokens by fetching its own JWKS from the ENDPOINT
URL (e.g. https://app.cameleer.io/oidc/jwks). Behind a Cloudflare
tunnel, that hostname resolves to Cloudflare's IP and the container
can't route back through the tunnel — the fetch times out (ETIMEDOUT),
causing all Management API calls to return 500.

Adding extra_hosts maps AUTH_HOST to host-gateway so the request goes
to the Docker host, which has Traefik on :443, which routes back to
Logto internally. This hairpin works because NODE_TLS_REJECT=0 accepts
the self-signed cert.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 09:13:39 +02:00
hsiegeln
6fe10432e6 fix(installer): remove duplicate config load that kills upgrade silently
All checks were successful
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 15s
The upgrade path in handle_rerun called load_config_file a second time
(already called by detect_existing_install). On the second pass, every
variable is already set, so [ -z "$VAR" ] && VAR="$value" returns
exit code 1 (test fails, && short-circuits). With set -e, the non-zero
exit from the case clause kills the script silently after printing
"[INFO] Upgrading installation..." — no error, no further output.

Removed the redundant load_config_file and load_env_overrides calls.
Both were already executed in main() before handle_rerun is reached.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 09:03:07 +02:00
hsiegeln
9f3faf4816 fix(traefik): set Logto router priority=1 to prevent route hijacking
All checks were successful
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 18s
Traefik auto-calculates router priority from rule string length. When
deployed with a domain longer than 23 chars (e.g. app.cameleer.io),
Host(`app.cameleer.io`) (25 chars) outranks PathPrefix(`/platform`)
(23 chars), causing ALL requests — including /platform/* — to route
to Logto instead of the SaaS app. This breaks login because the sign-in
UI loads without an OIDC interaction session.

Setting priority=1 makes Logto a true catch-all, matching the intent
documented in docker/CLAUDE.md.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 08:50:16 +02:00
hsiegeln
a60095608e fix(installer): send correct Host header in Traefik routing check
All checks were successful
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 19s
The root redirect rule matches Host(`PUBLIC_HOST`), not localhost.
Curl with --resolve (bash) and Host header (PS1) so the health
check sends the right hostname when verifying Traefik routing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 08:15:18 +02:00
hsiegeln
9f9112c6a5 feat(installer): add interactive registry prompts
Some checks failed
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 16s
SonarQube Analysis / sonarqube (push) Failing after 1m47s
Both simple and expert modes now ask "Pull images from a private
registry?" with follow-up prompts for URL, username, and token.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 02:17:52 +02:00
hsiegeln
e1a9f6d225 feat(installer): add --registry, --registry-user, --registry-token
All checks were successful
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 15s
Both installers (bash + PS1) now support pulling images from a
custom Docker registry. Writes *_IMAGE env vars to .env so compose
templates use the configured registry. Runs docker login before
pull when credentials are provided. Persisted in cameleer.conf
for upgrades/reconfigure.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 02:10:48 +02:00
hsiegeln
180644f0df fix(installer): SIGPIPE crash in generate_password with pipefail
All checks were successful
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 18s
`tr | head -c 32` causes tr to receive SIGPIPE when head exits early.
With `set -eo pipefail`, exit code 141 kills the script right after
"Configuration validated" before any passwords are generated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 01:41:47 +02:00
hsiegeln
62b74d2d06 ci: remove sync-images workflow
All checks were successful
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 16s
Remote server will pull directly from the Gitea registry instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 01:25:56 +02:00
hsiegeln
3e2f035d97 fix(ci): use POSIX-compatible loop instead of bash arrays
All checks were successful
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 18s
The docker-builder container runs ash/sh, not bash — arrays with ()
are not supported. Use a simple for-in loop instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:32:12 +02:00
hsiegeln
9962ee99d9 fix(ci): drop ssh-keyscan, use StrictHostKeyChecking=accept-new instead
All checks were successful
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 17s
ssh-keyscan fails when the runner can't reach the host on port 22
during that step. Using accept-new on the ssh command itself is
equivalent for an ephemeral CI runner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:29:52 +02:00
hsiegeln
b53840b77b ci: add manual workflow to sync Docker images to remote server
Some checks failed
CI / docker (push) Has been cancelled
CI / build (push) Has been cancelled
Pulls all :latest images from the Gitea registry and pipes them
via `docker save | ssh docker load` to the APP_HOST server.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:28:39 +02:00
hsiegeln
9ed2cedc98 feat: self-service sign-up with email verification and onboarding
All checks were successful
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 1m15s
Complete sign-up pipeline: email registration via Logto Experience API,
SMTP email verification, and self-service trial tenant creation.

Layer 1 — Logto config:
- Bootstrap Phase 8b: SMTP email connector with branded HTML templates
- Bootstrap Phase 8c: enable SignInAndRegister (email+password sign-up)
- Dockerfile installs official Logto connectors (ensures SMTP available)
- SMTP env vars in docker-compose, installer templates, .env.example

Layer 2 — Experience API (ui/sign-in/experience-api.ts):
- Registration flow: initRegistration → sendVerificationCode → verifyCode
  → addProfile (password) → identifyUser → submit
- Sign-in auto-detects email vs username identifier

Layer 3 — Custom sign-in UI (ui/sign-in/SignInPage.tsx):
- Three-mode state machine: signIn / register / verifyCode
- Reads first_screen=register from URL query params
- Toggle links between sign-in and register views

Layer 4 — Post-registration onboarding:
- OnboardingService: reuses VendorTenantService.createAndProvision(),
  adds calling user to Logto org as owner, enforces one trial per user
- OnboardingController: POST /api/onboarding/tenant (authenticated only)
- OnboardingPage.tsx: org name + auto-slug form
- LandingRedirect: detects zero orgs → redirects to /onboarding
- RegisterPage.tsx: /platform/register initiates OIDC with firstScreen

Installers (install.sh + install.ps1):
- Both prompt for SMTP config in SaaS mode
- CLI args, env var capture, cameleer.conf persistence

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-25 00:21:07 +02:00
hsiegeln
dc7ac3a1ec feat: split auth domain — Logto gets dedicated AUTH_HOST
All checks were successful
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 48s
Support separate auth domain (e.g. auth.cameleer.io) for Logto while
keeping the SaaS app on PUBLIC_HOST (e.g. app.cameleer.io). AUTH_HOST
defaults to PUBLIC_HOST for backward-compatible single-domain setups.

- Logto routing: Host(AUTH_HOST) replaces PathPrefix('/') catch-all
- Root redirect moved from traefik-dynamic.yml to Docker labels with
  Host(PUBLIC_HOST) scope so it doesn't intercept auth domain
- Self-signed cert generates SANs for both domains
- Bootstrap Host header uses AUTH_HOST for Logto endpoint validation
- Spring issuer-uri and oidcissueruri use new authhost property
- Both installers (sh + ps1) prompt for AUTH_HOST in expert mode

Local dev: AUTH_HOST=auth.localhost (resolves to 127.0.0.1, no hosts file)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 18:11:47 +02:00
hsiegeln
1fbafbb16d feat: add vendor tenant metrics dashboard
All checks were successful
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m0s
Fleet overview page at /vendor/metrics showing per-tenant operational
metrics (agents, CPU, heap, HTTP requests, ingestion drops, uptime).
Queries each tenant's server via the new POST /api/v1/admin/server-metrics/query
REST API instead of direct ClickHouse access, supporting future per-tenant
CH instances.

Backend: TenantMetricsService fires 11 metric queries per tenant
concurrently over a 5-minute window, assembles into a summary snapshot.
ServerApiClient.queryServerMetrics() handles the M2M authenticated POST.

Frontend: VendorMetricsPage with KPI strip (fleet totals) and per-tenant
table with color-coded badges and heap usage bars. Auto-refreshes every 60s.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-24 14:02:57 +02:00
hsiegeln
6c1241ed89 docs(docker): replace obsolete 504 workaround note with the real wiring
Some checks failed
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 18s
SonarQube Analysis / sonarqube (push) Failing after 1m20s
Pre-fix the paragraph claimed every dynamically-created container MUST
carry `traefik.docker.network=cameleer-traefik` to avoid a 504, because
Traefik's Docker provider pointed at `network: cameleer` (a literal
name that never matched any real network). After the one-line static
config fix (df64573), Traefik's provider targets `cameleer-traefik`
directly — the network every managed container already joins — so the
per-container label is just defense-in-depth, not required.

Rewritten to describe current behaviour and keep a short note about the
pre-fix 504 for operators who roll back to an old image.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 18:22:32 +02:00
hsiegeln
df64573bfb fix(traefik): point docker provider network at cameleer-traefik
All checks were successful
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 16s
The static config set `provider.docker.network: cameleer`, but no network
by that literal name exists. The `cameleer` network defined in the
compose file gets namespaced by compose to `cameleer_cameleer`, and
managed app containers created at runtime only ever attach to
`cameleer-traefik` (per `DockerNetworkManager.TRAEFIK_NETWORK`).

Symptom: when the Docker provider's preferred network doesn't match any
network on a container, Traefik picks an arbitrary container IP and may
route to one on a bridge Traefik itself isn't attached to — requests
hang until Traefik's upstream timeout fires (504 Gateway Timeout).

Fix is one line: match the network that `cameleer-server` actually
attaches its managed containers to.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 18:15:22 +02:00
hsiegeln
4526d97bda fix: generate CAMELEER_SERVER_SECURITY_JWTSECRET in installer and wire into containers
All checks were successful
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 59s
The server now requires a non-empty JWT secret. The installer (bash + ps1)
generates a random value for both SaaS and standalone modes, and the compose
templates map it into the respective containers. Also fixes container names
in generated INSTALL.md docs to use the cameleer- prefix consistently.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-23 09:30:11 +02:00
hsiegeln
132143c083 refactor: decompose CLAUDE.md into directory-scoped files
Some checks failed
CI / build (push) Successful in 1m59s
CI / docker (push) Successful in 1m24s
SonarQube Analysis / sonarqube (push) Failing after 2m4s
Root CLAUDE.md reduced from 475 to 175 lines (75 excl. GitNexus).
Detailed context now loads automatically only when editing code in
the relevant directory:

- provisioning/CLAUDE.md — env vars, provisioning flow, lifecycle
- config/CLAUDE.md — auth, scopes, JWT, OIDC role extraction
- docker/CLAUDE.md — routing, networks, bootstrap, deployment pipeline
- installer/CLAUDE.md — deployment modes, compose templates, env naming
- ui/CLAUDE.md — frontend files, sign-in UI

No information lost — everything moved, nothing deleted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:30:21 +02:00
hsiegeln
b824942408 docs: fix scope breakdown and add missing InfrastructurePage
All checks were successful
CI / build (push) Successful in 2m12s
CI / docker (push) Successful in 19s
- OAuth2 scopes: 1 platform + 9 tenant + 3 server (not "10 platform")
- Add InfrastructurePage.tsx to vendor pages list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:21:28 +02:00
hsiegeln
31e8dd05f0 docs: update CLAUDE.md for runtime base image, env var accuracy
All checks were successful
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 3m2s
- Fix OIDC env var names (OIDC_ISSUERURI not OIDCISSUERURI)
- Fix CAMELEER_SERVER_TENANT_ID value (slug, not UUID)
- Add missing env vars (ClickHouse, JWT secret, license token, base image)
- Complete provisioning properties table (was 6/16, now all listed)
- Add semantic note: CAMELEER_SAAS_PROVISIONING_* = "forwarded to tenant"
- Update runtime-base description (log appender JAR, entrypoint override,
  runtime type detection, PropertiesLauncher version handling)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:16:55 +02:00
hsiegeln
eba9f560ac fix: name JAR volume explicitly to match JARDOCKERVOLUME env var
Some checks failed
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 19s
SonarQube Analysis / sonarqube (push) Failing after 1m23s
The compose volume `jars` gets created as `<project>_jars` by Docker
Compose, but JARDOCKERVOLUME tells the server to mount `cameleer-jars`
on deployed app containers. These are different Docker volumes, so
the app JAR was never visible inside the app container — causing
ClassNotFoundException on startup.

Fix: add `name: cameleer-jars` to the volume definition so both the
server and deployed app containers share the same named volume.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 00:03:48 +02:00
hsiegeln
3c2bf4a9b1 fix: pass self-reference in VendorTenantServiceTest for async proxy
All checks were successful
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 44s
The @Lazy self-proxy pattern requires a non-null reference in tests.
Construct the instance then re-create with itself as the self param.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:33:07 +02:00
hsiegeln
97b2235914 fix: update tests for ProvisioningProperties runtimeBaseImage field
Some checks failed
CI / build (push) Failing after 1m22s
CI / docker (push) Has been skipped
Add missing runtimeBaseImage arg to ProvisioningProperties constructor
calls in tests. Also add missing self-proxy arg to VendorTenantService
constructor (pre-existing from async provisioning commit).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:27:53 +02:00
hsiegeln
338db5dcda fix: forward runtime base image to provisioned tenant servers
Some checks failed
CI / build (push) Failing after 59s
CI / docker (push) Has been skipped
CAMELEER_SERVER_RUNTIME_BASEIMAGE was never set on provisioned
per-tenant server containers, causing them to fall back to the
server's hardcoded default. Added CAMELEER_SAAS_PROVISIONING_RUNTIMEBASEIMAGE
as a configurable property that gets forwarded during provisioning.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 23:20:46 +02:00
hsiegeln
fd50a147a2 fix: make tenant provisioning truly async via self-proxy
Some checks failed
CI / build (push) Failing after 41s
CI / docker (push) Has been skipped
@Async on provisionAsync() was bypassed because all call sites were
internal (this.provisionAsync), skipping the Spring proxy. Inject self
via @Lazy to route through the proxy so provisioning runs in a
background thread and the API returns immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:32:05 +02:00
hsiegeln
0dd52624b7 fix: use semicolon as COMPOSE_FILE separator on Windows
All checks were successful
CI / build (push) Successful in 1m59s
CI / docker (push) Successful in 46s
Windows Docker Compose uses ; not : as the path separator in COMPOSE_FILE.
The colon was being interpreted as part of the filename, causing CreateFile errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:11:34 +02:00
hsiegeln
1ce0ea411d chore: update design-system to 0.1.54
Some checks failed
CI / build (push) Successful in 1m25s
CI / docker (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 22:08:17 +02:00
hsiegeln
81be25198c chore: update design-system to 0.1.53
All checks were successful
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m32s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:56:14 +02:00
hsiegeln
dc4ea33c9b feat: externalize docker-compose templates from installer scripts
All checks were successful
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 20s
Replace inline heredoc compose generation with static template files.
Templates are copied to the install dir and composed via COMPOSE_FILE in .env.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:53:26 +02:00
hsiegeln
186f7639ad docs: update CLAUDE.md with template-based installer architecture
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:11:04 +02:00
hsiegeln
6c7895b0d6 chore(installer): remove generated install output, add to gitignore
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:09:30 +02:00
hsiegeln
6170f61eeb refactor(installer): replace ps1 compose generation with template copying
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:08:34 +02:00
hsiegeln
2ed527ac74 refactor(installer): replace sh compose generation with template copying
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 21:03:01 +02:00
hsiegeln
cb1f6b8ccf feat(installer): add .env.example with documented variables
Reference .env file documenting all configuration variables across both
deployment modes, with section headers for compose assembly, public access,
credentials, TLS, Docker, provisioning, and monitoring.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:59:15 +02:00
hsiegeln
758585cc9a feat(installer): add TLS and monitoring overlay templates
Optional compose overlays: TLS overlay mounts user-supplied certs into
traefik, monitoring overlay replaces the noop bridge with an external
Docker network for Prometheus scraping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:59:10 +02:00
hsiegeln
141b44048c feat(installer): add standalone docker-compose and traefik templates
Standalone mode: server + server-ui services with postgres image override
to stock postgres:16-alpine. Includes traefik-dynamic.yml for default TLS
certificate store configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:59:05 +02:00
hsiegeln
3c343f9441 feat(installer): add SaaS docker-compose template
Logto identity provider and cameleer-saas management plane services.
Includes Traefik labels, CORS config, bootstrap healthcheck, and all
provisioning env vars parameterized from .env.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:59:00 +02:00
hsiegeln
bdb24f8de6 feat(installer): add infra base docker-compose template
Shared infrastructure base (traefik, postgres, clickhouse) always loaded
regardless of deployment mode. Uses parameterized images, fail-if-unset
password variables, and a noop monitoring network bridge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:58:54 +02:00
hsiegeln
933b56f68f docs: add implementation plan for externalizing compose templates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:54:31 +02:00
hsiegeln
19c463051a docs: add design spec for externalizing docker compose templates
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:47:14 +02:00
hsiegeln
41052d01e8 fix: replace admin password fallback defaults with fail-if-unset
All checks were successful
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 16s
Docker compose templates defaulted to admin/admin when .env was missing.
Now uses :? to fail with a clear error instead of silently using weak creds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 20:17:46 +02:00
hsiegeln
99e75b0a4e fix: update sign-in UI to design-system 0.1.51
All checks were successful
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 1m31s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:33:00 +02:00
hsiegeln
eb6897bf10 chore: update design-system to 0.1.51 (renamed assets)
Some checks failed
CI / build (push) Failing after 1m9s
CI / docker (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 16:30:50 +02:00
hsiegeln
63c194dab7 chore: rename cameleer3 to cameleer
Some checks failed
CI / build (push) Failing after 18s
CI / docker (push) Has been skipped
Rename Java packages from net.siegeln.cameleer3 to net.siegeln.cameleer,
update all references in workflows, Docker configs, docs, and bootstrap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 15:28:44 +02:00
hsiegeln
44a0e413e9 fix: include cameleer3-log-appender.jar in runtime base image
All checks were successful
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 13s
The log appender JAR was missing from the cameleer-runtime-base Docker
image, causing agent log forwarding to silently fail with "No supported
logging framework found, log forwarding disabled". This meant only
container stdout logs (source=container) were captured — no application
or agent logs reached ClickHouse.

CI now downloads the appender JAR from the Maven registry alongside the
agent JAR, and the Dockerfile COPYs it to /app/cameleer3-log-appender.jar
where the server's Docker entrypoint expects it (-Dloader.path for
Spring Boot, -cp for plain Java).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 11:11:04 +02:00
hsiegeln
15306dddc0 fix: force-pull images on install and fix provisioning test assertions
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 47s
Installers now use `--pull always --force-recreate` on `docker compose up`
to ensure fresh images are used on every install/reinstall, preventing
stale containers from missing schema changes like db_password.

Fix VendorTenantServiceTest to expect two repository saves in provisioning
tests (one for dbPassword, one for final status).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:50:40 +02:00
hsiegeln
6eb848f353 fix: add missing TenantDatabaseService mock to VendorTenantServiceTest
Some checks failed
CI / build (push) Failing after 58s
CI / docker (push) Has been skipped
Constructor gained an 11th parameter (TenantDatabaseService) but the
test was not updated, breaking CI compilation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 08:32:14 +02:00
hsiegeln
d53afe43cc docs: update CLAUDE.md for per-tenant PG isolation and consolidated migrations
Some checks failed
CI / build (push) Failing after 42s
CI / docker (push) Has been skipped
SonarQube Analysis / sonarqube (push) Failing after 33s
- TenantDatabaseService added to key classes
- TenantDataCleanupService now ClickHouse-only
- Per-tenant JDBC URL with currentSchema/ApplicationName in env vars table
- Provisioning flow updated with DB creation step
- Delete flow updated with schema+user drop
- Database migrations section reflects consolidated V001 baseline

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:26:36 +02:00
hsiegeln
24a443ef30 refactor: consolidate Flyway migrations into single V001 baseline
Some checks failed
CI / build (push) Failing after 51s
CI / docker (push) Has been skipped
Replace 14 incremental migrations (V001-V015) with a single V001__init.sql
representing the final schema. Tables that were created and later dropped
(environments, api_keys, apps, deployments) are excluded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:24:25 +02:00
hsiegeln
d7eb700860 refactor: move PG cleanup to TenantDatabaseService, keep only ClickHouse
TenantDataCleanupService now handles only ClickHouse GDPR erasure;
the dropPostgresSchema private method is removed and the public method
renamed cleanupClickHouse(). VendorTenantService updated accordingly
with the TODO comment removed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 00:17:00 +02:00
hsiegeln
c1458e4995 feat: create per-tenant PG database during provisioning, drop on delete
Inject TenantDatabaseService; call createTenantDatabase() at the start
of provisionAsync() (stores generated password on TenantEntity), and
dropTenantDatabase() in delete() before GDPR data erasure.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 00:16:06 +02:00
hsiegeln
b79a7fe405 feat: construct per-tenant JDBC URL with currentSchema and ApplicationName
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 00:14:35 +02:00
hsiegeln
6d6c1f3562 feat: add TenantDatabaseService for per-tenant PG user+schema
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 00:13:34 +02:00
hsiegeln
0e3f383cf4 feat: add dbPassword to TenantProvisionRequest 2026-04-15 00:13:27 +02:00
hsiegeln
cd6dd1e5af feat: add dbPassword field to TenantEntity 2026-04-15 00:13:12 +02:00
hsiegeln
dfa2a6bfa2 feat: add db_password column to tenants table (V015) 2026-04-15 00:13:11 +02:00
hsiegeln
a7196ff4c1 docs: per-tenant PostgreSQL isolation implementation plan
8-task plan covering migration, entity change, TenantDatabaseService,
provisioner JDBC URL construction, VendorTenantService integration,
and TenantDataCleanupService refactor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:11:34 +02:00
hsiegeln
17c6723f7e docs: per-tenant PostgreSQL isolation design spec
Per-tenant PG users and schemas for DB-level data isolation.
Each tenant server gets its own credentials and currentSchema/ApplicationName
JDBC parameters, aligned with server team's commit 7a63135.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 00:08:35 +02:00
hsiegeln
91e93696ed fix: improve Infrastructure page readability
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 54s
Use Card and KpiStrip design system components, add database icons to
section headers, right-align numeric columns, replace text toggles with
chevron icons, and constrain max width to prevent ultra-wide stretching.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:26:43 +02:00
hsiegeln
57e41e407c fix: remove "Open Server Dashboard" link from tenant sidebar
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 55s
The server dashboard link in the sidebar footer is premature — tenant
servers may not be provisioned yet and the link target depends on org
context that isn't always available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:16:39 +02:00
hsiegeln
bc46af5cea fix: use configured credentials for tenant schema cleanup
All checks were successful
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 39s
Same hardcoded dev credentials bug as InfrastructureService —
TenantDataCleanupService.dropPostgresSchema() used "cameleer"/"cameleer_dev"
instead of the provisioning properties, causing schema DROP to fail on
production installs during tenant deletion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:11:16 +02:00
hsiegeln
03fb414981 fix: use configured credentials for infrastructure PostgreSQL queries
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 43s
pgConnection() had hardcoded dev credentials ("cameleer"/"cameleer_dev")
instead of using the provisioning properties, causing "password
authentication failed" on production installs where the password is
generated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 23:01:00 +02:00
hsiegeln
553ecc1490 fix: PowerShell installer fixes for Windows and Logto console login
All checks were successful
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 18s
Three issues fixed:

1. Docker socket: use /var/run/docker.sock instead of Windows named pipe
   (//./pipe/docker_engine) — Linux containers can't use named pipes.

2. FQDN detection: reverse-DNS lookup on host IPs to find the FQDN
   instead of relying on GetHostEntry which returns bare hostname on
   Windows machines with DNS-registered domain suffixes.

3. Reinstall path duplication: Push-Location/Pop-Location in the
   reinstall handler used try/catch without finally, so Pop-Location
   was skipped when docker compose wrote to stderr under
   ErrorActionPreference=Stop. CWD stayed in the install dir, causing
   the relative ./cameleer default to resolve to cameleer/cameleer.

4. Logto bootstrap: register admin-console redirect URIs and add the
   admin user to Logto's internal organizations (t-default, t-admin)
   with the admin role — both required for console login to work.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 22:46:05 +02:00
hsiegeln
dec1c53d30 docs: track AGENTS.md for GitNexus code intelligence instructions
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 20:04:06 +02:00
hsiegeln
ace6ad0cf2 fix: remove openssl dependency for password generation
All checks were successful
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 19s
Use /dev/urandom instead of openssl rand for generating random
passwords. Available on all Linux/macOS systems without requiring
openssl to be installed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:58:11 +02:00
hsiegeln
4a67677158 fix: use correct compose service names in health checks
All checks were successful
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 33s
The verify_health functions passed short service names (postgres,
clickhouse, server, logto) but the actual compose services are
prefixed with cameleer-. This caused docker compose ps -q to return
empty, so health was never read and checks always timed out.

Also renamed server/server-ui service definitions to
cameleer-server/cameleer-server-ui for consistency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 19:17:41 +02:00
hsiegeln
27c3f4d136 refactor: prefix all third-party service names with cameleer-
Some checks failed
CI / build (push) Successful in 1m59s
CI / docker (push) Successful in 1m6s
SonarQube Analysis / sonarqube (push) Failing after 1m55s
Rename all Docker Compose service names, DNS hostnames, volumes,
and Traefik labels to use the cameleer- prefix for clear ownership.

Services renamed:
- postgres → cameleer-postgres
- clickhouse → cameleer-clickhouse
- logto → cameleer-logto
- traefik → cameleer-traefik

Volumes renamed:
- pgdata → cameleer-pgdata
- chdata → cameleer-chdata
- certs → cameleer-certs
- bootstrapdata → cameleer-bootstrapdata

Updated across:
- docker-compose.yml, docker-compose.dev.yml
- installer/cameleer/docker-compose.yml
- installer/install.sh, installer/install.ps1
- application.yml defaults
- DockerTenantProvisioner.java hardcoded URL
- logto-bootstrap.sh defaults
- VendorTenantServiceTest.java
- CLAUDE.md, docs/architecture.md, docs/user-manual.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 22:51:33 +02:00
hsiegeln
fe6682e520 docs: update CLAUDE.md for deployment modes and admin merge
All checks were successful
CI / build (push) Successful in 1m59s
CI / docker (push) Successful in 23s
- Document standalone vs multi-tenant deployment modes
- Replace vendor references with SaaS admin
- Update bootstrap phases (Phase 12 always runs, no VENDOR_SEED_ENABLED)
- Update provisioning flow (no separate vendor user)
- Remove VENDOR_SEED_ENABLED from dev compose reference

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:04:57 +02:00
hsiegeln
012c866594 refactor: merge vendor user into saas-admin
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 17s
The admin user IS the platform admin — no separate vendor user needed.
The saas-vendor role is now always assigned to the admin user during
bootstrap. Removes VENDOR_ENABLED, VENDOR_USER, VENDOR_PASS from all
config, prompts, compose templates, and bootstrap script.

In multi-tenant mode: admin logs in with saas-admin credentials, gets
platform:admin scope via saas-vendor role, manages tenants directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:36:52 +02:00
hsiegeln
4e553a6c42 fix: add BOOTSTRAP_TOKEN to standalone server env
All checks were successful
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 15s
The cameleer3-server requires CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN
at startup. In standalone mode nothing uses it externally, but the
server's SecurityBeanConfig validates it exists. Generate a random
token in the .env and pass it through.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:16:10 +02:00
hsiegeln
f254f2700f feat: standalone single-tenant deployment mode
All checks were successful
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 14s
Single-tenant installations now run the server directly without Logto
or the SaaS management plane. The installer generates a simpler compose
with 5 services: traefik, postgres, clickhouse, cameleer3-server, and
cameleer3-server-ui. Uses local auth (built-in admin), no OIDC.

Multi-tenant (vendor) mode is unchanged — full SaaS stack with Logto.

Changes:
- New DEPLOYMENT_MODE variable (standalone/saas) replaces TENANT_ORG_NAME
- generate_compose_file_standalone() for the 5-service compose
- Standalone traefik-dynamic.yml (no /platform/ redirect)
- Stock postgres:16-alpine (server creates schema via Flyway)
- Standalone health checks (server + UI instead of Logto + SaaS)
- Standalone credentials/docs generation
- Remove Phase 12b from bootstrap (no longer needed)
- Remove setup_single_tenant_record (no longer needed)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:12:02 +02:00
hsiegeln
17d8d98d5f fix: move single-tenant DB record creation from bootstrap to installer
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 17s
The bootstrap script runs before the SaaS app starts, but the tenants
table only exists after Flyway migrations run in the SaaS app. This
circular dependency caused Phase 12b's psql commands to fail under
set -e, crashing the Logto container on first install in single-tenant
mode.

Now the bootstrap only handles Logto-side setup (org, user roles, OIDC
redirect URIs), and the installer creates the tenant DB record after
verify_health confirms the SaaS app is up. Also makes docker_compose_up
tolerant of transient startup errors since verify_health is the real
health gate.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 19:31:23 +02:00
hsiegeln
bfb26d9aa5 fix: guard logto entrypoint kill with || true to prevent set -e exit
All checks were successful
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 17s
When the background Logto process exits during bootstrap, `kill $LOGTO_PID`
returns non-zero. Under `set -e`, this terminates the entrypoint before
reaching the production-mode restart, causing the container to error on
first startup and only recover via restart policy.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 19:12:22 +02:00
hsiegeln
cd4266ffc6 chore: remove redundant DOCKER_HOST env var from SaaS service
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 13s
TenantProvisionerAutoConfig already hardcodes the socket path via
.withDockerHost("unix:///var/run/docker.sock"). The env var was
redundant and not read by the Java Docker client.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 19:02:42 +02:00
hsiegeln
74a1e02cb8 fix: replace env_file with explicit env vars for cameleer-saas
Some checks failed
CI / build (push) Failing after 2s
CI / docker (push) Has been skipped
Revert env_file approach — only pass the specific env vars the SaaS
app needs for its own database, identity, and tenant provisioning.
Organized into clear groups: Docker, SaaS database, Identity, and
Provisioning (passed to per-tenant servers).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 19:01:26 +02:00
hsiegeln
b3a19098c5 fix: pass all .env vars to cameleer-saas via env_file
Some checks failed
CI / build (push) Failing after 11s
CI / docker (push) Has been skipped
Instead of explicitly listing every env var the SaaS container needs,
use env_file to pass the entire .env. This ensures all installer-
configured values (passwords, hosts, ports, etc.) are available for
current and future use by the SaaS app and its provisioning config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:58:04 +02:00
hsiegeln
6b1dcba876 fix: pass ClickHouse password to SaaS provisioning config
All checks were successful
CI / build (push) Successful in 1m30s
CI / docker (push) Successful in 39s
The CLICKHOUSE_PASSWORD env var was set on the clickhouse container
but not passed to cameleer-saas. The provisioning properties defaulted
to 'cameleer_ch' instead of the installer-generated password, causing
tenant servers to fail ClickHouse authentication.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:55:36 +02:00
hsiegeln
38125f9ecc fix: update tests for new ProvisioningProperties constructor args
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 41s
Add datasourceUsername and datasourcePassword to test constructors
to match the updated record definition.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:48:35 +02:00
hsiegeln
6b95cf78ea fix: add datasource username/password defaults to application.yml
Some checks failed
CI / build (push) Failing after 37s
CI / docker (push) Has been skipped
The new ProvisioningProperties record fields need defaults in
application.yml or Spring Boot fails to bind the configuration.
Defaults to POSTGRES_USER/POSTGRES_PASSWORD env vars with
fallback to cameleer/cameleer_dev.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:46:38 +02:00
hsiegeln
b70d95cbb9 fix: pass database credentials to per-tenant servers via config
Some checks failed
CI / build (push) Failing after 38s
CI / docker (push) Has been skipped
The DockerTenantProvisioner hardcoded SPRING_DATASOURCE_USERNAME
and SPRING_DATASOURCE_PASSWORD as "cameleer" / "cameleer_dev".
With the installer generating random passwords, tenant servers
failed to connect to PostgreSQL.

Add datasourceUsername and datasourcePassword to ProvisioningProperties,
pass them from the compose env vars, and use them in the provisioner.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:44:32 +02:00
hsiegeln
8b9045b0e2 fix: detect Docker socket GID for container permissions
All checks were successful
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 12s
The Docker socket group varies by host (e.g., GID 1001 on WSL2).
Hardcoding group_add: ["0"] doesn't work when the socket is owned
by a different group. The installer now detects the socket GID at
install time via stat. The main docker-compose.yml uses a
configurable DOCKER_GID env var (defaults to 0).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:39:20 +02:00
hsiegeln
4fe642b91d fix: add Docker socket mount and DOCKER_HOST to SaaS service
All checks were successful
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 15s
The cameleer-saas service needs Docker socket access for tenant
provisioning. Add the socket bind mount, group_add for permissions,
and explicit DOCKER_HOST=unix:///var/run/docker.sock to prevent
the Java Docker client from falling back to TCP (which happens on
WSL2 + Docker Desktop when DOCKER_HOST leaks from the host env).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:30:55 +02:00
hsiegeln
7e13b4ee5d fix(installer): use Docker health status instead of exec for verification
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 16s
Replace docker compose exec commands with Docker's built-in health
status checks. The exec-based ClickHouse check hung because
clickhouse-client waits for the server during initialization.
Docker's healthcheck status is already configured in compose and
is more reliable. Logto + Bootstrap merged into one check since
the healthcheck includes the bootstrap.json file test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:28:04 +02:00
hsiegeln
85eabd86ef feat: add deployment mode — vendor (multi-tenant) or single-tenant
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 17s
Installer now asks deployment mode in simple mode:
- Multi-tenant vendor: creates saas-vendor role + assigns to admin
- Single tenant: asks for org name, creates Logto org + tenant record,
  assigns admin as org owner

Reverts always-create-vendor-role — role is only created when vendor
mode is selected. TENANT_ORG_NAME env var passed to bootstrap for
single-tenant org creation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:18:25 +02:00
hsiegeln
b44f6338f8 fix: always assign saas-vendor role to admin user
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 13s
The admin user needs platform:admin to create tenants via the vendor
console. Previously the saas-vendor role was only created when
VENDOR_SEED_ENABLED=true (for a separate vendor user). Now the role
is always created and assigned to the admin user. VENDOR_SEED_ENABLED
only controls creating the separate vendor user.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:09:10 +02:00
hsiegeln
4ff04c386e fix(installer): force lowercase hostname in merge_config
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 15s
Moves hostname normalization into merge_config() so it applies
regardless of source (CLI flag, env var, config file, prompt,
auto-detect). Logto normalizes hostnames internally — case mismatch
causes JWT issuer validation failure (401).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 18:04:30 +02:00
hsiegeln
b38f02eae3 fix(installer): fix ClickHouse health check and normalize hostname
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 19s
- ClickHouse health check: use $CLICKHOUSE_PASSWORD directly instead
  of extracting from .env via grep (nested quoting broke in eval)
- Normalize auto-detected hostname to lowercase (Windows returns
  uppercase which causes OIDC issuer case mismatches)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:58:32 +02:00
hsiegeln
8c504b714d fix: use BOOTSTRAP_LOCAL flag to skip Host headers in bootstrap
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 16s
When running inside the Logto container (BOOTSTRAP_LOCAL=true), the
bootstrap script skips Host and X-Forwarded-Proto headers on all curl
calls. This avoids issuer mismatches when Logto runs with localhost
endpoints during bootstrap mode. PUBLIC_HOST/PUBLIC_PROTOCOL remain
unchanged so redirect URIs are generated with the correct public values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:44:02 +02:00
hsiegeln
83801d2499 fix: use localhost for bootstrap, restart Logto with public endpoints
All checks were successful
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 13s
Start Logto with localhost endpoints so bootstrap can reach the
Management API without going through Traefik. After bootstrap
completes, restart Logto with the real public endpoints for
production use. This eliminates the Traefik race condition entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:28:19 +02:00
hsiegeln
9042356e81 fix: wait for Traefik to discover routes before bootstrap
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 16s
The Management API requires the admin OIDC endpoint (ADMIN_ENDPOINT)
to be reachable. Since bootstrap now runs inside the Logto container
(not a separate container), Traefik may not have discovered the labels
yet. Wait for the admin endpoint to be routable before running bootstrap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:22:05 +02:00
hsiegeln
f97e951d87 fix: add db alteration deploy step to Logto entrypoint
All checks were successful
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 14s
Newer Logto versions require `npm run cli db alteration deploy` after
seeding to apply schema migrations. Without this, Logto fails with
"relation systems does not exist".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 17:12:06 +02:00
hsiegeln
fa6bca0add fix: use apk instead of apt-get in Logto Dockerfile
All checks were successful
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m12s
The Logto base image (ghcr.io/logto-io/logto:latest) is Alpine-based,
not Debian. Switch from apt-get to apk for installing bootstrap deps.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:47:05 +02:00
hsiegeln
11dd6a354f feat(installer): add PowerShell installer for Windows
Some checks failed
CI / build (push) Successful in 1m24s
CI / docker (push) Failing after 25s
Mirrors install.sh structure and produces identical output files.
Uses native PowerShell idioms for parameters, prompts, and crypto.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:39:24 +02:00
hsiegeln
7f15177310 feat(installer): add main function and complete install.sh
Appends the main() entry point that wires together all installer phases:
arg parsing, config loading, rerun detection, prerequisites, auto-detect,
interactive prompts, config merge/validate, password generation, file
generation, docker pull/up, health verification, and output printing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 16:33:15 +02:00
hsiegeln
b01f6e5109 feat(installer): add re-run, upgrade, and reinstall logic 2026-04-13 16:32:02 +02:00
hsiegeln
8146f072df feat(installer): add output file generation (credentials, INSTALL.md, config) 2026-04-13 16:31:38 +02:00
hsiegeln
f13fd3faf0 feat(installer): add docker operations and health verification 2026-04-13 16:30:53 +02:00
hsiegeln
5e5bc97bf5 feat(installer): add .env and docker-compose.yml generation 2026-04-13 16:30:32 +02:00
hsiegeln
7fc80cad58 feat(installer): add config merge, validation, and password generation 2026-04-13 16:25:34 +02:00
hsiegeln
6eabd0cf2e feat(installer): add interactive prompts for simple and expert modes 2026-04-13 16:25:16 +02:00
hsiegeln
4debee966a feat(installer): add prerequisite checks and auto-detection 2026-04-13 16:24:55 +02:00
hsiegeln
1e348eb8ca feat(installer): add argument parsing and config file handling 2026-04-13 16:24:35 +02:00
hsiegeln
f136502a35 feat(installer): scaffold install.sh with constants and utilities
Creates the installer skeleton (Phase 2, Task 8) with version/registry
constants, color codes, default values, _ENV_* variable capture pattern,
config/state variable declarations, and utility functions (log_*, print_banner,
prompt, prompt_password, prompt_yesno, generate_password).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 16:22:21 +02:00
hsiegeln
bf367b1db7 ci: add builds for cameleer-postgres, cameleer-clickhouse, cameleer-traefik
Update Logto build to use repo root context for bootstrap script access.
2026-04-13 16:20:37 +02:00
hsiegeln
f5165add13 feat: consolidate docker-compose.yml for baked-in images
Remove all bind-mounted config files and init containers. Services
reduced from 7 to 5. All configuration via environment variables.
2026-04-13 16:19:29 +02:00
hsiegeln
ec38d0b1c2 feat: merge bootstrap into cameleer-logto image
Adds logto-entrypoint.sh that seeds DB, starts Logto, waits for health,
runs bootstrap, then keeps Logto running. Eliminates the separate
logto-bootstrap init container.
2026-04-13 16:17:13 +02:00
hsiegeln
6cd82de5f9 fix: update traefik-dynamic.yml cert paths to /certs/
The entrypoint writes certs to /certs/ but the dynamic config
referenced /etc/traefik/certs/. Since both are baked into the image,
align the paths so only one volume mount is needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:15:39 +02:00
hsiegeln
0a0898b2f7 feat: create cameleer-traefik image with cert generation and config baked in
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 16:14:47 +02:00
hsiegeln
6864081550 feat: create cameleer-clickhouse image with init and config baked in
Bakes init.sql, users.xml (with from_env password), and prometheus.xml
into a custom ClickHouse image to eliminate 3 bind-mounted config files.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 16:13:06 +02:00
hsiegeln
fe5838b40f feat: create cameleer-postgres image with init script baked in
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-13 16:12:02 +02:00
hsiegeln
1b57f03973 Add install script implementation plan
18 tasks across 3 phases:
- Phase 1 (Tasks 1-7): Platform image consolidation — bake init
  scripts into cameleer-postgres, cameleer-clickhouse, cameleer-traefik,
  merge bootstrap into cameleer-logto, update compose and CI
- Phase 2 (Tasks 8-17): Bash installer with simple/expert/silent modes,
  config precedence, health verification, idempotent re-run
- Phase 3 (Task 18): PowerShell port for Windows

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 16:08:50 +02:00
hsiegeln
0a06615ae2 Fix spec self-review issues in install script design
Resolve TBD placeholder (Docker minimum versions), clarify TLS cert
flow after traefik-certs init container merge, note Traefik env var
substitution for dynamic config, and document Docker socket path
differences between Linux and Windows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:38:59 +02:00
hsiegeln
16a2ff3174 Add install script design spec
Defines a professional installer for the Cameleer SaaS platform with
dual native scripts (bash + PowerShell), three installation modes
(simple/expert/silent), and a platform simplification that consolidates
7 services into 5 by baking all init logic into Docker images.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 15:37:23 +02:00
hsiegeln
c2ccf9d233 feat: enable Prometheus metrics for ClickHouse and tenant servers
Some checks failed
CI / build (push) Successful in 1m46s
CI / docker (push) Successful in 55s
SonarQube Analysis / sonarqube (push) Failing after 1m19s
ClickHouse: enable built-in Prometheus exporter at :9363/metrics via
config.d/prometheus.xml with metrics, events, and async_metrics.
Docker labels added for docker_sd_configs auto-discovery.

Tenant servers: add prometheus.scrape/path/port labels to provisioned
server containers pointing to /api/v1/prometheus:8081.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:24:08 +02:00
hsiegeln
06c85edd8e chore: update design system to 0.1.45 (sidebar version styling)
All checks were successful
CI / build (push) Successful in 1m54s
CI / docker (push) Successful in 1m29s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 17:28:49 +02:00
hsiegeln
9514ab69c8 fix: update test constructors for ProvisioningProperties arity change
All checks were successful
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 41s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 14:48:53 +02:00
hsiegeln
d3a9be8f2e fix: remove vendor-to-tenant-org addition on tenant creation
Some checks failed
CI / build (push) Failing after 50s
CI / docker (push) Has been skipped
Vendor has platform:admin scope globally and manages tenants through the
SaaS console — no need to be a member of each tenant's Logto org.
Removes the step that failed with Logto's varchar(21) user ID limit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 14:30:56 +02:00
hsiegeln
85e0d6156a fix: remove :ro from clickhouse-users.xml mount
Some checks failed
CI / build (push) Failing after 58s
CI / docker (push) Has been skipped
ClickHouse entrypoint needs write access to resolve from_env attribute
and apply CLICKHOUSE_PASSWORD to the default user config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 14:27:35 +02:00
hsiegeln
96aa6579b0 fix: use separate CH credentials, remove dead bootstrap code
Some checks failed
CI / build (push) Failing after 41s
CI / docker (push) Has been skipped
- ClickHouse: pass user/password via ProvisioningProperties instead of
  baking into JDBC URLs. All consumers (InfrastructureService,
  TenantDataCleanupService, DockerTenantProvisioner) use the same source.
- Bootstrap: remove dead tenant config (CAMELEER_AUTH_TOKEN, t-default
  org, example tenant vars) — tenants are created dynamically by vendor.
- Bootstrap JSON: remove unused fields (tenantName, tenantSlug,
  bootstrapToken, tenantAdminUser, organizationId).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 14:12:42 +02:00
hsiegeln
da4a263cd7 fix: add ClickHouse password authentication
All checks were successful
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 42s
ClickHouse default user had no password, causing auth failures on recent
CH versions. Set password via from_env in clickhouse-users.xml, pass
credentials in JDBC URLs to SaaS services and tenant server containers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:59:59 +02:00
hsiegeln
879accfc7f fix: allow tenant slug reuse after soft-delete
All checks were successful
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 44s
Replace absolute UNIQUE constraint on tenants.slug with a partial unique
index that excludes DELETED rows. This allows re-creating a tenant with
the same slug after deletion.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 13:39:13 +02:00
hsiegeln
35a62463b3 docs: document vendor Infrastructure page and env var
Some checks failed
CI / build (push) Successful in 1m58s
CI / docker (push) Successful in 1m21s
SonarQube Analysis / sonarqube (push) Failing after 1m51s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:20:06 +02:00
hsiegeln
92503a1061 feat: add vendor infrastructure page with PG/CH per-tenant view
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 23:18:59 +02:00
hsiegeln
95a92ae9e5 feat: add vendor InfrastructureController for platform:admin
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:16:18 +02:00
hsiegeln
5aa8586940 feat: add InfrastructureService with PG and CH queries
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:15:18 +02:00
hsiegeln
776a01d87b feat: set INFRASTRUCTUREENDPOINTS=false on tenant server containers
Adds CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS=false to the env
var list injected into provisioned tenant server containers, disabling
the Database and ClickHouse admin endpoints (returns 404) on SaaS-
managed instances. The server defaults to true (standalone mode).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-11 23:13:28 +02:00
hsiegeln
0b736a92f9 docs: update env var references to new naming convention
All checks were successful
CI / build (push) Successful in 1m51s
CI / docker (push) Successful in 19s
architecture.md runtime/deployment section rewritten with correct
CAMELEER_SAAS_PROVISIONING_* and CAMELEER_SERVER_* env vars.
user-manual.md updated container resource env vars and removed
stale CAMELEER_TENANT_SLUG reference. HOWTO.md cleaned up.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:56:21 +02:00
hsiegeln
df90814cc3 Update OIDC env vars for server's nested oidc.* grouping
All checks were successful
CI / build (push) Successful in 1m47s
CI / docker (push) Successful in 1m2s
Align DockerTenantProvisioner env vars with the server's new
cameleer.server.security.oidc.* namespace:
  CAMELEER_SERVER_SECURITY_OIDC_ISSUERURI
  CAMELEER_SERVER_SECURITY_OIDC_JWKSETURI
  CAMELEER_SERVER_SECURITY_OIDC_AUDIENCE
  CAMELEER_SERVER_SECURITY_OIDC_TLSSKIPVERIFY

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 21:30:41 +02:00
hsiegeln
8cf44f6e2c Migrate config to cameleer.saas.* naming convention
All checks were successful
CI / build (push) Successful in 1m49s
CI / docker (push) Successful in 55s
Move all SaaS configuration properties under the cameleer.saas.*
namespace with all-lowercase dot-separated names and mechanical env var
mapping. Aligns with the server (cameleer.server.*) and agent
(cameleer.agent.*) conventions.

Changes:
- Move cameleer.identity.* → cameleer.saas.identity.*
- Move cameleer.provisioning.* → cameleer.saas.provisioning.*
- Move cameleer.certs.* → cameleer.saas.certs.*
- Rename kebab-case properties to concatenated lowercase
- Update all env vars to CAMELEER_SAAS_* mechanical mapping
- Update DockerTenantProvisioner to pass CAMELEER_SERVER_* env vars
  to provisioned server containers (matching server's new convention)
- Spring JWT config now derives from SaaS properties via cross-reference
- Clean up orphaned properties in application-local.yml
- Update docker-compose.yml, docker-compose.dev.yml, .env.example
- Update CLAUDE.md, HOWTO.md, architecture.md, user-manual.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 18:11:21 +02:00
hsiegeln
5e69628a51 docs: update CLAUDE.md with upgrade, password mgmt, TLS, cleanup
All checks were successful
CI / build (push) Successful in 1m50s
CI / docker (push) Successful in 19s
- VendorTenantService: upgrade server (force-pull + re-provision)
- TenantPortalService: password management, server upgrade
- DockerTenantProvisioner: upgrade(), full cleanup in remove(), GDPR
- Traefik TLS: default cert in dynamic config (v3 compatibility)
- CA trust: server entrypoint imports ca.pem into JVM truststore
- LogtoManagementClient: password updates via correct endpoint
- ServerApiClient: server admin password reset
- UI: tenant dashboard/settings password and upgrade controls

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 12:07:41 +02:00
hsiegeln
9163f919c8 fix: move TLS default cert config to Traefik dynamic config
All checks were successful
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 13s
Traefik v3 ignores tls.stores.default in the static config, causing it
to serve its auto-generated fallback cert instead of the platform cert.
Moving the default certificate store to the dynamic config (file
provider) fixes this — Traefik now serves the correct cert and also
picks up cert rotations without a restart.

This was the root cause of OIDC PKIX failures: the server imported the
CA into its JVM truststore, but Traefik was serving a different cert
entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:45:02 +02:00
hsiegeln
3b8b76d53e chore: update @cameleer/design-system to 0.1.44
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 55s
Error toasts now persist until manually dismissed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 11:01:44 +02:00
hsiegeln
e5523c969e fix: use correct Logto endpoint for password updates
All checks were successful
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 43s
PATCH /api/users/{id}/password, not /api/users/{id}. The general user
update endpoint rejected the password field with 422.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:57:28 +02:00
hsiegeln
e2e5c794a2 feat: add server upgrade action — force-pull latest images and re-provision
All checks were successful
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 48s
Restart only stops/starts existing containers with the same image. The new
upgrade action removes server + UI containers, force-pulls the latest
Docker images, then re-provisions (preserving app containers, volumes, and
networks). Available to both vendor (tenant detail) and tenant admin
(dashboard).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 10:45:45 +02:00
hsiegeln
d5eead888d feat: server admin password reset via tenant portal
All checks were successful
CI / build (push) Successful in 2m23s
CI / docker (push) Successful in 1m8s
- POST /api/tenant/server/admin-password — resets server's built-in
  admin password via M2M API call to the tenant's server
- Settings page: "Server Admin Password" card
- ServerApiClient.resetServerAdminPassword() calls server's password
  reset endpoint with M2M token

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:46:30 +02:00
hsiegeln
4121bd64b2 feat: password management for tenant portal
All checks were successful
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 47s
- POST /api/tenant/password — change own Logto password
- POST /api/tenant/team/{userId}/password — reset team member password
- Settings page: "Change Password" card with confirm field
- Team page: "Reset Password" button per member with inline form
- LogtoManagementClient.updateUserPassword() via Logto Management API

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:19:48 +02:00
hsiegeln
dd8553a8b4 feat: full tenant cleanup on delete — Docker resources, PG schema, CH data (#55)
All checks were successful
CI / build (push) Successful in 2m23s
CI / docker (push) Successful in 1m6s
DockerTenantProvisioner.remove() now cleans up all tenant Docker resources:
containers (by cameleer.tenant label), env networks, tenant network, JAR volume.
TenantDataCleanupService drops the tenant's PostgreSQL schema and deletes all
ClickHouse data for GDPR compliance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 09:10:47 +02:00
hsiegeln
3284304c1f fix: remove dead /server/ fallback redirect
All checks were successful
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 46s
When no org is resolved, redirect to /tenant instead of the
non-existent /server/ path. Fixes login redirect loop.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:39:14 +02:00
hsiegeln
6f8b84fb1a fix: re-provision containers when restart finds them missing
All checks were successful
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 39s
When Docker containers have been removed (e.g. manual cleanup or image
update), restart now falls back to full re-provisioning instead of
failing with 404. Applies to both vendor and tenant portal restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:37:04 +02:00
hsiegeln
d2caa737b9 chore: update @cameleer/design-system to v0.1.43 (FileInput)
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 29s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:07:39 +02:00
hsiegeln
875b07fb3a feat: use FileInput DS component for file uploads, fix certs volume perms
All checks were successful
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m12s
- Replace inline FileField and native <input type="file"> with
  FileInput from @cameleer/design-system (drag-and-drop, icons, clear)
- Update CertificatesPage and SsoPage to use FileInput + FormField
- Fix /certs volume permissions (chmod 775) so cameleer user can write

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 08:04:47 +02:00
hsiegeln
4fdf171912 fix: don't show stale CA banner when no CA bundle exists
Some checks failed
CI / build (push) Successful in 1m39s
CI / docker (push) Successful in 37s
SonarQube Analysis / sonarqube (push) Failing after 1m44s
The self-signed bootstrap cert has no CA bundle, so newly created tenants
with ca_applied_at=NULL are not actually stale. Skip the count when the
active cert has hasCa=false.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 22:21:26 +02:00
hsiegeln
2239d3d980 docs: update CLAUDE.md and HOWTO.md for fleet health and recent changes
All checks were successful
CI / build (push) Successful in 2m13s
CI / docker (push) Successful in 11s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 22:10:11 +02:00
hsiegeln
8eef7e170b feat: show agent/env counts in vendor tenant list
All checks were successful
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 48s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 22:02:46 +02:00
hsiegeln
d7ce0aaf8c feat: add agent/env counts to vendor tenant list endpoint
Extend VendorTenantSummary with agentCount, environmentCount, and
agentLimit fields. Fetch counts in parallel using CompletableFuture
per tenant, only calling server API for ACTIVE tenants with RUNNING
servers. Agent limit extracted from license limits JSONB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 22:01:02 +02:00
hsiegeln
a0c12b8ee6 chore: update DS to v0.1.42
All checks were successful
CI / build (push) Successful in 2m3s
CI / docker (push) Successful in 1m14s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:02:32 +02:00
hsiegeln
a5445e332e fix: fetch actual agent/environment counts from server for tenant dashboard
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 43s
The dashboard was showing hardcoded zeroes for agent and environment usage.
Now fetches real counts via M2M API from the tenant's server.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 20:35:24 +02:00
hsiegeln
cab6e409b9 fix: show public endpoint instead of internal Docker URL in tenant settings
All checks were successful
CI / build (push) Successful in 1m21s
CI / docker (push) Successful in 38s
Closes #51

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 20:29:54 +02:00
hsiegeln
0fe084bcb2 fix: restrict key.pem file permissions to 0600 (owner-only)
All checks were successful
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 34s
All private key writes now use writeAtomicRestricted which sets POSIX
owner-read/write permissions after writing. Gracefully skips on
non-POSIX filesystems (Windows dev).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:49:07 +02:00
hsiegeln
3ae8fa18cd feat: support password-protected private keys
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
Encrypted PKCS#8 private keys are decrypted during staging using the
provided password. The decrypted key is stored for Traefik (which needs
cleartext PEM). Unencrypted keys continue to work without a password.

- CertificateManager.stage() accepts optional keyPassword
- DockerCertificateManager handles EncryptedPrivateKeyInfo decryption
- UI: password field in upload form (vendor CertificatesPage)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:44:09 +02:00
hsiegeln
82f62ca0ff docs: add tenant CA cert management to CLAUDE.md and HOWTO.md
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 12s
- TenantCaCertEntity, TenantCaCertRepository, TenantCaCertService
- TenantPortalController CA endpoints
- V013 migration
- Tenant portal API reference updated

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:36:51 +02:00
hsiegeln
dd30ee77d4 feat: tenant CA certificate management with staging
Some checks failed
CI / build (push) Successful in 1m7s
CI / docker (push) Has been cancelled
Tenants can upload multiple CA certificates for enterprise SSO providers
that use private certificate authorities.

- New tenant_ca_certs table (V013) with PEM storage in DB
- Stage/activate/delete lifecycle per CA cert
- Aggregated ca.pem rebuild on activate/delete (atomic .wip swap)
- REST API: GET/POST/DELETE on /api/tenant/ca
- UI: CA Certificates section on SSO page with upload, activate, remove

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:35:04 +02:00
hsiegeln
a3a6f99958 fix: prevent vendor redirect to /tenant on hard refresh
All checks were successful
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 42s
RequireScope and LandingRedirect now wait for scopesReady flag before
evaluating, preventing the race where org-scoped tokens load before
global tokens and the vendor gets incorrectly redirected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:16:46 +02:00
hsiegeln
22752ffcb1 fix: polish CertificatesPage layout
All checks were successful
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 44s
- Truncate fingerprint with hover tooltip
- Remove duplicate warning icon in stale banner
- Style file inputs to match design system
- Bump grid min-width for better card spacing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 19:04:16 +02:00
hsiegeln
a48c4bfd08 docs: update CLAUDE.md and HOWTO.md for all session changes
All checks were successful
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 9s
- Certificate management (provider interface, lifecycle, bootstrap, UI)
- Async tenant provisioning with polling UX
- Server restart capability (vendor + tenant)
- Audit log actor name resolution from Logto
- SSO connector management, vendor audit page
- Updated API reference with all current endpoints
- Fixed architecture table (per-tenant containers are dynamic)
- Updated migration list through V012

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 18:41:41 +02:00
hsiegeln
45bcc954ac feat: certificate management with stage/activate/restore lifecycle
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 45s
Provider-based architecture (Docker now, K8s later):
- CertificateManager interface + DockerCertificateManager (file-based)
- Atomic swap via .wip files for safe cert replacement
- Stage -> Activate -> Archive lifecycle with one-deep rollback
- Bootstrap supports user-supplied certs via CERT_FILE/KEY_FILE/CA_FILE
- CA bundle aggregates platform + tenant CAs, distributed to containers
- Vendor UI: Certificates page with upload, activate, restore, discard
- Stale tenant tracking (ca_applied_at) with restart banner
- Conditional TLS skip removal when CA bundle exists

Includes design spec, migration V012, service + controller tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 18:29:02 +02:00
hsiegeln
51a1aef10e fix: hide server dashboard link for vendor, remove fingerprint icon
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 43s
Vendor persona doesn't need "Open Server Dashboard" in sidebar footer.
Removed inline Fingerprint icon from Identity (Logto) menu item.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:49:40 +02:00
hsiegeln
2607ef5dbe fix: resolve actor name from Logto for audit log entries
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 32s
AuditService now looks up username/name/email from Logto Management API
when actorEmail is null, with an in-memory cache to avoid repeated calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:47:43 +02:00
hsiegeln
0a1e848ef7 fix: return 204 No Content from restart endpoints
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 34s
Empty 200 responses caused JSON parse errors in the API client.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:41:17 +02:00
hsiegeln
6dc5e558a3 chore: bump @cameleer/design-system to 0.1.41
All checks were successful
CI / build (push) Successful in 51s
CI / docker (push) Successful in 52s
Picks up Spinner animation fix (missing @keyframes in CSS module).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:38:40 +02:00
hsiegeln
a3a1643b37 fix: update VendorTenantServiceTest for async provisioning
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 45s
Tests now mock tenantRepository.findById() since provisionAsync re-loads
the tenant entity, and assert on the entity directly rather than the
return value of createAndProvision().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:28:51 +02:00
hsiegeln
4447d79c92 fix: add missing TenantProvisioner mock to TenantPortalServiceTest
Some checks failed
CI / build (push) Failing after 40s
CI / docker (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:25:37 +02:00
hsiegeln
7e7a07470b feat: add restart server action for vendor and tenant
Some checks failed
CI / build (push) Failing after 36s
CI / docker (push) Has been skipped
Vendor: POST /api/vendor/tenants/{id}/restart (platform:admin scope)
Tenant: POST /api/tenant/server/restart (tenant:manage scope)

Both call TenantProvisioner.stop() then start() on the server + UI
containers. Restart button on vendor TenantDetailPage (Actions card)
and tenant TenantDashboardPage (Server card). Allowed in any status
including PROVISIONING.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:21:14 +02:00
hsiegeln
252c18bcff feat: async tenant provisioning with polling UX
Some checks failed
CI / build (push) Failing after 39s
CI / docker (push) Has been skipped
Backend: extract Docker provisioning into @Async method so the API
returns immediately with status=PROVISIONING. The tenant record, Logto
org, admin user, and license are created synchronously; container
provisioning, health check, license push, and OIDC config happen in a
background thread.

Frontend: navigate to tenant detail page immediately after creation.
Detail page polls every 3s while status=PROVISIONING and shows a
spinner indicator. Toast notification when provisioning completes.
Fixes #52.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:14:26 +02:00
hsiegeln
269c679e9c fix: remove server-specific providers now that TopBar is decomposed
All checks were successful
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 49s
Update to @cameleer/design-system@0.1.40 where TopBar no longer depends
on GlobalFilterProvider or CommandPaletteProvider. Remove these
unnecessary provider wrappers from main.tsx. Fixes #53.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 17:06:16 +02:00
hsiegeln
e559267f1e feat: replace tenant OIDC page with Enterprise SSO connector management
All checks were successful
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 46s
- Add LogtoManagementClient methods for SSO connector CRUD + org JIT
- Add TenantSsoService with tenant isolation (validates connector-org link)
- Add TenantSsoController at /api/tenant/sso with test endpoint
- Create SsoPage with provider selection, dynamic config form, test button
- Remove old OIDC config endpoints from tenant portal (server OIDC is
  now platform-managed, set during provisioning)
- Sidebar: OIDC -> SSO with Shield icon

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 15:48:51 +02:00
hsiegeln
4341656a5e refactor: remove additionalScopes from OIDC config push
All checks were successful
CI / build (push) Successful in 1m34s
CI / docker (push) Successful in 56s
Server now hardcodes Logto org scopes in the auth flow, so the
provisioner no longer needs to push them via OIDC config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 15:37:53 +02:00
hsiegeln
2cda065c06 fix: register /platform/ as post-logout redirect URI, improve sidebar contrast
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 44s
- Add /platform/ to SPA postLogoutRedirectUris in bootstrap (fixes #54)
- Use amber color + bold for active vendor sidebar items

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:41:18 +02:00
hsiegeln
bcad83cc40 fix: use JdbcTemplate for audit queries (match server pattern)
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 34s
Replace JPQL @Query with dynamic SQL via JdbcTemplate to avoid
Hibernate null parameter type issues (bytea vs text). Conditionally
appends WHERE clauses only for non-null filters, matching the proven
pattern from cameleer3-server's PostgresAuditRepository.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:31:02 +02:00
hsiegeln
0d47c2ec7c fix: avoid null bytea in audit search JPQL
All checks were successful
CI / build (push) Successful in 51s
CI / docker (push) Successful in 32s
Hibernate binds null String params as bytea, causing PostgreSQL
lower(bytea) error. Convert null search to empty string in service
layer, use empty-string check in JPQL instead of IS NULL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:26:18 +02:00
hsiegeln
247ec030e5 fix: use COALESCE in audit JPQL to prevent lower(bytea) error
All checks were successful
CI / build (push) Successful in 51s
CI / docker (push) Successful in 31s
Hibernate passes null search param as bytea type, causing PostgreSQL
to fail on LOWER(bytea). COALESCE converts null to empty string.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:21:20 +02:00
hsiegeln
a1acc0bc62 fix: permit SPA routes /vendor/** and /tenant/** for direct navigation
All checks were successful
CI / build (push) Successful in 49s
CI / docker (push) Successful in 32s
Without this, hard refresh on SPA routes returns 401 because Spring
Security intercepts before SpaController can forward to index.html.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:15:08 +02:00
hsiegeln
8b94937d38 feat: add audit log viewing for vendor and tenant personas
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 40s
Vendor sees all audit events with tenant filter at /vendor/audit.
Tenant admin sees only their own events at /tenant/audit.
Both support pagination, action/result filters, and text search.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 13:07:18 +02:00
hsiegeln
1750fe64a2 docs: update CLAUDE.md with provisioning fixes and OIDC role flow
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 9s
Documents traefik.docker.network label requirement, JAR volume mount,
CAMELEER_API_URL env var, additionalScopes for org roles, and the
OIDC role fallback priority (claim mapping > token roles > defaults).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:43:45 +02:00
hsiegeln
4572a4bb57 fix: mount JAR volume on provisioned server containers
All checks were successful
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 34s
The server needs a shared Docker volume at /data/jars to store
uploaded JARs that deployed app containers can access. Without this
mount, deployed containers fail with "Unable to access jarfile".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:36:09 +02:00
hsiegeln
9824d06824 fix: include Logto org scopes in OIDC config pushed to servers
All checks were successful
CI / build (push) Successful in 1m1s
CI / docker (push) Successful in 38s
Without urn:logto:scope:organizations and
urn:logto:scope:organization_roles in the additionalScopes, Logto
doesn't include organization role data in the Custom JWT context.
This caused the roles claim to be empty, so all OIDC users got
defaultRoles (VIEWER) instead of their org role (e.g. owner →
server:admin).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 12:10:56 +02:00
hsiegeln
e24c6da025 feat: grant vendor user Logto admin console access during bootstrap
All checks were successful
CI / build (push) Successful in 1m1s
CI / docker (push) Successful in 11s
When VENDOR_SEED_ENABLED=true, the vendor user is now also created
in the Logto admin tenant with user + default:admin roles, giving
them access to the Logto admin console at port 3002.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 11:54:57 +02:00
hsiegeln
6bdcbf840b fix: correct Logto admin console link to use port 3002
All checks were successful
CI / build (push) Successful in 58s
CI / docker (push) Successful in 47s
The Identity (Logto) link in the vendor sidebar pointed to /console
which doesn't exist. The Logto admin console is served on port 3002
via a dedicated Traefik entrypoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 11:52:45 +02:00
hsiegeln
4699db5465 docs: document traefik.docker.network and CAMELEER_API_URL in CLAUDE.md
All checks were successful
CI / build (push) Successful in 57s
CI / docker (push) Successful in 9s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:55:31 +02:00
hsiegeln
d911fd2201 fix: add traefik.docker.network label to provisioned containers
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 34s
Traefik's Docker provider resolves container IPs using the configured
default network ('cameleer'). For dynamically-created containers not
managed by compose, this network name doesn't match. Adding the
traefik.docker.network label explicitly tells Traefik to use the
cameleer-traefik network for routing, fixing 504 Gateway Timeouts
on /t/{slug}/api/* paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:40:59 +02:00
hsiegeln
b4f9277220 fix: use CAMELEER_API_URL env var for server-ui container
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 33s
The nginx template in cameleer3-server-ui uses ${CAMELEER_API_URL} for
the upstream proxy target, not API_URL. The wrong env var name caused
the baked-in default (http://cameleer3-server:8081) to be used, which
doesn't resolve in per-tenant networks where the server is named
cameleer-server-{slug}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 09:16:42 +02:00
hsiegeln
eaf109549d fix: use /api/v1/admin/oidc for server OIDC config push (not /api/admin)
All checks were successful
CI / build (push) Successful in 51s
CI / docker (push) Successful in 36s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:58:34 +02:00
hsiegeln
3a6b94c1eb fix: remove server health wait from bootstrap (no compose server)
All checks were successful
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 10s
Bootstrap was stuck waiting for cameleer3-server which no longer exists
in docker-compose. Removed server wait loop and SERVER_ENDPOINT config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:54:37 +02:00
hsiegeln
b727bc771d docs: update CLAUDE.md to reflect platform redesign
All checks were successful
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 14s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:44:23 +02:00
hsiegeln
7ee2985626 feat: push OIDC config to provisioned server for SSO login
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 33s
After provisioning a server, pushes Logto Traditional Web App
credentials (client ID + secret) via the server's OIDC admin API.
This enables SSO: users authenticated via Logto can access the
server dashboard without a separate login.

Reads tradAppSecret from bootstrap JSON via LogtoConfig.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:37:01 +02:00
hsiegeln
3efae43879 feat: clean control plane — remove all example tenant resources
All checks were successful
CI / build (push) Successful in 58s
CI / docker (push) Successful in 11s
- Removed cameleer3-server and cameleer3-server-ui from docker-compose
  (tenants provision their own server instances via the vendor console)
- Removed viewer/camel user from bootstrap (tenant users created during
  provisioning)
- Removed Phase 7 server OIDC configuration (provisioned servers get
  OIDC config from env vars, claim mappings via Logto Custom JWT)
- Removed server-related env vars from bootstrap (SERVER_ENDPOINT, etc.)
- Removed jardata volume from dev overlay

Clean slate: docker compose up gives you Traefik + PostgreSQL +
ClickHouse + Logto + SaaS platform + vendor seed. Everything else
(servers, tenants, users) created through the vendor console.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:24:28 +02:00
hsiegeln
aa663a9c9e feat: vendor sidebar section, remove example tenant, add Logto link
All checks were successful
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 48s
- Sidebar: Tenants moved into expandable "Vendor" section with
  sub-items for Tenants and Identity (Logto console link)
- Bootstrap: removed example organization creation (Phase 6 org)
  — tenants are now created exclusively via the vendor console
- Removed BootstrapDataSeeder (no auto-seeded tenant/license)
- Bootstrap log updated to reflect clean-slate approach

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:19:46 +02:00
hsiegeln
f5ef8e6488 feat: per-tenant network isolation
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 33s
Each tenant gets an isolated Docker bridge network (cameleer-tenant-{slug}).
Server + UI containers use the tenant network as primary, with additional
connections to the shared services network (postgres/clickhouse/logto) and
Traefik network (routing). Tenant networks are internal (no internet) and
isolated from each other. Apps deployed by the tenant server also join
the tenant network. Network is removed on tenant delete.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 08:04:11 +02:00
hsiegeln
0a43a7dcd1 feat: register OIDC redirect URIs for provisioned tenant servers
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 34s
During tenant provisioning, adds /t/{slug}/oidc/callback to the Logto
Traditional Web App's registered redirect URIs. This enables the
server's OIDC login flow to work when accessed via Traefik routing.

Also reads tradAppId from bootstrap JSON via LogtoConfig.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:50:38 +02:00
hsiegeln
3b345881c6 fix: strip non-alphanumeric chars from admin username input
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 41s
Logto usernames must match alphanumeric regex. The form now strips
invalid characters on input and shows a hint about the constraint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:44:59 +02:00
hsiegeln
2dc75c4361 feat: create initial admin user + add vendor to new tenant orgs
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 41s
When creating a tenant, the vendor can specify adminUsername +
adminPassword. The backend creates the user in Logto and assigns them
the owner org role. The vendor user is also auto-added to every new
org for support access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:35:17 +02:00
hsiegeln
b7a0530466 fix: exclude DELETED tenants from vendor tenant list
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 33s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:28:56 +02:00
hsiegeln
ebdb4f9450 fix: allow slug reuse after tenant soft-delete
All checks were successful
CI / build (push) Successful in 59s
CI / docker (push) Successful in 39s
existsBySlug found DELETED records, blocking slug reuse. Changed to
existsBySlugAndStatusNot(slug, DELETED) so deleted tenant slugs can
be reclaimed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 07:25:49 +02:00
hsiegeln
5ed33807d8 fix: use /api/v1/health for server health checks (not /actuator/health)
Some checks failed
CI / build (push) Successful in 49s
CI / docker (push) Successful in 30s
SonarQube Analysis / sonarqube (push) Failing after 1m24s
The server's /actuator/health requires auth. The public health endpoint
is /api/v1/health (same as compose-managed server's Docker HEALTHCHECK).
Also increased health check retries/timeout and added startPeriod.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:43:46 +02:00
hsiegeln
00476c974f fix: vendor scoping, sidebar visibility, and landing redirect
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 41s
- OrgResolver merges global + org-scoped token scopes so vendor's
  platform:admin (from global saas-vendor role) is always visible
- LandingRedirect waits for scopes to load before redirecting
  (prevents premature redirect to server dashboard)
- Layout hides tenant portal sidebar items when vendor is on
  /vendor/* routes; shows them when navigating to tenant context

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:39:12 +02:00
hsiegeln
c674785c82 fix: merge global + org-scoped token scopes in OrgResolver
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 40s
Vendor's platform:admin scope comes from a global Logto role, which is
only present in the non-org-scoped token. OrgResolver now fetches both
the global token and the org-scoped token, merging their scopes. This
ensures vendor users see platform:admin and land on the vendor console.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:33:30 +02:00
hsiegeln
4087ce8f29 fix: provisioned server containers — strip-prefix, Docker socket, env
All checks were successful
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 34s
- Add Traefik strip-prefix middleware so /t/{slug}/api -> /api on server
- Add priority to routers (server API=10, UI=5) to prevent conflicts
- Mount Docker socket + group_add in server containers for app deployment
- Add JAR storage, Docker network, volume env vars for runtime orchestrator
- Use HashMap for labels (>10 entries exceeds Map.of limit)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:26:48 +02:00
hsiegeln
39c3b39711 feat: role-based sidebar visibility and landing redirect
All checks were successful
CI / build (push) Successful in 51s
CI / docker (push) Successful in 42s
- Vendor (platform:admin): sees only TENANTS in sidebar
- Tenant admin (tenant:manage): sees Dashboard, License, OIDC, Team, Settings
- Regular user (operator/viewer): redirected to server dashboard directly
- LandingRedirect checks scopes in priority order: vendor > admin > server

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:24:22 +02:00
hsiegeln
cdd495d985 ci: exclude new integration tests from CI (no Testcontainers in CI)
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 33s
VendorTenantControllerTest and TenantPortalControllerTest use
@SpringBootTest + Testcontainers PostgreSQL, same as the existing
controller tests that are already excluded.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:11:45 +02:00
hsiegeln
17fbe73e60 test: add 25 tests for vendor + portal services and controllers
Some checks failed
CI / build (push) Failing after 1m16s
CI / docker (push) Has been skipped
VendorTenantServiceTest (8): create/provision, suspend, delete, renew
VendorTenantControllerTest (7): CRUD, auth, conflict handling
TenantPortalServiceTest (5): dashboard, license, settings
TenantPortalControllerTest (5): dashboard, license, settings, auth

Fix TenantIsolationInterceptor bugs found by tests:
- org_id resolution now runs before portal path check
- path matching uses URI minus context path (not getServletPath)
- portal path returns 403 sendError instead of empty 200

Total: 50 tests, 0 failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 23:08:47 +02:00
hsiegeln
faac0048c3 fix: add missing server env vars to DockerTenantProvisioner
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 43s
Adds CAMELEER_AUTH_TOKEN, CAMELEER_JWT_SECRET, CAMELEER_OIDC_AUDIENCE,
CLICKHOUSE_URL to provisioned server containers. Also passes PUBLIC_HOST
and PUBLIC_PROTOCOL to SaaS container in dev overlay so provisioner
resolves the correct hostname instead of defaulting to localhost.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:53:21 +02:00
hsiegeln
e6f2f17fa1 fix: use correct tier enum values (LOW/MID/HIGH/BUSINESS) in create form
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:39:23 +02:00
hsiegeln
28d044efbc fix: vendor user now lands on /vendor/tenants after login
LandingRedirect component checks scopes — platform:admin goes to
vendor console, others go to tenant dashboard.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:37:28 +02:00
hsiegeln
6a81053d37 feat: integrate vendor seed into bootstrap with VENDOR_SEED_ENABLED switch
All checks were successful
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 49s
Phase 12 in logto-bootstrap.sh creates saas-vendor global role + vendor
user when VENDOR_SEED_ENABLED=true. Enabled by default in dev overlay.
Also restores GlobalFilterProvider + CommandPaletteProvider (required by
DS TopBar internally).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:30:06 +02:00
hsiegeln
fd41a056eb feat: Docker socket mount for tenant provisioning
Add Docker socket volume, group_add: ["0"], and provisioning env vars
(CAMELEER_SERVER_IMAGE, CAMELEER_SERVER_UI_IMAGE, CAMELEER_NETWORK,
CAMELEER_TRAEFIK_NETWORK) to the cameleer-saas service in docker-compose.dev.yml.
2026-04-09 22:30:06 +02:00
hsiegeln
9ecaf22f09 feat: tenant portal — all 5 pages (dashboard, license, OIDC, team, settings)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:30:06 +02:00
hsiegeln
d2f6b02a5f feat: vendor console — tenant list, create wizard, detail page
Implements Task 9: shared components (ServerStatusBadge, UsageIndicator,
platform.module.css, tierColor utility) and full vendor console pages
(VendorTenantsPage, CreateTenantPage, TenantDetailPage). Build passes cleanly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:30:06 +02:00
hsiegeln
bf3aa57274 feat: restructure frontend routes — vendor/tenant persona split
Splits the flat 3-page UI into /vendor/* (platform:admin) and /tenant/*
(all authenticated users) route trees, with stub pages, new API hooks,
updated Layout with persona-aware sidebar, and SpaController forwarding.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:29:59 +02:00
hsiegeln
e56e3fca8a feat: tenant portal API (dashboard, license, OIDC, team, settings)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
hsiegeln
127834ce4d feat: vendor tenant API with provisioning, suspend, delete
Adds VendorTenantService orchestrating full tenant lifecycle (create,
provision, license push, activate, suspend, delete, renew license),
VendorTenantController at /api/vendor/tenants with platform:admin guard,
LicenseResponse.from() factory, SecurityConfig vendor/tenant path rules,
and TenantIsolationInterceptor bypasses for vendor and tenant portal paths.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
hsiegeln
6bdb02ff5a feat: add per-tenant health, OIDC, team methods to API clients
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
hsiegeln
96a5b1d9f1 feat: implement DockerTenantProvisioner with container lifecycle
Replace stub with full Docker implementation using docker-java. Manages
per-tenant server and UI containers with Traefik labels, health checks,
image pull, network attachment, and full lifecycle (provision/start/stop/remove/status).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
hsiegeln
771e9d1081 feat: add TenantProvisioner interface with auto-detection
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
hsiegeln
ebba021448 feat: add provisioning fields to tenants + license revoke
Adds server_endpoint and provision_error columns to tenants table (V011 migration),
updates TenantEntity and TenantResponse with new fields and a from() factory,
adds revokeLicense() to LicenseService, and updates TenantController to use the factory.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
hsiegeln
81d570fd63 docs: add platform redesign implementation plan (12 tasks)
Backend: TenantProvisioner interface, DockerTenantProvisioner,
vendor API (create/provision/suspend/delete), tenant portal API
(dashboard/license/OIDC/team/settings). Frontend: route restructure
(/vendor/*, /tenant/*), persona-aware Layout, 8 new pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
hsiegeln
7b92de4017 docs: add platform redesign spec with user stories
Redesign SaaS platform from read-only viewer into vendor management
plane with tenant provisioning, license management, and customer
self-service. Two personas (vendor/customer), pluggable provisioning
interface (Docker first, K8s later), per-tenant server instances.

User stories tracked as Gitea issues #40-#51. Closes #37.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 22:29:01 +02:00
0ba896ada4 Merge pull request 'SaaS platform UX polish: layout, navigation, error handling' (#39) from feature/saas-ux-polish into main
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 11s
Reviewed-on: #39
2026-04-09 19:56:24 +02:00
hsiegeln
af7abc3eac fix: add confirmation dialog before tenant context switch
All checks were successful
CI / build (push) Successful in 1m18s
CI / build (pull_request) Successful in 1m19s
CI / docker (pull_request) Has been skipped
CI / docker (push) Successful in 1m1s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:51:59 +02:00
hsiegeln
ce1655bba6 fix: add error states to OrgResolver, DashboardPage, AdminTenantsPage
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:51:28 +02:00
hsiegeln
798ec4850d fix: replace raw button with DS Button, add token copy-to-clipboard
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:51:03 +02:00
hsiegeln
7d4126ad4e fix: unify tier color mapping, fix feature badge colors
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:50:28 +02:00
hsiegeln
e3d9a3bd18 fix: sidebar active state, breadcrumbs, collapse, username fallback, lucide icons
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:49:42 +02:00
hsiegeln
7c7d574aa7 fix: replace hardcoded text-white with DS variables, fix label/value layout
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:49:32 +02:00
hsiegeln
f9b1628e14 fix: add password visibility toggle and fix branding to 'Cameleer'
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 19:49:14 +02:00
hsiegeln
e84e53f835 Add SaaS platform UX polish implementation plan (8 tasks)
Detailed step-by-step plan covering layout fixes (label/value collision,
DS variable adoption), header/navigation (sidebar active state,
breadcrumbs, collapse), error handling, DS component adoption, sign-in
improvements, and polish (tier colors, badges, confirmations).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:46:44 +02:00
hsiegeln
1133763520 Add SaaS platform UX polish design spec with audit findings
Playwright audit (22 screenshots) + source code audit covering all
platform pages. Spec defines 4 batches: layout fixes (label/value
collision, hardcoded colors), header/navigation (hide server controls,
sidebar active state), error handling & components (DS adoption,
copy-to-clipboard, error states), and polish (tier colors, badges).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 19:39:33 +02:00
hsiegeln
5c4a84e64c fix: platform label/value spacing and neutral license badge colors
Disabled features on the license page now show 'Not included' with a
neutral (auto) badge color instead of 'Disabled' in error red, which
looked like an actionable error rather than a plan tier indicator.

Label/value spacing on DashboardPage already used flex justify-between
correctly — no change needed there.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-09 18:46:09 +02:00
hsiegeln
538591989c docs: mark Plan 3 (runtime management port) as completed
All checks were successful
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 11s
Verified 2026-04-09: all runtime management fully ported to
cameleer3-server with enhancements beyond the original plan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-09 08:58:15 +02:00
hsiegeln
63e6c6b1b5 docs: update CLAUDE.md with key classes, network topology, and runtime env vars
All checks were successful
CI / build (push) Successful in 1m18s
CI / docker (push) Successful in 19s
SonarQube Analysis / sonarqube (push) Successful in 1m12s
Add key class locations for Java backend and React frontend, document
cameleer-traefik network topology with DNS alias, add server runtime
env vars table, update deployment pipeline to 7-stage flow, add
database migration reference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 23:11:03 +02:00
hsiegeln
4a7351d48e fix: add cameleer-traefik network so deployed apps can reach server
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 10s
Deployed app containers are put on the cameleer-traefik network by the
orchestrator, but the server and Traefik were only on the compose-internal
network. This caused UnresolvedAddressException when apps tried to connect
to cameleer3-server:8081 for agent registration and SSE.

- Add cameleer-traefik network with fixed name (no compose project prefix)
- Attach server to cameleer-traefik with DNS alias "cameleer3-server"
- Attach Traefik to cameleer-traefik for routing to deployed apps
- Add dev overrides for Docker orchestration (socket, volumes, env vars)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 22:37:51 +02:00
hsiegeln
1d6c0cf451 docs: update documentation for Docker orchestration and env var rename
All checks were successful
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 18s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-08 22:09:19 +02:00
hsiegeln
cc792ae336 refactor: rename CAMELEER_EXPORT_ENDPOINT to CAMELEER_SERVER_URL in runtime-base
Some checks failed
CI / build (push) Successful in 1m7s
CI / docker (push) Has been cancelled
Standardize env var naming. The agent reads CAMELEER_SERVER_URL
to configure -Dcameleer.export.endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 22:06:48 +02:00
hsiegeln
bb8c68a5ca feat: seed claim mapping rules in bootstrap after OIDC config
All checks were successful
CI / build (push) Successful in 53s
CI / docker (push) Successful in 14s
After configuring the server's OIDC settings, the bootstrap now seeds
claim mapping rules so Logto roles (server:admin, server:operator) map
to server RBAC roles (ADMIN, OPERATOR) automatically. Rules are
idempotent — existing mappings are checked by matchValue before creating.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 12:21:28 +02:00
hsiegeln
cfc7842e18 ci: retry build (transient npm network error)
All checks were successful
CI / build (push) Successful in 49s
CI / docker (push) Successful in 1m16s
2026-04-08 08:56:44 +02:00
hsiegeln
3fa062b92c docs: add architecture review spec and implementation plans
Some checks failed
CI / build (push) Failing after 25s
CI / docker (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 08:53:22 +02:00
hsiegeln
5938643632 feat: strip SaaS UI to vendor management dashboard
- Delete EnvironmentsPage, EnvironmentDetailPage, AppDetailPage
- Delete EnvironmentTree and DeploymentStatusBadge components
- Simplify DashboardPage to show tenant info, license status, server link
- Remove environment/app/deployment routes from router
- Remove environment section from sidebar, keep dashboard/license/platform
- Strip API hooks to tenant/license/me only
- Remove environment/app/deployment/observability types

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 00:03:01 +02:00
hsiegeln
de5821dddb feat: remove Docker socket dependency from SaaS layer
- Remove docker-java-core and docker-java-transport-zerodep from pom.xml
- Remove Docker socket mount, group_add, jardata volume from docker-compose.yml
- Remove CAMELEER_DOCKER_NETWORK and CLICKHOUSE_URL env vars from SaaS service
- Remove jardata volume definition

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 00:01:15 +02:00
hsiegeln
bad78e26a1 feat: add migration to drop migrated tables from SaaS database
- V010: drop deployments, apps, environments, api_keys tables
- Tables have been migrated to cameleer3-server

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 00:00:32 +02:00
hsiegeln
c254fbf723 feat: remove migrated environment/app/deployment/runtime code from SaaS
- Delete environment/, app/, deployment/, runtime/ packages (source + tests)
- Delete apikey/ package (tied to environments, table will be dropped)
- Strip AsyncConfig to empty @EnableAsync (no more deploymentExecutor bean)
- Remove EnvironmentService dependency from TenantService
- Remove environment/app isolation from TenantIsolationInterceptor
- Remove environment seeding from BootstrapDataSeeder
- Refactor ServerApiClient to use LogtoConfig instead of RuntimeConfig
- Add server-endpoint property to LogtoConfig (was in RuntimeConfig)
- Remove runtime config section and multipart config from application.yml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:59:53 +02:00
hsiegeln
160a989f9f feat: remove all ClickHouse dependencies from SaaS layer
- Delete log/ package (ClickHouseConfig, ContainerLogService, LogController)
- Delete observability/ package (AgentStatusService, AgentStatusController)
- Remove clickhouse-jdbc dependency from pom.xml
- Remove cameleer.clickhouse config section from application.yml
- Delete associated test files

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 23:56:21 +02:00
hsiegeln
30aaacb5b5 fix: correct protocol version header, disable SQL logging, document deployment pipeline
All checks were successful
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m43s
SonarQube Analysis / sonarqube (push) Successful in 1m20s
- ServerApiClient: use X-Cameleer-Protocol-Version: 1 (server expects "1", not "2")
- Disable Hibernate show-sql in dev profile (too verbose)
- CLAUDE.md: document deployment pipeline architecture, M2M server role in bootstrap,
  runtime-base image in CI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:58:27 +02:00
hsiegeln
617785baa7 fix: use full registry path for runtime base image default
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 35s
The default cameleer-runtime-base:latest has no registry prefix, so
Docker can't pull it. Use the full gitea.siegeln.net/cameleer/ path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:23:26 +02:00
hsiegeln
f14affcc1e fix: verify Management API readiness before proceeding in bootstrap
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 14s
Logto's OIDC endpoint may respond before the Management API is fully
initialized. Add a retry loop that checks GET /api/roles returns valid
JSON before making any API calls. Fixes intermittent bootstrap failure
on cold starts with 'Cannot index string with string "name"'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:18:08 +02:00
hsiegeln
d6f488199c fix: rename async executor bean to avoid clash with DeploymentExecutor
All checks were successful
CI / build (push) Successful in 49s
CI / docker (push) Successful in 36s
The @Bean named 'deploymentExecutor' (ThreadPoolTaskExecutor) collided
with the @Service class DeploymentExecutor. Rename the bean to
'deploymentTaskExecutor'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 18:06:10 +02:00
hsiegeln
dade9cefe2 fix: use sed instead of grep -P for BusyBox compatibility in CI
All checks were successful
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 10s
The Alpine-based docker builder uses BusyBox grep which doesn't
support Perl regex (-P). Switch to sed for extracting the agent
version from Maven metadata XML.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:57:43 +02:00
hsiegeln
3f0a27c96e fix: add lenient mock strictness to AgentStatusServiceTest
Some checks failed
CI / build (push) Successful in 1m46s
CI / docker (push) Failing after 45s
The serverApiClient.isAvailable() stubbing in setUp() is unused by
the observability test, causing UnnecessaryStubbingException in CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:48:08 +02:00
hsiegeln
5d04a154f9 refactor: deployment infrastructure cleanup (4 fixes)
Some checks failed
CI / build (push) Failing after 46s
CI / docker (push) Has been skipped
1. Docker socket security: remove root group from Dockerfile, use
   group_add in docker-compose.yml for runtime-only socket access

2. M2M server communication: create ServerApiClient using Logto
   client_credentials grant with API resource scope. Add M2M server
   role in bootstrap. Replace hacky admin/admin login in
   AgentStatusService.

3. Async deployment: extract DeploymentExecutor as separate @Service
   so Spring's @Async proxy works (self-invocation bypasses proxy).
   Deploy now returns immediately, health check runs in background.

4. Bootstrap: M2M server role (cameleer-m2m-server) with server:admin
   scope, idempotent creation outside the M2M app creation block.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 17:08:37 +02:00
hsiegeln
8407d8b3c0 fix: deployment health check, container cleanup, and status reporting
Three fixes for the deployment pipeline:
1. Health check path: /health -> /cameleer/health (matches agent)
2. Container cleanup: stop AND remove old container before starting
   new one, plus orphan cleanup by container name to prevent conflicts
3. Container status: read health.status instead of state.status so
   waitForHealthy correctly detects the "healthy" state

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 16:20:33 +02:00
hsiegeln
35276f66e9 fix: use compose-prefixed Docker network name for deployments
Docker Compose prefixes network names with the project name, so the
actual network is cameleer-saas_cameleer, not just cameleer. Pass
CAMELEER_DOCKER_NETWORK env var using COMPOSE_PROJECT_NAME.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:56:01 +02:00
hsiegeln
ea04eeb6dc fix: switch to zerodep Docker transport for Unix socket support
The httpclient5 transport needs junixsocket for Unix domain sockets.
Switch to docker-java-transport-zerodep which has built-in Unix socket
support with zero external dependencies.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:53:39 +02:00
hsiegeln
ca6e8ce35a fix: add cameleer user to root group for Docker socket access
The mounted /var/run/docker.sock is owned by root:root with rw-rw----
permissions. The cameleer user needs to be in the root group to
read/write the socket for building images and managing containers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:48:22 +02:00
hsiegeln
9c6ab77b72 fix: configure Docker client to use Unix socket
Default docker-java config resolved to localhost:2375 (TCP) inside the
container. Explicitly set docker host to unix:///var/run/docker.sock
which is volume-mounted from the host.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:40:57 +02:00
hsiegeln
a5c881a4d0 fix: skip bootstrap on subsequent restarts if already complete
Check for spaClientId and m2mClientSecret in the cached bootstrap
file. If both exist, exit immediately instead of re-running all
phases. Delete /data/logto-bootstrap.json to force a re-run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:36:43 +02:00
hsiegeln
00a3f2fd3f feat: runtime base image CI, bootstrap token, and deploy plumbing
Add CI step to build cameleer-runtime-base image by downloading the
agent shaded JAR from Gitea Maven registry and pushing the image.
Wire CAMELEER_AUTH_TOKEN from docker-compose into RuntimeConfig so
deployed containers authenticate with cameleer3-server. Add agent.jar
to gitignore for local builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:32:42 +02:00
hsiegeln
1a0f1e07be fix: JAR upload — increase multipart limit and fix storage permissions
Spring Boot defaults to 1MB max file size which rejected all JAR
uploads. Set to 200MB to match the configured max-jar-size. Also
create /data/jars with cameleer user ownership in the Dockerfile
so the non-root process can write uploaded JARs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 15:10:35 +02:00
hsiegeln
8febdba533 feat: per-app resource limits, auto-slug, and polished create dialogs
Add per-app memory limit and CPU shares (stored on AppEntity, used by
DeploymentService with fallback to global defaults). JAR upload is now
optional at creation time. Both create modals show the computed slug in
the dialog title and use consistent Cancel-left/Action-right button
layout with inline styles to avoid Modal CSS conflicts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 14:53:57 +02:00
hsiegeln
3d41d4a3da feat: 4-role model — owner, operator, viewer + vendor-seed
All checks were successful
CI / build (push) Successful in 57s
CI / docker (push) Successful in 47s
Redesign the role model from 3 roles (platform-admin, admin, member)
to 4 clear personas:

- owner (org role): full tenant control — billing, team, apps, deploy
- operator (org role): app lifecycle + observability, no billing/team
- viewer (org role): read-only observability
- saas-vendor (global role, hosted only): cross-tenant platform admin

Bootstrap changes:
- Rename org roles: admin→owner, member→operator, add viewer
- Remove platform-admin global role (moved to vendor-seed)
- admin user gets owner role, camel user gets viewer role
- Custom JWT maps: owner→server:admin, operator→server:operator,
  viewer→server:viewer, saas-vendor→server:admin

New docker/vendor-seed.sh for hosted SaaS environments only.
Remove sidebar user/logout link (TopBar handles logout).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 13:49:16 +02:00
hsiegeln
c96faa4f3f fix: display username in UI, fix license limits key mismatch
All checks were successful
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 55s
- Read user profile from Logto ID token in OrgResolver, store in
  Zustand org store, display in sidebar footer and TopBar avatar
- Fix license limits showing "—" by aligning frontend LIMIT_LABELS
  keys with backend snake_case convention (max_agents, retention_days,
  max_environments)
- Bump @cameleer/design-system to v0.1.38 (font-size floor)
- Add dev volume mount for local UI hot-reload without image rebuild

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 12:20:40 +02:00
hsiegeln
bab9714efc docs: document Custom JWT, server OIDC role paths, and bootstrap Phase 7b
All checks were successful
CI / build (push) Successful in 1m40s
CI / docker (push) Successful in 19s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:55:02 +02:00
hsiegeln
67b35a25d6 feat: configure Logto Custom JWT and update server OIDC rolesClaim
All checks were successful
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 14s
- Change rolesClaim from "scope" to "roles" to match the custom claim
  injected by the Logto Custom JWT script
- Add Phase 7b: configure Logto Custom JWT for access tokens that maps
  org roles (admin→server:admin, member→server:viewer) and global roles
  (platform-admin→server:admin) into a standard "roles" claim
- Add additionalScopes field to OIDC config

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 10:17:04 +02:00
hsiegeln
b7aed1afb1 fix: explicitly set service=logto on default tenant router
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 7s
SonarQube Analysis / sonarqube (push) Successful in 1m16s
Traefik couldn't auto-link the logto router when two services
(logto, logto-console) exist on the same container. This broke
ALL default tenant routing (sign-in, OIDC, API).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:56:10 +02:00
hsiegeln
6f57e19c2a fix: add CORS middleware for admin console origin on default tenant
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 9s
The admin console (port 3002) calls the Management API on the
default tenant (port 443). Add Traefik CORS headers to allow
cross-origin requests from the admin console origin.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:54:09 +02:00
hsiegeln
c32a606a91 fix: add X-Forwarded-Proto to admin-tenant bootstrap calls only
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 8s
ADMIN_ENDPOINT is now HTTPS so admin-tenant calls need the
forwarded proto header. Default-tenant calls stay unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:48:11 +02:00
hsiegeln
e0e65bb62c feat: HTTPS admin console via Traefik with NODE_TLS_REJECT_UNAUTHORIZED
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 8s
ADMIN_ENDPOINT set to HTTPS so OIDC issuer matches browser URL.
NODE_TLS_REJECT_UNAUTHORIZED=0 lets Logto's internal ky-based
OIDC self-discovery accept the self-signed cert through Traefik.
Remove in production with real certs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:44:33 +02:00
hsiegeln
0e5016cdcc revert: restore to working state (774db7b)
All checks were successful
CI / build (push) Successful in 46s
CI / docker (push) Successful in 8s
Admin console HTTPS via Traefik conflicts with Logto's
ADMIN_ENDPOINT self-discovery. Parking this for now.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:26:33 +02:00
hsiegeln
49fda95f15 fix: use localhost for ADMIN_ENDPOINT, rely on TRUST_PROXY_HEADER
All checks were successful
CI / build (push) Successful in 46s
CI / docker (push) Successful in 7s
ADMIN_ENDPOINT=http://localhost:3002 for Logto self-calls.
TRUST_PROXY_HEADER makes Logto use X-Forwarded-Proto from Traefik
to generate HTTPS URLs for browser-facing OIDC flows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:25:18 +02:00
hsiegeln
ca40536fd3 fix: add Docker network alias for Logto self-discovery with TLS
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 9s
Add PUBLIC_HOST as network alias on the logto container so its
internal ADMIN_ENDPOINT calls (http://PUBLIC_HOST:3002) resolve
inside Docker directly, bypassing Traefik. Browser traffic goes
through Traefik on host port 3002 with TLS termination.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:22:51 +02:00
hsiegeln
fdca4911ae fix: admin console via Traefik port 3002 without forced TLS
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 9s
Remove tls=true from the logto-console router so the entrypoint
accepts plain HTTP. Logto's internal self-calls via ADMIN_ENDPOINT
use HTTP and pass through Traefik transparently. Browsers can
access via HTTP on port 3002.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:21:12 +02:00
hsiegeln
6497b59c55 feat: HTTPS admin console on port 3443 via Traefik
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 7s
Use separate port 3443 for TLS-terminated admin console access.
Port 3002 stays directly mapped from logto in dev for Logto's
internal OIDC self-discovery via ADMIN_ENDPOINT.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:18:16 +02:00
hsiegeln
04a2b41326 feat: expose admin console on HTTPS via Traefik port 3002
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 9s
Traefik-only change: new entrypoint + router for TLS termination.
No changes to Logto ADMIN_ENDPOINT or bootstrap script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:09:42 +02:00
hsiegeln
774db7ba53 revert: restore to last working state (b3ac8a6)
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 9s
Revert all Traefik port 3002 and ADMIN_ENDPOINT changes that broke
bootstrap. Admin console HTTPS access needs a different approach.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:07:17 +02:00
hsiegeln
a2119b8bfd fix: remove Host header from admin tenant bootstrap calls
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 9s
ADMIN_ENDPOINT is http://localhost:3002, but bootstrap sent
Host: PUBLIC_HOST:3002 which didn't match. Let curl use the
default Host from LOGTO_ADMIN_ENDPOINT (logto:3002) which Logto
resolves to the admin tenant internally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:05:33 +02:00
hsiegeln
1dfa4d9f32 fix: use localhost for Logto ADMIN_ENDPOINT
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 7s
Logto calls ADMIN_ENDPOINT internally for OIDC discovery. Using
PUBLIC_HOST resolved to the host machine where Traefik now owns
port 3002, causing a routing loop. localhost resolves inside the
container directly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 00:02:51 +02:00
hsiegeln
f276953b03 fix: revert ADMIN_ENDPOINT to HTTP, remove X-Forwarded-Proto
All checks were successful
CI / build (push) Successful in 50s
CI / docker (push) Successful in 25s
Internal Docker traffic is HTTP. Traefik handles TLS termination
for external access. TRUST_PROXY_HEADER lets Logto detect HTTPS
from Traefik's forwarded headers automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:59:49 +02:00
hsiegeln
c8ec1da328 fix: only use X-Forwarded-Proto on admin tenant calls (port 3002)
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 9s
Default tenant (port 3001) works without it — adding it caused
Internal server error. Only the admin tenant needs it because
ADMIN_ENDPOINT changed to HTTPS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:58:16 +02:00
hsiegeln
a3af667f76 debug: log api_get response for bootstrap troubleshooting
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 7s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:56:08 +02:00
hsiegeln
251d8eb8e1 fix: add X-Forwarded-Proto to all bootstrap API helpers
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 9s
All Logto endpoints are configured with HTTPS but bootstrap calls
internal HTTP. Every curl call needs the forwarded proto header.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:54:51 +02:00
hsiegeln
5f560e9f33 fix: add X-Forwarded-Proto to bootstrap admin endpoint calls
All checks were successful
CI / build (push) Successful in 46s
CI / docker (push) Successful in 9s
Logto's ADMIN_ENDPOINT is now HTTPS but bootstrap calls the internal
HTTP endpoint directly. TRUST_PROXY_HEADER needs X-Forwarded-Proto
to resolve the correct scheme.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:53:29 +02:00
hsiegeln
73388e15e2 feat: expose Logto admin console on HTTPS via Traefik port 3002
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 9s
Add admin-console entrypoint to Traefik with TLS termination.
Route port 3002 through Traefik to logto:3002. Update Logto
ADMIN_ENDPOINT to use HTTPS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:49:39 +02:00
hsiegeln
b3ac8a6bcc fix: set admin tenant sign-in mode to SignIn after user creation
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 8s
Admin tenant defaults to Register mode (onboarding flow). Since we
create the admin user via API, we need to switch to SignIn mode so
the custom sign-in UI can authenticate against the admin console.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:46:36 +02:00
hsiegeln
c354d2e74f fix: assign 'user' base role for admin console access
All checks were successful
CI / build (push) Successful in 58s
CI / docker (push) Successful in 11s
The admin tenant requires both the 'user' role (base access) and
'default:admin' role (Management API). Missing the 'user' role
causes a 403 at the identification step.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:43:07 +02:00
hsiegeln
9dbdda62ce fix: use m-admin token for admin tenant console user creation
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 16s
The m-default token has audience https://default.logto.app/api which
is rejected by port 3002's admin tenant API. Use m-admin client with
audience https://admin.logto.app/api instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:37:51 +02:00
hsiegeln
65d2c7c764 debug: log admin tenant API response for bootstrap troubleshooting
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 1m3s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:35:26 +02:00
hsiegeln
8adf5daab9 chore: bump @cameleer/design-system to 0.1.37
Some checks failed
CI / build (push) Successful in 49s
CI / docker (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:34:35 +02:00
hsiegeln
bc42fa7172 fix: create admin console user on Logto admin tenant (port 3002)
All checks were successful
CI / build (push) Successful in 58s
CI / docker (push) Successful in 9s
The admin console runs on a separate tenant with its own user store.
Previous approach tried to assign a non-existent 'admin:admin' role
on the default tenant. Now creates the user on the admin tenant via
port 3002, assigns 'default:admin' role for Management API access,
and adds to t-default organization.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:28:40 +02:00
hsiegeln
e478427a29 fix: restore registry-based Docker layer caching in CI
All checks were successful
CI / build (push) Successful in 58s
CI / docker (push) Successful in 26s
Replace --no-cache with --cache-from/--cache-to registry caching,
matching the cameleer3-server CI pattern. The ephemeral CI runner
destroys BuildKit local cache after each job, so only registry
caching persists between runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:10:55 +02:00
hsiegeln
2f7d4bd71c feat: use cameleer3-logo.svg from design-system v0.1.36 everywhere
All checks were successful
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 3m34s
- Sidebar, sign-in page, and favicons all use the single SVG
- Postinstall copies SVG for SaaS HTML favicon (gitignored)
- Sign-in favicon committed (baked into Logto Docker image)
- Remove old PNG favicon references

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 23:03:18 +02:00
hsiegeln
93a2f7d900 fix: skip postinstall favicon copy when public/ absent (Docker build)
All checks were successful
CI / build (push) Successful in 59s
CI / docker (push) Successful in 2m50s
The Dockerfile copies package.json before ui/ contents, so public/
doesn't exist during npm ci. Skip the copy gracefully.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:50:44 +02:00
hsiegeln
c9ecebdd92 chore: bump @cameleer/design-system to 0.1.34
Some checks failed
CI / build (push) Successful in 58s
CI / docker (push) Failing after 17s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:46:19 +02:00
hsiegeln
2e87667734 refactor: import brand icons directly from design-system package
Some checks failed
CI / build (push) Failing after 20s
CI / docker (push) Has been skipped
- Sidebar and sign-in logos use Vite import from @cameleer/design-system
- HTML favicons copied by postinstall script (gitignored)
- Remove manually copied PNGs from repo
- Clean up SecurityConfig permitAll (bundled assets under /_app/**)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:39:29 +02:00
hsiegeln
1ca0e960fb fix: update brand icons to transparent-background versions from v0.1.33
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 2m52s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:35:42 +02:00
hsiegeln
3a33324b2a fix: permit cameleer-logo-48.png without auth
Some checks failed
CI / build (push) Successful in 48s
CI / docker (push) Has been cancelled
Browser img tags don't send Bearer tokens, so the sidebar logo
needs to be in the permitAll list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:32:16 +02:00
hsiegeln
3ca13b6b88 perf: add BuildKit cache mounts for Maven and npm in Docker builds
Some checks failed
CI / build (push) Successful in 49s
CI / docker (push) Has been cancelled
Maven .m2 and npm caches persist across --no-cache builds, avoiding
full dependency re-downloads on every CI run.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:29:14 +02:00
hsiegeln
ea3723958e fix: bootstrap file permission denied, use PNG favicon
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 2m45s
- Change chmod 600 to 644 on bootstrap JSON (cameleer user needs read)
- Use PNG favicon instead of SVG (currentColor invisible in browser tab)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:25:15 +02:00
hsiegeln
d8b9ca6cfe chore: bump @cameleer/design-system to 0.1.33
All checks were successful
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 3m11s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:17:55 +02:00
hsiegeln
29daf51ee3 feat: replace icons with brand assets from design-system v0.1.32
All checks were successful
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 3m22s
- Replace favicon SVG with official camel-logo.svg from design-system
- Add PNG favicons (32px, 192px) with proper link tags in index.html
- Replace sidebar logo with 48px brand icon (cameleer-logo-48.png)
- Replace sign-in page logo with 48px brand icon
- Permit favicon PNGs in SecurityConfig

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:09:13 +02:00
hsiegeln
3dedfb1eb7 chore: bump @cameleer/design-system to 0.1.32
All checks were successful
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 2m49s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 22:00:57 +02:00
hsiegeln
f81cd740b7 fix: security hardening — remove dead routes, add JWT audience validation
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 2m49s
- Remove broken observe/dashboard Traefik routes (server accessed via /server only)
- Remove unused acme volume
- Add JWT audience claim validation (https://api.cameleer.local) in SecurityConfig
- Secure bootstrap output file with chmod 600
- Add dev-only comments on TLS_SKIP_VERIFY and credential logging

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 19:15:03 +02:00
hsiegeln
7d6e78afa3 fix: add /server/login?local to Traditional app post-logout redirect URIs
All checks were successful
CI / build (push) Successful in 54s
CI / docker (push) Successful in 2m48s
The server-ui logout redirects to /server/login?local but this URI was
not whitelisted in Logto, causing the post-logout redirect to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 17:47:12 +02:00
hsiegeln
edbb66b056 docs: update architecture for custom sign-in UI and CI pipeline
All checks were successful
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 2m52s
- CLAUDE.md: add custom sign-in UI section, update routing table,
  document auto-redirect, CI-built images, no local builds, dev
  override without volume mounts
- Design spec: reflect final implementation — custom Logto image,
  no CUSTOM_UI_PATH, no init containers, bundled favicon

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:29:37 +02:00
hsiegeln
194004f8f9 fix: remove local bind mounts from dev override
All checks were successful
CI / build (push) Successful in 52s
CI / docker (push) Successful in 2m46s
The dev override was mounting local ui/dist and target/*.jar over
the image contents, serving stale local builds instead of the
CI-built artifacts. Remove these mounts — the image has everything.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 16:18:39 +02:00
hsiegeln
82163144e7 ci: use --no-cache for Docker builds, remove registry cache
All checks were successful
CI / build (push) Successful in 57s
CI / docker (push) Successful in 2m49s
Local registry makes cache overhead unnecessary. Ensures fresh
builds with no stale layer reuse.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:59:29 +02:00
hsiegeln
3fcbc431fb fix: restore multi-stage Dockerfiles, use cameleer-docker-builder
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 39s
Follow cameleer3-server CI pattern: docker job uses
cameleer-docker-builder:1 (has Docker CLI), Dockerfiles contain
multi-stage builds (self-contained, no external toolchain needed).

- Dockerfile: restore frontend + maven + runtime stages
- ui/sign-in/Dockerfile: add node build stage + Logto base
- ci.yml: docker job reverts to cameleer-docker-builder:1,
  passes REGISTRY_TOKEN as build-arg, adds build cache

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:51:59 +02:00
hsiegeln
ad97a552f6 refactor: no builds in Dockerfiles, CI builds all artifacts
Some checks failed
CI / build (push) Successful in 59s
CI / docker (push) Failing after 11s
Dockerfiles now only COPY pre-built artifacts:
- Dockerfile (SaaS): just COPY target/*.jar, no multi-stage build
- ui/sign-in/Dockerfile (Logto): just FROM logto + COPY dist/
- Removed docker/logto.Dockerfile (had node build stage)

CI pipeline builds everything:
- docker job: builds frontend, JAR, sign-in UI, then packages
  into images using the simple Dockerfiles
- Uses cameleer-build:1 (has node + maven + docker)
- build job: also builds sign-in UI for testing

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:39:19 +02:00
hsiegeln
983b861d20 fix: bundle favicon.svg in sign-in UI instead of cross-service fetch
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 22s
The sign-in page is served by Logto, not the SaaS app. Referencing
/platform/favicon.svg required SecurityConfig permitAll and cross-service
routing. Instead, bundle favicon.svg directly in the sign-in UI dist
so Logto serves it at /favicon.svg.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:35:23 +02:00
hsiegeln
2375cb9111 ci: build and push custom Logto image in CI pipeline
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 41s
- Add "Build and push Logto image" step to docker job
- Remove build: directive from logto service in docker-compose
- docker-compose now only pulls pre-built images, no local builds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:17:55 +02:00
hsiegeln
972f9b5f38 feat: custom Logto image + auto-redirect to sign-in
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 40s
- Add docker/logto.Dockerfile: builds custom Logto image with sign-in
  UI baked into /etc/logto/packages/experience/dist/
- Remove sign-in-ui init container, signinui volume, CUSTOM_UI_PATH
  (CUSTOM_UI_PATH is Logto Cloud only, not available in OSS)
- Remove sign-in build stage from SaaS Dockerfile (now in logto.Dockerfile)
- Remove docker/saas-entrypoint.sh (no longer needed)
- LoginPage auto-redirects to Logto OIDC on mount instead of showing
  "Sign in with Logto" button — seamless sign-in experience

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 15:12:11 +02:00
hsiegeln
9013740b83 fix: mount custom sign-in UI over Logto experience dist
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 33s
CUSTOM_UI_PATH is a Logto Cloud feature, not available in OSS.
The correct approach for self-hosted Logto is to volume-mount
over /etc/logto/packages/experience/dist/.

- Use init container (sign-in-ui) to copy dist to shared volume
  as root (fixes permission denied with cameleer user)
- Logto mounts signinui volume at experience/dist path
- Logto depends on sign-in-ui init container completion
- Remove saas-entrypoint.sh approach (no longer needed)
- Revert Dockerfile entrypoint to direct java -jar
- Permit /favicon.svg in SecurityConfig for sign-in page logo

Tested: full OIDC flow works end-to-end via Playwright.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 14:24:33 +02:00
hsiegeln
df220bc5f3 feat: custom Logto sign-in UI with Cameleer branding
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 50s
Replace Logto's default sign-in page with a custom React SPA that
matches the cameleer3-server login page using @cameleer/design-system.

- New Vite+React app at ui/sign-in/ with Experience API integration
- 4-step auth flow: init → verify password → identify → submit
- Design-system components: Card, Input, Button, FormField, Alert
- Same witty random subtitles as cameleer3-server LoginPage
- Dockerfile: add sign-in-frontend build stage, copy dist to image
- docker-compose: CUSTOM_UI_PATH on Logto, shared signinui volume
- SaaS entrypoint copies sign-in dist to shared volume on startup
- Add .gitattributes for LF line endings

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 11:43:22 +02:00
hsiegeln
b1c2832245 docs: update architecture with bootstrap phases, scopes, branding
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 11s
- CLAUDE.md: add bootstrap phase listing, document 13 scopes (10
  platform + 3 server), server role mapping via scope claim, admin
  console access, sign-in branding
- Mark server-role-mapping and logto-admin-branding specs as implemented

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:46:39 +02:00
hsiegeln
51cdca95c4 feat: server role mapping, Logto admin access, sign-in branding
Some checks failed
CI / build (push) Successful in 38s
CI / docker (push) Has been cancelled
- Add server:admin/operator/viewer scopes to bootstrap and org roles
- Grant SaaS admin Logto console access via admin:admin role
- Configure sign-in experience with Cameleer branding (colors + logos)
- Add rolesClaim and audience to server OIDC config
- Add server scopes to PublicConfigController for token inclusion
- Permit logo SVGs in SecurityConfig (fix 401 on /platform/logo.svg)
- Add cameleer3 logo SVGs (light + dark) to ui/public/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:45:19 +02:00
hsiegeln
edd1d45a1a docs: Logto admin credentials + branding design spec
All checks were successful
CI / build (push) Successful in 47s
CI / docker (push) Successful in 8s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:24:52 +02:00
hsiegeln
574c719148 docs: server role mapping design spec
All checks were successful
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 10s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 10:05:12 +02:00
hsiegeln
0082576063 docs: update architecture docs for single-domain /platform routing
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 10s
Reflects current state: path-based routing, SaaS at /platform,
Logto catch-all, TLS init container, server integration env vars,
custom JwtDecoder for ES384, skip consent for SSO.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 09:43:14 +02:00
hsiegeln
5a8d38a946 fix: enable skip consent on Traditional app for first-party SSO
All checks were successful
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 8s
SonarQube Analysis / sonarqube (push) Successful in 1m24s
Without this, Logto returns consent_required when the server tries
SSO because the scopes were never explicitly granted.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:30:25 +02:00
hsiegeln
d74aafc7b3 fix: update Traditional app redirect URIs for Traefik routing
All checks were successful
CI / build (push) Successful in 46s
CI / docker (push) Successful in 8s
Server OIDC callback is at /oidc/callback (without /server/ prefix due
to strip-prefix). Register both variants until server reads
X-Forwarded-Prefix.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:10:40 +02:00
hsiegeln
329f5b80df feat: add CORS allowed origins for server behind reverse proxy
All checks were successful
CI / build (push) Successful in 45s
CI / docker (push) Successful in 7s
Browser sends Origin header on fetch calls even same-origin. Server
needs the public host in its CORS allowlist. Derived from PUBLIC_HOST.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:40:00 +02:00
hsiegeln
e16094d83f feat: enable OIDC TLS skip-verify for server in Docker dev
All checks were successful
CI / build (push) Successful in 46s
CI / docker (push) Successful in 7s
Self-signed certs cause PKIX errors when the server fetches OIDC
discovery. CAMELEER_OIDC_TLS_SKIP_VERIFY=true disables cert
verification for OIDC calls only (server-team feature, pending build).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:24:22 +02:00
hsiegeln
730ead38a0 fix: add strip-prefix back for server-ui asset serving
All checks were successful
CI / build (push) Successful in 46s
CI / docker (push) Successful in 6s
Nginx needs to see /assets/... not /server/assets/... to find the files.
Strip-prefix + BASE_PATH=/server now works correctly with the fixed image.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:59:16 +02:00
hsiegeln
5ded08cace fix: remove patched entrypoint, use server team's fixed image
Some checks failed
CI / build (push) Successful in 50s
CI / docker (push) Has been cancelled
Server team fixed the BASE_PATH sed ordering bug. Remove our entrypoint
override and let the image's own entrypoint handle it.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:58:20 +02:00
hsiegeln
5981a3db71 fix: patch server-ui entrypoint to fix sed ordering bug
All checks were successful
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 9s
The server-ui's entrypoint inserts <base href> THEN rewrites all
href="/" — including the just-inserted base tag, causing doubling.
Patched entrypoint rewrites asset paths first, then inserts <base>.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:53:03 +02:00
hsiegeln
4c6625efaa fix: restore BASE_PATH=/server, remove strip-prefix
Server-ui needs BASE_PATH for React Router basename. Without strip-prefix,
no X-Forwarded-Prefix doubling. Server-ui handles full /server/ path itself.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:40:11 +02:00
hsiegeln
9bd8ddfad5 fix: remove BASE_PATH, let X-Forwarded-Prefix from strip-prefix handle it
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 6s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:38:20 +02:00
hsiegeln
a700d3a8ed fix: add strip-prefix back to server-ui route
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 6s
Server-ui injects BASE_PATH=/server/ into <base href>. Without strip-prefix,
Traefik forwards /server/ path AND server-ui adds /server/ again = double prefix.
Strip /server before forwarding so server-ui sees / and produces correct <base href="/server/">.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:33:57 +02:00
hsiegeln
1b2c962261 fix: root → /platform/ redirect via Traefik file config
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 5s
Docker-compose label escaping mangles regex patterns. Use a separate
Traefik dynamic config file instead — clean regex, proper 302 redirect.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:30:38 +02:00
hsiegeln
43967dcf2e fix: add /platform prefix to signIn/signOut redirect URIs
All checks were successful
CI / build (push) Successful in 42s
CI / docker (push) Successful in 41s
LoginPage and useAuth used window.location.origin without the /platform
base path, causing redirect_uri mismatch with Logto's registered URIs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:25:53 +02:00
hsiegeln
5a847e075c fix: remove root redirect, /platform/ is the entry point
All checks were successful
CI / build (push) Successful in 42s
CI / docker (push) Successful in 5s
Server-side path rewrite breaks React Router (browser URL stays at /
but basename is /platform). The SPA entry point is /platform/ — users
bookmark that. Root / goes to Logto catch-all which is correct for
direct OIDC flows.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:21:53 +02:00
hsiegeln
bbace4698f fix: use replacepathregex (path-only) instead of redirectregex (full URL)
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 5s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:19:28 +02:00
hsiegeln
e5836bb9d5 fix: Go regexp replacement syntax ($1 not ${1})
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:18:04 +02:00
hsiegeln
8a59c23266 fix: capture group in redirectregex for root redirect
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:17:14 +02:00
hsiegeln
4f4d9777ce fix: use replacepath middleware for root → /platform/ rewrite
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:16:26 +02:00
hsiegeln
e3921576e5 fix: add explicit priority and broader regex for root redirect
All checks were successful
CI / build (push) Successful in 42s
CI / docker (push) Successful in 5s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:15:02 +02:00
hsiegeln
d32a03bb7b fix: redirect root / to /platform/ for better UX
All checks were successful
CI / build (push) Successful in 42s
CI / docker (push) Successful in 6s
Users hitting the root URL now get redirected to the SaaS app instead
of seeing Logto's unknown-session page.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:13:20 +02:00
hsiegeln
4997f7a6a9 feat: move SaaS app to /platform base path, Logto becomes catch-all
All checks were successful
CI / build (push) Successful in 48s
CI / docker (push) Successful in 41s
Eliminates all Logto path enumeration in Traefik. Routing is now:
- /platform/* → cameleer-saas (SPA + API)
- /server/* → server-ui
- /* (catch-all) → Logto (sign-in, OIDC, assets, everything)

Spring context-path handles backend prefix transparently. No changes
needed in controllers, SecurityConfig, or interceptors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 23:06:41 +02:00
hsiegeln
4ab72425ae fix: route Logto experience API paths (/api/interaction, /api/experience)
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 6s
Logto's sign-in form calls /api/interaction/* and /api/experience/* to
submit credentials, but these were routed to cameleer-saas by the /api
catch-all. Added explicit Logto API paths with higher Traefik priority.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:52:12 +02:00
hsiegeln
191be6ab40 fix: route Logto sign-in experience paths through Traefik
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 6s
Logto uses root-level paths for its sign-in UI (/sign-in, /register,
/consent, /single-sign-on, /social, /unknown-session) that were falling
through to the SPA catch-all and getting 401.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:47:39 +02:00
hsiegeln
bc384a6d2d fix: permit /_app/** static assets in SecurityConfig
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 31s
SPA assets moved from /assets/ to /_app/ for single-domain routing,
but SecurityConfig still permitted the old path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:40:41 +02:00
hsiegeln
28a90f5fc7 fix: add BASE_PATH=/server/ to server-ui, remove strip prefix
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 6s
Server-ui now handles base path natively via BASE_PATH env var.
Traefik forwards full path without stripping.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 22:38:37 +02:00
hsiegeln
9568e7f127 feat: single-domain path-based routing (no subdomains required)
All checks were successful
CI / build (push) Successful in 46s
CI / docker (push) Successful in 41s
Move SPA assets from /assets/ to /_app/ (Vite assetsDir config) so
Traefik can route /assets/* to Logto without conflict. All services
on one hostname with path-based routing:

- /oidc/*, /interaction/*, /assets/* → Logto
- /server/* → server-ui (prefix stripped)
- /api/* → cameleer-saas
- /* (catch-all) → cameleer-saas SPA

Customer needs only 1 DNS record. Server gets OIDC_JWK_SET_URI for
Docker-internal JWK fetch (standard Spring split config).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 21:10:03 +02:00
hsiegeln
9a8881c4cc docs: single-domain routing design spec
Path-based routing on one hostname. SPA assets move to /_app/,
Logto gets /assets/ + /oidc/ + /interaction/. Server-ui at /server/.
Includes requirements for server team (split JWK/issuer, BASE_PATH).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 20:46:00 +02:00
hsiegeln
e167d5475e feat: production-ready TLS with self-signed cert init container
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 40s
Standard OIDC architecture: subdomain routing (auth.HOST, server.HOST),
TLS via Traefik, self-signed cert auto-generated on first boot.

- Add traefik-certs init container (generates wildcard self-signed cert)
- Enable TLS on all Traefik routers (websecure entrypoint)
- HTTP→HTTPS redirect in traefik.yml
- Host-based routing for all services (no more path conflicts)
- PUBLIC_PROTOCOL env var (https default, configurable)
- Protocol-aware redirect URIs in bootstrap
- Protocol-aware UI fallbacks

Customer bootstrap: set PUBLIC_HOST + DNS records + docker compose up.
For production TLS, configure Traefik ACME (Let's Encrypt).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 18:14:25 +02:00
hsiegeln
3694d4a7d6 fix: use server.localhost subdomain for server-ui (same /assets conflict)
All checks were successful
CI / build (push) Successful in 37s
CI / docker (push) Successful in 39s
Server UI assets also use absolute /assets/* paths that conflict with
the SPA catch-all. Same fix as Logto: Host-based routing at
server.localhost gives it its own namespace.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:48:11 +02:00
hsiegeln
0472528cd6 fix: move Logto to auth.localhost subdomain to avoid /assets path conflict
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 38s
Logto's login page references /assets/* which conflicts with the SPA's
assets at the same path. Using Host-based routing (auth.localhost) gives
Logto its own namespace - all paths on that subdomain go to Logto,
eliminating the conflict. *.localhost resolves to 127.0.0.1 and is a
secure context.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:42:17 +02:00
hsiegeln
c58ca34b2c fix: route all public traffic through Traefik at localhost:80
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 39s
Logto ENDPOINT now points at Traefik (http://localhost) instead of
directly at port 3001. All services share the same base URL, eliminating
OIDC issuer mismatches and crypto.subtle secure context issues.

- Remove :3001 from all public-facing Logto URLs
- Add cameleer3-server-ui to Traefik at /server/ with prefix strip
- Dashboard link uses /server/ path instead of port 8082
- Bootstrap Host headers match Logto ENDPOINT (no port)
- Redirect URIs simplified (Traefik handles port 80)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:32:36 +02:00
hsiegeln
3a93b68ea5 fix: split JWK fetch (Docker-internal) from issuer validation (localhost)
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 7s
crypto.subtle requires a secure context, so the browser must access
everything via localhost. The custom JwtDecoder already supports this
split: jwk-set-uri uses Docker-internal logto:3001 for network fetch,
while issuer-uri uses localhost:3001 for string-only claim validation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:16:04 +02:00
hsiegeln
e90ca29920 fix: centralize public hostname into single PUBLIC_HOST env var
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 36s
All public-facing URLs (Logto OIDC, redirect URIs, dashboard links) now
derive from PUBLIC_HOST in .env instead of scattered localhost references.
Resolves Docker networking ambiguity where localhost inside containers
doesn't reach the host machine.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 17:07:20 +02:00
hsiegeln
423803b303 fix: use Docker-internal URL for server OIDC issuer in bootstrap
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 5s
Bootstrap was sending LOGTO_PUBLIC_ENDPOINT (http://localhost:3001)
as the OIDC issuer URI to the server. Inside Docker, localhost is
unreachable. Changed to LOGTO_ENDPOINT (http://logto:3001).

Also: .env must set LOGTO_ISSUER_URI=http://logto:3001/oidc (not
localhost) since this env var feeds cameleer3-server's OIDC decoder.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:38:02 +02:00
hsiegeln
cfb16d5048 fix: bootstrap OIDC config — add retry and null guard
All checks were successful
CI / build (push) Successful in 37s
CI / docker (push) Successful in 7s
Phase 7 server health check failed intermittently due to timing.
Add 3-attempt retry loop with 2s sleep. Guard against jq returning
literal "null" string for TRAD_SECRET. Add debug logging for both
preconditions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:31:41 +02:00
hsiegeln
45b60a0aee feat: add cameleer3-server-ui container to Docker Compose
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 38s
The cameleer3-server deploys backend and UI as separate containers.
Add the cameleer3-server-ui image (nginx SPA + API reverse proxy)
to the Compose stack, exposed on port 8082 in dev. Update sidebar
"View Dashboard" link to point to the UI container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 16:23:48 +02:00
hsiegeln
9b77f810c1 fix: use correct health endpoint and HTTP method for server integration
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 31s
ConnectivityHealthCheck: /actuator/health → /api/v1/health (actuator
now requires auth on cameleer3-server after OIDC was added).

Bootstrap: POST → PUT for /api/v1/admin/oidc (server expects PUT,
POST returned 405 causing OIDC config to silently fail).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 15:59:24 +02:00
hsiegeln
1ef8c9dceb refactor: merge tenant isolation into single HandlerInterceptor
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 37s
Replace TenantResolutionFilter + TenantOwnershipValidator (15 manual
calls across 5 controllers) with a single TenantIsolationInterceptor
that uses Spring HandlerMapping path variables for fail-closed tenant
isolation. New endpoints with {tenantId}, {environmentId}, or {appId}
path variables are automatically isolated without manual code.

Simplify OrgResolver from dual-token fetch to single token — Logto
merges all scopes into either token type.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 15:48:04 +02:00
hsiegeln
051f7fdae9 feat: auth hardening — scope enforcement, tenant isolation, and docs
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 39s
Add @PreAuthorize annotations to all API controllers (14 endpoints
across 6 controllers) enforcing OAuth2 scopes: apps:manage, apps:deploy,
billing:manage, observe:read, platform:admin.

Enforce tenant isolation: TenantResolutionFilter now rejects cross-tenant
access on /api/tenants/{id}/* paths. New TenantOwnershipValidator checks
environment/app ownership for paths without tenantId. Platform admins
bypass both layers.

Fix frontend: OrgResolver split into two useEffect hooks so scopes
refresh on org switch. Scopes now served from /api/config (single source
of truth). Bootstrap cleaned — standalone org permissions removed.

Update docs/architecture.md, docs/user-manual.md, and CLAUDE.md to
reflect all auth hardening changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 15:32:53 +02:00
hsiegeln
b459a69083 docs: add architecture document
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 7s
Comprehensive technical reference covering system topology, auth model
(Logto OIDC, scopes, token types, Spring Security pipeline), data model
(7 tables from Flyway migrations), deployment flow, agent-server protocol,
API endpoints, security boundaries, frontend architecture, and full
configuration reference. All class names, paths, and properties verified
against the codebase.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 14:19:05 +02:00
hsiegeln
c5596d8ea4 docs: add user manual
Task-oriented guide for SaaS customers and self-hosted operators
covering login, environments, applications, deployments, observability,
licenses, platform admin, roles, self-hosted setup, and troubleshooting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 14:15:40 +02:00
hsiegeln
e3baaeee84 fix: replace stale permission prop with scope in EnvironmentDetailPage
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 38s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 14:13:37 +02:00
hsiegeln
298f6e3e71 feat: scope-based authorization — read standard scope claim, remove custom roles extraction
Some checks failed
CI / build (push) Failing after 18s
CI / docker (push) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 14:04:16 +02:00
hsiegeln
9c2a1d27b7 feat: replace hardcoded permission map with direct OAuth2 scope checks
Remove role-to-permission mapping (usePermissions, RequirePermission) and replace
with direct scope reads from the Logto access token JWT. OrgResolver decodes the
scope claim after /api/me resolves and stores scopes in Zustand. RequireScope and
useScopes replace the old hooks/components across all pages.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 14:04:06 +02:00
hsiegeln
277d5ea638 feat: define OAuth2 scopes on API resource and assign to Logto roles
Creates 10 API resource scopes (platform:admin + 9 tenant-level) on the
Cameleer SaaS API resource, assigns all to platform-admin role, creates
matching organization scopes, and wires them declaratively to org roles
(admin gets all, member gets deploy/observe). All operations are idempotent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 14:01:43 +02:00
hsiegeln
6ccf7f3fcb fix: ProtectedRoute spinner fix, TokenSync cleanup, dev hot-reload
All checks were successful
CI / build (push) Successful in 37s
CI / docker (push) Successful in 38s
- ProtectedRoute: only gate on initial auth load, not every async op
- TokenSync: OrgResolver is sole source of org data, remove fetchUserInfo
- docker-compose.dev: mount ui/dist for hot-reload

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 13:11:44 +02:00
hsiegeln
cfa989bd5e fix: allow JwtDecoder bean override in test context
- Add @Primary + @ConditionalOnMissingBean so TestSecurityConfig.jwtDecoder()
  wins over SecurityConfig.jwtDecoder() without needing a real OIDC endpoint
- Add spring.main.allow-bean-definition-overriding=true and
  cameleer.clickhouse.enabled=false to src/test/resources/application-test.yml
  so Testcontainers @ServiceConnection can supply the datasource
- Disable ClickHouse in test profile (src/main/resources/application-test.yml)
  so the explicit ClickHouseConfig DataSource bean is not created, allowing
  @ServiceConnection to wire the Testcontainers Postgres datasource
- Fix TenantControllerTest and LicenseControllerTest to explicitly grant
  ROLE_platform-admin authority via .authorities() on the test JWT, since
  spring-security-test does not run the custom JwtAuthenticationConverter
- Fix EnvironmentService.createDefaultForTenant() to use an internal
  bootstrap path that skips license enforcement (chicken-and-egg: no license
  exists at tenant creation time yet)
- Remove now-unnecessary license stub from EnvironmentServiceTest

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 13:06:31 +02:00
hsiegeln
4da9cf23cb infra: add OIDC config to bootstrap output, stop reading Logto DB for secrets
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:44:27 +02:00
hsiegeln
9e6440d97c infra: remove ForwardAuth, keys mount, add OIDC env vars for server
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:44:04 +02:00
hsiegeln
5326102443 feat: remove bootstrap_token from EnvironmentEntity — API keys managed separately
Remove bootstrapToken field/getter/setter from EnvironmentEntity and drop
the RuntimeConfig dependency from EnvironmentService. DeploymentService and
AgentStatusService now use a TODO-api-key placeholder until the ApiKeyService
wiring is complete. All test references to setBootstrapToken removed.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:42:47 +02:00
hsiegeln
ec1ec2e65f feat: rewrite frontend auth — roles from org store, Logto org role names
Replace ID-token claim reads with org store lookups in useAuth and
usePermissions; add currentOrgRoles to useOrgStore; update role names
to Logto org role conventions (admin/member); remove username from
Layout (no longer derived from token claims).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:42:26 +02:00
hsiegeln
5f43394b00 refactor: remove getUserRoles from LogtoManagementClient — roles come from JWT 2026-04-05 12:40:58 +02:00
hsiegeln
bd2a6a601b test: update TestSecurityConfig with org and role claims for Logto tokens 2026-04-05 12:40:49 +02:00
hsiegeln
4b5a1cf2a2 feat: add API key entity, repository, and service with SHA-256 hashing
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:39:50 +02:00
hsiegeln
b8b0c686e8 feat: replace manual Logto role check with @PreAuthorize in TenantController
Remove LogtoManagementClient dependency from TenantController; gate
listAll and create with @PreAuthorize("hasRole('platform-admin')"),
relying on the JWT roles claim already mapped by JwtAuthenticationConverter.
Update TenantControllerTest to supply the platform-admin role via jwt()
on all POST requests that expect 201/409.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:39:40 +02:00
hsiegeln
d4408634a6 feat: rewrite MeController — read from JWT claims, Management API only for cold start
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:38:39 +02:00
hsiegeln
48a5035a2c fix: remove Ed25519 license signing — replace with UUID token placeholder
Drop JwtConfig dependency from LicenseService; generate license tokens as
random UUIDs instead. Add findByToken to LicenseRepository and update
verifyLicenseToken to do a DB lookup. Update LicenseServiceTest to match.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:37:32 +02:00
hsiegeln
396c00749e feat: rewrite SecurityConfig — single filter chain, Logto OAuth2 Resource Server
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:35:45 +02:00
hsiegeln
f89be09e04 chore: greenfield migrations — remove user/role tables, add api_keys, drop bootstrap_token
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:33:52 +02:00
hsiegeln
3929bbb95e chore: delete dead auth code — users/roles/JWTs/ForwardAuth live in Logto now
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-05 12:32:18 +02:00
hsiegeln
1397267be5 docs: add auth overhaul implementation plan
16 tasks across 3 phases: server OIDC support, SaaS auth rewrite,
infrastructure updates. TDD, complete code, greenfield migrations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:26:47 +02:00
hsiegeln
c61c59a441 docs: update auth spec for greenfield approach
Remove migration/backward-compat hedging. Delete legacy user/role/permission
tables entirely, remove bootstrap_token column in favor of api_keys table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:19:09 +02:00
hsiegeln
fc4c1f94cd docs: add auth overhaul design spec
Comprehensive design for replacing the incoherent three-system auth
with Logto-centric architecture: OAuth2 Resource Server for humans,
API keys for agents, zero trust (no header identity), server-per-tenant.
Covers cameleer-saas (large), cameleer3-server (small), agent (none).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 12:13:19 +02:00
hsiegeln
1b42bd585d fix: re-resolve tenantId when org list is enriched by OrgResolver
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 37s
TokenSync sets currentOrgId from fetchUserInfo (no tenantId).
OrgResolver later calls setOrganizations with enriched data including
tenantId. Now setOrganizations auto-resolves currentTenantId when
the org is already selected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 10:29:03 +02:00
hsiegeln
51c73d64a4 fix: read M2M credentials from bootstrap JSON when env vars empty
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 31s
The bootstrap dynamically creates the M2M app and writes credentials
to the JSON file. LogtoConfig now falls back to the bootstrap file
when LOGTO_M2M_CLIENT_ID/SECRET env vars are not set.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 10:23:02 +02:00
hsiegeln
34aadd1e25 fix: accept Logto at+jwt token type in Spring Security
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 30s
Logto issues access tokens with typ "at+jwt" (RFC 9068) but Spring
Security's default NimbusJwtDecoder only allows "JWT". Custom decoder
accepts any type. Also removed hard 401 redirect from API client.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 10:17:25 +02:00
hsiegeln
1abf0f827b fix: remove 401 hard redirect, let React Query retry
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 41s
The /api/me call races with TokenSync — fires before the token
provider is set. Removed the hard window.location redirect on 401
from the API client. React Query retries with backoff instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 03:02:32 +02:00
hsiegeln
00ee8876c1 fix: move DB seeding from bootstrap script to Java ApplicationRunner
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 33s
SonarQube Analysis / sonarqube (push) Successful in 1m21s
The bootstrap script runs before cameleer-saas (Flyway), so tenant
tables don't exist yet. Moved DB seeding to BootstrapDataSeeder
ApplicationRunner which runs after Flyway migrations complete.
Reads bootstrap JSON and creates tenant/environment/license if missing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 02:55:43 +02:00
hsiegeln
827e388349 feat: bootstrap 2 users, tenant, org-scoped tokens, platform admin UI
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 39s
Bootstrap script now creates:
- SaaS Owner (admin/admin) with platform-admin role
- Tenant Admin (camel/camel) in Example Tenant org
- Traditional Web App for cameleer3-server OIDC
- DB records: tenant, default environment, license
- Configures cameleer3-server OIDC via its admin API
All credentials configurable via env vars.

Backend:
- Fix LogtoManagementClient resource URL (https://default.logto.app/api)
- Add getUserRoles/getUserOrganizations to LogtoManagementClient
- Add GET /api/me endpoint (user info, platform admin status, tenants)
- Add GET /api/tenants list-all for platform admins
- Remove insecure X-header forwarding from Traefik

Frontend:
- Org-scoped tokens: getAccessToken(resource, orgId) for tenant context
- OrgResolver component populates org store from /api/me
- useOrganization Zustand store (currentOrgId + currentTenantId)
- Platform admin sidebar section + AdminTenantsPage
- View Dashboard link points to cameleer3-server on port 8081

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 02:50:51 +02:00
hsiegeln
b83cfdcd49 fix: add all design-system providers (GlobalFilter + CommandPalette)
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 38s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:30:18 +02:00
hsiegeln
a7dd026225 fix: add GlobalFilterProvider required by design-system DataTable
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 41s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:23:36 +02:00
hsiegeln
0843a33383 refactor: replace hand-rolled OIDC with @logto/react SDK
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 48s
The hand-rolled OIDC flow (manual PKCE, token exchange, URL
construction) was fragile and accumulated multiple bugs. Replaced
with the official @logto/react SDK which handles PKCE, token
exchange, storage, and refresh automatically.

- Add @logto/react SDK dependency
- Add LogtoProvider with runtime config in main.tsx
- Add TokenSync component bridging SDK tokens to API client
- Add useAuth hook replacing Zustand auth store
- Simplify LoginPage to signIn(), CallbackPage to useHandleSignInCallback()
- Delete pkce.ts and auth-store.ts (replaced by SDK)
- Fix react-router-dom → react-router imports in page files
- All 17 React Query hooks unchanged (token provider pattern)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:17:47 +02:00
hsiegeln
84667170f1 fix: register API resource in Logto for JWT access tokens
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 39s
Logto returns opaque access tokens when no resource is specified.
Added API resource creation to bootstrap, included resource indicator
in /api/config, and SPA now passes resource parameter in auth request.
Also fixed issuer-uri to match Logto's public endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 01:01:32 +02:00
hsiegeln
6764f981d2 fix: add PKCE support for Logto auth and fix Traefik routing
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 39s
Logto requires PKCE (Proof Key for Code Exchange) for SPA auth.
Added code_challenge/code_verifier to login and callback flow.

Also fixed Traefik router-service linking — when a container defines
multiple routers, each needs an explicit service binding or Traefik
v3 refuses to auto-link them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:48:21 +02:00
hsiegeln
537c2bbaf2 fix: use LOGTO_PUBLIC_ENDPOINT for Logto ENDPOINT config
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 6s
Logto's ENDPOINT must be the browser-accessible URL (not Docker
internal). When .env sets LOGTO_ENDPOINT=http://logto:3001, it was
overriding the default and causing Logto to redirect browsers to
an unreachable hostname.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:44:24 +02:00
hsiegeln
beb3442c07 fix: return browser-accessible Logto URL from /api/config
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 31s
Separate LOGTO_PUBLIC_ENDPOINT (browser-facing, defaults to
http://localhost:3001) from LOGTO_ENDPOINT (Docker-internal).
Also fix bootstrap M2M verification by using correct Host header
for default tenant token endpoint.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:33:43 +02:00
hsiegeln
a20d36df38 fix: bootstrap script use curl with Host header for Logto tenant routing
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 6s
Logto routes requests by Host header to determine tenant. Inside Docker,
requests to logto:3001/3002 need Host: localhost:3001/3002 to match the
configured ENDPOINT/ADMIN_ENDPOINT.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:28:23 +02:00
hsiegeln
021b056bce feat: zero-config first-run experience with Logto bootstrap
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 37s
- logto-bootstrap.sh: API-driven init script that creates SPA app,
  M2M app, and default user (camel/camel) via Logto Management API.
  Reads m-default secret from DB, then removes seeded apps with
  known secrets (security hardening). Idempotent.
- PublicConfigController: /api/config public endpoint serves Logto
  client ID from bootstrap output file (runtime, not build-time)
- Frontend: LoginPage + CallbackPage fetch config from /api/config
  instead of import.meta.env (fixes Vite build-time baking issue)
- Docker Compose: logto-bootstrap init service with health-gated
  dependency chain, shared volume for bootstrap config
- SecurityConfig: permit /api/config without auth

Flow: docker compose up → bootstrap creates apps/user → SPA fetches
config → login page shows → sign in with Logto → camel/camel

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-05 00:22:22 +02:00
hsiegeln
cda7dfbaa7 fix: permit SPA routes and static assets in Spring Security
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 32s
The SPA (index.html, /login, /callback, /assets/*) must be accessible
without authentication. API routes remain protected.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:49:43 +02:00
hsiegeln
ad6805e447 fix: use standard dist/ output for Vite, copy to static/ explicitly
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 37s
The relative outDir '../src/main/resources/static' resolved
unpredictably in Docker. Use standard 'dist/' output, then:
- Dockerfile: COPY --from=frontend /ui/dist/ to static/
- CI: cp -r dist/ to src/main/resources/static/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:44:50 +02:00
hsiegeln
e5e14fbe32 fix: add CAMELEER_JWT_SECRET for cameleer3-server
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 7s
The server needs this to derive its Ed25519 signing key. Without it,
startup fails with 'Empty key'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:42:37 +02:00
hsiegeln
e10f80c298 fix: allow ClickHouse connections from Docker network
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 6s
The default ClickHouse image restricts the 'default' user to localhost
only. Override with clickhouse-users.xml to allow connections from any
IP (needed for inter-container communication on the Docker network).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:41:14 +02:00
hsiegeln
16acd145a3 fix: pg_isready healthcheck must specify database name
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 5s
Without -d, pg_isready connects to database matching the username
('cameleer'), which doesn't exist. Specify $POSTGRES_DB explicitly.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:38:02 +02:00
hsiegeln
d0fd2c49be fix: Docker Compose database initialization
Some checks failed
CI / build (push) Successful in 38s
CI / docker (push) Has been cancelled
- init-databases.sh: create cameleer3 DB for cameleer3-server, connect
  to $POSTGRES_DB explicitly (avoids 'database cameleer does not exist')
- clickhouse-init.sql: auto-create cameleer database on first start
- docker-compose.yml: fix cameleer3-server datasource to cameleer3 DB,
  add ClickHouse init script volume mount, pass credentials

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:37:19 +02:00
hsiegeln
567d92ca34 fix: let Flyway inherit datasource connection instead of separate URL
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 31s
Explicit spring.flyway.url/user/password used SPRING_DATASOURCE_URL env
var but Flyway resolves its own defaults independently, falling back to
localhost when the env var mapping doesn't match. Removing the explicit
Flyway connection config lets it inherit from the datasource, which is
correctly configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:20:04 +02:00
hsiegeln
fb4e1f57e5 docs: add Phase 9 Frontend React Shell implementation plan
All checks were successful
CI / build (push) Successful in 40s
CI / docker (push) Successful in 5s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:10:25 +02:00
hsiegeln
032db410c7 fix: refactor ClickHouse config to match cameleer3-server pattern
All checks were successful
CI / build (push) Successful in 38s
CI / docker (push) Successful in 32s
- Add ClickHouseProperties with @ConfigurationProperties
- @ConditionalOnProperty to toggle ClickHouse
- @Primary DataSource + JdbcTemplate for PostgreSQL (prevents Spring
  Boot from routing JPA/Flyway to ClickHouse)
- HikariDataSource for ClickHouse with explicit credentials
- Remove separate DataSourceConfig.java (merged into ClickHouseConfig)
- Remove database-platform override (no longer needed with @Primary)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 23:09:01 +02:00
hsiegeln
be4c882ef8 fix: configure Flyway to use explicit PostgreSQL datasource
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 31s
With two DataSource beans (PostgreSQL + ClickHouse), Flyway was picking
up the ClickHouse DataSource and failing with auth errors. Explicitly
configure Flyway's url/user/password to target PostgreSQL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:58:54 +02:00
hsiegeln
64a5edac78 fix: add explicit datasource defaults to application.yml
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 31s
Without an explicit spring.datasource.url, Spring Boot falls back to
jdbc:postgresql://localhost:5432 when the SPRING_DATASOURCE_URL env var
is missing or not picked up. Default now points to the docker-compose
service name (postgres:5432/cameleer_saas).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:54:18 +02:00
hsiegeln
806895fbd0 fix: separate dev and local profiles for Docker vs bare-metal
All checks were successful
CI / build (push) Successful in 37s
CI / docker (push) Successful in 29s
dev profile: activated inside Docker containers (show-sql only)
local profile: for running Maven outside Docker (localhost overrides)

Usage: mvn spring-boot:run -Dspring-boot.run.profiles=dev,local

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:45:07 +02:00
hsiegeln
c0e189a5c8 fix: add ClickHouse and cameleer3-server localhost overrides to dev profile
All checks were successful
CI / build (push) Successful in 39s
CI / docker (push) Successful in 3m1s
Without these, `mvn spring-boot:run -Dspring-boot.run.profiles=dev` fails
because ClickHouse URL defaults to docker hostname 'clickhouse'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:34:02 +02:00
hsiegeln
aaa4af40c5 fix: use BUILDPLATFORM for native cross-compilation, remove broken cache mounts
Some checks failed
CI / build (push) Successful in 42s
CI / docker (push) Has been cancelled
Build and frontend stages now use --platform=$BUILDPLATFORM so Maven and
Node run natively on the ARM64 CI runner instead of under QEMU emulation.
Only the final JRE runtime stage targets amd64. Removed --mount=type=cache
which doesn't persist across CI runs with buildx --push; the registry layer
cache (--cache-from/--cache-to in CI) handles caching the dependency layer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:31:22 +02:00
hsiegeln
c4a4c9d2fc fix: cross-compile Docker image for amd64 and add npm registry auth
Some checks failed
CI / build (push) Successful in 40s
CI / docker (push) Has been cancelled
- CI docker job: QEMU + buildx + --platform linux/amd64 (runners are arm64)
- Dockerfile: REGISTRY_TOKEN build arg for @cameleer/design-system npm auth
- CI build job: npm auth token for frontend build step
- Registry cache for faster builds

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:21:44 +02:00
050ff61e7a Merge pull request 'feat: Phase 9 — Frontend React Shell' (#35) from feat/phase-9-frontend-react-shell into main
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 4s
Reviewed-on: #35
2026-04-04 22:12:53 +02:00
hsiegeln
e325c4d2c0 fix: correct Dockerfile frontend build output path
All checks were successful
CI / build (push) Successful in 1m10s
CI / build (pull_request) Successful in 1m9s
CI / docker (pull_request) Has been skipped
CI / docker (push) Successful in 23s
Vite's outDir is '../src/main/resources/static' (relative to ui/),
which resolves to /src/main/resources/static/ in the Docker build.
The COPY was looking at /ui/dist/ which doesn't exist.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:10:42 +02:00
hsiegeln
4c8c8efbe5 feat: add SPA controller, Traefik route, CI frontend build, and HOWTO update
Some checks failed
CI / build (push) Successful in 49s
CI / docker (push) Failing after 38s
CI / build (pull_request) Successful in 1m2s
CI / docker (pull_request) Has been skipped
- SpaController catch-all forwards non-API routes to index.html
- Traefik SPA route at priority=1 catches all unmatched paths
- CI pipeline builds frontend before Maven
- Dockerfile adds multi-stage frontend build
- HOWTO.md documents frontend development workflow

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 22:06:36 +02:00
hsiegeln
f6d3627abc feat: add license page with tier features and limits
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 22:04:55 +02:00
hsiegeln
fe786790e1 feat: add app detail page with deploy, logs, and status
Full AppDetailPage with tabbed layout (Overview / Deployments / Logs),
current deployment status with auto-poll, action bar (deploy/stop/restart/re-upload/delete),
agent status card, routing card with edit modal, deployment history table,
and container log viewer with stream filter.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 22:03:29 +02:00
hsiegeln
5eac48ad72 feat: add environments list and environment detail pages
Implements EnvironmentsPage with DataTable, create modal, and row navigation,
and EnvironmentDetailPage with app list, inline rename, new app form with JAR
upload, and delete confirmation — all gated by RBAC permissions.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 22:00:14 +02:00
hsiegeln
02019e9347 feat: add dashboard page with tenant overview and KPI stats
Replaces placeholder DashboardPage with a full implementation: tenant
name + tier badge, KPI strip (environments, total apps, running, stopped),
environment list with per-env app counts, and a recent deployments
placeholder. Uses EnvApps helper component to fetch per-environment app
data without violating hook rules.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:57:53 +02:00
hsiegeln
91a4235223 feat: add sidebar layout, environment tree, and router
Wires up AppShell + Sidebar compound component, a per-environment
SidebarTree that lazy-fetches apps, React Router nested routes, and
provider-wrapped main.tsx with ThemeProvider/ToastProvider/BreadcrumbProvider.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:55:21 +02:00
hsiegeln
e725669aef feat: add RBAC hooks and permission-gated components
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:51:38 +02:00
hsiegeln
d572926010 feat: add API client with auth middleware and React Query hooks
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:50:19 +02:00
hsiegeln
e33818cc74 feat: add auth store, login, callback, and protected route
Adds Zustand auth store with JWT parsing (sub, roles, organization_id),
Logto OIDC login page, authorization code callback handler, and
ProtectedRoute guard. Also adds vite-env.d.ts for import.meta.env types.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:48:56 +02:00
hsiegeln
146dbccc6e feat: scaffold React SPA with Vite, design system, and TypeScript types
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:47:01 +02:00
hsiegeln
600985c913 docs: add Phase 9 Frontend React Shell spec
All checks were successful
CI / build (push) Successful in 28s
CI / docker (push) Successful in 4s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 21:36:45 +02:00
7aa331d73c Merge pull request 'feat: Phase 4 — Observability Pipeline + Inbound Routing' (#34) from feat/phase-4-observability-pipeline into main
All checks were successful
CI / build (push) Successful in 28s
CI / docker (push) Successful in 3s
Reviewed-on: #34
2026-04-04 21:20:46 +02:00
hsiegeln
9b1643c1ee docs: update HOWTO with observability dashboard, routing, and agent status
All checks were successful
CI / build (push) Successful in 29s
CI / build (pull_request) Successful in 48s
CI / docker (push) Successful in 47s
CI / docker (pull_request) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 21:06:05 +02:00
hsiegeln
9f8d0f43ab feat: add dashboard Traefik route and CAMELEER_TENANT_ID config
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:04:57 +02:00
hsiegeln
43cd2d012f feat: add cameleer3-server startup connectivity check 2026-04-04 21:03:41 +02:00
hsiegeln
210da55e7a feat: add Traefik routing labels for customer apps with exposed ports
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:03:04 +02:00
hsiegeln
08b87edd6e feat: add agent status and observability status endpoints
Implements AgentStatusService (TDD) that proxies cameleer3-server agent
registry API and queries ClickHouse for trace counts. Gracefully degrades
to UNKNOWN state when server is unreachable or DataSource is absent.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 21:01:43 +02:00
hsiegeln
024780c01e feat: add exposed port routing and route URL to app API
Adds domain config to RuntimeConfig/application.yml, expands AppResponse
with exposedPort and computed routeUrl, adds updateRouting to AppService,
and adds PATCH /{appId}/routing endpoint to AppController.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 20:57:37 +02:00
hsiegeln
d25849d665 feat: add labels support to StartContainerRequest and DockerRuntimeOrchestrator
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 20:55:16 +02:00
hsiegeln
b0275bcf64 feat: add exposed_port column to apps table 2026-04-04 20:53:56 +02:00
hsiegeln
f8d80eaf79 docs: add Phase 4 Observability Pipeline implementation plan
All checks were successful
CI / build (push) Successful in 28s
CI / docker (push) Successful in 4s
8 tasks: migration, labels support, routing API, agent/observability
status endpoints, Traefik routing labels, connectivity check,
Docker Compose + env, HOWTO update.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 20:52:17 +02:00
hsiegeln
41629f3290 docs: add Phase 4 Observability Pipeline + Inbound Routing spec
All checks were successful
CI / build (push) Successful in 27s
CI / docker (push) Successful in 4s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 20:47:51 +02:00
hsiegeln
b78dfa9a7b docs: add HOWTO.md with install, start, and bootstrap instructions
All checks were successful
CI / build (push) Successful in 27s
CI / docker (push) Successful in 4s
Quick start, full installation guide, Logto setup, first tenant
creation, app deployment walkthrough, API reference, tier limits,
development commands, and troubleshooting.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 19:31:56 +02:00
hsiegeln
d81ce2b697 ci: revert artifact approach, use BuildKit cache for Maven deps
All checks were successful
CI / build (push) Successful in 29s
CI / docker (push) Successful in 2m31s
Gitea Actions doesn't support upload/download-artifact v4.
Reverted to two-job approach (git clone + docker build).
Added BuildKit cache mount (--mount=type=cache,target=/root/.m2)
to Dockerfile so Maven deps persist across Docker builds on the
same runner. First build downloads, subsequent builds are cached.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 19:27:08 +02:00
hsiegeln
cbf7d5c60f ci: pass pre-built JAR to docker job via artifact
Some checks failed
CI / build (push) Failing after 51s
CI / docker (push) Has been skipped
Build job uploads the JAR, docker job downloads it and builds a
runtime-only image. Eliminates duplicate Maven dependency download
(~2min saving). The repo Dockerfile is kept for local builds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 18:15:12 +02:00
956eb13dd6 Merge pull request 'feat: Phase 3 — Runtime Orchestration + Environments' (#33) from feat/phase-3-runtime-orchestration into main
All checks were successful
CI / build (push) Successful in 29s
CI / docker (push) Successful in 45s
Reviewed-on: #33
2026-04-04 18:10:42 +02:00
hsiegeln
af04f7b4a1 ci: add nightly SonarQube analysis workflow
All checks were successful
CI / build (push) Successful in 45s
CI / build (pull_request) Successful in 46s
CI / docker (pull_request) Has been skipped
CI / docker (push) Successful in 2m29s
Runs at 02:00 UTC daily (same schedule as cameleer3 and cameleer3-server).
Uses cameleer-build:1 image, excludes TestContainers integration tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 18:08:35 +02:00
hsiegeln
abc06f57da feat: update Docker Compose, CI, and add runtime-base Dockerfile
Some checks failed
CI / build (push) Successful in 57s
CI / build (pull_request) Successful in 54s
CI / docker (pull_request) Has been skipped
CI / docker (push) Has been cancelled
Add jardata volume, CAMELEER_AUTH_TOKEN/CAMELEER3_SERVER_ENDPOINT/CLICKHOUSE_URL env vars to cameleer-saas, CAMELEER_AUTH_TOKEN to cameleer3-server, runtime-base Dockerfile for agent-instrumented customer apps, and expand CI surefire excludes for new integration test classes.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 18:04:42 +02:00
hsiegeln
0bd54f2a95 feat: add container log service with ClickHouse storage and log API
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 18:02:42 +02:00
hsiegeln
fc34626a88 feat: add deployment controller with deploy/stop/restart endpoints
Add DeploymentResponse DTO, DeploymentController at /api/apps/{appId} with POST /deploy (202), GET /deployments, GET /deployments/{id}, POST /stop, POST /restart (202), and integration tests covering empty list, 404, and 401 cases.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 18:00:23 +02:00
hsiegeln
59df59f406 feat: add deployment service with async pipeline
Implements DeploymentService with TDD: builds Docker images, starts containers with Cameleer env vars, polls for health, and handles stop/restart lifecycle. All 3 unit tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:57:09 +02:00
hsiegeln
23a474fbf3 feat: add deployment entity, repository, and status enums 2026-04-04 17:54:08 +02:00
hsiegeln
d2ea256cd8 feat: add app controller with multipart JAR upload
Adds AppController at /api/environments/{environmentId}/apps with POST (multipart
metadata+JAR), GET list, GET by ID, PUT jar reupload, and DELETE endpoints.
Also adds CreateAppRequest and AppResponse DTOs, integration tests (AppControllerTest),
and fixes ClickHouseConfig to be excluded in test profile via @Profile("!test").

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:53:10 +02:00
hsiegeln
51f5822364 feat: add app service with JAR upload and tier enforcement
Implements AppService with JAR file storage, SHA-256 checksum computation,
tier-based app limit enforcement via LicenseDefaults, and audit logging.
Four TDD tests all pass covering creation, JAR validation, duplicate slug
rejection, and JAR re-upload.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 17:47:05 +02:00
hsiegeln
2151801d40 feat: add DockerRuntimeOrchestrator with docker-java
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:44:34 +02:00
hsiegeln
90c1e36cb7 feat: add RuntimeOrchestrator interface and request/response types 2026-04-04 17:42:56 +02:00
hsiegeln
731690191b feat: add app entity and repository 2026-04-04 17:42:08 +02:00
hsiegeln
36069bae07 feat: auto-create default environment on tenant provisioning 2026-04-04 17:41:23 +02:00
hsiegeln
785bdab3d1 feat: add environment controller with CRUD endpoints
Implements POST/GET/PATCH/DELETE endpoints at /api/tenants/{tenantId}/environments
with DTOs, mapping helpers, and a Spring Boot integration test (TestContainers).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:40:23 +02:00
hsiegeln
34e98ab176 feat: add environment service with tier enforcement and audit logging
Implements EnvironmentService with full CRUD, duplicate slug rejection,
tier-based environment count limits, and audit logging for create/update/delete.
Adds ENVIRONMENT_CREATE, ENVIRONMENT_UPDATE, ENVIRONMENT_DELETE to AuditAction.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:36:09 +02:00
hsiegeln
8511d10343 feat: add environment entity, repository, and status enum 2026-04-04 17:33:43 +02:00
hsiegeln
4cb15c9bea feat: add database migrations for environments, apps, deployments 2026-04-04 17:32:51 +02:00
hsiegeln
bd8dfcf147 fix: use concrete ClickHouseDataSource return type to avoid bean ambiguity 2026-04-04 17:32:09 +02:00
hsiegeln
803b8c9876 feat: add Phase 3 dependencies and configuration
Add docker-java and ClickHouse JDBC dependencies, RuntimeConfig and
ClickHouseConfig Spring components, AsyncConfig with deployment thread
pool, and runtime/clickhouse config sections in application.yml.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-04 17:29:06 +02:00
hsiegeln
c0fce36d4a chore: add .worktrees to .gitignore 2026-04-04 17:26:22 +02:00
hsiegeln
fa7853b02d docs: add Phase 3 Runtime Orchestration implementation plan
16-task plan covering environments, apps, deployments, Docker
runtime orchestrator, ClickHouse log ingestion, and CI updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 17:24:20 +02:00
hsiegeln
0326dc6cce docs: add Phase 3 Runtime Orchestration spec
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 17:13:08 +02:00
5d14f78b9d Merge pull request 'Phase 2: Tenants + Identity + Licensing' (#32) from feature/phase-2-tenants-identity-licensing into main
All checks were successful
CI / build (push) Successful in 25s
CI / docker (push) Successful in 31s
Reviewed-on: #32
2026-04-04 15:58:07 +02:00
292 changed files with 50245 additions and 1328 deletions

View File

@@ -1,25 +1,58 @@
# Cameleer SaaS Environment Variables # Cameleer SaaS Environment Configuration
# Copy to .env and fill in values # Copy to .env and fill in values for production
# Application version # Image version
VERSION=latest VERSION=latest
# Public access
PUBLIC_HOST=localhost
PUBLIC_PROTOCOL=https
# Auth domain (Logto). Defaults to PUBLIC_HOST for single-domain setups.
# Set to a separate subdomain (e.g. auth.cameleer.io) to split auth from the app.
# AUTH_HOST=localhost
# Ports
HTTP_PORT=80
HTTPS_PORT=443
LOGTO_CONSOLE_PORT=3002
# PostgreSQL # PostgreSQL
POSTGRES_USER=cameleer POSTGRES_USER=cameleer
POSTGRES_PASSWORD=change_me_in_production POSTGRES_PASSWORD=change_me_in_production
POSTGRES_DB=cameleer_saas POSTGRES_DB=cameleer_saas
# Logto Identity Provider # ClickHouse
LOGTO_ENDPOINT=http://logto:3001 CLICKHOUSE_PASSWORD=change_me_in_production
LOGTO_ISSUER_URI=http://logto:3001/oidc
LOGTO_JWK_SET_URI=http://logto:3001/oidc/jwks
LOGTO_DB_PASSWORD=change_me_in_production
LOGTO_M2M_CLIENT_ID=
LOGTO_M2M_CLIENT_SECRET=
# Ed25519 Keys (mount PEM files) # Admin user (created by bootstrap)
CAMELEER_JWT_PRIVATE_KEY_PATH=/etc/cameleer/keys/ed25519.key SAAS_ADMIN_USER=admin
CAMELEER_JWT_PUBLIC_KEY_PATH=/etc/cameleer/keys/ed25519.pub SAAS_ADMIN_PASS=change_me_in_production
# Domain (for Traefik TLS) # SMTP (for email verification during registration)
DOMAIN=localhost # Required for self-service sign-up. Without SMTP, only admin-created users can sign in.
SMTP_HOST=
SMTP_PORT=587
SMTP_USER=
SMTP_PASS=
SMTP_FROM_EMAIL=noreply@cameleer.io
# TLS (leave empty for self-signed)
# NODE_TLS_REJECT=0 # Set to 1 when using real certificates
# CERT_FILE=
# KEY_FILE=
# CA_FILE=
# Vendor account (optional)
VENDOR_SEED_ENABLED=false
# VENDOR_USER=vendor
# VENDOR_PASS=change_me
# Docker socket GID (run: stat -c '%g' /var/run/docker.sock)
# DOCKER_GID=0
# Docker images (override for custom registries)
# TRAEFIK_IMAGE=gitea.siegeln.net/cameleer/cameleer-traefik
# POSTGRES_IMAGE=gitea.siegeln.net/cameleer/cameleer-postgres
# CLICKHOUSE_IMAGE=gitea.siegeln.net/cameleer/cameleer-clickhouse
# LOGTO_IMAGE=gitea.siegeln.net/cameleer/cameleer-logto
# CAMELEER_IMAGE=gitea.siegeln.net/cameleer/cameleer-saas

View File

@@ -27,10 +27,29 @@ jobs:
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }} key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: ${{ runner.os }}-maven- restore-keys: ${{ runner.os }}-maven-
- name: Build SaaS frontend
run: |
cd ui
echo "//gitea.siegeln.net/api/packages/cameleer/npm/:_authToken=${REGISTRY_TOKEN}" >> .npmrc
npm ci
npm run build
cp -r dist/ ../src/main/resources/static/
env:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
- name: Build and Test (unit tests only) - name: Build and Test (unit tests only)
run: >- run: >-
mvn clean verify -B mvn clean verify -B
-Dsurefire.excludes="**/AuthControllerTest.java,**/TenantControllerTest.java,**/LicenseControllerTest.java,**/AuditRepositoryTest.java,**/CameleerSaasApplicationTest.java" -Dsurefire.excludes="**/AuthControllerTest.java,**/TenantControllerTest.java,**/LicenseControllerTest.java,**/AuditRepositoryTest.java,**/CameleerSaasApplicationTest.java,**/EnvironmentControllerTest.java,**/AppControllerTest.java,**/DeploymentControllerTest.java,**/AgentStatusControllerTest.java,**/VendorTenantControllerTest.java,**/TenantPortalControllerTest.java"
- name: Build sign-in UI
run: |
cd ui/sign-in
echo "//gitea.siegeln.net/api/packages/cameleer/npm/:_authToken=${REGISTRY_TOKEN}" >> .npmrc
npm ci
npm run build
env:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
docker: docker:
needs: build needs: build
@@ -70,15 +89,94 @@ jobs:
echo "IMAGE_TAGS=branch-$SLUG" >> "$GITHUB_ENV" echo "IMAGE_TAGS=branch-$SLUG" >> "$GITHUB_ENV"
fi fi
- name: Build and push - name: Set up QEMU for cross-platform builds
run: docker run --rm --privileged gitea.siegeln.net/cameleer/binfmt:1 --install all
- name: Build and push SaaS image
run: | run: |
docker buildx create --use --name cibuilder
TAGS="-t gitea.siegeln.net/cameleer/cameleer-saas:${{ github.sha }}" TAGS="-t gitea.siegeln.net/cameleer/cameleer-saas:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-saas:$TAG" TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-saas:$TAG"
done done
docker build $TAGS --provenance=false . docker buildx build --platform linux/amd64 \
for TAG in $IMAGE_TAGS ${{ github.sha }}; do --build-arg REGISTRY_TOKEN="$REGISTRY_TOKEN" \
docker push gitea.siegeln.net/cameleer/cameleer-saas:$TAG $TAGS \
done --cache-from type=registry,ref=gitea.siegeln.net/cameleer/cameleer-saas:buildcache \
--cache-to type=registry,ref=gitea.siegeln.net/cameleer/cameleer-saas:buildcache,mode=max \
--provenance=false \
--push .
env: env:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }} REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
- name: Build and push runtime base image
run: |
AGENT_VERSION=$(curl -sf "https://gitea.siegeln.net/api/packages/cameleer/maven/com/cameleer/cameleer-agent/1.0-SNAPSHOT/maven-metadata.xml" \
| sed -n 's/.*<value>\([^<]*\)<\/value>.*/\1/p' | tail -1)
echo "Agent version: $AGENT_VERSION"
curl -sf -o docker/runtime-base/agent.jar \
"https://gitea.siegeln.net/api/packages/cameleer/maven/com/cameleer/cameleer-agent/1.0-SNAPSHOT/cameleer-agent-${AGENT_VERSION}-shaded.jar"
APPENDER_VERSION=$(curl -sf "https://gitea.siegeln.net/api/packages/cameleer/maven/com/cameleer/cameleer-log-appender/1.0-SNAPSHOT/maven-metadata.xml" \
| sed -n 's/.*<value>\([^<]*\)<\/value>.*/\1/p' | tail -1)
echo "Log appender version: $APPENDER_VERSION"
curl -sf -o docker/runtime-base/cameleer-log-appender.jar \
"https://gitea.siegeln.net/api/packages/cameleer/maven/com/cameleer/cameleer-log-appender/1.0-SNAPSHOT/cameleer-log-appender-${APPENDER_VERSION}.jar"
ls -la docker/runtime-base/agent.jar docker/runtime-base/cameleer-log-appender.jar
TAGS="-t gitea.siegeln.net/cameleer/cameleer-runtime-base:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-runtime-base:$TAG"
done
docker buildx build --platform linux/amd64 \
$TAGS \
--provenance=false \
--push docker/runtime-base/
- name: Build and push Logto image
run: |
TAGS="-t gitea.siegeln.net/cameleer/cameleer-logto:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-logto:$TAG"
done
docker buildx build --platform linux/amd64 \
--build-arg REGISTRY_TOKEN="$REGISTRY_TOKEN" \
-f ui/sign-in/Dockerfile \
$TAGS \
--cache-from type=registry,ref=gitea.siegeln.net/cameleer/cameleer-logto:buildcache \
--cache-to type=registry,ref=gitea.siegeln.net/cameleer/cameleer-logto:buildcache,mode=max \
--provenance=false \
--push .
env:
REGISTRY_TOKEN: ${{ secrets.REGISTRY_TOKEN }}
- name: Build and push PostgreSQL image
run: |
TAGS="-t gitea.siegeln.net/cameleer/cameleer-postgres:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-postgres:$TAG"
done
docker buildx build --platform linux/amd64 \
$TAGS \
--provenance=false \
--push docker/cameleer-postgres/
- name: Build and push ClickHouse image
run: |
TAGS="-t gitea.siegeln.net/cameleer/cameleer-clickhouse:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-clickhouse:$TAG"
done
docker buildx build --platform linux/amd64 \
$TAGS \
--provenance=false \
--push docker/cameleer-clickhouse/
- name: Build and push Traefik image
run: |
TAGS="-t gitea.siegeln.net/cameleer/cameleer-traefik:${{ github.sha }}"
for TAG in $IMAGE_TAGS; do
TAGS="$TAGS -t gitea.siegeln.net/cameleer/cameleer-traefik:$TAG"
done
docker buildx build --platform linux/amd64 \
$TAGS \
--provenance=false \
--push docker/cameleer-traefik/

View File

@@ -0,0 +1,35 @@
name: SonarQube Analysis
on:
schedule:
- cron: '0 2 * * *' # Nightly at 02:00 UTC
workflow_dispatch: # Allow manual trigger
jobs:
sonarqube:
runs-on: ubuntu-latest
container:
image: gitea.siegeln.net/cameleer/cameleer-build:1
credentials:
username: cameleer
password: ${{ secrets.REGISTRY_TOKEN }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for blame data
- name: Cache Maven dependencies
uses: actions/cache@v4
with:
path: ~/.m2/repository
key: ${{ runner.os }}-maven-${{ hashFiles('**/pom.xml') }}
restore-keys: ${{ runner.os }}-maven-
- name: Build, Test and Analyze
run: >-
mvn clean verify sonar:sonar --batch-mode
-Dsurefire.excludes="**/AuthControllerTest.java,**/TenantControllerTest.java,**/LicenseControllerTest.java,**/AuditRepositoryTest.java,**/CameleerSaasApplicationTest.java,**/EnvironmentControllerTest.java,**/AppControllerTest.java,**/DeploymentControllerTest.java,**/AgentStatusControllerTest.java"
-Dsonar.host.url=${{ secrets.SONAR_HOST_URL }}
-Dsonar.token=${{ secrets.SONAR_TOKEN }}
-Dsonar.projectKey=cameleer-saas
-Dsonar.projectName="Cameleer SaaS"

16
.gitignore vendored
View File

@@ -18,3 +18,19 @@ Thumbs.db
# Environment # Environment
.env .env
*.env.local *.env.local
# Worktrees
.worktrees/
# Claude
.claude/
.superpowers/
.playwright-mcp/
.gitnexus
# Installer output (generated by install.sh / install.ps1)
installer/cameleer/
# Generated by postinstall from @cameleer/design-system
ui/public/favicon.svg
docker/runtime-base/agent.jar

3
.gitmodules vendored Normal file
View File

@@ -0,0 +1,3 @@
[submodule "installer"]
path = installer
url = https://gitea.siegeln.net/cameleer/cameleer-saas-installer.git

101
AGENTS.md Normal file
View File

@@ -0,0 +1,101 @@
<!-- gitnexus:start -->
# GitNexus — Code Intelligence
This project is indexed by GitNexus as **cameleer-saas** (2838 symbols, 6037 relationships, 239 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.
> If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.
## Always Do
- **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user.
- **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows.
- **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
- When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
- When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`.
## When Debugging
1. `gitnexus_query({query: "<error or symptom>"})` — find execution flows related to the issue
2. `gitnexus_context({name: "<suspect function>"})` — see all callers, callees, and process participation
3. `READ gitnexus://repo/cameleer-saas/process/{processName}` — trace the full execution flow step by step
4. For regressions: `gitnexus_detect_changes({scope: "compare", base_ref: "main"})` — see what your branch changed
## When Refactoring
- **Renaming**: MUST use `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` first. Review the preview — graph edits are safe, text_search edits need manual review. Then run with `dry_run: false`.
- **Extracting/Splitting**: MUST run `gitnexus_context({name: "target"})` to see all incoming/outgoing refs, then `gitnexus_impact({target: "target", direction: "upstream"})` to find all external callers before moving code.
- After any refactor: run `gitnexus_detect_changes({scope: "all"})` to verify only expected files changed.
## Never Do
- NEVER edit a function, class, or method without first running `gitnexus_impact` on it.
- NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
- NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph.
- NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope.
## Tools Quick Reference
| Tool | When to use | Command |
|------|-------------|---------|
| `query` | Find code by concept | `gitnexus_query({query: "auth validation"})` |
| `context` | 360-degree view of one symbol | `gitnexus_context({name: "validateUser"})` |
| `impact` | Blast radius before editing | `gitnexus_impact({target: "X", direction: "upstream"})` |
| `detect_changes` | Pre-commit scope check | `gitnexus_detect_changes({scope: "staged"})` |
| `rename` | Safe multi-file rename | `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` |
| `cypher` | Custom graph queries | `gitnexus_cypher({query: "MATCH ..."})` |
## Impact Risk Levels
| Depth | Meaning | Action |
|-------|---------|--------|
| d=1 | WILL BREAK — direct callers/importers | MUST update these |
| d=2 | LIKELY AFFECTED — indirect deps | Should test |
| d=3 | MAY NEED TESTING — transitive | Test if critical path |
## Resources
| Resource | Use for |
|----------|---------|
| `gitnexus://repo/cameleer-saas/context` | Codebase overview, check index freshness |
| `gitnexus://repo/cameleer-saas/clusters` | All functional areas |
| `gitnexus://repo/cameleer-saas/processes` | All execution flows |
| `gitnexus://repo/cameleer-saas/process/{name}` | Step-by-step execution trace |
## Self-Check Before Finishing
Before completing any code modification task, verify:
1. `gitnexus_impact` was run for all modified symbols
2. No HIGH/CRITICAL risk warnings were ignored
3. `gitnexus_detect_changes()` confirms changes match expected scope
4. All d=1 (WILL BREAK) dependents were updated
## Keeping the Index Fresh
After committing code changes, the GitNexus index becomes stale. Re-run analyze to update it:
```bash
npx gitnexus analyze
```
If the index previously included embeddings, preserve them by adding `--embeddings`:
```bash
npx gitnexus analyze --embeddings
```
To check whether embeddings exist, inspect `.gitnexus/meta.json` — the `stats.embeddings` field shows the count (0 means no embeddings). **Running analyze without `--embeddings` will delete any previously generated embeddings.**
> Claude Code users: A PostToolUse hook handles this automatically after `git commit` and `git merge`.
## CLI
| Task | Read this skill file |
|------|---------------------|
| Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` |
| Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` |
| Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` |
| Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` |
| Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` |
| Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |
<!-- gitnexus:end -->

160
CLAUDE.md
View File

@@ -4,36 +4,174 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
## Project ## Project
Cameleer SaaS — multi-tenant SaaS platform wrapping the Cameleer observability stack (Java agent + server) for Apache Camel applications. Customers get managed observability for their Camel integrations without running infrastructure. Cameleer SaaS — **vendor management plane** for the Cameleer observability stack. Three personas: **vendor** (platform:admin) manages the platform and provisions tenants; **tenant admin** (tenant:manage) manages their observability instance; **new user** (authenticated, no scopes) goes through self-service onboarding. Tenants can be created by the vendor OR via self-service sign-up (email registration + onboarding wizard). Each tenant gets per-tenant cameleer-server + UI instances via Docker API.
## Ecosystem ## Ecosystem
This repo is the SaaS layer on top of two proven components: This repo is the SaaS layer on top of two proven components:
- **cameleer3** (sibling repo) — Java agent using ByteBuddy for zero-code instrumentation of Camel apps. Captures route executions, processor traces, payloads, metrics, and route graph topology. Deploys as `-javaagent` JAR. - **cameleer** (sibling repo) — Java agent using ByteBuddy for zero-code instrumentation of Camel apps. Captures route executions, processor traces, payloads, metrics, and route graph topology. Deploys as `-javaagent` JAR.
- **cameleer3-server** (sibling repo) — Spring Boot observability backend. Receives agent data via HTTP, pushes config/commands via SSE. PostgreSQL + OpenSearch storage. React SPA dashboard. JWT auth with Ed25519 config signing. - **cameleer-server** (sibling repo) — Spring Boot observability backend. Receives agent data via HTTP, pushes config/commands via SSE. PostgreSQL + ClickHouse storage. React SPA dashboard. JWT auth with Ed25519 config signing. Docker container orchestration for app deployments.
- **cameleer-website** — Marketing site (Astro 5) - **cameleer-website** — Marketing site (Astro 5)
- **design-system** — Shared React component library (`@cameleer/design-system` on Gitea npm registry) - **design-system** — Shared React component library (`@cameleer/design-system` on Gitea npm registry)
Agent-server protocol is defined in `cameleer3/cameleer3-common/PROTOCOL.md`. The agent and server are mature, proven components — this repo wraps them with multi-tenancy, billing, and self-service onboarding. Agent-server protocol is defined in `cameleer/cameleer-common/PROTOCOL.md`. The agent and server are mature, proven components — this repo wraps them with multi-tenancy, billing, and self-service onboarding.
## Key Packages
### Java Backend (`src/main/java/net/siegeln/cameleer/saas/`)
| Package | Purpose | Key classes |
|---------|---------|-------------|
| `config/` | Security, tenant isolation, web config | `SecurityConfig`, `TenantIsolationInterceptor`, `TenantContext`, `PublicConfigController`, `MeController` |
| `tenant/` | Tenant data model | `TenantEntity` (JPA: id, name, slug, tier, status, logto_org_id, db_password) |
| `vendor/` | Vendor console (platform:admin) | `VendorTenantService`, `VendorTenantController`, `InfrastructureService` |
| `onboarding/` | Self-service sign-up onboarding | `OnboardingController`, `OnboardingService` |
| `portal/` | Tenant admin portal (org-scoped) | `TenantPortalService`, `TenantPortalController` |
| `provisioning/` | Pluggable tenant provisioning | `DockerTenantProvisioner`, `TenantDatabaseService`, `TenantDataCleanupService` |
| `certificate/` | TLS certificate lifecycle | `CertificateService`, `CertificateController`, `TenantCaCertService` |
| `license/` | License management | `LicenseService`, `LicenseController` |
| `identity/` | Logto & server integration | `LogtoManagementClient`, `ServerApiClient` |
| `audit/` | Audit logging | `AuditService` |
### Frontend
- **`ui/src/`** — React 19 SPA at `/platform/*` (vendor + tenant admin pages)
- **`ui/sign-in/`** — Custom Logto sign-in UI (built into `cameleer-logto` Docker image)
## Architecture Context ## Architecture Context
The existing cameleer3-server already has single-tenant auth (JWT, RBAC, bootstrap tokens, OIDC). The SaaS layer must: The SaaS platform is a **vendor management plane**. It does not proxy requests to servers — instead it provisions dedicated per-tenant cameleer-server instances via Docker API. Each tenant gets isolated server + UI containers with their own database schemas, networks, and Traefik routing.
- Add multi-tenancy (tenant isolation of agent data, diagrams, configs)
- Provide self-service signup, billing, and team management For detailed architecture docs, see the directory-scoped CLAUDE.md files (loaded automatically when editing code in that directory):
- Generate per-tenant bootstrap tokens for agent registration - **Provisioning flow, env vars, lifecycle** → `src/.../provisioning/CLAUDE.md`
- Proxy or federate access to tenant-specific cameleer3-server instances - **Auth, scopes, JWT, OIDC** → `src/.../config/CLAUDE.md`
- Enforce usage quotas and metered billing - **Docker, routing, networks, bootstrap, deployment pipeline** → `docker/CLAUDE.md`
- **Installer, deployment modes, compose templates** → `installer/CLAUDE.md` (git submodule: `cameleer-saas-installer`)
- **Frontend, sign-in UI** → `ui/CLAUDE.md`
## Database Migrations
PostgreSQL (Flyway): `src/main/resources/db/migration/`
- V001 — consolidated baseline: tenants (with db_password, server_endpoint, provision_error, ca_applied_at), licenses, audit_log, certificates, tenant_ca_certs
## Related Conventions ## Related Conventions
- Gitea-hosted: `gitea.siegeln.net/cameleer/` - Gitea-hosted: `gitea.siegeln.net/cameleer/`
- CI: `.gitea/workflows/` — Gitea Actions - CI: `.gitea/workflows/` — Gitea Actions
- K8s target: k3s cluster at 192.168.50.86 - K8s target: k3s cluster at 192.168.50.86
- Docker builds: multi-stage, buildx with registry cache, `--provenance=false` for Gitea compatibility - Docker images: CI builds and pushes all images — Dockerfiles use multi-stage builds, no local builds needed
- `cameleer-saas` — SaaS vendor management plane (frontend + JAR baked in)
- `cameleer-logto` — custom Logto with sign-in UI baked in
- `cameleer-server` / `cameleer-server-ui` — provisioned per-tenant (not in compose, created by `DockerTenantProvisioner`)
- `cameleer-runtime-base` — base image for deployed apps (agent JAR + `cameleer-log-appender.jar` + JRE). CI downloads latest agent and log appender SNAPSHOTs from Gitea Maven registry. The Dockerfile ENTRYPOINT is overridden by `DockerRuntimeOrchestrator` at container creation; agent config uses `CAMELEER_AGENT_*` env vars set by `DeploymentExecutor`.
- Docker builds: `--no-cache`, `--provenance=false` for Gitea compatibility
- `docker-compose.yml` (root) — thin dev overlay (ports, volume mounts, `SPRING_PROFILES_ACTIVE: dev`). Chained on top of production templates from the installer submodule via `COMPOSE_FILE` in `.env`.
- Installer is a **git submodule** at `installer/` pointing to `cameleer/cameleer-saas-installer` (public repo). Compose templates live there — single source of truth, no duplication. Run `git submodule update --remote installer` to pull template updates.
- Design system: import from `@cameleer/design-system` (Gitea npm registry) - Design system: import from `@cameleer/design-system` (Gitea npm registry)
## Disabled Skills ## Disabled Skills
- Do NOT use any `gsd:*` skills in this project. This includes all `/gsd:` prefixed commands. - Do NOT use any `gsd:*` skills in this project. This includes all `/gsd:` prefixed commands.
<!-- gitnexus:start -->
# GitNexus — Code Intelligence
This project is indexed by GitNexus as **cameleer-saas** (2881 symbols, 6138 relationships, 243 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.
> If any GitNexus tool warns the index is stale, run `npx gitnexus analyze` in terminal first.
## Always Do
- **MUST run impact analysis before editing any symbol.** Before modifying a function, class, or method, run `gitnexus_impact({target: "symbolName", direction: "upstream"})` and report the blast radius (direct callers, affected processes, risk level) to the user.
- **MUST run `gitnexus_detect_changes()` before committing** to verify your changes only affect expected symbols and execution flows.
- **MUST warn the user** if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
- When exploring unfamiliar code, use `gitnexus_query({query: "concept"})` to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
- When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use `gitnexus_context({name: "symbolName"})`.
## When Debugging
1. `gitnexus_query({query: "<error or symptom>"})` — find execution flows related to the issue
2. `gitnexus_context({name: "<suspect function>"})` — see all callers, callees, and process participation
3. `READ gitnexus://repo/cameleer-saas/process/{processName}` — trace the full execution flow step by step
4. For regressions: `gitnexus_detect_changes({scope: "compare", base_ref: "main"})` — see what your branch changed
## When Refactoring
- **Renaming**: MUST use `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` first. Review the preview — graph edits are safe, text_search edits need manual review. Then run with `dry_run: false`.
- **Extracting/Splitting**: MUST run `gitnexus_context({name: "target"})` to see all incoming/outgoing refs, then `gitnexus_impact({target: "target", direction: "upstream"})` to find all external callers before moving code.
- After any refactor: run `gitnexus_detect_changes({scope: "all"})` to verify only expected files changed.
## Never Do
- NEVER edit a function, class, or method without first running `gitnexus_impact` on it.
- NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
- NEVER rename symbols with find-and-replace — use `gitnexus_rename` which understands the call graph.
- NEVER commit changes without running `gitnexus_detect_changes()` to check affected scope.
## Tools Quick Reference
| Tool | When to use | Command |
|------|-------------|---------|
| `query` | Find code by concept | `gitnexus_query({query: "auth validation"})` |
| `context` | 360-degree view of one symbol | `gitnexus_context({name: "validateUser"})` |
| `impact` | Blast radius before editing | `gitnexus_impact({target: "X", direction: "upstream"})` |
| `detect_changes` | Pre-commit scope check | `gitnexus_detect_changes({scope: "staged"})` |
| `rename` | Safe multi-file rename | `gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})` |
| `cypher` | Custom graph queries | `gitnexus_cypher({query: "MATCH ..."})` |
## Impact Risk Levels
| Depth | Meaning | Action |
|-------|---------|--------|
| d=1 | WILL BREAK — direct callers/importers | MUST update these |
| d=2 | LIKELY AFFECTED — indirect deps | Should test |
| d=3 | MAY NEED TESTING — transitive | Test if critical path |
## Resources
| Resource | Use for |
|----------|---------|
| `gitnexus://repo/cameleer-saas/context` | Codebase overview, check index freshness |
| `gitnexus://repo/cameleer-saas/clusters` | All functional areas |
| `gitnexus://repo/cameleer-saas/processes` | All execution flows |
| `gitnexus://repo/cameleer-saas/process/{name}` | Step-by-step execution trace |
## Self-Check Before Finishing
Before completing any code modification task, verify:
1. `gitnexus_impact` was run for all modified symbols
2. No HIGH/CRITICAL risk warnings were ignored
3. `gitnexus_detect_changes()` confirms changes match expected scope
4. All d=1 (WILL BREAK) dependents were updated
## Keeping the Index Fresh
After committing code changes, the GitNexus index becomes stale. Re-run analyze to update it:
```bash
npx gitnexus analyze
```
If the index previously included embeddings, preserve them by adding `--embeddings`:
```bash
npx gitnexus analyze --embeddings
```
To check whether embeddings exist, inspect `.gitnexus/meta.json` — the `stats.embeddings` field shows the count (0 means no embeddings). **Running analyze without `--embeddings` will delete any previously generated embeddings.**
> Claude Code users: A PostToolUse hook handles this automatically after `git commit` and `git merge`.
## CLI
| Task | Read this skill file |
|------|---------------------|
| Understand architecture / "How does X work?" | `.claude/skills/gitnexus/gitnexus-exploring/SKILL.md` |
| Blast radius / "What breaks if I change X?" | `.claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md` |
| Trace bugs / "Why is X failing?" | `.claude/skills/gitnexus/gitnexus-debugging/SKILL.md` |
| Rename / extract / split / refactor | `.claude/skills/gitnexus/gitnexus-refactoring/SKILL.md` |
| Tools, resources, schema reference | `.claude/skills/gitnexus/gitnexus-guide/SKILL.md` |
| Index, status, clean, wiki CLI commands | `.claude/skills/gitnexus/gitnexus-cli/SKILL.md` |
<!-- gitnexus:end -->

View File

@@ -1,15 +1,30 @@
# Dockerfile # syntax=docker/dockerfile:1
FROM eclipse-temurin:21-jdk-alpine AS build
# Frontend: runs natively on build host
FROM --platform=$BUILDPLATFORM node:22-alpine AS frontend
ARG REGISTRY_TOKEN
WORKDIR /ui
COPY ui/package.json ui/package-lock.json ui/.npmrc ./
RUN --mount=type=cache,target=/root/.npm echo "//gitea.siegeln.net/api/packages/cameleer/npm/:_authToken=${REGISTRY_TOKEN}" >> .npmrc && npm ci
COPY ui/ .
RUN npm run build
# Maven build: runs natively on build host (no QEMU emulation)
FROM --platform=$BUILDPLATFORM eclipse-temurin:21-jdk-alpine AS build
WORKDIR /build WORKDIR /build
COPY .mvn/ .mvn/ COPY .mvn/ .mvn/
COPY mvnw pom.xml ./ COPY mvnw pom.xml ./
RUN ./mvnw dependency:go-offline -B # Cache deps — BuildKit cache mount persists across --no-cache builds
RUN --mount=type=cache,target=/root/.m2/repository ./mvnw dependency:go-offline -B || true
COPY src/ src/ COPY src/ src/
RUN ./mvnw package -DskipTests -B COPY --from=frontend /ui/dist/ src/main/resources/static/
RUN --mount=type=cache,target=/root/.m2/repository ./mvnw package -DskipTests -B
# Runtime: target platform (amd64)
FROM eclipse-temurin:21-jre-alpine FROM eclipse-temurin:21-jre-alpine
WORKDIR /app WORKDIR /app
RUN addgroup -S cameleer && adduser -S cameleer -G cameleer RUN addgroup -S cameleer && adduser -S cameleer -G cameleer \
&& mkdir -p /data/jars && chown -R cameleer:cameleer /data
COPY --from=build /build/target/*.jar app.jar COPY --from=build /build/target/*.jar app.jar
USER cameleer USER cameleer
EXPOSE 8080 EXPOSE 8080

447
HOWTO.md Normal file
View File

@@ -0,0 +1,447 @@
# Cameleer SaaS -- How to Install, Start & Bootstrap
## Quick Start (Development)
```bash
# 1. Clone
git clone https://gitea.siegeln.net/cameleer/cameleer-saas.git
cd cameleer-saas
# 2. Create environment file
cp .env.example .env
# 3. Generate Ed25519 key pair
mkdir -p keys
ssh-keygen -t ed25519 -f keys/ed25519 -N ""
mv keys/ed25519 keys/ed25519.key
# 4. Start the stack
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
# 5. Wait for services to be ready (~30s)
docker compose logs -f cameleer-saas --since 10s
# Look for: "Started CameleerSaasApplication"
# 6. Verify
curl http://localhost:8080/actuator/health
# {"status":"UP"}
```
## Prerequisites
- Docker Desktop (Windows/Mac) or Docker Engine 24+ (Linux)
- Git
- `curl` or any HTTP client (for testing)
## Architecture
The platform runs as a Docker Compose stack:
| Service | Image | Port | Purpose |
|---------|-------|------|---------|
| **traefik-certs** | alpine:latest | — | Init container: generates self-signed cert or copies user-supplied cert |
| **traefik** | traefik:v3 | 80, 443, 3002 | Reverse proxy, TLS termination, routing |
| **postgres** | postgres:16-alpine | 5432* | Platform database + Logto database |
| **logto** | ghcr.io/logto-io/logto | 3001*, 3002* | Identity provider (OIDC) |
| **cameleer-saas** | cameleer-saas:latest | 8080* | SaaS API server + vendor UI |
| **clickhouse** | clickhouse-server:latest | 8123* | Trace/metrics/log storage |
*Ports exposed to host only with `docker-compose.dev.yml` overlay.
Per-tenant `cameleer-server` and `cameleer-server-ui` containers are provisioned dynamically by `DockerTenantProvisioner` — they are NOT part of the compose stack.
## Installation
### 1. Environment Configuration
```bash
cp .env.example .env
```
Edit `.env` and set at minimum:
```bash
# Change in production
POSTGRES_PASSWORD=<strong-password>
# Logto M2M credentials (auto-provisioned by bootstrap, or get from Logto admin console)
CAMELEER_SAAS_IDENTITY_M2MCLIENTID=
CAMELEER_SAAS_IDENTITY_M2MCLIENTSECRET=
```
### 2. Ed25519 Keys
The platform uses Ed25519 keys for license signing and machine token verification.
```bash
mkdir -p keys
ssh-keygen -t ed25519 -f keys/ed25519 -N ""
mv keys/ed25519 keys/ed25519.key
```
This creates `keys/ed25519.key` (private) and `keys/ed25519.pub` (public). The keys directory is mounted read-only into the cameleer-saas container.
If no key files are configured, the platform generates ephemeral keys on startup (suitable for development only -- keys change on every restart).
### 3. TLS Certificate (Optional)
By default, the `traefik-certs` init container generates a self-signed certificate for `PUBLIC_HOST`. To supply your own certificate at bootstrap time, set these env vars in `.env`:
```bash
CERT_FILE=/path/to/cert.pem # PEM-encoded certificate
KEY_FILE=/path/to/key.pem # PEM-encoded private key
CA_FILE=/path/to/ca.pem # Optional: CA bundle (for private CA trust)
```
The init container validates that the key matches the certificate before accepting. If validation fails, the container exits with an error.
**Runtime certificate replacement** is available via the vendor UI at `/vendor/certificates`:
- Upload a new cert+key+CA bundle (staged, not yet active)
- Validate and activate (atomic swap, Traefik hot-reloads)
- Roll back to the previous certificate if needed
- Track which tenants need a restart to pick up CA bundle changes
### 4. Start the Stack
**Development** (ports exposed for direct access):
```bash
docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
```
**Production** (traffic routed through Traefik only):
```bash
docker compose up -d
```
### 5. Verify Services
```bash
# Health check
curl http://localhost:8080/actuator/health
# Check all containers are running
docker compose ps
```
## Bootstrapping
### First-Time Logto Setup
On first boot, Logto seeds its database automatically. Access the admin console to configure it:
1. Open http://localhost:3002 (Logto admin console)
2. Complete the initial setup wizard
3. Create a **Machine-to-Machine** application:
- Go to Applications > Create Application > Machine-to-Machine
- Note the **App ID** and **App Secret**
- Assign the **Logto Management API** resource with all scopes
4. Update `.env`:
```
CAMELEER_SAAS_IDENTITY_M2MCLIENTID=<app-id>
CAMELEER_SAAS_IDENTITY_M2MCLIENTSECRET=<app-secret>
```
5. Restart cameleer-saas: `docker compose restart cameleer-saas`
### Create Your First Tenant
With a Logto user token (obtained via OIDC login flow):
```bash
TOKEN="<your-logto-jwt>"
# Create tenant
curl -X POST http://localhost:8080/api/tenants \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"name": "My Company", "slug": "my-company", "tier": "MID"}'
# A "default" environment is auto-created with the tenant
```
### Generate a License
```bash
TENANT_ID="<uuid-from-above>"
curl -X POST "http://localhost:8080/api/tenants/$TENANT_ID/license" \
-H "Authorization: Bearer $TOKEN"
```
### Deploy a Camel Application
```bash
# List environments
curl "http://localhost:8080/api/tenants/$TENANT_ID/environments" \
-H "Authorization: Bearer $TOKEN"
ENV_ID="<default-environment-uuid>"
# Upload JAR and create app
curl -X POST "http://localhost:8080/api/environments/$ENV_ID/apps" \
-H "Authorization: Bearer $TOKEN" \
-F 'metadata={"slug":"order-service","displayName":"Order Service"};type=application/json' \
-F "file=@/path/to/your-camel-app.jar"
APP_ID="<app-uuid-from-response>"
# Deploy (async -- returns 202 with deployment ID)
curl -X POST "http://localhost:8080/api/apps/$APP_ID/deploy" \
-H "Authorization: Bearer $TOKEN"
DEPLOYMENT_ID="<deployment-uuid>"
# Poll deployment status
curl "http://localhost:8080/api/apps/$APP_ID/deployments/$DEPLOYMENT_ID" \
-H "Authorization: Bearer $TOKEN"
# Status transitions: BUILDING -> STARTING -> RUNNING (or FAILED)
# View container logs
curl "http://localhost:8080/api/apps/$APP_ID/logs?limit=50" \
-H "Authorization: Bearer $TOKEN"
# Stop the app
curl -X POST "http://localhost:8080/api/apps/$APP_ID/stop" \
-H "Authorization: Bearer $TOKEN"
```
### Enable Inbound HTTP Routing
If your Camel app exposes a REST endpoint, you can make it reachable from outside the stack:
```bash
# Set the port your app listens on (e.g., 8080 for Spring Boot)
curl -X PATCH "http://localhost:8080/api/environments/$ENV_ID/apps/$APP_ID/routing" \
-H "Authorization: Bearer $TOKEN" \
-H "Content-Type: application/json" \
-d '{"exposedPort": 8080}'
```
Your app is now reachable at `http://{app-slug}.{env-slug}.{tenant-slug}.{domain}` (e.g., `http://order-service.default.my-company.localhost`). Traefik routes traffic automatically.
To disable routing, set `exposedPort` to `null`.
### View the Observability Dashboard
The cameleer-server React SPA dashboard is available at:
```
http://localhost/dashboard
```
This shows execution traces, route topology graphs, metrics, and logs for all deployed apps. Authentication is required (Logto OIDC token via forward-auth).
### Check Agent & Observability Status
```bash
# Is the agent registered with cameleer-server?
curl "http://localhost:8080/api/apps/$APP_ID/agent-status" \
-H "Authorization: Bearer $TOKEN"
# Returns: registered, state (ACTIVE/STALE/DEAD/UNKNOWN), routeIds
# Is the app producing observability data?
curl "http://localhost:8080/api/apps/$APP_ID/observability-status" \
-H "Authorization: Bearer $TOKEN"
# Returns: hasTraces, lastTraceAt, traceCount24h
```
## API Reference
### Tenants
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/tenants` | Create tenant |
| GET | `/api/tenants/{id}` | Get tenant |
| GET | `/api/tenants/by-slug/{slug}` | Get tenant by slug |
### Licensing
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/tenants/{tid}/license` | Generate license |
| GET | `/api/tenants/{tid}/license` | Get active license |
### Environments
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/tenants/{tid}/environments` | Create environment |
| GET | `/api/tenants/{tid}/environments` | List environments |
| GET | `/api/tenants/{tid}/environments/{eid}` | Get environment |
| PATCH | `/api/tenants/{tid}/environments/{eid}` | Rename environment |
| DELETE | `/api/tenants/{tid}/environments/{eid}` | Delete environment |
### Apps
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/environments/{eid}/apps` | Create app + upload JAR |
| GET | `/api/environments/{eid}/apps` | List apps |
| GET | `/api/environments/{eid}/apps/{aid}` | Get app |
| PUT | `/api/environments/{eid}/apps/{aid}/jar` | Re-upload JAR |
| PATCH | `/api/environments/{eid}/apps/{aid}/routing` | Set/clear exposed port |
| DELETE | `/api/environments/{eid}/apps/{aid}` | Delete app |
### Deployments
| Method | Path | Description |
|--------|------|-------------|
| POST | `/api/apps/{aid}/deploy` | Deploy app (async, 202) |
| GET | `/api/apps/{aid}/deployments` | Deployment history |
| GET | `/api/apps/{aid}/deployments/{did}` | Get deployment status |
| POST | `/api/apps/{aid}/stop` | Stop current deployment |
| POST | `/api/apps/{aid}/restart` | Restart app |
### Logs
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/apps/{aid}/logs` | Query container logs |
Query params: `since`, `until` (ISO timestamps), `limit` (default 500), `stream` (stdout/stderr/both)
### Observability
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/apps/{aid}/agent-status` | Agent registration status |
| GET | `/api/apps/{aid}/observability-status` | Trace/metrics data health |
### Dashboard
| Path | Description |
|------|-------------|
| `/dashboard` | cameleer-server observability dashboard (forward-auth protected) |
### Vendor: Certificates (platform:admin)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/vendor/certificates` | Overview (active, staged, archived, stale count) |
| POST | `/api/vendor/certificates/stage` | Upload cert+key+CA (multipart) |
| POST | `/api/vendor/certificates/activate` | Promote staged -> active |
| POST | `/api/vendor/certificates/restore` | Swap archived <-> active |
| DELETE | `/api/vendor/certificates/staged` | Discard staged cert |
| GET | `/api/vendor/certificates/stale-tenants` | Count tenants needing CA restart |
### Vendor: Tenants (platform:admin)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/vendor/tenants` | List all tenants (includes fleet health: agentCount, environmentCount, agentLimit) |
| POST | `/api/vendor/tenants` | Create tenant (async provisioning) |
| GET | `/api/vendor/tenants/{id}` | Tenant detail + server state |
| POST | `/api/vendor/tenants/{id}/restart` | Restart server containers |
| POST | `/api/vendor/tenants/{id}/suspend` | Suspend tenant |
| POST | `/api/vendor/tenants/{id}/activate` | Activate tenant |
| DELETE | `/api/vendor/tenants/{id}` | Delete tenant |
| POST | `/api/vendor/tenants/{id}/license` | Renew license |
### Tenant Portal (org-scoped)
| Method | Path | Description |
|--------|------|-------------|
| GET | `/api/tenant/dashboard` | Tenant dashboard data |
| GET | `/api/tenant/license` | License details |
| POST | `/api/tenant/server/restart` | Restart server |
| GET | `/api/tenant/team` | List team members |
| POST | `/api/tenant/team/invite` | Invite team member |
| DELETE | `/api/tenant/team/{userId}` | Remove team member |
| GET | `/api/tenant/settings` | Tenant settings |
| GET | `/api/tenant/sso` | List SSO connectors |
| POST | `/api/tenant/sso` | Create SSO connector |
| GET | `/api/tenant/ca` | List tenant CA certificates |
| POST | `/api/tenant/ca` | Upload CA cert (staged) |
| POST | `/api/tenant/ca/{id}/activate` | Activate staged CA cert |
| DELETE | `/api/tenant/ca/{id}` | Remove CA cert |
| GET | `/api/tenant/audit` | Tenant audit log |
### Health
| Method | Path | Description |
|--------|------|-------------|
| GET | `/actuator/health` | Health check (public) |
| GET | `/api/health/secured` | Authenticated health check |
## Tier Limits
| Tier | Environments | Apps | Retention | Features |
|------|-------------|------|-----------|----------|
| LOW | 1 | 3 | 7 days | Topology |
| MID | 2 | 10 | 30 days | + Lineage, Correlation |
| HIGH | Unlimited | 50 | 90 days | + Debugger, Replay |
| BUSINESS | Unlimited | Unlimited | 365 days | All features |
## Frontend Development
The SaaS management UI is a React SPA in the `ui/` directory.
### Setup
```bash
cd ui
npm install
```
### Dev Server
```bash
cd ui
npm run dev
```
The Vite dev server starts on http://localhost:5173 and proxies `/api` to `http://localhost:8080` (the Spring Boot backend). Run the backend in another terminal with `mvn spring-boot:run` or via Docker Compose.
### Environment Variables
| Variable | Purpose | Default |
|----------|---------|---------|
| `VITE_LOGTO_ENDPOINT` | Logto OIDC endpoint | `http://localhost:3001` |
| `VITE_LOGTO_CLIENT_ID` | Logto application client ID | (empty) |
Create a `ui/.env.local` file for local overrides:
```bash
VITE_LOGTO_ENDPOINT=http://localhost:3001
VITE_LOGTO_CLIENT_ID=your-client-id
```
### Production Build
```bash
cd ui
npm run build
```
Output goes to `src/main/resources/static/` (configured in `vite.config.ts`). The subsequent `mvn package` bundles the SPA into the JAR. In Docker builds, the Dockerfile handles this automatically via a multi-stage build.
### SPA Routing
Spring Boot serves `index.html` for all non-API routes via `SpaController.java`. React Router handles client-side routing. The SPA lives at `/`, while the observability dashboard (cameleer-server) is at `/dashboard`.
## Development
### Running Tests
```bash
# Unit tests only (no Docker required)
mvn test -B -Dsurefire.excludes="**/*ControllerTest.java,**/AuditRepositoryTest.java,**/CameleerSaasApplicationTest.java"
# Integration tests (requires Docker Desktop)
mvn test -B -Dtest="EnvironmentControllerTest,AppControllerTest,DeploymentControllerTest"
# All tests
mvn verify -B
```
### Building Locally
```bash
# Build JAR
mvn clean package -DskipTests -B
# Build Docker image
docker build -t cameleer-saas:local .
# Use local image
VERSION=local docker compose -f docker-compose.yml -f docker-compose.dev.yml up -d
```
## Troubleshooting
**Logto fails to start**: Check that PostgreSQL is healthy first. Logto needs the `logto` database created by `docker/init-databases.sh`. Run `docker compose logs logto` for details.
**cameleer-saas won't start**: Check `docker compose logs cameleer-saas`. Common issues:
- PostgreSQL not ready (wait for healthcheck)
- Flyway migration conflict (check for manual schema changes)
**Ephemeral key warnings**: `No Ed25519 key files configured -- generating ephemeral keys (dev mode)` is normal in development. For production, generate keys as described above.
**Container deployment fails**: Check that Docker socket is mounted (`/var/run/docker.sock`) and the `cameleer-runtime-base` image is available. Pull it with: `docker pull gitea.siegeln.net/cameleer/cameleer-runtime-base:latest`

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 14 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 70 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 72 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 44 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

BIN
audit/03-login-page.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 16 KiB

BIN
audit/04-login-error.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

BIN
audit/06-license-page.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 49 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 211 KiB

BIN
audit/11-search-modal.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 48 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 45 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 6.0 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.8 KiB

BIN
audit/21-sidebar-detail.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.9 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.1 KiB

View File

@@ -0,0 +1,269 @@
# Cameleer SaaS Platform UI Audit Findings
**Date:** 2026-04-09
**Auditor:** Claude Opus 4.6
**URL:** https://desktop-fb5vgj9.siegeln.internal/
**Credentials:** admin/admin
**Browser:** Playwright (Chromium)
---
## 1. Login Page (`/sign-in`)
**Screenshot:** `03-login-page.png`, `04-login-error.png`
### What works well
- Clean, centered card layout with consistent design system components
- Fun rotating subtitle taglines (e.g., "No ticket, no caravan") add personality
- Cameleer logo is displayed correctly
- Error handling works -- "Invalid username or password" alert appears on bad credentials (red alert banner)
- Sign in button is correctly disabled until both fields are populated
- Loading state on button during authentication
- Uses proper `autoComplete` attributes (`username`, `current-password`)
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| Important | **No password visibility toggle** -- the Password input uses `type="password"` with no eye icon to reveal. Most modern login forms offer this. | Password field |
| Important | **Branding says "cameleer"** not "Cameleer" or "Cameleer SaaS" -- the product name on the login page is the internal repo name, not the user-facing brand | `.logo` text content |
| Nice-to-have | **No "Forgot password" link** -- even if it goes to a "contact admin" page, users expect this | Below password field |
| Nice-to-have | **No Enter-key submit hint** -- though Enter does work via form submit, there's no visual affordance | Form area |
| Nice-to-have | **Page title is "Sign in -- cameleer"** -- should match product branding ("Cameleer SaaS") | `<title>` tag |
---
## 2. Platform Dashboard (`/platform/`)
**Screenshots:** `05-platform-dashboard-loggedin.png`, `15-dashboard-desktop-1280.png`, `19-tenant-info-detail.png`, `20-kpi-strip-detail.png`
### What works well
- Clear tenant name as page heading ("Example Tenant")
- Tier badge next to tenant name provides immediate context
- KPI strip with Tier, Status, License cards is visually clean and well-structured
- License KPI card shows expiry date in green "expires 8.4.2027" trend indicator
- "Server Management" card provides clear description of what the server dashboard does
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| **Critical** | **Label/value collision in Tenant Information card** -- "Slugdefault", "Created8.4.2026" have no visual separation between label and value. The source uses `flex justify-between` but the deployed Card component doesn't give the inner `div` full width, so items stack/collapse. | Tenant Information card |
| **Critical** | **"Open Server Dashboard" appears 3 times** on one page: (1) primary button in header area below tenant name, (2) "Server Management" card with secondary button, (3) sidebar footer link. This is redundant and clutters the page. Reduce to 1-2 locations max. | Header area, Server Management card, sidebar footer |
| Important | **Breadcrumb is always empty** -- the `breadcrumb` prop is passed as `[]`. Platform pages should have breadcrumbs like "Platform > Dashboard" or "Platform > License". | TopBar breadcrumb nav |
| Important | **Massive empty space below content** -- the dashboard only has ~4 cards but the page extends far below with blank white/cream space. The page feels sparse and "stub-like." | Below Server Management card |
| Important | **Tier badge color is misleading** -- "LOW" tier uses `primary` (orange) color, which doesn't convey it's the lowest/cheapest tier. The `tierColor()` function in DashboardPage maps to enterprise=success, pro=primary, starter=warning, but the actual data uses LOW/MID/HIGH/BUSINESS tiers (defined in LicensePage). Dashboard and License pages have different tier color mappings. | Tier badge |
| Important | **Status is shown redundantly** -- "ACTIVE" appears in (1) KPI strip Status card, (2) Tenant Information card with badge, and (3) header area badge. This is excessive for a single piece of information. | Multiple locations |
| Nice-to-have | **No tenant ID/slug in breadcrumb or subtitle** -- the slug "default" only appears buried in the Tenant Information card | Page header area |
---
## 3. License Page (`/platform/license`)
**Screenshots:** `06-license-page.png`, `07-license-token-revealed.png`, `16-license-features-detail.png`, `17-license-limits-detail.png`, `18-license-validity-detail.png`
### What works well
- Well-structured layout with logical sections (Validity, Features, Limits, License Token)
- Tier badge in header provides context
- Feature matrix clearly shows enabled vs disabled features
- "Days remaining" with color-coded badge (green for healthy, warning for <30 days, red for expired)
- Token show/hide toggle works correctly
- Token revealed in monospace code block with appropriate styling
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| **Critical** | **Label/value collision in Validity section** -- "Issued8. April 2026" and "Expires8. April 2027" have no separation. Source code uses `flex items-center justify-between` but the flex container seems to not be stretching to full width. | Validity card rows |
| **Critical** | **Label/value collision in Limits section** -- "Max Agents3", "Retention Days7", "Max Environments1" have labels and values mashed together. Source uses `flex items-center justify-between` layout but the same rendering bug prevents proper spacing. | Limits card rows |
| Important | **No "Copy to clipboard" button** for the license token -- users need to manually select and copy. A copy button with confirmation toast is standard UX for tokens/secrets. | License Token section |
| Important | **Feature badge text mismatch** -- Source code says `'Not included'` for disabled features, but deployed version shows "DISABLED". This suggests the deployed build is out of sync with the source. | Features card badges |
| Important | **"Disabled" badge color** -- disabled features use `color='auto'` (which renders as a neutral/red-ish badge), while "Enabled" uses green. Consider using a muted gray for "Not included" to make it feel less like an error state. Red implies something is wrong, but a feature simply not being in the plan is not an error. | Features card disabled badges |
| Nice-to-have | **Limits values are not right-aligned** -- due to the label/value collision, the numeric values don't align in a column, making comparison harder | Limits card |
| Nice-to-have | **No units on limits** -- "Retention Days7" should be "7 days", "Max Agents3" should be "3 agents" or just "3" with clear formatting | Limits card values |
---
## 4. Admin Pages (`/platform/admin/tenants`)
**No screenshot available -- page returns HTTP error**
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| **Critical** | **Admin page returns HTTP error (net::ERR_HTTP_RESPONSE_CODE_FAILURE)** -- navigating to `/platform/admin/tenants` fails with an HTTP error. The route exists in the router (`AdminTenantsPage`), but the admin section is not visible in the sidebar (no "Platform" item shown). | Admin route |
| Important | **Admin section not visible in sidebar** -- the `platform:admin` scope check in Layout.tsx hides the "Platform" sidebar item. Even though the user is "admin", they apparently don't have the `platform:admin` scope in their JWT. This may be intentional (scope not assigned) or a bug. | Sidebar Platform section |
| Important | **No graceful fallback for unauthorized admin access** -- if a user manually navigates to `/admin/tenants` without the scope, the page should show a "Not authorized" message rather than an HTTP error. | Admin route error handling |
---
## 5. Navigation
**Screenshots:** `21-sidebar-detail.png`, `12-sidebar-collapsed.png`
### What works well
- Clean sidebar with Cameleer SaaS branding and logo
- "Open Server Dashboard" in sidebar footer is a good location
- Sidebar has only 2 navigation items (Dashboard, License) which keeps it simple
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| **Critical** | **No active state on sidebar navigation items** -- when on the Dashboard page, neither Dashboard nor License is highlighted/active. The sidebar uses `Sidebar.Section` components with `open={false}` as navigation links via `onToggle`, but `Section` is designed for expandable/collapsible groups, not navigation links. There is no visual indicator of the current page. | Sidebar items |
| Important | **Sidebar collapse doesn't work visually** -- clicking "Collapse sidebar" toggles the `active` state on the button but the sidebar doesn't visually collapse. The Layout component passes `collapsed={false}` as a hardcoded prop and `onCollapseToggle={() => {}}` as a no-op. | Sidebar collapse button |
| Important | **No clear distinction between "platform" and "server" levels** -- there's nothing in the sidebar header that says "Platform" vs "Server". The sidebar says "Cameleer SaaS" but when you switch to the server dashboard, it becomes a completely different app. A user might not understand the relationship. | Sidebar header |
| Nice-to-have | **"Open Server Dashboard" opens in new tab** -- `window.open('/server/', '_blank', 'noopener')` is used. While reasonable, there's no visual indicator (external link icon) that it will open a new tab. | Sidebar footer link, dashboard buttons |
---
## 6. Header Bar (TopBar)
**Screenshot:** `22-header-bar-detail.png`
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| **Critical** | **Server-specific controls shown on platform pages** -- the TopBar always renders: (1) Search (Ctrl+K), (2) Status filters (OK/Warn/Error/Running), (3) Time range pills (1h/3h/6h/Today/24h/7d), (4) Auto-refresh toggle (MANUAL/AUTO). None of these are relevant to the platform dashboard or license page. They are observability controls designed for the server's exchange/route monitoring. | Entire TopBar filter area |
| Important | **Search button does nothing** -- clicking "Search..." on the platform does not open a search modal. The CommandPaletteProvider is likely not configured for the platform context. | Search button |
| Important | **Status filter buttons are interactive but meaningless** -- clicking OK/Warn/Error/Running on platform pages toggles state (global filter provider) but has no effect on the displayed content. | Status filter buttons |
| Important | **Time range selector is interactive but meaningless** -- similarly, changing the time range from 1h to 7d has no effect on platform pages. | Time range pills |
| Important | **Auto-refresh toggle is misleading** -- shows "MANUAL" toggle on platform pages where there's nothing to auto-refresh. | Auto-refresh button |
---
## 7. User Menu
**Screenshot:** `02-user-menu-dropdown.png`
### What works well
- User name "admin" and avatar initials "AD" displayed correctly
- Dropdown appears on click with Logout option
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| Important | **User menu only has "Logout"** -- there's no "Profile", "Settings", "About", or "Switch Tenant" option. For a SaaS platform, users should at minimum see their role and tenant context. | User dropdown menu |
| Nice-to-have | **Avatar shows "AD" for "admin"** -- the Avatar component appears to use first 2 characters of the name. For "admin" this produces "AD" which looks like initials for a different name. | Avatar component |
---
## 8. Dark Mode
**Screenshots:** `08-dashboard-dark-mode.png`, `09-license-dark-mode.png`
### What works well
- Dark mode toggle works and applies globally
- Background transitions to dark brown/charcoal
- Text colors adapt appropriately
- Cards maintain visual distinction from background
- Design system tokens handle the switch smoothly
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| Nice-to-have | **Dark mode is warm-toned (brown)** rather than the more common cool dark gray/charcoal. This is consistent with the design system's cameleer branding but may feel unusual to users accustomed to dark mode in other apps. | Global dark theme |
| Nice-to-have | **The same label/value collision issues appear in dark mode** -- these are layout bugs, not color bugs, so dark mode doesn't help or hurt. | Card content |
---
## 9. Responsiveness
**Screenshots:** `13-responsive-tablet.png`, `14-responsive-mobile.png`
### Issues found
| Severity | Issue | Element |
|----------|-------|---------|
| **Critical** | **Mobile layout is broken** -- at 375px width, the sidebar overlaps the main content. The KPI strip cards are truncated ("LO...", "AC..."). The header bar overflows. Content is unreadable. | Full page at mobile widths |
| Important | **Tablet layout (768px) is functional but crowded** -- sidebar takes significant width, header bar items are compressed ("Se..." for Search), but content is readable. KPI strip wraps correctly. | Full page at tablet widths |
| Important | **Sidebar doesn't collapse on mobile** -- there's no hamburger menu or responsive sidebar behavior. The sidebar is always visible, eating screen space on narrow viewports. | Sidebar |
---
## 10. Cross-cutting Concerns
### Loading States
- Dashboard and License pages both show a centered `Spinner` during loading -- this works well.
- `EmptyState` component used for "No tenant associated" and "License unavailable" -- good error handling in components.
### Error States
- Login page error handling is good (alert banner)
- No visible error boundary for unexpected errors on platform pages
- Admin route fails silently with HTTP error -- no user-facing error message
### Toast Notifications
- No toast notifications observed during the audit
- License token copy should trigger a toast confirmation (if a copy button existed)
### Confirmation Dialogs
- No destructive actions available on the platform (no delete/deactivate buttons) so no confirmation dialogs needed currently
---
## Summary of Issues by Severity
### Critical (5)
1. **Label/value collision** throughout Tenant Information card, License Validity, and License Limits sections -- labels and values run together without spacing
2. **"Open Server Dashboard" appears 3 times** on the dashboard page -- excessive redundancy
3. **No active state on sidebar navigation items** -- users can't tell which page they're on
4. **Server-specific header controls shown on platform pages** -- search, status filters, time range, auto-refresh are all meaningless on platform pages
5. **Mobile layout completely broken** -- sidebar overlaps content, content truncated
### Important (17)
1. No password visibility toggle on login
2. Branding says "cameleer" instead of product name on login
3. Breadcrumbs always empty on platform pages
4. Massive empty space below dashboard content
5. Tier badge color mapping inconsistent between Dashboard and License pages
6. Status shown redundantly in 3 places on dashboard
7. No clipboard copy button for license token
8. Feature badge text mismatch between source and deployed build
9. "Disabled" badge uses red-ish color (implies error, not "not in plan")
10. Admin page returns HTTP error with no graceful fallback
11. Admin section invisible in sidebar despite being admin user
12. Sidebar collapse button doesn't work (no-op handler)
13. No clear platform vs server level distinction
14. Search button does nothing on platform
15. Status filters and time range interactive but meaningless on platform
16. User menu only has Logout (no profile/settings)
17. Sidebar doesn't collapse/hide on mobile
### Nice-to-have (8)
1. No "Forgot password" link on login
2. Login page title uses "cameleer" branding
3. No external link icon on "Open Server Dashboard"
4. Avatar shows "AD" for "admin"
5. No units on limit values
6. Dark mode warm-toned (not standard cool dark)
7. No Enter-key submit hint
8. No tenant ID in breadcrumb/subtitle
---
## Overarching Assessment
The platform UI currently feels like a **thin shell** around the server dashboard. It has only 2 functioning pages (Dashboard and License), and both suffer from the same fundamental layout bug (label/value collision in Card components). The header bar is entirely borrowed from the server observability UI without any platform-specific adaptation, making 70% of the header controls irrelevant.
**Key architectural concerns:**
1. The TopBar component from the design system is monolithic -- it always renders server-specific controls (status filters, time range, search). The platform needs either a simplified TopBar variant or the ability to hide these sections.
2. The sidebar uses `Sidebar.Section` (expandable groups) as navigation links, which prevents active-state highlighting. It should use `Sidebar.Link` or a similar component.
3. The platform provides very little actionable functionality -- a user can view their tenant info and license, but can't manage anything. The "Server Management" card is just a link to another app.
**What works well overall:**
- Design system integration is solid (same look and feel as server)
- Dark mode works correctly
- Loading and error states are handled
- Login page is clean and functional
- KPI strip component is effective at summarizing key info
**Recommended priorities:**
1. Fix the label/value collision bug (affects 3 cards across 2 pages)
2. Hide or replace server-specific header controls on platform pages
3. Add sidebar active state and fix the collapse behavior
4. Add clipboard copy for license token
5. Fix mobile responsiveness

View File

@@ -0,0 +1,433 @@
# Cameleer SaaS UI — Source Code Audit Findings
**Audit date:** 2026-04-09
**Scope:** `ui/src/` (platform SPA) + `ui/sign-in/src/` (custom Logto sign-in)
**Design system:** `@cameleer/design-system@0.1.38`
---
## 1. Layout and Styling Patterns
### 1.1 Container Padding/Margin
All three page components use an identical outer wrapper pattern:
```tsx
// DashboardPage.tsx:67, LicensePage.tsx:82, AdminTenantsPage.tsx:60
<div className="space-y-6 p-6">
```
**Verdict:** Consistent across all pages. However, this padding is applied by each page individually rather than by the `Layout` component. If a new page omits `p-6`, the layout will be inconsistent. Consider moving container padding to the `Layout` component wrapping `<Outlet />`.
### 1.2 Use of Design System Components vs Custom HTML
| Component | DashboardPage | LicensePage | AdminTenantsPage |
|-----------|:---:|:---:|:---:|
| Badge | Yes | Yes | Yes |
| Button | Yes | - | - |
| Card | Yes | Yes | Yes |
| DataTable | - | - | Yes |
| EmptyState | Yes | Yes | - |
| KpiStrip | Yes | - | - |
| Spinner | Yes | Yes | Yes |
**Issues found:**
- **LicensePage.tsx:166-170** — Raw `<button>` for "Show token" / "Hide token" toggle instead of DS `Button variant="ghost"`:
```tsx
<button
type="button"
className="text-sm text-primary-400 hover:text-primary-300 underline underline-offset-2 focus:outline-none"
onClick={() => setTokenExpanded((v) => !v)}
>
```
This uses hardcoded Tailwind color classes (`text-primary-400`, `hover:text-primary-300`) instead of design tokens or a DS Button.
- **LicensePage.tsx:174** — Raw `<div>` + `<code>` for token display instead of DS `CodeBlock` (which is available and supports `copyable`):
```tsx
<div className="mt-2 rounded bg-white/5 border border-white/10 p-3 overflow-x-auto">
<code className="text-xs font-mono text-white/80 break-all">
{license.token}
</code>
</div>
```
- **AdminTenantsPage.tsx** — No empty state when `tenants` is empty. The DataTable renders with zero rows but no guidance for the admin.
### 1.3 Card/Section Grouping
- **DashboardPage** uses: KpiStrip + "Tenant Information" Card + "Server Management" Card. Good grouping.
- **LicensePage** uses: "Validity" Card + "Features" Card + "Limits" Card + "License Token" Card. Well-structured.
- **AdminTenantsPage** uses: single Card wrapping DataTable. Appropriate for a list view.
### 1.4 Typography
All pages use the same heading pattern:
```tsx
<h1 className="text-2xl font-semibold text-white">...</h1>
```
**Issue:** `text-white` is hardcoded rather than using a DS color token like `var(--text-primary)`. This will break if the design system ever supports a light theme (the DS has `ThemeProvider` and a theme toggle in the TopBar). The same pattern appears:
- `DashboardPage.tsx:73` — `text-white`
- `LicensePage.tsx:85` — `text-white`
- `AdminTenantsPage.tsx:62` — `text-white`
Similarly, muted text uses `text-white/60` and `text-white/80` throughout:
- `DashboardPage.tsx:96` — `text-white/80`
- `LicensePage.tsx:96,106,109` — `text-white/60`, `text-white`
- `LicensePage.tsx:129` — `text-sm text-white`
- `LicensePage.tsx:150` — `text-sm text-white/60`
These should use `var(--text-primary)` / `var(--text-secondary)` / `var(--text-muted)` from the design system.
### 1.5 Color Token Usage
**Positive:** The sign-in page CSS module (`SignInPage.module.css`) correctly uses DS variables:
```css
color: var(--text-primary); /* line 30 */
color: var(--text-muted); /* line 40 */
background: var(--bg-base); /* line 7 */
font-family: var(--font-body); /* line 20 */
```
**Negative:** The platform SPA pages bypass the design system's CSS variables entirely, using Tailwind utility classes with hardcoded dark-theme colors (`text-white`, `text-white/60`, `bg-white/5`, `border-white/10`, `divide-white/10`).
---
## 2. Interaction Patterns
### 2.1 Button Placement and Order
- **DashboardPage.tsx:81-87** — "Open Server Dashboard" button is top-right (standard). Also repeated inside a Card at line 119-125. Two identical CTAs on the same page is redundant.
- No forms exist in the platform pages. No create/edit/delete operations are exposed in the UI (read-only dashboard).
### 2.2 Confirmation Dialogs for Destructive Actions
- The DS provides `ConfirmDialog` and `AlertDialog` — neither is used anywhere.
- **AdminTenantsPage.tsx:47-57** — Row click silently switches tenant context and navigates to `/`. No confirmation dialog for context switching, which could be disorienting. The user clicks a row in the admin table, and their entire session context changes.
### 2.3 Loading States
All pages use the same loading pattern — centered `<Spinner />` in a fixed-height container:
```tsx
<div className="flex items-center justify-center h-64">
<Spinner />
</div>
```
**Issues:**
- Full-page auth loading screens (LoginPage, CallbackPage, ProtectedRoute, OrgResolver) use inline styles instead of Tailwind:
```tsx
<div style={{ display: 'flex', alignItems: 'center', justifyContent: 'center', minHeight: '100vh' }}>
```
This is inconsistent with the page components which use Tailwind classes.
- The `main.tsx` app bootstrap loading (line 59) also uses inline styles. Six files use this identical inline style pattern — it should be a shared component or consistent class.
- No `Skeleton` components are used anywhere, despite the DS providing `Skeleton`. For the dashboard and license pages which fetch data, skeletons would give better perceived performance than a generic spinner.
### 2.4 Error Handling
- **API client (`api/client.ts`):** Errors are thrown as generic `Error` objects. No toast notifications on failure.
- **LicensePage.tsx:63-69** — Shows `EmptyState` for `isError`. Good.
- **DashboardPage.tsx** — No error state handling at all. If `useTenant()` or `useLicense()` fails, the page renders with fallback `-` values silently. No `isError` check.
- **AdminTenantsPage.tsx** — No error state. If `useAllTenants()` fails, falls through to rendering the table with empty data.
- **OrgResolver.tsx:88-89** — On error, renders `null` (blank screen). The user sees nothing — no error message, no retry option, no redirect. This is the worst error UX in the app.
- No component imports or uses `useToast()` from the DS. Toasts are never shown for any operation.
### 2.5 Empty States
- **DashboardPage.tsx:57-63** — `EmptyState` for no tenant. Good.
- **LicensePage.tsx:54-60** — `EmptyState` for no tenant. Good.
- **LicensePage.tsx:63-69** — `EmptyState` for license fetch error. Good.
- **AdminTenantsPage.tsx** — **Missing.** No empty state when `tenants` array is empty. DataTable will render an empty table body.
---
## 3. Component Usage
### 3.1 DS Imports by File
| File | DS Components Imported |
|------|----------------------|
| `main.tsx` | ThemeProvider, ToastProvider, BreadcrumbProvider, GlobalFilterProvider, CommandPaletteProvider, Spinner |
| `Layout.tsx` | AppShell, Sidebar, TopBar |
| `DashboardPage.tsx` | Badge, Button, Card, EmptyState, KpiStrip, Spinner |
| `LicensePage.tsx` | Badge, Card, EmptyState, Spinner |
| `AdminTenantsPage.tsx` | Badge, Card, DataTable, Spinner + Column type |
| `LoginPage.tsx` | Spinner |
| `CallbackPage.tsx` | Spinner |
| `ProtectedRoute.tsx` | Spinner |
| `OrgResolver.tsx` | Spinner |
| `SignInPage.tsx` (sign-in) | Card, Input, Button, Alert, FormField |
### 3.2 Available but Unused DS Components
These DS components are relevant to the platform UI but unused:
| Component | Could be used for |
|-----------|------------------|
| `AlertDialog` / `ConfirmDialog` | Confirming tenant context switch in AdminTenantsPage |
| `CodeBlock` | License token display (currently raw HTML) |
| `Skeleton` | Loading states instead of spinner |
| `Tooltip` | Badge hover explanations, info about features |
| `StatusDot` | Tenant status indicators |
| `Breadcrumb` / `useBreadcrumb` | Page navigation context (currently empty `[]`) |
| `LoginForm` | Could replace the custom sign-in form (DS already has one) |
| `useToast` | Error/success notifications |
### 3.3 Raw HTML Where DS Components Exist
1. **LicensePage.tsx:166-170** — Raw `<button>` instead of `Button variant="ghost"`
2. **LicensePage.tsx:174-178** — Raw `<div><code>` instead of `CodeBlock`
3. **Layout.tsx:26-62** — Four inline SVG icon components instead of using `lucide-react` icons (the DS depends on lucide-react)
4. **DashboardPage.tsx:95-112** — Manual label/value list with `<div className="flex justify-between">` instead of using a DS pattern (the DS has no explicit key-value list component, so this is acceptable)
### 3.4 Styling Approach
- **Platform SPA pages:** Tailwind CSS utility classes (via class names like `space-y-6`, `p-6`, `flex`, `items-center`, etc.)
- **Sign-in page:** CSS modules (`SignInPage.module.css`) with DS CSS variables
- **Auth loading screens:** Inline `style={{}}` objects
- **No CSS modules** in the platform SPA at all (zero `.module.css` files in `ui/src/`)
This is a three-way inconsistency: Tailwind in pages, CSS modules in sign-in, inline styles in auth components.
---
## 4. Navigation
### 4.1 Sidebar
**File:** `ui/src/components/Layout.tsx:70-118`
The sidebar uses `Sidebar.Section` with `open={false}` and `{null}` children as a workaround to make sections act as navigation links (via `onToggle`). This is a semantic misuse — sections are designed as collapsible containers, not nav links.
```tsx
<Sidebar.Section
icon={<DashboardIcon />}
label="Dashboard"
open={false}
onToggle={() => navigate('/')}
>
{null}
</Sidebar.Section>
```
**Issues:**
- No `active` state is set on any section. The DS supports `active?: boolean` on `SidebarSectionProps` (line 988 of DS types), but it's never passed. The user has no visual indicator of which page they're on.
- `collapsed={false}` is hardcoded with `onCollapseToggle={() => {}}` — the sidebar cannot be collapsed. This is a no-op handler.
- Only three nav items: Dashboard, License, Platform (admin-only). Very sparse.
### 4.2 "Open Server Dashboard"
Two implementations, both identical:
1. **Sidebar footer** (`Layout.tsx:112-116`): `Sidebar.FooterLink` with `window.open('/server/', '_blank', 'noopener')`
2. **Dashboard page** (`DashboardPage.tsx:84`): Primary Button, same `window.open` call
3. **Dashboard page** (`DashboardPage.tsx:120-125`): Secondary Button in a Card, same `window.open` call
Three separate "Open Server Dashboard" triggers on the dashboard. The footer link is good; the two dashboard buttons are redundant.
### 4.3 Breadcrumbs
**File:** `Layout.tsx:124` — `<TopBar breadcrumb={[]} ... />`
Breadcrumbs are permanently empty. The DS provides `useBreadcrumb()` hook (exported, see line 1255 of DS types) that pages can call to set page-specific breadcrumbs, but none of the pages use it. The TopBar renders an empty breadcrumb area.
### 4.4 User Menu / Avatar
**File:** `Layout.tsx:125-126`
```tsx
<TopBar
user={username ? { name: username } : undefined}
onLogout={logout}
/>
```
The TopBar's `user` prop triggers a `Dropdown` with only a "Logout" option. The avatar is rendered by the DS using the `Avatar` component with the user's name.
**Issue:** When `username` is `null` (common if the Logto ID token doesn't have `username`, `name`, or `email` claims), no user indicator is shown at all — no avatar, no logout button. The user has no way to log out from the UI.
---
## 5. Header Bar
### 5.1 Shared TopBar with Server
The platform SPA and the server SPA both use the same `TopBar` component from `@cameleer/design-system`. This means they share identical header chrome.
### 5.2 Irrelevant Controls on Platform Pages
**Critical issue.** The `TopBar` component (DS source, lines 5569-5588 of `index.es.js`) **always** renders:
1. **Status filter pills** (Completed, Warning, Error, Running) — `ButtonGroup` with global filter status values
2. **Time range dropdown** — `TimeRangeDropdown` with presets like "Last 1h", "Last 24h"
3. **Auto-refresh toggle** — "AUTO" / "MANUAL" button
4. **Theme toggle** — Light/dark mode switch
5. **Command palette search** — "Search... Ctrl+K" button
These controls are hardcoded in the DS `TopBar` component. They read from `useGlobalFilters()` and operate on exchange status filters and time ranges — concepts that are **completely irrelevant** to the SaaS platform pages (Dashboard, License, Admin Tenants).
The platform wraps everything in `GlobalFilterProvider` (in `main.tsx:96`), which initializes the filter state, but nothing in the platform UI reads or uses these filters. They are dead UI elements that confuse users.
**Recommendation:** Either:
- The DS should make these controls optional/configurable on `TopBar`
- The platform should use a simpler header component
- The platform should not wrap in `GlobalFilterProvider` / `CommandPaletteProvider` (but this may cause runtime errors if TopBar assumes they exist)
---
## 6. Specific Issues
### 6.1 Label/Value Formatting — "Slugdefault" Concatenation Bug
**Not found in source code.** The source code properly formats label/value pairs with `flex justify-between` layout:
```tsx
// DashboardPage.tsx:96-99
<div className="flex justify-between text-white/80">
<span>Slug</span>
<span className="font-mono">{tenant?.slug ?? '-'}</span>
</div>
```
If "Slugdefault" concatenation is visible in the UI, it's a **rendering/CSS issue** rather than a template bug — the `flex justify-between` may collapse if the container is too narrow, or there may be a DS Card padding issue causing the spans to not separate. The code itself has proper separation.
Similarly for limits on the License page:
```tsx
// LicensePage.tsx:147-155
<span className="text-sm text-white/60">{label}</span>
<span className="text-sm font-mono text-white">{value !== undefined ? value : '—'}</span>
```
Labels and values are in separate `<span>` elements within `flex justify-between` containers. The code is correct.
### 6.2 Badge Colors
**Feature badges (LicensePage.tsx:130-133):**
```tsx
<Badge
label={enabled ? 'Enabled' : 'Not included'}
color={enabled ? 'success' : 'auto'}
/>
```
- Enabled features: `color="success"` (green) — appropriate
- Disabled features: `color="auto"` — this uses the DS's auto-color logic (hash-based). For a disabled/not-included state, `color="error"` or a neutral muted variant would be more appropriate to clearly communicate "not available."
**Tenant status badges (DashboardPage.tsx:102-105, AdminTenantsPage.tsx:24-29):**
```tsx
color={tenant?.status === 'ACTIVE' ? 'success' : 'warning'}
color={row.status === 'ACTIVE' ? 'success' : 'warning'}
```
- ACTIVE: green — appropriate
- Anything else (SUSPENDED, PENDING): yellow/warning — reasonable but SUSPENDED should arguably be `error` (red)
**Tier badges:** Use `tierColor()` function but it's defined differently in each file:
- `DashboardPage.tsx:12-18` maps: enterprise->success, pro->primary, starter->warning
- `LicensePage.tsx:25-33` maps: BUSINESS->success, HIGH->primary, MID->warning, LOW->error
These use **different tier names** (enterprise/pro/starter vs BUSINESS/HIGH/MID/LOW). One is for tenant tiers, the other for license tiers, but the inconsistency suggests either the data model has diverged or one mapping is stale.
### 6.3 Sign-In Page (`ui/sign-in/src/`)
**Positive findings:**
- Uses DS components: `Card`, `Input`, `Button`, `Alert`, `FormField`
- Uses CSS modules with DS CSS variables (`var(--bg-base)`, `var(--text-primary)`, etc.)
- Proper form with `aria-label="Sign in"`, `autoComplete` attributes
- Loading state on submit button via `loading` prop
- Error display via DS `Alert variant="error"`
- Creative rotating subtitle strings — good personality touch
**Issues:**
1. **No `ThemeProvider` wrapper** (`sign-in/src/main.tsx`):
```tsx
createRoot(document.getElementById('root')!).render(
<StrictMode>
<App />
</StrictMode>,
);
```
The sign-in page imports `@cameleer/design-system/style.css` which provides CSS variable defaults, so it works. But the theme toggle won't function, and if the DS ever requires `ThemeProvider` for initialization, this will break.
2. **No `ToastProvider`** — if any DS component internally uses `useToast()`, it will throw.
3. **Hardcoded branding** (`SignInPage.tsx:61`):
```tsx
cameleer
```
The brand name is hardcoded text, not sourced from configuration.
4. **`React` import unused** (`SignInPage.tsx:1`): `useMemo` and `useState` are imported from `react` but the `import React` default import is absent, which is fine for React 19.
5. **No "forgot password" flow** — the form has username + password only. No recovery link. The DS `LoginForm` component supports `onForgotPassword` and `onSignUp` callbacks.
---
## 7. Architecture Observations
### 7.1 Provider Stack Over-provisioning
`main.tsx` wraps the app in:
```
ThemeProvider > ToastProvider > BreadcrumbProvider > GlobalFilterProvider > CommandPaletteProvider
```
`GlobalFilterProvider` and `CommandPaletteProvider` are server-dashboard concepts (exchange status filters, time range, search). They are unused by any platform page but are required because `TopBar` reads from them internally. This creates coupling between the server's observability UI concerns and the SaaS platform pages.
### 7.2 Route Guard Nesting
The route structure is:
```
ProtectedRoute > OrgResolver > Layout > (pages)
```
`OrgResolver` fetches `/api/me` and resolves tenant context. If it fails (`isError`), it renders `null` — a blank screen inside the Layout shell. This means the sidebar and TopBar render but the content area is completely empty with no explanation.
### 7.3 Unused Import
- `LicensePage.tsx:1` imports `React` and `useState` — `React` import is not needed with React 19's JSX transform, and `useState` is used so that's fine. But `React` as a namespace import isn't used.
### 7.4 DataTable Requires `id` Field
`AdminTenantsPage.tsx:67` passes `tenants` to `DataTable`. The DS type requires `T extends { id: string }`. The `TenantResponse` type has `id: string`, so this works, but the `createdAt` column (line 31) renders the raw ISO timestamp string without formatting — unlike DashboardPage which formats it with `toLocaleDateString()`.
---
## 8. Summary of Issues by Severity
### High Priority
| # | Issue | File(s) | Line(s) |
|---|-------|---------|---------|
| H1 | TopBar shows irrelevant status filters, time range, auto-refresh for platform pages | `Layout.tsx` / DS `TopBar` | 122-128 |
| H2 | OrgResolver error state renders blank screen (no error UI) | `OrgResolver.tsx` | 88-89 |
| H3 | Hardcoded `text-white` colors break light theme | All pages | Multiple |
### Medium Priority
| # | Issue | File(s) | Line(s) |
|---|-------|---------|---------|
| M1 | No active state on sidebar navigation items | `Layout.tsx` | 79-108 |
| M2 | Breadcrumbs permanently empty | `Layout.tsx` | 124 |
| M3 | DashboardPage has no error handling for failed API calls | `DashboardPage.tsx` | 23-26 |
| M4 | AdminTenantsPage missing empty state | `AdminTenantsPage.tsx` | 67-72 |
| M5 | AdminTenantsPage row click silently switches tenant context | `AdminTenantsPage.tsx` | 47-57 |
| M6 | Toasts never used despite ToastProvider being mounted | All pages | - |
| M7 | Raw `<button>` and `<code>` instead of DS components in LicensePage | `LicensePage.tsx` | 166-178 |
| M8 | AdminTenantsPage `createdAt` column renders raw ISO string | `AdminTenantsPage.tsx` | 31 |
| M9 | `tierColor()` defined twice with different tier mappings | `DashboardPage.tsx`, `LicensePage.tsx` | 12-18, 25-33 |
| M10 | "Not included" feature badge uses `color="auto"` instead of muted/neutral | `LicensePage.tsx` | 133 |
### Low Priority
| # | Issue | File(s) | Line(s) |
|---|-------|---------|---------|
| L1 | Three "Open Server Dashboard" buttons/links on dashboard | `Layout.tsx`, `DashboardPage.tsx` | 112-116, 81-87, 119-125 |
| L2 | Inconsistent loading style (inline styles vs Tailwind) | Auth files vs pages | Multiple |
| L3 | No Skeleton loading used (all Spinner) | All pages | - |
| L4 | Sidebar collapse disabled (no-op handler) | `Layout.tsx` | 71 |
| L5 | Sign-in page missing ThemeProvider wrapper | `sign-in/src/main.tsx` | 6-9 |
| L6 | Sign-in page has no forgot-password or sign-up link | `sign-in/src/SignInPage.tsx` | - |
| L7 | Custom SVG icons in Layout instead of lucide-react | `Layout.tsx` | 26-62 |
| L8 | Username null = no logout button visible | `Layout.tsx` | 125-126 |
| L9 | Page padding `p-6` repeated per-page instead of in Layout | All pages | - |

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

BIN
audit/verify-02-license.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 47 KiB

4974
ci-docker-log.txt Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -1,21 +0,0 @@
# Development overrides: exposes ports for direct access
# Usage: docker compose -f docker-compose.yml -f docker-compose.dev.yml up
services:
postgres:
ports:
- "5432:5432"
logto:
ports:
- "3001:3001"
- "3002:3002"
cameleer-saas:
ports:
- "8080:8080"
environment:
SPRING_PROFILES_ACTIVE: dev
clickhouse:
ports:
- "8123:8123"

View File

@@ -1,122 +1,23 @@
# Dev overrides — layered on top of installer/templates/ via COMPOSE_FILE in .env
# Usage: docker compose up (reads .env automatically)
services: services:
traefik: cameleer-postgres:
image: traefik:v3
restart: unless-stopped
ports: ports:
- "80:80" - "5432:5432"
- "443:443"
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./traefik.yml:/etc/traefik/traefik.yml:ro
- acme:/etc/traefik/acme
networks:
- cameleer
postgres: cameleer-clickhouse:
image: postgres:16-alpine ports:
restart: unless-stopped - "8123:8123"
environment:
POSTGRES_DB: ${POSTGRES_DB:-cameleer_saas}
POSTGRES_USER: ${POSTGRES_USER:-cameleer}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-cameleer_dev}
volumes:
- pgdata:/var/lib/postgresql/data
- ./docker/init-databases.sh:/docker-entrypoint-initdb.d/init-databases.sh:ro
healthcheck:
test: ["CMD-SHELL", "pg_isready -U ${POSTGRES_USER:-cameleer}"]
interval: 5s
timeout: 5s
retries: 5
networks:
- cameleer
logto: cameleer-logto:
image: ghcr.io/logto-io/logto:latest ports:
restart: unless-stopped - "3001:3001"
depends_on:
postgres:
condition: service_healthy
entrypoint: ["sh", "-c", "npm run cli db seed -- --swe && npm start"]
environment:
DB_URL: postgres://${POSTGRES_USER:-cameleer}:${POSTGRES_PASSWORD:-cameleer_dev}@postgres:5432/logto
ENDPOINT: ${LOGTO_ENDPOINT:-http://localhost:3001}
ADMIN_ENDPOINT: ${LOGTO_ADMIN_ENDPOINT:-http://localhost:3002}
TRUST_PROXY_HEADER: 1
labels:
- traefik.enable=true
- traefik.http.routers.logto.rule=PathPrefix(`/oidc`) || PathPrefix(`/interaction`)
- traefik.http.services.logto.loadbalancer.server.port=3001
networks:
- cameleer
cameleer-saas: cameleer-saas:
image: ${CAMELEER_IMAGE:-gitea.siegeln.net/cameleer/cameleer-saas}:${VERSION:-latest} ports:
restart: unless-stopped - "8080:8080"
depends_on:
postgres:
condition: service_healthy
volumes: volumes:
- /var/run/docker.sock:/var/run/docker.sock - ./ui/dist:/app/static
- ./keys:/etc/cameleer/keys:ro
environment: environment:
SPRING_DATASOURCE_URL: jdbc:postgresql://postgres:5432/${POSTGRES_DB:-cameleer_saas} SPRING_PROFILES_ACTIVE: dev
SPRING_DATASOURCE_USERNAME: ${POSTGRES_USER:-cameleer} SPRING_WEB_RESOURCES_STATIC_LOCATIONS: file:/app/static/,classpath:/static/
SPRING_DATASOURCE_PASSWORD: ${POSTGRES_PASSWORD:-cameleer_dev}
LOGTO_ENDPOINT: ${LOGTO_ENDPOINT:-http://logto:3001}
LOGTO_ISSUER_URI: ${LOGTO_ISSUER_URI:-http://logto:3001/oidc}
LOGTO_JWK_SET_URI: ${LOGTO_JWK_SET_URI:-http://logto:3001/oidc/jwks}
LOGTO_M2M_CLIENT_ID: ${LOGTO_M2M_CLIENT_ID:-}
LOGTO_M2M_CLIENT_SECRET: ${LOGTO_M2M_CLIENT_SECRET:-}
CAMELEER_JWT_PRIVATE_KEY_PATH: ${CAMELEER_JWT_PRIVATE_KEY_PATH:-}
CAMELEER_JWT_PUBLIC_KEY_PATH: ${CAMELEER_JWT_PUBLIC_KEY_PATH:-}
labels:
- traefik.enable=true
- traefik.http.routers.api.rule=PathPrefix(`/api`)
- traefik.http.services.api.loadbalancer.server.port=8080
- traefik.http.routers.forwardauth.rule=Path(`/auth/verify`)
- traefik.http.services.forwardauth.loadbalancer.server.port=8080
networks:
- cameleer
cameleer3-server:
image: ${CAMELEER3_SERVER_IMAGE:-gitea.siegeln.net/cameleer/cameleer3-server}:${VERSION:-latest}
restart: unless-stopped
depends_on:
postgres:
condition: service_healthy
clickhouse:
condition: service_started
environment:
SPRING_DATASOURCE_URL: jdbc:postgresql://postgres:5432/${POSTGRES_DB:-cameleer_saas}
CLICKHOUSE_URL: jdbc:clickhouse://clickhouse:8123/cameleer
labels:
- traefik.enable=true
- traefik.http.routers.observe.rule=PathPrefix(`/observe`)
- traefik.http.routers.observe.middlewares=forward-auth
- traefik.http.middlewares.forward-auth.forwardauth.address=http://cameleer-saas:8080/auth/verify
- traefik.http.middlewares.forward-auth.forwardauth.authResponseHeaders=X-Tenant-Id,X-User-Id,X-User-Email
- traefik.http.services.observe.loadbalancer.server.port=8080
networks:
- cameleer
clickhouse:
image: clickhouse/clickhouse-server:latest
restart: unless-stopped
volumes:
- chdata:/var/lib/clickhouse
healthcheck:
test: ["CMD-SHELL", "clickhouse-client --query 'SELECT 1'"]
interval: 10s
timeout: 5s
retries: 3
networks:
- cameleer
networks:
cameleer:
driver: bridge
volumes:
pgdata:
chdata:
acme:

94
docker/CLAUDE.md Normal file
View File

@@ -0,0 +1,94 @@
# Docker & Infrastructure
## Routing (single-domain, path-based via Traefik)
All services on one hostname. Infrastructure containers (Traefik, Logto) use `PUBLIC_HOST` + `PUBLIC_PROTOCOL` env vars directly. The SaaS app reads these via `CAMELEER_SAAS_PROVISIONING_PUBLICHOST` / `CAMELEER_SAAS_PROVISIONING_PUBLICPROTOCOL` (Spring Boot properties `cameleer.saas.provisioning.publichost` / `cameleer.saas.provisioning.publicprotocol`).
| Path | Target | Notes |
|------|--------|-------|
| `/platform/*` | cameleer-saas:8080 | SPA + API (`server.servlet.context-path: /platform`) |
| `/platform/vendor/*` | (SPA routes) | Vendor console (platform:admin) |
| `/platform/tenant/*` | (SPA routes) | Tenant admin portal (org-scoped) |
| `/t/{slug}/*` | per-tenant server-ui | Provisioned tenant UI containers (Traefik labels) |
| `/` | redirect -> `/platform/` | Via `docker/traefik-dynamic.yml` |
| `/*` (catch-all) | cameleer-logto:3001 (priority=1) | Custom sign-in UI, OIDC, interaction |
- SPA assets at `/_app/` (Vite `assetsDir: '_app'`) to avoid conflict with Logto's `/assets/`
- Logto `ENDPOINT` = `${PUBLIC_PROTOCOL}://${PUBLIC_HOST}` (same domain, same origin)
- TLS: `traefik-certs` init container generates self-signed cert (dev) or copies user-supplied cert via `CERT_FILE`/`KEY_FILE`/`CA_FILE` env vars. Default cert configured in `docker/traefik-dynamic.yml` (NOT static `traefik.yml` — Traefik v3 ignores `tls.stores.default` in static config). Runtime cert replacement via vendor UI (stage/activate/restore). ACME for production (future). Server containers import `/certs/ca.pem` into JVM truststore at startup via `docker-entrypoint.sh` for OIDC trust.
- Root `/` -> `/platform/` redirect via Traefik file provider (`docker/traefik-dynamic.yml`)
- LoginPage auto-redirects to Logto OIDC (no intermediate button)
- Per-tenant server containers get Traefik labels for `/t/{slug}/*` routing at provisioning time
## Docker Networks
Compose-defined networks:
| Network | Name on Host | Purpose |
|---------|-------------|---------|
| `cameleer` | `cameleer-saas_cameleer` | Compose default — shared services (DB, Logto, SaaS) |
| `cameleer-traefik` | `cameleer-traefik` (fixed `name:`) | Traefik + provisioned tenant containers |
Per-tenant networks (created dynamically by `DockerTenantProvisioner`):
| Network | Name Pattern | Purpose |
|---------|-------------|---------|
| Tenant network | `cameleer-tenant-{slug}` | Internal bridge, no internet — isolates tenant server + apps |
| Environment network | `cameleer-env-{tenantId}-{envSlug}` | Tenant-scoped (includes tenantId to prevent slug collision across tenants) |
Server containers join three networks: tenant network (primary), shared services network (`cameleer`), and traefik network. Apps deployed by the server use the tenant network as primary.
**Backend IP resolution:** Traefik's Docker provider is configured with `network: cameleer-traefik` (static `traefik.yml`). Every cameleer-managed container — saas-provisioned tenant containers (via `DockerTenantProvisioner`) and cameleer-server's per-app containers (via `DockerNetworkManager`) — is attached to `cameleer-traefik` at creation, so Traefik always resolves a reachable backend IP. Provisioned tenant containers additionally emit a `traefik.docker.network=cameleer-traefik` label as per-service defense-in-depth. (Pre-2026-04-23 the static config pointed at `network: cameleer`, a name that never matched any real network — that produced 504 Gateway Timeout on every managed app until the Traefik image was rebuilt.)
## Custom sign-in UI (`ui/sign-in/`)
Separate Vite+React SPA replacing Logto's default sign-in page. Supports both sign-in and self-service registration.
- Built as custom Logto Docker image (`cameleer-logto`): `ui/sign-in/Dockerfile` = node build stage + `FROM ghcr.io/logto-io/logto:latest` + install official connectors (SMTP) + COPY dist over `/etc/logto/packages/experience/dist/`
- Uses `@cameleer/design-system` components (Card, Input, Button, FormField, Alert)
- **Sign-in**: Logto Experience API (4-step: init -> verify password -> identify -> submit -> redirect). Auto-detects email vs username identifier.
- **Registration**: 2-phase flow. Phase 1: init Register -> send verification code to email. Phase 2: verify code -> set password -> identify (creates user) -> submit -> redirect.
- Reads `first_screen=register` from URL query params to show register form initially (set by `@logto/react` SDK's `firstScreen` option)
- `CUSTOM_UI_PATH` env var does NOT work for Logto OSS — must volume-mount or replace the experience dist directory
- Favicon bundled in `ui/sign-in/public/favicon.svg` (served by Logto, not SaaS)
## Deployment pipeline
App deployment is handled by the cameleer-server's `DeploymentExecutor` (7-stage async flow):
1. PRE_FLIGHT — validate config, check JAR exists
2. PULL_IMAGE — pull base image if missing
3. CREATE_NETWORK — ensure cameleer-traefik and cameleer-env-{slug} networks
4. START_REPLICAS — create N containers with Traefik labels
5. HEALTH_CHECK — poll `/cameleer/health` on agent port 9464
6. SWAP_TRAFFIC — stop old deployment (blue/green)
7. COMPLETE — mark RUNNING or DEGRADED
Key files:
- `DeploymentExecutor.java` (in cameleer-server) — async staged deployment, runtime type auto-detection
- `DockerRuntimeOrchestrator.java` (in cameleer-server) — Docker client, container lifecycle, builds runtime-type-specific entrypoints (spring-boot uses `-cp` + `PropertiesLauncher` with `-Dloader.path` for log appender; quarkus uses `-jar`; plain-java uses `-cp` + detected main class; native exec directly). Overrides the Dockerfile ENTRYPOINT.
- `docker/runtime-base/Dockerfile` — base image with agent JAR + `cameleer-log-appender.jar` + JRE. The Dockerfile ENTRYPOINT (`-jar /app/app.jar`) is a fallback — `DockerRuntimeOrchestrator` overrides it at container creation.
- `RuntimeDetector.java` (in cameleer-server) — detects runtime type from JAR manifest `Main-Class`; derives correct `PropertiesLauncher` package (Spring Boot 3.2+ vs pre-3.2)
- `ServerApiClient.java` — M2M token acquisition for SaaS->server API calls (agent status). Uses `X-Cameleer-Protocol-Version: 1` header
- Docker socket access: `group_add: ["0"]` in docker-compose.dev.yml (not root group membership in Dockerfile)
- Network: deployed containers join `cameleer-tenant-{slug}` (primary, isolation) + `cameleer-traefik` (routing) + `cameleer-env-{tenantId}-{envSlug}` (environment isolation)
## Bootstrap (`docker/logto-bootstrap.sh`)
Idempotent script run inside the Logto container entrypoint. **Clean slate** — no example tenant, no viewer user, no server configuration. Phases:
1. Wait for Logto health (no server to wait for — servers are provisioned per-tenant)
2. Get Management API token (reads `m-default` secret from DB)
3. Create Logto apps (SPA, Traditional Web App with `skipConsent`, M2M with Management API role + server API role)
3b. Create API resource scopes (1 platform + 9 tenant + 3 server scopes)
4. Create org roles (owner, operator, viewer with API resource scope assignments) + M2M server role (`cameleer-m2m-server` with `server:admin` scope)
5. Create admin user (SaaS admin with Logto console access)
7b. Configure Logto Custom JWT for access tokens (maps org roles -> `roles` claim: owner->server:admin, operator->server:operator, viewer->server:viewer; saas-vendor global role -> server:admin)
8. Configure Logto sign-in branding (Cameleer colors `#C6820E`/`#D4941E`, logo from `/platform/logo.svg`)
8b. Configure SMTP email connector (if `SMTP_HOST`/`SMTP_USER` env vars set) — discovers factory via `/api/connector-factories`, creates connector with Cameleer-branded HTML email templates for Register/SignIn/ForgotPassword/Generic. Skips gracefully if SMTP not configured.
8c. Enable self-service registration — sets `signInMode: "SignInAndRegister"`, `signUp: { identifiers: ["email"], password: true, verify: true }`, sign-in methods: email+password and username+password (backwards-compatible with admin user).
9. Cleanup seeded Logto apps
10. Write bootstrap results to `/data/logto-bootstrap.json`
12. Create `saas-vendor` global role with all API scopes and assign to admin user (always runs — admin IS the platform admin).
SMTP env vars for email verification: `SMTP_HOST`, `SMTP_PORT` (default 587), `SMTP_USER`, `SMTP_PASS`, `SMTP_FROM_EMAIL` (default `noreply@cameleer.io`). Passed to `cameleer-logto` container via docker-compose. Both installers prompt for these in SaaS mode.
The multi-tenant compose stack is: Traefik + PostgreSQL + ClickHouse + Logto (with bootstrap entrypoint) + cameleer-saas. No `cameleer-server` or `cameleer-server-ui` in compose — those are provisioned per-tenant by `DockerTenantProvisioner`.

View File

@@ -0,0 +1,4 @@
FROM clickhouse/clickhouse-server:latest
COPY init.sql /docker-entrypoint-initdb.d/init.sql
COPY users.xml /etc/clickhouse-server/users.d/default-user.xml
COPY prometheus.xml /etc/clickhouse-server/config.d/prometheus.xml

View File

@@ -0,0 +1 @@
CREATE DATABASE IF NOT EXISTS cameleer;

View File

@@ -0,0 +1,9 @@
<clickhouse>
<prometheus>
<endpoint>/metrics</endpoint>
<port>9363</port>
<metrics>true</metrics>
<events>true</events>
<asynchronous_metrics>true</asynchronous_metrics>
</prometheus>
</clickhouse>

View File

@@ -0,0 +1,16 @@
<clickhouse>
<users>
<default remove="remove">
</default>
<default>
<profile>default</profile>
<networks>
<ip>::/0</ip>
</networks>
<password from_env="CLICKHOUSE_PASSWORD" />
<quota>default</quota>
<access_management>0</access_management>
</default>
</users>
</clickhouse>

View File

@@ -0,0 +1,65 @@
#!/bin/sh
set -e
# Save the real public endpoints for after bootstrap
REAL_ENDPOINT="$ENDPOINT"
REAL_ADMIN_ENDPOINT="$ADMIN_ENDPOINT"
echo "[entrypoint] Seeding Logto database..."
npm run cli db seed -- --swe 2>/dev/null || true
echo "[entrypoint] Deploying database alterations..."
npm run cli db alteration deploy 2>/dev/null || true
# Start Logto with localhost endpoints so it can reach itself without Traefik
export ENDPOINT="http://localhost:3001"
export ADMIN_ENDPOINT="http://localhost:3002"
echo "[entrypoint] Starting Logto (bootstrap mode)..."
npm start &
LOGTO_PID=$!
echo "[entrypoint] Waiting for Logto to be ready..."
for i in $(seq 1 120); do
if node -e "require('http').get('http://localhost:3001/oidc/.well-known/openid-configuration', r => process.exit(r.statusCode === 200 ? 0 : 1)).on('error', () => process.exit(1))" 2>/dev/null; then
echo "[entrypoint] Logto is ready."
break
fi
if [ "$i" -eq 120 ]; then
echo "[entrypoint] ERROR: Logto not ready after 120s"
exit 1
fi
sleep 1
done
# Run bootstrap — use localhost endpoints, skip Host headers (BOOTSTRAP_LOCAL flag)
# PUBLIC_HOST and PUBLIC_PROTOCOL stay real for redirect URI generation
BOOTSTRAP_FILE="/data/logto-bootstrap.json"
export LOGTO_ENDPOINT="http://localhost:3001"
export LOGTO_ADMIN_ENDPOINT="http://localhost:3002"
export BOOTSTRAP_LOCAL="true"
if [ -f "$BOOTSTRAP_FILE" ]; then
CACHED_SECRET=$(jq -r '.m2mClientSecret // empty' "$BOOTSTRAP_FILE" 2>/dev/null)
CACHED_SPA=$(jq -r '.spaClientId // empty' "$BOOTSTRAP_FILE" 2>/dev/null)
if [ -n "$CACHED_SECRET" ] && [ -n "$CACHED_SPA" ]; then
echo "[entrypoint] Bootstrap already complete."
else
echo "[entrypoint] Incomplete bootstrap found, re-running..."
/scripts/logto-bootstrap.sh
fi
else
echo "[entrypoint] Running bootstrap..."
/scripts/logto-bootstrap.sh
fi
# Restart Logto with real public endpoints
echo "[entrypoint] Bootstrap done. Restarting Logto with public endpoints..."
kill $LOGTO_PID 2>/dev/null || true
wait $LOGTO_PID 2>/dev/null || true
export ENDPOINT="$REAL_ENDPOINT"
export ADMIN_ENDPOINT="$REAL_ADMIN_ENDPOINT"
echo "[entrypoint] Starting Logto (production mode)..."
exec npm start

View File

@@ -0,0 +1,3 @@
FROM postgres:16-alpine
COPY init-databases.sh /docker-entrypoint-initdb.d/init-databases.sh
RUN chmod +x /docker-entrypoint-initdb.d/init-databases.sh

View File

@@ -0,0 +1,9 @@
#!/bin/bash
set -e
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
CREATE DATABASE logto;
CREATE DATABASE cameleer;
GRANT ALL PRIVILEGES ON DATABASE logto TO $POSTGRES_USER;
GRANT ALL PRIVILEGES ON DATABASE cameleer TO $POSTGRES_USER;
EOSQL

View File

@@ -0,0 +1,7 @@
FROM traefik:v3
RUN apk add --no-cache openssl
COPY traefik.yml /etc/traefik/traefik.yml
COPY traefik-dynamic.yml /etc/traefik/dynamic.yml
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

View File

@@ -0,0 +1,68 @@
#!/bin/sh
set -e
CERTS_DIR="/certs"
# Skip if certs already exist (idempotent)
if [ ! -f "$CERTS_DIR/cert.pem" ]; then
mkdir -p "$CERTS_DIR"
if [ -n "$CERT_FILE" ] && [ -n "$KEY_FILE" ]; then
# User-supplied certificate
echo "[certs] Installing user-supplied certificate..."
cp "$CERT_FILE" "$CERTS_DIR/cert.pem"
cp "$KEY_FILE" "$CERTS_DIR/key.pem"
if [ -n "$CA_FILE" ]; then
cp "$CA_FILE" "$CERTS_DIR/ca.pem"
fi
# Validate key matches cert
CERT_MOD=$(openssl x509 -noout -modulus -in "$CERTS_DIR/cert.pem" 2>/dev/null | md5sum)
KEY_MOD=$(openssl rsa -noout -modulus -in "$CERTS_DIR/key.pem" 2>/dev/null | md5sum)
if [ "$CERT_MOD" != "$KEY_MOD" ]; then
echo "[certs] ERROR: Certificate and key do not match!"
rm -f "$CERTS_DIR/cert.pem" "$CERTS_DIR/key.pem" "$CERTS_DIR/ca.pem"
exit 1
fi
SELF_SIGNED=false
echo "[certs] Installed user-supplied certificate."
else
# Generate self-signed certificate
HOST="${PUBLIC_HOST:-localhost}"
AUTH="${AUTH_HOST:-$HOST}"
echo "[certs] Generating self-signed certificate for $HOST..."
# Build SAN list; deduplicate when AUTH_HOST equals PUBLIC_HOST
if [ "$AUTH" = "$HOST" ]; then
SAN="DNS:$HOST,DNS:*.$HOST"
else
SAN="DNS:$HOST,DNS:*.$HOST,DNS:$AUTH,DNS:*.$AUTH"
echo "[certs] (+ auth domain: $AUTH)"
fi
openssl req -x509 -newkey rsa:4096 \
-keyout "$CERTS_DIR/key.pem" -out "$CERTS_DIR/cert.pem" \
-days 365 -nodes \
-subj "/CN=$HOST" \
-addext "subjectAltName=$SAN"
SELF_SIGNED=true
echo "[certs] Generated self-signed certificate for $HOST."
fi
# Write metadata for SaaS app to seed DB
SUBJECT=$(openssl x509 -noout -subject -in "$CERTS_DIR/cert.pem" 2>/dev/null | sed 's/subject=//')
FINGERPRINT=$(openssl x509 -noout -fingerprint -sha256 -in "$CERTS_DIR/cert.pem" 2>/dev/null | sed 's/.*=//')
NOT_BEFORE=$(openssl x509 -noout -startdate -in "$CERTS_DIR/cert.pem" 2>/dev/null | sed 's/notBefore=//')
NOT_AFTER=$(openssl x509 -noout -enddate -in "$CERTS_DIR/cert.pem" 2>/dev/null | sed 's/notAfter=//')
HAS_CA=false
[ -f "$CERTS_DIR/ca.pem" ] && HAS_CA=true
cat > "$CERTS_DIR/meta.json" <<METAEOF
{"subject":"$SUBJECT","fingerprint":"$FINGERPRINT","selfSigned":$SELF_SIGNED,"hasCa":$HAS_CA,"notBefore":"$NOT_BEFORE","notAfter":"$NOT_AFTER"}
METAEOF
mkdir -p "$CERTS_DIR/staged" "$CERTS_DIR/prev"
chmod 775 "$CERTS_DIR" "$CERTS_DIR/staged" "$CERTS_DIR/prev"
chmod 660 "$CERTS_DIR"/*.pem 2>/dev/null || true
else
echo "[certs] Certificates already exist, skipping generation."
fi
# Start Traefik
exec traefik "$@"

View File

@@ -0,0 +1,6 @@
tls:
stores:
default:
defaultCertificate:
certFile: /certs/cert.pem
keyFile: /certs/key.pem

View File

@@ -0,0 +1,23 @@
api:
dashboard: false
entryPoints:
web:
address: ":80"
http:
redirections:
entryPoint:
to: websecure
scheme: https
websecure:
address: ":443"
admin-console:
address: ":3002"
providers:
docker:
endpoint: "unix:///var/run/docker.sock"
exposedByDefault: false
network: cameleer-traefik
file:
filename: /etc/traefik/dynamic.yml

View File

@@ -1,7 +0,0 @@
#!/bin/bash
set -e
psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" <<-EOSQL
CREATE DATABASE logto;
GRANT ALL PRIVILEGES ON DATABASE logto TO $POSTGRES_USER;
EOSQL

760
docker/logto-bootstrap.sh Normal file
View File

@@ -0,0 +1,760 @@
#!/bin/sh
set -e
# Cameleer SaaS — Bootstrap Script
# Creates Logto apps, users, organizations, roles.
# Seeds cameleer_saas DB with tenant, environment, license.
# Configures cameleer-server OIDC.
# Idempotent: checks existence before creating.
LOGTO_ENDPOINT="${LOGTO_ENDPOINT:-http://cameleer-logto:3001}"
LOGTO_ADMIN_ENDPOINT="${LOGTO_ADMIN_ENDPOINT:-http://cameleer-logto:3002}"
LOGTO_PUBLIC_ENDPOINT="${LOGTO_PUBLIC_ENDPOINT:-http://localhost:3001}"
MGMT_API_RESOURCE="https://default.logto.app/api"
BOOTSTRAP_FILE="/data/logto-bootstrap.json"
PG_HOST="${PG_HOST:-cameleer-postgres}"
PG_USER="${PG_USER:-cameleer}"
PG_DB_LOGTO="logto"
PG_DB_SAAS="${PG_DB_SAAS:-cameleer_saas}"
# App names
SPA_APP_NAME="Cameleer SaaS"
M2M_APP_NAME="Cameleer SaaS Backend"
TRAD_APP_NAME="Cameleer Dashboard"
API_RESOURCE_INDICATOR="https://api.cameleer.local"
API_RESOURCE_NAME="Cameleer SaaS API"
# Users (configurable via env vars)
SAAS_ADMIN_USER="${SAAS_ADMIN_USER:-admin}"
SAAS_ADMIN_PASS="${SAAS_ADMIN_PASS:-admin}"
# No server config — servers are provisioned dynamically by the admin console
# Redirect URIs (derived from PUBLIC_HOST and PUBLIC_PROTOCOL)
HOST="${PUBLIC_HOST:-localhost}"
AUTH="${AUTH_HOST:-$HOST}"
PROTO="${PUBLIC_PROTOCOL:-https}"
SPA_REDIRECT_URIS="[\"${PROTO}://${HOST}/platform/callback\"]"
SPA_POST_LOGOUT_URIS="[\"${PROTO}://${HOST}/platform/login\",\"${PROTO}://${HOST}/platform/\"]"
TRAD_REDIRECT_URIS="[\"${PROTO}://${HOST}/oidc/callback\",\"${PROTO}://${HOST}/server/oidc/callback\"]"
TRAD_POST_LOGOUT_URIS="[\"${PROTO}://${HOST}\",\"${PROTO}://${HOST}/server\",\"${PROTO}://${HOST}/server/login?local\"]"
log() { echo "[bootstrap] $1"; }
pgpass() { PGPASSWORD="${PG_PASSWORD:-cameleer_dev}"; export PGPASSWORD; }
# When BOOTSTRAP_LOCAL=true (running inside Logto container with localhost endpoints),
# skip Host/X-Forwarded-Proto headers — they cause issuer mismatches with localhost
if [ "$BOOTSTRAP_LOCAL" = "true" ]; then
HOST_ARGS=""
ADMIN_HOST_ARGS=""
else
# Logto validates Host header against its ENDPOINT, which uses AUTH_HOST
HOST_ARGS="-H Host:${AUTH}"
ADMIN_HOST_ARGS="-H Host:${AUTH}:3002 -H X-Forwarded-Proto:https"
fi
# Install jq + curl if not already available (deps are baked into cameleer-logto image)
if ! command -v jq >/dev/null 2>&1 || ! command -v curl >/dev/null 2>&1; then
if command -v apk >/dev/null 2>&1; then
apk add --no-cache jq curl >/dev/null 2>&1
elif command -v apt-get >/dev/null 2>&1; then
apt-get update -qq && apt-get install -y -qq jq curl >/dev/null 2>&1
fi
fi
# Read cached secrets from previous run
if [ -f "$BOOTSTRAP_FILE" ]; then
CACHED_M2M_SECRET=$(jq -r '.m2mClientSecret // empty' "$BOOTSTRAP_FILE" 2>/dev/null)
CACHED_TRAD_SECRET=$(jq -r '.tradAppSecret // empty' "$BOOTSTRAP_FILE" 2>/dev/null)
CACHED_SPA_ID=$(jq -r '.spaClientId // empty' "$BOOTSTRAP_FILE" 2>/dev/null)
log "Found cached bootstrap file"
if [ -n "$CACHED_M2M_SECRET" ] && [ -n "$CACHED_SPA_ID" ]; then
log "Bootstrap already complete — skipping. Delete $BOOTSTRAP_FILE to force re-run."
exit 0
fi
fi
# ============================================================
# PHASE 1: Wait for services
# ============================================================
log "Waiting for Logto..."
for i in $(seq 1 60); do
if curl -sf "${LOGTO_ENDPOINT}/oidc/.well-known/openid-configuration" >/dev/null 2>&1; then
log "Logto is ready."
break
fi
[ "$i" -eq 60 ] && { log "ERROR: Logto not ready after 60s"; exit 1; }
sleep 1
done
# No server wait — servers are provisioned dynamically by the admin console
# ============================================================
# PHASE 2: Get Management API token
# ============================================================
log "Reading m-default secret from database..."
pgpass
M_DEFAULT_SECRET=$(psql -h "$PG_HOST" -U "$PG_USER" -d "$PG_DB_LOGTO" -t -A -c \
"SELECT secret FROM applications WHERE id = 'm-default' AND tenant_id = 'admin';")
[ -z "$M_DEFAULT_SECRET" ] && { log "ERROR: m-default app not found"; exit 1; }
get_admin_token() {
curl -s -X POST "${LOGTO_ADMIN_ENDPOINT}/oidc/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
$ADMIN_HOST_ARGS \
-d "grant_type=client_credentials&client_id=${1}&client_secret=${2}&resource=${MGMT_API_RESOURCE}&scope=all"
}
get_default_token() {
curl -s -X POST "${LOGTO_ENDPOINT}/oidc/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
$HOST_ARGS \
-d "grant_type=client_credentials&client_id=${1}&client_secret=${2}&resource=${MGMT_API_RESOURCE}&scope=all"
}
log "Getting Management API token..."
TOKEN_RESPONSE=$(get_admin_token "m-default" "$M_DEFAULT_SECRET")
TOKEN=$(echo "$TOKEN_RESPONSE" | jq -r '.access_token' 2>/dev/null)
[ -z "$TOKEN" ] || [ "$TOKEN" = "null" ] && { log "ERROR: Failed to get token"; exit 1; }
log "Got Management API token."
# Verify Management API is fully ready (Logto may still be initializing internally)
log "Verifying Management API is responsive..."
for i in $(seq 1 30); do
VERIFY_RESPONSE=$(curl -s -H "Authorization: Bearer $TOKEN" $HOST_ARGS "${LOGTO_ENDPOINT}/api/roles" 2>/dev/null)
if echo "$VERIFY_RESPONSE" | jq -e 'type == "array"' >/dev/null 2>&1; then
log "Management API is ready."
break
fi
[ "$i" -eq 30 ] && { log "ERROR: Management API not responsive after 30s"; exit 1; }
sleep 1
done
# --- Helper: Logto API calls ---
api_get() {
curl -s -H "Authorization: Bearer $TOKEN" $HOST_ARGS "${LOGTO_ENDPOINT}${1}" 2>/dev/null || echo "[]"
}
api_post() {
curl -s -X POST -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" $HOST_ARGS \
-d "$2" "${LOGTO_ENDPOINT}${1}" 2>/dev/null || true
}
api_put() {
curl -s -X PUT -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" $HOST_ARGS \
-d "$2" "${LOGTO_ENDPOINT}${1}" 2>/dev/null || true
}
api_delete() {
curl -s -X DELETE -H "Authorization: Bearer $TOKEN" $HOST_ARGS "${LOGTO_ENDPOINT}${1}" 2>/dev/null || true
}
api_patch() {
curl -s -X PATCH -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" $HOST_ARGS \
-d "$2" "${LOGTO_ENDPOINT}${1}" 2>/dev/null || true
}
# ============================================================
# PHASE 3: Create Logto applications
# ============================================================
EXISTING_APPS=$(api_get "/api/applications")
# --- SPA app (for SaaS frontend) ---
SPA_ID=$(echo "$EXISTING_APPS" | jq -r ".[] | select(.name == \"$SPA_APP_NAME\" and .type == \"SPA\") | .id")
if [ -n "$SPA_ID" ]; then
log "SPA app exists: $SPA_ID"
else
log "Creating SPA app..."
SPA_RESPONSE=$(api_post "/api/applications" "{
\"name\": \"$SPA_APP_NAME\",
\"type\": \"SPA\",
\"oidcClientMetadata\": {
\"redirectUris\": $SPA_REDIRECT_URIS,
\"postLogoutRedirectUris\": $SPA_POST_LOGOUT_URIS
}
}")
SPA_ID=$(echo "$SPA_RESPONSE" | jq -r '.id')
log "Created SPA app: $SPA_ID"
fi
# --- Traditional Web App (for cameleer-server OIDC) ---
TRAD_ID=$(echo "$EXISTING_APPS" | jq -r ".[] | select(.name == \"$TRAD_APP_NAME\" and .type == \"Traditional\") | .id")
TRAD_SECRET=""
if [ -n "$TRAD_ID" ]; then
log "Traditional app exists: $TRAD_ID"
TRAD_SECRET="${CACHED_TRAD_SECRET:-}"
else
log "Creating Traditional Web app..."
TRAD_RESPONSE=$(api_post "/api/applications" "{
\"name\": \"$TRAD_APP_NAME\",
\"type\": \"Traditional\",
\"oidcClientMetadata\": {
\"redirectUris\": $TRAD_REDIRECT_URIS,
\"postLogoutRedirectUris\": $TRAD_POST_LOGOUT_URIS
}
}")
TRAD_ID=$(echo "$TRAD_RESPONSE" | jq -r '.id')
TRAD_SECRET=$(echo "$TRAD_RESPONSE" | jq -r '.secret')
[ "$TRAD_SECRET" = "null" ] && TRAD_SECRET=""
log "Created Traditional app: $TRAD_ID"
fi
# Enable skip consent for the Traditional app (first-party SSO)
api_put "/api/applications/$TRAD_ID" '{"isThirdParty": false, "customClientMetadata": {"alwaysIssueRefreshToken": true, "skipConsent": true}}' >/dev/null 2>&1
log "Traditional app: skip consent enabled."
# --- API resource ---
EXISTING_RESOURCES=$(api_get "/api/resources")
API_RESOURCE_ID=$(echo "$EXISTING_RESOURCES" | jq -r ".[] | select(.indicator == \"$API_RESOURCE_INDICATOR\") | .id")
if [ -n "$API_RESOURCE_ID" ]; then
log "API resource exists: $API_RESOURCE_ID"
else
log "Creating API resource..."
RESOURCE_RESPONSE=$(api_post "/api/resources" "{
\"name\": \"$API_RESOURCE_NAME\",
\"indicator\": \"$API_RESOURCE_INDICATOR\"
}")
API_RESOURCE_ID=$(echo "$RESOURCE_RESPONSE" | jq -r '.id')
log "Created API resource: $API_RESOURCE_ID"
fi
# ============================================================
# PHASE 3b: Create API resource scopes
# ============================================================
log "Creating API resource scopes..."
EXISTING_SCOPES=$(api_get "/api/resources/${API_RESOURCE_ID}/scopes")
create_scope() {
local name="$1"
local desc="$2"
local existing_id=$(echo "$EXISTING_SCOPES" | jq -r ".[] | select(.name == \"$name\") | .id")
if [ -n "$existing_id" ]; then
log " Scope '$name' exists: $existing_id" >&2
echo "$existing_id"
else
local resp=$(api_post "/api/resources/${API_RESOURCE_ID}/scopes" "{\"name\": \"$name\", \"description\": \"$desc\"}")
local new_id=$(echo "$resp" | jq -r '.id')
log " Created scope '$name': $new_id" >&2
echo "$new_id"
fi
}
# Platform-level scope
SCOPE_PLATFORM_ADMIN=$(create_scope "platform:admin" "SaaS platform administration")
# Tenant-level scopes
SCOPE_TENANT_MANAGE=$(create_scope "tenant:manage" "Manage tenant settings")
SCOPE_BILLING_MANAGE=$(create_scope "billing:manage" "Manage billing")
SCOPE_TEAM_MANAGE=$(create_scope "team:manage" "Manage team members")
SCOPE_APPS_MANAGE=$(create_scope "apps:manage" "Create and delete apps")
SCOPE_APPS_DEPLOY=$(create_scope "apps:deploy" "Deploy apps")
SCOPE_SECRETS_MANAGE=$(create_scope "secrets:manage" "Manage secrets")
SCOPE_OBSERVE_READ=$(create_scope "observe:read" "View observability data")
SCOPE_OBSERVE_DEBUG=$(create_scope "observe:debug" "Debug and replay operations")
SCOPE_SETTINGS_MANAGE=$(create_scope "settings:manage" "Manage settings")
# Server-level scopes (mapped to server RBAC roles via JWT scope claim)
SCOPE_SERVER_ADMIN=$(create_scope "server:admin" "Full server access")
SCOPE_SERVER_OPERATOR=$(create_scope "server:operator" "Deploy and manage apps in server")
SCOPE_SERVER_VIEWER=$(create_scope "server:viewer" "Read-only server observability")
# Collect scope IDs for role assignment
# Owner: full tenant control
OWNER_SCOPE_IDS="\"$SCOPE_TENANT_MANAGE\",\"$SCOPE_BILLING_MANAGE\",\"$SCOPE_TEAM_MANAGE\",\"$SCOPE_APPS_MANAGE\",\"$SCOPE_APPS_DEPLOY\",\"$SCOPE_SECRETS_MANAGE\",\"$SCOPE_OBSERVE_READ\",\"$SCOPE_OBSERVE_DEBUG\",\"$SCOPE_SETTINGS_MANAGE\",\"$SCOPE_SERVER_ADMIN\""
# Operator: app lifecycle + observability (no billing/team/secrets/settings)
OPERATOR_SCOPE_IDS="\"$SCOPE_APPS_MANAGE\",\"$SCOPE_APPS_DEPLOY\",\"$SCOPE_OBSERVE_READ\",\"$SCOPE_OBSERVE_DEBUG\",\"$SCOPE_SERVER_OPERATOR\""
# Viewer: read-only observability
VIEWER_SCOPE_IDS="\"$SCOPE_OBSERVE_READ\",\"$SCOPE_SERVER_VIEWER\""
# Vendor (saas-vendor global role): platform:admin + all tenant scopes
ALL_SCOPE_IDS="\"$SCOPE_PLATFORM_ADMIN\",$OWNER_SCOPE_IDS"
# --- M2M app ---
M2M_ID=$(echo "$EXISTING_APPS" | jq -r ".[] | select(.name == \"$M2M_APP_NAME\" and .type == \"MachineToMachine\") | .id")
M2M_SECRET=""
if [ -n "$M2M_ID" ]; then
log "M2M app exists: $M2M_ID"
M2M_SECRET="${CACHED_M2M_SECRET:-}"
else
log "Creating M2M app..."
M2M_RESPONSE=$(api_post "/api/applications" "{
\"name\": \"$M2M_APP_NAME\",
\"type\": \"MachineToMachine\"
}")
M2M_ID=$(echo "$M2M_RESPONSE" | jq -r '.id')
M2M_SECRET=$(echo "$M2M_RESPONSE" | jq -r '.secret')
log "Created M2M app: $M2M_ID"
# Assign Management API role
log "Assigning Management API access to M2M app..."
pgpass
MGMT_RESOURCE_ID=$(psql -h "$PG_HOST" -U "$PG_USER" -d "$PG_DB_LOGTO" -t -A -c \
"SELECT id FROM resources WHERE indicator = '$MGMT_API_RESOURCE' AND tenant_id = 'default';")
if [ -n "$MGMT_RESOURCE_ID" ]; then
SCOPE_IDS=$(psql -h "$PG_HOST" -U "$PG_USER" -d "$PG_DB_LOGTO" -t -A -c \
"SELECT json_agg(id) FROM scopes WHERE resource_id = '$MGMT_RESOURCE_ID' AND tenant_id = 'default';" | tr -d '[:space:]')
ROLE_RESPONSE=$(api_post "/api/roles" "{
\"name\": \"cameleer-m2m-management\",
\"description\": \"Full Management API access for Cameleer SaaS\",
\"type\": \"MachineToMachine\",
\"scopeIds\": $SCOPE_IDS
}")
ROLE_ID=$(echo "$ROLE_RESPONSE" | jq -r '.id')
if [ -n "$ROLE_ID" ] && [ "$ROLE_ID" != "null" ]; then
api_post "/api/roles/$ROLE_ID/applications" "{\"applicationIds\": [\"$M2M_ID\"]}" >/dev/null
log "Assigned Management API role to M2M app."
VERIFY=$(get_default_token "$M2M_ID" "$M2M_SECRET")
VERIFY_TOKEN=$(echo "$VERIFY" | jq -r '.access_token')
if [ -n "$VERIFY_TOKEN" ] && [ "$VERIFY_TOKEN" != "null" ]; then
log "Verified M2M app works."
else
log "WARNING: M2M verification failed"
M2M_SECRET=""
fi
fi
fi
fi
# Create M2M role for the Cameleer API resource (server:admin access) — idempotent
EXISTING_M2M_SERVER_ROLE=$(api_get "/api/roles" | jq -r '.[] | select(.name == "cameleer-m2m-server") | .id')
if [ -z "$EXISTING_M2M_SERVER_ROLE" ]; then
log "Creating M2M server access role..."
SERVER_M2M_ROLE_RESPONSE=$(api_post "/api/roles" "{
\"name\": \"cameleer-m2m-server\",
\"description\": \"Server API access for SaaS backend (M2M)\",
\"type\": \"MachineToMachine\",
\"scopeIds\": [\"$SCOPE_SERVER_ADMIN\"]
}")
EXISTING_M2M_SERVER_ROLE=$(echo "$SERVER_M2M_ROLE_RESPONSE" | jq -r '.id')
fi
if [ -n "$EXISTING_M2M_SERVER_ROLE" ] && [ "$EXISTING_M2M_SERVER_ROLE" != "null" ] && [ -n "$M2M_ID" ]; then
api_post "/api/roles/$EXISTING_M2M_SERVER_ROLE/applications" "{\"applicationIds\": [\"$M2M_ID\"]}" >/dev/null 2>&1
log "Assigned server API role to M2M app: $EXISTING_M2M_SERVER_ROLE"
fi
# ============================================================
# PHASE 4: Create roles
# ============================================================
# --- Organization roles: owner, operator, viewer ---
# Note: saas-vendor global role is created in Phase 12 and assigned to the admin user.
log "Creating organization roles..."
EXISTING_ORG_ROLES=$(api_get "/api/organization-roles")
ORG_OWNER_ROLE_ID=$(echo "$EXISTING_ORG_ROLES" | jq -r '.[] | select(.name == "owner") | .id')
if [ -n "$ORG_OWNER_ROLE_ID" ]; then
log "Org owner role exists: $ORG_OWNER_ROLE_ID"
else
ORG_OWNER_RESPONSE=$(api_post "/api/organization-roles" "{
\"name\": \"owner\",
\"description\": \"Platform owner — full tenant control\"
}")
ORG_OWNER_ROLE_ID=$(echo "$ORG_OWNER_RESPONSE" | jq -r '.id')
log "Created org owner role: $ORG_OWNER_ROLE_ID"
fi
ORG_OPERATOR_ROLE_ID=$(echo "$EXISTING_ORG_ROLES" | jq -r '.[] | select(.name == "operator") | .id')
if [ -z "$ORG_OPERATOR_ROLE_ID" ]; then
ORG_OPERATOR_RESPONSE=$(api_post "/api/organization-roles" "{
\"name\": \"operator\",
\"description\": \"Operator — manage apps, deploy, observe\"
}")
ORG_OPERATOR_ROLE_ID=$(echo "$ORG_OPERATOR_RESPONSE" | jq -r '.id')
log "Created org operator role: $ORG_OPERATOR_ROLE_ID"
fi
ORG_VIEWER_ROLE_ID=$(echo "$EXISTING_ORG_ROLES" | jq -r '.[] | select(.name == "viewer") | .id')
if [ -z "$ORG_VIEWER_ROLE_ID" ]; then
ORG_VIEWER_RESPONSE=$(api_post "/api/organization-roles" "{
\"name\": \"viewer\",
\"description\": \"Viewer — read-only observability\"
}")
ORG_VIEWER_ROLE_ID=$(echo "$ORG_VIEWER_RESPONSE" | jq -r '.id')
log "Created org viewer role: $ORG_VIEWER_ROLE_ID"
fi
# Assign API resource scopes to org roles (these appear in org-scoped resource tokens)
log "Assigning API resource scopes to organization roles..."
api_put "/api/organization-roles/${ORG_OWNER_ROLE_ID}/resource-scopes" "{\"scopeIds\": [$OWNER_SCOPE_IDS]}" >/dev/null 2>&1
api_put "/api/organization-roles/${ORG_OPERATOR_ROLE_ID}/resource-scopes" "{\"scopeIds\": [$OPERATOR_SCOPE_IDS]}" >/dev/null 2>&1
api_put "/api/organization-roles/${ORG_VIEWER_ROLE_ID}/resource-scopes" "{\"scopeIds\": [$VIEWER_SCOPE_IDS]}" >/dev/null 2>&1
log "API resource scopes assigned to organization roles."
# ============================================================
# PHASE 5: Create users
# ============================================================
# --- Platform Owner ---
log "Checking for platform owner user '$SAAS_ADMIN_USER'..."
ADMIN_USER_ID=$(api_get "/api/users?search=$SAAS_ADMIN_USER" | jq -r ".[] | select(.username == \"$SAAS_ADMIN_USER\") | .id")
if [ -n "$ADMIN_USER_ID" ]; then
log "Platform owner exists: $ADMIN_USER_ID"
else
log "Creating platform owner '$SAAS_ADMIN_USER'..."
ADMIN_RESPONSE=$(api_post "/api/users" "{
\"username\": \"$SAAS_ADMIN_USER\",
\"password\": \"$SAAS_ADMIN_PASS\",
\"name\": \"Platform Owner\"
}")
ADMIN_USER_ID=$(echo "$ADMIN_RESPONSE" | jq -r '.id')
log "Created platform owner: $ADMIN_USER_ID"
fi
# --- Grant SaaS admin Logto console access (admin tenant, port 3002) ---
log "Granting SaaS admin Logto console access..."
# Get admin-tenant M2M token (m-default token has wrong audience for port 3002)
ADMIN_MGMT_RESOURCE="https://admin.logto.app/api"
log "Reading m-admin secret from database..."
M_ADMIN_SECRET=$(psql -h "$PG_HOST" -U "$PG_USER" -d "$PG_DB_LOGTO" -t -A -c \
"SELECT secret FROM applications WHERE id = 'm-admin' AND tenant_id = 'admin';" 2>/dev/null)
if [ -z "$M_ADMIN_SECRET" ]; then
log "WARNING: m-admin app not found — skipping console access"
else
ADMIN_TOKEN_RESPONSE=$(curl -s -X POST "${LOGTO_ADMIN_ENDPOINT}/oidc/token" \
-H "Content-Type: application/x-www-form-urlencoded" \
$ADMIN_HOST_ARGS \
-d "grant_type=client_credentials&client_id=m-admin&client_secret=${M_ADMIN_SECRET}&resource=${ADMIN_MGMT_RESOURCE}&scope=all")
ADMIN_TOKEN=$(echo "$ADMIN_TOKEN_RESPONSE" | jq -r '.access_token' 2>/dev/null)
if [ -z "$ADMIN_TOKEN" ] || [ "$ADMIN_TOKEN" = "null" ]; then
log "WARNING: Failed to get admin tenant token — skipping console access"
log "Response: $(echo "$ADMIN_TOKEN_RESPONSE" | head -c 200)"
else
log "Got admin tenant token."
# Admin-tenant API helpers (port 3002, admin token)
admin_api_get() {
curl -s -H "Authorization: Bearer $ADMIN_TOKEN" $ADMIN_HOST_ARGS "${LOGTO_ADMIN_ENDPOINT}${1}" 2>/dev/null || echo "[]"
}
admin_api_post() {
curl -s -X POST -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" $ADMIN_HOST_ARGS \
-d "$2" "${LOGTO_ADMIN_ENDPOINT}${1}" 2>/dev/null || true
}
admin_api_patch() {
curl -s -X PATCH -H "Authorization: Bearer $ADMIN_TOKEN" -H "Content-Type: application/json" $ADMIN_HOST_ARGS \
-d "$2" "${LOGTO_ADMIN_ENDPOINT}${1}" 2>/dev/null || true
}
# Check if admin user already exists on admin tenant
ADMIN_TENANT_USER_ID=$(admin_api_get "/api/users?search=$SAAS_ADMIN_USER" | jq -r ".[] | select(.username == \"$SAAS_ADMIN_USER\") | .id" 2>/dev/null)
if [ -z "$ADMIN_TENANT_USER_ID" ] || [ "$ADMIN_TENANT_USER_ID" = "null" ]; then
log "Creating admin console user '$SAAS_ADMIN_USER'..."
ADMIN_TENANT_RESPONSE=$(admin_api_post "/api/users" "{
\"username\": \"$SAAS_ADMIN_USER\",
\"password\": \"$SAAS_ADMIN_PASS\",
\"name\": \"Platform Admin\"
}")
ADMIN_TENANT_USER_ID=$(echo "$ADMIN_TENANT_RESPONSE" | jq -r '.id')
log "Created admin console user: $ADMIN_TENANT_USER_ID"
else
log "Admin console user exists: $ADMIN_TENANT_USER_ID"
fi
if [ -n "$ADMIN_TENANT_USER_ID" ] && [ "$ADMIN_TENANT_USER_ID" != "null" ]; then
# Assign both 'user' (required base role) and 'default:admin' (Management API access)
ADMIN_USER_ROLE_ID=$(admin_api_get "/api/roles" | jq -r '.[] | select(.name == "user") | .id')
ADMIN_ROLE_ID=$(admin_api_get "/api/roles" | jq -r '.[] | select(.name == "default:admin") | .id')
ROLE_IDS_JSON="[]"
if [ -n "$ADMIN_USER_ROLE_ID" ] && [ "$ADMIN_USER_ROLE_ID" != "null" ]; then
ROLE_IDS_JSON=$(echo "$ROLE_IDS_JSON" | jq ". + [\"$ADMIN_USER_ROLE_ID\"]")
fi
if [ -n "$ADMIN_ROLE_ID" ] && [ "$ADMIN_ROLE_ID" != "null" ]; then
ROLE_IDS_JSON=$(echo "$ROLE_IDS_JSON" | jq ". + [\"$ADMIN_ROLE_ID\"]")
fi
if [ "$ROLE_IDS_JSON" != "[]" ]; then
admin_api_post "/api/users/$ADMIN_TENANT_USER_ID/roles" "{\"roleIds\": $ROLE_IDS_JSON}" >/dev/null 2>&1
log "Assigned admin tenant roles (user + default:admin)."
else
log "WARNING: admin tenant roles not found"
fi
# Switch sign-in mode from Register to SignIn (admin user already created)
admin_api_patch "/api/sign-in-exp" '{"signInMode": "SignIn"}' >/dev/null 2>&1
log "Set sign-in mode to SignIn."
# Register admin-console redirect URIs (Logto ships with empty URIs)
ADMIN_PUBLIC="${ADMIN_ENDPOINT:-${PROTO}://${HOST}:3002}"
admin_api_patch "/api/applications/admin-console" "{
\"oidcClientMetadata\": {
\"redirectUris\": [\"${ADMIN_PUBLIC}/console/callback\"],
\"postLogoutRedirectUris\": [\"${ADMIN_PUBLIC}/console\"]
}
}" >/dev/null 2>&1
log "Registered admin-console redirect URIs."
# Add admin user to Logto's internal organizations (required for console login)
for ORG_ID in t-default t-admin; do
admin_api_post "/api/organizations/${ORG_ID}/users" "{\"userIds\": [\"$ADMIN_TENANT_USER_ID\"]}" >/dev/null 2>&1
done
ADMIN_ORG_ROLE_ID=$(admin_api_get "/api/organization-roles" | jq -r '.[] | select(.name == "admin") | .id')
if [ -n "$ADMIN_ORG_ROLE_ID" ] && [ "$ADMIN_ORG_ROLE_ID" != "null" ]; then
for ORG_ID in t-default t-admin; do
admin_api_post "/api/organizations/${ORG_ID}/users/${ADMIN_TENANT_USER_ID}/roles" "{\"organizationRoleIds\": [\"$ADMIN_ORG_ROLE_ID\"]}" >/dev/null 2>&1
done
fi
log "Added admin to Logto console organizations."
log "SaaS admin granted Logto console access."
else
log "WARNING: Could not create admin console user"
fi
fi # end: ADMIN_TOKEN check
fi # end: M_ADMIN_SECRET check
# No viewer user — tenant users are created by the admin during tenant provisioning.
# No example organization — tenants are created via the admin console.
# No server OIDC config — each provisioned server gets OIDC from env vars.
ORG_ID=""
# ============================================================
# PHASE 7b: Configure Logto Custom JWT for access tokens
# ============================================================
# Adds a 'roles' claim to access tokens based on user's org roles and global roles.
# This allows the server to extract roles from the access token using rolesClaim: "roles".
log "Configuring Logto Custom JWT for access tokens..."
CUSTOM_JWT_SCRIPT='const getCustomJwtClaims = async ({ token, context, environmentVariables }) => {
const roleMap = { owner: "server:admin", operator: "server:operator", viewer: "server:viewer" };
const roles = new Set();
if (context?.user?.organizationRoles) {
for (const orgRole of context.user.organizationRoles) {
const mapped = roleMap[orgRole.roleName];
if (mapped) roles.add(mapped);
}
}
if (context?.user?.roles) {
for (const role of context.user.roles) {
if (role.name === "saas-vendor") roles.add("server:admin");
}
}
return roles.size > 0 ? { roles: [...roles] } : {};
};'
CUSTOM_JWT_PAYLOAD=$(jq -n --arg script "$CUSTOM_JWT_SCRIPT" '{ script: $script }')
CUSTOM_JWT_RESPONSE=$(api_put "/api/configs/jwt-customizer/access-token" "$CUSTOM_JWT_PAYLOAD" 2>&1)
if echo "$CUSTOM_JWT_RESPONSE" | jq -e '.script' >/dev/null 2>&1; then
log "Custom JWT configured for access tokens."
else
log "WARNING: Custom JWT configuration failed — server OIDC login may fall back to local roles"
log "Response: $(echo "$CUSTOM_JWT_RESPONSE" | head -c 200)"
fi
# ============================================================
# PHASE 8: Configure sign-in branding
# ============================================================
log "Configuring sign-in experience branding..."
api_patch "/api/sign-in-exp" "{
\"color\": {
\"primaryColor\": \"#C6820E\",
\"isDarkModeEnabled\": true,
\"darkPrimaryColor\": \"#D4941E\"
},
\"branding\": {
\"logoUrl\": \"${PROTO}://${HOST}/platform/logo.svg\",
\"darkLogoUrl\": \"${PROTO}://${HOST}/platform/logo-dark.svg\"
}
}"
log "Sign-in branding configured."
# ============================================================
# PHASE 8b: Configure SMTP email connector
# ============================================================
# Required for email verification during registration and password reset.
# Skipped if SMTP_HOST is not set (registration will not work without email delivery).
if [ -n "${SMTP_HOST:-}" ] && [ -n "${SMTP_USER:-}" ]; then
log "Configuring SMTP email connector..."
# Discover available email connector factories
FACTORIES=$(api_get "/api/connector-factories")
# Prefer a factory with "smtp" in the ID
SMTP_FACTORY_ID=$(echo "$FACTORIES" | jq -r '[.[] | select(.type == "Email" and (.id | test("smtp"; "i")))] | .[0].id // empty')
if [ -z "$SMTP_FACTORY_ID" ]; then
# Fall back to any non-demo Email factory
SMTP_FACTORY_ID=$(echo "$FACTORIES" | jq -r '[.[] | select(.type == "Email" and .isDemo != true)] | .[0].id // empty')
fi
if [ -n "$SMTP_FACTORY_ID" ]; then
# Build SMTP config JSON
SMTP_CONFIG=$(jq -n \
--arg host "$SMTP_HOST" \
--arg port "${SMTP_PORT:-587}" \
--arg user "$SMTP_USER" \
--arg pass "${SMTP_PASS:-}" \
--arg from "${SMTP_FROM_EMAIL:-noreply@cameleer.io}" \
'{
host: $host,
port: ($port | tonumber),
auth: { user: $user, pass: $pass },
fromEmail: $from,
templates: [
{
usageType: "Register",
contentType: "text/html",
subject: "Verify your email for Cameleer",
content: "<div style=\"font-family:sans-serif;max-width:480px;margin:0 auto;padding:24px\"><div style=\"text-align:center;margin-bottom:24px\"><span style=\"font-size:24px;font-weight:700;color:#C6820E\">Cameleer</span></div><p style=\"color:#333;font-size:15px;line-height:1.6\">Enter this code to verify your email and create your account:</p><div style=\"text-align:center;margin:24px 0\"><span style=\"font-size:32px;font-weight:700;letter-spacing:6px;color:#C6820E\">{{code}}</span></div><p style=\"color:#666;font-size:13px\">This code expires in 10 minutes. If you did not request this, you can safely ignore this email.</p></div>"
},
{
usageType: "SignIn",
contentType: "text/html",
subject: "Your Cameleer sign-in code",
content: "<div style=\"font-family:sans-serif;max-width:480px;margin:0 auto;padding:24px\"><div style=\"text-align:center;margin-bottom:24px\"><span style=\"font-size:24px;font-weight:700;color:#C6820E\">Cameleer</span></div><p style=\"color:#333;font-size:15px;line-height:1.6\">Your sign-in verification code:</p><div style=\"text-align:center;margin:24px 0\"><span style=\"font-size:32px;font-weight:700;letter-spacing:6px;color:#C6820E\">{{code}}</span></div><p style=\"color:#666;font-size:13px\">This code expires in 10 minutes.</p></div>"
},
{
usageType: "ForgotPassword",
contentType: "text/html",
subject: "Reset your Cameleer password",
content: "<div style=\"font-family:sans-serif;max-width:480px;margin:0 auto;padding:24px\"><div style=\"text-align:center;margin-bottom:24px\"><span style=\"font-size:24px;font-weight:700;color:#C6820E\">Cameleer</span></div><p style=\"color:#333;font-size:15px;line-height:1.6\">Enter this code to reset your password:</p><div style=\"text-align:center;margin:24px 0\"><span style=\"font-size:32px;font-weight:700;letter-spacing:6px;color:#C6820E\">{{code}}</span></div><p style=\"color:#666;font-size:13px\">This code expires in 10 minutes. If you did not request a password reset, you can safely ignore this email.</p></div>"
},
{
usageType: "Generic",
contentType: "text/html",
subject: "Your Cameleer verification code",
content: "<div style=\"font-family:sans-serif;max-width:480px;margin:0 auto;padding:24px\"><div style=\"text-align:center;margin-bottom:24px\"><span style=\"font-size:24px;font-weight:700;color:#C6820E\">Cameleer</span></div><p style=\"color:#333;font-size:15px;line-height:1.6\">Your verification code:</p><div style=\"text-align:center;margin:24px 0\"><span style=\"font-size:32px;font-weight:700;letter-spacing:6px;color:#C6820E\">{{code}}</span></div><p style=\"color:#666;font-size:13px\">This code expires in 10 minutes.</p></div>"
}
]
}')
# Check if an email connector already exists
EXISTING_CONNECTORS=$(api_get "/api/connectors")
EMAIL_CONNECTOR_ID=$(echo "$EXISTING_CONNECTORS" | jq -r '[.[] | select(.type == "Email")] | .[0].id // empty')
if [ -n "$EMAIL_CONNECTOR_ID" ]; then
api_patch "/api/connectors/$EMAIL_CONNECTOR_ID" "{\"config\": $SMTP_CONFIG}" >/dev/null 2>&1
log "Updated existing email connector: $EMAIL_CONNECTOR_ID"
else
CONNECTOR_RESPONSE=$(api_post "/api/connectors" "{\"connectorId\": \"$SMTP_FACTORY_ID\", \"config\": $SMTP_CONFIG}")
CREATED_ID=$(echo "$CONNECTOR_RESPONSE" | jq -r '.id // empty')
if [ -n "$CREATED_ID" ]; then
log "Created SMTP email connector: $CREATED_ID (factory: $SMTP_FACTORY_ID)"
else
log "WARNING: Failed to create SMTP connector. Response: $(echo "$CONNECTOR_RESPONSE" | head -c 300)"
fi
fi
else
log "WARNING: No email connector factory found — email delivery will not work."
log "Available factories: $(echo "$FACTORIES" | jq -c '[.[] | select(.type == "Email") | .id]')"
fi
else
log "SMTP not configured (SMTP_HOST/SMTP_USER not set) — email delivery disabled."
log "Set SMTP_HOST, SMTP_USER, SMTP_PASS, SMTP_FROM_EMAIL env vars to enable."
fi
# ============================================================
# PHASE 8c: Enable registration (email + password)
# ============================================================
# Configures sign-in experience to allow self-service registration with email verification.
# This runs AFTER the SMTP connector so email delivery is ready before registration opens.
log "Configuring sign-in experience for registration..."
api_patch "/api/sign-in-exp" '{
"signInMode": "SignInAndRegister",
"signUp": {
"identifiers": ["email"],
"password": true,
"verify": true
},
"signIn": {
"methods": [
{
"identifier": "email",
"password": true,
"verificationCode": false,
"isPasswordPrimary": true
},
{
"identifier": "username",
"password": true,
"verificationCode": false,
"isPasswordPrimary": true
}
]
}
}' >/dev/null 2>&1
log "Sign-in experience configured: SignInAndRegister (email + password)."
# ============================================================
# PHASE 9: Cleanup seeded apps
# ============================================================
if [ -n "$M2M_SECRET" ]; then
log "Cleaning up seeded apps with known secrets..."
for SEEDED_ID in "m-default" "m-admin" "s6cz3wajdv8gtdyz8e941"; do
if echo "$EXISTING_APPS" | jq -e ".[] | select(.id == \"$SEEDED_ID\")" >/dev/null 2>&1; then
api_delete "/api/applications/$SEEDED_ID"
log "Deleted seeded app: $SEEDED_ID"
fi
done
fi
# ============================================================
# PHASE 10: Write bootstrap results
# ============================================================
log "Writing bootstrap config to $BOOTSTRAP_FILE..."
mkdir -p "$(dirname "$BOOTSTRAP_FILE")"
cat > "$BOOTSTRAP_FILE" <<EOF
{
"spaClientId": "$SPA_ID",
"m2mClientId": "$M2M_ID",
"m2mClientSecret": "$M2M_SECRET",
"tradAppId": "$TRAD_ID",
"tradAppSecret": "$TRAD_SECRET",
"apiResourceIndicator": "$API_RESOURCE_INDICATOR",
"platformAdminUser": "$SAAS_ADMIN_USER",
"oidcIssuerUri": "${LOGTO_ENDPOINT}/oidc",
"oidcAudience": "$API_RESOURCE_INDICATOR"
}
EOF
chmod 644 "$BOOTSTRAP_FILE"
# ============================================================
# Phase 12: SaaS Admin Role
# ============================================================
log ""
log "=== Phase 12: SaaS Admin Role ==="
# Create saas-vendor global role with all API scopes
log "Checking for saas-vendor role..."
EXISTING_ROLES=$(api_get "/api/roles")
VENDOR_ROLE_ID=$(echo "$EXISTING_ROLES" | jq -r '.[] | select(.name == "saas-vendor" and .type == "User") | .id')
if [ -z "$VENDOR_ROLE_ID" ]; then
ALL_SCOPE_IDS=$(api_get "/api/resources/$API_RESOURCE_ID/scopes" | jq '[.[].id]')
log "Creating saas-vendor role with all scopes..."
VENDOR_ROLE_RESPONSE=$(api_post "/api/roles" "{
\"name\": \"saas-vendor\",
\"description\": \"SaaS vendor — full platform control across all tenants\",
\"type\": \"User\",
\"scopeIds\": $ALL_SCOPE_IDS
}")
VENDOR_ROLE_ID=$(echo "$VENDOR_ROLE_RESPONSE" | jq -r '.id')
log "Created saas-vendor role: $VENDOR_ROLE_ID"
else
log "saas-vendor role exists: $VENDOR_ROLE_ID"
fi
# Assign vendor role to admin user
if [ -n "$VENDOR_ROLE_ID" ] && [ "$VENDOR_ROLE_ID" != "null" ] && [ -n "$ADMIN_USER_ID" ]; then
api_post "/api/users/$ADMIN_USER_ID/roles" "{\"roleIds\": [\"$VENDOR_ROLE_ID\"]}" >/dev/null 2>&1
log "Assigned saas-vendor role to admin user."
fi
log "SaaS admin role configured."
log ""
log "=== Bootstrap complete! ==="
# dev only — remove credential logging in production
log " SPA Client ID: $SPA_ID"
log ""
log " No tenants created — use the admin console to create tenants."
log ""

View File

@@ -0,0 +1,19 @@
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
# Agent JAR and log appender JAR are copied during CI build from Gitea Maven registry
COPY agent.jar /app/agent.jar
COPY cameleer-log-appender.jar /app/cameleer-log-appender.jar
ENTRYPOINT exec java \
-Dcameleer.export.type=${CAMELEER_EXPORT_TYPE:-HTTP} \
-Dcameleer.export.endpoint=${CAMELEER_SERVER_URL} \
-Dcameleer.agent.name=${HOSTNAME} \
-Dcameleer.agent.application=${CAMELEER_APPLICATION_ID:-default} \
-Dcameleer.agent.environment=${CAMELEER_ENVIRONMENT_ID:-default} \
-Dcameleer.routeControl.enabled=${CAMELEER_ROUTE_CONTROL_ENABLED:-false} \
-Dcameleer.replay.enabled=${CAMELEER_REPLAY_ENABLED:-false} \
-Dcameleer.health.enabled=true \
-Dcameleer.health.port=9464 \
-javaagent:/app/agent.jar \
-jar /app/app.jar

View File

@@ -0,0 +1,20 @@
#!/bin/sh
# Patched entrypoint: fixes the sed ordering bug in the server-ui image.
# The original entrypoint inserts <base href> then rewrites ALL href="/..."
# including the just-inserted base tag, causing /server/server/ doubling.
BASE_PATH="${BASE_PATH:-/}"
if [ "$BASE_PATH" != "/" ]; then
BASE_PATH=$(echo "$BASE_PATH" | sed 's#/*$#/#; s#^/*#/#')
INDEX="/usr/share/nginx/html/index.html"
# Rewrite absolute asset paths FIRST (before inserting <base>)
sed -i "s|href=\"/|href=\"${BASE_PATH}|g; s|src=\"/|src=\"${BASE_PATH}|g" "$INDEX"
# THEN inject <base> tag
sed -i "s|<head>|<head><base href=\"${BASE_PATH}\">|" "$INDEX"
echo "BASE_PATH set to ${BASE_PATH} — rewrote index.html"
fi
exec /docker-entrypoint.sh "$@"

999
docs/architecture.md Normal file
View File

@@ -0,0 +1,999 @@
# Cameleer SaaS Architecture
**Last updated:** 2026-04-05
**Status:** Living document -- update as the system evolves
---
## 1. System Overview
Cameleer SaaS is a multi-tenant platform that provides managed observability for
Apache Camel applications. Customers deploy their Camel JARs through the SaaS
platform and get zero-code instrumentation, execution tracing, route topology
visualization, and runtime control -- without running any observability
infrastructure themselves.
The system comprises three components:
**Cameleer Agent** (`cameleer` repo) -- A Java agent using ByteBuddy for
zero-code bytecode instrumentation. Captures route executions, processor traces,
payloads, metrics, and route graph topology. Deployed as a `-javaagent` JAR
alongside the customer's application.
**Cameleer Server** (`cameleer-server` repo) -- A Spring Boot observability
backend. Receives telemetry from agents via HTTP, pushes configuration and
commands to agents via SSE. Stores data in PostgreSQL and ClickHouse. Provides
a React SPA dashboard for direct observability access. JWT auth with Ed25519
config signing.
**Cameleer SaaS** (this repo) -- The multi-tenancy, deployment, and management
layer. Handles user authentication via Logto OIDC, tenant provisioning, JAR
upload and deployment, API key management, license generation, and audit
logging. Serves a React SPA that wraps the full user experience.
---
## 2. Component Topology
```
Internet / LAN
|
+-----+-----+
| Traefik | :80 / :443
| (v3) | Reverse proxy + TLS termination
+-----+-----+
|
+---------------+---------------+-------------------+
| | | |
PathPrefix(/api) PathPrefix(/) PathPrefix(/oidc) PathPrefix(/observe)
PathPrefix(/api) priority=1 PathPrefix( PathPrefix(/dashboard)
| | /interaction) |
v v v v
+--------------+ +--------------+ +-----------+ +------------------+
| cameleer-saas| | cameleer-saas| | Logto | | cameleer-server |
| (API) | | (SPA) | | | | |
| :8080 | | :8080 | | :3001 | | :8081 |
+--------------+ +--------------+ +-----------+ +------------------+
| | |
+------+-------------------------+------------------+
| | |
+------+------+ +------+------+ +------+------+
| PostgreSQL | | PostgreSQL | | ClickHouse |
| :5432 | | (logto DB) | | :8123 |
| cameleer_ | | :5432 | | cameleer |
| saas DB | +--------------+ +-------------+
+--------------+
|
+------+------+
| Customer |
| App + Agent |
| (container) |
+-------------+
```
### Services
| Service | Image | Internal Port | Network | Purpose |
|-------------------|---------------------------------------------|---------------|----------|----------------------------------|
| traefik | `traefik:v3` | 80, 443 | cameleer | Reverse proxy, TLS, routing |
| postgres | `postgres:16-alpine` | 5432 | cameleer | Shared PostgreSQL (3 databases) |
| logto | `ghcr.io/logto-io/logto:latest` | 3001 | cameleer | OIDC identity provider |
| logto-bootstrap | `postgres:16-alpine` (ephemeral) | -- | cameleer | One-shot bootstrap script |
| cameleer-saas | `gitea.siegeln.net/cameleer/cameleer-saas` | 8080 | cameleer | SaaS API + SPA serving |
| cameleer-server | `gitea.siegeln.net/cameleer/cameleer-server`| 8081 | cameleer | Observability backend |
| clickhouse | `clickhouse/clickhouse-server:latest` | 8123 | cameleer | Time-series telemetry storage |
### Docker Network
All services share a single Docker bridge network named `cameleer`. Customer app
containers are also attached to this network so agents can reach the
cameleer-server.
### Volumes
| Volume | Mounted By | Purpose |
|-----------------|---------------------|--------------------------------------------|
| `pgdata` | postgres | PostgreSQL data persistence |
| `chdata` | clickhouse | ClickHouse data persistence |
| `acme` | traefik | TLS certificate storage |
| `jardata` | cameleer-saas | Uploaded customer JAR files |
| `bootstrapdata` | logto-bootstrap, cameleer-saas | Bootstrap output JSON (shared) |
### Databases on PostgreSQL
The shared PostgreSQL instance hosts three databases:
- `cameleer_saas` -- SaaS platform tables (tenants, environments, apps, etc.)
- `logto` -- Logto identity provider data
- `cameleer` -- cameleer-server operational data
The `docker/init-databases.sh` init script creates all three during first start.
---
## 3. Authentication & Authorization
### 3.1 Design Principles
1. **Logto is the single identity provider** for all human users.
2. **Zero trust** -- every service validates tokens independently via JWKS or its
own signing key. No identity in HTTP headers.
3. **No custom crypto** -- standard protocols only (OAuth2, OIDC, JWT, SHA-256).
4. **API keys for agents** -- per-environment opaque secrets, exchanged for
server-issued JWTs via the bootstrap registration flow.
### 3.2 Token Types
| Token | Issuer | Algorithm | Validator | Used By |
|--------------------|-----------------|------------------|----------------------|--------------------------------|
| Logto user JWT | Logto | ES384 (asymmetric)| Any service via JWKS | SaaS UI users, server users |
| Logto M2M JWT | Logto | ES384 (asymmetric)| Any service via JWKS | SaaS platform -> server calls |
| Server internal JWT| cameleer-server| HS256 (symmetric) | Issuing server only | Agents (after registration) |
| API key (opaque) | SaaS platform | N/A (SHA-256 hash)| cameleer-server | Agent initial registration |
| Ed25519 signature | cameleer-server| EdDSA | Agent | Server -> agent command signing|
### 3.3 Scope Model
The Logto API resource `https://api.cameleer.local` has 10 scopes, created by
the bootstrap script (`docker/logto-bootstrap.sh`):
| Scope | Description | Platform Admin | Org Admin | Org Member |
|--------------------|--------------------------------|:--------------:|:---------:|:----------:|
| `platform:admin` | SaaS platform administration | x | | |
| `tenant:manage` | Manage tenant settings | x | x | |
| `billing:manage` | Manage billing | x | x | |
| `team:manage` | Manage team members | x | x | |
| `apps:manage` | Create and delete apps | x | x | |
| `apps:deploy` | Deploy apps | x | x | x |
| `secrets:manage` | Manage secrets | x | x | |
| `observe:read` | View observability data | x | x | x |
| `observe:debug` | Debug and replay operations | x | x | x |
| `settings:manage` | Manage settings | x | x | |
**Role hierarchy:**
- **Global role `platform-admin`** -- All 10 scopes. Assigned to SaaS owner.
- **Organization role `admin`** -- 9 tenant-level scopes (all except `platform:admin`).
- **Organization role `member`** -- 3 scopes: `apps:deploy`, `observe:read`,
`observe:debug`.
### 3.4 Authentication Flows
**Human user -> SaaS Platform:**
```
Browser Logto cameleer-saas
| | |
|--- OIDC auth code flow ->| |
|<-- id_token, auth code --| |
| | |
|--- getAccessToken(resource, orgId) ---------------->|
| (org-scoped JWT with scope claim) |
| | |
|--- GET /api/me, Authorization: Bearer <jwt> ------->|
| | validate via JWKS |
| | extract organization_id|
| | resolve to tenant |
|<-- { userId, tenants } -----------------------------|
```
1. User authenticates with Logto (OIDC authorization code flow via `@logto/react`).
2. Frontend obtains org-scoped access token via `getAccessToken(resource, orgId)`.
3. Backend validates via Logto JWKS (Spring OAuth2 Resource Server).
4. `organization_id` claim in JWT resolves to internal tenant ID via
`TenantIsolationInterceptor`.
**SaaS platform -> cameleer-server API (M2M):**
1. SaaS platform obtains Logto M2M token (`client_credentials` grant) via
`LogtoManagementClient`.
2. Calls server API with `Authorization: Bearer <logto-m2m-token>`.
3. Server validates via Logto JWKS (OIDC resource server support).
4. Server grants ADMIN role to valid M2M tokens.
**Agent -> cameleer-server:**
1. Agent reads `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN` environment variable (API key).
2. Calls `POST /api/v1/agents/register` with the key as Bearer token.
3. Server validates via `BootstrapTokenValidator` (constant-time comparison).
4. Server issues internal HMAC JWT (access + refresh) + Ed25519 public key.
5. Agent uses JWT for all subsequent requests, refreshes on expiry.
**Server -> Agent (commands):**
1. Server signs command payload with Ed25519 private key.
2. Sends via SSE with signature field.
3. Agent verifies using server's public key (received at registration).
4. Destructive commands require a nonce (replay protection).
### 3.5 Spring Security Configuration
`SecurityConfig.java` configures a single stateless filter chain:
```java
@Configuration
@EnableWebSecurity
@EnableMethodSecurity
public class SecurityConfig {
@Bean
public SecurityFilterChain filterChain(HttpSecurity http) throws Exception {
http
.csrf(csrf -> csrf.disable())
.sessionManagement(s -> s.sessionCreationPolicy(STATELESS))
.authorizeHttpRequests(auth -> auth
.requestMatchers("/actuator/health").permitAll()
.requestMatchers("/api/config").permitAll()
.requestMatchers("/", "/index.html", "/login", "/callback",
"/environments/**", "/license", "/admin/**").permitAll()
.requestMatchers("/assets/**", "/favicon.ico").permitAll()
.anyRequest().authenticated()
)
.oauth2ResourceServer(oauth2 -> oauth2.jwt(jwt ->
jwt.jwtAuthenticationConverter(jwtAuthenticationConverter())));
return http.build();
}
}
```
**JWT processing pipeline:**
1. `BearerTokenAuthenticationFilter` (Spring built-in) extracts the Bearer token.
2. `JwtDecoder` validates the token signature (ES384 via Logto JWKS) and issuer.
Accepts both `JWT` and `at+jwt` token types (RFC 9068 / Logto convention).
3. `JwtAuthenticationConverter` maps the `scope` claim to Spring authorities:
`scope: "platform:admin observe:read"` becomes `SCOPE_platform:admin` and
`SCOPE_observe:read`.
4. `TenantIsolationInterceptor` (registered as a `HandlerInterceptor` on
`/api/**` via `WebConfig`) reads `organization_id` from the JWT, resolves it
to an internal tenant UUID via `TenantService.getByLogtoOrgId()`, stores it
on `TenantContext` (ThreadLocal), and validates path variable isolation (see
Section 8.1).
**Authorization enforcement** -- Every mutating API endpoint uses Spring
`@PreAuthorize` annotations with `SCOPE_` authorities. Read-only list/get
endpoints require authentication only (no specific scope). The scope-to-endpoint
mapping:
| Scope | Endpoints |
|------------------|--------------------------------------------------------------------------|
| `platform:admin` | `GET /api/tenants` (list all), `POST /api/tenants` (create tenant) |
| `apps:manage` | Environment create/update/delete, app create/delete |
| `apps:deploy` | JAR upload, routing patch, deploy/stop/restart |
| `billing:manage` | License generation |
| `observe:read` | Log queries, agent status, observability status |
| *(auth only)* | List/get-by-ID endpoints (environments, apps, deployments, licenses) |
Example:
```java
@PreAuthorize("hasAuthority('SCOPE_apps:manage')")
public ResponseEntity<EnvironmentResponse> create(...) { ... }
```
### 3.6 Frontend Auth Architecture
**Logto SDK integration** (`main.tsx`):
The `LogtoProvider` is configured with scopes including `UserScope.Organizations`
and `UserScope.OrganizationRoles`, requesting organization-aware tokens from
Logto.
**Token management** (`TokenSync` component in `main.tsx`):
When an organization is selected, `setTokenProvider` is called with
`getAccessToken(resource, orgId)` to produce org-scoped JWTs. When no org is
selected, a non-org-scoped token is used.
**Organization resolution** (`OrgResolver.tsx`):
`OrgResolver` uses two separate `useEffect` hooks to keep org state and scopes
in sync:
- **Effect 1: Org population** (depends on `[me]`) -- Calls `GET /api/me` to
fetch tenant memberships, maps them to `OrgInfo` objects in the Zustand org
store, and auto-selects the first org if the user belongs to exactly one.
- **Effect 2: Scope fetching** (depends on `[me, currentOrgId]`) -- Fetches the
API resource identifier from `/api/config`, then obtains an org-scoped access
token (`getAccessToken(resource, orgId)`). Scopes are decoded from the JWT
payload and written to the store via `setScopes()`. A single token fetch is
sufficient because Logto merges all granted scopes (including global scopes
like `platform:admin`) into the org-scoped token.
The two-effect split ensures scopes are re-fetched whenever the user switches
organizations, preventing stale scope sets from a previously selected org.
**Scope-based UI gating:**
The `useOrgStore` exposes a `scopes: Set<string>` that components check to
conditionally render UI elements. For example, admin-only controls check for
`platform:admin` in the scope set.
**Route protection** (`ProtectedRoute.tsx`):
Wraps authenticated routes. Redirects to `/login` when the user is not
authenticated. Uses a ref to avoid showing a spinner after the initial auth
check completes (the Logto SDK sets `isLoading=true` for every async method,
not just initial load).
---
## 4. Data Model
### 4.1 Entity Relationship Diagram
```
+-------------------+
| tenants |
+-------------------+
| id (PK, UUID) |
| name |
| slug (UNIQUE) |
| tier |
| status |
| logto_org_id |
| stripe_customer_id|
| stripe_sub_id |
| settings (JSONB) |
| created_at |
| updated_at |
+--------+----------+
|
+-----+-----+------------------+
| | |
v v v
+----------+ +----------+ +-----------+
| licenses | | environ- | | audit_log |
| | | ments | | |
+----------+ +----------+ +-----------+
| id (PK) | | id (PK) | | id (PK) |
| tenant_id| | tenant_id| | tenant_id |
| tier | | slug | | actor_id |
| features | | display_ | | action |
| limits | | name | | resource |
| token | | status | | result |
| issued_at| | created_ | | metadata |
| expires_ | | at | | created_at|
| at | +-----+----+ +-----------+
+----------+ |
+----+----+
| |
v v
+----------+ +-----------+
| api_keys | | apps |
+----------+ +-----------+
| id (PK) | | id (PK) |
| environ_ | | environ_ |
| ment_id | | ment_id |
| key_hash | | slug |
| key_ | | display_ |
| prefix | | name |
| status | | jar_* |
| created_ | | exposed_ |
| at | | port |
| revoked_ | | current_ |
| at | | deploy_id|
+----------+ | previous_ |
| deploy_id|
+-----+-----+
|
v
+-------------+
| deployments |
+-------------+
| id (PK) |
| app_id |
| version |
| image_ref |
| desired_ |
| status |
| observed_ |
| status |
| orchestrator|
| _metadata |
| error_msg |
| deployed_at |
| stopped_at |
| created_at |
+-------------+
```
### 4.2 Table Descriptions
**`tenants`** (V001) -- Top-level multi-tenancy entity. Each tenant maps to a
Logto organization via `logto_org_id`. The `tier` column (`LOW` default) drives
license feature gates. The `status` column tracks provisioning state
(`PROVISIONING`, `ACTIVE`, etc.). `settings` is a JSONB bag for tenant-specific
configuration. Stripe columns support future billing integration.
**`licenses`** (V002) -- Per-tenant license tokens with feature flags and usage
limits. The `token` column stores the generated license string. `features` and
`limits` are JSONB columns holding structured capability data. Licenses have
explicit expiry and optional revocation.
**`environments`** (V003) -- Logical deployment environments within a tenant
(e.g., `dev`, `staging`, `production`). Scoped by `(tenant_id, slug)` unique
constraint. Each environment gets its own set of API keys and apps.
**`api_keys`** (V004) -- Per-environment opaque API keys for agent
authentication. The plaintext is never stored -- only `key_hash` (SHA-256 hex,
64 chars) and `key_prefix` (first 12 chars of the `cmk_`-prefixed key, for
identification). Status lifecycle: `ACTIVE` -> `ROTATED` or `REVOKED`.
**`apps`** (V005) -- Customer applications within an environment. Tracks
uploaded JAR metadata (`jar_storage_path`, `jar_checksum`, `jar_size_bytes`,
`jar_original_filename`), optional `exposed_port` for inbound HTTP routing,
and deployment references (`current_deployment_id`, `previous_deployment_id`
for rollback).
**`deployments`** (V006) -- Versioned deployment records for each app. Tracks a
two-state lifecycle: `desired_status` (what the user wants: `RUNNING` or
`STOPPED`) and `observed_status` (what the system sees: `BUILDING`, `STARTING`,
`RUNNING`, `STOPPED`, `FAILED`). `orchestrator_metadata` (JSONB) stores the
Docker container ID. Versioned with `(app_id, version)` unique constraint.
**`audit_log`** (V007) -- Append-only audit trail. Records actor, tenant,
action, resource, environment, result, and optional metadata JSONB. Indexed
by `(tenant_id, created_at)`, `(actor_id, created_at)`, and
`(action, created_at)` for efficient querying.
### 4.3 Audit Actions
Defined in `AuditAction.java`:
| Category | Actions |
|---------------|----------------------------------------------------------------|
| Auth | `AUTH_REGISTER`, `AUTH_LOGIN`, `AUTH_LOGIN_FAILED`, `AUTH_LOGOUT`|
| Tenant | `TENANT_CREATE`, `TENANT_UPDATE`, `TENANT_SUSPEND`, `TENANT_REACTIVATE`, `TENANT_DELETE` |
| Environment | `ENVIRONMENT_CREATE`, `ENVIRONMENT_UPDATE`, `ENVIRONMENT_DELETE`|
| App lifecycle | `APP_CREATE`, `APP_DEPLOY`, `APP_PROMOTE`, `APP_ROLLBACK`, `APP_SCALE`, `APP_STOP`, `APP_DELETE` |
| Secrets | `SECRET_CREATE`, `SECRET_READ`, `SECRET_UPDATE`, `SECRET_DELETE`, `SECRET_ROTATE` |
| Config | `CONFIG_UPDATE` |
| Team | `TEAM_INVITE`, `TEAM_REMOVE`, `TEAM_ROLE_CHANGE` |
| License | `LICENSE_GENERATE`, `LICENSE_REVOKE` |
---
## 5. Deployment Model
### 5.1 Server-Per-Tenant
Each tenant gets a dedicated cameleer-server instance. The SaaS platform
provisions and manages these servers. In the current Docker Compose topology, a
single shared cameleer-server is used for the default tenant. Production
deployments will run per-tenant servers as separate containers or K8s pods.
### 5.2 Customer App Deployment Flow
The deployment lifecycle is managed by `DeploymentService`:
```
User uploads JAR Build Docker image Start container
via AppController --> from base image + --> on cameleer network
(multipart POST) uploaded JAR with agent env vars
| | |
v v v
apps.jar_storage_path deployments.image_ref deployments.orchestrator_metadata
apps.jar_checksum deployments.observed_ {"containerId": "..."}
apps.jar_size_bytes status = BUILDING
```
**Step-by-step (from `DeploymentService.deploy()`):**
1. **Validate** -- Ensure the app has an uploaded JAR.
2. **Version** -- Increment deployment version via
`deploymentRepository.findMaxVersionByAppId()`.
3. **Image ref** -- Generate `cameleer-runtime-{env}-{app}:v{n}`.
4. **Persist** -- Save deployment record with `observed_status = BUILDING`.
5. **Audit** -- Log `APP_DEPLOY` action.
6. **Async execution** (`@Async("deploymentExecutor")`):
a. Build Docker image from base image + customer JAR.
b. Stop previous container if one exists.
c. Start new container with environment variables:
| Variable | Value |
|-----------------------------|----------------------------------------|
| `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN` | API key for agent registration |
| `CAMELEER_EXPORT_TYPE` | `HTTP` |
| `CAMELEER_SERVER_RUNTIME_SERVERURL` | cameleer-server internal URL |
| `CAMELEER_APPLICATION_ID` | App slug |
| `CAMELEER_ENVIRONMENT_ID` | Environment slug |
| `CAMELEER_DISPLAY_NAME` | `{tenant}-{env}-{app}` |
d. Apply resource limits (`container-memory-limit`, `container-cpu-shares`).
e. Configure Traefik labels for inbound routing if `exposed_port` is set:
`{app}.{env}.{tenant}.{domain}`.
f. Poll container health for up to `health-check-timeout` seconds.
g. Update deployment status to `RUNNING` or `FAILED`.
h. Update app's `current_deployment_id` and `previous_deployment_id`.
### 5.3 Container Resource Limits
Configured via `RuntimeConfig`:
| Property | Default | Description |
|-----------------------------------|-------------|-----------------------------|
| `cameleer.runtime.container-memory-limit` | `512m` | Docker memory limit |
| `cameleer.runtime.container-cpu-shares` | `512` | Docker CPU shares |
| `cameleer.runtime.max-jar-size` | `200MB` | Max upload size |
| `cameleer.runtime.health-check-timeout` | `60` | Seconds to wait for healthy |
| `cameleer.runtime.deployment-thread-pool-size` | `4`| Concurrent deployments |
---
## 6. Agent-Server Protocol
The agent-server protocol is defined in full in
`cameleer/cameleer-common/PROTOCOL.md`. This section summarizes the key
aspects relevant to the SaaS platform.
### 6.1 Agent Registration
1. Agent starts with `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN` environment variable (an API key
generated by the SaaS platform, prefixed with `cmk_`).
2. Agent calls `POST /api/v1/agents/register` on the cameleer-server with the
API key as a Bearer token.
3. Server validates the key and returns:
- HMAC JWT access token (short-lived, ~1 hour)
- HMAC JWT refresh token (longer-lived, ~7 days)
- Ed25519 public key (for verifying server commands)
4. Agent uses the access token for all subsequent API calls.
5. On access token expiry, agent uses refresh token to obtain a new pair.
6. On refresh token expiry, agent re-registers using the original API key.
### 6.2 Telemetry Ingestion
Agents send telemetry to the server via HTTP POST:
- Route executions with processor-level traces
- Payload captures (configurable granularity with redaction)
- Route graph topology (tree + graph dual representation)
- Metrics and heartbeats
### 6.3 Server-to-Agent Commands (SSE)
The server maintains an SSE (Server-Sent Events) push channel to each agent:
- Configuration changes (engine level, payload capture settings)
- Deep trace requests for specific correlation IDs
- Exchange replay commands
- Per-processor payload capture overrides
**Command signing:** All commands are signed with the server's Ed25519 private
key. The agent verifies signatures using the public key received during
registration. Destructive commands include a nonce for replay protection.
---
## 7. API Overview
All endpoints under `/api/` require authentication unless noted otherwise.
Authentication is via Logto JWT Bearer token. Mutating endpoints additionally
require specific scopes via `@PreAuthorize` (see Section 3.5 for the full
mapping). The Auth column below shows `JWT` for authentication-only endpoints
and the required scope name for scope-gated endpoints.
### 7.1 Platform Configuration
| Method | Path | Auth | Description |
|--------|-------------------|----------|--------------------------------------------|
| GET | `/api/config` | Public | Frontend config (Logto endpoint, client ID, API resource, scopes) |
| GET | `/api/health/secured` | JWT | Auth verification endpoint |
| GET | `/actuator/health`| Public | Spring Boot health check |
`/api/config` response shape:
```json
{
"logtoEndpoint": "http://localhost:3001",
"logtoClientId": "<from bootstrap or env>",
"logtoResource": "https://api.cameleer.local",
"scopes": [
"platform:admin", "tenant:manage", "billing:manage", "team:manage",
"apps:manage", "apps:deploy", "secrets:manage", "observe:read",
"observe:debug", "settings:manage"
]
}
```
The `scopes` array is authoritative -- the frontend reads it during Logto
provider initialization to request the correct API resource scopes during
sign-in. Scopes are defined as a constant list in `PublicConfigController`
rather than being queried from Logto at runtime.
### 7.2 Identity
| Method | Path | Auth | Description |
|--------|-------------------|----------|--------------------------------------------|
| GET | `/api/me` | JWT | Current user info + tenant memberships |
`MeController` extracts `organization_id` from the JWT to resolve the tenant.
For non-org-scoped tokens, it falls back to `LogtoManagementClient.getUserOrganizations()`
to enumerate all organizations the user belongs to.
### 7.3 Tenants
| Method | Path | Auth | Description |
|--------|----------------------------|----------------------------------|------------------------|
| GET | `/api/tenants` | `SCOPE_platform:admin` | List all tenants |
| POST | `/api/tenants` | `SCOPE_platform:admin` | Create tenant |
| GET | `/api/tenants/{id}` | JWT | Get tenant by UUID |
| GET | `/api/tenants/by-slug/{slug}` | JWT | Get tenant by slug |
### 7.4 Environments
| Method | Path | Auth | Description |
|--------|----------------------------------------------------|---------------------|--------------------------|
| POST | `/api/tenants/{tenantId}/environments` | `apps:manage` | Create environment |
| GET | `/api/tenants/{tenantId}/environments` | JWT | List environments |
| GET | `/api/tenants/{tenantId}/environments/{envId}` | JWT | Get environment |
| PATCH | `/api/tenants/{tenantId}/environments/{envId}` | `apps:manage` | Update display name |
| DELETE | `/api/tenants/{tenantId}/environments/{envId}` | `apps:manage` | Delete environment |
### 7.5 Apps
| Method | Path | Auth | Description |
|--------|----------------------------------------------------|-----------------|------------------------|
| POST | `/api/environments/{envId}/apps` | `apps:manage` | Create app (multipart: metadata + JAR) |
| GET | `/api/environments/{envId}/apps` | JWT | List apps |
| GET | `/api/environments/{envId}/apps/{appId}` | JWT | Get app |
| PUT | `/api/environments/{envId}/apps/{appId}/jar` | `apps:deploy` | Re-upload JAR |
| DELETE | `/api/environments/{envId}/apps/{appId}` | `apps:manage` | Delete app |
| PATCH | `/api/environments/{envId}/apps/{appId}/routing` | `apps:deploy` | Set exposed port |
### 7.6 Deployments
| Method | Path | Auth | Description |
|--------|----------------------------------------------------|-----------------|--------------------------|
| POST | `/api/apps/{appId}/deploy` | `apps:deploy` | Deploy app (async, 202) |
| POST | `/api/apps/{appId}/stop` | `apps:deploy` | Stop running deployment |
| POST | `/api/apps/{appId}/restart` | `apps:deploy` | Stop + redeploy |
| GET | `/api/apps/{appId}/deployments` | JWT | List deployment history |
| GET | `/api/apps/{appId}/deployments/{deploymentId}` | JWT | Get deployment details |
### 7.7 Observability
| Method | Path | Auth | Description |
|--------|--------------------------------------------------|-----------------|---------------------------|
| GET | `/api/apps/{appId}/agent-status` | `observe:read` | Agent connectivity status |
| GET | `/api/apps/{appId}/observability-status` | `observe:read` | Observability data status |
| GET | `/api/apps/{appId}/logs` | `observe:read` | Container logs (query params: `since`, `until`, `limit`, `stream`) |
### 7.8 Licenses
| Method | Path | Auth | Description |
|--------|-------------------------------------------------|-------------------|--------------------------|
| POST | `/api/tenants/{tenantId}/license` | `billing:manage` | Generate license (365d) |
| GET | `/api/tenants/{tenantId}/license` | JWT | Get active license |
### 7.9 SPA Routing
The `SpaController` forwards all non-API paths to `index.html` for client-side
routing:
```java
@GetMapping(value = {"/", "/login", "/callback", "/environments/**", "/license"})
public String spa() { return "forward:/index.html"; }
```
---
## 8. Security Model
### 8.1 Tenant Isolation
Tenant isolation is enforced by a single Spring `HandlerInterceptor` --
`TenantIsolationInterceptor` -- registered on `/api/**` via `WebConfig`. It
handles both tenant resolution and ownership validation in one place:
**Resolution (every `/api/**` request):**
The interceptor's `preHandle()` reads the JWT's `organization_id` claim,
resolves it to an internal tenant UUID via `TenantService.getByLogtoOrgId()`,
and stores it on `TenantContext` (ThreadLocal). If no organization context is
resolved and the user is not a platform admin, the interceptor returns
**403 Forbidden**.
**Path variable validation (automatic, fail-closed):**
After resolution, the interceptor reads Spring's
`HandlerMapping.URI_TEMPLATE_VARIABLES_ATTRIBUTE` to inspect path variables
defined on the matched handler method. It checks three path variable names:
- `{tenantId}` -- Compared directly against the resolved tenant ID.
- `{environmentId}` -- The environment is loaded and its `tenantId` is compared.
- `{appId}` -- The app -> environment -> tenant chain is followed and compared.
If any path variable is present and the resolved tenant does not own that
resource, the interceptor returns **403 Forbidden**. This is **fail-closed**:
any new endpoint that uses these path variable names is automatically isolated
without requiring manual validation calls.
**Platform admin bypass:**
Users with `SCOPE_platform:admin` bypass all isolation checks. Their
`TenantContext` is left empty (null tenant ID), which downstream services
interpret as unrestricted access.
**Cleanup:**
`TenantContext.clear()` is called in `afterCompletion()` to prevent ThreadLocal
leaks regardless of whether the request succeeded or failed.
**Additional isolation boundaries:**
- Environment and app queries are scoped by tenant through foreign key
relationships (`environments.tenant_id`).
- Customer app containers run in isolated Docker containers with per-container
resource limits.
### 8.2 API Key Security
- Keys are generated with 32 bytes of `SecureRandom` entropy, prefixed with
`cmk_` and Base64url-encoded.
- Only the SHA-256 hash is stored in the database (`key_hash` column, 64 hex
chars). The `key_prefix` (first 12 chars) is stored for identification in
UI listings.
- The plaintext key is returned exactly once at creation time and never stored.
- Key lifecycle: `ACTIVE` -> `ROTATED` (old keys remain for grace period) or
`REVOKED` (immediately invalidated, `revoked_at` timestamp set).
- Validation is via SHA-256 hash comparison:
`ApiKeyService.validate(plaintext)` -> hash -> lookup by hash and status.
### 8.3 Token Lifetimes
| Token | Lifetime | Notes |
|----------------------|-------------|------------------------------------|
| Logto access token | ~1 hour | Configured in Logto, refreshed by SDK |
| Logto refresh token | ~14 days | Used by `@logto/react` for silent refresh |
| Server agent JWT | ~1 hour | cameleer-server `CAMELEER_JWT_SECRET` |
| Server refresh token | ~7 days | Agent re-registers when expired |
### 8.4 Audit Logging
All state-changing operations are logged to the `audit_log` table via
`AuditService.log()`. Each entry records:
- `actor_id` -- UUID of the user (from JWT subject)
- `tenant_id` -- UUID of the affected tenant
- `action` -- Enum value from `AuditAction`
- `resource` -- Identifier of the affected resource (e.g., app slug)
- `environment` -- Environment slug if applicable
- `result` -- `SUCCESS` or error indicator
- `metadata` -- Optional JSONB for additional context
Audit entries are immutable (append-only, no UPDATE/DELETE operations).
### 8.5 Security Boundaries
- CSRF is disabled (stateless API, Bearer token auth only).
- Sessions are disabled (`SessionCreationPolicy.STATELESS`).
- The Docker socket is mounted read-write on cameleer-saas for container
management. This is the highest-privilege access in the system.
- Logto's admin endpoint (`:3002`) is not exposed through Traefik.
- ClickHouse has no external port exposure.
---
## 9. Frontend Architecture
### 9.1 Stack
| Technology | Purpose |
|-----------------------|-------------------------------------------|
| React 19 | UI framework |
| Vite | Build tool and dev server |
| `@logto/react` | OIDC SDK (auth code flow, token mgmt) |
| Zustand | Org/tenant state management (`useOrgStore`)|
| TanStack React Query | Server state, caching, background refresh |
| React Router (v7) | Client-side routing |
| `@cameleer/design-system` | Shared component library (Gitea npm) |
### 9.2 Component Hierarchy
```
<ThemeProvider>
<ToastProvider>
<BreadcrumbProvider>
<GlobalFilterProvider>
<CommandPaletteProvider>
<LogtoProvider>
<TokenSync /> -- Manages org-scoped token provider
<QueryClientProvider>
<BrowserRouter>
<AppRouter>
/login -- LoginPage
/callback -- CallbackPage (OIDC redirect)
<ProtectedRoute>
<OrgResolver> -- Fetches /api/me, populates org store
<Layout>
/ -- DashboardPage
/environments -- EnvironmentsPage
/environments/:envId -- EnvironmentDetailPage
/environments/:envId/apps/:appId -- AppDetailPage
/license -- LicensePage
/admin/tenants -- AdminTenantsPage
```
### 9.3 Auth Data Flow
```
LogtoProvider -- Configured with 10 API resource scopes from /api/config
|
v
ProtectedRoute -- Gates on isAuthenticated, redirects to /login
|
v
OrgResolver -- Effect 1 [me]: populate org store from /api/me
| -- Effect 2 [me, currentOrgId]: fetch org-scoped
| -- access token, decode scopes into Set
| -- Re-runs Effect 2 on org switch (stale scope fix)
v
Layout + pages -- Read from useOrgStore for tenant context
-- Read from useAuth() for auth state
-- Read scopes for UI gating
```
### 9.4 State Stores
**`useOrgStore`** (Zustand) -- `ui/src/auth/useOrganization.ts`:
| Field | Type | Purpose |
|------------------|------------------|------------------------------------|
| `currentOrgId` | `string | null` | Logto org ID (for token scoping) |
| `currentTenantId`| `string | null` | DB UUID (for API calls) |
| `organizations` | `OrgInfo[]` | All orgs the user belongs to |
| `scopes` | `Set<string>` | OAuth2 scopes from access token |
**`useAuth()`** hook -- `ui/src/auth/useAuth.ts`:
Combines `@logto/react` state (`isAuthenticated`, `isLoading`) with org store
state (`currentTenantId`). Provides `logout` and `signIn` callbacks.
---
## 10. Configuration Reference
### 10.1 cameleer-saas
**Spring / Database:**
| Variable | Default | Description |
|------------------------------|----------------------------------------------|----------------------------------|
| `SPRING_DATASOURCE_URL` | `jdbc:postgresql://cameleer-postgres:5432/cameleer_saas` | PostgreSQL JDBC URL |
| `SPRING_DATASOURCE_USERNAME`| `cameleer` | PostgreSQL user |
| `SPRING_DATASOURCE_PASSWORD`| `cameleer_dev` | PostgreSQL password |
**Identity / OIDC:**
| Variable | Default | Description |
|---------------------------|------------|--------------------------------------------|
| `CAMELEER_SAAS_IDENTITY_LOGTOENDPOINT` | (empty) | Logto internal URL (Docker-internal) |
| `CAMELEER_SAAS_IDENTITY_LOGTOPUBLICENDPOINT` | (empty) | Logto public URL (browser-accessible) |
| `CAMELEER_SAAS_IDENTITY_M2MCLIENTID` | (empty) | M2M app client ID (from bootstrap) |
| `CAMELEER_SAAS_IDENTITY_M2MCLIENTSECRET` | (empty) | M2M app client secret (from bootstrap) |
| `CAMELEER_SAAS_IDENTITY_SPACLIENTID` | (empty) | SPA app client ID (fallback; bootstrap preferred) |
**Provisioning** (`cameleer.saas.provisioning.*` / `CAMELEER_SAAS_PROVISIONING_*`):
| Variable | Default | Description |
|-----------------------------------|------------------------------------|----------------------------------|
| `CAMELEER_SAAS_PROVISIONING_SERVERIMAGE` | `gitea.siegeln.net/cameleer/cameleer-server:latest` | Docker image for per-tenant server |
| `CAMELEER_SAAS_PROVISIONING_SERVERUIIMAGE` | `gitea.siegeln.net/cameleer/cameleer-server-ui:latest` | Docker image for per-tenant UI |
| `CAMELEER_SAAS_PROVISIONING_NETWORKNAME` | `cameleer-saas_cameleer` | Shared services Docker network |
| `CAMELEER_SAAS_PROVISIONING_TRAEFIKNETWORK` | `cameleer-traefik` | Traefik Docker network |
| `CAMELEER_SAAS_PROVISIONING_PUBLICHOST` | `localhost` | Public hostname (same as infrastructure `PUBLIC_HOST`) |
| `CAMELEER_SAAS_PROVISIONING_PUBLICPROTOCOL` | `https` | Public protocol (same as infrastructure `PUBLIC_PROTOCOL`) |
| `CAMELEER_SAAS_PROVISIONING_DATASOURCEURL` | `jdbc:postgresql://cameleer-postgres:5432/cameleer` | PostgreSQL URL passed to tenant servers |
| `CAMELEER_SAAS_PROVISIONING_CLICKHOUSEURL` | `jdbc:clickhouse://cameleer-clickhouse:8123/cameleer` | ClickHouse URL passed to tenant servers |
### 10.2 cameleer-server (per-tenant)
Env vars injected into provisioned per-tenant server containers by `DockerTenantProvisioner`. All server properties use the `cameleer.server.*` prefix (env vars: `CAMELEER_SERVER_*`).
| Variable | Default / Value | Description |
|------------------------------|----------------------------------------------|----------------------------------|
| `SPRING_DATASOURCE_URL` | `jdbc:postgresql://cameleer-postgres:5432/cameleer` | PostgreSQL JDBC URL |
| `SPRING_DATASOURCE_USERNAME`| `cameleer` | PostgreSQL user |
| `SPRING_DATASOURCE_PASSWORD`| `cameleer_dev` | PostgreSQL password |
| `CAMELEER_SERVER_CLICKHOUSE_URL` | `jdbc:clickhouse://cameleer-clickhouse:8123/cameleer` | ClickHouse JDBC URL |
| `CAMELEER_SERVER_TENANT_ID` | *(tenant slug)* | Tenant identifier for data isolation |
| `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN` | *(generated)* | Agent bootstrap token |
| `CAMELEER_SERVER_SECURITY_JWTSECRET` | *(generated, must be non-empty)* | JWT signing secret |
| `CAMELEER_SERVER_SECURITY_OIDC_ISSUERURI` | `${PUBLIC_PROTOCOL}://${PUBLIC_HOST}/oidc` | OIDC issuer for M2M tokens |
| `CAMELEER_SERVER_SECURITY_OIDC_JWKSETURI` | `http://cameleer-logto:3001/oidc/jwks` | Docker-internal JWK fetch |
| `CAMELEER_SERVER_SECURITY_OIDC_AUDIENCE` | `https://api.cameleer.local` | JWT audience validation |
| `CAMELEER_SERVER_SECURITY_CORSALLOWEDORIGINS` | `${PUBLIC_PROTOCOL}://${PUBLIC_HOST}` | CORS for browser requests |
| `CAMELEER_SERVER_RUNTIME_ENABLED` | `true` | Enable Docker orchestration |
| `CAMELEER_SERVER_RUNTIME_SERVERURL` | `http://cameleer-server-{slug}:8081` | Per-tenant server URL |
| `CAMELEER_SERVER_RUNTIME_ROUTINGDOMAIN` | `${PUBLIC_HOST}` | Domain for Traefik routing |
| `CAMELEER_SERVER_RUNTIME_ROUTINGMODE` | `path` | `path` or `subdomain` routing |
| `CAMELEER_SERVER_RUNTIME_JARSTORAGEPATH` | `/data/jars` | JAR file storage directory |
| `CAMELEER_SERVER_RUNTIME_DOCKERNETWORK` | `cameleer-tenant-{slug}` | Primary network for app containers |
| `CAMELEER_SERVER_RUNTIME_JARDOCKERVOLUME` | `cameleer-jars-{slug}` | Docker volume for JAR sharing |
| `CAMELEER_JWT_SECRET` | `cameleer-dev-jwt-secret-...` | HMAC secret for internal JWTs |
| `CAMELEER_SERVER_TENANT_ID` | `default` | Tenant slug for data isolation |
| `CAMELEER_SERVER_SECURITY_OIDCISSUERURI` | (empty) | Logto issuer for M2M token validation |
| `CAMELEER_SERVER_SECURITY_OIDCAUDIENCE` | (empty) | Expected JWT audience |
### 10.3 logto
| Variable | Default | Description |
|---------------------|--------------------------|---------------------------------|
| `LOGTO_PUBLIC_ENDPOINT` | `http://localhost:3001`| Public-facing Logto URL |
| `LOGTO_ADMIN_ENDPOINT` | `http://localhost:3002`| Admin console URL (not exposed) |
### 10.4 postgres
| Variable | Default | Description |
|---------------------|-------------------|---------------------------------|
| `POSTGRES_DB` | `cameleer_saas` | Default database name |
| `POSTGRES_USER` | `cameleer` | PostgreSQL superuser |
| `POSTGRES_PASSWORD` | `cameleer_dev` | PostgreSQL password |
### 10.5 logto-bootstrap
| Variable | Default | Description |
|----------------------|----------------------------|--------------------------------|
| `SAAS_ADMIN_USER` | `admin` | Platform admin username |
| `SAAS_ADMIN_PASS` | `admin` | Platform admin password |
| `TENANT_ADMIN_USER` | `camel` | Default tenant admin username |
| `TENANT_ADMIN_PASS` | `camel` | Default tenant admin password |
| `CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN`| `default-bootstrap-token` | Agent bootstrap token |
### 10.6 Bootstrap Output
The bootstrap script writes `/data/logto-bootstrap.json` containing:
```json
{
"spaClientId": "<auto-generated>",
"m2mClientId": "<auto-generated>",
"m2mClientSecret": "<auto-generated>",
"tradAppId": "<auto-generated>",
"tradAppSecret": "<auto-generated>",
"apiResourceIndicator": "https://api.cameleer.local",
"organizationId": "<auto-generated>",
"tenantName": "Example Tenant",
"tenantSlug": "default",
"bootstrapToken": "<from env>",
"platformAdminUser": "<from env>",
"tenantAdminUser": "<from env>",
"oidcIssuerUri": "http://cameleer-logto:3001/oidc",
"oidcAudience": "https://api.cameleer.local"
}
```
This file is mounted read-only into cameleer-saas via the `bootstrapdata`
volume. `PublicConfigController` reads it to serve SPA client IDs and the API
resource indicator without requiring environment variable configuration. The
controller also includes a `scopes` array (see Section 7.1) so the frontend
can request the correct API resource scopes during Logto sign-in.
---
## Appendix: Key Source Files
| File | Purpose |
|------|---------|
| `docker-compose.yml` | Service topology and configuration |
| `docker/logto-bootstrap.sh` | Idempotent Logto + DB bootstrap |
| `src/.../config/SecurityConfig.java` | Spring Security filter chain |
| `src/.../config/TenantIsolationInterceptor.java` | JWT org_id -> tenant resolution + path variable ownership validation (fail-closed) |
| `src/.../config/WebConfig.java` | Registers `TenantIsolationInterceptor` on `/api/**` |
| `src/.../config/TenantContext.java` | ThreadLocal tenant ID holder |
| `src/.../config/MeController.java` | User identity + tenant endpoint |
| `src/.../config/PublicConfigController.java` | SPA configuration endpoint (Logto config + scopes) |
| `src/.../tenant/TenantController.java` | Tenant CRUD (platform:admin gated) |
| `src/.../environment/EnvironmentController.java` | Environment CRUD |
| `src/.../app/AppController.java` | App CRUD + JAR upload |
| `src/.../deployment/DeploymentService.java` | Async deployment orchestration |
| `src/.../deployment/DeploymentController.java` | Deploy/stop/restart endpoints |
| `src/.../apikey/ApiKeyService.java` | API key generation, rotation, revocation |
| `src/.../identity/LogtoManagementClient.java` | Logto Management API client |
| `src/.../audit/AuditService.java` | Audit log writer |
| `src/.../runtime/RuntimeConfig.java` | Container runtime configuration |
| `ui/src/main.tsx` | React app entry, Logto provider setup |
| `ui/src/router.tsx` | Client-side route definitions |
| `ui/src/auth/OrgResolver.tsx` | Org + scope resolution from JWT |
| `ui/src/auth/useOrganization.ts` | Zustand org/tenant store |
| `ui/src/auth/useAuth.ts` | Auth convenience hook |
| `ui/src/auth/ProtectedRoute.tsx` | Route guard component |

View File

@@ -80,7 +80,7 @@ Note: Phase 9 (Frontend) can be developed in parallel with Phases 3-8, building
**PRD Sections:** 6 (Tenant Provisioning), 11 (Networking & Tenant Isolation) **PRD Sections:** 6 (Tenant Provisioning), 11 (Networking & Tenant Isolation)
**Gitea Epics:** #3 (Tenant Provisioning), #8 (Networking) **Gitea Epics:** #3 (Tenant Provisioning), #8 (Networking)
**Depends on:** Phase 2 **Depends on:** Phase 2
**Produces:** Automated tenant provisioning pipeline. Signup creates tenant → Flux HelmRelease generated → namespace provisioned → cameleer3-server deployed → PostgreSQL schema + OpenSearch index created → tenant ACTIVE. NetworkPolicies enforced. **Produces:** Automated tenant provisioning pipeline. Signup creates tenant → Flux HelmRelease generated → namespace provisioned → cameleer-server deployed → PostgreSQL schema + OpenSearch index created → tenant ACTIVE. NetworkPolicies enforced.
**Key deliverables:** **Key deliverables:**
- Provisioning state machine (idempotent, retryable) - Provisioning state machine (idempotent, retryable)
@@ -91,7 +91,7 @@ Note: Phase 9 (Frontend) can be developed in parallel with Phases 3-8, building
- Readiness checking (poll tenant server health) - Readiness checking (poll tenant server health)
- Tenant lifecycle operations (suspend, reactivate, delete) - Tenant lifecycle operations (suspend, reactivate, delete)
- K8s NetworkPolicy templates (default deny + allow rules) - K8s NetworkPolicy templates (default deny + allow rules)
- Helm chart for cameleer3-server tenant deployment - Helm chart for cameleer-server tenant deployment
--- ---
@@ -143,11 +143,11 @@ Note: Phase 9 (Frontend) can be developed in parallel with Phases 3-8, building
**PRD Sections:** 8 (Observability Integration) **PRD Sections:** 8 (Observability Integration)
**Gitea Epics:** #6 (Observability Integration), #13 (Exchange Replay — gating only) **Gitea Epics:** #6 (Observability Integration), #13 (Exchange Replay — gating only)
**Depends on:** Phase 3 (server already deployed per tenant), Phase 2 (license for feature gating) **Depends on:** Phase 3 (server already deployed per tenant), Phase 2 (license for feature gating)
**Produces:** Tenants see their cameleer3-server UI embedded in the SaaS shell. API gateway routes to tenant server. MOAT features gated by license tier. **Produces:** Tenants see their cameleer-server UI embedded in the SaaS shell. API gateway routes to tenant server. MOAT features gated by license tier.
**Key deliverables:** **Key deliverables:**
- Ingress routing rules: `/t/{tenant}/api/*` → tenant's cameleer3-server - Ingress routing rules: `/t/{tenant}/api/*` → tenant's cameleer-server
- cameleer3-server "managed mode" configuration (trust SaaS JWT, report metrics) - cameleer-server "managed mode" configuration (trust SaaS JWT, report metrics)
- Bootstrap token generation API - Bootstrap token generation API
- MOAT feature gating via license (topology=all, lineage=limited/full, correlation=mid+, debugger=high+, replay=high+) - MOAT feature gating via license (topology=all, lineage=limited/full, correlation=mid+, debugger=high+, replay=high+)
- Server UI embedding approach (iframe or reverse proxy with path rewriting) - Server UI embedding approach (iframe or reverse proxy with path rewriting)
@@ -211,7 +211,7 @@ Note: Phase 9 (Frontend) can be developed in parallel with Phases 3-8, building
- SaaS shell (navigation, tenant switcher, user menu) - SaaS shell (navigation, tenant switcher, user menu)
- Dashboard (platform overview) - Dashboard (platform overview)
- Apps list + App deployment page (upload, config, secrets, status, logs, versions) - Apps list + App deployment page (upload, config, secrets, status, logs, versions)
- Observability section (embedded cameleer3-server UI) - Observability section (embedded cameleer-server UI)
- Team management pages - Team management pages
- Settings pages (tenant config, SSO/OIDC, vault connections) - Settings pages (tenant config, SSO/OIDC, vault connections)
- Billing pages (usage, invoices, plan management) - Billing pages (usage, invoices, plan management)

View File

@@ -2006,7 +2006,7 @@ available throughout request lifecycle."
**Files:** **Files:**
- Create: `src/main/java/net/siegeln/cameleer/saas/config/ForwardAuthController.java` - Create: `src/main/java/net/siegeln/cameleer/saas/config/ForwardAuthController.java`
This endpoint is called by Traefik's ForwardAuth middleware to validate requests routed to non-platform services (e.g., cameleer3-server). It validates the JWT, resolves the tenant, and returns tenant context headers. This endpoint is called by Traefik's ForwardAuth middleware to validate requests routed to non-platform services (e.g., cameleer-server). It validates the JWT, resolves the tenant, and returns tenant context headers.
- [ ] **Step 1: Create ForwardAuthController** - [ ] **Step 1: Create ForwardAuthController**
@@ -2455,8 +2455,8 @@ services:
networks: networks:
- cameleer - cameleer
cameleer3-server: cameleer-server:
image: ${CAMELEER3_SERVER_IMAGE:-gitea.siegeln.net/cameleer/cameleer3-server}:${VERSION:-latest} image: ${CAMELEER_SERVER_IMAGE:-gitea.siegeln.net/cameleer/cameleer-server}:${VERSION:-latest}
restart: unless-stopped restart: unless-stopped
depends_on: depends_on:
postgres: postgres:
@@ -2539,9 +2539,9 @@ git add docker-compose.yml docker-compose.dev.yml traefik.yml docker/init-databa
git commit -m "feat: add Docker Compose production stack with Traefik + Logto git commit -m "feat: add Docker Compose production stack with Traefik + Logto
7-container stack: Traefik (reverse proxy), PostgreSQL (shared), 7-container stack: Traefik (reverse proxy), PostgreSQL (shared),
Logto (identity), cameleer-saas (control plane), cameleer3-server Logto (identity), cameleer-saas (control plane), cameleer-server
(observability), ClickHouse (traces). ForwardAuth middleware for (observability), ClickHouse (traces). ForwardAuth middleware for
tenant-aware routing to cameleer3-server." tenant-aware routing to cameleer-server."
``` ```
--- ---

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,789 @@
# Phase 4: Observability Pipeline + Inbound Routing — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Complete the deploy → hit endpoint → see traces loop. Serve the existing cameleer-server dashboard, add agent connectivity verification, enable optional inbound HTTP routing for customer apps, and wire up observability data health checks.
**Architecture:** Wiring phase — cameleer-server already has full observability. Phase 4 adds Traefik routing for the dashboard + customer app endpoints, new API endpoints in cameleer-saas for agent-status and observability-status, and configures `CAMELEER_TENANT_ID` on the server.
**Tech Stack:** Spring Boot 3.4.3, docker-java 3.4.1, ClickHouse JDBC, Traefik v3 labels, Spring RestClient
---
## File Structure
### New Files
- `src/main/java/net/siegeln/cameleer/saas/observability/AgentStatusService.java` — Queries cameleer-server for agent registration
- `src/main/java/net/siegeln/cameleer/saas/observability/AgentStatusController.java` — Agent status + observability status endpoints
- `src/main/java/net/siegeln/cameleer/saas/observability/dto/AgentStatusResponse.java` — Response DTO
- `src/main/java/net/siegeln/cameleer/saas/observability/dto/ObservabilityStatusResponse.java` — Response DTO
- `src/main/java/net/siegeln/cameleer/saas/observability/dto/UpdateRoutingRequest.java` — Request DTO for PATCH routing
- `src/main/java/net/siegeln/cameleer/saas/observability/ConnectivityHealthCheck.java` — Startup connectivity verification
- `src/test/java/net/siegeln/cameleer/saas/observability/AgentStatusServiceTest.java` — Unit tests
- `src/test/java/net/siegeln/cameleer/saas/observability/AgentStatusControllerTest.java` — Integration tests
- `src/main/resources/db/migration/V010__add_exposed_port_to_apps.sql` — Migration
### Modified Files
- `src/main/java/net/siegeln/cameleer/saas/runtime/StartContainerRequest.java` — Add `labels` field
- `src/main/java/net/siegeln/cameleer/saas/runtime/DockerRuntimeOrchestrator.java` — Apply labels on container create
- `src/main/java/net/siegeln/cameleer/saas/runtime/RuntimeConfig.java` — Add `domain` property
- `src/main/java/net/siegeln/cameleer/saas/app/AppEntity.java` — Add `exposedPort` field
- `src/main/java/net/siegeln/cameleer/saas/app/AppService.java` — Add `updateRouting` method
- `src/main/java/net/siegeln/cameleer/saas/app/AppController.java` — Add PATCH routing endpoint
- `src/main/java/net/siegeln/cameleer/saas/app/dto/AppResponse.java` — Add `exposedPort` + `routeUrl` fields
- `src/main/java/net/siegeln/cameleer/saas/deployment/DeploymentService.java` — Build labels for Traefik routing
- `src/main/resources/application.yml` — Add `domain` property
- `docker-compose.yml` — Add dashboard Traefik route, `CAMELEER_TENANT_ID`
- `.env.example` — Add `CAMELEER_TENANT_SLUG`
- `HOWTO.md` — Update with observability + routing docs
---
## Task 1: Database Migration + Entity Changes
**Files:**
- Create: `src/main/resources/db/migration/V010__add_exposed_port_to_apps.sql`
- Modify: `src/main/java/net/siegeln/cameleer/saas/app/AppEntity.java`
- [ ] **Step 1: Create migration V010**
```sql
ALTER TABLE apps ADD COLUMN exposed_port INTEGER;
```
- [ ] **Step 2: Add exposedPort field to AppEntity**
Add after `previousDeploymentId` field:
```java
@Column(name = "exposed_port")
private Integer exposedPort;
```
Add getter and setter:
```java
public Integer getExposedPort() { return exposedPort; }
public void setExposedPort(Integer exposedPort) { this.exposedPort = exposedPort; }
```
- [ ] **Step 3: Verify compilation**
Run: `mvn compile -B -q`
- [ ] **Step 4: Commit**
```bash
git add src/main/resources/db/migration/V010__add_exposed_port_to_apps.sql \
src/main/java/net/siegeln/cameleer/saas/app/AppEntity.java
git commit -m "feat: add exposed_port column to apps table"
```
---
## Task 2: StartContainerRequest Labels + DockerRuntimeOrchestrator
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/runtime/StartContainerRequest.java`
- Modify: `src/main/java/net/siegeln/cameleer/saas/runtime/DockerRuntimeOrchestrator.java`
- [ ] **Step 1: Add labels field to StartContainerRequest**
Replace the current record with:
```java
package net.siegeln.cameleer.saas.runtime;
import java.util.Map;
public record StartContainerRequest(
String imageRef,
String containerName,
String network,
Map<String, String> envVars,
long memoryLimitBytes,
int cpuShares,
int healthCheckPort,
Map<String, String> labels
) {}
```
- [ ] **Step 2: Apply labels in DockerRuntimeOrchestrator.startContainer**
In the `startContainer` method, after `.withHostConfig(hostConfig)` and before `.withHealthcheck(...)`, add:
```java
.withLabels(request.labels() != null ? request.labels() : Map.of())
```
- [ ] **Step 3: Fix all existing callers of StartContainerRequest**
The `DeploymentService.executeDeploymentAsync` method creates a `StartContainerRequest`. Add `Map.of()` as the labels argument (empty labels for now — routing labels come in Task 5):
Find the existing `new StartContainerRequest(...)` call and add `Map.of()` as the last argument.
- [ ] **Step 4: Verify compilation and run unit tests**
Run: `mvn test -B -Dsurefire.excludes="**/*ControllerTest.java,**/AuditRepositoryTest.java,**/CameleerSaasApplicationTest.java" -q`
- [ ] **Step 5: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/runtime/StartContainerRequest.java \
src/main/java/net/siegeln/cameleer/saas/runtime/DockerRuntimeOrchestrator.java \
src/main/java/net/siegeln/cameleer/saas/deployment/DeploymentService.java
git commit -m "feat: add labels support to StartContainerRequest and DockerRuntimeOrchestrator"
```
---
## Task 3: RuntimeConfig Domain + AppResponse + AppService Routing
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/runtime/RuntimeConfig.java`
- Modify: `src/main/java/net/siegeln/cameleer/saas/app/dto/AppResponse.java`
- Modify: `src/main/java/net/siegeln/cameleer/saas/app/AppService.java`
- Modify: `src/main/java/net/siegeln/cameleer/saas/app/AppController.java`
- Create: `src/main/java/net/siegeln/cameleer/saas/observability/dto/UpdateRoutingRequest.java`
- Modify: `src/main/resources/application.yml`
- [ ] **Step 1: Add domain property to RuntimeConfig**
Add field and getter:
```java
@Value("${cameleer.runtime.domain:localhost}")
private String domain;
public String getDomain() { return domain; }
```
- [ ] **Step 2: Add domain to application.yml**
In the `cameleer.runtime` section, add:
```yaml
domain: ${DOMAIN:localhost}
```
- [ ] **Step 3: Update AppResponse to include exposedPort and routeUrl**
Replace the record:
```java
package net.siegeln.cameleer.saas.app.dto;
import java.time.Instant;
import java.util.UUID;
public record AppResponse(
UUID id,
UUID environmentId,
String slug,
String displayName,
String jarOriginalFilename,
Long jarSizeBytes,
String jarChecksum,
Integer exposedPort,
String routeUrl,
UUID currentDeploymentId,
UUID previousDeploymentId,
Instant createdAt,
Instant updatedAt
) {}
```
- [ ] **Step 4: Create UpdateRoutingRequest**
```java
package net.siegeln.cameleer.saas.observability.dto;
public record UpdateRoutingRequest(
Integer exposedPort
) {}
```
- [ ] **Step 5: Add updateRouting method to AppService**
```java
public AppEntity updateRouting(UUID appId, Integer exposedPort, UUID actorId) {
var app = appRepository.findById(appId)
.orElseThrow(() -> new IllegalArgumentException("App not found"));
app.setExposedPort(exposedPort);
return appRepository.save(app);
}
```
- [ ] **Step 6: Update AppController — add PATCH routing endpoint and update toResponse**
Add the endpoint:
```java
@PatchMapping("/{appId}/routing")
public ResponseEntity<AppResponse> updateRouting(
@PathVariable UUID environmentId,
@PathVariable UUID appId,
@RequestBody UpdateRoutingRequest request,
Authentication authentication) {
try {
var actorId = resolveActorId(authentication);
var app = appService.updateRouting(appId, request.exposedPort(), actorId);
var env = environmentService.getById(app.getEnvironmentId()).orElse(null);
return ResponseEntity.ok(toResponse(app, env));
} catch (IllegalArgumentException e) {
return ResponseEntity.notFound().build();
}
}
```
This requires adding `EnvironmentService` and `RuntimeConfig` as constructor dependencies to `AppController`. Update the constructor.
Update `toResponse` to accept the environment and compute the route URL:
```java
private AppResponse toResponse(AppEntity app, EnvironmentEntity env) {
String routeUrl = null;
if (app.getExposedPort() != null && env != null) {
var tenant = tenantRepository.findById(env.getTenantId()).orElse(null);
if (tenant != null) {
routeUrl = "http://" + app.getSlug() + "." + env.getSlug() + "."
+ tenant.getSlug() + "." + runtimeConfig.getDomain();
}
}
return new AppResponse(
app.getId(), app.getEnvironmentId(), app.getSlug(), app.getDisplayName(),
app.getJarOriginalFilename(), app.getJarSizeBytes(), app.getJarChecksum(),
app.getExposedPort(), routeUrl,
app.getCurrentDeploymentId(), app.getPreviousDeploymentId(),
app.getCreatedAt(), app.getUpdatedAt());
}
```
This requires adding `TenantRepository` as a constructor dependency too. Update the existing `toResponse(AppEntity)` calls in other methods to pass the environment — look up the environment from the `environmentId` path variable or from `environmentService`.
For the list/get/create endpoints that already have `environmentId` in the path, look up the environment once and pass it.
- [ ] **Step 7: Verify compilation**
Run: `mvn compile -B -q`
- [ ] **Step 8: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/runtime/RuntimeConfig.java \
src/main/java/net/siegeln/cameleer/saas/app/dto/AppResponse.java \
src/main/java/net/siegeln/cameleer/saas/app/AppService.java \
src/main/java/net/siegeln/cameleer/saas/app/AppController.java \
src/main/java/net/siegeln/cameleer/saas/observability/dto/UpdateRoutingRequest.java \
src/main/resources/application.yml
git commit -m "feat: add exposed port routing and route URL to app API"
```
---
## Task 4: Agent Status + Observability Status Endpoints (TDD)
**Files:**
- Create: `src/main/java/net/siegeln/cameleer/saas/observability/dto/AgentStatusResponse.java`
- Create: `src/main/java/net/siegeln/cameleer/saas/observability/dto/ObservabilityStatusResponse.java`
- Create: `src/main/java/net/siegeln/cameleer/saas/observability/AgentStatusService.java`
- Create: `src/main/java/net/siegeln/cameleer/saas/observability/AgentStatusController.java`
- Create: `src/test/java/net/siegeln/cameleer/saas/observability/AgentStatusServiceTest.java`
- [ ] **Step 1: Create DTOs**
`AgentStatusResponse.java`:
```java
package net.siegeln.cameleer.saas.observability.dto;
import java.time.Instant;
import java.util.List;
public record AgentStatusResponse(
boolean registered,
String state,
Instant lastHeartbeat,
List<String> routeIds,
String applicationId,
String environmentId
) {}
```
`ObservabilityStatusResponse.java`:
```java
package net.siegeln.cameleer.saas.observability.dto;
import java.time.Instant;
public record ObservabilityStatusResponse(
boolean hasTraces,
boolean hasMetrics,
boolean hasDiagrams,
Instant lastTraceAt,
long traceCount24h
) {}
```
- [ ] **Step 2: Write failing tests**
```java
package net.siegeln.cameleer.saas.observability;
import net.siegeln.cameleer.saas.app.AppEntity;
import net.siegeln.cameleer.saas.app.AppRepository;
import net.siegeln.cameleer.saas.environment.EnvironmentEntity;
import net.siegeln.cameleer.saas.environment.EnvironmentRepository;
import net.siegeln.cameleer.saas.runtime.RuntimeConfig;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.extension.ExtendWith;
import org.mockito.Mock;
import org.mockito.junit.jupiter.MockitoExtension;
import java.util.Optional;
import java.util.UUID;
import static org.junit.jupiter.api.Assertions.*;
import static org.mockito.Mockito.*;
@ExtendWith(MockitoExtension.class)
class AgentStatusServiceTest {
@Mock private AppRepository appRepository;
@Mock private EnvironmentRepository environmentRepository;
@Mock private RuntimeConfig runtimeConfig;
private AgentStatusService agentStatusService;
@BeforeEach
void setUp() {
when(runtimeConfig.getCameleerServerEndpoint()).thenReturn("http://cameleer-server:8081");
agentStatusService = new AgentStatusService(appRepository, environmentRepository, runtimeConfig);
}
@Test
void getAgentStatus_appNotFound_shouldThrow() {
when(appRepository.findById(any())).thenReturn(Optional.empty());
assertThrows(IllegalArgumentException.class,
() -> agentStatusService.getAgentStatus(UUID.randomUUID()));
}
@Test
void getAgentStatus_shouldReturnUnknownWhenServerUnreachable() {
var appId = UUID.randomUUID();
var envId = UUID.randomUUID();
var app = new AppEntity();
app.setId(appId);
app.setEnvironmentId(envId);
app.setSlug("my-app");
when(appRepository.findById(appId)).thenReturn(Optional.of(app));
var env = new EnvironmentEntity();
env.setId(envId);
env.setSlug("default");
when(environmentRepository.findById(envId)).thenReturn(Optional.of(env));
var result = agentStatusService.getAgentStatus(appId);
assertNotNull(result);
assertFalse(result.registered());
assertEquals("UNKNOWN", result.state());
}
}
```
- [ ] **Step 3: Implement AgentStatusService**
```java
package net.siegeln.cameleer.saas.observability;
import net.siegeln.cameleer.saas.app.AppRepository;
import net.siegeln.cameleer.saas.environment.EnvironmentRepository;
import net.siegeln.cameleer.saas.observability.dto.AgentStatusResponse;
import net.siegeln.cameleer.saas.observability.dto.ObservabilityStatusResponse;
import net.siegeln.cameleer.saas.runtime.RuntimeConfig;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.stereotype.Service;
import org.springframework.web.client.RestClient;
import javax.sql.DataSource;
import java.sql.Timestamp;
import java.time.Instant;
import java.util.List;
import java.util.UUID;
@Service
public class AgentStatusService {
private static final Logger log = LoggerFactory.getLogger(AgentStatusService.class);
private final AppRepository appRepository;
private final EnvironmentRepository environmentRepository;
private final RuntimeConfig runtimeConfig;
private final RestClient restClient;
@Autowired(required = false)
@Qualifier("clickHouseDataSource")
private DataSource clickHouseDataSource;
public AgentStatusService(AppRepository appRepository,
EnvironmentRepository environmentRepository,
RuntimeConfig runtimeConfig) {
this.appRepository = appRepository;
this.environmentRepository = environmentRepository;
this.runtimeConfig = runtimeConfig;
this.restClient = RestClient.builder()
.baseUrl(runtimeConfig.getCameleerServerEndpoint())
.build();
}
public AgentStatusResponse getAgentStatus(UUID appId) {
var app = appRepository.findById(appId)
.orElseThrow(() -> new IllegalArgumentException("App not found"));
var env = environmentRepository.findById(app.getEnvironmentId())
.orElseThrow(() -> new IllegalStateException("Environment not found"));
try {
var response = restClient.get()
.uri("/api/v1/agents")
.header("Authorization", "Bearer " + runtimeConfig.getBootstrapToken())
.retrieve()
.body(List.class);
if (response != null) {
for (var agentObj : response) {
if (agentObj instanceof java.util.Map<?, ?> agent) {
var agentAppId = String.valueOf(agent.get("applicationId"));
var agentEnvId = String.valueOf(agent.get("environmentId"));
if (app.getSlug().equals(agentAppId) && env.getSlug().equals(agentEnvId)) {
var state = String.valueOf(agent.getOrDefault("state", "UNKNOWN"));
var routeIds = agent.get("routeIds");
@SuppressWarnings("unchecked")
var routes = routeIds instanceof List<?> r ? (List<String>) r : List.<String>of();
return new AgentStatusResponse(true, state, null, routes,
agentAppId, agentEnvId);
}
}
}
}
return new AgentStatusResponse(false, "NOT_REGISTERED", null,
List.of(), app.getSlug(), env.getSlug());
} catch (Exception e) {
log.warn("Failed to query agent status from cameleer-server: {}", e.getMessage());
return new AgentStatusResponse(false, "UNKNOWN", null,
List.of(), app.getSlug(), env.getSlug());
}
}
public ObservabilityStatusResponse getObservabilityStatus(UUID appId) {
var app = appRepository.findById(appId)
.orElseThrow(() -> new IllegalArgumentException("App not found"));
var env = environmentRepository.findById(app.getEnvironmentId())
.orElseThrow(() -> new IllegalStateException("Environment not found"));
if (clickHouseDataSource == null) {
return new ObservabilityStatusResponse(false, false, false, null, 0);
}
try (var conn = clickHouseDataSource.getConnection();
var ps = conn.prepareStatement("""
SELECT
count() as trace_count,
max(start_time) as last_trace
FROM executions
WHERE application_id = ? AND environment = ?
AND start_time > now() - INTERVAL 24 HOUR
""")) {
ps.setString(1, app.getSlug());
ps.setString(2, env.getSlug());
try (var rs = ps.executeQuery()) {
if (rs.next()) {
var count = rs.getLong("trace_count");
var lastTrace = rs.getTimestamp("last_trace");
return new ObservabilityStatusResponse(
count > 0, false, false,
lastTrace != null ? lastTrace.toInstant() : null,
count);
}
}
} catch (Exception e) {
log.warn("Failed to query observability status from ClickHouse: {}", e.getMessage());
}
return new ObservabilityStatusResponse(false, false, false, null, 0);
}
}
```
- [ ] **Step 4: Create AgentStatusController**
```java
package net.siegeln.cameleer.saas.observability;
import net.siegeln.cameleer.saas.observability.dto.AgentStatusResponse;
import net.siegeln.cameleer.saas.observability.dto.ObservabilityStatusResponse;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.*;
import java.util.UUID;
@RestController
@RequestMapping("/api/apps/{appId}")
public class AgentStatusController {
private final AgentStatusService agentStatusService;
public AgentStatusController(AgentStatusService agentStatusService) {
this.agentStatusService = agentStatusService;
}
@GetMapping("/agent-status")
public ResponseEntity<AgentStatusResponse> getAgentStatus(@PathVariable UUID appId) {
try {
var status = agentStatusService.getAgentStatus(appId);
return ResponseEntity.ok(status);
} catch (IllegalArgumentException e) {
return ResponseEntity.notFound().build();
}
}
@GetMapping("/observability-status")
public ResponseEntity<ObservabilityStatusResponse> getObservabilityStatus(@PathVariable UUID appId) {
try {
var status = agentStatusService.getObservabilityStatus(appId);
return ResponseEntity.ok(status);
} catch (IllegalArgumentException e) {
return ResponseEntity.notFound().build();
}
}
}
```
- [ ] **Step 5: Run tests**
Run: `mvn test -pl . -Dtest=AgentStatusServiceTest -B`
Expected: 2 tests PASS
- [ ] **Step 6: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/observability/ \
src/test/java/net/siegeln/cameleer/saas/observability/
git commit -m "feat: add agent status and observability status endpoints"
```
---
## Task 5: Traefik Routing Labels in DeploymentService
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/deployment/DeploymentService.java`
- [ ] **Step 1: Build Traefik labels when app has exposedPort**
In `executeDeploymentAsync`, after building the `envVars` map and before creating `startRequest`, add label computation:
```java
// Build Traefik labels for inbound routing
var labels = new java.util.HashMap<String, String>();
if (app.getExposedPort() != null) {
labels.put("traefik.enable", "true");
labels.put("traefik.http.routers." + containerName + ".rule",
"Host(`" + app.getSlug() + "." + env.getSlug() + "."
+ tenant.getSlug() + "." + runtimeConfig.getDomain() + "`)");
labels.put("traefik.http.services." + containerName + ".loadbalancer.server.port",
String.valueOf(app.getExposedPort()));
}
```
Then pass `labels` to the `StartContainerRequest` constructor (replacing the `Map.of()` added in Task 2).
Note: The `tenant` variable is already looked up earlier in the method for container naming.
- [ ] **Step 2: Run unit tests**
Run: `mvn test -pl . -Dtest=DeploymentServiceTest -B`
Expected: All tests PASS
- [ ] **Step 3: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/deployment/DeploymentService.java
git commit -m "feat: add Traefik routing labels for customer apps with exposed ports"
```
---
## Task 6: Connectivity Health Check
**Files:**
- Create: `src/main/java/net/siegeln/cameleer/saas/observability/ConnectivityHealthCheck.java`
- [ ] **Step 1: Create startup connectivity check**
```java
package net.siegeln.cameleer.saas.observability;
import net.siegeln.cameleer.saas.runtime.RuntimeConfig;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.context.event.ApplicationReadyEvent;
import org.springframework.context.event.EventListener;
import org.springframework.stereotype.Component;
import org.springframework.web.client.RestClient;
@Component
public class ConnectivityHealthCheck {
private static final Logger log = LoggerFactory.getLogger(ConnectivityHealthCheck.class);
private final RuntimeConfig runtimeConfig;
public ConnectivityHealthCheck(RuntimeConfig runtimeConfig) {
this.runtimeConfig = runtimeConfig;
}
@EventListener(ApplicationReadyEvent.class)
public void verifyConnectivity() {
checkCameleerServer();
}
private void checkCameleerServer() {
try {
var client = RestClient.builder()
.baseUrl(runtimeConfig.getCameleerServerEndpoint())
.build();
var response = client.get()
.uri("/actuator/health")
.retrieve()
.toBodilessEntity();
if (response.getStatusCode().is2xxSuccessful()) {
log.info("cameleer-server connectivity: OK ({})",
runtimeConfig.getCameleerServerEndpoint());
} else {
log.warn("cameleer-server connectivity: HTTP {} ({})",
response.getStatusCode(), runtimeConfig.getCameleerServerEndpoint());
}
} catch (Exception e) {
log.warn("cameleer-server connectivity: FAILED ({}) - {}",
runtimeConfig.getCameleerServerEndpoint(), e.getMessage());
}
}
}
```
- [ ] **Step 2: Verify compilation**
Run: `mvn compile -B -q`
- [ ] **Step 3: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/observability/ConnectivityHealthCheck.java
git commit -m "feat: add cameleer-server startup connectivity check"
```
---
## Task 7: Docker Compose + .env + CI Updates
**Files:**
- Modify: `docker-compose.yml`
- Modify: `.env.example`
- Modify: `.gitea/workflows/ci.yml`
- [ ] **Step 1: Update docker-compose.yml — add dashboard route and CAMELEER_TENANT_ID**
In the `cameleer-server` service:
Add to environment section:
```yaml
CAMELEER_TENANT_ID: ${CAMELEER_TENANT_SLUG:-default}
```
Add new Traefik labels (after existing ones):
```yaml
- traefik.http.routers.dashboard.rule=PathPrefix(`/dashboard`)
- traefik.http.routers.dashboard.middlewares=forward-auth,dashboard-strip
- traefik.http.middlewares.dashboard-strip.stripprefix.prefixes=/dashboard
- traefik.http.services.dashboard.loadbalancer.server.port=8080
```
- [ ] **Step 2: Update .env.example**
Add:
```
CAMELEER_TENANT_SLUG=default
```
- [ ] **Step 3: Update CI excludes**
In `.gitea/workflows/ci.yml`, add `**/AgentStatusControllerTest.java` to the Surefire excludes (if integration test exists).
- [ ] **Step 4: Run all unit tests**
Run: `mvn test -B -Dsurefire.excludes="**/*ControllerTest.java,**/AuditRepositoryTest.java,**/CameleerSaasApplicationTest.java" -q`
- [ ] **Step 5: Commit**
```bash
git add docker-compose.yml .env.example .gitea/workflows/ci.yml
git commit -m "feat: add dashboard Traefik route and CAMELEER_TENANT_ID config"
```
---
## Task 8: Update HOWTO.md
**Files:**
- Modify: `HOWTO.md`
- [ ] **Step 1: Add observability and routing sections**
After the "Deploy a Camel Application" section, add:
**Observability Dashboard section** — explains how to access the dashboard at `/dashboard`, what data is visible.
**Inbound HTTP Routing section** — explains how to set `exposedPort` on an app and what URL to use.
**Agent Status section** — explains the agent-status and observability-status endpoints.
Update the API Reference table with the new endpoints:
- `GET /api/apps/{aid}/agent-status`
- `GET /api/apps/{aid}/observability-status`
- `PATCH /api/environments/{eid}/apps/{aid}/routing`
Update the .env table to include `CAMELEER_TENANT_SLUG`.
- [ ] **Step 2: Commit**
```bash
git add HOWTO.md
git commit -m "docs: update HOWTO with observability dashboard, routing, and agent status"
```
---
## Summary of Spec Coverage
| Spec Requirement | Task |
|---|---|
| Serve cameleer-server dashboard via Traefik | Task 7 (dashboard Traefik labels) |
| CAMELEER_TENANT_ID configuration | Task 7 (docker-compose env) |
| Agent connectivity verification endpoint | Task 4 (AgentStatusService + Controller) |
| Observability data health endpoint | Task 4 (ObservabilityStatusResponse) |
| Inbound HTTP routing (exposedPort + Traefik labels) | Tasks 1, 2, 3, 5 |
| StartContainerRequest labels support | Task 2 |
| AppResponse with routeUrl | Task 3 |
| PATCH routing API | Task 3 |
| Startup connectivity check | Task 6 |
| Docker Compose changes | Task 7 |
| .env.example updates | Task 7 |
| HOWTO.md updates | Task 8 |
| V010 migration | Task 1 |

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,986 @@
# Plan 1: Auth & RBAC Overhaul
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add claim-based RBAC with managed/direct assignment origins, and make the server operate as a pure OAuth2 resource server when OIDC is configured.
**Architecture:** Extend the existing RBAC schema with an `origin` column (direct vs managed) on assignment tables, add a `claim_mapping_rules` table, and implement a ClaimMappingService that evaluates JWT claims against mapping rules on every OIDC login. When OIDC is configured, the server becomes a pure resource server — no local login, no JWT generation for users. Agents always use server-issued tokens regardless of auth mode.
**Tech Stack:** Java 17, Spring Boot 3.4.3, PostgreSQL 16, Flyway, JUnit 5, Testcontainers, AssertJ
**Repo:** `C:\Users\Hendrik\Documents\projects\cameleer-server`
---
## File Map
### New Files
- `cameleer-server-app/src/main/resources/db/migration/V2__claim_mapping.sql`
- `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingRule.java`
- `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingRepository.java`
- `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingService.java`
- `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/AssignmentOrigin.java`
- `cameleer-server-app/src/main/java/com/cameleer/server/app/storage/PostgresClaimMappingRepository.java`
- `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ClaimMappingAdminController.java`
- `cameleer-server-app/src/test/java/com/cameleer/server/core/rbac/ClaimMappingServiceTest.java`
- `cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ClaimMappingAdminControllerIT.java`
- `cameleer-server-app/src/test/java/com/cameleer/server/app/security/OidcOnlyModeIT.java`
### Modified Files
- `cameleer-server-app/src/main/resources/db/migration/V1__init.sql` — no changes (immutable)
- `cameleer-server-app/src/main/java/com/cameleer/server/app/rbac/RbacServiceImpl.java` — add origin-aware query methods
- `cameleer-server-app/src/main/java/com/cameleer/server/app/storage/PostgresUserRepository.java` — add origin-aware queries
- `cameleer-server-app/src/main/java/com/cameleer/server/app/security/OidcAuthController.java` — replace syncOidcRoles with claim mapping
- `cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtAuthenticationFilter.java` — disable internal token path in OIDC-only mode
- `cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityConfig.java` — conditional endpoint registration
- `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/UserAdminController.java` — disable in OIDC-only mode
- `cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java` — wire ClaimMappingService
- `cameleer-server-app/src/main/resources/application.yml` — no new properties needed (OIDC config already exists)
---
### Task 1: Database Migration — Add Origin Tracking and Claim Mapping Rules
**Files:**
- Create: `cameleer-server-app/src/main/resources/db/migration/V2__claim_mapping.sql`
- [ ] **Step 1: Write the migration**
```sql
-- V2__claim_mapping.sql
-- Add origin tracking to assignment tables
ALTER TABLE user_roles ADD COLUMN origin TEXT NOT NULL DEFAULT 'direct';
ALTER TABLE user_roles ADD COLUMN mapping_id UUID;
ALTER TABLE user_groups ADD COLUMN origin TEXT NOT NULL DEFAULT 'direct';
ALTER TABLE user_groups ADD COLUMN mapping_id UUID;
-- Drop old primary keys (they don't include origin)
ALTER TABLE user_roles DROP CONSTRAINT user_roles_pkey;
ALTER TABLE user_roles ADD PRIMARY KEY (user_id, role_id, origin);
ALTER TABLE user_groups DROP CONSTRAINT user_groups_pkey;
ALTER TABLE user_groups ADD PRIMARY KEY (user_id, group_id, origin);
-- Claim mapping rules table
CREATE TABLE claim_mapping_rules (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
claim TEXT NOT NULL,
match_type TEXT NOT NULL,
match_value TEXT NOT NULL,
action TEXT NOT NULL,
target TEXT NOT NULL,
priority INT NOT NULL DEFAULT 0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
CONSTRAINT chk_match_type CHECK (match_type IN ('equals', 'contains', 'regex')),
CONSTRAINT chk_action CHECK (action IN ('assignRole', 'addToGroup'))
);
-- Foreign key from assignments to mapping rules
ALTER TABLE user_roles ADD CONSTRAINT fk_user_roles_mapping
FOREIGN KEY (mapping_id) REFERENCES claim_mapping_rules(id) ON DELETE CASCADE;
ALTER TABLE user_groups ADD CONSTRAINT fk_user_groups_mapping
FOREIGN KEY (mapping_id) REFERENCES claim_mapping_rules(id) ON DELETE CASCADE;
-- Index for fast managed assignment cleanup
CREATE INDEX idx_user_roles_origin ON user_roles(user_id, origin);
CREATE INDEX idx_user_groups_origin ON user_groups(user_id, origin);
```
- [ ] **Step 2: Run migration to verify**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn flyway:migrate -pl cameleer-server-app -Dflyway.url=jdbc:postgresql://localhost:5432/cameleer -Dflyway.user=cameleer -Dflyway.password=cameleer_dev`
If no local PostgreSQL, verify syntax by running the existing test suite which uses Testcontainers.
- [ ] **Step 3: Commit**
```bash
git add cameleer-server-app/src/main/resources/db/migration/V2__claim_mapping.sql
git commit -m "feat: add claim mapping rules table and origin tracking to RBAC assignments"
```
---
### Task 2: Core Domain — ClaimMappingRule, AssignmentOrigin, Repository Interface
**Files:**
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/AssignmentOrigin.java`
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingRule.java`
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingRepository.java`
- [ ] **Step 1: Create AssignmentOrigin enum**
```java
package com.cameleer.server.core.rbac;
public enum AssignmentOrigin {
direct, managed
}
```
- [ ] **Step 2: Create ClaimMappingRule record**
```java
package com.cameleer.server.core.rbac;
import java.time.Instant;
import java.util.UUID;
public record ClaimMappingRule(
UUID id,
String claim,
String matchType,
String matchValue,
String action,
String target,
int priority,
Instant createdAt
) {
public enum MatchType { equals, contains, regex }
public enum Action { assignRole, addToGroup }
}
```
- [ ] **Step 3: Create ClaimMappingRepository interface**
```java
package com.cameleer.server.core.rbac;
import java.util.List;
import java.util.Optional;
import java.util.UUID;
public interface ClaimMappingRepository {
List<ClaimMappingRule> findAll();
Optional<ClaimMappingRule> findById(UUID id);
UUID create(String claim, String matchType, String matchValue, String action, String target, int priority);
void update(UUID id, String claim, String matchType, String matchValue, String action, String target, int priority);
void delete(UUID id);
}
```
- [ ] **Step 4: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/AssignmentOrigin.java
git add cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingRule.java
git add cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingRepository.java
git commit -m "feat: add ClaimMappingRule domain model and repository interface"
```
---
### Task 3: Core Domain — ClaimMappingService
**Files:**
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingService.java`
- Create: `cameleer-server-app/src/test/java/com/cameleer/server/core/rbac/ClaimMappingServiceTest.java`
- [ ] **Step 1: Write tests for ClaimMappingService**
```java
package com.cameleer.server.core.rbac;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import java.util.*;
import static org.assertj.core.api.Assertions.assertThat;
class ClaimMappingServiceTest {
private ClaimMappingService service;
@BeforeEach
void setUp() {
service = new ClaimMappingService();
}
@Test
void evaluate_containsMatch_onStringArrayClaim() {
var rule = new ClaimMappingRule(
UUID.randomUUID(), "groups", "contains", "cameleer-admins",
"assignRole", "ADMIN", 0, null);
Map<String, Object> claims = Map.of("groups", List.of("eng", "cameleer-admins", "devops"));
var results = service.evaluate(List.of(rule), claims);
assertThat(results).hasSize(1);
assertThat(results.get(0).rule()).isEqualTo(rule);
}
@Test
void evaluate_equalsMatch_onStringClaim() {
var rule = new ClaimMappingRule(
UUID.randomUUID(), "department", "equals", "platform",
"assignRole", "OPERATOR", 0, null);
Map<String, Object> claims = Map.of("department", "platform");
var results = service.evaluate(List.of(rule), claims);
assertThat(results).hasSize(1);
}
@Test
void evaluate_regexMatch() {
var rule = new ClaimMappingRule(
UUID.randomUUID(), "email", "regex", ".*@example\\.com$",
"addToGroup", "Example Corp", 0, null);
Map<String, Object> claims = Map.of("email", "john@example.com");
var results = service.evaluate(List.of(rule), claims);
assertThat(results).hasSize(1);
}
@Test
void evaluate_noMatch_returnsEmpty() {
var rule = new ClaimMappingRule(
UUID.randomUUID(), "groups", "contains", "cameleer-admins",
"assignRole", "ADMIN", 0, null);
Map<String, Object> claims = Map.of("groups", List.of("eng", "devops"));
var results = service.evaluate(List.of(rule), claims);
assertThat(results).isEmpty();
}
@Test
void evaluate_missingClaim_returnsEmpty() {
var rule = new ClaimMappingRule(
UUID.randomUUID(), "groups", "contains", "admins",
"assignRole", "ADMIN", 0, null);
Map<String, Object> claims = Map.of("department", "eng");
var results = service.evaluate(List.of(rule), claims);
assertThat(results).isEmpty();
}
@Test
void evaluate_rulesOrderedByPriority() {
var lowPriority = new ClaimMappingRule(
UUID.randomUUID(), "role", "equals", "dev",
"assignRole", "VIEWER", 0, null);
var highPriority = new ClaimMappingRule(
UUID.randomUUID(), "role", "equals", "dev",
"assignRole", "OPERATOR", 10, null);
Map<String, Object> claims = Map.of("role", "dev");
var results = service.evaluate(List.of(highPriority, lowPriority), claims);
assertThat(results).hasSize(2);
assertThat(results.get(0).rule().priority()).isEqualTo(0);
assertThat(results.get(1).rule().priority()).isEqualTo(10);
}
@Test
void evaluate_containsMatch_onSpaceSeparatedString() {
var rule = new ClaimMappingRule(
UUID.randomUUID(), "scope", "contains", "server:admin",
"assignRole", "ADMIN", 0, null);
Map<String, Object> claims = Map.of("scope", "openid profile server:admin");
var results = service.evaluate(List.of(rule), claims);
assertThat(results).hasSize(1);
}
}
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=ClaimMappingServiceTest -Dsurefire.failIfNoSpecifiedTests=false`
Expected: Compilation error — ClaimMappingService does not exist yet.
- [ ] **Step 3: Implement ClaimMappingService**
```java
package com.cameleer.server.core.rbac;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.*;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
public class ClaimMappingService {
private static final Logger log = LoggerFactory.getLogger(ClaimMappingService.class);
public record MappingResult(ClaimMappingRule rule) {}
public List<MappingResult> evaluate(List<ClaimMappingRule> rules, Map<String, Object> claims) {
return rules.stream()
.sorted(Comparator.comparingInt(ClaimMappingRule::priority))
.filter(rule -> matches(rule, claims))
.map(MappingResult::new)
.toList();
}
private boolean matches(ClaimMappingRule rule, Map<String, Object> claims) {
Object claimValue = claims.get(rule.claim());
if (claimValue == null) return false;
return switch (rule.matchType()) {
case "equals" -> equalsMatch(claimValue, rule.matchValue());
case "contains" -> containsMatch(claimValue, rule.matchValue());
case "regex" -> regexMatch(claimValue, rule.matchValue());
default -> {
log.warn("Unknown match type: {}", rule.matchType());
yield false;
}
};
}
private boolean equalsMatch(Object claimValue, String matchValue) {
if (claimValue instanceof String s) {
return s.equalsIgnoreCase(matchValue);
}
return String.valueOf(claimValue).equalsIgnoreCase(matchValue);
}
private boolean containsMatch(Object claimValue, String matchValue) {
if (claimValue instanceof List<?> list) {
return list.stream().anyMatch(item -> String.valueOf(item).equalsIgnoreCase(matchValue));
}
if (claimValue instanceof String s) {
// Space-separated string (e.g., OAuth2 scope claim)
return Arrays.stream(s.split("\\s+"))
.anyMatch(part -> part.equalsIgnoreCase(matchValue));
}
return false;
}
private boolean regexMatch(Object claimValue, String matchValue) {
String s = String.valueOf(claimValue);
try {
return Pattern.matches(matchValue, s);
} catch (Exception e) {
log.warn("Invalid regex in claim mapping rule: {}", matchValue, e);
return false;
}
}
}
```
- [ ] **Step 4: Run tests to verify they pass**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=ClaimMappingServiceTest`
Expected: All 7 tests PASS.
- [ ] **Step 5: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/ClaimMappingService.java
git add cameleer-server-app/src/test/java/com/cameleer/server/core/rbac/ClaimMappingServiceTest.java
git commit -m "feat: implement ClaimMappingService with equals/contains/regex matching"
```
---
### Task 4: PostgreSQL Repository — ClaimMappingRepository Implementation
**Files:**
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/storage/PostgresClaimMappingRepository.java`
- [ ] **Step 1: Implement PostgresClaimMappingRepository**
```java
package com.cameleer.server.app.storage;
import com.cameleer.server.core.rbac.ClaimMappingRepository;
import com.cameleer.server.core.rbac.ClaimMappingRule;
import org.springframework.jdbc.core.JdbcTemplate;
import java.util.List;
import java.util.Optional;
import java.util.UUID;
public class PostgresClaimMappingRepository implements ClaimMappingRepository {
private final JdbcTemplate jdbc;
public PostgresClaimMappingRepository(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public List<ClaimMappingRule> findAll() {
return jdbc.query("""
SELECT id, claim, match_type, match_value, action, target, priority, created_at
FROM claim_mapping_rules ORDER BY priority, created_at
""", (rs, i) -> new ClaimMappingRule(
rs.getObject("id", UUID.class),
rs.getString("claim"),
rs.getString("match_type"),
rs.getString("match_value"),
rs.getString("action"),
rs.getString("target"),
rs.getInt("priority"),
rs.getTimestamp("created_at").toInstant()
));
}
@Override
public Optional<ClaimMappingRule> findById(UUID id) {
var results = jdbc.query("""
SELECT id, claim, match_type, match_value, action, target, priority, created_at
FROM claim_mapping_rules WHERE id = ?
""", (rs, i) -> new ClaimMappingRule(
rs.getObject("id", UUID.class),
rs.getString("claim"),
rs.getString("match_type"),
rs.getString("match_value"),
rs.getString("action"),
rs.getString("target"),
rs.getInt("priority"),
rs.getTimestamp("created_at").toInstant()
), id);
return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
}
@Override
public UUID create(String claim, String matchType, String matchValue, String action, String target, int priority) {
UUID id = UUID.randomUUID();
jdbc.update("""
INSERT INTO claim_mapping_rules (id, claim, match_type, match_value, action, target, priority)
VALUES (?, ?, ?, ?, ?, ?, ?)
""", id, claim, matchType, matchValue, action, target, priority);
return id;
}
@Override
public void update(UUID id, String claim, String matchType, String matchValue, String action, String target, int priority) {
jdbc.update("""
UPDATE claim_mapping_rules
SET claim = ?, match_type = ?, match_value = ?, action = ?, target = ?, priority = ?
WHERE id = ?
""", claim, matchType, matchValue, action, target, priority, id);
}
@Override
public void delete(UUID id) {
jdbc.update("DELETE FROM claim_mapping_rules WHERE id = ?", id);
}
}
```
- [ ] **Step 2: Wire the bean in AgentRegistryBeanConfig (or a new RbacBeanConfig)**
Add to `cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java` (or create a new `RbacBeanConfig.java`):
```java
@Bean
public ClaimMappingRepository claimMappingRepository(JdbcTemplate jdbcTemplate) {
return new PostgresClaimMappingRepository(jdbcTemplate);
}
@Bean
public ClaimMappingService claimMappingService() {
return new ClaimMappingService();
}
```
- [ ] **Step 3: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/storage/PostgresClaimMappingRepository.java
git add cameleer-server-app/src/main/java/com/cameleer/server/app/config/AgentRegistryBeanConfig.java
git commit -m "feat: implement PostgresClaimMappingRepository and wire beans"
```
---
### Task 5: Modify RbacServiceImpl — Origin-Aware Assignments
**Files:**
- Modify: `cameleer-server-app/src/main/java/com/cameleer/server/app/rbac/RbacServiceImpl.java`
- [ ] **Step 1: Add managed assignment methods to RbacService interface**
In `cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/RbacService.java`, add:
```java
void clearManagedAssignments(String userId);
void assignManagedRole(String userId, UUID roleId, UUID mappingId);
void addUserToManagedGroup(String userId, UUID groupId, UUID mappingId);
```
- [ ] **Step 2: Implement in RbacServiceImpl**
Add these methods to `RbacServiceImpl.java`:
```java
@Override
public void clearManagedAssignments(String userId) {
jdbc.update("DELETE FROM user_roles WHERE user_id = ? AND origin = 'managed'", userId);
jdbc.update("DELETE FROM user_groups WHERE user_id = ? AND origin = 'managed'", userId);
}
@Override
public void assignManagedRole(String userId, UUID roleId, UUID mappingId) {
jdbc.update("""
INSERT INTO user_roles (user_id, role_id, origin, mapping_id)
VALUES (?, ?, 'managed', ?)
ON CONFLICT (user_id, role_id, origin) DO UPDATE SET mapping_id = EXCLUDED.mapping_id
""", userId, roleId, mappingId);
}
@Override
public void addUserToManagedGroup(String userId, UUID groupId, UUID mappingId) {
jdbc.update("""
INSERT INTO user_groups (user_id, group_id, origin, mapping_id)
VALUES (?, ?, 'managed', ?)
ON CONFLICT (user_id, group_id, origin) DO UPDATE SET mapping_id = EXCLUDED.mapping_id
""", userId, groupId, mappingId);
}
```
- [ ] **Step 3: Update existing assignRoleToUser to specify origin='direct'**
Modify the existing `assignRoleToUser` and `addUserToGroup` methods to explicitly set `origin = 'direct'`:
```java
@Override
public void assignRoleToUser(String userId, UUID roleId) {
jdbc.update("""
INSERT INTO user_roles (user_id, role_id, origin)
VALUES (?, ?, 'direct')
ON CONFLICT (user_id, role_id, origin) DO NOTHING
""", userId, roleId);
}
@Override
public void addUserToGroup(String userId, UUID groupId) {
jdbc.update("""
INSERT INTO user_groups (user_id, group_id, origin)
VALUES (?, ?, 'direct')
ON CONFLICT (user_id, group_id, origin) DO NOTHING
""", userId, groupId);
}
```
- [ ] **Step 4: Update getDirectRolesForUser to filter by origin='direct'**
```java
@Override
public List<RoleSummary> getDirectRolesForUser(String userId) {
return jdbc.query("""
SELECT r.id, r.name, r.system FROM user_roles ur
JOIN roles r ON r.id = ur.role_id
WHERE ur.user_id = ? AND ur.origin = 'direct'
""", (rs, i) -> new RoleSummary(
rs.getObject("id", UUID.class),
rs.getString("name"),
rs.getBoolean("system"),
"direct"
), userId);
}
```
- [ ] **Step 5: Run existing tests**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app`
Expected: All existing tests still pass (migration adds columns with defaults).
- [ ] **Step 6: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/rbac/RbacService.java
git add cameleer-server-app/src/main/java/com/cameleer/server/app/rbac/RbacServiceImpl.java
git commit -m "feat: add origin-aware managed/direct assignment methods to RbacService"
```
---
### Task 6: Modify OidcAuthController — Replace syncOidcRoles with Claim Mapping
**Files:**
- Modify: `cameleer-server-app/src/main/java/com/cameleer/server/app/security/OidcAuthController.java`
- [ ] **Step 1: Inject ClaimMappingService and ClaimMappingRepository**
Add to constructor:
```java
private final ClaimMappingService claimMappingService;
private final ClaimMappingRepository claimMappingRepository;
```
- [ ] **Step 2: Replace syncOidcRoles with applyClaimMappings**
Replace the `syncOidcRoles` method (lines 176-208) with:
```java
private void applyClaimMappings(String userId, Map<String, Object> claims) {
List<ClaimMappingRule> rules = claimMappingRepository.findAll();
if (rules.isEmpty()) {
log.debug("No claim mapping rules configured, skipping for user {}", userId);
return;
}
rbacService.clearManagedAssignments(userId);
List<ClaimMappingService.MappingResult> results = claimMappingService.evaluate(rules, claims);
for (var result : results) {
ClaimMappingRule rule = result.rule();
switch (rule.action()) {
case "assignRole" -> {
UUID roleId = SystemRole.BY_NAME.get(SystemRole.normalizeScope(rule.target()));
if (roleId == null) {
log.warn("Claim mapping target role '{}' not found, skipping", rule.target());
continue;
}
rbacService.assignManagedRole(userId, roleId, rule.id());
log.debug("Managed role {} assigned to {} via mapping {}", rule.target(), userId, rule.id());
}
case "addToGroup" -> {
// Look up group by name
var groups = groupRepository.findAll();
var group = groups.stream().filter(g -> g.name().equalsIgnoreCase(rule.target())).findFirst();
if (group.isEmpty()) {
log.warn("Claim mapping target group '{}' not found, skipping", rule.target());
continue;
}
rbacService.addUserToManagedGroup(userId, group.get().id(), rule.id());
log.debug("Managed group {} assigned to {} via mapping {}", rule.target(), userId, rule.id());
}
}
}
}
```
- [ ] **Step 3: Update callback() to call applyClaimMappings**
In the `callback()` method, replace the `syncOidcRoles(userId, oidcRoles, config)` call with:
```java
// Extract all claims from the access token for claim mapping
Map<String, Object> claims = tokenExchanger.extractAllClaims(oidcUser);
applyClaimMappings(userId, claims);
```
Note: `extractAllClaims` needs to be added to `OidcTokenExchanger` — it returns the raw JWT claims map from the access token.
- [ ] **Step 4: Run existing tests**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app`
Expected: PASS (OIDC tests may need adjustment if they test syncOidcRoles directly).
- [ ] **Step 5: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/security/OidcAuthController.java
git commit -m "feat: replace syncOidcRoles with claim mapping evaluation on OIDC login"
```
---
### Task 7: OIDC-Only Mode — Disable Local Auth When OIDC Configured
**Files:**
- Modify: `cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityConfig.java`
- Modify: `cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtAuthenticationFilter.java`
- [ ] **Step 1: Add isOidcEnabled() helper to SecurityConfig**
```java
private boolean isOidcEnabled() {
return oidcIssuerUri != null && !oidcIssuerUri.isBlank();
}
```
- [ ] **Step 2: Conditionally disable local login endpoints**
In `SecurityConfig.filterChain()`, when OIDC is enabled, remove `/api/v1/auth/login` and `/api/v1/auth/refresh` from public endpoints (or let them return 404). The simplest approach: add a condition in `UiAuthController`:
```java
// In UiAuthController
@PostMapping("/login")
public ResponseEntity<?> login(@RequestBody LoginRequest request) {
if (oidcEnabled) {
return ResponseEntity.status(404).body(Map.of("error", "Local login disabled when OIDC is configured"));
}
// ... existing logic
}
```
- [ ] **Step 3: Modify JwtAuthenticationFilter to skip internal token path for user tokens in OIDC mode**
In `JwtAuthenticationFilter`, when OIDC is enabled, only accept internal (HMAC) tokens for agent subjects (starting with no `user:` prefix or explicitly agent subjects). User-facing tokens must come from the OIDC decoder:
```java
private void tryInternalToken(String token, HttpServletRequest request) {
try {
JwtService.JwtValidationResult result = jwtService.validateAccessToken(token);
// In OIDC mode, only accept agent tokens via internal validation
if (oidcDecoder != null && result.subject() != null && result.subject().startsWith("user:")) {
return; // User tokens must go through OIDC path
}
setAuthentication(result, request);
} catch (Exception e) {
// Not a valid internal token, will try OIDC next
}
}
```
- [ ] **Step 4: Disable user admin endpoints in OIDC mode**
In `UserAdminController`, add a guard for user creation and password reset:
```java
@PostMapping
public ResponseEntity<?> createUser(@RequestBody CreateUserRequest request) {
if (oidcEnabled) {
return ResponseEntity.status(400).body(Map.of("error", "User creation disabled when OIDC is configured. Users are auto-provisioned on OIDC login."));
}
// ... existing logic
}
@PostMapping("/{userId}/password")
public ResponseEntity<?> resetPassword(@PathVariable String userId, @RequestBody PasswordRequest request) {
if (oidcEnabled) {
return ResponseEntity.status(400).body(Map.of("error", "Password management disabled when OIDC is configured"));
}
// ... existing logic
}
```
- [ ] **Step 5: Run full test suite**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app`
Expected: PASS.
- [ ] **Step 6: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/security/SecurityConfig.java
git add cameleer-server-app/src/main/java/com/cameleer/server/app/security/JwtAuthenticationFilter.java
git add cameleer-server-app/src/main/java/com/cameleer/server/app/controller/UserAdminController.java
git commit -m "feat: disable local auth when OIDC is configured (resource server mode)"
```
---
### Task 8: Claim Mapping Admin Controller
**Files:**
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ClaimMappingAdminController.java`
- [ ] **Step 1: Implement the controller**
```java
package com.cameleer.server.app.controller;
import com.cameleer.server.core.rbac.ClaimMappingRepository;
import com.cameleer.server.core.rbac.ClaimMappingRule;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.springframework.http.ResponseEntity;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.web.bind.annotation.*;
import java.net.URI;
import java.util.List;
import java.util.UUID;
@RestController
@RequestMapping("/api/v1/admin/claim-mappings")
@PreAuthorize("hasRole('ADMIN')")
@Tag(name = "Claim Mapping Admin", description = "Manage OIDC claim-to-role/group mapping rules")
public class ClaimMappingAdminController {
private final ClaimMappingRepository repository;
public ClaimMappingAdminController(ClaimMappingRepository repository) {
this.repository = repository;
}
@GetMapping
@Operation(summary = "List all claim mapping rules")
public List<ClaimMappingRule> list() {
return repository.findAll();
}
@GetMapping("/{id}")
@Operation(summary = "Get a claim mapping rule by ID")
public ResponseEntity<ClaimMappingRule> get(@PathVariable UUID id) {
return repository.findById(id)
.map(ResponseEntity::ok)
.orElse(ResponseEntity.notFound().build());
}
record CreateRuleRequest(String claim, String matchType, String matchValue,
String action, String target, int priority) {}
@PostMapping
@Operation(summary = "Create a claim mapping rule")
public ResponseEntity<ClaimMappingRule> create(@RequestBody CreateRuleRequest request) {
UUID id = repository.create(
request.claim(), request.matchType(), request.matchValue(),
request.action(), request.target(), request.priority());
return repository.findById(id)
.map(rule -> ResponseEntity.created(URI.create("/api/v1/admin/claim-mappings/" + id)).body(rule))
.orElse(ResponseEntity.internalServerError().build());
}
@PutMapping("/{id}")
@Operation(summary = "Update a claim mapping rule")
public ResponseEntity<ClaimMappingRule> update(@PathVariable UUID id, @RequestBody CreateRuleRequest request) {
if (repository.findById(id).isEmpty()) {
return ResponseEntity.notFound().build();
}
repository.update(id, request.claim(), request.matchType(), request.matchValue(),
request.action(), request.target(), request.priority());
return repository.findById(id)
.map(ResponseEntity::ok)
.orElse(ResponseEntity.internalServerError().build());
}
@DeleteMapping("/{id}")
@Operation(summary = "Delete a claim mapping rule")
public ResponseEntity<Void> delete(@PathVariable UUID id) {
if (repository.findById(id).isEmpty()) {
return ResponseEntity.notFound().build();
}
repository.delete(id);
return ResponseEntity.noContent().build();
}
}
```
- [ ] **Step 2: Add endpoint to SecurityConfig**
In `SecurityConfig.filterChain()`, the `/api/v1/admin/**` path already requires ADMIN role. No changes needed.
- [ ] **Step 3: Run full test suite**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/controller/ClaimMappingAdminController.java
git commit -m "feat: add ClaimMappingAdminController for CRUD on mapping rules"
```
---
### Task 9: Integration Test — Claim Mapping End-to-End
**Files:**
- Create: `cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ClaimMappingAdminControllerIT.java`
- [ ] **Step 1: Write integration test**
```java
package com.cameleer.server.app.controller;
import com.cameleer.server.app.AbstractPostgresIT;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.web.client.TestRestTemplate;
import org.springframework.http.*;
import static org.assertj.core.api.Assertions.assertThat;
class ClaimMappingAdminControllerIT extends AbstractPostgresIT {
@Autowired private TestRestTemplate restTemplate;
@Autowired private ObjectMapper objectMapper;
@Autowired private TestSecurityHelper securityHelper;
private HttpHeaders adminHeaders;
@BeforeEach
void setUp() {
adminHeaders = securityHelper.adminHeaders();
}
@Test
void createAndListRules() throws Exception {
String body = """
{"claim":"groups","matchType":"contains","matchValue":"admins","action":"assignRole","target":"ADMIN","priority":0}
""";
var createResponse = restTemplate.exchange("/api/v1/admin/claim-mappings",
HttpMethod.POST, new HttpEntity<>(body, adminHeaders), String.class);
assertThat(createResponse.getStatusCode()).isEqualTo(HttpStatus.CREATED);
var listResponse = restTemplate.exchange("/api/v1/admin/claim-mappings",
HttpMethod.GET, new HttpEntity<>(adminHeaders), String.class);
assertThat(listResponse.getStatusCode()).isEqualTo(HttpStatus.OK);
JsonNode rules = objectMapper.readTree(listResponse.getBody());
assertThat(rules.isArray()).isTrue();
assertThat(rules.size()).isGreaterThanOrEqualTo(1);
}
@Test
void deleteRule() throws Exception {
String body = """
{"claim":"dept","matchType":"equals","matchValue":"eng","action":"assignRole","target":"VIEWER","priority":0}
""";
var createResponse = restTemplate.exchange("/api/v1/admin/claim-mappings",
HttpMethod.POST, new HttpEntity<>(body, adminHeaders), String.class);
JsonNode created = objectMapper.readTree(createResponse.getBody());
String id = created.get("id").asText();
var deleteResponse = restTemplate.exchange("/api/v1/admin/claim-mappings/" + id,
HttpMethod.DELETE, new HttpEntity<>(adminHeaders), Void.class);
assertThat(deleteResponse.getStatusCode()).isEqualTo(HttpStatus.NO_CONTENT);
var getResponse = restTemplate.exchange("/api/v1/admin/claim-mappings/" + id,
HttpMethod.GET, new HttpEntity<>(adminHeaders), String.class);
assertThat(getResponse.getStatusCode()).isEqualTo(HttpStatus.NOT_FOUND);
}
}
```
- [ ] **Step 2: Run integration tests**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=ClaimMappingAdminControllerIT`
Expected: PASS.
- [ ] **Step 3: Commit**
```bash
git add cameleer-server-app/src/test/java/com/cameleer/server/app/controller/ClaimMappingAdminControllerIT.java
git commit -m "test: add integration tests for claim mapping admin API"
```
---
### Task 10: Run Full Test Suite and Final Verification
- [ ] **Step 1: Run all tests**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn clean verify`
Expected: All tests PASS. Build succeeds.
- [ ] **Step 2: Verify migration applies cleanly on fresh database**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=AbstractPostgresIT`
Expected: Testcontainers starts fresh PostgreSQL, Flyway applies V1 + V2, context loads.
- [ ] **Step 3: Commit any remaining fixes**
```bash
git add -A
git commit -m "chore: finalize auth & RBAC overhaul — all tests passing"
```

View File

@@ -0,0 +1,615 @@
# Plan 2: Server-Side License Validation
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add Ed25519-signed license JWT validation to the server, enabling feature gating for MOAT features (debugger, lineage, correlation) by tier.
**Architecture:** The SaaS generates Ed25519-signed license JWTs containing tier, features, limits, and expiry. The server validates the license on startup (from env var or file) or at runtime (via admin API). A `LicenseGate` service checks whether a feature is enabled before serving gated endpoints. The server's existing Ed25519 infrastructure (JDK 17 `java.security`) is reused for verification. In standalone mode without a license, all features are available (open/dev mode).
**Tech Stack:** Java 17, Spring Boot 3.4.3, Ed25519 (JDK built-in), Nimbus JOSE JWT, JUnit 5, AssertJ
**Repo:** `C:\Users\Hendrik\Documents\projects\cameleer-server`
---
## File Map
### New Files
- `cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseInfo.java`
- `cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseValidator.java`
- `cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseGate.java`
- `cameleer-server-core/src/main/java/com/cameleer/server/core/license/Feature.java`
- `cameleer-server-app/src/main/java/com/cameleer/server/app/config/LicenseBeanConfig.java`
- `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/LicenseAdminController.java`
- `cameleer-server-app/src/test/java/com/cameleer/server/core/license/LicenseValidatorTest.java`
- `cameleer-server-app/src/test/java/com/cameleer/server/core/license/LicenseGateTest.java`
### Modified Files
- `cameleer-server-app/src/main/resources/application.yml` — add license config properties
---
### Task 1: Core Domain — LicenseInfo, Feature Enum
**Files:**
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/license/Feature.java`
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseInfo.java`
- [ ] **Step 1: Create Feature enum**
```java
package com.cameleer.server.core.license;
public enum Feature {
topology,
lineage,
correlation,
debugger,
replay
}
```
- [ ] **Step 2: Create LicenseInfo record**
```java
package com.cameleer.server.core.license;
import java.time.Instant;
import java.util.Map;
import java.util.Set;
public record LicenseInfo(
String tier,
Set<Feature> features,
Map<String, Integer> limits,
Instant issuedAt,
Instant expiresAt
) {
public boolean isExpired() {
return expiresAt != null && Instant.now().isAfter(expiresAt);
}
public boolean hasFeature(Feature feature) {
return features.contains(feature);
}
public int getLimit(String key, int defaultValue) {
return limits.getOrDefault(key, defaultValue);
}
/** Open license — all features enabled, no limits. Used when no license is configured. */
public static LicenseInfo open() {
return new LicenseInfo("open", Set.of(Feature.values()), Map.of(), Instant.now(), null);
}
}
```
- [ ] **Step 3: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/license/Feature.java
git add cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseInfo.java
git commit -m "feat: add LicenseInfo and Feature domain model"
```
---
### Task 2: LicenseValidator — Ed25519 JWT Verification
**Files:**
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseValidator.java`
- Create: `cameleer-server-app/src/test/java/com/cameleer/server/core/license/LicenseValidatorTest.java`
- [ ] **Step 1: Write tests**
```java
package com.cameleer.server.core.license;
import org.junit.jupiter.api.Test;
import java.security.*;
import java.security.spec.NamedParameterSpec;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
import java.util.Base64;
import static org.assertj.core.api.Assertions.assertThat;
import static org.assertj.core.api.Assertions.assertThatThrownBy;
class LicenseValidatorTest {
private KeyPair generateKeyPair() throws Exception {
KeyPairGenerator kpg = KeyPairGenerator.getInstance("Ed25519");
return kpg.generateKeyPair();
}
private String sign(PrivateKey key, String payload) throws Exception {
Signature signer = Signature.getInstance("Ed25519");
signer.initSign(key);
signer.update(payload.getBytes());
return Base64.getEncoder().encodeToString(signer.sign());
}
@Test
void validate_validLicense_returnsLicenseInfo() throws Exception {
KeyPair kp = generateKeyPair();
String publicKeyBase64 = Base64.getEncoder().encodeToString(kp.getPublic().getEncoded());
LicenseValidator validator = new LicenseValidator(publicKeyBase64);
Instant expires = Instant.now().plus(365, ChronoUnit.DAYS);
String payload = """
{"tier":"HIGH","features":["topology","lineage","debugger"],"limits":{"max_agents":50,"retention_days":90},"iat":%d,"exp":%d}
""".formatted(Instant.now().getEpochSecond(), expires.getEpochSecond()).trim();
String signature = sign(kp.getPrivate(), payload);
String token = Base64.getEncoder().encodeToString(payload.getBytes()) + "." + signature;
LicenseInfo info = validator.validate(token);
assertThat(info.tier()).isEqualTo("HIGH");
assertThat(info.hasFeature(Feature.debugger)).isTrue();
assertThat(info.hasFeature(Feature.replay)).isFalse();
assertThat(info.getLimit("max_agents", 0)).isEqualTo(50);
assertThat(info.isExpired()).isFalse();
}
@Test
void validate_expiredLicense_throwsException() throws Exception {
KeyPair kp = generateKeyPair();
String publicKeyBase64 = Base64.getEncoder().encodeToString(kp.getPublic().getEncoded());
LicenseValidator validator = new LicenseValidator(publicKeyBase64);
Instant past = Instant.now().minus(1, ChronoUnit.DAYS);
String payload = """
{"tier":"LOW","features":["topology"],"limits":{},"iat":%d,"exp":%d}
""".formatted(past.minus(30, ChronoUnit.DAYS).getEpochSecond(), past.getEpochSecond()).trim();
String signature = sign(kp.getPrivate(), payload);
String token = Base64.getEncoder().encodeToString(payload.getBytes()) + "." + signature;
assertThatThrownBy(() -> validator.validate(token))
.isInstanceOf(IllegalArgumentException.class)
.hasMessageContaining("expired");
}
@Test
void validate_tamperedPayload_throwsException() throws Exception {
KeyPair kp = generateKeyPair();
String publicKeyBase64 = Base64.getEncoder().encodeToString(kp.getPublic().getEncoded());
LicenseValidator validator = new LicenseValidator(publicKeyBase64);
String payload = """
{"tier":"LOW","features":["topology"],"limits":{},"iat":0,"exp":9999999999}
""".trim();
String signature = sign(kp.getPrivate(), payload);
// Tamper with payload
String tampered = payload.replace("LOW", "BUSINESS");
String token = Base64.getEncoder().encodeToString(tampered.getBytes()) + "." + signature;
assertThatThrownBy(() -> validator.validate(token))
.isInstanceOf(SecurityException.class)
.hasMessageContaining("signature");
}
}
```
- [ ] **Step 2: Run tests to verify they fail**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=LicenseValidatorTest -Dsurefire.failIfNoSpecifiedTests=false`
Expected: Compilation error — LicenseValidator does not exist.
- [ ] **Step 3: Implement LicenseValidator**
```java
package com.cameleer.server.core.license;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.security.*;
import java.security.spec.X509EncodedKeySpec;
import java.time.Instant;
import java.util.*;
import java.util.stream.Collectors;
import java.util.stream.StreamSupport;
public class LicenseValidator {
private static final Logger log = LoggerFactory.getLogger(LicenseValidator.class);
private static final ObjectMapper objectMapper = new ObjectMapper();
private final PublicKey publicKey;
public LicenseValidator(String publicKeyBase64) {
try {
byte[] keyBytes = Base64.getDecoder().decode(publicKeyBase64);
KeyFactory kf = KeyFactory.getInstance("Ed25519");
this.publicKey = kf.generatePublic(new X509EncodedKeySpec(keyBytes));
} catch (Exception e) {
throw new IllegalStateException("Failed to load license public key", e);
}
}
public LicenseInfo validate(String token) {
String[] parts = token.split("\\.", 2);
if (parts.length != 2) {
throw new IllegalArgumentException("Invalid license token format: expected payload.signature");
}
byte[] payloadBytes = Base64.getDecoder().decode(parts[0]);
byte[] signatureBytes = Base64.getDecoder().decode(parts[1]);
// Verify signature
try {
Signature verifier = Signature.getInstance("Ed25519");
verifier.initVerify(publicKey);
verifier.update(payloadBytes);
if (!verifier.verify(signatureBytes)) {
throw new SecurityException("License signature verification failed");
}
} catch (SecurityException e) {
throw e;
} catch (Exception e) {
throw new SecurityException("License signature verification failed", e);
}
// Parse payload
try {
JsonNode root = objectMapper.readTree(payloadBytes);
String tier = root.get("tier").asText();
Set<Feature> features = new HashSet<>();
if (root.has("features")) {
for (JsonNode f : root.get("features")) {
try {
features.add(Feature.valueOf(f.asText()));
} catch (IllegalArgumentException e) {
log.warn("Unknown feature in license: {}", f.asText());
}
}
}
Map<String, Integer> limits = new HashMap<>();
if (root.has("limits")) {
root.get("limits").fields().forEachRemaining(entry ->
limits.put(entry.getKey(), entry.getValue().asInt()));
}
Instant issuedAt = root.has("iat") ? Instant.ofEpochSecond(root.get("iat").asLong()) : Instant.now();
Instant expiresAt = root.has("exp") ? Instant.ofEpochSecond(root.get("exp").asLong()) : null;
LicenseInfo info = new LicenseInfo(tier, features, limits, issuedAt, expiresAt);
if (info.isExpired()) {
throw new IllegalArgumentException("License expired at " + expiresAt);
}
return info;
} catch (IllegalArgumentException e) {
throw e;
} catch (Exception e) {
throw new IllegalArgumentException("Failed to parse license payload", e);
}
}
}
```
- [ ] **Step 4: Run tests**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=LicenseValidatorTest`
Expected: All 3 tests PASS.
- [ ] **Step 5: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseValidator.java
git add cameleer-server-app/src/test/java/com/cameleer/server/core/license/LicenseValidatorTest.java
git commit -m "feat: implement LicenseValidator with Ed25519 signature verification"
```
---
### Task 3: LicenseGate — Feature Check Service
**Files:**
- Create: `cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseGate.java`
- Create: `cameleer-server-app/src/test/java/com/cameleer/server/core/license/LicenseGateTest.java`
- [ ] **Step 1: Write tests**
```java
package com.cameleer.server.core.license;
import org.junit.jupiter.api.Test;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
import java.util.Map;
import java.util.Set;
import static org.assertj.core.api.Assertions.assertThat;
class LicenseGateTest {
@Test
void noLicense_allFeaturesEnabled() {
LicenseGate gate = new LicenseGate();
// No license loaded → open mode
assertThat(gate.isEnabled(Feature.debugger)).isTrue();
assertThat(gate.isEnabled(Feature.replay)).isTrue();
assertThat(gate.isEnabled(Feature.lineage)).isTrue();
assertThat(gate.getTier()).isEqualTo("open");
}
@Test
void withLicense_onlyLicensedFeaturesEnabled() {
LicenseGate gate = new LicenseGate();
LicenseInfo license = new LicenseInfo("MID",
Set.of(Feature.topology, Feature.lineage, Feature.correlation),
Map.of("max_agents", 10, "retention_days", 30),
Instant.now(), Instant.now().plus(365, ChronoUnit.DAYS));
gate.load(license);
assertThat(gate.isEnabled(Feature.topology)).isTrue();
assertThat(gate.isEnabled(Feature.lineage)).isTrue();
assertThat(gate.isEnabled(Feature.debugger)).isFalse();
assertThat(gate.isEnabled(Feature.replay)).isFalse();
assertThat(gate.getTier()).isEqualTo("MID");
assertThat(gate.getLimit("max_agents", 0)).isEqualTo(10);
}
}
```
- [ ] **Step 2: Implement LicenseGate**
```java
package com.cameleer.server.core.license;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.concurrent.atomic.AtomicReference;
public class LicenseGate {
private static final Logger log = LoggerFactory.getLogger(LicenseGate.class);
private final AtomicReference<LicenseInfo> current = new AtomicReference<>(LicenseInfo.open());
public void load(LicenseInfo license) {
current.set(license);
log.info("License loaded: tier={}, features={}, expires={}",
license.tier(), license.features(), license.expiresAt());
}
public boolean isEnabled(Feature feature) {
return current.get().hasFeature(feature);
}
public String getTier() {
return current.get().tier();
}
public int getLimit(String key, int defaultValue) {
return current.get().getLimit(key, defaultValue);
}
public LicenseInfo getCurrent() {
return current.get();
}
}
```
- [ ] **Step 3: Run tests**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app -Dtest=LicenseGateTest`
Expected: PASS.
- [ ] **Step 4: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/license/LicenseGate.java
git add cameleer-server-app/src/test/java/com/cameleer/server/core/license/LicenseGateTest.java
git commit -m "feat: implement LicenseGate for feature checking"
```
---
### Task 4: License Loading — Bean Config and Startup
**Files:**
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/config/LicenseBeanConfig.java`
- Modify: `cameleer-server-app/src/main/resources/application.yml`
- [ ] **Step 1: Add license config properties to application.yml**
```yaml
license:
token: ${CAMELEER_LICENSE_TOKEN:}
file: ${CAMELEER_LICENSE_FILE:}
public-key: ${CAMELEER_LICENSE_PUBLIC_KEY:}
```
- [ ] **Step 2: Implement LicenseBeanConfig**
```java
package com.cameleer.server.app.config;
import com.cameleer.server.core.license.LicenseGate;
import com.cameleer.server.core.license.LicenseInfo;
import com.cameleer.server.core.license.LicenseValidator;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.nio.file.Files;
import java.nio.file.Path;
@Configuration
public class LicenseBeanConfig {
private static final Logger log = LoggerFactory.getLogger(LicenseBeanConfig.class);
@Value("${license.token:}")
private String licenseToken;
@Value("${license.file:}")
private String licenseFile;
@Value("${license.public-key:}")
private String licensePublicKey;
@Bean
public LicenseGate licenseGate() {
LicenseGate gate = new LicenseGate();
String token = resolveLicenseToken();
if (token == null || token.isBlank()) {
log.info("No license configured — running in open mode (all features enabled)");
return gate;
}
if (licensePublicKey == null || licensePublicKey.isBlank()) {
log.warn("License token provided but no public key configured (CAMELEER_LICENSE_PUBLIC_KEY). Running in open mode.");
return gate;
}
try {
LicenseValidator validator = new LicenseValidator(licensePublicKey);
LicenseInfo info = validator.validate(token);
gate.load(info);
} catch (Exception e) {
log.error("Failed to validate license: {}. Running in open mode.", e.getMessage());
}
return gate;
}
private String resolveLicenseToken() {
if (licenseToken != null && !licenseToken.isBlank()) {
return licenseToken;
}
if (licenseFile != null && !licenseFile.isBlank()) {
try {
return Files.readString(Path.of(licenseFile)).trim();
} catch (Exception e) {
log.warn("Failed to read license file {}: {}", licenseFile, e.getMessage());
}
}
return null;
}
}
```
- [ ] **Step 3: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/config/LicenseBeanConfig.java
git add cameleer-server-app/src/main/resources/application.yml
git commit -m "feat: add license loading at startup from env var or file"
```
---
### Task 5: License Admin API — Runtime License Update
**Files:**
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/LicenseAdminController.java`
- [ ] **Step 1: Implement controller**
```java
package com.cameleer.server.app.controller;
import com.cameleer.server.core.license.LicenseGate;
import com.cameleer.server.core.license.LicenseInfo;
import com.cameleer.server.core.license.LicenseValidator;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.ResponseEntity;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.web.bind.annotation.*;
import java.util.Map;
@RestController
@RequestMapping("/api/v1/admin/license")
@PreAuthorize("hasRole('ADMIN')")
@Tag(name = "License Admin", description = "License management")
public class LicenseAdminController {
private final LicenseGate licenseGate;
private final String licensePublicKey;
public LicenseAdminController(LicenseGate licenseGate,
@Value("${license.public-key:}") String licensePublicKey) {
this.licenseGate = licenseGate;
this.licensePublicKey = licensePublicKey;
}
@GetMapping
@Operation(summary = "Get current license info")
public ResponseEntity<LicenseInfo> getCurrent() {
return ResponseEntity.ok(licenseGate.getCurrent());
}
record UpdateLicenseRequest(String token) {}
@PostMapping
@Operation(summary = "Update license token at runtime")
public ResponseEntity<?> update(@RequestBody UpdateLicenseRequest request) {
if (licensePublicKey == null || licensePublicKey.isBlank()) {
return ResponseEntity.badRequest().body(Map.of("error", "No license public key configured"));
}
try {
LicenseValidator validator = new LicenseValidator(licensePublicKey);
LicenseInfo info = validator.validate(request.token());
licenseGate.load(info);
return ResponseEntity.ok(info);
} catch (Exception e) {
return ResponseEntity.badRequest().body(Map.of("error", e.getMessage()));
}
}
}
```
- [ ] **Step 2: Run full test suite**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn clean verify`
Expected: PASS.
- [ ] **Step 3: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/controller/LicenseAdminController.java
git commit -m "feat: add license admin API for runtime license updates"
```
---
### Task 6: Feature Gating — Wire LicenseGate Into Endpoints
This task is a placeholder — MOAT feature endpoints don't exist yet. When they're added (debugger, lineage, correlation), they should inject `LicenseGate` and check `isEnabled(Feature.xxx)` before serving:
```java
@GetMapping("/api/v1/debug/sessions")
public ResponseEntity<?> listDebugSessions() {
if (!licenseGate.isEnabled(Feature.debugger)) {
return ResponseEntity.status(403).body(Map.of("error", "Feature 'debugger' requires a HIGH or BUSINESS tier license"));
}
// ... serve debug sessions
}
```
- [ ] **Step 1: No code changes needed now — document the pattern for MOAT feature implementation**
- [ ] **Step 2: Final verification**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn clean verify`
Expected: All tests PASS.

View File

@@ -0,0 +1,993 @@
# Plan 3: Runtime Management in the Server
> **Status: COMPLETED** — Verified 2026-04-09. All runtime management fully ported to cameleer-server with enhancements beyond the original plan.
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [x]`) syntax for tracking.
**Goal:** Move environment management, app lifecycle, JAR upload, and Docker container orchestration from the SaaS layer into the server, so the server is a self-sufficient product that can deploy and manage Camel applications.
**Architecture:** The server gains Environment/App/AppVersion/Deployment entities stored in its PostgreSQL. A `RuntimeOrchestrator` interface abstracts Docker/K8s/disabled modes, auto-detected at startup. The Docker implementation uses a shared base image + volume-mounted JARs (no per-deployment image builds). Apps are promoted between environments by creating new Deployments pointing to the same AppVersion. Routing supports both path-based and subdomain-based modes via Traefik labels.
**Tech Stack:** Java 17, Spring Boot 3.4.3, docker-java (zerodep transport), PostgreSQL 16, Flyway, JUnit 5, Testcontainers
**Repo:** `C:\Users\Hendrik\Documents\projects\cameleer-server`
**Source reference:** Code ported from `C:\Users\Hendrik\Documents\projects\cameleer-saas` (environment, app, deployment, runtime packages)
---
## File Map
### New Files — Core Module (`cameleer-server-core`)
```
src/main/java/com/cameleer/server/core/runtime/
├── Environment.java Record: id, slug, displayName, status, createdAt
├── EnvironmentStatus.java Enum: ACTIVE, SUSPENDED
├── EnvironmentRepository.java Interface: CRUD + findBySlug
├── EnvironmentService.java Business logic: create, list, delete, enforce limits
├── App.java Record: id, environmentId, slug, displayName, createdAt
├── AppVersion.java Record: id, appId, version, jarPath, sha256, uploadedAt
├── AppRepository.java Interface: CRUD + findByEnvironmentId
├── AppVersionRepository.java Interface: CRUD + findByAppId
├── AppService.java Business logic: create, upload JAR, list, delete
├── Deployment.java Record: id, appId, appVersionId, environmentId, status, containerId
├── DeploymentStatus.java Enum: STARTING, RUNNING, FAILED, STOPPED
├── DeploymentRepository.java Interface: CRUD + findByAppId + findByEnvironmentId
├── DeploymentService.java Business logic: deploy, stop, restart, promote
├── RuntimeOrchestrator.java Interface: startContainer, stopContainer, getStatus, getLogs
├── RuntimeConfig.java Record: jarStoragePath, baseImage, dockerNetwork, routing, etc.
├── ContainerRequest.java Record: containerName, jarPath, envVars, memoryLimit, cpuShares
├── ContainerStatus.java Record: state, running, exitCode, error
└── RoutingMode.java Enum: path, subdomain
```
### New Files — App Module (`cameleer-server-app`)
```
src/main/java/com/cameleer/server/app/runtime/
├── DockerRuntimeOrchestrator.java Docker implementation using docker-java
├── DisabledRuntimeOrchestrator.java No-op implementation (observability-only mode)
├── RuntimeOrchestratorAutoConfig.java @Configuration: auto-detects Docker vs K8s vs disabled
├── DeploymentExecutor.java @Service: async deployment pipeline
├── JarStorageService.java File-system JAR storage with versioning
└── ContainerLogCollector.java Collects Docker container stdout/stderr
src/main/java/com/cameleer/server/app/storage/
├── PostgresEnvironmentRepository.java
├── PostgresAppRepository.java
├── PostgresAppVersionRepository.java
└── PostgresDeploymentRepository.java
src/main/java/com/cameleer/server/app/controller/
├── EnvironmentAdminController.java CRUD endpoints under /api/v1/admin/environments
├── AppController.java App + version CRUD + JAR upload
└── DeploymentController.java Deploy, stop, restart, promote, logs
src/main/resources/db/migration/
└── V3__runtime_management.sql Environments, apps, app_versions, deployments tables
```
### Modified Files
- `pom.xml` (parent) — add docker-java dependency
- `cameleer-server-app/pom.xml` — add docker-java dependency
- `application.yml` — add runtime config properties
---
### Task 1: Add docker-java Dependency
**Files:**
- Modify: `cameleer-server-app/pom.xml`
- [x] **Step 1: Add docker-java dependency**
```xml
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-core</artifactId>
<version>3.4.1</version>
</dependency>
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-transport-zerodep</artifactId>
<version>3.4.1</version>
</dependency>
```
- [x] **Step 2: Verify build**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn compile -pl cameleer-server-app`
Expected: BUILD SUCCESS.
- [x] **Step 3: Commit**
```bash
git add cameleer-server-app/pom.xml
git commit -m "chore: add docker-java dependency for runtime orchestration"
```
---
### Task 2: Database Migration — Runtime Management Tables
**Files:**
- Create: `cameleer-server-app/src/main/resources/db/migration/V3__runtime_management.sql`
- [x] **Step 1: Write migration**
```sql
-- V3__runtime_management.sql
-- Runtime management: environments, apps, app versions, deployments
CREATE TABLE environments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
slug VARCHAR(100) NOT NULL UNIQUE,
display_name VARCHAR(255) NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'ACTIVE',
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE TABLE apps (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
environment_id UUID NOT NULL REFERENCES environments(id) ON DELETE CASCADE,
slug VARCHAR(100) NOT NULL,
display_name VARCHAR(255) NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(environment_id, slug)
);
CREATE INDEX idx_apps_environment_id ON apps(environment_id);
CREATE TABLE app_versions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
version INTEGER NOT NULL,
jar_path VARCHAR(500) NOT NULL,
jar_checksum VARCHAR(64) NOT NULL,
jar_filename VARCHAR(255),
jar_size_bytes BIGINT,
uploaded_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE(app_id, version)
);
CREATE INDEX idx_app_versions_app_id ON app_versions(app_id);
CREATE TABLE deployments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
app_version_id UUID NOT NULL REFERENCES app_versions(id),
environment_id UUID NOT NULL REFERENCES environments(id),
status VARCHAR(20) NOT NULL DEFAULT 'STARTING',
container_id VARCHAR(100),
container_name VARCHAR(255),
error_message TEXT,
deployed_at TIMESTAMPTZ,
stopped_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_deployments_app_id ON deployments(app_id);
CREATE INDEX idx_deployments_env_id ON deployments(environment_id);
-- Default environment (standalone mode always has at least one)
INSERT INTO environments (slug, display_name) VALUES ('default', 'Default');
```
- [x] **Step 2: Commit**
```bash
git add cameleer-server-app/src/main/resources/db/migration/V3__runtime_management.sql
git commit -m "feat: add runtime management database schema (environments, apps, versions, deployments)"
```
---
### Task 3: Core Domain — Environment, App, AppVersion, Deployment Records
**Files:**
- Create all records in `cameleer-server-core/src/main/java/com/cameleer/server/core/runtime/`
- [x] **Step 1: Create all domain records**
```java
// Environment.java
package com.cameleer.server.core.runtime;
import java.time.Instant;
import java.util.UUID;
public record Environment(UUID id, String slug, String displayName, EnvironmentStatus status, Instant createdAt) {}
// EnvironmentStatus.java
package com.cameleer.server.core.runtime;
public enum EnvironmentStatus { ACTIVE, SUSPENDED }
// App.java
package com.cameleer.server.core.runtime;
import java.time.Instant;
import java.util.UUID;
public record App(UUID id, UUID environmentId, String slug, String displayName, Instant createdAt) {}
// AppVersion.java
package com.cameleer.server.core.runtime;
import java.time.Instant;
import java.util.UUID;
public record AppVersion(UUID id, UUID appId, int version, String jarPath, String jarChecksum,
String jarFilename, Long jarSizeBytes, Instant uploadedAt) {}
// Deployment.java
package com.cameleer.server.core.runtime;
import java.time.Instant;
import java.util.UUID;
public record Deployment(UUID id, UUID appId, UUID appVersionId, UUID environmentId,
DeploymentStatus status, String containerId, String containerName,
String errorMessage, Instant deployedAt, Instant stoppedAt, Instant createdAt) {
public Deployment withStatus(DeploymentStatus newStatus) {
return new Deployment(id, appId, appVersionId, environmentId, newStatus,
containerId, containerName, errorMessage, deployedAt, stoppedAt, createdAt);
}
}
// DeploymentStatus.java
package com.cameleer.server.core.runtime;
public enum DeploymentStatus { STARTING, RUNNING, FAILED, STOPPED }
// RoutingMode.java
package com.cameleer.server.core.runtime;
public enum RoutingMode { path, subdomain }
```
- [x] **Step 2: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/runtime/
git commit -m "feat: add runtime management domain records"
```
---
### Task 4: Core — Repository Interfaces and RuntimeOrchestrator
**Files:**
- Create repository interfaces and RuntimeOrchestrator in `core/runtime/`
- [x] **Step 1: Create repository interfaces**
```java
// EnvironmentRepository.java
package com.cameleer.server.core.runtime;
import java.util.*;
public interface EnvironmentRepository {
List<Environment> findAll();
Optional<Environment> findById(UUID id);
Optional<Environment> findBySlug(String slug);
UUID create(String slug, String displayName);
void updateDisplayName(UUID id, String displayName);
void updateStatus(UUID id, EnvironmentStatus status);
void delete(UUID id);
}
// AppRepository.java
package com.cameleer.server.core.runtime;
import java.util.*;
public interface AppRepository {
List<App> findByEnvironmentId(UUID environmentId);
Optional<App> findById(UUID id);
Optional<App> findByEnvironmentIdAndSlug(UUID environmentId, String slug);
UUID create(UUID environmentId, String slug, String displayName);
void delete(UUID id);
}
// AppVersionRepository.java
package com.cameleer.server.core.runtime;
import java.util.*;
public interface AppVersionRepository {
List<AppVersion> findByAppId(UUID appId);
Optional<AppVersion> findById(UUID id);
int findMaxVersion(UUID appId);
UUID create(UUID appId, int version, String jarPath, String jarChecksum, String jarFilename, Long jarSizeBytes);
}
// DeploymentRepository.java
package com.cameleer.server.core.runtime;
import java.util.*;
public interface DeploymentRepository {
List<Deployment> findByAppId(UUID appId);
List<Deployment> findByEnvironmentId(UUID environmentId);
Optional<Deployment> findById(UUID id);
Optional<Deployment> findActiveByAppIdAndEnvironmentId(UUID appId, UUID environmentId);
UUID create(UUID appId, UUID appVersionId, UUID environmentId, String containerName);
void updateStatus(UUID id, DeploymentStatus status, String containerId, String errorMessage);
void markDeployed(UUID id);
void markStopped(UUID id);
}
```
- [x] **Step 2: Create RuntimeOrchestrator interface**
```java
// RuntimeOrchestrator.java
package com.cameleer.server.core.runtime;
import java.util.stream.Stream;
public interface RuntimeOrchestrator {
boolean isEnabled();
String startContainer(ContainerRequest request);
void stopContainer(String containerId);
void removeContainer(String containerId);
ContainerStatus getContainerStatus(String containerId);
Stream<String> getLogs(String containerId, int tailLines);
}
// ContainerRequest.java
package com.cameleer.server.core.runtime;
import java.util.Map;
public record ContainerRequest(
String containerName,
String baseImage,
String jarPath,
String network,
Map<String, String> envVars,
Map<String, String> labels,
long memoryLimitBytes,
int cpuShares,
int healthCheckPort
) {}
// ContainerStatus.java
package com.cameleer.server.core.runtime;
public record ContainerStatus(String state, boolean running, int exitCode, String error) {
public static ContainerStatus notFound() {
return new ContainerStatus("not_found", false, -1, "Container not found");
}
}
```
- [x] **Step 3: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/runtime/
git commit -m "feat: add runtime repository interfaces and RuntimeOrchestrator"
```
---
### Task 5: Core — EnvironmentService, AppService, DeploymentService
**Files:**
- Create service classes in `core/runtime/`
- [x] **Step 1: Create EnvironmentService**
```java
package com.cameleer.server.core.runtime;
import java.util.List;
import java.util.UUID;
public class EnvironmentService {
private final EnvironmentRepository repo;
public EnvironmentService(EnvironmentRepository repo) {
this.repo = repo;
}
public List<Environment> listAll() { return repo.findAll(); }
public Environment getById(UUID id) { return repo.findById(id).orElseThrow(() -> new IllegalArgumentException("Environment not found: " + id)); }
public Environment getBySlug(String slug) { return repo.findBySlug(slug).orElseThrow(() -> new IllegalArgumentException("Environment not found: " + slug)); }
public UUID create(String slug, String displayName) {
if (repo.findBySlug(slug).isPresent()) {
throw new IllegalArgumentException("Environment with slug '" + slug + "' already exists");
}
return repo.create(slug, displayName);
}
public void delete(UUID id) {
Environment env = getById(id);
if ("default".equals(env.slug())) {
throw new IllegalArgumentException("Cannot delete the default environment");
}
repo.delete(id);
}
}
```
- [x] **Step 2: Create AppService**
```java
package com.cameleer.server.core.runtime;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.*;
import java.security.MessageDigest;
import java.util.HexFormat;
import java.util.List;
import java.util.UUID;
public class AppService {
private static final Logger log = LoggerFactory.getLogger(AppService.class);
private final AppRepository appRepo;
private final AppVersionRepository versionRepo;
private final String jarStoragePath;
public AppService(AppRepository appRepo, AppVersionRepository versionRepo, String jarStoragePath) {
this.appRepo = appRepo;
this.versionRepo = versionRepo;
this.jarStoragePath = jarStoragePath;
}
public List<App> listByEnvironment(UUID environmentId) { return appRepo.findByEnvironmentId(environmentId); }
public App getById(UUID id) { return appRepo.findById(id).orElseThrow(() -> new IllegalArgumentException("App not found: " + id)); }
public List<AppVersion> listVersions(UUID appId) { return versionRepo.findByAppId(appId); }
public UUID createApp(UUID environmentId, String slug, String displayName) {
if (appRepo.findByEnvironmentIdAndSlug(environmentId, slug).isPresent()) {
throw new IllegalArgumentException("App with slug '" + slug + "' already exists in this environment");
}
return appRepo.create(environmentId, slug, displayName);
}
public AppVersion uploadJar(UUID appId, String filename, InputStream jarData, long size) throws IOException {
App app = getById(appId);
int nextVersion = versionRepo.findMaxVersion(appId) + 1;
// Store JAR: {jarStoragePath}/{appId}/v{version}/app.jar
Path versionDir = Path.of(jarStoragePath, appId.toString(), "v" + nextVersion);
Files.createDirectories(versionDir);
Path jarFile = versionDir.resolve("app.jar");
MessageDigest digest;
try { digest = MessageDigest.getInstance("SHA-256"); }
catch (Exception e) { throw new RuntimeException(e); }
try (InputStream in = jarData) {
byte[] buffer = new byte[8192];
int bytesRead;
try (var out = Files.newOutputStream(jarFile)) {
while ((bytesRead = in.read(buffer)) != -1) {
out.write(buffer, 0, bytesRead);
digest.update(buffer, 0, bytesRead);
}
}
}
String checksum = HexFormat.of().formatHex(digest.digest());
UUID versionId = versionRepo.create(appId, nextVersion, jarFile.toString(), checksum, filename, size);
log.info("Uploaded JAR for app {}: version={}, size={}, sha256={}", appId, nextVersion, size, checksum);
return versionRepo.findById(versionId).orElseThrow();
}
public String resolveJarPath(UUID appVersionId) {
AppVersion version = versionRepo.findById(appVersionId)
.orElseThrow(() -> new IllegalArgumentException("AppVersion not found: " + appVersionId));
return version.jarPath();
}
public void deleteApp(UUID id) {
appRepo.delete(id);
}
}
```
- [x] **Step 3: Create DeploymentService**
```java
package com.cameleer.server.core.runtime;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.List;
import java.util.UUID;
public class DeploymentService {
private static final Logger log = LoggerFactory.getLogger(DeploymentService.class);
private final DeploymentRepository deployRepo;
private final AppService appService;
private final EnvironmentService envService;
public DeploymentService(DeploymentRepository deployRepo, AppService appService, EnvironmentService envService) {
this.deployRepo = deployRepo;
this.appService = appService;
this.envService = envService;
}
public List<Deployment> listByApp(UUID appId) { return deployRepo.findByAppId(appId); }
public Deployment getById(UUID id) { return deployRepo.findById(id).orElseThrow(() -> new IllegalArgumentException("Deployment not found: " + id)); }
/** Create a deployment record. Actual container start is handled by DeploymentExecutor (async). */
public Deployment createDeployment(UUID appId, UUID appVersionId, UUID environmentId) {
App app = appService.getById(appId);
Environment env = envService.getById(environmentId);
String containerName = env.slug() + "-" + app.slug();
UUID deploymentId = deployRepo.create(appId, appVersionId, environmentId, containerName);
return deployRepo.findById(deploymentId).orElseThrow();
}
/** Promote: deploy the same app version to a different environment. */
public Deployment promote(UUID appId, UUID appVersionId, UUID targetEnvironmentId) {
return createDeployment(appId, appVersionId, targetEnvironmentId);
}
public void markRunning(UUID deploymentId, String containerId) {
deployRepo.updateStatus(deploymentId, DeploymentStatus.RUNNING, containerId, null);
deployRepo.markDeployed(deploymentId);
}
public void markFailed(UUID deploymentId, String errorMessage) {
deployRepo.updateStatus(deploymentId, DeploymentStatus.FAILED, null, errorMessage);
}
public void markStopped(UUID deploymentId) {
deployRepo.updateStatus(deploymentId, DeploymentStatus.STOPPED, null, null);
deployRepo.markStopped(deploymentId);
}
}
```
- [x] **Step 4: Commit**
```bash
git add cameleer-server-core/src/main/java/com/cameleer/server/core/runtime/
git commit -m "feat: add EnvironmentService, AppService, DeploymentService"
```
---
### Task 6: App Module — PostgreSQL Repositories
**Files:**
- Create all Postgres repositories in `app/storage/`
- [x] **Step 1: Implement all four repositories**
Follow the pattern from `PostgresUserRepository.java``JdbcTemplate` with row mappers. Each repository implements its core interface with standard SQL (INSERT, SELECT, UPDATE, DELETE).
Key patterns to follow:
- Constructor injection of `JdbcTemplate`
- RowMapper lambdas returning records
- `UUID.randomUUID()` for ID generation
- `Timestamp.from(Instant)` for timestamp parameters
- [x] **Step 2: Wire beans**
Create `RuntimeBeanConfig.java` in `app/config/`:
```java
@Configuration
public class RuntimeBeanConfig {
@Bean
public EnvironmentRepository environmentRepository(JdbcTemplate jdbc) {
return new PostgresEnvironmentRepository(jdbc);
}
@Bean
public AppRepository appRepository(JdbcTemplate jdbc) {
return new PostgresAppRepository(jdbc);
}
@Bean
public AppVersionRepository appVersionRepository(JdbcTemplate jdbc) {
return new PostgresAppVersionRepository(jdbc);
}
@Bean
public DeploymentRepository deploymentRepository(JdbcTemplate jdbc) {
return new PostgresDeploymentRepository(jdbc);
}
@Bean
public EnvironmentService environmentService(EnvironmentRepository repo) {
return new EnvironmentService(repo);
}
@Bean
public AppService appService(AppRepository appRepo, AppVersionRepository versionRepo,
@Value("${cameleer.runtime.jar-storage-path:/data/jars}") String jarStoragePath) {
return new AppService(appRepo, versionRepo, jarStoragePath);
}
@Bean
public DeploymentService deploymentService(DeploymentRepository deployRepo, AppService appService, EnvironmentService envService) {
return new DeploymentService(deployRepo, appService, envService);
}
}
```
- [x] **Step 3: Run tests**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn test -pl cameleer-server-app`
Expected: PASS (Flyway applies V3 migration, context loads).
- [x] **Step 4: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/storage/Postgres*Repository.java
git add cameleer-server-app/src/main/java/com/cameleer/server/app/config/RuntimeBeanConfig.java
git commit -m "feat: implement PostgreSQL repositories for runtime management"
```
---
### Task 7: Docker Runtime Orchestrator
**Files:**
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/runtime/DockerRuntimeOrchestrator.java`
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/runtime/DisabledRuntimeOrchestrator.java`
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/runtime/RuntimeOrchestratorAutoConfig.java`
- [x] **Step 1: Implement DisabledRuntimeOrchestrator**
```java
package com.cameleer.server.app.runtime;
import com.cameleer.server.core.runtime.*;
import java.util.stream.Stream;
public class DisabledRuntimeOrchestrator implements RuntimeOrchestrator {
@Override public boolean isEnabled() { return false; }
@Override public String startContainer(ContainerRequest r) { throw new UnsupportedOperationException("Runtime management disabled"); }
@Override public void stopContainer(String id) { throw new UnsupportedOperationException("Runtime management disabled"); }
@Override public void removeContainer(String id) { throw new UnsupportedOperationException("Runtime management disabled"); }
@Override public ContainerStatus getContainerStatus(String id) { return ContainerStatus.notFound(); }
@Override public Stream<String> getLogs(String id, int tail) { return Stream.empty(); }
}
```
- [x] **Step 2: Implement DockerRuntimeOrchestrator**
Port from SaaS `DockerRuntimeOrchestrator.java`, adapted:
- Uses docker-java `DockerClientImpl` with zerodep transport
- `startContainer()`: creates container from base image with volume mount for JAR (instead of image build), sets env vars, Traefik labels, health check, resource limits
- `stopContainer()`: stops with 30s timeout
- `removeContainer()`: force remove
- `getContainerStatus()`: inspect container state
- `getLogs()`: tail container logs
Key difference from SaaS version: **no image build**. The base image is pre-built. JAR is volume-mounted:
```java
@Override
public String startContainer(ContainerRequest request) {
List<String> envList = request.envVars().entrySet().stream()
.map(e -> e.getKey() + "=" + e.getValue()).toList();
// Volume bind: mount JAR into container
Bind jarBind = new Bind(request.jarPath(), new Volume("/app/app.jar"), AccessMode.ro);
HostConfig hostConfig = HostConfig.newHostConfig()
.withMemory(request.memoryLimitBytes())
.withMemorySwap(request.memoryLimitBytes())
.withCpuShares(request.cpuShares())
.withNetworkMode(request.network())
.withBinds(jarBind);
CreateContainerResponse container = dockerClient.createContainerCmd(request.baseImage())
.withName(request.containerName())
.withEnv(envList)
.withLabels(request.labels())
.withHostConfig(hostConfig)
.withHealthcheck(new HealthCheck()
.withTest(List.of("CMD-SHELL", "wget -qO- http://localhost:" + request.healthCheckPort() + "/cameleer/health || exit 1"))
.withInterval(10_000_000_000L)
.withTimeout(5_000_000_000L)
.withRetries(3)
.withStartPeriod(30_000_000_000L))
.exec();
dockerClient.startContainerCmd(container.getId()).exec();
return container.getId();
}
```
- [x] **Step 3: Implement RuntimeOrchestratorAutoConfig**
```java
package com.cameleer.server.app.runtime;
import com.cameleer.server.core.runtime.RuntimeOrchestrator;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import java.nio.file.Files;
import java.nio.file.Path;
@Configuration
public class RuntimeOrchestratorAutoConfig {
private static final Logger log = LoggerFactory.getLogger(RuntimeOrchestratorAutoConfig.class);
@Bean
public RuntimeOrchestrator runtimeOrchestrator() {
// Auto-detect: Docker socket available?
if (Files.exists(Path.of("/var/run/docker.sock"))) {
log.info("Docker socket detected — enabling Docker runtime orchestrator");
return new DockerRuntimeOrchestrator();
}
// TODO: K8s detection (check for service account token)
log.info("No Docker socket or K8s detected — runtime management disabled (observability-only mode)");
return new DisabledRuntimeOrchestrator();
}
}
```
- [x] **Step 4: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/runtime/
git commit -m "feat: implement DockerRuntimeOrchestrator with volume-mount JAR deployment"
```
---
### Task 8: DeploymentExecutor — Async Deployment Pipeline
**Files:**
- Create: `cameleer-server-app/src/main/java/com/cameleer/server/app/runtime/DeploymentExecutor.java`
- [x] **Step 1: Implement async deployment pipeline**
```java
package com.cameleer.server.app.runtime;
import com.cameleer.server.core.runtime.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.scheduling.annotation.Async;
import org.springframework.stereotype.Service;
import java.util.HashMap;
import java.util.Map;
@Service
public class DeploymentExecutor {
private static final Logger log = LoggerFactory.getLogger(DeploymentExecutor.class);
private final RuntimeOrchestrator orchestrator;
private final DeploymentService deploymentService;
private final AppService appService;
private final EnvironmentService envService;
// Inject runtime config values
public DeploymentExecutor(RuntimeOrchestrator orchestrator, DeploymentService deploymentService,
AppService appService, EnvironmentService envService) {
this.orchestrator = orchestrator;
this.deploymentService = deploymentService;
this.appService = appService;
this.envService = envService;
}
@Async("deploymentExecutor")
public void executeAsync(Deployment deployment) {
try {
// Stop existing deployment in same environment for same app
// ... (find active deployment, stop container)
String jarPath = appService.resolveJarPath(deployment.appVersionId());
App app = appService.getById(deployment.appId());
Environment env = envService.getById(deployment.environmentId());
Map<String, String> envVars = new HashMap<>();
envVars.put("CAMELEER_EXPORT_TYPE", "HTTP");
envVars.put("CAMELEER_EXPORT_ENDPOINT", /* server endpoint */);
envVars.put("CAMELEER_AUTH_TOKEN", /* bootstrap token */);
envVars.put("CAMELEER_APPLICATION_ID", app.slug());
envVars.put("CAMELEER_ENVIRONMENT_ID", env.slug());
envVars.put("CAMELEER_DISPLAY_NAME", deployment.containerName());
Map<String, String> labels = buildTraefikLabels(app, env, deployment);
ContainerRequest request = new ContainerRequest(
deployment.containerName(),
/* baseImage */, jarPath, /* network */,
envVars, labels, /* memoryLimit */, /* cpuShares */, 9464);
String containerId = orchestrator.startContainer(request);
waitForHealthy(containerId, 60);
deploymentService.markRunning(deployment.id(), containerId);
log.info("Deployment {} is RUNNING (container={})", deployment.id(), containerId);
} catch (Exception e) {
log.error("Deployment {} FAILED: {}", deployment.id(), e.getMessage(), e);
deploymentService.markFailed(deployment.id(), e.getMessage());
}
}
private void waitForHealthy(String containerId, int timeoutSeconds) throws InterruptedException {
long deadline = System.currentTimeMillis() + timeoutSeconds * 1000L;
while (System.currentTimeMillis() < deadline) {
ContainerStatus status = orchestrator.getContainerStatus(containerId);
if ("healthy".equalsIgnoreCase(status.state()) || (status.running() && "running".equalsIgnoreCase(status.state()))) {
return;
}
if (!status.running()) {
throw new RuntimeException("Container stopped unexpectedly: " + status.error());
}
Thread.sleep(2000);
}
throw new RuntimeException("Container health check timed out after " + timeoutSeconds + "s");
}
private Map<String, String> buildTraefikLabels(App app, Environment env, Deployment deployment) {
// TODO: implement path-based and subdomain-based Traefik labels based on routing config
return Map.of("traefik.enable", "true");
}
}
```
- [x] **Step 2: Add async config**
Add to `RuntimeBeanConfig.java` or create `AsyncConfig.java`:
```java
@Bean(name = "deploymentExecutor")
public TaskExecutor deploymentTaskExecutor() {
ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
executor.setCorePoolSize(4);
executor.setMaxPoolSize(4);
executor.setQueueCapacity(25);
executor.setThreadNamePrefix("deploy-");
executor.initialize();
return executor;
}
```
- [x] **Step 3: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/runtime/DeploymentExecutor.java
git commit -m "feat: implement async DeploymentExecutor pipeline"
```
---
### Task 9: REST Controllers — Environment, App, Deployment
**Files:**
- Create: `EnvironmentAdminController.java` (under `/api/v1/admin/environments`, ADMIN role)
- Create: `AppController.java` (under `/api/v1/apps`, OPERATOR role)
- Create: `DeploymentController.java` (under `/api/v1/apps/{appId}/deployments`, OPERATOR role)
- [x] **Step 1: Implement EnvironmentAdminController**
CRUD for environments. Path: `/api/v1/admin/environments`. Requires ADMIN role. Follows existing controller patterns (OpenAPI annotations, ResponseEntity).
- [x] **Step 2: Implement AppController**
App CRUD + JAR upload. Path: `/api/v1/apps`. Requires OPERATOR role. JAR upload via `multipart/form-data`. Returns app versions.
Key endpoint for JAR upload:
```java
@PostMapping(value = "/{appId}/versions", consumes = MediaType.MULTIPART_FORM_DATA_VALUE)
public ResponseEntity<AppVersion> uploadJar(@PathVariable UUID appId,
@RequestParam("file") MultipartFile file) throws IOException {
AppVersion version = appService.uploadJar(appId, file.getOriginalFilename(), file.getInputStream(), file.getSize());
return ResponseEntity.status(201).body(version);
}
```
- [x] **Step 3: Implement DeploymentController**
Deploy, stop, restart, promote, logs. Path: `/api/v1/apps/{appId}/deployments`. Requires OPERATOR role.
Key endpoints:
```java
@PostMapping
public ResponseEntity<Deployment> deploy(@PathVariable UUID appId, @RequestBody DeployRequest request) {
// request contains: appVersionId, environmentId
Deployment deployment = deploymentService.createDeployment(appId, request.appVersionId(), request.environmentId());
deploymentExecutor.executeAsync(deployment);
return ResponseEntity.accepted().body(deployment);
}
@PostMapping("/{deploymentId}/promote")
public ResponseEntity<Deployment> promote(@PathVariable UUID appId, @PathVariable UUID deploymentId,
@RequestBody PromoteRequest request) {
Deployment source = deploymentService.getById(deploymentId);
Deployment promoted = deploymentService.promote(appId, source.appVersionId(), request.targetEnvironmentId());
deploymentExecutor.executeAsync(promoted);
return ResponseEntity.accepted().body(promoted);
}
```
- [x] **Step 4: Add security rules to SecurityConfig**
Add to `SecurityConfig.filterChain()`:
```java
// Runtime management (OPERATOR+)
.requestMatchers("/api/v1/apps/**").hasAnyRole("OPERATOR", "ADMIN")
```
- [x] **Step 5: Commit**
```bash
git add cameleer-server-app/src/main/java/com/cameleer/server/app/controller/EnvironmentAdminController.java
git add cameleer-server-app/src/main/java/com/cameleer/server/app/controller/AppController.java
git add cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DeploymentController.java
git commit -m "feat: add REST controllers for environment, app, and deployment management"
```
---
### Task 10: Configuration and Application Properties
**Files:**
- Modify: `cameleer-server-app/src/main/resources/application.yml`
- [x] **Step 1: Add runtime config properties**
```yaml
cameleer:
runtime:
enabled: ${CAMELEER_RUNTIME_ENABLED:true}
jar-storage-path: ${CAMELEER_JAR_STORAGE_PATH:/data/jars}
base-image: ${CAMELEER_RUNTIME_BASE_IMAGE:cameleer-runtime-base:latest}
docker-network: ${CAMELEER_DOCKER_NETWORK:cameleer}
agent-health-port: 9464
health-check-timeout: 60
container-memory-limit: ${CAMELEER_CONTAINER_MEMORY_LIMIT:512m}
container-cpu-shares: ${CAMELEER_CONTAINER_CPU_SHARES:512}
routing-mode: ${CAMELEER_ROUTING_MODE:path}
routing-domain: ${CAMELEER_ROUTING_DOMAIN:localhost}
```
- [x] **Step 2: Run full test suite**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn clean verify`
Expected: PASS.
- [x] **Step 3: Commit**
```bash
git add cameleer-server-app/src/main/resources/application.yml
git commit -m "feat: add runtime management configuration properties"
```
---
### Task 11: Integration Tests
- [x] **Step 1: Write EnvironmentAdminController integration test**
Test CRUD operations for environments. Follows existing pattern from `AgentRegistrationControllerIT`.
- [x] **Step 2: Write AppController integration test**
Test app creation, JAR upload, version listing.
- [x] **Step 3: Write DeploymentController integration test**
Test deployment creation (with `DisabledRuntimeOrchestrator` — verifies the deployment record is created even if Docker is unavailable). Full Docker tests require Docker-in-Docker and are out of scope for CI.
- [x] **Step 4: Commit**
```bash
git add cameleer-server-app/src/test/java/com/cameleer/server/app/controller/
git commit -m "test: add integration tests for runtime management API"
```
---
### Task 12: Final Verification
- [x] **Step 1: Run full build**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-server && mvn clean verify`
Expected: All tests PASS.
- [x] **Step 2: Verify schema applies cleanly**
Fresh Testcontainers PostgreSQL should apply V1 + V2 + V3 without errors.
- [x] **Step 3: Commit any remaining fixes**
```bash
git add -A
git commit -m "chore: finalize runtime management — all tests passing"
```

View File

@@ -0,0 +1,377 @@
# Plan 4: SaaS Cleanup — Strip to Vendor Management Plane
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Remove all migrated code from the SaaS layer (environments, apps, deployments, ClickHouse access) and strip it down to a thin vendor management plane: tenant lifecycle, license generation, billing, and Logto organization management.
**Architecture:** The SaaS retains only vendor-level concerns. All runtime management, observability, and user management is now in the server. The SaaS communicates with server instances exclusively via REST API (ServerApiClient). ClickHouse dependency is removed entirely.
**Tech Stack:** Java 21, Spring Boot 3.4.3, PostgreSQL 16
**Repo:** `C:\Users\Hendrik\Documents\projects\cameleer-saas`
**Prerequisite:** Plans 1-3 must be implemented in cameleer-server first.
---
## Summary of Changes
### Files to DELETE (migrated to server or no longer needed)
```
src/main/java/net/siegeln/cameleer/saas/environment/
├── EnvironmentEntity.java
├── EnvironmentService.java
├── EnvironmentController.java
├── EnvironmentRepository.java
├── EnvironmentStatus.java
└── dto/
├── CreateEnvironmentRequest.java
├── UpdateEnvironmentRequest.java
└── EnvironmentResponse.java
src/main/java/net/siegeln/cameleer/saas/app/
├── AppEntity.java
├── AppService.java
├── AppController.java
├── AppRepository.java
└── dto/
├── CreateAppRequest.java
└── AppResponse.java
src/main/java/net/siegeln/cameleer/saas/deployment/
├── DeploymentEntity.java
├── DeploymentService.java
├── DeploymentController.java
├── DeploymentRepository.java
├── DeploymentExecutor.java
├── DesiredStatus.java
├── ObservedStatus.java
└── dto/
└── DeploymentResponse.java
src/main/java/net/siegeln/cameleer/saas/runtime/
├── RuntimeOrchestrator.java
├── DockerRuntimeOrchestrator.java
├── RuntimeConfig.java
├── BuildImageRequest.java
├── StartContainerRequest.java
├── ContainerStatus.java
└── LogConsumer.java
src/main/java/net/siegeln/cameleer/saas/log/
├── ClickHouseConfig.java
├── ClickHouseProperties.java
├── ContainerLogService.java
├── LogController.java
└── dto/
└── LogEntry.java
src/main/java/net/siegeln/cameleer/saas/observability/
├── AgentStatusService.java
├── AgentStatusController.java
└── dto/
├── AgentStatusResponse.java
└── ObservabilityStatusResponse.java
```
### Files to MODIFY
```
src/main/java/net/siegeln/cameleer/saas/config/AsyncConfig.java — remove deploymentExecutor bean
src/main/java/net/siegeln/cameleer/saas/tenant/TenantService.java — remove createDefaultForTenant() call
src/main/resources/application.yml — remove clickhouse + runtime config sections
docker-compose.yml — remove Docker socket mount from SaaS, update routing
```
### Files to KEEP (vendor management plane)
```
src/main/java/net/siegeln/cameleer/saas/tenant/ — Tenant CRUD, lifecycle
src/main/java/net/siegeln/cameleer/saas/license/ — License generation
src/main/java/net/siegeln/cameleer/saas/identity/ — Logto org management, ServerApiClient
src/main/java/net/siegeln/cameleer/saas/config/ — SecurityConfig, SpaController
src/main/java/net/siegeln/cameleer/saas/audit/ — Vendor audit logging
src/main/java/net/siegeln/cameleer/saas/apikey/ — API key management (if used)
ui/ — Vendor management dashboard
```
### Flyway Migrations to KEEP
The existing migrations (V001-V009) can remain since they're already applied. Add a new cleanup migration:
```
src/main/resources/db/migration/V010__drop_migrated_tables.sql
```
---
### Task 1: Remove ClickHouse Dependency
- [ ] **Step 1: Delete ClickHouse files**
```bash
rm -rf src/main/java/net/siegeln/cameleer/saas/log/ClickHouseConfig.java
rm -rf src/main/java/net/siegeln/cameleer/saas/log/ClickHouseProperties.java
rm -rf src/main/java/net/siegeln/cameleer/saas/log/ContainerLogService.java
rm -rf src/main/java/net/siegeln/cameleer/saas/log/LogController.java
rm -rf src/main/java/net/siegeln/cameleer/saas/log/dto/
```
- [ ] **Step 2: Remove ClickHouse from AgentStatusService**
Delete `AgentStatusService.java` and `AgentStatusController.java` entirely (agent status is now a server concern).
```bash
rm -rf src/main/java/net/siegeln/cameleer/saas/observability/
```
- [ ] **Step 3: Remove ClickHouse config from application.yml**
Remove the entire `cameleer.clickhouse:` section.
- [ ] **Step 4: Remove ClickHouse JDBC dependency from pom.xml**
Remove:
```xml
<dependency>
<groupId>com.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
</dependency>
```
- [ ] **Step 5: Verify build**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-saas && mvn compile`
Expected: BUILD SUCCESS. Fix any remaining import errors.
- [ ] **Step 6: Commit**
```bash
git add -A
git commit -m "feat: remove all ClickHouse dependencies from SaaS layer"
```
---
### Task 2: Remove Environment/App/Deployment Code
- [ ] **Step 1: Delete environment package**
```bash
rm -rf src/main/java/net/siegeln/cameleer/saas/environment/
```
- [ ] **Step 2: Delete app package**
```bash
rm -rf src/main/java/net/siegeln/cameleer/saas/app/
```
- [ ] **Step 3: Delete deployment package**
```bash
rm -rf src/main/java/net/siegeln/cameleer/saas/deployment/
```
- [ ] **Step 4: Delete runtime package**
```bash
rm -rf src/main/java/net/siegeln/cameleer/saas/runtime/
```
- [ ] **Step 5: Remove AsyncConfig deploymentExecutor bean**
In `AsyncConfig.java`, remove the `deploymentExecutor` bean (or delete AsyncConfig if it only had that bean).
- [ ] **Step 6: Update TenantService**
Remove any calls to `EnvironmentService.createDefaultForTenant()` from `TenantService.java`. The server now handles default environment creation.
- [ ] **Step 7: Remove runtime config from application.yml**
Remove the entire `cameleer.runtime:` section.
- [ ] **Step 8: Verify build**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-saas && mvn compile`
Expected: BUILD SUCCESS. Fix any remaining import errors.
- [ ] **Step 9: Commit**
```bash
git add -A
git commit -m "feat: remove migrated environment/app/deployment/runtime code from SaaS"
```
---
### Task 3: Database Cleanup Migration
- [ ] **Step 1: Create cleanup migration**
```sql
-- V010__drop_migrated_tables.sql
-- Drop tables that have been migrated to cameleer-server
DROP TABLE IF EXISTS deployments CASCADE;
DROP TABLE IF EXISTS apps CASCADE;
DROP TABLE IF EXISTS environments CASCADE;
DROP TABLE IF EXISTS api_keys CASCADE;
```
- [ ] **Step 2: Commit**
```bash
git add src/main/resources/db/migration/V010__drop_migrated_tables.sql
git commit -m "feat: drop migrated tables from SaaS database"
```
---
### Task 4: Remove Docker Socket Dependency
- [ ] **Step 1: Update docker-compose.yml**
Remove from `cameleer-saas` service:
```yaml
volumes:
- /var/run/docker.sock:/var/run/docker.sock
- jardata:/data/jars
group_add:
- "0"
```
The Docker socket mount now belongs to the `cameleer-server` service instead.
- [ ] **Step 2: Remove docker-java dependency from pom.xml**
Remove:
```xml
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-core</artifactId>
</dependency>
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-transport-zerodep</artifactId>
</dependency>
```
- [ ] **Step 3: Commit**
```bash
git add docker-compose.yml pom.xml
git commit -m "feat: remove Docker socket dependency from SaaS layer"
```
---
### Task 5: Update SaaS UI
- [ ] **Step 1: Remove environment/app/deployment pages from SaaS frontend**
Remove pages that now live in the server UI:
- `EnvironmentsPage`
- `EnvironmentDetailPage`
- `AppDetailPage`
The SaaS UI retains:
- `DashboardPage` — vendor overview (tenant list, status)
- `AdminTenantsPage` — tenant management
- `LicensePage` — license management
- [ ] **Step 2: Update navigation**
Remove links to environments/apps/deployments. The SaaS UI should link to the tenant's server instance for those features (e.g., "Open Dashboard" link to `https://{tenant-slug}.cameleer.example.com/server/`).
- [ ] **Step 3: Commit**
```bash
git add ui/
git commit -m "feat: strip SaaS UI to vendor management dashboard"
```
---
### Task 6: Expand ServerApiClient
- [ ] **Step 1: Add provisioning-related API calls**
The `ServerApiClient` should gain methods for tenant provisioning:
```java
public void pushLicense(String serverEndpoint, String licenseToken) {
post(serverEndpoint + "/api/v1/admin/license")
.body(Map.of("token", licenseToken))
.retrieve()
.toBodilessEntity();
}
public Map<String, Object> getHealth(String serverEndpoint) {
return get(serverEndpoint + "/api/v1/health")
.retrieve()
.body(Map.class);
}
```
- [ ] **Step 2: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/identity/ServerApiClient.java
git commit -m "feat: expand ServerApiClient with license push and health check methods"
```
---
### Task 7: Write SAAS-INTEGRATION.md
- [ ] **Step 1: Create integration contract document**
Create `docs/SAAS-INTEGRATION.md` in the cameleer-server repo documenting:
- Which server API endpoints the SaaS calls
- Required auth (M2M token with `server:admin` scope)
- License injection mechanism (`POST /api/v1/admin/license`)
- Health check endpoint (`GET /api/v1/health`)
- What the server exposes vs what the SaaS must never access directly
- Env vars the SaaS sets when provisioning a server instance
- [ ] **Step 2: Commit**
```bash
cd /c/Users/Hendrik/Documents/projects/cameleer-server
git add docs/SAAS-INTEGRATION.md
git commit -m "docs: add SaaS integration contract documentation"
```
---
### Task 8: Final Verification
- [ ] **Step 1: Build SaaS**
Run: `cd /c/Users/Hendrik/Documents/projects/cameleer-saas && mvn clean verify`
Expected: BUILD SUCCESS with reduced dependency footprint.
- [ ] **Step 2: Verify SaaS starts without ClickHouse**
The SaaS should start with only PostgreSQL (and Logto). No ClickHouse required.
- [ ] **Step 3: Verify remaining code footprint**
The SaaS source should now contain approximately:
- `tenant/` — ~4 files
- `license/` — ~5 files
- `identity/` — ~3 files (LogtoConfig, ServerApiClient, M2M token)
- `config/` — ~3 files (SecurityConfig, SpaController, TLS)
- `audit/` — ~3 files
- `ui/` — stripped dashboard
Total: ~20 Java files (down from ~75).
- [ ] **Step 4: Final commit**
```bash
git add -A
git commit -m "chore: finalize SaaS cleanup — vendor management plane only"
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,760 @@
# SaaS Platform UX Polish — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Fix layout bugs, replace hardcoded dark-only colors with design system tokens, improve navigation/header, add error handling, and adopt design system components consistently across the SaaS platform UI.
**Architecture:** All changes are in the existing SaaS platform UI (`ui/src/`) and sign-in page (`ui/sign-in/src/`). The platform uses `@cameleer/design-system` components and Tailwind CSS. The key issue is that pages use hardcoded `text-white` Tailwind classes instead of DS CSS variables, and the DS `TopBar` renders server-specific controls that are irrelevant on platform pages.
**Tech Stack:** React 19, TypeScript, Tailwind CSS, `@cameleer/design-system`, React Router v6, Logto SDK
**Spec:** `docs/superpowers/specs/2026-04-09-saas-ux-polish-design.md`
---
## Task 1: Fix label/value collision and replace hardcoded colors
**Spec items:** 1.1, 1.2
**Files:**
- Create: `ui/src/styles/platform.module.css`
- Modify: `ui/src/pages/DashboardPage.tsx`
- Modify: `ui/src/pages/LicensePage.tsx`
- Modify: `ui/src/pages/AdminTenantsPage.tsx`
- [ ] **Step 1: Create shared platform CSS module**
Create `ui/src/styles/platform.module.css` with DS-variable-based classes replacing the hardcoded Tailwind colors:
```css
.heading {
font-size: 1.5rem;
font-weight: 600;
color: var(--text-primary);
}
.textPrimary {
color: var(--text-primary);
}
.textSecondary {
color: var(--text-secondary);
}
.textMuted {
color: var(--text-muted);
}
.mono {
font-family: var(--font-mono);
}
.kvRow {
display: flex;
align-items: center;
justify-content: space-between;
width: 100%;
}
.kvLabel {
font-size: 0.875rem;
color: var(--text-muted);
}
.kvValue {
font-size: 0.875rem;
color: var(--text-primary);
}
.kvValueMono {
font-size: 0.875rem;
color: var(--text-primary);
font-family: var(--font-mono);
}
.dividerList {
display: flex;
flex-direction: column;
}
.dividerList > * + * {
border-top: 1px solid var(--border-subtle);
}
.dividerRow {
display: flex;
align-items: center;
justify-content: space-between;
padding: 0.75rem 0;
}
.dividerRow:first-child {
padding-top: 0;
}
.dividerRow:last-child {
padding-bottom: 0;
}
.description {
font-size: 0.875rem;
color: var(--text-muted);
}
.tokenBlock {
margin-top: 0.5rem;
border-radius: var(--radius-sm);
background: var(--bg-inset);
border: 1px solid var(--border-subtle);
padding: 0.75rem;
overflow-x: auto;
}
.tokenCode {
font-size: 0.75rem;
font-family: var(--font-mono);
color: var(--text-secondary);
word-break: break-all;
}
```
- [ ] **Step 2: Update DashboardPage to use CSS module + fix label/value**
In `ui/src/pages/DashboardPage.tsx`:
1. Add import:
```typescript
import s from '../styles/platform.module.css';
```
2. Replace all hardcoded color classes:
- Line 71: `text-2xl font-semibold text-white``className={s.heading}`
- Lines 96, 100, 107: `className="flex justify-between text-white/80"``className={s.kvRow}`
- Inner label spans: wrap with `className={s.kvLabel}`
- Inner value spans: wrap with `className={s.kvValueMono}` (for mono) or `className={s.kvValue}`
- Line 116: `text-sm text-white/60``className={s.description}`
3. The label/value collision fix: the `kvRow` class uses explicit `display: flex; width: 100%; justify-content: space-between` which ensures the flex container stretches to full Card width regardless of Card's inner layout.
- [ ] **Step 3: Update LicensePage to use CSS module**
In `ui/src/pages/LicensePage.tsx`:
1. Add import: `import s from '../styles/platform.module.css';`
2. Replace all hardcoded color classes:
- Line 85: heading → `className={s.heading}`
- Lines 95-115 (Validity rows): `flex items-center justify-between``className={s.kvRow}`, labels → `className={s.kvLabel}`, values → `className={s.kvValue}`
- Lines 121-136 (Features): `divide-y divide-white/10``className={s.dividerList}`, rows → `className={s.dividerRow}`, feature name `text-sm text-white``className={s.textPrimary}` + `text-sm`
- Lines 142-157 (Limits): same dividerList/dividerRow pattern, label → `className={s.kvLabel}`, value → `className={s.kvValueMono}`
- Line 163: description text → `className={s.description}`
- Lines 174-178: token code block → `className={s.tokenBlock}` on outer div, `className={s.tokenCode}` on code element
- [ ] **Step 4: Update AdminTenantsPage to use CSS module**
In `ui/src/pages/AdminTenantsPage.tsx`:
- Line 62: `text-2xl font-semibold text-white``className={s.heading}`
- [ ] **Step 5: Verify in both themes**
1. Open the platform dashboard in browser
2. Check label/value pairs have proper spacing (Slug on left, "default" on right)
3. Toggle to light theme via TopBar toggle
4. Verify all text is readable in light mode (no invisible white-on-white)
5. Toggle back to dark mode — should look the same as before
- [ ] **Step 6: Commit**
```bash
git add ui/src/styles/platform.module.css ui/src/pages/DashboardPage.tsx ui/src/pages/LicensePage.tsx ui/src/pages/AdminTenantsPage.tsx
git commit -m "fix: replace hardcoded text-white with DS variables, fix label/value layout"
```
---
## Task 2: Remove redundant dashboard elements
**Spec items:** 1.3, 2.4
**Files:**
- Modify: `ui/src/pages/DashboardPage.tsx`
- [ ] **Step 1: Remove primary "Open Server Dashboard" button from header**
In `ui/src/pages/DashboardPage.tsx`, find the header area (lines ~75-88). Remove the primary Button for "Open Server Dashboard" (lines ~81-87). Keep:
- The Server Management Card with its secondary button (lines ~113-126)
- The sidebar footer link (in Layout.tsx — don't touch)
The header area should just have the tenant name heading + tier badge, no button.
- [ ] **Step 2: Commit**
```bash
git add ui/src/pages/DashboardPage.tsx
git commit -m "fix: remove redundant Open Server Dashboard button from dashboard header"
```
---
## Task 3: Fix header controls and sidebar navigation
**Spec items:** 2.1, 2.2, 2.3, 2.5
**Files:**
- Modify: `ui/src/components/Layout.tsx`
- Modify: `ui/src/main.tsx` (possibly)
- [ ] **Step 1: Investigate TopBar props for hiding controls**
The DS `TopBar` interface (from types):
```typescript
interface TopBarProps {
breadcrumb: BreadcrumbItem[];
environment?: ReactNode;
user?: { name: string };
onLogout?: () => void;
className?: string;
}
```
The TopBar has NO props to hide status filters, time range, auto-refresh, or search. These are hardcoded inside the component.
**Options:**
1. Check if removing `GlobalFilterProvider` and `CommandPaletteProvider` from `main.tsx` makes TopBar gracefully hide those sections (test this first)
2. If that causes errors, add `display: none` CSS overrides for the irrelevant sections
3. If neither works, build a simplified platform header
Try option 1 first. In `main.tsx`, remove `GlobalFilterProvider` and `CommandPaletteProvider` from the provider stack. Test if the app still renders. If TopBar crashes without them, revert and try option 2.
- [ ] **Step 2: Add sidebar active state**
In `ui/src/components/Layout.tsx`, add route-based active state:
```typescript
import { useLocation } from 'react-router';
// Inside the Layout component:
const location = useLocation();
```
Update each `Sidebar.Section`:
```tsx
<Sidebar.Section
icon={<DashboardIcon />}
label="Dashboard"
open={false}
active={location.pathname === '/' || location.pathname === ''}
onToggle={() => navigate('/')}
>
{null}
</Sidebar.Section>
<Sidebar.Section
icon={<LicenseIcon />}
label="License"
open={false}
active={location.pathname === '/license'}
onToggle={() => navigate('/license')}
>
{null}
</Sidebar.Section>
{scopes.has('platform:admin') && (
<Sidebar.Section
icon={<PlatformIcon />}
label="Platform"
open={false}
active={location.pathname.startsWith('/admin')}
onToggle={() => navigate('/admin/tenants')}
>
{null}
</Sidebar.Section>
)}
```
- [ ] **Step 3: Add breadcrumbs**
In Layout.tsx, compute breadcrumbs from the current route:
```typescript
const breadcrumb = useMemo((): BreadcrumbItem[] => {
const path = location.pathname;
if (path.startsWith('/admin')) return [{ label: 'Admin' }, { label: 'Tenants' }];
if (path.startsWith('/license')) return [{ label: 'License' }];
return [{ label: 'Dashboard' }];
}, [location.pathname]);
```
Pass to TopBar:
```tsx
<TopBar breadcrumb={breadcrumb} ... />
```
Import `BreadcrumbItem` type from `@cameleer/design-system` if needed.
- [ ] **Step 4: Fix sidebar collapse**
Replace the hardcoded collapse state:
```typescript
const [sidebarCollapsed, setSidebarCollapsed] = useState(false);
```
```tsx
<Sidebar collapsed={sidebarCollapsed} onCollapseToggle={() => setSidebarCollapsed(c => !c)}>
```
- [ ] **Step 5: Fix username null fallback**
Update the user prop (line ~125):
```tsx
const displayName = username || 'User';
<TopBar
breadcrumb={breadcrumb}
user={{ name: displayName }}
onLogout={logout}
/>
```
This ensures the logout button is always visible.
- [ ] **Step 6: Replace custom SVG icons with lucide-react**
Replace the 4 custom SVG icon components (lines 25-62) with lucide-react icons:
```typescript
import { LayoutDashboard, ShieldCheck, Building, Server } from 'lucide-react';
```
Then update sidebar sections:
```tsx
icon={<LayoutDashboard size={18} />} // was <DashboardIcon />
icon={<ShieldCheck size={18} />} // was <LicenseIcon />
icon={<Building size={18} />} // was <PlatformIcon />
```
Remove the 4 custom SVG component functions (DashboardIcon, LicenseIcon, ObsIcon, PlatformIcon).
- [ ] **Step 7: Verify**
1. Sidebar shows active highlight on current page
2. Breadcrumbs show "Dashboard", "License", or "Admin > Tenants"
3. Sidebar collapse works (click collapse button, sidebar minimizes)
4. User avatar/logout always visible
5. Icons render correctly from lucide-react
6. Check if server controls are hidden (depending on step 1 result)
- [ ] **Step 8: Commit**
```bash
git add ui/src/components/Layout.tsx ui/src/main.tsx
git commit -m "fix: sidebar active state, breadcrumbs, collapse, username fallback, lucide icons"
```
---
## Task 4: Error handling and OrgResolver fix
**Spec items:** 3.1, 3.2, 3.7
**Files:**
- Modify: `ui/src/auth/OrgResolver.tsx`
- Modify: `ui/src/pages/DashboardPage.tsx`
- Modify: `ui/src/pages/AdminTenantsPage.tsx`
- [ ] **Step 1: Fix OrgResolver error state**
In `ui/src/auth/OrgResolver.tsx`, find the error handling (lines 88-90):
```tsx
// BEFORE:
if (isError) {
return null;
}
// AFTER:
if (isError) {
return (
<div className="flex flex-col items-center justify-center h-64 gap-4">
<EmptyState
title="Unable to load account"
description="Failed to retrieve your organization. Please try again or contact support."
/>
<Button variant="secondary" size="sm" onClick={() => refetch()}>
Retry
</Button>
</div>
);
}
```
Add imports: `EmptyState`, `Button` from `@cameleer/design-system`. Ensure `refetch` is available from the query hook (check if `useQuery` returns it).
- [ ] **Step 2: Add error handling to DashboardPage**
In `ui/src/pages/DashboardPage.tsx`, after the loading check (line ~49) and tenant check (line ~57), add error handling:
```tsx
const { data: tenant, isError: tenantError } = useTenant();
const { data: license, isError: licenseError } = useLicense();
// After loading spinner check:
if (tenantError || licenseError) {
return (
<div className="p-6">
<EmptyState
title="Unable to load dashboard"
description="Failed to retrieve tenant information. Please try again later."
/>
</div>
);
}
```
Check how `useTenant()` and `useLicense()` expose error state — they may use `isError` from React Query.
- [ ] **Step 3: Add empty state and date formatting to AdminTenantsPage**
In `ui/src/pages/AdminTenantsPage.tsx`:
1. Add error handling:
```tsx
if (isError) {
return (
<div className="p-6">
<EmptyState
title="Unable to load tenants"
description="You may not have admin permissions, or the server is unavailable."
/>
</div>
);
}
```
2. Format the `createdAt` column (line 31):
```tsx
// BEFORE:
{ key: 'createdAt', header: 'Created' },
// AFTER:
{ key: 'createdAt', header: 'Created', render: (_, row) => new Date(row.createdAt).toLocaleDateString() },
```
3. Add empty state to DataTable (if supported) or show EmptyState when tenants is empty:
```tsx
{(!tenants || tenants.length === 0) ? (
<EmptyState title="No tenants" description="No tenants have been created yet." />
) : (
<DataTable columns={columns} data={tenants} onRowClick={handleRowClick} />
)}
```
- [ ] **Step 4: Commit**
```bash
git add ui/src/auth/OrgResolver.tsx ui/src/pages/DashboardPage.tsx ui/src/pages/AdminTenantsPage.tsx
git commit -m "fix: add error states to OrgResolver, DashboardPage, AdminTenantsPage"
```
---
## Task 5: DS component adoption and license token copy
**Spec items:** 3.3, 3.4
**Files:**
- Modify: `ui/src/pages/LicensePage.tsx`
- [ ] **Step 1: Replace raw button with DS Button**
In `ui/src/pages/LicensePage.tsx`, find the token toggle button (lines ~166-172):
```tsx
// BEFORE:
<button
type="button"
className="text-sm text-primary-400 hover:text-primary-300 underline underline-offset-2 focus:outline-none"
onClick={() => setTokenExpanded((v) => !v)}
>
{tokenExpanded ? 'Hide token' : 'Show token'}
</button>
// AFTER:
<Button variant="ghost" size="sm" onClick={() => setTokenExpanded((v) => !v)}>
{tokenExpanded ? 'Hide token' : 'Show token'}
</Button>
```
Ensure `Button` is imported from `@cameleer/design-system`.
- [ ] **Step 2: Add copy-to-clipboard button**
Add `useToast` import and `Copy` icon:
```typescript
import { useToast } from '@cameleer/design-system';
import { Copy } from 'lucide-react';
```
Add toast hook in component:
```typescript
const { toast } = useToast();
```
Next to the show/hide button, add a copy button (only when expanded):
```tsx
<div style={{ display: 'flex', gap: 8, alignItems: 'center' }}>
<Button variant="ghost" size="sm" onClick={() => setTokenExpanded((v) => !v)}>
{tokenExpanded ? 'Hide token' : 'Show token'}
</Button>
{tokenExpanded && (
<Button variant="ghost" size="sm" onClick={() => {
navigator.clipboard.writeText(license.token);
toast({ title: 'Token copied to clipboard', variant: 'success' });
}}>
<Copy size={14} /> Copy
</Button>
)}
</div>
```
- [ ] **Step 3: Commit**
```bash
git add ui/src/pages/LicensePage.tsx
git commit -m "fix: replace raw button with DS Button, add token copy-to-clipboard"
```
---
## Task 6: Sign-in page improvements
**Spec items:** 3.6, 4.5
**Files:**
- Modify: `ui/sign-in/src/SignInPage.tsx`
- [ ] **Step 1: Add password visibility toggle**
In `ui/sign-in/src/SignInPage.tsx`, add state and imports:
```typescript
import { Eye, EyeOff } from 'lucide-react';
const [showPassword, setShowPassword] = useState(false);
```
Update the password FormField (lines ~84-94):
```tsx
<FormField label="Password" htmlFor="login-password">
<div style={{ position: 'relative' }}>
<Input
id="login-password"
type={showPassword ? 'text' : 'password'}
value={password}
onChange={(e) => setPassword(e.target.value)}
placeholder="••••••••"
autoComplete="current-password"
disabled={loading}
/>
<button
type="button"
onClick={() => setShowPassword(!showPassword)}
style={{
position: 'absolute', right: 8, top: '50%', transform: 'translateY(-50%)',
background: 'none', border: 'none', cursor: 'pointer', color: 'var(--text-muted)',
padding: 4, display: 'flex', alignItems: 'center',
}}
tabIndex={-1}
>
{showPassword ? <EyeOff size={16} /> : <Eye size={16} />}
</button>
</div>
</FormField>
```
Note: Using a raw `<button>` here because the sign-in page may not have the full DS Button available (it's a separate Vite build). Use inline styles for positioning since the sign-in page uses CSS modules.
- [ ] **Step 2: Fix branding text**
In `ui/sign-in/src/SignInPage.tsx`, find the logo text (line ~61):
```tsx
// BEFORE:
<div className={styles.logo}>
<img src={cameleerLogo} alt="" className={styles.logoImg} />
cameleer
</div>
// AFTER:
<div className={styles.logo}>
<img src={cameleerLogo} alt="" className={styles.logoImg} />
Cameleer
</div>
```
Also update the page title if it's set anywhere (check `index.html` in `ui/sign-in/`):
```html
<title>Sign in — Cameleer</title>
```
- [ ] **Step 3: Commit**
```bash
git add ui/sign-in/src/SignInPage.tsx ui/sign-in/index.html
git commit -m "fix: add password visibility toggle and fix branding to 'Cameleer'"
```
---
## Task 7: Unify tier colors and fix badges
**Spec items:** 4.1, 4.2
**Files:**
- Create: `ui/src/utils/tier.ts`
- Modify: `ui/src/pages/DashboardPage.tsx`
- Modify: `ui/src/pages/LicensePage.tsx`
- [ ] **Step 1: Create shared tier utility**
Create `ui/src/utils/tier.ts`:
```typescript
export type TierColor = 'primary' | 'success' | 'warning' | 'error' | 'auto';
export function tierColor(tier: string): TierColor {
switch (tier?.toUpperCase()) {
case 'BUSINESS':
case 'ENTERPRISE':
return 'success';
case 'HIGH':
case 'PRO':
return 'primary';
case 'MID':
case 'STARTER':
return 'warning';
case 'LOW':
case 'FREE':
return 'auto';
default:
return 'auto';
}
}
```
- [ ] **Step 2: Replace local tierColor in both pages**
In `DashboardPage.tsx`, remove the local `tierColor` function (lines 12-19) and add:
```typescript
import { tierColor } from '../utils/tier';
```
In `LicensePage.tsx`, remove the local `tierColor` function (lines 25-33) and add:
```typescript
import { tierColor } from '../utils/tier';
```
- [ ] **Step 3: Fix feature badge color**
In `LicensePage.tsx`, find the feature badge (line ~131-132):
```tsx
// BEFORE:
color={enabled ? 'success' : 'auto'}
// Check what neutral badge colors the DS supports.
// If 'auto' hashes to inconsistent colors, use a fixed muted option.
// AFTER:
color={enabled ? 'success' : 'warning'}
```
Use `'warning'` (amber/muted) for "Not included" — it's neutral without implying error. If the DS has a better neutral option, use that.
- [ ] **Step 4: Commit**
```bash
git add ui/src/utils/tier.ts ui/src/pages/DashboardPage.tsx ui/src/pages/LicensePage.tsx
git commit -m "fix: unify tier color mapping, fix feature badge colors"
```
---
## Task 8: AdminTenantsPage confirmation and polish
**Spec items:** 4.3
**Files:**
- Modify: `ui/src/pages/AdminTenantsPage.tsx`
- [ ] **Step 1: Add confirmation before tenant context switch**
In `ui/src/pages/AdminTenantsPage.tsx`, add state and import:
```typescript
import { AlertDialog } from '@cameleer/design-system';
const [switchTarget, setSwitchTarget] = useState<TenantResponse | null>(null);
```
Update the row click handler:
```tsx
// BEFORE:
const handleRowClick = (tenant: TenantResponse) => {
const orgs = useOrgStore.getState().organizations;
const match = orgs.find((o) => o.name === tenant.name || o.slug === tenant.slug);
if (match) {
setCurrentOrg(match.id);
navigate('/');
}
};
// AFTER:
const handleRowClick = (tenant: TenantResponse) => {
setSwitchTarget(tenant);
};
const confirmSwitch = () => {
if (!switchTarget) return;
const orgs = useOrgStore.getState().organizations;
const match = orgs.find((o) => o.name === switchTarget.name || o.slug === switchTarget.slug);
if (match) {
setCurrentOrg(match.id);
navigate('/');
}
setSwitchTarget(null);
};
```
Add the AlertDialog at the bottom of the component return:
```tsx
<AlertDialog
open={!!switchTarget}
onCancel={() => setSwitchTarget(null)}
onConfirm={confirmSwitch}
title="Switch tenant?"
description={`Switch to tenant "${switchTarget?.name}"? Your dashboard context will change.`}
confirmLabel="Switch"
variant="warning"
/>
```
- [ ] **Step 2: Commit**
```bash
git add ui/src/pages/AdminTenantsPage.tsx
git commit -m "fix: add confirmation dialog before tenant context switch"
```
---
## Summary
| Task | Batch | Key Changes | Commit |
|------|-------|-------------|--------|
| 1 | Layout | CSS module with DS variables, fix label/value, replace text-white | `fix: replace hardcoded text-white with DS variables, fix label/value layout` |
| 2 | Layout | Remove redundant "Open Server Dashboard" button | `fix: remove redundant Open Server Dashboard button` |
| 3 | Navigation | Sidebar active state, breadcrumbs, collapse, username fallback, lucide icons | `fix: sidebar active state, breadcrumbs, collapse, username fallback, lucide icons` |
| 4 | Error Handling | OrgResolver error UI, DashboardPage error state, AdminTenantsPage error + date format | `fix: add error states to OrgResolver, DashboardPage, AdminTenantsPage` |
| 5 | Components | DS Button for token toggle, copy-to-clipboard with toast | `fix: replace raw button with DS Button, add token copy-to-clipboard` |
| 6 | Sign-in | Password visibility toggle, branding fix to "Cameleer" | `fix: add password visibility toggle and fix branding to 'Cameleer'` |
| 7 | Polish | Shared tierColor(), fix feature badge colors | `fix: unify tier color mapping, fix feature badge colors` |
| 8 | Polish | Confirmation dialog for admin tenant switch | `fix: add confirmation dialog before tenant context switch` |

View File

@@ -0,0 +1,210 @@
# Fleet Health at a Glance Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Add agent count, environment count, and agent limit columns to the vendor tenant list so the vendor can see fleet utilization at a glance.
**Architecture:** Extend the existing `VendorTenantSummary` record with three int fields. The list endpoint fetches counts from each active tenant's server via existing M2M API methods (`getAgentCount`, `getEnvironmentCount`), parallelized with `CompletableFuture`. Frontend adds two columns (Agents, Envs) to the DataTable.
**Tech Stack:** Java 21, Spring Boot, CompletableFuture, React, TypeScript, @cameleer/design-system DataTable
---
### Task 1: Extend backend — VendorTenantSummary + parallel fetch
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/vendor/VendorTenantController.java`
- [ ] **Step 1: Extend the VendorTenantSummary record**
In `VendorTenantController.java`, replace the record at lines 39-48:
```java
public record VendorTenantSummary(
UUID id,
String name,
String slug,
String tier,
String status,
String serverState,
String licenseExpiry,
String provisionError,
int agentCount,
int environmentCount,
int agentLimit
) {}
```
- [ ] **Step 2: Update the listAll() endpoint to fetch counts in parallel**
Replace the `listAll()` method at lines 60-77:
```java
@GetMapping
public ResponseEntity<List<VendorTenantSummary>> listAll() {
var tenants = vendorTenantService.listAll();
// Parallel health fetch for active tenants
var futures = tenants.stream().map(tenant -> java.util.concurrent.CompletableFuture.supplyAsync(() -> {
ServerStatus status = vendorTenantService.getServerStatus(tenant);
String licenseExpiry = vendorTenantService
.getLicenseForTenant(tenant.getId())
.map(l -> l.getExpiresAt() != null ? l.getExpiresAt().toString() : null)
.orElse(null);
int agentCount = 0;
int environmentCount = 0;
int agentLimit = -1;
String endpoint = tenant.getServerEndpoint();
boolean isActive = "ACTIVE".equals(tenant.getStatus().name());
if (isActive && endpoint != null && !endpoint.isBlank() && "RUNNING".equals(status.state().name())) {
var serverApi = vendorTenantService.getServerApiClient();
agentCount = serverApi.getAgentCount(endpoint);
environmentCount = serverApi.getEnvironmentCount(endpoint);
}
var license = vendorTenantService.getLicenseForTenant(tenant.getId());
if (license.isPresent() && license.get().getLimits() != null) {
var limits = license.get().getLimits();
if (limits.containsKey("agents")) {
agentLimit = ((Number) limits.get("agents")).intValue();
}
}
return new VendorTenantSummary(
tenant.getId(), tenant.getName(), tenant.getSlug(),
tenant.getTier().name(), tenant.getStatus().name(),
status.state().name(), licenseExpiry, tenant.getProvisionError(),
agentCount, environmentCount, agentLimit
);
})).toList();
List<VendorTenantSummary> summaries = futures.stream()
.map(java.util.concurrent.CompletableFuture::join)
.toList();
return ResponseEntity.ok(summaries);
}
```
- [ ] **Step 3: Expose ServerApiClient from VendorTenantService**
Add a getter in `src/main/java/net/siegeln/cameleer/saas/vendor/VendorTenantService.java`:
```java
public ServerApiClient getServerApiClient() {
return serverApiClient;
}
```
(The `serverApiClient` field already exists in VendorTenantService — check around line 30.)
- [ ] **Step 4: Verify compilation**
Run: `./mvnw compile -pl . -q`
Expected: BUILD SUCCESS
- [ ] **Step 5: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/vendor/VendorTenantController.java \
src/main/java/net/siegeln/cameleer/saas/vendor/VendorTenantService.java
git commit -m "feat: add agent/env counts to vendor tenant list endpoint"
```
---
### Task 2: Update frontend types and columns
**Files:**
- Modify: `ui/src/types/api.ts`
- Modify: `ui/src/pages/vendor/VendorTenantsPage.tsx`
- [ ] **Step 1: Add fields to VendorTenantSummary TypeScript type**
In `ui/src/types/api.ts`, update the `VendorTenantSummary` interface:
```typescript
export interface VendorTenantSummary {
id: string;
name: string;
slug: string;
tier: string;
status: string;
serverState: string;
licenseExpiry: string | null;
provisionError: string | null;
agentCount: number;
environmentCount: number;
agentLimit: number;
}
```
- [ ] **Step 2: Add Agents and Envs columns to VendorTenantsPage**
In `ui/src/pages/vendor/VendorTenantsPage.tsx`, add a helper function after `statusColor`:
```typescript
function formatUsage(used: number, limit: number): string {
return limit < 0 ? `${used} / ∞` : `${used} / ${limit}`;
}
```
Then add two column entries in the `columns` array, after the `serverState` column (after line 54) and before the `licenseExpiry` column:
```typescript
{
key: 'agentCount',
header: 'Agents',
render: (_v, row) => (
<span style={{ fontFamily: 'monospace', fontSize: '0.875rem' }}>
{formatUsage(row.agentCount, row.agentLimit)}
</span>
),
},
{
key: 'environmentCount',
header: 'Envs',
render: (_v, row) => (
<span style={{ fontFamily: 'monospace', fontSize: '0.875rem' }}>
{row.environmentCount}
</span>
),
},
```
- [ ] **Step 3: Build the UI**
Run: `cd ui && npm run build`
Expected: Build succeeds with no errors.
- [ ] **Step 4: Commit**
```bash
git add ui/src/types/api.ts ui/src/pages/vendor/VendorTenantsPage.tsx
git commit -m "feat: show agent/env counts in vendor tenant list"
```
---
### Task 3: Verify end-to-end
- [ ] **Step 1: Run backend tests**
Run: `./mvnw test -pl . -q`
Expected: All tests pass. (Existing tests use mocks, the new parallel fetch doesn't break them since it only affects the controller's list mapping.)
- [ ] **Step 2: Verify in browser**
Navigate to the vendor tenant list. Confirm:
- "Agents" column shows "0 / ∞" (or actual count if agents are connected)
- "Envs" column shows "1" (or actual count)
- PROVISIONING/SUSPENDED tenants show "0" for both
- 30s auto-refresh still works
- [ ] **Step 3: Final commit and push**
```bash
git push
```

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,961 @@
# Externalize Docker Compose Templates — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Replace inline docker-compose generation in installer scripts with static template files, reducing duplication and enabling user customization.
**Architecture:** Static YAML templates in `installer/templates/` are copied to the install directory. The installer writes `.env` (including `COMPOSE_FILE` to select which templates are active) and runs `docker compose up -d`. Conditional features (TLS, monitoring) are handled via compose file layering and `.env` variables instead of heredoc injection.
**Tech Stack:** Docker Compose v2, YAML, Bash, PowerShell
**Spec:** `docs/superpowers/specs/2026-04-15-externalize-compose-templates-design.md`
---
### Task 1: Create `docker-compose.yml` (infra base template)
**Files:**
- Create: `installer/templates/docker-compose.yml`
This is the shared infrastructure base — always loaded regardless of deployment mode.
- [ ] **Step 1: Create the infra base template**
```yaml
# Cameleer Infrastructure
# Shared base — always loaded. Mode-specific services in separate compose files.
services:
cameleer-traefik:
image: ${TRAEFIK_IMAGE:-gitea.siegeln.net/cameleer/cameleer-traefik}:${VERSION:-latest}
restart: unless-stopped
ports:
- "${HTTP_PORT:-80}:80"
- "${HTTPS_PORT:-443}:443"
- "${LOGTO_CONSOLE_BIND:-127.0.0.1}:${LOGTO_CONSOLE_PORT:-3002}:3002"
environment:
PUBLIC_HOST: ${PUBLIC_HOST:-localhost}
CERT_FILE: ${CERT_FILE:-}
KEY_FILE: ${KEY_FILE:-}
CA_FILE: ${CA_FILE:-}
volumes:
- cameleer-certs:/certs
- ${DOCKER_SOCKET:-/var/run/docker.sock}:/var/run/docker.sock:ro
labels:
- "prometheus.io/scrape=true"
- "prometheus.io/port=8082"
- "prometheus.io/path=/metrics"
networks:
- cameleer
- cameleer-traefik
- monitoring
cameleer-postgres:
image: ${POSTGRES_IMAGE:-gitea.siegeln.net/cameleer/cameleer-postgres}:${VERSION:-latest}
restart: unless-stopped
environment:
POSTGRES_DB: ${POSTGRES_DB:-cameleer_saas}
POSTGRES_USER: ${POSTGRES_USER:-cameleer}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:?POSTGRES_PASSWORD must be set in .env}
volumes:
- cameleer-pgdata:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $${POSTGRES_USER:-cameleer} -d $${POSTGRES_DB:-cameleer_saas}"]
interval: 5s
timeout: 5s
retries: 5
networks:
- cameleer
- monitoring
cameleer-clickhouse:
image: ${CLICKHOUSE_IMAGE:-gitea.siegeln.net/cameleer/cameleer-clickhouse}:${VERSION:-latest}
restart: unless-stopped
environment:
CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD:?CLICKHOUSE_PASSWORD must be set in .env}
volumes:
- cameleer-chdata:/var/lib/clickhouse
healthcheck:
test: ["CMD-SHELL", "clickhouse-client --password $${CLICKHOUSE_PASSWORD} --query 'SELECT 1'"]
interval: 10s
timeout: 5s
retries: 3
labels:
- "prometheus.io/scrape=true"
- "prometheus.io/port=9363"
- "prometheus.io/path=/metrics"
networks:
- cameleer
- monitoring
volumes:
cameleer-pgdata:
cameleer-chdata:
cameleer-certs:
networks:
cameleer:
driver: bridge
cameleer-traefik:
name: cameleer-traefik
driver: bridge
monitoring:
name: cameleer-monitoring-noop
```
Key changes from the generated version:
- Logto console port always present with `LOGTO_CONSOLE_BIND` controlling exposure
- Prometheus labels unconditional on traefik and clickhouse
- `monitoring` network defined as local noop bridge
- All services join `monitoring` network
- `POSTGRES_DB` uses `${POSTGRES_DB:-cameleer_saas}` (parameterized — standalone overrides via `.env`)
- Password variables use `:?` fail-if-unset
Note: The SaaS mode uses `cameleer-postgres` (custom multi-DB image) while standalone uses `postgres:16-alpine`. The `POSTGRES_IMAGE` variable already handles this — the infra base uses `${POSTGRES_IMAGE:-...}` and standalone `.env` sets `POSTGRES_IMAGE=postgres:16-alpine`.
- [ ] **Step 2: Verify YAML is valid**
Run: `python -c "import yaml; yaml.safe_load(open('installer/templates/docker-compose.yml'))"`
Expected: No output (valid YAML). If python/yaml not available, use `docker compose -f installer/templates/docker-compose.yml config --quiet` (will fail on unset vars, but validates structure).
- [ ] **Step 3: Commit**
```bash
git add installer/templates/docker-compose.yml
git commit -m "feat(installer): add infra base docker-compose template"
```
---
### Task 2: Create `docker-compose.saas.yml` (SaaS mode template)
**Files:**
- Create: `installer/templates/docker-compose.saas.yml`
SaaS-specific services: Logto identity provider and cameleer-saas management plane.
- [ ] **Step 1: Create the SaaS template**
```yaml
# Cameleer SaaS — Logto + management plane
# Loaded in SaaS deployment mode
services:
cameleer-logto:
image: ${LOGTO_IMAGE:-gitea.siegeln.net/cameleer/cameleer-logto}:${VERSION:-latest}
restart: unless-stopped
depends_on:
cameleer-postgres:
condition: service_healthy
environment:
DB_URL: postgres://${POSTGRES_USER:-cameleer}:${POSTGRES_PASSWORD}@cameleer-postgres:5432/logto
ENDPOINT: ${PUBLIC_PROTOCOL:-https}://${PUBLIC_HOST:-localhost}
ADMIN_ENDPOINT: ${PUBLIC_PROTOCOL:-https}://${PUBLIC_HOST:-localhost}:${LOGTO_CONSOLE_PORT:-3002}
TRUST_PROXY_HEADER: 1
NODE_TLS_REJECT_UNAUTHORIZED: "${NODE_TLS_REJECT:-0}"
LOGTO_ENDPOINT: http://cameleer-logto:3001
LOGTO_ADMIN_ENDPOINT: http://cameleer-logto:3002
LOGTO_PUBLIC_ENDPOINT: ${PUBLIC_PROTOCOL:-https}://${PUBLIC_HOST:-localhost}
PUBLIC_HOST: ${PUBLIC_HOST:-localhost}
PUBLIC_PROTOCOL: ${PUBLIC_PROTOCOL:-https}
PG_HOST: cameleer-postgres
PG_USER: ${POSTGRES_USER:-cameleer}
PG_PASSWORD: ${POSTGRES_PASSWORD}
PG_DB_SAAS: cameleer_saas
SAAS_ADMIN_USER: ${SAAS_ADMIN_USER:-admin}
SAAS_ADMIN_PASS: ${SAAS_ADMIN_PASS:?SAAS_ADMIN_PASS must be set in .env}
healthcheck:
test: ["CMD-SHELL", "node -e \"require('http').get('http://localhost:3001/oidc/.well-known/openid-configuration', r => process.exit(r.statusCode === 200 ? 0 : 1)).on('error', () => process.exit(1))\" && test -f /data/logto-bootstrap.json"]
interval: 10s
timeout: 5s
retries: 60
start_period: 30s
labels:
- traefik.enable=true
- traefik.http.routers.cameleer-logto.rule=PathPrefix(`/`)
- traefik.http.routers.cameleer-logto.priority=1
- traefik.http.routers.cameleer-logto.entrypoints=websecure
- traefik.http.routers.cameleer-logto.tls=true
- traefik.http.routers.cameleer-logto.service=cameleer-logto
- traefik.http.routers.cameleer-logto.middlewares=cameleer-logto-cors
- "traefik.http.middlewares.cameleer-logto-cors.headers.accessControlAllowOriginList=${PUBLIC_PROTOCOL:-https}://${PUBLIC_HOST:-localhost}:${LOGTO_CONSOLE_PORT:-3002}"
- traefik.http.middlewares.cameleer-logto-cors.headers.accessControlAllowMethods=GET,POST,PUT,PATCH,DELETE,OPTIONS
- traefik.http.middlewares.cameleer-logto-cors.headers.accessControlAllowHeaders=Authorization,Content-Type
- traefik.http.middlewares.cameleer-logto-cors.headers.accessControlAllowCredentials=true
- traefik.http.services.cameleer-logto.loadbalancer.server.port=3001
- traefik.http.routers.cameleer-logto-console.rule=PathPrefix(`/`)
- traefik.http.routers.cameleer-logto-console.entrypoints=admin-console
- traefik.http.routers.cameleer-logto-console.tls=true
- traefik.http.routers.cameleer-logto-console.service=cameleer-logto-console
- traefik.http.services.cameleer-logto-console.loadbalancer.server.port=3002
volumes:
- cameleer-bootstrapdata:/data
networks:
- cameleer
- monitoring
cameleer-saas:
image: ${CAMELEER_IMAGE:-gitea.siegeln.net/cameleer/cameleer-saas}:${VERSION:-latest}
restart: unless-stopped
depends_on:
cameleer-logto:
condition: service_healthy
environment:
# SaaS database
SPRING_DATASOURCE_URL: jdbc:postgresql://cameleer-postgres:5432/cameleer_saas
SPRING_DATASOURCE_USERNAME: ${POSTGRES_USER:-cameleer}
SPRING_DATASOURCE_PASSWORD: ${POSTGRES_PASSWORD}
# Identity (Logto)
CAMELEER_SAAS_IDENTITY_LOGTOENDPOINT: http://cameleer-logto:3001
CAMELEER_SAAS_IDENTITY_LOGTOPUBLICENDPOINT: ${PUBLIC_PROTOCOL:-https}://${PUBLIC_HOST:-localhost}
# Provisioning — passed to per-tenant server containers
CAMELEER_SAAS_PROVISIONING_PUBLICHOST: ${PUBLIC_HOST:-localhost}
CAMELEER_SAAS_PROVISIONING_PUBLICPROTOCOL: ${PUBLIC_PROTOCOL:-https}
CAMELEER_SAAS_PROVISIONING_NETWORKNAME: ${COMPOSE_PROJECT_NAME:-cameleer-saas}_cameleer
CAMELEER_SAAS_PROVISIONING_TRAEFIKNETWORK: cameleer-traefik
CAMELEER_SAAS_PROVISIONING_DATASOURCEUSERNAME: ${POSTGRES_USER:-cameleer}
CAMELEER_SAAS_PROVISIONING_DATASOURCEPASSWORD: ${POSTGRES_PASSWORD}
CAMELEER_SAAS_PROVISIONING_CLICKHOUSEPASSWORD: ${CLICKHOUSE_PASSWORD}
CAMELEER_SAAS_PROVISIONING_SERVERIMAGE: ${CAMELEER_SAAS_PROVISIONING_SERVERIMAGE:-gitea.siegeln.net/cameleer/cameleer-server:latest}
CAMELEER_SAAS_PROVISIONING_SERVERUIIMAGE: ${CAMELEER_SAAS_PROVISIONING_SERVERUIIMAGE:-gitea.siegeln.net/cameleer/cameleer-server-ui:latest}
labels:
- traefik.enable=true
- traefik.http.routers.saas.rule=PathPrefix(`/platform`)
- traefik.http.routers.saas.entrypoints=websecure
- traefik.http.routers.saas.tls=true
- traefik.http.services.saas.loadbalancer.server.port=8080
- "prometheus.io/scrape=true"
- "prometheus.io/port=8080"
- "prometheus.io/path=/platform/actuator/prometheus"
volumes:
- cameleer-bootstrapdata:/data/bootstrap:ro
- cameleer-certs:/certs
- ${DOCKER_SOCKET:-/var/run/docker.sock}:/var/run/docker.sock
group_add:
- "${DOCKER_GID:-0}"
networks:
- cameleer
- monitoring
volumes:
cameleer-bootstrapdata:
networks:
monitoring:
name: cameleer-monitoring-noop
```
Key changes:
- Logto console traefik labels always included (harmless when port is localhost-only)
- Prometheus labels on cameleer-saas always included
- `DOCKER_GID` read from `.env` via `${DOCKER_GID:-0}` instead of inline `stat`
- Both services join `monitoring` network
- `monitoring` network redefined as noop bridge (compose merges with base definition)
- [ ] **Step 2: Commit**
```bash
git add installer/templates/docker-compose.saas.yml
git commit -m "feat(installer): add SaaS docker-compose template"
```
---
### Task 3: Create `docker-compose.server.yml` (standalone mode template)
**Files:**
- Create: `installer/templates/docker-compose.server.yml`
- Create: `installer/templates/traefik-dynamic.yml`
Standalone-specific services: cameleer-server + server-ui. Also includes the traefik dynamic config that standalone mode needs (overrides the baked-in SaaS redirect).
- [ ] **Step 1: Create the standalone template**
```yaml
# Cameleer Server (standalone)
# Loaded in standalone deployment mode
services:
cameleer-traefik:
volumes:
- ./traefik-dynamic.yml:/etc/traefik/dynamic.yml:ro
cameleer-postgres:
image: postgres:16-alpine
environment:
POSTGRES_DB: ${POSTGRES_DB:-cameleer}
healthcheck:
test: ["CMD-SHELL", "pg_isready -U $${POSTGRES_USER:-cameleer} -d $${POSTGRES_DB:-cameleer}"]
cameleer-server:
image: ${SERVER_IMAGE:-gitea.siegeln.net/cameleer/cameleer-server}:${VERSION:-latest}
container_name: cameleer-server
restart: unless-stopped
depends_on:
cameleer-postgres:
condition: service_healthy
environment:
CAMELEER_SERVER_TENANT_ID: default
SPRING_DATASOURCE_URL: jdbc:postgresql://cameleer-postgres:5432/${POSTGRES_DB:-cameleer}?currentSchema=tenant_default
SPRING_DATASOURCE_USERNAME: ${POSTGRES_USER:-cameleer}
SPRING_DATASOURCE_PASSWORD: ${POSTGRES_PASSWORD}
CAMELEER_SERVER_CLICKHOUSE_URL: jdbc:clickhouse://cameleer-clickhouse:8123/cameleer
CAMELEER_SERVER_CLICKHOUSE_USERNAME: default
CAMELEER_SERVER_CLICKHOUSE_PASSWORD: ${CLICKHOUSE_PASSWORD}
CAMELEER_SERVER_SECURITY_BOOTSTRAPTOKEN: ${BOOTSTRAP_TOKEN:?BOOTSTRAP_TOKEN must be set in .env}
CAMELEER_SERVER_SECURITY_UIUSER: ${SERVER_ADMIN_USER:-admin}
CAMELEER_SERVER_SECURITY_UIPASSWORD: ${SERVER_ADMIN_PASS:?SERVER_ADMIN_PASS must be set in .env}
CAMELEER_SERVER_SECURITY_CORSALLOWEDORIGINS: ${PUBLIC_PROTOCOL:-https}://${PUBLIC_HOST:-localhost}
CAMELEER_SERVER_RUNTIME_ENABLED: "true"
CAMELEER_SERVER_RUNTIME_SERVERURL: http://cameleer-server:8081
CAMELEER_SERVER_RUNTIME_ROUTINGDOMAIN: ${PUBLIC_HOST:-localhost}
CAMELEER_SERVER_RUNTIME_ROUTINGMODE: path
CAMELEER_SERVER_RUNTIME_JARSTORAGEPATH: /data/jars
CAMELEER_SERVER_RUNTIME_DOCKERNETWORK: cameleer-apps
CAMELEER_SERVER_RUNTIME_JARDOCKERVOLUME: cameleer-jars
CAMELEER_SERVER_RUNTIME_BASEIMAGE: gitea.siegeln.net/cameleer/cameleer-runtime-base:${VERSION:-latest}
labels:
- traefik.enable=true
- traefik.http.routers.server-api.rule=PathPrefix(`/api`)
- traefik.http.routers.server-api.entrypoints=websecure
- traefik.http.routers.server-api.tls=true
- traefik.http.services.server-api.loadbalancer.server.port=8081
- traefik.docker.network=cameleer-traefik
healthcheck:
test: ["CMD-SHELL", "curl -sf http://localhost:8081/api/v1/health || exit 1"]
interval: 10s
timeout: 5s
retries: 30
start_period: 30s
volumes:
- jars:/data/jars
- cameleer-certs:/certs:ro
- ${DOCKER_SOCKET:-/var/run/docker.sock}:/var/run/docker.sock
group_add:
- "${DOCKER_GID:-0}"
networks:
- cameleer
- cameleer-traefik
- cameleer-apps
- monitoring
cameleer-server-ui:
image: ${SERVER_UI_IMAGE:-gitea.siegeln.net/cameleer/cameleer-server-ui}:${VERSION:-latest}
restart: unless-stopped
depends_on:
cameleer-server:
condition: service_healthy
environment:
CAMELEER_API_URL: http://cameleer-server:8081
BASE_PATH: ""
labels:
- traefik.enable=true
- traefik.http.routers.ui.rule=PathPrefix(`/`)
- traefik.http.routers.ui.priority=1
- traefik.http.routers.ui.entrypoints=websecure
- traefik.http.routers.ui.tls=true
- traefik.http.services.ui.loadbalancer.server.port=80
- traefik.docker.network=cameleer-traefik
networks:
- cameleer-traefik
- monitoring
volumes:
jars:
networks:
cameleer-apps:
name: cameleer-apps
driver: bridge
monitoring:
name: cameleer-monitoring-noop
```
Key design decisions:
- `cameleer-traefik` and `cameleer-postgres` entries are **overrides** — compose merges them with the base. The postgres image switches to `postgres:16-alpine` and the healthcheck uses `${POSTGRES_DB:-cameleer}` instead of hardcoded `cameleer_saas`. Traefik gets the `traefik-dynamic.yml` volume mount.
- `DOCKER_GID` from `.env` via `${DOCKER_GID:-0}`
- `BOOTSTRAP_TOKEN` uses `:?` fail-if-unset
- Both server and server-ui join `monitoring` network
- [ ] **Step 2: Create the traefik dynamic config template**
```yaml
tls:
stores:
default:
defaultCertificate:
certFile: /certs/cert.pem
keyFile: /certs/key.pem
```
This file is only relevant in standalone mode (overrides the baked-in SaaS `/` -> `/platform/` redirect in the traefik image).
- [ ] **Step 3: Commit**
```bash
git add installer/templates/docker-compose.server.yml installer/templates/traefik-dynamic.yml
git commit -m "feat(installer): add standalone docker-compose and traefik templates"
```
---
### Task 4: Create overlay templates (TLS + monitoring)
**Files:**
- Create: `installer/templates/docker-compose.tls.yml`
- Create: `installer/templates/docker-compose.monitoring.yml`
- [ ] **Step 1: Create the TLS overlay**
```yaml
# Custom TLS certificates overlay
# Adds user-supplied certificate volume to traefik
services:
cameleer-traefik:
volumes:
- ./certs:/user-certs:ro
```
- [ ] **Step 2: Create the monitoring overlay**
```yaml
# External monitoring network overlay
# Overrides the noop monitoring bridge with a real external network
networks:
monitoring:
external: true
name: ${MONITORING_NETWORK:?MONITORING_NETWORK must be set in .env}
```
This is the key to the monitoring pattern: the base compose files define `monitoring` as a local noop bridge and all services join it. When this overlay is included in `COMPOSE_FILE`, compose merges the network definition — overriding it to point at the real external monitoring network. No per-service entries needed.
- [ ] **Step 3: Commit**
```bash
git add installer/templates/docker-compose.tls.yml installer/templates/docker-compose.monitoring.yml
git commit -m "feat(installer): add TLS and monitoring overlay templates"
```
---
### Task 5: Create `.env.example`
**Files:**
- Create: `installer/templates/.env.example`
- [ ] **Step 1: Create the documented variable reference**
```bash
# Cameleer Configuration
# Copy this file to .env and fill in the values.
# The installer generates .env automatically — this file is for reference.
# ============================================================
# Compose file assembly (set by installer)
# ============================================================
# SaaS: docker-compose.yml:docker-compose.saas.yml
# Standalone: docker-compose.yml:docker-compose.server.yml
# Add :docker-compose.tls.yml for custom TLS certificates
# Add :docker-compose.monitoring.yml for external monitoring network
COMPOSE_FILE=docker-compose.yml:docker-compose.saas.yml
# ============================================================
# Image version
# ============================================================
VERSION=latest
# ============================================================
# Public access
# ============================================================
PUBLIC_HOST=localhost
PUBLIC_PROTOCOL=https
# ============================================================
# Ports
# ============================================================
HTTP_PORT=80
HTTPS_PORT=443
# Set to 0.0.0.0 to expose Logto admin console externally (default: localhost only)
# LOGTO_CONSOLE_BIND=0.0.0.0
LOGTO_CONSOLE_PORT=3002
# ============================================================
# PostgreSQL
# ============================================================
POSTGRES_USER=cameleer
POSTGRES_PASSWORD=CHANGE_ME
# SaaS: cameleer_saas, Standalone: cameleer
POSTGRES_DB=cameleer_saas
# ============================================================
# ClickHouse
# ============================================================
CLICKHOUSE_PASSWORD=CHANGE_ME
# ============================================================
# Admin credentials (SaaS mode)
# ============================================================
SAAS_ADMIN_USER=admin
SAAS_ADMIN_PASS=CHANGE_ME
# ============================================================
# Admin credentials (standalone mode)
# ============================================================
# SERVER_ADMIN_USER=admin
# SERVER_ADMIN_PASS=CHANGE_ME
# BOOTSTRAP_TOKEN=CHANGE_ME
# ============================================================
# TLS
# ============================================================
# Set to 1 to reject unauthorized TLS certificates (production)
NODE_TLS_REJECT=0
# Custom TLS certificate paths (inside container, set by installer)
# CERT_FILE=/user-certs/cert.pem
# KEY_FILE=/user-certs/key.pem
# CA_FILE=/user-certs/ca.pem
# ============================================================
# Docker
# ============================================================
DOCKER_SOCKET=/var/run/docker.sock
# GID of the docker socket — detected by installer, used for container group_add
DOCKER_GID=0
# ============================================================
# Provisioning images (SaaS mode only)
# ============================================================
# CAMELEER_SAAS_PROVISIONING_SERVERIMAGE=gitea.siegeln.net/cameleer/cameleer-server:latest
# CAMELEER_SAAS_PROVISIONING_SERVERUIIMAGE=gitea.siegeln.net/cameleer/cameleer-server-ui:latest
# ============================================================
# Monitoring (optional)
# ============================================================
# External Docker network name for Prometheus scraping.
# Only needed when docker-compose.monitoring.yml is in COMPOSE_FILE.
# MONITORING_NETWORK=prometheus
```
- [ ] **Step 2: Commit**
```bash
git add installer/templates/.env.example
git commit -m "feat(installer): add .env.example with documented variables"
```
---
### Task 6: Update `install.sh` — replace compose generation with template copying
**Files:**
- Modify: `installer/install.sh:574-672` (generate_env_file — add COMPOSE_FILE and LOGTO_CONSOLE_BIND)
- Modify: `installer/install.sh:674-1135` (replace generate_compose_file + generate_compose_file_standalone with copy_templates)
- Modify: `installer/install.sh:1728-1731` (reinstall cleanup — delete template files)
- Modify: `installer/install.sh:1696-1710` (upgrade path — copy templates instead of generate)
- Modify: `installer/install.sh:1790-1791` (main — call copy_templates instead of generate_compose_file)
- [ ] **Step 1: Replace `generate_compose_file` and `generate_compose_file_standalone` with `copy_templates`**
Delete both functions (`generate_compose_file` at line 674 and `generate_compose_file_standalone` at line 934) and replace with:
```bash
copy_templates() {
local src
src="$(cd "$(dirname "$0")" && pwd)/templates"
# Base infra — always copied
cp "$src/docker-compose.yml" "$INSTALL_DIR/docker-compose.yml"
cp "$src/.env.example" "$INSTALL_DIR/.env.example"
# Mode-specific
if [ "$DEPLOYMENT_MODE" = "standalone" ]; then
cp "$src/docker-compose.server.yml" "$INSTALL_DIR/docker-compose.server.yml"
cp "$src/traefik-dynamic.yml" "$INSTALL_DIR/traefik-dynamic.yml"
else
cp "$src/docker-compose.saas.yml" "$INSTALL_DIR/docker-compose.saas.yml"
fi
# Optional overlays
if [ "$TLS_MODE" = "custom" ]; then
cp "$src/docker-compose.tls.yml" "$INSTALL_DIR/docker-compose.tls.yml"
fi
if [ -n "$MONITORING_NETWORK" ]; then
cp "$src/docker-compose.monitoring.yml" "$INSTALL_DIR/docker-compose.monitoring.yml"
fi
log_info "Copied docker-compose templates to $INSTALL_DIR"
}
```
- [ ] **Step 2: Update `generate_env_file` to include `COMPOSE_FILE`, `LOGTO_CONSOLE_BIND`, and `DOCKER_GID`**
In the standalone `.env` block (line 577-614), add after the `DOCKER_GID` line:
```bash
# Compose file assembly
COMPOSE_FILE=docker-compose.yml:docker-compose.server.yml$([ "$TLS_MODE" = "custom" ] && echo ":docker-compose.tls.yml")$([ -n "$MONITORING_NETWORK" ] && echo ":docker-compose.monitoring.yml")
EOF
```
In the SaaS `.env` block (line 617-668), add `LOGTO_CONSOLE_BIND` and `COMPOSE_FILE`. After the `LOGTO_CONSOLE_PORT` line:
```bash
LOGTO_CONSOLE_BIND=$([ "$LOGTO_CONSOLE_EXPOSED" = "true" ] && echo "0.0.0.0" || echo "127.0.0.1")
```
And at the end of the SaaS block, add the `COMPOSE_FILE` line:
```bash
# Compose file assembly
COMPOSE_FILE=docker-compose.yml:docker-compose.saas.yml$([ "$TLS_MODE" = "custom" ] && echo ":docker-compose.tls.yml")$([ -n "$MONITORING_NETWORK" ] && echo ":docker-compose.monitoring.yml")
```
Also add the `MONITORING_NETWORK` variable to `.env` when set:
```bash
if [ -n "$MONITORING_NETWORK" ]; then
echo "" >> "$f"
echo "# Monitoring" >> "$f"
echo "MONITORING_NETWORK=${MONITORING_NETWORK}" >> "$f"
fi
```
- [ ] **Step 3: Update `main()` — replace `generate_compose_file` call with `copy_templates`**
At line 1791, change:
```bash
generate_compose_file
```
to:
```bash
copy_templates
```
- [ ] **Step 4: Update `handle_rerun` upgrade path**
At line 1703, change:
```bash
generate_compose_file
```
to:
```bash
copy_templates
```
- [ ] **Step 5: Update reinstall cleanup to remove template files**
At lines 1728-1731, update the `rm -f` list to include all possible template files:
```bash
rm -f "$INSTALL_DIR/.env" "$INSTALL_DIR/.env.bak" "$INSTALL_DIR/.env.example" \
"$INSTALL_DIR/docker-compose.yml" "$INSTALL_DIR/docker-compose.saas.yml" \
"$INSTALL_DIR/docker-compose.server.yml" "$INSTALL_DIR/docker-compose.tls.yml" \
"$INSTALL_DIR/docker-compose.monitoring.yml" "$INSTALL_DIR/traefik-dynamic.yml" \
"$INSTALL_DIR/cameleer.conf" "$INSTALL_DIR/credentials.txt" \
"$INSTALL_DIR/INSTALL.md"
```
- [ ] **Step 6: Commit**
```bash
git add installer/install.sh
git commit -m "refactor(installer): replace sh compose generation with template copying"
```
---
### Task 7: Update `install.ps1` — replace compose generation with template copying
**Files:**
- Modify: `installer/install.ps1:574-666` (Generate-EnvFile — add COMPOSE_FILE and LOGTO_CONSOLE_BIND)
- Modify: `installer/install.ps1:671-1105` (replace Generate-ComposeFile + Generate-ComposeFileStandalone with Copy-Templates)
- Modify: `installer/install.ps1:1706-1723` (upgrade path)
- Modify: `installer/install.ps1:1746` (reinstall cleanup)
- Modify: `installer/install.ps1:1797-1798` (Main — call Copy-Templates)
- [ ] **Step 1: Replace `Generate-ComposeFile` and `Generate-ComposeFileStandalone` with `Copy-Templates`**
Delete both functions and replace with:
```powershell
function Copy-Templates {
$c = $script:cfg
$src = Join-Path $PSScriptRoot 'templates'
# Base infra — always copied
Copy-Item (Join-Path $src 'docker-compose.yml') (Join-Path $c.InstallDir 'docker-compose.yml') -Force
Copy-Item (Join-Path $src '.env.example') (Join-Path $c.InstallDir '.env.example') -Force
# Mode-specific
if ($c.DeploymentMode -eq 'standalone') {
Copy-Item (Join-Path $src 'docker-compose.server.yml') (Join-Path $c.InstallDir 'docker-compose.server.yml') -Force
Copy-Item (Join-Path $src 'traefik-dynamic.yml') (Join-Path $c.InstallDir 'traefik-dynamic.yml') -Force
} else {
Copy-Item (Join-Path $src 'docker-compose.saas.yml') (Join-Path $c.InstallDir 'docker-compose.saas.yml') -Force
}
# Optional overlays
if ($c.TlsMode -eq 'custom') {
Copy-Item (Join-Path $src 'docker-compose.tls.yml') (Join-Path $c.InstallDir 'docker-compose.tls.yml') -Force
}
if ($c.MonitoringNetwork) {
Copy-Item (Join-Path $src 'docker-compose.monitoring.yml') (Join-Path $c.InstallDir 'docker-compose.monitoring.yml') -Force
}
Log-Info "Copied docker-compose templates to $($c.InstallDir)"
}
```
- [ ] **Step 2: Update `Generate-EnvFile` to include `COMPOSE_FILE`, `LOGTO_CONSOLE_BIND`, and `MONITORING_NETWORK`**
In the standalone `.env` content block, add after `DOCKER_GID`:
```powershell
$composeFile = 'docker-compose.yml:docker-compose.server.yml'
if ($c.TlsMode -eq 'custom') { $composeFile += ':docker-compose.tls.yml' }
if ($c.MonitoringNetwork) { $composeFile += ':docker-compose.monitoring.yml' }
```
Then append to `$content`:
```powershell
$content += "`n`n# Compose file assembly`nCOMPOSE_FILE=$composeFile"
if ($c.MonitoringNetwork) {
$content += "`n`n# Monitoring`nMONITORING_NETWORK=$($c.MonitoringNetwork)"
}
```
In the SaaS `.env` content block, add `LOGTO_CONSOLE_BIND` after `LOGTO_CONSOLE_PORT`:
```powershell
$consoleBind = if ($c.LogtoConsoleExposed -eq 'true') { '0.0.0.0' } else { '127.0.0.1' }
```
Add to the content string: `LOGTO_CONSOLE_BIND=$consoleBind`
Build `COMPOSE_FILE`:
```powershell
$composeFile = 'docker-compose.yml:docker-compose.saas.yml'
if ($c.TlsMode -eq 'custom') { $composeFile += ':docker-compose.tls.yml' }
if ($c.MonitoringNetwork) { $composeFile += ':docker-compose.monitoring.yml' }
```
And append to `$content`:
```powershell
$content += "`n`n# Compose file assembly`nCOMPOSE_FILE=$composeFile"
if ($c.MonitoringNetwork) {
$content += "`n`n# Monitoring`nMONITORING_NETWORK=$($c.MonitoringNetwork)"
}
```
- [ ] **Step 3: Update `Main` — replace `Generate-ComposeFile` call with `Copy-Templates`**
At line 1798, change:
```powershell
Generate-ComposeFile
```
to:
```powershell
Copy-Templates
```
- [ ] **Step 4: Update `Handle-Rerun` upgrade path**
At line 1716, change:
```powershell
Generate-ComposeFile
```
to:
```powershell
Copy-Templates
```
- [ ] **Step 5: Update reinstall cleanup to remove template files**
At line 1746, update the filename list:
```powershell
foreach ($fname in @('.env','.env.bak','.env.example','docker-compose.yml','docker-compose.saas.yml','docker-compose.server.yml','docker-compose.tls.yml','docker-compose.monitoring.yml','traefik-dynamic.yml','cameleer.conf','credentials.txt','INSTALL.md')) {
```
- [ ] **Step 6: Commit**
```bash
git add installer/install.ps1
git commit -m "refactor(installer): replace ps1 compose generation with template copying"
```
---
### Task 8: Update existing generated install and clean up
**Files:**
- Modify: `installer/cameleer/docker-compose.yml` (replace with template copy for dev environment)
- [ ] **Step 1: Remove the old generated docker-compose.yml from the cameleer/ directory**
The `installer/cameleer/` directory contains a previously generated install. The `docker-compose.yml` there is now stale — it was generated by the old inline method. Since this is a dev environment output, remove it (it will be recreated by running the installer with the new template approach).
```bash
git rm installer/cameleer/docker-compose.yml
```
- [ ] **Step 2: Add `installer/cameleer/` to `.gitignore` if not already there**
The install output directory should not be tracked. Check if `.gitignore` already covers it. If not, add:
```
installer/cameleer/
```
This prevents generated `.env`, `credentials.txt`, and compose files from being committed.
- [ ] **Step 3: Commit**
```bash
git add -A installer/cameleer/ .gitignore
git commit -m "chore(installer): remove generated install output, add to gitignore"
```
---
### Task 9: Verify the templates produce equivalent output
**Files:** (no changes — verification only)
- [ ] **Step 1: Compare template output against the old generated compose**
Create a temporary `.env` file and run `docker compose config` to render the resolved compose. Compare against the old generated output:
```bash
cd installer/cameleer
# Back up old generated file for comparison
cp docker-compose.yml docker-compose.old.yml 2>/dev/null || true
# Create a test .env that exercises the SaaS path
cat > /tmp/test-saas.env << 'EOF'
COMPOSE_FILE=docker-compose.yml:docker-compose.saas.yml
VERSION=latest
PUBLIC_HOST=test.example.com
PUBLIC_PROTOCOL=https
HTTP_PORT=80
HTTPS_PORT=443
LOGTO_CONSOLE_PORT=3002
LOGTO_CONSOLE_BIND=0.0.0.0
POSTGRES_USER=cameleer
POSTGRES_PASSWORD=testpass
POSTGRES_DB=cameleer_saas
CLICKHOUSE_PASSWORD=testpass
SAAS_ADMIN_USER=admin
SAAS_ADMIN_PASS=testpass
NODE_TLS_REJECT=0
DOCKER_SOCKET=/var/run/docker.sock
DOCKER_GID=0
CAMELEER_SAAS_PROVISIONING_SERVERIMAGE=gitea.siegeln.net/cameleer/cameleer-server:latest
CAMELEER_SAAS_PROVISIONING_SERVERUIIMAGE=gitea.siegeln.net/cameleer/cameleer-server-ui:latest
EOF
# Render the new templates
cd ../templates
docker compose --env-file /tmp/test-saas.env config
```
Expected: A fully resolved compose with all 5 services (traefik, postgres, clickhouse, logto, saas), correct environment variables, and the monitoring noop network.
- [ ] **Step 2: Test standalone mode rendering**
```bash
cat > /tmp/test-standalone.env << 'EOF'
COMPOSE_FILE=docker-compose.yml:docker-compose.server.yml
VERSION=latest
PUBLIC_HOST=test.example.com
PUBLIC_PROTOCOL=https
HTTP_PORT=80
HTTPS_PORT=443
POSTGRES_IMAGE=postgres:16-alpine
POSTGRES_USER=cameleer
POSTGRES_PASSWORD=testpass
POSTGRES_DB=cameleer
CLICKHOUSE_PASSWORD=testpass
SERVER_ADMIN_USER=admin
SERVER_ADMIN_PASS=testpass
BOOTSTRAP_TOKEN=testtoken
DOCKER_SOCKET=/var/run/docker.sock
DOCKER_GID=0
EOF
cd ../templates
docker compose --env-file /tmp/test-standalone.env config
```
Expected: 5 services (traefik, postgres with `postgres:16-alpine` image, clickhouse, server, server-ui). Postgres `POSTGRES_DB` should be `cameleer`. Server should have all env vars resolved.
- [ ] **Step 3: Test with TLS + monitoring overlays**
```bash
cat > /tmp/test-full.env << 'EOF'
COMPOSE_FILE=docker-compose.yml:docker-compose.saas.yml:docker-compose.tls.yml:docker-compose.monitoring.yml
VERSION=latest
PUBLIC_HOST=test.example.com
PUBLIC_PROTOCOL=https
HTTP_PORT=80
HTTPS_PORT=443
LOGTO_CONSOLE_PORT=3002
LOGTO_CONSOLE_BIND=0.0.0.0
POSTGRES_USER=cameleer
POSTGRES_PASSWORD=testpass
POSTGRES_DB=cameleer_saas
CLICKHOUSE_PASSWORD=testpass
SAAS_ADMIN_USER=admin
SAAS_ADMIN_PASS=testpass
NODE_TLS_REJECT=0
DOCKER_SOCKET=/var/run/docker.sock
DOCKER_GID=0
MONITORING_NETWORK=prometheus
CAMELEER_SAAS_PROVISIONING_SERVERIMAGE=gitea.siegeln.net/cameleer/cameleer-server:latest
CAMELEER_SAAS_PROVISIONING_SERVERUIIMAGE=gitea.siegeln.net/cameleer/cameleer-server-ui:latest
EOF
cd ../templates
docker compose --env-file /tmp/test-full.env config
```
Expected: Same as SaaS mode but with `./certs:/user-certs:ro` volume on traefik and the `monitoring` network declared as `external: true` with name `prometheus`.
- [ ] **Step 4: Clean up temp files**
```bash
rm -f /tmp/test-saas.env /tmp/test-standalone.env /tmp/test-full.env
```
- [ ] **Step 5: Commit verification results as a note (optional)**
No code changes — this task is verification only. If all checks pass, proceed to the final commit.
---
### Task 10: Final commit — update CLAUDE.md deployment modes table
**Files:**
- Modify: `CLAUDE.md` (update Deployment Modes section to reference template files)
- [ ] **Step 1: Update the deployment modes documentation**
In the "Deployment Modes (installer)" section of CLAUDE.md, add a note about the template-based approach:
After the deployment modes table, add:
```markdown
The installer uses static docker-compose templates in `installer/templates/`. Templates are copied to the install directory and composed via `COMPOSE_FILE` in `.env`:
- `docker-compose.yml` — shared infrastructure (traefik, postgres, clickhouse)
- `docker-compose.saas.yml` — SaaS mode (logto, cameleer-saas)
- `docker-compose.server.yml` — standalone mode (server, server-ui)
- `docker-compose.tls.yml` — overlay: custom TLS cert volume
- `docker-compose.monitoring.yml` — overlay: external monitoring network
```
- [ ] **Step 2: Commit**
```bash
git add CLAUDE.md
git commit -m "docs: update CLAUDE.md with template-based installer architecture"
```

View File

@@ -0,0 +1,464 @@
# Per-Tenant PostgreSQL Isolation Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Give each tenant its own PostgreSQL user and schema so tenant servers can only access their own data at the database level.
**Architecture:** During provisioning, create a dedicated PG user (`tenant_<slug>`) with a matching schema. Pass per-tenant credentials and `currentSchema`/`ApplicationName` JDBC parameters to the server container. On delete, drop both schema and user. Existing tenants without `dbPassword` fall back to shared credentials for backwards compatibility.
**Tech Stack:** Java 21, Spring Boot 3.4, Flyway, PostgreSQL 16, Docker Java API
**Spec:** `docs/superpowers/specs/2026-04-15-per-tenant-pg-isolation-design.md`
---
### Task 1: Flyway Migration — add `db_password` column
**Files:**
- Create: `src/main/resources/db/migration/V015__add_tenant_db_password.sql`
- [ ] **Step 1: Create migration file**
```sql
ALTER TABLE tenants ADD COLUMN db_password VARCHAR(255);
```
- [ ] **Step 2: Verify migration applies**
Run: `mvn flyway:info -pl .` or start the app and check logs for `V015__add_tenant_db_password` in Flyway output.
- [ ] **Step 3: Commit**
```bash
git add src/main/resources/db/migration/V015__add_tenant_db_password.sql
git commit -m "feat: add db_password column to tenants table (V015)"
```
---
### Task 2: TenantEntity — add `dbPassword` field
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/tenant/TenantEntity.java`
- [ ] **Step 1: Add field and accessors**
After the `provisionError` field (line 59), add:
```java
@Column(name = "db_password")
private String dbPassword;
```
After the `setProvisionError` method (line 102), add:
```java
public String getDbPassword() { return dbPassword; }
public void setDbPassword(String dbPassword) { this.dbPassword = dbPassword; }
```
- [ ] **Step 2: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/tenant/TenantEntity.java
git commit -m "feat: add dbPassword field to TenantEntity"
```
---
### Task 3: Create `TenantDatabaseService`
**Files:**
- Create: `src/main/java/net/siegeln/cameleer/saas/provisioning/TenantDatabaseService.java`
- [ ] **Step 1: Implement the service**
```java
package net.siegeln.cameleer.saas.provisioning;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.stereotype.Service;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;
/**
* Creates and drops per-tenant PostgreSQL users and schemas
* on the shared cameleer database for DB-level tenant isolation.
*/
@Service
public class TenantDatabaseService {
private static final Logger log = LoggerFactory.getLogger(TenantDatabaseService.class);
private final ProvisioningProperties props;
public TenantDatabaseService(ProvisioningProperties props) {
this.props = props;
}
/**
* Create a dedicated PG user and schema for a tenant.
* Idempotent — skips if user/schema already exist.
*/
public void createTenantDatabase(String slug, String password) {
validateSlug(slug);
String url = props.datasourceUrl();
if (url == null || url.isBlank()) {
log.warn("No datasource URL configured — skipping tenant DB setup");
return;
}
String user = "tenant_" + slug;
String schema = "tenant_" + slug;
try (Connection conn = DriverManager.getConnection(url, props.datasourceUsername(), props.datasourcePassword());
Statement stmt = conn.createStatement()) {
// Create user if not exists
boolean userExists;
try (ResultSet rs = stmt.executeQuery(
"SELECT 1 FROM pg_roles WHERE rolname = '" + user + "'")) {
userExists = rs.next();
}
if (!userExists) {
stmt.execute("CREATE USER \"" + user + "\" WITH PASSWORD '" + escapePassword(password) + "'");
log.info("Created PostgreSQL user: {}", user);
} else {
// Update password on re-provision
stmt.execute("ALTER USER \"" + user + "\" WITH PASSWORD '" + escapePassword(password) + "'");
log.info("Updated password for existing PostgreSQL user: {}", user);
}
// Create schema if not exists
boolean schemaExists;
try (ResultSet rs = stmt.executeQuery(
"SELECT 1 FROM information_schema.schemata WHERE schema_name = '" + schema + "'")) {
schemaExists = rs.next();
}
if (!schemaExists) {
stmt.execute("CREATE SCHEMA \"" + schema + "\" AUTHORIZATION \"" + user + "\"");
log.info("Created PostgreSQL schema: {}", schema);
} else {
// Ensure ownership is correct
stmt.execute("ALTER SCHEMA \"" + schema + "\" OWNER TO \"" + user + "\"");
log.info("Schema {} already exists — ensured ownership", schema);
}
// Revoke access to public schema
stmt.execute("REVOKE ALL ON SCHEMA public FROM \"" + user + "\"");
} catch (Exception e) {
throw new RuntimeException("Failed to create tenant database for '" + slug + "': " + e.getMessage(), e);
}
}
/**
* Drop tenant schema (CASCADE) and user. Idempotent.
*/
public void dropTenantDatabase(String slug) {
validateSlug(slug);
String url = props.datasourceUrl();
if (url == null || url.isBlank()) {
log.warn("No datasource URL configured — skipping tenant DB cleanup");
return;
}
String user = "tenant_" + slug;
String schema = "tenant_" + slug;
try (Connection conn = DriverManager.getConnection(url, props.datasourceUsername(), props.datasourcePassword());
Statement stmt = conn.createStatement()) {
stmt.execute("DROP SCHEMA IF EXISTS \"" + schema + "\" CASCADE");
log.info("Dropped PostgreSQL schema: {}", schema);
stmt.execute("DROP USER IF EXISTS \"" + user + "\"");
log.info("Dropped PostgreSQL user: {}", user);
} catch (Exception e) {
log.warn("Failed to drop tenant database for '{}': {}", slug, e.getMessage());
}
}
private void validateSlug(String slug) {
if (slug == null || !slug.matches("^[a-z0-9-]+$")) {
throw new IllegalArgumentException("Invalid tenant slug: " + slug);
}
}
private String escapePassword(String password) {
return password.replace("'", "''");
}
}
```
- [ ] **Step 2: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/provisioning/TenantDatabaseService.java
git commit -m "feat: add TenantDatabaseService for per-tenant PG user+schema"
```
---
### Task 4: Add `dbPassword` to `TenantProvisionRequest`
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/provisioning/TenantProvisionRequest.java`
- [ ] **Step 1: Add field to record**
Replace the entire record with:
```java
package net.siegeln.cameleer.saas.provisioning;
import java.util.UUID;
public record TenantProvisionRequest(
UUID tenantId,
String slug,
String tier,
String licenseToken,
String dbPassword
) {}
```
- [ ] **Step 2: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/provisioning/TenantProvisionRequest.java
git commit -m "feat: add dbPassword to TenantProvisionRequest"
```
---
### Task 5: Update `DockerTenantProvisioner` — per-tenant JDBC URL
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/provisioning/DockerTenantProvisioner.java:197-200`
- [ ] **Step 1: Replace shared credentials with per-tenant credentials**
In `createServerContainer()` (line 197-200), replace:
```java
var env = new java.util.ArrayList<>(List.of(
"SPRING_DATASOURCE_URL=" + props.datasourceUrl(),
"SPRING_DATASOURCE_USERNAME=" + props.datasourceUsername(),
"SPRING_DATASOURCE_PASSWORD=" + props.datasourcePassword(),
```
With:
```java
// Per-tenant DB isolation: dedicated user+schema when dbPassword is set,
// shared credentials for backwards compatibility with pre-isolation tenants.
String dsUrl;
String dsUser;
String dsPass;
if (req.dbPassword() != null) {
dsUrl = props.datasourceUrl() + "?currentSchema=tenant_" + slug + "&ApplicationName=tenant_" + slug;
dsUser = "tenant_" + slug;
dsPass = req.dbPassword();
} else {
dsUrl = props.datasourceUrl();
dsUser = props.datasourceUsername();
dsPass = props.datasourcePassword();
}
var env = new java.util.ArrayList<>(List.of(
"SPRING_DATASOURCE_URL=" + dsUrl,
"SPRING_DATASOURCE_USERNAME=" + dsUser,
"SPRING_DATASOURCE_PASSWORD=" + dsPass,
```
- [ ] **Step 2: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/provisioning/DockerTenantProvisioner.java
git commit -m "feat: construct per-tenant JDBC URL with currentSchema and ApplicationName"
```
---
### Task 6: Update `VendorTenantService` — provisioning and delete flows
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/vendor/VendorTenantService.java`
- [ ] **Step 1: Inject `TenantDatabaseService`**
Add to the constructor and field declarations:
```java
private final TenantDatabaseService tenantDatabaseService;
```
Add to the constructor parameter list and assignment. (Follow the existing pattern of other injected services.)
- [ ] **Step 2: Update `provisionAsync()` — create DB before containers**
In `provisionAsync()` (around line 120), add DB creation before the provision call. Replace:
```java
var provisionRequest = new TenantProvisionRequest(tenantId, slug, tier, licenseToken);
ProvisionResult result = tenantProvisioner.provision(provisionRequest);
```
With:
```java
// Create per-tenant PG user + schema
String dbPassword = UUID.randomUUID().toString().replace("-", "")
+ UUID.randomUUID().toString().replace("-", "").substring(0, 8);
try {
tenantDatabaseService.createTenantDatabase(slug, dbPassword);
} catch (Exception e) {
log.error("Failed to create tenant database for {}: {}", slug, e.getMessage(), e);
tenantRepository.findById(tenantId).ifPresent(t -> {
t.setProvisionError("Database setup failed: " + e.getMessage());
tenantRepository.save(t);
});
return;
}
// Store DB password on entity
TenantEntity tenantForDb = tenantRepository.findById(tenantId).orElse(null);
if (tenantForDb == null) {
log.error("Tenant {} disappeared during provisioning", slug);
return;
}
tenantForDb.setDbPassword(dbPassword);
tenantRepository.save(tenantForDb);
var provisionRequest = new TenantProvisionRequest(tenantId, slug, tier, licenseToken, dbPassword);
ProvisionResult result = tenantProvisioner.provision(provisionRequest);
```
- [ ] **Step 3: Update the existing `TenantProvisionRequest` constructor call in upgrade flow**
Search for any other `new TenantProvisionRequest(...)` calls. The `upgradeServer` method (or re-provision after upgrade) also creates a provision request. Update it to pass `dbPassword` from the entity:
```java
TenantEntity tenant = ...;
var provisionRequest = new TenantProvisionRequest(
tenant.getId(), tenant.getSlug(), tenant.getTier().name(),
licenseToken, tenant.getDbPassword());
```
If the tenant has `dbPassword == null` (pre-existing), this is fine — Task 5 handles the null fallback.
- [ ] **Step 4: Update `delete()` — use TenantDatabaseService**
In `delete()` (around line 306), replace:
```java
// Erase tenant data from server databases (GDPR)
dataCleanupService.cleanup(tenant.getSlug());
```
With:
```java
// Drop per-tenant PG schema + user
tenantDatabaseService.dropTenantDatabase(tenant.getSlug());
// Erase ClickHouse data (GDPR)
dataCleanupService.cleanupClickHouse(tenant.getSlug());
```
- [ ] **Step 5: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/vendor/VendorTenantService.java
git commit -m "feat: create per-tenant PG database during provisioning, drop on delete"
```
---
### Task 7: Refactor `TenantDataCleanupService` — ClickHouse only
**Files:**
- Modify: `src/main/java/net/siegeln/cameleer/saas/provisioning/TenantDataCleanupService.java`
- [ ] **Step 1: Remove PG logic, rename public method**
Remove the `dropPostgresSchema()` method and the `cleanup()` method. Replace with a single public method:
```java
/**
* Deletes tenant data from ClickHouse tables (GDPR data erasure).
* PostgreSQL cleanup is handled by TenantDatabaseService.
*/
public void cleanupClickHouse(String slug) {
deleteClickHouseData(slug);
}
```
Remove the `dropPostgresSchema()` private method entirely. Keep `deleteClickHouseData()` unchanged.
- [ ] **Step 2: Commit**
```bash
git add src/main/java/net/siegeln/cameleer/saas/provisioning/TenantDataCleanupService.java
git commit -m "refactor: move PG cleanup to TenantDatabaseService, keep only ClickHouse"
```
---
### Task 8: Verify end-to-end
- [ ] **Step 1: Build**
```bash
mvn compile -pl .
```
Verify no compilation errors.
- [ ] **Step 2: Deploy and test tenant creation**
Deploy the updated SaaS image. Create a new tenant via the UI. Verify in PostgreSQL:
```sql
-- Should show the new tenant user
SELECT rolname FROM pg_roles WHERE rolname LIKE 'tenant_%';
-- Should show the new tenant schema
SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'tenant_%';
```
- [ ] **Step 3: Verify server container env vars**
```bash
docker inspect cameleer-server-<slug> | grep -E "DATASOURCE|currentSchema|ApplicationName"
```
Expected: URL contains `?currentSchema=tenant_<slug>&ApplicationName=tenant_<slug>`, username is `tenant_<slug>`.
- [ ] **Step 4: Verify Infrastructure page**
Navigate to Vendor > Infrastructure. The PostgreSQL card should now show the tenant schema with size/tables/rows.
- [ ] **Step 5: Test tenant deletion**
Delete the tenant. Verify:
```sql
-- User should be gone
SELECT rolname FROM pg_roles WHERE rolname LIKE 'tenant_%';
-- Schema should be gone
SELECT schema_name FROM information_schema.schemata WHERE schema_name LIKE 'tenant_%';
```
- [ ] **Step 6: Commit all remaining changes**
```bash
git add -A
git commit -m "feat: per-tenant PostgreSQL isolation — complete implementation"
```

View File

@@ -3,7 +3,7 @@
**Date:** 2026-03-29 **Date:** 2026-03-29
**Status:** Draft — Awaiting Review **Status:** Draft — Awaiting Review
**Author:** Boardroom simulation (Strategist, Skeptic, Architect, Growth Hacker) **Author:** Boardroom simulation (Strategist, Skeptic, Architect, Growth Hacker)
**Gitea Issues:** cameleer/cameleer3 #57-#72 (label: MOAT) **Gitea Issues:** cameleer/cameleer #57-#72 (label: MOAT)
## Executive Summary ## Executive Summary
@@ -32,14 +32,14 @@ Week 8-14: Live Route Debugger (agent + server + UI)
- #59 — Cross-Service Trace Correlation + Topology Map - #59 — Cross-Service Trace Correlation + Topology Map
**Debugger sub-issues:** **Debugger sub-issues:**
- #60 — Protocol: Debug session command types (`cameleer3-common`) - #60 — Protocol: Debug session command types (`cameleer-common`)
- #61 — Agent: DebugSessionManager + breakpoint InterceptStrategy integration - #61 — Agent: DebugSessionManager + breakpoint InterceptStrategy integration
- #62 — Agent: ExchangeStateSerializer + synthetic direct route wrapper - #62 — Agent: ExchangeStateSerializer + synthetic direct route wrapper
- #63 — Server: DebugSessionService + WebSocket + REST API - #63 — Server: DebugSessionService + WebSocket + REST API
- #70 — UI: Debug session frontend components - #70 — UI: Debug session frontend components
**Lineage sub-issues:** **Lineage sub-issues:**
- #64 — Protocol: Lineage command types (`cameleer3-common`) - #64 — Protocol: Lineage command types (`cameleer-common`)
- #65 — Agent: LineageManager + capture mode integration - #65 — Agent: LineageManager + capture mode integration
- #66 — Server: LineageService + DiffEngine + REST API - #66 — Server: LineageService + DiffEngine + REST API
- #71 — UI: Lineage timeline + diff viewer components - #71 — UI: Lineage timeline + diff viewer components
@@ -69,14 +69,14 @@ Browser (SaaS UI)
WebSocket <--------------------------------------+ WebSocket <--------------------------------------+
| | | |
v | v |
cameleer3-server | cameleer-server |
| POST /api/v1/debug/sessions | | POST /api/v1/debug/sessions |
| POST /api/v1/debug/sessions/{id}/step | | POST /api/v1/debug/sessions/{id}/step |
| POST /api/v1/debug/sessions/{id}/resume | | POST /api/v1/debug/sessions/{id}/resume |
| DELETE /api/v1/debug/sessions/{id} | | DELETE /api/v1/debug/sessions/{id} |
| | | |
v | v |
SSE Command Channel --> cameleer3 agent | SSE Command Channel --> cameleer agent |
| | | | | |
| "start-debug" | | | "start-debug" | |
| command v | | command v |
@@ -101,7 +101,7 @@ SSE Command Channel --> cameleer3 agent |
| Continue to next processor | Continue to next processor
``` ```
### 1.3 Protocol Additions (cameleer3-common) ### 1.3 Protocol Additions (cameleer-common)
#### New SSE Commands #### New SSE Commands
@@ -160,11 +160,11 @@ SSE Command Channel --> cameleer3 agent |
} }
``` ```
### 1.4 Agent Implementation (cameleer3-agent) ### 1.4 Agent Implementation (cameleer-agent)
#### DebugSessionManager #### DebugSessionManager
- Location: `com.cameleer3.agent.debug.DebugSessionManager` - Location: `com.cameleer.agent.debug.DebugSessionManager`
- Stores active sessions: `ConcurrentHashMap<sessionId, DebugSession>` - Stores active sessions: `ConcurrentHashMap<sessionId, DebugSession>`
- Enforces max concurrent sessions (default 3, configurable via `cameleer.debug.maxSessions`) - Enforces max concurrent sessions (default 3, configurable via `cameleer.debug.maxSessions`)
- Allocates **dedicated Thread** per session (NOT from Camel thread pool) - Allocates **dedicated Thread** per session (NOT from Camel thread pool)
@@ -213,7 +213,7 @@ For non-direct routes (timer, jms, http, file):
3. Debug exchange enters via `ProducerTemplate.send()` 3. Debug exchange enters via `ProducerTemplate.send()`
4. Remove temporary route on session completion 4. Remove temporary route on session completion
### 1.5 Server Implementation (cameleer3-server) ### 1.5 Server Implementation (cameleer-server)
#### REST Endpoints #### REST Endpoints
@@ -308,7 +308,7 @@ Capture the full transformation history of a message flowing through a route. At
### 2.2 Architecture ### 2.2 Architecture
``` ```
cameleer3 agent cameleer agent
| |
| On lineage-enabled exchange: | On lineage-enabled exchange:
| Before processor: capture INPUT | Before processor: capture INPUT
@@ -319,7 +319,7 @@ cameleer3 agent
POST /api/v1/data/executions (processors carry full snapshots) POST /api/v1/data/executions (processors carry full snapshots)
| |
v v
cameleer3-server cameleer-server
| |
| LineageService: | LineageService:
| > Flatten processor tree to ordered list | > Flatten processor tree to ordered list
@@ -334,7 +334,7 @@ GET /api/v1/executions/{id}/lineage
Browser: LineageTimeline + DiffViewer Browser: LineageTimeline + DiffViewer
``` ```
### 2.3 Protocol Additions (cameleer3-common) ### 2.3 Protocol Additions (cameleer-common)
#### New SSE Commands #### New SSE Commands
@@ -370,11 +370,11 @@ Browser: LineageTimeline + DiffViewer
| `EXPRESSION` | Any exchange matching a Simple/JsonPath predicate | | `EXPRESSION` | Any exchange matching a Simple/JsonPath predicate |
| `NEXT_N` | Next N exchanges on the route (countdown) | | `NEXT_N` | Next N exchanges on the route (countdown) |
### 2.4 Agent Implementation (cameleer3-agent) ### 2.4 Agent Implementation (cameleer-agent)
#### LineageManager #### LineageManager
- Location: `com.cameleer3.agent.lineage.LineageManager` - Location: `com.cameleer.agent.lineage.LineageManager`
- Stores active configs: `ConcurrentHashMap<lineageId, LineageConfig>` - Stores active configs: `ConcurrentHashMap<lineageId, LineageConfig>`
- Tracks capture count per lineageId: auto-disables at `maxCaptures` - Tracks capture count per lineageId: auto-disables at `maxCaptures`
- Duration timeout via `ScheduledExecutorService`: auto-disables after expiry - Duration timeout via `ScheduledExecutorService`: auto-disables after expiry
@@ -412,7 +412,7 @@ cameleer.lineage.maxBodySize=65536 # 64KB for lineage captures (vs 4KB normal
cameleer.lineage.enabled=true # master switch cameleer.lineage.enabled=true # master switch
``` ```
### 2.5 Server Implementation (cameleer3-server) ### 2.5 Server Implementation (cameleer-server)
#### LineageService #### LineageService
@@ -548,7 +548,7 @@ New (added):
| Direct/SEDA | URI prefix `direct:`, `seda:`, `vm:` | Exchange property (in-process) | | Direct/SEDA | URI prefix `direct:`, `seda:`, `vm:` | Exchange property (in-process) |
| File/FTP | URI prefix `file:`, `ftp:` | Not propagated (async) | | File/FTP | URI prefix `file:`, `ftp:` | Not propagated (async) |
### 3.3 Agent Implementation (cameleer3-agent) ### 3.3 Agent Implementation (cameleer-agent)
#### Outgoing Propagation (InterceptStrategy) #### Outgoing Propagation (InterceptStrategy)
@@ -597,7 +597,7 @@ execution.setHopIndex(...); // depth in distributed trace
- Parse failure: log warning, continue without context (no exchange failure) - Parse failure: log warning, continue without context (no exchange failure)
- Only inject on outgoing processors, never on FROM consumers - Only inject on outgoing processors, never on FROM consumers
### 3.4 Server Implementation: Trace Assembly (cameleer3-server) ### 3.4 Server Implementation: Trace Assembly (cameleer-server)
#### CorrelationService #### CorrelationService
@@ -665,7 +665,7 @@ CREATE INDEX idx_executions_parent_span
- **Fan-out:** parallel multicast creates multiple children from same processor - **Fan-out:** parallel multicast creates multiple children from same processor
- **Circular calls:** detected via hopIndex (max depth 20) - **Circular calls:** detected via hopIndex (max depth 20)
### 3.5 Server Implementation: Topology Graph (cameleer3-server) ### 3.5 Server Implementation: Topology Graph (cameleer-server)
#### DependencyGraphService #### DependencyGraphService
@@ -799,11 +799,11 @@ Reserve `sourceTenantHash` in TraceContext for future use:
| Work | Repo | Issue | | Work | Repo | Issue |
|------|------|-------| |------|------|-------|
| Service topology materialized view | cameleer3-server | #69 | | Service topology materialized view | cameleer-server | #69 |
| Topology REST API | cameleer3-server | #69 | | Topology REST API | cameleer-server | #69 |
| ServiceTopologyGraph.tsx | cameleer3-server + saas | #72 | | ServiceTopologyGraph.tsx | cameleer-server + saas | #72 |
| WebSocket infrastructure (for debugger) | cameleer3-server | #63 | | WebSocket infrastructure (for debugger) | cameleer-server | #63 |
| TraceContext DTO in cameleer3-common | cameleer3 | #67 | | TraceContext DTO in cameleer-common | cameleer | #67 |
**Ship:** Topology graph visible from existing data. Zero agent changes. Immediate visual payoff. **Ship:** Topology graph visible from existing data. Zero agent changes. Immediate visual payoff.
@@ -811,10 +811,10 @@ Reserve `sourceTenantHash` in TraceContext for future use:
| Work | Repo | Issue | | Work | Repo | Issue |
|------|------|-------| |------|------|-------|
| Lineage protocol DTOs | cameleer3-common | #64 | | Lineage protocol DTOs | cameleer-common | #64 |
| LineageManager + capture integration | cameleer3-agent | #65 | | LineageManager + capture integration | cameleer-agent | #65 |
| LineageService + DiffEngine | cameleer3-server | #66 | | LineageService + DiffEngine | cameleer-server | #66 |
| Lineage UI components | cameleer3-server + saas | #71 | | Lineage UI components | cameleer-server + saas | #71 |
**Ship:** Payload flow lineage independently usable. **Ship:** Payload flow lineage independently usable.
@@ -822,10 +822,10 @@ Reserve `sourceTenantHash` in TraceContext for future use:
| Work | Repo | Issue | | Work | Repo | Issue |
|------|------|-------| |------|------|-------|
| Trace context header propagation | cameleer3-agent | #67 | | Trace context header propagation | cameleer-agent | #67 |
| Executions table migration (new columns) | cameleer3-server | #68 | | Executions table migration (new columns) | cameleer-server | #68 |
| CorrelationService + trace assembly | cameleer3-server | #68 | | CorrelationService + trace assembly | cameleer-server | #68 |
| DistributedTraceView + TraceSearch UI | cameleer3-server + saas | #72 | | DistributedTraceView + TraceSearch UI | cameleer-server + saas | #72 |
**Ship:** Distributed traces + topology — full correlation story. **Ship:** Distributed traces + topology — full correlation story.
@@ -833,11 +833,11 @@ Reserve `sourceTenantHash` in TraceContext for future use:
| Work | Repo | Issue | | Work | Repo | Issue |
|------|------|-------| |------|------|-------|
| Debug protocol DTOs | cameleer3-common | #60 | | Debug protocol DTOs | cameleer-common | #60 |
| DebugSessionManager + InterceptStrategy | cameleer3-agent | #61 | | DebugSessionManager + InterceptStrategy | cameleer-agent | #61 |
| ExchangeStateSerializer + synthetic wrapper | cameleer3-agent | #62 | | ExchangeStateSerializer + synthetic wrapper | cameleer-agent | #62 |
| DebugSessionService + WS + REST | cameleer3-server | #63 | | DebugSessionService + WS + REST | cameleer-server | #63 |
| Debug UI components | cameleer3-server + saas | #70 | | Debug UI components | cameleer-server + saas | #70 |
**Ship:** Full browser-based route debugger with integration to lineage and correlation. **Ship:** Full browser-based route debugger with integration to lineage and correlation.

View File

@@ -10,12 +10,12 @@
## 1. Product Definition ## 1. Product Definition
**Cameleer SaaS** is a Camel application runtime platform with built-in observability. Customers deploy Apache Camel applications and get zero-configuration tracing, topology mapping, payload lineage, distributed correlation, live debugging, and exchange replay — powered by the cameleer3 agent (auto-injected) and cameleer3-server (managed per tenant). **Cameleer SaaS** is a Camel application runtime platform with built-in observability. Customers deploy Apache Camel applications and get zero-configuration tracing, topology mapping, payload lineage, distributed correlation, live debugging, and exchange replay — powered by the cameleer agent (auto-injected) and cameleer-server (managed per tenant).
### Three Pillars ### Three Pillars
1. **Runtime** — Deploy and run Camel applications with automatic agent injection 1. **Runtime** — Deploy and run Camel applications with automatic agent injection
2. **Observability** — Per-tenant cameleer3-server (traces, topology, lineage, correlation, debugger, replay) 2. **Observability** — Per-tenant cameleer-server (traces, topology, lineage, correlation, debugger, replay)
3. **Management** — Auth, billing, teams, provisioning, secrets, environments 3. **Management** — Auth, billing, teams, provisioning, secrets, environments
### Two Deployment Modes ### Two Deployment Modes
@@ -27,8 +27,8 @@
| Component | Role | Changes Required | | Component | Role | Changes Required |
|-----------|------|------------------| |-----------|------|------------------|
| cameleer3 (agent) | Zero-code Camel instrumentation, auto-injected into customer JARs | MOAT features (lineage, correlation, debugger, replay) | | cameleer (agent) | Zero-code Camel instrumentation, auto-injected into customer JARs | MOAT features (lineage, correlation, debugger, replay) |
| cameleer3-server | Per-tenant observability backend | Managed mode (trust SaaS JWT), license module, MOAT features | | cameleer-server | Per-tenant observability backend | Managed mode (trust SaaS JWT), license module, MOAT features |
| cameleer-saas (this repo) | SaaS management platform — control plane | New: everything in this document | | cameleer-saas (this repo) | SaaS management platform — control plane | New: everything in this document |
| design-system | Shared React component library | Used by both SaaS shell and server UI | | design-system | Shared React component library | Used by both SaaS shell and server UI |
@@ -81,7 +81,7 @@ Single Spring Boot application with well-bounded internal modules. K8s ingress h
``` ```
[Browser] → [Ingress (Traefik/Envoy)] → [SaaS Platform (modular Spring Boot)] [Browser] → [Ingress (Traefik/Envoy)] → [SaaS Platform (modular Spring Boot)]
↓ (tenant routes) ↓ (provisioning) ↓ (tenant routes) ↓ (provisioning)
[Tenant cameleer3-server] [Flux CD → K8s] [Tenant cameleer-server] [Flux CD → K8s]
``` ```
### Component Map ### Component Map
@@ -114,7 +114,7 @@ Single Spring Boot application with well-bounded internal modules. K8s ingress h
│ (PostgreSQL) │ │ API │ │ │ │ (PostgreSQL) │ │ API │ │ │
│ - tenants │ └────────┘ │ ┌─────────────────────┐ │ │ - tenants │ └────────┘ │ ┌─────────────────────┐ │
│ - users │ │ │ tenant-a namespace │ │ │ - users │ │ │ tenant-a namespace │ │
│ - teams │ ┌─────┐ │ │ ├─ cameleer3-server │ │ │ - teams │ ┌─────┐ │ │ ├─ cameleer-server │ │
│ - audit log │ │Flux │ │ │ ├─ camel-app-1 │ │ │ - audit log │ │Flux │ │ │ ├─ camel-app-1 │ │
│ - licenses │ │ CD │ │ │ ├─ camel-app-2 │ │ │ - licenses │ │ CD │ │ │ ├─ camel-app-2 │ │
└──────────────┘ └──┬──┘ │ │ └─ NetworkPolicies │ │ └──────────────┘ └──┬──┘ │ │ └─ NetworkPolicies │ │
@@ -144,7 +144,7 @@ Same management platform routes to dedicated cluster(s) per customer. Dedicated
| Management Platform backend | Spring Boot 3, Java 21 | | Management Platform backend | Spring Boot 3, Java 21 |
| Management Platform frontend | React, @cameleer/design-system | | Management Platform frontend | React, @cameleer/design-system |
| Platform database | PostgreSQL | | Platform database | PostgreSQL |
| Tenant observability | cameleer3-server (Spring Boot), PostgreSQL, OpenSearch | | Tenant observability | cameleer-server (Spring Boot), PostgreSQL, OpenSearch |
| GitOps | Flux CD | | GitOps | Flux CD |
| K8s distribution | Talos (production), k3s (dev) | | K8s distribution | Talos (production), k3s (dev) |
| Ingress | Traefik or Envoy | | Ingress | Traefik or Envoy |
@@ -192,7 +192,7 @@ Stores all SaaS control plane data — completely separate from tenant observabi
### Tenant Data (Shared PostgreSQL) ### Tenant Data (Shared PostgreSQL)
Each tenant's cameleer3-server uses its own PostgreSQL schema on the shared instance (dedicated instance for high/business). This is the existing cameleer3-server data model — unchanged: Each tenant's cameleer-server uses its own PostgreSQL schema on the shared instance (dedicated instance for high/business). This is the existing cameleer-server data model — unchanged:
- Route executions, processor traces, metrics - Route executions, processor traces, metrics
- Route graph topology - Route graph topology
@@ -215,12 +215,12 @@ Completely separate: Prometheus TSDB for metrics, Loki for logs.
### Architecture ### Architecture
The SaaS management platform is the single identity plane. It owns authentication and authorization. Per-tenant cameleer3-server instances trust SaaS-issued tokens. The SaaS management platform is the single identity plane. It owns authentication and authorization. Per-tenant cameleer-server instances trust SaaS-issued tokens.
- Spring Security OAuth2 for OIDC federation with customer IdPs - Spring Security OAuth2 for OIDC federation with customer IdPs
- Ed25519 JWT signing (consistent with existing cameleer3-server pattern) - Ed25519 JWT signing (consistent with existing cameleer-server pattern)
- Tokens carry: tenant ID, user ID, roles, feature entitlements - Tokens carry: tenant ID, user ID, roles, feature entitlements
- cameleer3-server validates SaaS-issued JWTs in managed mode - cameleer-server validates SaaS-issued JWTs in managed mode
- Standalone mode retains its own auth for air-gapped deployments - Standalone mode retains its own auth for air-gapped deployments
### RBAC Model ### RBAC Model
@@ -252,7 +252,7 @@ Customer signs up + payment
→ Create tenant record + Stripe customer/subscription → Create tenant record + Stripe customer/subscription
→ Generate signed license token (Ed25519) → Generate signed license token (Ed25519)
→ Create Flux HelmRelease CR → Create Flux HelmRelease CR
→ Flux reconciles: namespace, ResourceQuota, NetworkPolicies, cameleer3-server → Flux reconciles: namespace, ResourceQuota, NetworkPolicies, cameleer-server
→ Provision PostgreSQL schema + per-tenant credentials → Provision PostgreSQL schema + per-tenant credentials
→ Provision OpenSearch index template + per-tenant credentials → Provision OpenSearch index template + per-tenant credentials
→ Readiness check: server healthy, DB migrated, auth working → Readiness check: server healthy, DB migrated, auth working
@@ -297,7 +297,7 @@ Full Cluster API automation deferred to future release.
### JAR Upload → Immutable Image ### JAR Upload → Immutable Image
1. **Validation** — File type check, size limit per tier, SHA-256 checksum, Trivy security scan, secret detection (reject JARs with embedded credentials) 1. **Validation** — File type check, size limit per tier, SHA-256 checksum, Trivy security scan, secret detection (reject JARs with embedded credentials)
2. **Image Build** — Templated Dockerfile: distroless JRE base + customer JAR + cameleer3-agent.jar + `-javaagent` flag + agent pre-configured for tenant server. Image tagged: `registry/{tenant}/{app}:v{N}-{sha256short}`. Signed with cosign. SBOM attached. 2. **Image Build** — Templated Dockerfile: distroless JRE base + customer JAR + cameleer-agent.jar + `-javaagent` flag + agent pre-configured for tenant server. Image tagged: `registry/{tenant}/{app}:v{N}-{sha256short}`. Signed with cosign. SBOM attached.
3. **Registry Push** — Per-tenant repository in platform container registry 3. **Registry Push** — Per-tenant repository in platform container registry
4. **Deploy** — K8s Deployment in tenant namespace with resource limits, secrets mounted, config injected, NetworkPolicy applied, liveness/readiness probes 4. **Deploy** — K8s Deployment in tenant namespace with resource limits, secrets mounted, config injected, NetworkPolicy applied, liveness/readiness probes
@@ -350,7 +350,7 @@ Central UI for managing each deployed application:
### Architecture ### Architecture
Each tenant gets a dedicated cameleer3-server instance: Each tenant gets a dedicated cameleer-server instance:
- Shared tiers: deployed in tenant's namespace - Shared tiers: deployed in tenant's namespace
- Dedicated tiers: deployed in tenant's cluster - Dedicated tiers: deployed in tenant's cluster
@@ -359,7 +359,7 @@ The SaaS API gateway routes `/t/{tenant}/api/*` to the correct server instance.
### Agent Connection ### Agent Connection
- Agent bootstrap tokens generated by the SaaS platform - Agent bootstrap tokens generated by the SaaS platform
- Agents connect directly to their tenant's cameleer3-server instance - Agents connect directly to their tenant's cameleer-server instance
- Agent auto-injected into customer Camel apps deployed on the platform - Agent auto-injected into customer Camel apps deployed on the platform
- External agents (customer-hosted Camel apps) can also connect using bootstrap tokens - External agents (customer-hosted Camel apps) can also connect using bootstrap tokens
@@ -448,7 +448,7 @@ K8s NetworkPolicies per tenant namespace:
- **Allow:** tenant namespace → shared PostgreSQL/OpenSearch (authenticated per-tenant credentials) - **Allow:** tenant namespace → shared PostgreSQL/OpenSearch (authenticated per-tenant credentials)
- **Allow:** tenant namespace → public internet (Camel app external connectivity) - **Allow:** tenant namespace → public internet (Camel app external connectivity)
- **Allow:** SaaS platform namespace → all tenant namespaces (management access) - **Allow:** SaaS platform namespace → all tenant namespaces (management access)
- **Allow:** tenant Camel apps → tenant cameleer3-server (intra-namespace) - **Allow:** tenant Camel apps → tenant cameleer-server (intra-namespace)
### Zero-Trust Tenant Boundary ### Zero-Trust Tenant Boundary
@@ -546,7 +546,7 @@ Completely separate from tenant observability data.
- TLS certificate expiry < 14 days - TLS certificate expiry < 14 days
- Metering pipeline stale > 1 hour - Metering pipeline stale > 1 hour
- Disk usage > 80% on any PV - Disk usage > 80% on any PV
- Tenant cameleer3-server unhealthy > 5 minutes - Tenant cameleer-server unhealthy > 5 minutes
- OOMKill on any tenant workload - OOMKill on any tenant workload
### Dashboards ### Dashboards
@@ -577,7 +577,7 @@ K8s Metrics → Metrics Collector → Usage Aggregator (hourly) → Stripe Usage
|-----------|------|--------| |-----------|------|--------|
| CPU | core·hours | K8s metrics (namespace aggregate) | | CPU | core·hours | K8s metrics (namespace aggregate) |
| RAM | GB·hours | K8s metrics (namespace aggregate) | | RAM | GB·hours | K8s metrics (namespace aggregate) |
| Data volume | GB ingested | cameleer3-server reports | | Data volume | GB ingested | cameleer-server reports |
- Aggregated per tenant, per hour, stored in platform DB before Stripe submission - Aggregated per tenant, per hour, stored in platform DB before Stripe submission
- Idempotent aggregation (safe to re-run) - Idempotent aggregation (safe to re-run)
@@ -613,7 +613,7 @@ K8s Metrics → Metrics Collector → Usage Aggregator (hourly) → Stripe Usage
| **App → Status** | Pod health, resource usage, agent connection, events | | **App → Status** | Pod health, resource usage, agent connection, events |
| **App → Logs** | Live stdout/stderr stream | | **App → Logs** | Live stdout/stderr stream |
| **App → Versions** | Image history, promotion log, rollback | | **App → Versions** | Image history, promotion log, rollback |
| **Observe** | Embedded cameleer3-server UI (topology, traces, lineage, correlation, debugger, replay) | | **Observe** | Embedded cameleer-server UI (topology, traces, lineage, correlation, debugger, replay) |
| **Team** | Users, roles, invites | | **Team** | Users, roles, invites |
| **Settings** | Tenant config, SSO/OIDC, vault connections | | **Settings** | Tenant config, SSO/OIDC, vault connections |
| **Billing** | Usage, invoices, plan management | | **Billing** | Usage, invoices, plan management |
@@ -621,7 +621,7 @@ K8s Metrics → Metrics Collector → Usage Aggregator (hourly) → Stripe Usage
### Design ### Design
- SaaS shell built with `@cameleer/design-system` - SaaS shell built with `@cameleer/design-system`
- cameleer3-server React UI embedded (same design system, visual consistency) - cameleer-server React UI embedded (same design system, visual consistency)
- Responsive but desktop-primary (observability tooling is a desktop workflow) - Responsive but desktop-primary (observability tooling is a desktop workflow)
--- ---
@@ -681,4 +681,4 @@ K8s Metrics → Metrics Collector → Usage Aggregator (hourly) → Stripe Usage
| 12 | Platform Operations & Self-Monitoring | epic, ops | | 12 | Platform Operations & Self-Monitoring | epic, ops |
| 13 | MOAT: Exchange Replay | epic, observability | | 13 | MOAT: Exchange Replay | epic, observability |
MOAT features (Debugger, Lineage, Correlation) tracked in cameleer/cameleer3 #57#72. MOAT features (Debugger, Lineage, Correlation) tracked in cameleer/cameleer #57#72.

View File

@@ -27,7 +27,7 @@ Key constraints:
| **Identity & Auth** | **Logto** | MPL-2.0 | Lightest IdP (2 containers, ~0.5-1 GB). Orgs, RBAC, M2M tokens, OIDC/SSO federation all in OSS. Replaces ~3-4 months of custom auth build (OIDC, SSO, teams, invites, MFA, password reset, custom roles). | | **Identity & Auth** | **Logto** | MPL-2.0 | Lightest IdP (2 containers, ~0.5-1 GB). Orgs, RBAC, M2M tokens, OIDC/SSO federation all in OSS. Replaces ~3-4 months of custom auth build (OIDC, SSO, teams, invites, MFA, password reset, custom roles). |
| **Reverse Proxy** | **Traefik** | MIT | Native Docker provider (labels) and K8s provider (IngressRoute CRDs). Same mental model in both environments. Already on the k3s cluster. ForwardAuth middleware for tenant-aware routing. Auto-HTTPS via Let's Encrypt. ~256 MB RAM. | | **Reverse Proxy** | **Traefik** | MIT | Native Docker provider (labels) and K8s provider (IngressRoute CRDs). Same mental model in both environments. Already on the k3s cluster. ForwardAuth middleware for tenant-aware routing. Auto-HTTPS via Let's Encrypt. ~256 MB RAM. |
| **Database** | **PostgreSQL** | PostgreSQL License | Already chosen. Platform data + Logto data (separate schemas). | | **Database** | **PostgreSQL** | PostgreSQL License | Already chosen. Platform data + Logto data (separate schemas). |
| **Trace/Metrics Storage** | **ClickHouse** | Apache-2.0 | Replaced OpenSearch in the cameleer3-server stack. Columnar OLAP, excellent for time-series observability data. | | **Trace/Metrics Storage** | **ClickHouse** | Apache-2.0 | Replaced OpenSearch in the cameleer-server stack. Columnar OLAP, excellent for time-series observability data. |
| **Schema Migrations** | **Flyway** | Apache-2.0 | Already in place. | | **Schema Migrations** | **Flyway** | Apache-2.0 | Already in place. |
| **Billing (subscriptions)** | **Stripe** | N/A (API) | Start with Stripe Checkout for fixed-tier subscriptions. No custom billing infrastructure day 1. | | **Billing (subscriptions)** | **Stripe** | N/A (API) | Start with Stripe Checkout for fixed-tier subscriptions. No custom billing infrastructure day 1. |
| **Billing (usage metering)** | **Lago** (deferred) | AGPL-3.0 | Purpose-built for event-based metering. 8 containers — deploy only when usage-based pricing launches. Design event model with Lago's API shape in mind from day 1. Integrate via API only (keeps AGPL safe). | | **Billing (usage metering)** | **Lago** (deferred) | AGPL-3.0 | Purpose-built for event-based metering. 8 containers — deploy only when usage-based pricing launches. Design event model with Lago's API shape in mind from day 1. Integrate via API only (keeps AGPL safe). |
@@ -42,14 +42,14 @@ Key constraints:
| Subsystem | Why Build | | Subsystem | Why Build |
|---|---| |---|---|
| **License signing & validation** | Ed25519 signed JWT with tier, features, limits, expiry. Dual mode: online API check + offline signed file. No off-the-shelf tool does this. Core IP. | | **License signing & validation** | Ed25519 signed JWT with tier, features, limits, expiry. Dual mode: online API check + offline signed file. No off-the-shelf tool does this. Core IP. |
| **Agent bootstrap tokens** | Tightly coupled to the cameleer3 agent protocol (PROTOCOL.md). Custom Ed25519 tokens for agent registration. | | **Agent bootstrap tokens** | Tightly coupled to the cameleer agent protocol (PROTOCOL.md). Custom Ed25519 tokens for agent registration. |
| **Tenant lifecycle** | CRUD, configuration, status management. Core business logic. User management (invites, teams, roles) is delegated to Logto's organization model. | | **Tenant lifecycle** | CRUD, configuration, status management. Core business logic. User management (invites, teams, roles) is delegated to Logto's organization model. |
| **Runtime orchestration** | The core of the "managed Camel runtime" product. `RuntimeOrchestrator` interface with Docker and K8s implementations. No off-the-shelf tool does "managed Camel runtime with agent injection." | | **Runtime orchestration** | The core of the "managed Camel runtime" product. `RuntimeOrchestrator` interface with Docker and K8s implementations. No off-the-shelf tool does "managed Camel runtime with agent injection." |
| **Image build pipeline** | Templated Dockerfile: JRE + cameleer3-agent.jar + customer JAR + `-javaagent` flag. Simple but custom. | | **Image build pipeline** | Templated Dockerfile: JRE + cameleer-agent.jar + customer JAR + `-javaagent` flag. Simple but custom. |
| **Feature gating** | Tier-based feature gating logic. Which features are available at which tier. Business logic. | | **Feature gating** | Tier-based feature gating logic. Which features are available at which tier. Business logic. |
| **Billing integration** | Stripe API calls, subscription lifecycle, webhook handling. Thin integration layer. | | **Billing integration** | Stripe API calls, subscription lifecycle, webhook handling. Thin integration layer. |
| **Observability proxy** | Routing authenticated requests to tenant-specific cameleer3-server instances. | | **Observability proxy** | Routing authenticated requests to tenant-specific cameleer-server instances. |
| **MOAT features** | Debugger, Lineage, Correlation — the defensible product. Built in cameleer3 agent + server. | | **MOAT features** | Debugger, Lineage, Correlation — the defensible product. Built in cameleer agent + server. |
### SKIP / DEFER ### SKIP / DEFER
@@ -74,7 +74,7 @@ Key constraints:
+--------+---------------------+------------------------+ +--------+---------------------+------------------------+
| | | |
+--------v--------+ +---------v-----------+ +--------v--------+ +---------v-----------+
| cameleer-saas | | cameleer3-server | | cameleer-saas | | cameleer-server |
| (Spring Boot) | | (observability) | | (Spring Boot) | | (observability) |
| Control plane | | Per-tenant instance | | Control plane | | Per-tenant instance |
+---+-------+-----+ +----------+----------+ +---+-------+-----+ +----------+----------+
@@ -99,10 +99,10 @@ API request:
-> Traefik forwards to upstream service -> Traefik forwards to upstream service
Machine auth (agent bootstrap): Machine auth (agent bootstrap):
cameleer3-agent -> cameleer-saas /api/agent/register cameleer-agent -> cameleer-saas /api/agent/register
-> Validates bootstrap token (Ed25519) -> Validates bootstrap token (Ed25519)
-> Issues agent session token -> Issues agent session token
-> Agent connects to cameleer3-server -> Agent connects to cameleer-server
``` ```
Logto handles all user-facing identity. The cameleer-saas app handles machine-to-machine auth (agent tokens, license tokens) using Ed25519. Logto handles all user-facing identity. The cameleer-saas app handles machine-to-machine auth (agent tokens, license tokens) using Ed25519.
@@ -137,9 +137,9 @@ Customer uploads JAR
-> Validation (file type, size, SHA-256, security scan) -> Validation (file type, size, SHA-256, security scan)
-> Templated Dockerfile generation: -> Templated Dockerfile generation:
FROM eclipse-temurin:21-jre-alpine FROM eclipse-temurin:21-jre-alpine
COPY cameleer3-agent.jar /opt/agent/ COPY cameleer-agent.jar /opt/agent/
COPY customer-app.jar /opt/app/ COPY customer-app.jar /opt/app/
ENTRYPOINT ["java", "-javaagent:/opt/agent/cameleer3-agent.jar", "-jar", "/opt/app/customer-app.jar"] ENTRYPOINT ["java", "-javaagent:/opt/agent/cameleer-agent.jar", "-jar", "/opt/app/customer-app.jar"]
-> Build: -> Build:
Docker mode: docker build via docker-java (local image cache) Docker mode: docker build via docker-java (local image cache)
K8s mode: Kaniko Job -> push to registry K8s mode: Kaniko Job -> push to registry
@@ -152,7 +152,7 @@ Customer uploads JAR
- **Schema-per-tenant** in PostgreSQL for platform data isolation. - **Schema-per-tenant** in PostgreSQL for platform data isolation.
- **Logto organizations** map 1:1 to tenants. Logto handles user-tenant membership. - **Logto organizations** map 1:1 to tenants. Logto handles user-tenant membership.
- **ClickHouse** data partitioned by tenant_id. - **ClickHouse** data partitioned by tenant_id.
- **cameleer3-server** instances are per-tenant (separate containers/pods). - **cameleer-server** instances are per-tenant (separate containers/pods).
- **K8s bonus:** Namespace-per-tenant for network isolation, resource quotas. - **K8s bonus:** Namespace-per-tenant for network isolation, resource quotas.
### Environment Model ### Environment Model
@@ -232,8 +232,8 @@ services:
- traefik.enable=true - traefik.enable=true
- traefik.http.routers.auth.rule=PathPrefix(`/auth`) - traefik.http.routers.auth.rule=PathPrefix(`/auth`)
cameleer3-server: cameleer-server:
image: gitea.siegeln.net/cameleer/cameleer3-server:${VERSION} image: gitea.siegeln.net/cameleer/cameleer-server:${VERSION}
environment: environment:
- CLICKHOUSE_URL=jdbc:clickhouse://clickhouse:8123/cameleer - CLICKHOUSE_URL=jdbc:clickhouse://clickhouse:8123/cameleer
labels: labels:
@@ -312,9 +312,9 @@ volumes:
### Phase 4: Observability Pipeline ### Phase 4: Observability Pipeline
**Goal:** Customer can see traces, metrics, and route topology for deployed apps. **Goal:** Customer can see traces, metrics, and route topology for deployed apps.
- Connect cameleer3-server to customer app containers - Connect cameleer-server to customer app containers
- ClickHouse tenant-scoped data partitioning - ClickHouse tenant-scoped data partitioning
- Observability API proxy (tenant-aware routing to cameleer3-server) - Observability API proxy (tenant-aware routing to cameleer-server)
- Basic topology graph endpoint - Basic topology graph endpoint
- Agent ↔ server connectivity verification - Agent ↔ server connectivity verification
@@ -367,13 +367,13 @@ volumes:
1. Upload a sample Camel JAR via API 1. Upload a sample Camel JAR via API
2. Platform builds container image 2. Platform builds container image
3. Deploy to "dev" environment 3. Deploy to "dev" environment
4. Container starts with cameleer3 agent attached 4. Container starts with cameleer agent attached
5. App is reachable via Traefik routing 5. App is reachable via Traefik routing
6. Logs are accessible via API 6. Logs are accessible via API
7. Deploy same image to "prod" with different config 7. Deploy same image to "prod" with different config
### Phase 4 Verification ### Phase 4 Verification
1. Running Camel app sends traces to cameleer3-server 1. Running Camel app sends traces to cameleer-server
2. Traces visible in ClickHouse with correct tenant_id 2. Traces visible in ClickHouse with correct tenant_id
3. Topology graph shows route structure 3. Topology graph shows route structure
4. Different tenant cannot see another tenant's data 4. Different tenant cannot see another tenant's data
@@ -393,7 +393,7 @@ docker compose up -d
# Create tenant + user via API/Logto # Create tenant + user via API/Logto
# Upload sample Camel JAR # Upload sample Camel JAR
# Deploy to environment # Deploy to environment
# Verify agent connects to cameleer3-server # Verify agent connects to cameleer-server
# Verify traces in ClickHouse # Verify traces in ClickHouse
# Verify observability API returns data # Verify observability API returns data
``` ```

View File

@@ -0,0 +1,424 @@
# Phase 3: Runtime Orchestration + Environments
**Date:** 2026-04-04
**Status:** Draft
**Depends on:** Phase 2 (Tenants + Identity + Licensing)
**Gitea issue:** #26
## Context
Phase 2 delivered multi-tenancy, identity (Logto OIDC), and license management. The platform can create tenants and issue licenses, but there is nothing to run yet. Phase 3 is the core product differentiator: customers upload a Camel JAR, the platform builds an immutable container image with the cameleer agent auto-injected, and deploys it to a logical environment. This is "managed Camel runtime" — similar to Coolify or MuleSoft CloudHub, but purpose-built for Apache Camel with deep observability.
Docker-first. The `KubernetesRuntimeOrchestrator` is deferred to Phase 5.
**Single-node constraint:** Because Phase 3 builds images locally via Docker socket (no registry push), the cameleer-saas control plane and the Docker daemon must reside on the same host. This is inherent to the single-tenant Docker Compose stack and is acceptable for that target. In K8s mode (Phase 5), images are built via Kaniko and pushed to a registry, removing this constraint.
## Key Decisions
| Decision | Choice | Rationale |
|----------|--------|-----------|
| JAR delivery | Direct HTTP upload (multipart) | Simplest path. Git-based and image-ref options can be added later. |
| Agent JAR source | Bundled in `cameleer-runtime-base` image | Version-locked to platform release. Updated by rebuilding the platform image with the new agent version. No runtime network dependency. |
| Build speed | Pre-built base image + single-layer customer add | Customer image build is `FROM base` + `COPY app.jar`. ~1-3 seconds. |
| Deployment model | Async with polling | Image builds are inherently slow. Deploy returns immediately with deployment ID. Client polls for status. |
| Entity hierarchy | Environment → App → Deployment | User thinks "I'm in dev, deploy my app." Environment is the workspace context. |
| Environment provisioning | Hybrid auto + manual | Every tenant gets a `default` environment on creation. Additional environments created manually, tier limit enforced. |
| Cross-environment isolation | Logical (not network) | Docker single-tenant mode — customer owns the stack. Data separated by `environmentId` in cameleer-server. Network isolation is a K8s Phase 5 concern. |
| Container networking | Shared `cameleer` bridge network | Customer containers join the existing network. Agent reaches cameleer-server at `http://cameleer-server:8081`. |
| Container naming | `{tenant-slug}-{env-slug}-{app-slug}` | Human-readable, unique, identifies tenant+environment+app at a glance. |
| Bootstrap tokens | Shared `CAMELEER_AUTH_TOKEN` from cameleer-server config | Platform reads the existing token and injects it into customer containers. Environment separation via agent `environmentId` claim, not token. Per-environment tokens deferred to K8s Phase 5. |
| Health checking | Agent health endpoint (port 9464) | Guaranteed to exist, no user config needed. User-defined health endpoints deferred. |
| Inbound HTTP routing | Not in Phase 3 | Most Camel apps are consumers (queues, polls), not servers. Traefik routing for customer apps deferred to Phase 4/4.5. |
| Container logs | Captured via docker-java, written to ClickHouse | Unified log query surface from day 1. Same pattern future app logs will use. |
| Resource constraints | cgroups via docker-java `mem_limit` + `cpu_shares` | Protect the control plane from noisy neighbors. Tier-based defaults. Even in single-tenant Docker mode, a runaway Camel app shouldn't starve Traefik/Postgres/Logto. |
| Orchestrator metadata | JSONB field on deployment entity | Docker stores `containerId`. K8s (Phase 5) stores `namespace`, `deploymentName`, `gitCommit`. Same table, different orchestrator. |
## Data Model
### Environment Entity
```sql
CREATE TABLE environments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
slug VARCHAR(100) NOT NULL,
display_name VARCHAR(255) NOT NULL,
bootstrap_token TEXT NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'ACTIVE',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(tenant_id, slug)
);
CREATE INDEX idx_environments_tenant_id ON environments(tenant_id);
```
- `slug` — URL-safe, immutable, unique per tenant. Auto-created environment gets slug `default`.
- `display_name` — User-editable. Auto-created environment gets `Default`.
- `bootstrap_token` — The `CAMELEER_AUTH_TOKEN` value used for customer containers in this environment. In Docker mode, all environments share the same value (read from platform config). In K8s mode (Phase 5), can be unique per environment.
- `status``ACTIVE` or `SUSPENDED`.
### App Entity
```sql
CREATE TABLE apps (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
environment_id UUID NOT NULL REFERENCES environments(id) ON DELETE CASCADE,
slug VARCHAR(100) NOT NULL,
display_name VARCHAR(255) NOT NULL,
jar_storage_path VARCHAR(500),
jar_checksum VARCHAR(64),
jar_original_filename VARCHAR(255),
jar_size_bytes BIGINT,
current_deployment_id UUID,
previous_deployment_id UUID,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(environment_id, slug)
);
CREATE INDEX idx_apps_environment_id ON apps(environment_id);
```
- `slug` — URL-safe, immutable, unique per environment.
- `jar_storage_path` — Relative path to uploaded JAR (e.g., `tenants/{tenant-slug}/envs/{env-slug}/apps/{app-slug}/app.jar`). Relative to the configured storage root (`cameleer.runtime.jar-storage-path`). Makes it easy to migrate the storage volume to a different mount point or cloud provider.
- `jar_checksum` — SHA-256 hex digest of the uploaded JAR.
- `current_deployment_id` — Points to the active deployment. Nullable (app created but never deployed).
- `previous_deployment_id` — Points to the last known good deployment. When a new deploy succeeds, `current` becomes the new one and `previous` becomes the old `current`. When a deploy fails, `current` stays as the failed one but `previous` still points to the last good version, enabling a rollback button.
### Deployment Entity
```sql
CREATE TABLE deployments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
version INTEGER NOT NULL,
image_ref VARCHAR(500) NOT NULL,
desired_status VARCHAR(20) NOT NULL DEFAULT 'RUNNING',
observed_status VARCHAR(20) NOT NULL DEFAULT 'BUILDING',
orchestrator_metadata JSONB DEFAULT '{}',
error_message TEXT,
deployed_at TIMESTAMPTZ,
stopped_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(app_id, version)
);
CREATE INDEX idx_deployments_app_id ON deployments(app_id);
```
- `version` — Sequential per app (1, 2, 3...). Incremented on each deploy.
- `image_ref` — Docker image reference, e.g., `cameleer-runtime-{tenant}-{app}:v3`.
- `desired_status` — What the user wants: `RUNNING`, `STOPPED`.
- `observed_status` — What the platform sees: `BUILDING`, `STARTING`, `RUNNING`, `FAILED`, `STOPPED`.
- `orchestrator_metadata` — Docker mode: `{"containerId": "abc123"}`. K8s mode (Phase 5): `{"namespace": "...", "deploymentName": "...", "gitCommit": "..."}`.
- `error_message` — Populated when `observed_status` is `FAILED`. Build error, startup crash, etc.
## Component Architecture
### RuntimeOrchestrator Interface
```java
public interface RuntimeOrchestrator {
String buildImage(BuildImageRequest request);
void startContainer(StartContainerRequest request);
void stopContainer(String containerId);
void removeContainer(String containerId);
ContainerStatus getContainerStatus(String containerId);
void streamLogs(String containerId, LogConsumer consumer);
}
```
- Single interface, implemented by `DockerRuntimeOrchestrator` (Phase 3) and `KubernetesRuntimeOrchestrator` (Phase 5).
- Injected via Spring `@Profile` or `@ConditionalOnProperty`.
- Request objects carry all context (image name, env vars, network, labels, etc.).
### DockerRuntimeOrchestrator
Uses `com.github.docker-java:docker-java` library. Connects via Docker socket (`/var/run/docker.sock`).
**buildImage:**
1. Creates a temporary build context directory
2. Writes a Dockerfile:
```dockerfile
FROM cameleer-runtime-base:{platform-version}
COPY app.jar /app/app.jar
```
3. Copies the customer JAR as `app.jar`
4. Calls `docker build` via docker-java
5. Tags as `cameleer-runtime-{tenant-slug}-{app-slug}:v{version}`
6. Returns the image reference
**startContainer:**
1. Creates container with:
- Image: the built image reference
- Name: `{tenant-slug}-{env-slug}-{app-slug}`
- Network: `cameleer` (the platform bridge network)
- Environment variables:
- `CAMELEER_AUTH_TOKEN={bootstrap-token}`
- `CAMELEER_EXPORT_TYPE=HTTP`
- `CAMELEER_EXPORT_ENDPOINT=http://cameleer-server:8081`
- `CAMELEER_APPLICATION_ID={app-slug}`
- `CAMELEER_ENVIRONMENT_ID={env-slug}`
- `CAMELEER_DISPLAY_NAME={tenant-slug}-{env-slug}-{app-slug}`
- Resource constraints (cgroups):
- `memory` / `memorySwap` — hard memory limit per container
- `cpuShares` — relative CPU weight (default 512)
- Defaults configurable via `cameleer.runtime.container-memory-limit` (default `512m`) and `cameleer.runtime.container-cpu-shares` (default `512`)
- Protects the control plane (Traefik, Postgres, Logto, cameleer-saas) from noisy neighbor Camel apps
- Health check: HTTP GET to agent health port 9464
2. Starts container
3. Returns container ID
**streamLogs:**
- Attaches to container stdout/stderr via docker-java `LogContainerCmd`
- Passes log lines to a `LogConsumer` callback (for ClickHouse ingestion)
### cameleer-runtime-base Image
A pre-built Docker image containing everything except the customer JAR:
```dockerfile
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY cameleer-agent-{version}-shaded.jar /app/agent.jar
ENTRYPOINT exec java \
-Dcameleer.export.type=${CAMELEER_EXPORT_TYPE:-HTTP} \
-Dcameleer.export.endpoint=${CAMELEER_EXPORT_ENDPOINT} \
-Dcameleer.agent.name=${HOSTNAME} \
-Dcameleer.agent.application=${CAMELEER_APPLICATION_ID:-default} \
-Dcameleer.agent.environment=${CAMELEER_ENVIRONMENT_ID:-default} \
-Dcameleer.routeControl.enabled=${CAMELEER_ROUTE_CONTROL_ENABLED:-false} \
-Dcameleer.replay.enabled=${CAMELEER_REPLAY_ENABLED:-false} \
-Dcameleer.health.enabled=true \
-Dcameleer.health.port=9464 \
-javaagent:/app/agent.jar \
-jar /app/app.jar
```
- Built as part of the CI pipeline for cameleer-saas.
- Published to Gitea registry: `gitea.siegeln.net/cameleer/cameleer-runtime-base:{version}`.
- Version tracks the platform version + agent version (e.g., `0.2.0` includes agent `1.0-SNAPSHOT`).
- Updating the agent JAR = rebuild this image with the new agent version → rebuild cameleer-saas image → all new deployments use the new agent.
### JAR Upload
- `POST /api/environments/{eid}/apps` with multipart file
- Validation:
- File extension: `.jar`
- Max size: 200 MB (configurable via `cameleer.runtime.max-jar-size`)
- SHA-256 checksum computed and stored
- Storage: relative path `tenants/{tenant-slug}/envs/{env-slug}/apps/{app-slug}/app.jar` under the configured storage root (`cameleer.runtime.jar-storage-path`, default `/data/jars`)
- Docker volume `jardata` mounted into cameleer-saas container
- Database stores the relative path only — decoupled from mount point
- JAR is overwritten on re-upload (new deploy uses new JAR)
### Async Deployment Pipeline
1. **API receives deploy request** → creates `Deployment` entity with `observed_status=BUILDING` → returns deployment ID (HTTP 202 Accepted)
2. **Background thread** (Spring `@Async` with a bounded thread pool):
a. Calls `orchestrator.buildImage(...)` → updates `observed_status=STARTING`
b. Calls `orchestrator.startContainer(...)` → updates `observed_status=STARTING`
c. Polls agent health endpoint (port 9464) with timeout → updates to `RUNNING` or `FAILED`
d. On any failure → updates `observed_status=FAILED`, `error_message=...`
3. **Client polls** `GET /api/apps/{aid}/deployments/{did}` for status updates
4. **On success:** set `previous_deployment_id = old current_deployment_id`, then `current_deployment_id = new deployment`. Stop and remove the old container.
5. **On failure:** `current_deployment_id` is set to the failed deployment (so status is visible), `previous_deployment_id` still points to the last known good version. Enables rollback.
### Container Logs → ClickHouse
- When a container starts, platform attaches a log consumer via `orchestrator.streamLogs()`
- Log consumer batches lines and writes to ClickHouse table:
```sql
CREATE TABLE IF NOT EXISTS container_logs (
tenant_id UUID,
environment_id UUID,
app_id UUID,
deployment_id UUID,
timestamp DateTime64(3),
stream String, -- 'stdout' or 'stderr'
message String
) ENGINE = MergeTree()
ORDER BY (tenant_id, environment_id, app_id, timestamp);
```
- Logs retrieved via `GET /api/apps/{aid}/logs?since=...&limit=...` which queries ClickHouse
- ClickHouse TTL can enforce retention based on license `retention_days` limit (future enhancement)
### Bootstrap Token Handling
In Docker single-tenant mode, all environments share the single cameleer-server instance and its single `CAMELEER_AUTH_TOKEN`. The platform reads this token from its own configuration (`cameleer.runtime.bootstrap-token` / `CAMELEER_AUTH_TOKEN` env var) and injects it into every customer container. No changes to cameleer-server are needed.
Environment-level data separation happens at the agent registration level — the agent sends its `environmentId` claim when it registers, and cameleer-server uses that to scope all data. The bootstrap token is the same across environments in a Docker stack.
The `bootstrap_token` column on the environment entity stores the token value used for that environment's containers. In Docker mode this is the same shared value for all environments. In K8s mode (Phase 5), each environment could have its own cameleer-server instance with a unique token, enabling true per-environment token isolation.
## API Surface
### Environment Endpoints
```
POST /api/tenants/{tenantId}/environments
Body: { "slug": "dev", "displayName": "Development" }
Returns: 201 Created + EnvironmentResponse
Enforces: tier-based max_environments limit from license
GET /api/tenants/{tenantId}/environments
Returns: 200 + List<EnvironmentResponse>
GET /api/tenants/{tenantId}/environments/{environmentId}
Returns: 200 + EnvironmentResponse
PATCH /api/tenants/{tenantId}/environments/{environmentId}
Body: { "displayName": "New Name" }
Returns: 200 + EnvironmentResponse
DELETE /api/tenants/{tenantId}/environments/{environmentId}
Returns: 204 No Content
Precondition: no running apps in environment
Restriction: cannot delete the auto-created "default" environment
```
### App Endpoints
```
POST /api/environments/{environmentId}/apps
Multipart: file (JAR) + metadata { "slug": "order-service", "displayName": "Order Service" }
Returns: 201 Created + AppResponse
Validates: file extension, size, checksum
GET /api/environments/{environmentId}/apps
Returns: 200 + List<AppResponse>
GET /api/environments/{environmentId}/apps/{appId}
Returns: 200 + AppResponse (includes current deployment status)
PUT /api/environments/{environmentId}/apps/{appId}/jar
Multipart: file (JAR)
Returns: 200 + AppResponse
Purpose: re-upload JAR without creating new app
DELETE /api/environments/{environmentId}/apps/{appId}
Returns: 204 No Content
Side effect: stops running container, removes image
```
### Deployment Endpoints
```
POST /api/apps/{appId}/deploy
Body: {} (empty — uses current JAR)
Returns: 202 Accepted + DeploymentResponse (with deployment ID, status=BUILDING)
GET /api/apps/{appId}/deployments
Returns: 200 + List<DeploymentResponse> (ordered by version desc)
GET /api/apps/{appId}/deployments/{deploymentId}
Returns: 200 + DeploymentResponse (poll this for status updates)
POST /api/apps/{appId}/stop
Returns: 200 + DeploymentResponse (desired_status=STOPPED)
POST /api/apps/{appId}/restart
Returns: 202 Accepted + DeploymentResponse (stops + redeploys same image)
```
### Log Endpoints
```
GET /api/apps/{appId}/logs
Query: since (ISO timestamp), until (ISO timestamp), limit (default 500), stream (stdout/stderr/both)
Returns: 200 + List<LogEntry>
Source: ClickHouse container_logs table
```
## Tier Enforcement
| Tier | max_environments | max_agents (apps) |
|------|-----------------|-------------------|
| LOW | 1 | 3 |
| MID | 2 | 10 |
| HIGH | unlimited (-1) | 50 |
| BUSINESS | unlimited (-1) | unlimited (-1) |
- `max_environments` enforced on `POST /api/tenants/{tid}/environments`. The auto-created `default` environment counts toward the limit.
- `max_agents` enforced on `POST /api/environments/{eid}/apps`. Count is total apps across all environments in the tenant.
## Docker Compose Changes
The cameleer-saas service needs:
- Docker socket mount: `/var/run/docker.sock:/var/run/docker.sock` (already present in docker-compose.yml)
- JAR storage volume: `jardata:/data/jars`
- `cameleer-runtime-base` image must be available (pre-pulled or built locally)
The cameleer-server `CAMELEER_AUTH_TOKEN` is read by cameleer-saas from shared environment config and injected into customer containers.
New volume in docker-compose.yml:
```yaml
volumes:
jardata:
```
## Dependencies
### New Maven Dependencies
```xml
<!-- Docker Java client -->
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-core</artifactId>
<version>3.4.1</version>
</dependency>
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-transport-httpclient5</artifactId>
<version>3.4.1</version>
</dependency>
<!-- ClickHouse JDBC -->
<dependency>
<groupId>com.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.7.1</version>
<classifier>all</classifier>
</dependency>
```
### New Configuration Properties
```yaml
cameleer:
runtime:
max-jar-size: 209715200 # 200 MB
jar-storage-path: /data/jars
base-image: cameleer-runtime-base:latest
docker-network: cameleer
agent-health-port: 9464
health-check-timeout: 60 # seconds to wait for healthy status
deployment-thread-pool-size: 4
container-memory-limit: 512m # per customer container
container-cpu-shares: 512 # relative weight (default Docker is 1024)
clickhouse:
url: jdbc:clickhouse://clickhouse:8123/cameleer
```
## Verification Plan
1. Upload a sample Camel JAR via `POST /api/environments/{eid}/apps`
2. Deploy via `POST /api/apps/{aid}/deploy` — returns 202 with deployment ID
3. Poll `GET /api/apps/{aid}/deployments/{did}` — status transitions: `BUILDING` → `STARTING` → `RUNNING`
4. Container visible in `docker ps` as `{tenant}-{env}-{app}`
5. Container is on the `cameleer` network
6. cameleer agent registers with cameleer-server (visible in server logs)
7. Agent health endpoint responds on port 9464
8. Container logs appear in ClickHouse `container_logs` table
9. `GET /api/apps/{aid}/logs` returns log entries
10. `POST /api/apps/{aid}/stop` stops the container, status becomes `STOPPED`
11. `POST /api/apps/{aid}/restart` restarts with same image
12. Re-upload JAR + redeploy creates deployment v2, stops v1
13. Tier limits enforced: LOW tenant cannot create more than 1 environment or 3 apps
14. Default environment auto-created on tenant provisioning

Some files were not shown because too many files have changed in this diff Show More