**Not in scope:** Security hardening, code quality, performance
---
## Executive Summary
The cameleer ecosystem has a clear vision: a standalone observability and runtime platform for Apache Camel, optionally managed by a thin SaaS vendor layer. Both deployment modes must be first-class.
The server (cameleer-server) is solid for observability but currently lacks runtime management capabilities (deploying and managing Camel application containers). These capabilities exist in the SaaS layer today but belong in the server, since standalone customers also need them.
The SaaS layer (cameleer-saas) has taken on too many responsibilities: environment management, app lifecycle, container orchestration, direct ClickHouse access, and partial auth duplication. It should be a thin vendor management plane: onboard tenants, provision server instances, manage billing. Nothing more.
**The revised direction:**
- **Server layer** = the product. Observability + runtime management + auth/RBAC. Self-sufficient standalone, or managed by SaaS.
- **SaaS layer** = vendor management plane. Owns tenant lifecycle (onboard, offboard, bill), provisions server instances, communicates exclusively via server REST APIs.
- **Strong data separation.** Each layer has its own dedicated PostgreSQL and ClickHouse. No cross-layer database access.
- **Logto as federation hub.** In SaaS mode, Logto handles all user authentication. Customers bring their own OIDC providers via Logto Enterprise SSO connectors.
- One deployable. Adapts to its environment. Standalone customer mounts Docker socket and gets full runtime management.
---
### Problem 2: SaaS bypasses the server to access its databases
**What happens today:**
-`AgentStatusService` queries the server's ClickHouse directly (`SELECT count(), max(start_time) FROM executions`)
-`ContainerLogService` creates and manages its own `container_logs` table in the server's ClickHouse
- The SaaS has its own ClickHouse connection pool (HikariCP, 10 connections)
**Why this is wrong:**
- Violates the exclusive data ownership principle. The server owns its ClickHouse schema.
- Schema changes in the server silently break the SaaS.
- Creates tight coupling where there should be a clean API boundary.
- Two connection pools to ClickHouse from different services adds unnecessary operational complexity.
**What should happen:**
- The SaaS has zero access to the server's databases. All data flows through the server's REST API.
- Container logs (a runtime concern) move to the server along with runtime management.
- The SaaS has its own PostgreSQL for vendor concerns (tenants, billing, provisioning records). No ClickHouse needed.
---
### Problem 3: Auth architecture doesn't support per-tenant OIDC
**What happens today:**
- The server has one global OIDC configuration
- In SaaS mode, it validates Logto tokens. All tenants use the same Logto instance.
- Customers cannot bring their own OIDC providers (Okta, Azure AD, etc.)
- The server generates its own JWTs after OIDC callback, creating dual-issuer problems (#38)
-`syncOidcRoles` writes shadow copies of roles to PostgreSQL on every login
**Why this matters:**
- Enterprise customers require SSO with their own identity provider. This is table-stakes for B2B SaaS.
- The dual-issuer pattern (server JWTs + Logto JWTs) causes the session synchronization problem (#38).
- Per-tenant OIDC needs a federation hub, not per-server OIDC config changes.
**What should happen:**
- **Standalone mode:** Server manages users/groups/roles independently. Optional OIDC integration pointing to customer's IdP directly. Works exactly as today.
- **SaaS mode:** Logto acts as federation hub via Enterprise SSO connectors. Each tenant/organization configures their own SSO connector (SAML or OIDC). Logto handles federation and issues a single token type. The server validates Logto tokens (one OIDC config). Single token issuer eliminates #38.
- **Server auth behavior (inferred from config, no explicit mode flag):**
- No OIDC configured: Full local auth. Server generates JWTs, manages users/groups/roles.
- OIDC configured: Local + OIDC coexist. Claim mapping available.
- OIDC configured + `cameleer.auth.local.enabled=false`: Pure resource server. No local login, no JWT generation, no shadow role sync. SaaS provisioner sets this.
---
### Problem 4: License validation has no server-side implementation
**What exists today:**
- The SaaS generates Ed25519-signed license JWTs with tier/features/limits
- The server has zero license awareness — no validation, no feature gating, no tier concept
- MOAT features cannot be gated at the server level
**What should happen:**
- Server validates Ed25519-signed license JWTs
- License loaded from: env var, file path, or API endpoint
- MOAT feature endpoints check license before serving data
- In standalone: license file at `/etc/cameleer/license.jwt`
- In SaaS: license injected during tenant provisioning
-`ADMIN` — full access including user/group/role management, server settings, license
Custom roles may be defined by the server admin for finer-grained control.
### Assignment Types
Every user-role and user-group assignment has an **origin**:
| Origin | Set by | Lifecycle | On OIDC login |
|--------|--------|-----------|---------------|
| `direct` | Admin manually assigns via UI/API | Persisted until admin removes it | Untouched |
| `managed` | Claim mapping rules evaluate JWT | Recalculated on every OIDC login | Cleared and re-evaluated |
Effective permissions = union of direct roles + managed roles + roles inherited from groups (both direct and managed group memberships).
**Schema:**
```sql
-- user_roles
user_id UUID NOT NULL
role_id UUID NOT NULL
origin VARCHAR NOT NULL -- 'direct' or 'managed'
mapping_id UUID -- NULL for direct; FK to claim_mapping_rules for managed
-- user_groups (same pattern)
user_id UUID NOT NULL
group_id UUID NOT NULL
origin VARCHAR NOT NULL -- 'direct' or 'managed'
mapping_id UUID -- NULL for direct; FK to claim_mapping_rules for managed
```
### Claim Mapping
When OIDC is configured, the server admin can define **claim mapping rules** that automatically assign roles or group memberships based on JWT claims. Rules are server-level config (one set per server instance = effectively per-tenant in SaaS mode).
**Rule structure:**
```sql
-- claim_mapping_rules
id UUID PRIMARY KEY
claim VARCHAR NOT NULL -- JWT claim to read (e.g., 'groups', 'roles', 'department')
match_type VARCHAR NOT NULL -- 'equals', 'contains', 'regex'
match_value VARCHAR NOT NULL -- value to match against
action VARCHAR NOT NULL -- 'assignRole' or 'addToGroup'
target VARCHAR NOT NULL -- role name or group name
priority INT DEFAULT 0 -- evaluation order (higher = later, for conflict resolution)
- Full local user management: create users, set passwords, assign roles/groups
- Admin manages everything via server UI
**Standalone + OIDC (OIDC configured and enabled):**
- Local user management still available
- OIDC users auto-signup on first login (configurable: on/off, default role)
- Claim mapping available for automated role/group assignment
- Both local and OIDC users coexist
**SaaS / OIDC-only (OIDC configured and enabled, local auth disabled):**
- Inferred: when OIDC is configured and enabled, the server operates as a pure resource server
- No local user creation or password management
- Users exist only after first OIDC login (auto-signup always on)
- Claim mapping is the primary role assignment mechanism
- Admin can still make direct assignments via UI (for overrides)
- User/password management UI sections hidden
- Logto org roles → JWT `roles` claim → server claim mapping → server roles
**Note:** There is no explicit `cameleer.auth.mode` flag. The server infers its auth behavior from whether OIDC is configured and enabled. If OIDC is present, the server acts as a resource server for user-facing auth (agents always use server-issued tokens regardless).
### SaaS with Enterprise SSO (per-tenant customer IdPs)
→ Same flow, different mapping rules configured by that tenant's admin:
"roles contains CameleerOperator" → OPERATOR role (managed)
```
Each tenant configures their own mapping rules in their server instance. The server doesn't care which IdP issued the claims — it just evaluates rules against whatever JWT it receives.
---
## Resolved Design Questions
### Server Instance Topology
-`CAMELEER_TENANT_ID` env var (already exists) scopes all data access in PG and CH
- Standalone: defaults to `"default"`, customer never thinks about it
- SaaS: the SaaS provisioner sets it when starting the server container
- Auth behavior is inferred from OIDC configuration (no explicit mode flag)
- The server doesn't need to "know" it's in SaaS mode — tenant_id + OIDC config is sufficient
### Runtime Orchestrator Scope (Routing)
The server owns routing as part of the RuntimeOrchestrator abstraction. Two routing strategies, configured at the server level:
```
cameleer.routing.mode=path → api.example.com/apps/{env}/{app} (default, works everywhere)
cameleer.routing.mode=subdomain → {app}.{env}.apps.example.com (requires wildcard DNS + TLS)
```
- **Path-based** is the default — no wildcard DNS or TLS required, works in every environment
- **Subdomain-based** is opt-in for customers who prefer it and can provide wildcard infrastructure
- Docker mode: Traefik labels on deployed containers
- K8s mode: Service + Ingress resources
- The routing mechanism is an implementation detail of each RuntimeOrchestrator, not a separate concern
### UI Navigation
The current server navigation structure is preserved. Runtime management integrates as follows:
Applications page design requires a UI mock before finalizing — to be explored in the frontend design phase.
### Data Migration
Not applicable — greenfield. No existing installations to migrate.
---
## Summary
The cameleer ecosystem is well-conceived but the current SaaS-server boundary is in the wrong place. The SaaS has grown into a second product rather than a thin vendor layer.
The fix is architectural: move runtime management (environments, apps, deployments) into the server, make the SaaS a pure vendor management plane, enforce strict data separation, and use Logto Enterprise SSO as the federation hub for per-tenant OIDC.
The result: **the server is the complete product (observability + runtime + auth). The SaaS is how the vendor manages tenants of that product. Both standalone and SaaS are first-class because the server doesn't depend on the SaaS for any of its capabilities.**