Rename Java packages from net.siegeln.cameleer3 to net.siegeln.cameleer, update all references in workflows, Docker configs, docs, and bootstrap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
17 KiB
Dual Deployment Architecture: Docker + Kubernetes
Date: 2026-04-04
Status: Approved
Supersedes: Portions of 2026-03-29-saas-platform-prd.md (deployment model, phase ordering, auth strategy)
Context
Cameleer SaaS must serve two deployment targets:
- Docker Compose — production-viable for small customers and air-gapped installs (single-tenant per stack)
- Kubernetes — managed SaaS and enterprise self-hosted (multi-tenant)
The original PRD assumed K8s-only production. This design restructures the architecture and roadmap to treat Docker Compose as a first-class production target, uses the Docker+K8s dual requirement as a filter for build-vs-buy decisions, and reorders the phase roadmap to ship a deployable product faster.
Key constraints:
- The application is always multi-tenant — Docker deployments have exactly 1 tenant
- Don't build custom abstractions over K8s-only primitives when no Docker equivalent exists
- Prefer right-sized OSS tools over Swiss Army knives or custom builds
- K8s-only features (NetworkPolicies, HPA, Flux CD) are operational enhancements, never functional requirements
Build-vs-Buy Decisions
BUY (Use 3rd Party OSS)
| Subsystem | Tool | License | Why This Tool |
|---|---|---|---|
| Identity & Auth | Logto | MPL-2.0 | Lightest IdP (2 containers, ~0.5-1 GB). Orgs, RBAC, M2M tokens, OIDC/SSO federation all in OSS. Replaces ~3-4 months of custom auth build (OIDC, SSO, teams, invites, MFA, password reset, custom roles). |
| Reverse Proxy | Traefik | MIT | Native Docker provider (labels) and K8s provider (IngressRoute CRDs). Same mental model in both environments. Already on the k3s cluster. ForwardAuth middleware for tenant-aware routing. Auto-HTTPS via Let's Encrypt. ~256 MB RAM. |
| Database | PostgreSQL | PostgreSQL License | Already chosen. Platform data + Logto data (separate schemas). |
| Trace/Metrics Storage | ClickHouse | Apache-2.0 | Replaced OpenSearch in the cameleer-server stack. Columnar OLAP, excellent for time-series observability data. |
| Schema Migrations | Flyway | Apache-2.0 | Already in place. |
| Billing (subscriptions) | Stripe | N/A (API) | Start with Stripe Checkout for fixed-tier subscriptions. No custom billing infrastructure day 1. |
| Billing (usage metering) | Lago (deferred) | AGPL-3.0 | Purpose-built for event-based metering. 8 containers — deploy only when usage-based pricing launches. Design event model with Lago's API shape in mind from day 1. Integrate via API only (keeps AGPL safe). |
| GitOps (K8s only) | Flux CD | Apache-2.0 | K8s-only, and that's acceptable. Docker deployments get release tarballs + upgrade scripts. |
| Image Builds (K8s) | Kaniko | Apache-2.0 | Daemonless container image builds inside K8s. For Docker mode, docker build via docker-java is simpler. |
| Monitoring | Prometheus + Grafana + Loki | Apache-2.0 | Works in both Docker and K8s. Optional for Docker (customer's choice), standard for K8s SaaS. |
| TLS Certificates | Traefik ACME (Docker) / cert-manager (K8s) | MIT / Apache-2.0 | Standard tools, no custom code. |
| Container Registry (K8s) | Gitea Registry (SaaS) / registry:2 (self-hosted) | — | Docker mode doesn't need a registry (local image cache). |
BUILD (Custom / Core IP)
| Subsystem | Why Build |
|---|---|
| License signing & validation | Ed25519 signed JWT with tier, features, limits, expiry. Dual mode: online API check + offline signed file. No off-the-shelf tool does this. Core IP. |
| Agent bootstrap tokens | Tightly coupled to the cameleer agent protocol (PROTOCOL.md). Custom Ed25519 tokens for agent registration. |
| Tenant lifecycle | CRUD, configuration, status management. Core business logic. User management (invites, teams, roles) is delegated to Logto's organization model. |
| Runtime orchestration | The core of the "managed Camel runtime" product. RuntimeOrchestrator interface with Docker and K8s implementations. No off-the-shelf tool does "managed Camel runtime with agent injection." |
| Image build pipeline | Templated Dockerfile: JRE + cameleer-agent.jar + customer JAR + -javaagent flag. Simple but custom. |
| Feature gating | Tier-based feature gating logic. Which features are available at which tier. Business logic. |
| Billing integration | Stripe API calls, subscription lifecycle, webhook handling. Thin integration layer. |
| Observability proxy | Routing authenticated requests to tenant-specific cameleer-server instances. |
| MOAT features | Debugger, Lineage, Correlation — the defensible product. Built in cameleer agent + server. |
SKIP / DEFER
| Subsystem | Why Skip |
|---|---|
| Secrets management (Vault) | Docker: env vars + mounted files. K8s: K8s Secrets. Vault is enterprise-tier complexity. Defer until demanded. |
| Custom role management UI | Logto provides this. |
| OIDC provider implementation | Logto provides this. |
| WireGuard VPN / VPC peering | Far future, dedicated-tier only. |
| Cluster API for dedicated tiers | Don't design for this until enterprise customers exist. |
| Management agent for updates | Watchtower is optional for connected customers. Air-gapped gets release tarballs. Don't build custom. |
Architecture
Platform Stack (Docker Compose — 6 base containers)
+-------------------------------------------------------+
| Traefik (reverse proxy, TLS, ForwardAuth) |
| - Docker: labels-based routing |
| - K8s: IngressRoute CRDs |
+--------+---------------------+------------------------+
| |
+--------v--------+ +---------v-----------+
| cameleer-saas | | cameleer-server |
| (Spring Boot) | | (observability) |
| Control plane | | Per-tenant instance |
+---+-------+-----+ +----------+----------+
| | |
+---v--+ +--v----+ +---------v---------+
| PG | | Logto | | ClickHouse |
| | | (IdP) | | (traces/metrics) |
+------+ +-------+ +-------------------+
Customer Camel apps are additional containers dynamically managed by the control plane via Docker API (Docker mode) or K8s API (K8s mode).
Auth Flow
User login:
Browser -> Traefik -> Logto (OIDC flow) -> JWT issued by Logto
API request:
Browser -> Traefik -> ForwardAuth (cameleer-saas /auth/verify)
-> Validates Logto JWT, injects X-Tenant-Id header
-> Traefik forwards to upstream service
Machine auth (agent bootstrap):
cameleer-agent -> cameleer-saas /api/agent/register
-> Validates bootstrap token (Ed25519)
-> Issues agent session token
-> Agent connects to cameleer-server
Logto handles all user-facing identity. The cameleer-saas app handles machine-to-machine auth (agent tokens, license tokens) using Ed25519.
Runtime Orchestration
RuntimeOrchestrator (interface)
+ deployApp(tenantId, appId, envId, imageRef, config) -> Deployment
+ stopApp(tenantId, appId, envId) -> void
+ restartApp(tenantId, appId, envId) -> void
+ getAppLogs(tenantId, appId, envId, since) -> Stream<LogLine>
+ getAppStatus(tenantId, appId, envId) -> AppStatus
+ listApps(tenantId) -> List<AppSummary>
DockerRuntimeOrchestrator (docker-java library)
- Talks to Docker daemon via /var/run/docker.sock
- Creates containers with labels for Traefik routing
- Manages container lifecycle
- Builds images locally via docker build
KubernetesRuntimeOrchestrator (fabric8 kubernetes-client)
- Creates Deployments, Services, ConfigMaps in tenant namespace
- Builds images via Kaniko Jobs, pushes to registry
- Manages rollout lifecycle
Image Build Pipeline
Customer uploads JAR
-> Validation (file type, size, SHA-256, security scan)
-> Templated Dockerfile generation:
FROM eclipse-temurin:21-jre-alpine
COPY cameleer-agent.jar /opt/agent/
COPY customer-app.jar /opt/app/
ENTRYPOINT ["java", "-javaagent:/opt/agent/cameleer-agent.jar", "-jar", "/opt/app/customer-app.jar"]
-> Build:
Docker mode: docker build via docker-java (local image cache)
K8s mode: Kaniko Job -> push to registry
-> Deploy to requested environment
Multi-Tenancy Model
- Always multi-tenant. Docker Compose has 1 pre-configured tenant.
- Schema-per-tenant in PostgreSQL for platform data isolation.
- Logto organizations map 1:1 to tenants. Logto handles user-tenant membership.
- ClickHouse data partitioned by tenant_id.
- cameleer-server instances are per-tenant (separate containers/pods).
- K8s bonus: Namespace-per-tenant for network isolation, resource quotas.
Environment Model
Each tenant can have multiple logical environments (tier-dependent):
| Tier | Environments |
|---|---|
| Low | prod only |
| Mid | dev, prod |
| High+ | dev, staging, prod + custom |
Each environment is a separate deployment of the same app image with different configuration:
- Docker: separate container, different env vars
- K8s: separate Deployment, different ConfigMap
Promotion = deploy same image tag to a different environment with that environment's config.
Configuration Strategy
The application is configured entirely via environment variables and Spring Boot profiles:
# Detected at startup
cameleer.deployment.mode: docker | kubernetes # auto-detected
cameleer.deployment.docker.socket: /var/run/docker.sock
cameleer.deployment.k8s.namespace-template: tenant-{tenantId}
# Identity provider
cameleer.identity.issuer-uri: http://logto:3001/oidc
cameleer.identity.client-id: ${LOGTO_CLIENT_ID}
cameleer.identity.client-secret: ${LOGTO_CLIENT_SECRET}
# Ed25519 keys (externalized, not per-boot)
cameleer.jwt.private-key-path: /etc/cameleer/keys/ed25519.key
cameleer.jwt.public-key-path: /etc/cameleer/keys/ed25519.pub
# Database
spring.datasource.url: ${DATABASE_URL}
# ClickHouse
cameleer.clickhouse.url: ${CLICKHOUSE_URL}
Docker Compose Production Template
services:
traefik:
image: traefik:v3
ports: ["80:80", "443:443"]
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro
- ./traefik.yml:/etc/traefik/traefik.yml
- acme:/etc/traefik/acme
labels:
# Dashboard (optional, secured)
cameleer-saas:
image: gitea.siegeln.net/cameleer/cameleer-saas:${VERSION}
volumes:
- /var/run/docker.sock:/var/run/docker.sock # For runtime orchestration
- ./keys:/etc/cameleer/keys:ro
environment:
- DATABASE_URL=jdbc:postgresql://postgres:5432/cameleer_saas
- LOGTO_CLIENT_ID=${LOGTO_CLIENT_ID}
- LOGTO_CLIENT_SECRET=${LOGTO_CLIENT_SECRET}
labels:
- traefik.enable=true
- traefik.http.routers.api.rule=PathPrefix(`/api`)
logto:
image: svhd/logto:latest
environment:
- DB_URL=postgresql://postgres:5432/logto
labels:
- traefik.enable=true
- traefik.http.routers.auth.rule=PathPrefix(`/auth`)
cameleer-server:
image: gitea.siegeln.net/cameleer/cameleer-server:${VERSION}
environment:
- CLICKHOUSE_URL=jdbc:clickhouse://clickhouse:8123/cameleer
labels:
- traefik.enable=true
- traefik.http.routers.observe.rule=PathPrefix(`/observe`)
postgres:
image: postgres:16-alpine
volumes: [pgdata:/var/lib/postgresql/data]
clickhouse:
image: clickhouse/clickhouse-server:latest
volumes: [chdata:/var/lib/clickhouse]
volumes:
pgdata:
chdata:
acme:
Docker vs K8s Feature Matrix
| Feature | Docker Compose | Kubernetes |
|---|---|---|
| Deploy Camel apps | Yes (Docker API) | Yes (K8s API) |
| Multiple environments | Yes (separate containers) | Yes (separate Deployments) |
| Agent injection | Yes | Yes |
| Observability (traces, topology) | Yes | Yes |
| Identity / SSO / Teams | Yes (Logto) | Yes (Logto) |
| Licensing | Yes | Yes |
| Auto-scaling | No | Yes (HPA) |
| Network isolation (multi-tenant) | Docker networks | NetworkPolicies |
| GitOps deployment | No (manual updates) | Yes (Flux CD) |
| Rolling updates | Manual restart | Native |
| Platform monitoring | Optional (customer adds Grafana) | Standard (Prometheus/Grafana/Loki) |
| Certificate management | Traefik ACME | cert-manager |
Revised Phase Roadmap
Phase 2: Tenants + Identity + Licensing
Goal: A customer can sign up, get a tenant, and access the platform via Traefik.
- Integrate Logto as identity provider
- Replace custom user-facing auth (login, registration, password management)
- Keep Ed25519 JWT for machine tokens (agent bootstrap, license signing)
- Configure Logto organizations to map to tenants
- Tenant entity + CRUD API
- License token generation (Ed25519 signed JWT: tier, features, limits, expiry)
- Traefik integration with ForwardAuth middleware
- Docker Compose production stack (6 containers)
- Externalize Ed25519 keys (mounted files, not per-boot)
Files to modify/create:
src/main/java/net/siegeln/cameleer/saas/tenant/— new packagesrc/main/java/net/siegeln/cameleer/saas/license/— new packagesrc/main/java/net/siegeln/cameleer/saas/config/SecurityConfig.java— Logto OIDC integrationsrc/main/resources/db/migration/V005__create_tenants.sqlsrc/main/resources/db/migration/V006__create_licenses.sqldocker-compose.yml— expand to full production stacktraefik.yml— static configsrc/main/resources/application.yml— Logto + Traefik config
Phase 3: Runtime Orchestration + Environments
Goal: Customer can upload a Camel JAR, deploy it to dev/prod, see it running with agent attached.
RuntimeOrchestratorinterfaceDockerRuntimeOrchestratorimplementation (docker-java)- Customer JAR upload endpoint
- Image build pipeline (Dockerfile template + docker build)
- Logical environment model (dev/test/prod per tenant)
- Environment-specific config overlays
- App lifecycle API (deploy, start, stop, restart, logs, health)
Key dependencies: docker-java, Kaniko (for future K8s)
Phase 4: Observability Pipeline
Goal: Customer can see traces, metrics, and route topology for deployed apps.
- Connect cameleer-server to customer app containers
- ClickHouse tenant-scoped data partitioning
- Observability API proxy (tenant-aware routing to cameleer-server)
- Basic topology graph endpoint
- Agent ↔ server connectivity verification
Phase 5: K8s Operational Layer
Goal: Same product works on K8s with operational enhancements.
KubernetesRuntimeOrchestratorimplementation (fabric8)- Kaniko-based image builds
- Flux CD integration for platform GitOps
- Namespace-per-tenant provisioning
- NetworkPolicies, ResourceQuotas
- Helm chart for K8s deployment
- Registry integration (Gitea registry / registry:2)
Phase 6: Billing
Goal: Customers can subscribe and pay.
- Stripe Checkout integration
- Subscription lifecycle (create, upgrade, downgrade, cancel)
- Tier enforcement (feature gating based on active subscription)
- Usage tracking in platform DB (prep for Lago integration later)
- Webhook handling for payment events
Phase 7: Security Hardening + Monitoring
Goal: Production-hardened platform.
- Prometheus/Grafana/Loki stack (optional Docker compose overlay, standard K8s)
- SOC 2 compliance review
- Rate limiting
- Container image signing (cosign)
- Supply chain security (SBOM, Trivy scanning)
- Audit log shipping to separate sink
Frontend (React Shell) — Parallel Track (Phase 2+)
- Can start as soon as Phase 2 API contracts are defined
- Uses
@cameleer/design-system - Screens: login, dashboard, app deployment, environment management, observability views, team management, billing
Verification Plan
Phase 2 Verification
docker compose upstarts all 6 containers- Navigate to Logto admin, create a user
- User logs in via OIDC flow through Traefik
- API calls with JWT include
X-Tenant-Idheader - License token can be generated and verified
- All existing tests still pass
Phase 3 Verification
- Upload a sample Camel JAR via API
- Platform builds container image
- Deploy to "dev" environment
- Container starts with cameleer agent attached
- App is reachable via Traefik routing
- Logs are accessible via API
- Deploy same image to "prod" with different config
Phase 4 Verification
- Running Camel app sends traces to cameleer-server
- Traces visible in ClickHouse with correct tenant_id
- Topology graph shows route structure
- Different tenant cannot see another tenant's data
Phase 5 Verification
- Helm install deploys full platform to k3s
- Tenant provisioning creates namespace + resources
- App deployment creates K8s Deployment + Service
- Kaniko builds image and pushes to registry
- NetworkPolicy blocks cross-tenant traffic
- Same API contracts work as Docker mode
End-to-End Smoke Test (Any Phase)
# Docker Compose
docker compose up -d
# Create tenant + user via API/Logto
# Upload sample Camel JAR
# Deploy to environment
# Verify agent connects to cameleer-server
# Verify traces in ClickHouse
# Verify observability API returns data