20 KiB
Phase 3: Runtime Orchestration + Environments
Date: 2026-04-04 Status: Draft Depends on: Phase 2 (Tenants + Identity + Licensing) Gitea issue: #26
Context
Phase 2 delivered multi-tenancy, identity (Logto OIDC), and license management. The platform can create tenants and issue licenses, but there is nothing to run yet. Phase 3 is the core product differentiator: customers upload a Camel JAR, the platform builds an immutable container image with the cameleer3 agent auto-injected, and deploys it to a logical environment. This is "managed Camel runtime" — similar to Coolify or MuleSoft CloudHub, but purpose-built for Apache Camel with deep observability.
Docker-first. The KubernetesRuntimeOrchestrator is deferred to Phase 5.
Single-node constraint: Because Phase 3 builds images locally via Docker socket (no registry push), the cameleer-saas control plane and the Docker daemon must reside on the same host. This is inherent to the single-tenant Docker Compose stack and is acceptable for that target. In K8s mode (Phase 5), images are built via Kaniko and pushed to a registry, removing this constraint.
Key Decisions
| Decision | Choice | Rationale |
|---|---|---|
| JAR delivery | Direct HTTP upload (multipart) | Simplest path. Git-based and image-ref options can be added later. |
| Agent JAR source | Bundled in cameleer-runtime-base image |
Version-locked to platform release. Updated by rebuilding the platform image with the new agent version. No runtime network dependency. |
| Build speed | Pre-built base image + single-layer customer add | Customer image build is FROM base + COPY app.jar. ~1-3 seconds. |
| Deployment model | Async with polling | Image builds are inherently slow. Deploy returns immediately with deployment ID. Client polls for status. |
| Entity hierarchy | Environment → App → Deployment | User thinks "I'm in dev, deploy my app." Environment is the workspace context. |
| Environment provisioning | Hybrid auto + manual | Every tenant gets a default environment on creation. Additional environments created manually, tier limit enforced. |
| Cross-environment isolation | Logical (not network) | Docker single-tenant mode — customer owns the stack. Data separated by environmentId in cameleer3-server. Network isolation is a K8s Phase 5 concern. |
| Container networking | Shared cameleer bridge network |
Customer containers join the existing network. Agent reaches cameleer3-server at http://cameleer3-server:8081. |
| Container naming | {tenant-slug}-{env-slug}-{app-slug} |
Human-readable, unique, identifies tenant+environment+app at a glance. |
| Bootstrap tokens | Shared CAMELEER_AUTH_TOKEN from cameleer3-server config |
Platform reads the existing token and injects it into customer containers. Environment separation via agent environmentId claim, not token. Per-environment tokens deferred to K8s Phase 5. |
| Health checking | Agent health endpoint (port 9464) | Guaranteed to exist, no user config needed. User-defined health endpoints deferred. |
| Inbound HTTP routing | Not in Phase 3 | Most Camel apps are consumers (queues, polls), not servers. Traefik routing for customer apps deferred to Phase 4/4.5. |
| Container logs | Captured via docker-java, written to ClickHouse | Unified log query surface from day 1. Same pattern future app logs will use. |
| Resource constraints | cgroups via docker-java mem_limit + cpu_shares |
Protect the control plane from noisy neighbors. Tier-based defaults. Even in single-tenant Docker mode, a runaway Camel app shouldn't starve Traefik/Postgres/Logto. |
| Orchestrator metadata | JSONB field on deployment entity | Docker stores containerId. K8s (Phase 5) stores namespace, deploymentName, gitCommit. Same table, different orchestrator. |
Data Model
Environment Entity
CREATE TABLE environments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
tenant_id UUID NOT NULL REFERENCES tenants(id) ON DELETE CASCADE,
slug VARCHAR(100) NOT NULL,
display_name VARCHAR(255) NOT NULL,
bootstrap_token TEXT NOT NULL,
status VARCHAR(20) NOT NULL DEFAULT 'ACTIVE',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(tenant_id, slug)
);
CREATE INDEX idx_environments_tenant_id ON environments(tenant_id);
slug— URL-safe, immutable, unique per tenant. Auto-created environment gets slugdefault.display_name— User-editable. Auto-created environment getsDefault.bootstrap_token— TheCAMELEER_AUTH_TOKENvalue used for customer containers in this environment. In Docker mode, all environments share the same value (read from platform config). In K8s mode (Phase 5), can be unique per environment.status—ACTIVEorSUSPENDED.
App Entity
CREATE TABLE apps (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
environment_id UUID NOT NULL REFERENCES environments(id) ON DELETE CASCADE,
slug VARCHAR(100) NOT NULL,
display_name VARCHAR(255) NOT NULL,
jar_storage_path VARCHAR(500),
jar_checksum VARCHAR(64),
jar_original_filename VARCHAR(255),
jar_size_bytes BIGINT,
current_deployment_id UUID,
previous_deployment_id UUID,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(environment_id, slug)
);
CREATE INDEX idx_apps_environment_id ON apps(environment_id);
slug— URL-safe, immutable, unique per environment.jar_storage_path— Relative path to uploaded JAR (e.g.,tenants/{tenant-slug}/envs/{env-slug}/apps/{app-slug}/app.jar). Relative to the configured storage root (cameleer.runtime.jar-storage-path). Makes it easy to migrate the storage volume to a different mount point or cloud provider.jar_checksum— SHA-256 hex digest of the uploaded JAR.current_deployment_id— Points to the active deployment. Nullable (app created but never deployed).previous_deployment_id— Points to the last known good deployment. When a new deploy succeeds,currentbecomes the new one andpreviousbecomes the oldcurrent. When a deploy fails,currentstays as the failed one butpreviousstill points to the last good version, enabling a rollback button.
Deployment Entity
CREATE TABLE deployments (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
app_id UUID NOT NULL REFERENCES apps(id) ON DELETE CASCADE,
version INTEGER NOT NULL,
image_ref VARCHAR(500) NOT NULL,
desired_status VARCHAR(20) NOT NULL DEFAULT 'RUNNING',
observed_status VARCHAR(20) NOT NULL DEFAULT 'BUILDING',
orchestrator_metadata JSONB DEFAULT '{}',
error_message TEXT,
deployed_at TIMESTAMPTZ,
stopped_at TIMESTAMPTZ,
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
UNIQUE(app_id, version)
);
CREATE INDEX idx_deployments_app_id ON deployments(app_id);
version— Sequential per app (1, 2, 3...). Incremented on each deploy.image_ref— Docker image reference, e.g.,cameleer-runtime-{tenant}-{app}:v3.desired_status— What the user wants:RUNNING,STOPPED.observed_status— What the platform sees:BUILDING,STARTING,RUNNING,FAILED,STOPPED.orchestrator_metadata— Docker mode:{"containerId": "abc123"}. K8s mode (Phase 5):{"namespace": "...", "deploymentName": "...", "gitCommit": "..."}.error_message— Populated whenobserved_statusisFAILED. Build error, startup crash, etc.
Component Architecture
RuntimeOrchestrator Interface
public interface RuntimeOrchestrator {
String buildImage(BuildImageRequest request);
void startContainer(StartContainerRequest request);
void stopContainer(String containerId);
void removeContainer(String containerId);
ContainerStatus getContainerStatus(String containerId);
void streamLogs(String containerId, LogConsumer consumer);
}
- Single interface, implemented by
DockerRuntimeOrchestrator(Phase 3) andKubernetesRuntimeOrchestrator(Phase 5). - Injected via Spring
@Profileor@ConditionalOnProperty. - Request objects carry all context (image name, env vars, network, labels, etc.).
DockerRuntimeOrchestrator
Uses com.github.docker-java:docker-java library. Connects via Docker socket (/var/run/docker.sock).
buildImage:
- Creates a temporary build context directory
- Writes a Dockerfile:
FROM cameleer-runtime-base:{platform-version} COPY app.jar /app/app.jar - Copies the customer JAR as
app.jar - Calls
docker buildvia docker-java - Tags as
cameleer-runtime-{tenant-slug}-{app-slug}:v{version} - Returns the image reference
startContainer:
- Creates container with:
- Image: the built image reference
- Name:
{tenant-slug}-{env-slug}-{app-slug} - Network:
cameleer(the platform bridge network) - Environment variables:
CAMELEER_AUTH_TOKEN={bootstrap-token}CAMELEER_EXPORT_TYPE=HTTPCAMELEER_EXPORT_ENDPOINT=http://cameleer3-server:8081CAMELEER_APPLICATION_ID={app-slug}CAMELEER_ENVIRONMENT_ID={env-slug}CAMELEER_DISPLAY_NAME={tenant-slug}-{env-slug}-{app-slug}
- Resource constraints (cgroups):
memory/memorySwap— hard memory limit per containercpuShares— relative CPU weight (default 512)- Defaults configurable via
cameleer.runtime.container-memory-limit(default512m) andcameleer.runtime.container-cpu-shares(default512) - Protects the control plane (Traefik, Postgres, Logto, cameleer-saas) from noisy neighbor Camel apps
- Health check: HTTP GET to agent health port 9464
- Starts container
- Returns container ID
streamLogs:
- Attaches to container stdout/stderr via docker-java
LogContainerCmd - Passes log lines to a
LogConsumercallback (for ClickHouse ingestion)
cameleer-runtime-base Image
A pre-built Docker image containing everything except the customer JAR:
FROM eclipse-temurin:21-jre-alpine
WORKDIR /app
COPY cameleer3-agent-{version}-shaded.jar /app/agent.jar
ENTRYPOINT exec java \
-Dcameleer.export.type=${CAMELEER_EXPORT_TYPE:-HTTP} \
-Dcameleer.export.endpoint=${CAMELEER_EXPORT_ENDPOINT} \
-Dcameleer.agent.name=${HOSTNAME} \
-Dcameleer.agent.application=${CAMELEER_APPLICATION_ID:-default} \
-Dcameleer.agent.environment=${CAMELEER_ENVIRONMENT_ID:-default} \
-Dcameleer.routeControl.enabled=${CAMELEER_ROUTE_CONTROL_ENABLED:-false} \
-Dcameleer.replay.enabled=${CAMELEER_REPLAY_ENABLED:-false} \
-Dcameleer.health.enabled=true \
-Dcameleer.health.port=9464 \
-javaagent:/app/agent.jar \
-jar /app/app.jar
- Built as part of the CI pipeline for cameleer-saas.
- Published to Gitea registry:
gitea.siegeln.net/cameleer/cameleer-runtime-base:{version}. - Version tracks the platform version + agent version (e.g.,
0.2.0includes agent1.0-SNAPSHOT). - Updating the agent JAR = rebuild this image with the new agent version → rebuild cameleer-saas image → all new deployments use the new agent.
JAR Upload
POST /api/environments/{eid}/appswith multipart file- Validation:
- File extension:
.jar - Max size: 200 MB (configurable via
cameleer.runtime.max-jar-size) - SHA-256 checksum computed and stored
- File extension:
- Storage: relative path
tenants/{tenant-slug}/envs/{env-slug}/apps/{app-slug}/app.jarunder the configured storage root (cameleer.runtime.jar-storage-path, default/data/jars)- Docker volume
jardatamounted into cameleer-saas container - Database stores the relative path only — decoupled from mount point
- Docker volume
- JAR is overwritten on re-upload (new deploy uses new JAR)
Async Deployment Pipeline
- API receives deploy request → creates
Deploymententity withobserved_status=BUILDING→ returns deployment ID (HTTP 202 Accepted) - Background thread (Spring
@Asyncwith a bounded thread pool): a. Callsorchestrator.buildImage(...)→ updatesobserved_status=STARTINGb. Callsorchestrator.startContainer(...)→ updatesobserved_status=STARTINGc. Polls agent health endpoint (port 9464) with timeout → updates toRUNNINGorFAILEDd. On any failure → updatesobserved_status=FAILED,error_message=... - Client polls
GET /api/apps/{aid}/deployments/{did}for status updates - On success: set
previous_deployment_id = old current_deployment_id, thencurrent_deployment_id = new deployment. Stop and remove the old container. - On failure:
current_deployment_idis set to the failed deployment (so status is visible),previous_deployment_idstill points to the last known good version. Enables rollback.
Container Logs → ClickHouse
- When a container starts, platform attaches a log consumer via
orchestrator.streamLogs() - Log consumer batches lines and writes to ClickHouse table:
CREATE TABLE IF NOT EXISTS container_logs (
tenant_id UUID,
environment_id UUID,
app_id UUID,
deployment_id UUID,
timestamp DateTime64(3),
stream String, -- 'stdout' or 'stderr'
message String
) ENGINE = MergeTree()
ORDER BY (tenant_id, environment_id, app_id, timestamp);
- Logs retrieved via
GET /api/apps/{aid}/logs?since=...&limit=...which queries ClickHouse - ClickHouse TTL can enforce retention based on license
retention_dayslimit (future enhancement)
Bootstrap Token Handling
In Docker single-tenant mode, all environments share the single cameleer3-server instance and its single CAMELEER_AUTH_TOKEN. The platform reads this token from its own configuration (cameleer.runtime.bootstrap-token / CAMELEER_AUTH_TOKEN env var) and injects it into every customer container. No changes to cameleer3-server are needed.
Environment-level data separation happens at the agent registration level — the agent sends its environmentId claim when it registers, and cameleer3-server uses that to scope all data. The bootstrap token is the same across environments in a Docker stack.
The bootstrap_token column on the environment entity stores the token value used for that environment's containers. In Docker mode this is the same shared value for all environments. In K8s mode (Phase 5), each environment could have its own cameleer3-server instance with a unique token, enabling true per-environment token isolation.
API Surface
Environment Endpoints
POST /api/tenants/{tenantId}/environments
Body: { "slug": "dev", "displayName": "Development" }
Returns: 201 Created + EnvironmentResponse
Enforces: tier-based max_environments limit from license
GET /api/tenants/{tenantId}/environments
Returns: 200 + List<EnvironmentResponse>
GET /api/tenants/{tenantId}/environments/{environmentId}
Returns: 200 + EnvironmentResponse
PATCH /api/tenants/{tenantId}/environments/{environmentId}
Body: { "displayName": "New Name" }
Returns: 200 + EnvironmentResponse
DELETE /api/tenants/{tenantId}/environments/{environmentId}
Returns: 204 No Content
Precondition: no running apps in environment
Restriction: cannot delete the auto-created "default" environment
App Endpoints
POST /api/environments/{environmentId}/apps
Multipart: file (JAR) + metadata { "slug": "order-service", "displayName": "Order Service" }
Returns: 201 Created + AppResponse
Validates: file extension, size, checksum
GET /api/environments/{environmentId}/apps
Returns: 200 + List<AppResponse>
GET /api/environments/{environmentId}/apps/{appId}
Returns: 200 + AppResponse (includes current deployment status)
PUT /api/environments/{environmentId}/apps/{appId}/jar
Multipart: file (JAR)
Returns: 200 + AppResponse
Purpose: re-upload JAR without creating new app
DELETE /api/environments/{environmentId}/apps/{appId}
Returns: 204 No Content
Side effect: stops running container, removes image
Deployment Endpoints
POST /api/apps/{appId}/deploy
Body: {} (empty — uses current JAR)
Returns: 202 Accepted + DeploymentResponse (with deployment ID, status=BUILDING)
GET /api/apps/{appId}/deployments
Returns: 200 + List<DeploymentResponse> (ordered by version desc)
GET /api/apps/{appId}/deployments/{deploymentId}
Returns: 200 + DeploymentResponse (poll this for status updates)
POST /api/apps/{appId}/stop
Returns: 200 + DeploymentResponse (desired_status=STOPPED)
POST /api/apps/{appId}/restart
Returns: 202 Accepted + DeploymentResponse (stops + redeploys same image)
Log Endpoints
GET /api/apps/{appId}/logs
Query: since (ISO timestamp), until (ISO timestamp), limit (default 500), stream (stdout/stderr/both)
Returns: 200 + List<LogEntry>
Source: ClickHouse container_logs table
Tier Enforcement
| Tier | max_environments | max_agents (apps) |
|---|---|---|
| LOW | 1 | 3 |
| MID | 2 | 10 |
| HIGH | unlimited (-1) | 50 |
| BUSINESS | unlimited (-1) | unlimited (-1) |
max_environmentsenforced onPOST /api/tenants/{tid}/environments. The auto-createddefaultenvironment counts toward the limit.max_agentsenforced onPOST /api/environments/{eid}/apps. Count is total apps across all environments in the tenant.
Docker Compose Changes
The cameleer-saas service needs:
- Docker socket mount:
/var/run/docker.sock:/var/run/docker.sock(already present in docker-compose.yml) - JAR storage volume:
jardata:/data/jars cameleer-runtime-baseimage must be available (pre-pulled or built locally)
The cameleer3-server CAMELEER_AUTH_TOKEN is read by cameleer-saas from shared environment config and injected into customer containers.
New volume in docker-compose.yml:
volumes:
jardata:
Dependencies
New Maven Dependencies
<!-- Docker Java client -->
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-core</artifactId>
<version>3.4.1</version>
</dependency>
<dependency>
<groupId>com.github.docker-java</groupId>
<artifactId>docker-java-transport-httpclient5</artifactId>
<version>3.4.1</version>
</dependency>
<!-- ClickHouse JDBC -->
<dependency>
<groupId>com.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.7.1</version>
<classifier>all</classifier>
</dependency>
New Configuration Properties
cameleer:
runtime:
max-jar-size: 209715200 # 200 MB
jar-storage-path: /data/jars
base-image: cameleer-runtime-base:latest
docker-network: cameleer
agent-health-port: 9464
health-check-timeout: 60 # seconds to wait for healthy status
deployment-thread-pool-size: 4
container-memory-limit: 512m # per customer container
container-cpu-shares: 512 # relative weight (default Docker is 1024)
clickhouse:
url: jdbc:clickhouse://clickhouse:8123/cameleer
Verification Plan
- Upload a sample Camel JAR via
POST /api/environments/{eid}/apps - Deploy via
POST /api/apps/{aid}/deploy— returns 202 with deployment ID - Poll
GET /api/apps/{aid}/deployments/{did}— status transitions:BUILDING→STARTING→RUNNING - Container visible in
docker psas{tenant}-{env}-{app} - Container is on the
cameleernetwork - cameleer3 agent registers with cameleer3-server (visible in server logs)
- Agent health endpoint responds on port 9464
- Container logs appear in ClickHouse
container_logstable GET /api/apps/{aid}/logsreturns log entriesPOST /api/apps/{aid}/stopstops the container, status becomesSTOPPEDPOST /api/apps/{aid}/restartrestarts with same image- Re-upload JAR + redeploy creates deployment v2, stops v1
- Tier limits enforced: LOW tenant cannot create more than 1 environment or 3 apps
- Default environment auto-created on tenant provisioning