Files
cameleer-server/docs/SERVER-CAPABILITIES.md
hsiegeln 4496be08bd
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 34s
SonarQube / sonarqube (push) Successful in 3m36s
docs: document SSO auto-redirect, consent handling, and auto-signup
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 01:45:45 +02:00

18 KiB

Cameleer3 Server — Capabilities Reference

Standalone reference for systems integrating with or managing Cameleer3 Server instances. Generated 2026-04-04. Source of truth: the codebase and OpenAPI spec at /api/v1/api-docs.

What It Does

Cameleer3 Server is an observability platform for Apache Camel applications. It receives execution traces, metrics, logs, and route diagrams from instrumented Camel agents, stores them in ClickHouse, and serves a web UI for searching, visualizing, and controlling routes.

Core capabilities:

  • Real-time execution tracing with processor-level detail
  • Full-text search across executions, logs, and attributes
  • Route topology diagrams with live execution overlays
  • Application configuration push via SSE
  • Route control (start/stop/suspend) and exchange replay
  • Agent lifecycle management with auto-heal on server restart
  • RBAC with local users, groups, roles, and OIDC federation
  • Multi-tenant isolation (one tenant per server instance)

Multi-Tenancy Model

Each server instance serves exactly one tenant. Multiple tenants share infrastructure but are isolated at the data layer.

Concern Isolation
PostgreSQL Schema-per-tenant (?currentSchema=tenant_{id})
ClickHouse Shared DB, tenant_id column on all tables, partitioned by (tenant_id, toYYYYMM(timestamp))
Configuration CAMELEER_TENANT_ID env var (default: "default")
Agents Each agent belongs to one tenant, one environment

Environments (dev/staging/prod) are first-class within a tenant. Agents send environmentId at registration and in every heartbeat. The UI filters by environment. JWT tokens carry an env claim for persistence across restarts.


Agent Protocol

Lifecycle

Register (bootstrap token) → Receive JWT + SSE URL
    ↓
Connect SSE ← Receive commands (config-update, deep-trace, replay, route-control)
    ↓
Heartbeat (every 30s) → Send capabilities, environmentId, routeStates
    ↓
Deregister (graceful shutdown)

State Machine

LIVE ──(no heartbeat for 90s)──→ STALE ──(300s more)──→ DEAD
  ↑                                 │
  └────(heartbeat arrives)──────────┘

Thresholds are configurable via agent-registry.* properties.

Registration

POST /api/v1/agents/register — requires bootstrap token in Authorization: Bearer header.

Request:

{
  "instanceId": "agent-abc-123",
  "displayName": "Order Service #1",
  "applicationId": "order-service",
  "environmentId": "production",
  "version": "3.2.1",
  "routeIds": ["processOrder", "handlePayment"],
  "capabilities": { "replay": true, "routeControl": true }
}

Response:

{
  "instanceId": "agent-abc-123",
  "eventStreamUrl": "/api/v1/agents/agent-abc-123/events",
  "heartbeatIntervalMs": 30000,
  "signingPublicKeyBase64": "<ed25519-public-key>",
  "accessToken": "<jwt>",
  "refreshToken": "<jwt>"
}

Heartbeat

POST /api/v1/agents/{id}/heartbeat — JWT auth.

{
  "capabilities": { "replay": true, "routeControl": true },
  "environmentId": "production",
  "routeStates": { "processOrder": "Started", "handlePayment": "Suspended" }
}

Auto-heals after server restart: if agent not in registry, re-registers from JWT claims + heartbeat body. Environment priority: heartbeat environmentId > JWT env claim > "default".

SSE Event Stream

GET /api/v1/agents/{id}/events — long-lived SSE connection. Keepalive ping every 15s.

Event types pushed to agents: config-update, deep-trace, replay, set-traced-processors, test-expression, route-control.

Token Refresh

POST /api/v1/agents/{id}/refresh — public endpoint, validates refresh token.

{ "refreshToken": "<refresh-jwt>" }

Returns new accessToken + refreshToken. Preserves roles, application, and environment from the original token.


Data Ingestion

All ingestion endpoints require JWT with AGENT role.

Endpoint Data Notes
POST /api/v1/data/executions Execution chunks (route + processor traces) Buffered, flushed periodically
POST /api/v1/data/diagrams Route graph definitions Single or array
POST /api/v1/data/events Agent lifecycle events Triggers registry state transitions
POST /api/v1/data/logs Application log batches Buffered, 503 if buffer full
POST /api/v1/data/metrics Metrics snapshots Buffered, 503 if buffer full

Command System

Commands are delivered to agents via SSE. Three dispatch modes:

Mode Endpoint Behavior
Single agent POST /api/v1/agents/{id}/commands Async (202), DELIVERED or PENDING
Group (application) POST /api/v1/agents/groups/{group}/commands Sync wait (10s), returns per-agent results
Broadcast (all LIVE) POST /api/v1/agents/commands Fire-and-forget (202)

Command types: config-update, deep-trace, replay, set-traced-processors, test-expression, route-control

Replay has a dedicated sync endpoint: POST /api/v1/agents/{id}/replay (30s timeout, returns result or 504).

Acknowledgment: POST /api/v1/agents/{id}/commands/{commandId}/ack — agent confirms receipt with status/message/data.


Query & Analytics API

All query endpoints require JWT with VIEWER role or higher.

Endpoint Description
GET /api/v1/search/executions Search by status, time, text, route, app, environment
POST /api/v1/search/executions Advanced search with full filter object
GET /api/v1/executions/{id} Execution detail with processor tree
GET /api/v1/executions/{id}/processors/by-id/{pid}/snapshot Exchange data at processor

Statistics & Analytics

Endpoint Description
GET /api/v1/search/stats Aggregated stats (P99, error rate, SLA compliance)
GET /api/v1/search/stats/timeseries Bucketed time-series
GET /api/v1/search/stats/timeseries/by-app Time series grouped by application
GET /api/v1/search/stats/timeseries/by-route Time series grouped by route
GET /api/v1/search/stats/punchcard Transaction heatmap (weekday x hour)
GET /api/v1/search/errors/top Top N errors with velocity trends
GET /api/v1/search/attributes/keys Distinct attribute key names

Route Catalog & Metrics

Endpoint Description
GET /api/v1/routes/catalog Applications with routes, agents, health
GET /api/v1/routes/metrics Per-route performance (TPS, P99, error rate)
GET /api/v1/routes/metrics/processors Per-processor metrics for a route

Logs

Endpoint Description
GET /api/v1/logs Cursor-based log search with level aggregation

Diagrams

Endpoint Description
GET /api/v1/diagrams Find diagram by application + routeId
GET /api/v1/diagrams/{hash}/render SVG or JSON layout

Agent Monitoring

Endpoint Description
GET /api/v1/agents List agents (filter by status, app, environment)
GET /api/v1/agents/events-log Agent lifecycle event history
GET /api/v1/agents/{id}/metrics Agent-level metrics time series

Application Configuration

Endpoint Role Description
GET /api/v1/config VIEWER List all app configs
GET /api/v1/config/{app} VIEWER Get config (returns defaults if none stored)
PUT /api/v1/config/{app} OPERATOR Save config + push to all LIVE agents
GET /api/v1/config/{app}/processor-routes VIEWER Processor-to-route mapping
POST /api/v1/config/{app}/test-expression VIEWER Test Camel expression via live agent

Config fields: metricsEnabled, samplingRate, tracedProcessors, logLevels, engineLevel, payloadCaptureMode, version.


Security

Authentication

Method Endpoint Purpose
Bootstrap token POST /agents/register One-time agent registration
Local credentials POST /auth/login UI login (username/password)
OIDC code exchange POST /auth/oidc/callback External identity provider
OIDC access token Bearer token in Authorization header SaaS M2M / external OIDC
Token refresh POST /auth/refresh UI token refresh
Token refresh POST /agents/{id}/refresh Agent token refresh

JWT Structure

  • Algorithm: HMAC-SHA256
  • Access token: 1 hour (configurable)
  • Refresh token: 7 days (configurable)
  • Claims: sub (agent ID or user:<username>), group (application), env (environment), roles (array), type (access/refresh)

RBAC Roles

Role Permissions
AGENT Data ingestion, heartbeat, SSE, command ack
VIEWER Read-only: executions, search, diagrams, metrics, logs, config
OPERATOR VIEWER + send commands, modify config, replay
ADMIN OPERATOR + user/group/role management, OIDC config, database admin

Ed25519 Config Signing

Server derives an Ed25519 keypair deterministically from the JWT secret. Public key is shared with agents at registration. Config-update payloads are signed so agents can verify authenticity.

OIDC Integration

Configured via admin API (/api/v1/admin/oidc). Supports any OpenID Connect provider. Features: role claim extraction (supports nested paths like realm_access.roles), auto-signup (auto-provisions new users on first OIDC login), configurable display name claim, constant-time token rotation via dual bootstrap tokens. Supports ES384 (Logto default), ES256, and RS256 for id_token validation.

SSO Auto-Redirect

When OIDC is configured and enabled, the login page automatically redirects to the OIDC provider with prompt=none for silent SSO. If the user has an active provider session, they are signed in without seeing a login form. If consent_required is returned (first login, scopes not yet granted), the flow retries without prompt=none so the user can grant consent once. If login_required (no provider session), falls back to the login form. Bypass auto-redirect with /login?local.

OIDC Resource Server

When CAMELEER_OIDC_ISSUER_URI is configured, the server accepts external access tokens (e.g., Logto M2M tokens) in addition to internal HMAC JWTs. Dual-path validation: tries internal HMAC first, falls back to OIDC JWKS validation. OAuth2 scope-based role mapping: admin scope maps to ADMIN, operator to OPERATOR, viewer to VIEWER. Supports ES384, ES256, and RS256 algorithms. Handles RFC 9068 at+jwt token type.

Variable Purpose
CAMELEER_OIDC_ISSUER_URI OIDC issuer URI for token validation (e.g., https://auth.example.com/oidc)
CAMELEER_OIDC_JWK_SET_URI Direct JWKS URL (e.g., http://logto:3001/oidc/jwks) — use when public issuer isn't reachable from inside containers
CAMELEER_OIDC_AUDIENCE Expected audience (API resource indicator)
CAMELEER_OIDC_TLS_SKIP_VERIFY Skip TLS certificate verification for OIDC calls (default false) — use when provider has a self-signed CA

Logto is proxy-aware (TRUST_PROXY_HEADER=1). The LOGTO_ENDPOINT env var sets the public-facing URL used in OIDC discovery, issuer URI, and redirect URLs. Logto requires its own subdomain (not a path prefix).


Admin API

All admin endpoints require ADMIN role. Prefix: /api/v1/admin/.

User Management

Endpoint Method Description
/users GET List all users
/users POST Create local user
/users/{id} GET/PUT/DELETE Get/update/delete user
/users/{id}/password POST Reset password
/users/{id}/roles/{roleId} POST/DELETE Assign/remove role
/users/{id}/groups/{groupId} POST/DELETE Add/remove from group

Group & Role Management

Endpoint Method Description
/groups GET/POST List/create groups
/groups/{id} GET/PUT/DELETE Manage group (cycle detection on parent change)
/groups/{id}/roles/{roleId} POST/DELETE Assign/remove role from group
/roles GET/POST List/create roles
/roles/{id} GET/PUT/DELETE Manage role (system roles protected)
/rbac/stats GET RBAC statistics

Infrastructure

Endpoint Description
/database/status PostgreSQL version, schema, health
/database/pool HikariCP connection pool stats
/database/tables Table sizes and row counts
/database/queries Active queries (with kill)
/clickhouse/status ClickHouse version, uptime
/clickhouse/tables Table info, row counts, sizes
/clickhouse/performance Disk, memory, compression, partitions
/clickhouse/queries Active ClickHouse queries
/clickhouse/pipeline Ingestion pipeline stats

Settings & Configuration

Endpoint Description
/app-settings Per-application settings (CRUD)
/thresholds Monitoring threshold configuration
/oidc OIDC provider configuration (CRUD + test)
/audit Paginated audit log search
/usage UI usage analytics (ClickHouse)

Storage

PostgreSQL

Used for RBAC, configuration, and audit. Schema-per-tenant isolation via ?currentSchema=tenant_{id}.

Tables: users, groups, roles, user_roles, user_groups, group_roles, server_config, application_config, audit_log.

Flyway migrations (V1-V11) manage schema evolution.

ClickHouse

Used for all observability data. Schema managed by ClickHouseSchemaInitializer (idempotent on startup).

Table Engine Purpose TTL
executions ReplacingMergeTree Route execution records 365d
processor_executions MergeTree Per-processor trace data 365d
agent_events MergeTree Agent lifecycle audit trail 365d
route_diagrams ReplacingMergeTree Route graph definitions -
logs MergeTree Application logs 365d
usage_events MergeTree UI action tracking 90d
stats_1m_all AggregatingMergeTree Global 1-minute rollups -
stats_1m_app AggregatingMergeTree Per-application rollups -
stats_1m_route AggregatingMergeTree Per-route rollups -
stats_1m_processor AggregatingMergeTree Per-processor-type rollups -
stats_1m_processor_detail AggregatingMergeTree Per-processor-instance rollups -

All tables include tenant_id and environment columns. Partitioned by (tenant_id, toYYYYMM(timestamp)).

Stats tables are fed by Materialized Views from base tables. Query with -Merge() combinators (e.g., countMerge(total_count)).


Deployment

Container Image

Multi-stage Docker build: Maven 3.9 + JDK 17 (build) → JRE 17 (runtime). Port 8081.

Registry: gitea.siegeln.net/cameleer/cameleer3-server

Infrastructure Requirements

Component Version Purpose
PostgreSQL 16+ RBAC, config, audit
ClickHouse 24.12+ All observability data

Required Environment Variables

Variable Required Default Purpose
CAMELEER_AUTH_TOKEN Yes - Bootstrap token for agent registration
CAMELEER_JWT_SECRET Recommended Random (ephemeral) JWT signing secret
CAMELEER_TENANT_ID No default Tenant identifier
CAMELEER_UI_USER No admin Default admin username
CAMELEER_UI_PASSWORD No admin Default admin password
CAMELEER_UI_ORIGIN No http://localhost:5173 CORS allowed origin (single, legacy)
CAMELEER_CORS_ALLOWED_ORIGINS No (empty) Comma-separated CORS origins — overrides UI_ORIGIN when set
CLICKHOUSE_URL No jdbc:clickhouse://localhost:8123/cameleer ClickHouse JDBC URL
CLICKHOUSE_USERNAME No default ClickHouse user
CLICKHOUSE_PASSWORD No (empty) ClickHouse password
SPRING_DATASOURCE_URL No jdbc:postgresql://localhost:5432/cameleer3 PostgreSQL JDBC URL
SPRING_DATASOURCE_USERNAME No cameleer PostgreSQL user
SPRING_DATASOURCE_PASSWORD No cameleer_dev PostgreSQL password
CAMELEER_DB_SCHEMA No tenant_{CAMELEER_TENANT_ID} PostgreSQL schema (override for feature branches)
CAMELEER_OIDC_ISSUER_URI No (empty) OIDC issuer URI — enables resource server mode for M2M tokens
CAMELEER_OIDC_JWK_SET_URI No (empty) Direct JWKS URL — bypasses OIDC discovery for container networking
CAMELEER_OIDC_AUDIENCE No (empty) Expected JWT audience (API resource indicator)
CAMELEER_OIDC_TLS_SKIP_VERIFY No false Skip TLS cert verification for OIDC calls (self-signed CAs)

Health Probes

  • Endpoint: GET /api/v1/health (public, no auth)
  • Liveness: 30s initial delay, 10s period
  • Readiness: 10s initial delay, 5s period

Ingestion Tuning

Variable Default Purpose
INGESTION_BUFFER_CAPACITY 50000 Ring buffer size
INGESTION_BATCH_SIZE 5000 Flush batch size
INGESTION_FLUSH_INTERVAL_MS 5000 Periodic flush interval

Agent Registry Tuning

Variable Default Purpose
AGENT_REGISTRY_STALE_THRESHOLD_MS 90000 Heartbeat miss → STALE
AGENT_REGISTRY_DEAD_THRESHOLD_MS 300000 STALE duration → DEAD
AGENT_REGISTRY_PING_INTERVAL_MS 15000 SSE keepalive interval
AGENT_REGISTRY_COMMAND_EXPIRY_MS 60000 Pending command TTL

Public Endpoints (No Auth)

These endpoints do not require authentication:

  • GET /api/v1/health
  • POST /api/v1/agents/register (requires bootstrap token)
  • POST /api/v1/agents/*/refresh
  • POST /api/v1/auth/login
  • POST /api/v1/auth/refresh
  • GET /api/v1/auth/oidc/config
  • POST /api/v1/auth/oidc/callback
  • GET /api/v1/api-docs/** (OpenAPI spec)
  • GET /swagger-ui.html (Swagger UI)
  • Static resources: /, /index.html, /config.js, /favicon.svg, /assets/**

All other endpoints require a valid JWT with appropriate role.