Files
cameleer-server/HOWTO.md
hsiegeln 0609220cdf
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 35s
docs: add CAMELEER_OIDC_TLS_SKIP_VERIFY to all documentation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-06 00:30:18 +02:00

21 KiB

HOWTO — Cameleer3 Server

Prerequisites

  • Java 17+
  • Maven 3.9+
  • Node.js 22+ and npm
  • Docker & Docker Compose
  • Access to the Gitea Maven registry (for cameleer3-common dependency)

Build

# Build UI first (required for embedded mode)
cd ui && npm ci && npm run build && cd ..

# Backend
mvn clean compile          # compile only
mvn clean verify           # compile + run all tests (needs Docker for integration tests)

Infrastructure Setup

Start PostgreSQL:

docker compose up -d

This starts PostgreSQL 16. The database schema is applied automatically via Flyway migrations on server startup. ClickHouse tables are created by the schema initializer on startup.

Service Port Purpose
PostgreSQL 5432 JDBC (Spring JDBC)

PostgreSQL credentials: cameleer / cameleer_dev, database cameleer3.

Run the Server

mvn clean package -DskipTests
CAMELEER_AUTH_TOKEN=my-secret-token java -jar cameleer3-server-app/target/cameleer3-server-app-1.0-SNAPSHOT.jar

The server starts on port 8081. The CAMELEER_AUTH_TOKEN environment variable is required — the server fails fast on startup if it is not set.

For token rotation without downtime, set CAMELEER_AUTH_TOKEN_PREVIOUS to the old token while rolling out the new one. The server accepts both during the overlap window.

API Endpoints

Authentication (Phase 4)

All endpoints except health, registration, and docs require a JWT Bearer token. The typical flow:

# 1. Register agent (requires bootstrap token)
curl -s -X POST http://localhost:8081/api/v1/agents/register \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-token" \
  -d '{"agentId":"agent-1","name":"Order Service","group":"order-service-prod","version":"1.0.0","routeIds":["route-1"],"capabilities":["deep-trace","replay"]}'
# Response includes: accessToken, refreshToken, serverPublicKey (Ed25519, Base64)

# 2. Use access token for all subsequent requests
TOKEN="<accessToken from registration>"

# 3. Refresh when access token expires (1h default)
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/refresh \
  -H "Authorization: Bearer <refreshToken>"
# Response: { "accessToken": "new-jwt" }

UI Login (for browser access):

# Login with UI credentials (returns JWT tokens)
curl -s -X POST http://localhost:8081/api/v1/auth/login \
  -H "Content-Type: application/json" \
  -d '{"username":"admin","password":"admin"}'
# Response: { "accessToken": "...", "refreshToken": "..." }

# Refresh UI token
curl -s -X POST http://localhost:8081/api/v1/auth/refresh \
  -H "Content-Type: application/json" \
  -d '{"refreshToken":"<refreshToken>"}'

UI credentials are configured via CAMELEER_UI_USER / CAMELEER_UI_PASSWORD env vars (default: admin / admin).

Public endpoints (no JWT required): GET /api/v1/health, POST /api/v1/agents/register (uses bootstrap token), POST /api/v1/auth/**, OpenAPI/Swagger docs.

Protected endpoints (JWT required): All other endpoints including ingestion, search, agent management, commands.

SSE connections: Authenticated via query parameter: /agents/{id}/events?token=<jwt> (EventSource API doesn't support custom headers).

Ed25519 signatures: All SSE command payloads (config-update, deep-trace, replay) include a signature field. Agents verify payload integrity using the serverPublicKey received during registration. The server generates a new ephemeral keypair on each startup — agents must re-register to get the new key.

RBAC (Role-Based Access Control)

JWTs carry a roles claim. Endpoints are restricted by role:

Role Access
AGENT Data ingestion (/data/** — executions, diagrams, metrics, logs), heartbeat, SSE events, command ack
VIEWER Search, execution detail, diagrams, agent list
OPERATOR VIEWER + send commands to agents
ADMIN OPERATOR + user management (/admin/**)

The env-var local user gets ADMIN role. Agents get AGENT role at registration.

OIDC Login (Optional)

OIDC configuration is stored in PostgreSQL and managed via the admin API or UI. The SPA checks if OIDC is available:

# 1. SPA checks if OIDC is available (returns 404 if not configured)
curl -s http://localhost:8081/api/v1/auth/oidc/config
# Returns: { "issuer": "...", "clientId": "...", "authorizationEndpoint": "..." }

# 2. After OIDC redirect, SPA sends the authorization code
curl -s -X POST http://localhost:8081/api/v1/auth/oidc/callback \
  -H "Content-Type: application/json" \
  -d '{"code":"auth-code-from-provider","redirectUri":"http://localhost:5173/callback"}'
# Returns: { "accessToken": "...", "refreshToken": "..." }

Local login remains available as fallback even when OIDC is enabled.

OIDC Admin Configuration (ADMIN only)

OIDC settings are managed at runtime via the admin API. No server restart needed.

# Get current OIDC config
curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8081/api/v1/admin/oidc

# Save OIDC config (client_secret: send "********" to keep existing, or new value to update)
curl -s -X PUT http://localhost:8081/api/v1/admin/oidc \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "enabled": true,
    "issuerUri": "http://logto:3001/oidc",
    "clientId": "your-client-id",
    "clientSecret": "your-client-secret",
    "rolesClaim": "realm_access.roles",
    "defaultRoles": ["VIEWER"]
  }'

# Test OIDC provider connectivity
curl -s -X POST http://localhost:8081/api/v1/admin/oidc/test \
  -H "Authorization: Bearer $TOKEN"

# Delete OIDC config (disables OIDC)
curl -s -X DELETE http://localhost:8081/api/v1/admin/oidc \
  -H "Authorization: Bearer $TOKEN"

Initial provisioning: OIDC can also be seeded from CAMELEER_OIDC_* env vars on first startup (when DB is empty). After that, the admin API takes over.

Logto Setup (OIDC Provider)

Logto is deployed alongside the Cameleer stack. After first deployment:

Logto is proxy-aware via TRUST_PROXY_HEADER=1. The LOGTO_ENDPOINT and LOGTO_ADMIN_ENDPOINT secrets define the public-facing URLs that Logto uses for OIDC discovery, issuer URI, and redirect URLs. When behind a reverse proxy (e.g., Traefik), set these to the external URLs (e.g., https://auth.cameleer.my.domain). Logto needs its own subdomain — it cannot be path-prefixed under another app.

  1. Initial setup: Open the Logto admin console (the LOGTO_ADMIN_ENDPOINT URL) and create the admin account
  2. Create SPA application: Applications → Create → Single Page App
    • Name: Cameleer UI
    • Redirect URI: your UI URL + /oidc/callback
    • Note the Client ID
  3. Create API Resource: API Resources → Create
    • Name: Cameleer Server API
    • Indicator: your API URL (e.g., https://cameleer.siegeln.net/api)
    • Add permissions: admin, operator, viewer
  4. Create M2M application (for SaaS platform): Applications → Create → Machine-to-Machine
    • Name: Cameleer SaaS
    • Assign the API Resource created above with admin scope
    • Note the Client ID and Client Secret
  5. Configure Cameleer OIDC login: Use the admin API (PUT /api/v1/admin/oidc) or set env vars for initial seeding:
    CAMELEER_OIDC_ENABLED=true
    CAMELEER_OIDC_ISSUER=<LOGTO_ENDPOINT>/oidc
    CAMELEER_OIDC_CLIENT_ID=<client-id-from-step-2>
    CAMELEER_OIDC_CLIENT_SECRET=<not-needed-for-public-spa>
    
  6. Configure resource server (for M2M token validation):
    CAMELEER_OIDC_ISSUER_URI=<LOGTO_ENDPOINT>/oidc
    CAMELEER_OIDC_JWK_SET_URI=http://logto:3001/oidc/jwks
    CAMELEER_OIDC_AUDIENCE=<api-resource-indicator-from-step-3>
    CAMELEER_OIDC_TLS_SKIP_VERIFY=true   # optional — skip cert verification for self-signed CAs
    
    JWK_SET_URI is needed when the public issuer URL isn't reachable from inside containers — it fetches JWKS directly from the internal Logto service. TLS_SKIP_VERIFY disables certificate verification for all OIDC HTTP calls (discovery, token exchange, JWKS); use only when the provider has a self-signed CA.

User Management (ADMIN only)

# List all users
curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8081/api/v1/admin/users

# Update user roles
curl -s -X PUT http://localhost:8081/api/v1/admin/users/{userId}/roles \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"roles":["VIEWER","OPERATOR"]}'

# Delete user
curl -s -X DELETE http://localhost:8081/api/v1/admin/users/{userId} \
  -H "Authorization: Bearer $TOKEN"

Ingestion (POST, returns 202 Accepted)

# Post route execution data (JWT required)
curl -s -X POST http://localhost:8081/api/v1/data/executions \
  -H "Content-Type: application/json" \
  -H "X-Protocol-Version: 1" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"agentId":"agent-1","routeId":"route-1","executionId":"exec-1","status":"COMPLETED","startTime":"2026-03-11T00:00:00Z","endTime":"2026-03-11T00:00:01Z","processorExecutions":[]}'

# Post route diagram
curl -s -X POST http://localhost:8081/api/v1/data/diagrams \
  -H "Content-Type: application/json" \
  -H "X-Protocol-Version: 1" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"agentId":"agent-1","routeId":"route-1","version":1,"nodes":[],"edges":[]}'

# Post agent metrics
curl -s -X POST http://localhost:8081/api/v1/data/metrics \
  -H "Content-Type: application/json" \
  -H "X-Protocol-Version: 1" \
  -H "Authorization: Bearer $TOKEN" \
  -d '[{"agentId":"agent-1","metricName":"cpu","value":42.0,"timestamp":"2026-03-11T00:00:00Z","tags":{}}]'

# Post application log entries (batch)
curl -s -X POST http://localhost:8081/api/v1/data/logs \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{
    "entries": [{
      "timestamp": "2026-03-25T10:00:00Z",
      "level": "INFO",
      "loggerName": "com.acme.MyService",
      "message": "Processing order #12345",
      "threadName": "main"
    }]
  }'

Note: The X-Protocol-Version: 1 header is required on all /api/v1/data/** endpoints. Missing or wrong version returns 400.

Health & Docs

# Health check
curl -s http://localhost:8081/api/v1/health

# OpenAPI JSON
curl -s http://localhost:8081/api/v1/api-docs

# Swagger UI
open http://localhost:8081/api/v1/swagger-ui.html

Search (Phase 2)

# Search by status (GET with basic filters)
curl -s -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8081/api/v1/search/executions?status=COMPLETED&limit=10"

# Search by time range
curl -s -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8081/api/v1/search/executions?timeFrom=2026-03-11T00:00:00Z&timeTo=2026-03-12T00:00:00Z"

# Advanced search (POST with full-text)
curl -s -X POST http://localhost:8081/api/v1/search/executions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"status":"FAILED","text":"NullPointerException","limit":20}'

# Transaction detail (nested processor tree)
curl -s -H "Authorization: Bearer $TOKEN" \
  http://localhost:8081/api/v1/executions/{executionId}

# Processor exchange snapshot
curl -s -H "Authorization: Bearer $TOKEN" \
  http://localhost:8081/api/v1/executions/{executionId}/processors/{index}/snapshot

# Render diagram as SVG
curl -s -H "Authorization: Bearer $TOKEN" \
  -H "Accept: image/svg+xml" \
  http://localhost:8081/api/v1/diagrams/{contentHash}/render

# Render diagram as JSON layout
curl -s -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" \
  http://localhost:8081/api/v1/diagrams/{contentHash}/render

Search response format: { "data": [...], "total": N, "offset": 0, "limit": 50 }

Supported search filters (GET): status, timeFrom, timeTo, correlationId, limit, offset

Additional POST filters: durationMin, durationMax, text (global full-text), textInBody, textInHeaders, textInErrors

Agent Registry & SSE (Phase 3)

# Register an agent (uses bootstrap token, not JWT — see Authentication section above)
curl -s -X POST http://localhost:8081/api/v1/agents/register \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-token" \
  -d '{"agentId":"agent-1","name":"Order Service","group":"order-service-prod","version":"1.0.0","routeIds":["route-1","route-2"],"capabilities":["deep-trace","replay"]}'

# Heartbeat (call every 30s)
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/heartbeat \
  -H "Authorization: Bearer $TOKEN"

# List agents (optionally filter by status)
curl -s -H "Authorization: Bearer $TOKEN" "http://localhost:8081/api/v1/agents"
curl -s -H "Authorization: Bearer $TOKEN" "http://localhost:8081/api/v1/agents?status=LIVE"

# Connect to SSE event stream (JWT via query parameter)
curl -s -N "http://localhost:8081/api/v1/agents/agent-1/events?token=$TOKEN"

# Send command to single agent
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/commands \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"config-update","payload":{"samplingRate":0.5}}'

# Send command to agent group
curl -s -X POST http://localhost:8081/api/v1/agents/groups/order-service-prod/commands \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"deep-trace","payload":{"routeId":"route-1","durationSeconds":60}}'

# Send route control command to agent group (start/stop/suspend/resume)
curl -s -X POST http://localhost:8081/api/v1/agents/groups/order-service-prod/commands \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"route-control","payload":{"routeId":"route-1","action":"stop","nonce":"unique-uuid"}}'

# Broadcast command to all live agents
curl -s -X POST http://localhost:8081/api/v1/agents/commands \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"config-update","payload":{"samplingRate":1.0}}'

# Acknowledge command delivery
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/commands/{commandId}/ack \
  -H "Authorization: Bearer $TOKEN"

Agent lifecycle: LIVE (heartbeat within 90s) → STALE (missed 3 heartbeats) → DEAD (5min after STALE). DEAD agents kept indefinitely.

Server restart resilience: The agent registry is in-memory and lost on server restart. Agents auto-re-register on their next heartbeat or SSE connection — the server reconstructs registry entries from JWT claims (subject, application). Route catalog uses ClickHouse execution data as fallback until agents re-register with full route IDs. Agents should also handle 404 on heartbeat by triggering a full re-registration.

SSE events: config-update, deep-trace, replay, route-control commands pushed in real time. Server sends ping keepalive every 15s.

Command expiry: Unacknowledged commands expire after 60 seconds.

Route control responses: Route control commands return CommandGroupResponse with per-agent status, response count, and timed-out agent IDs.

Backpressure

When the write buffer is full (default capacity: 50,000), ingestion endpoints return 503 Service Unavailable. Already-buffered data is not lost.

Configuration

Key settings in cameleer3-server-app/src/main/resources/application.yml:

Setting Default Description
server.port 8081 Server port
ingestion.buffer-capacity 50000 Max items in write buffer
ingestion.batch-size 5000 Items per batch insert
ingestion.flush-interval-ms 1000 Buffer flush interval (ms)
agent-registry.heartbeat-interval-seconds 30 Expected heartbeat interval
agent-registry.stale-threshold-seconds 90 Time before agent marked STALE
agent-registry.dead-threshold-seconds 300 Time after STALE before DEAD
agent-registry.command-expiry-seconds 60 Pending command TTL
agent-registry.keepalive-interval-seconds 15 SSE ping keepalive interval
security.access-token-expiry-ms 3600000 JWT access token lifetime (1h)
security.refresh-token-expiry-ms 604800000 Refresh token lifetime (7d)
security.bootstrap-token ${CAMELEER_AUTH_TOKEN} Bootstrap token for agent registration (required)
security.bootstrap-token-previous ${CAMELEER_AUTH_TOKEN_PREVIOUS} Previous bootstrap token for rotation (optional)
security.ui-user admin UI login username (CAMELEER_UI_USER env var)
security.ui-password admin UI login password (CAMELEER_UI_PASSWORD env var)
security.ui-origin http://localhost:5173 CORS allowed origin for UI (CAMELEER_UI_ORIGIN env var)
security.jwt-secret (random) HMAC secret for JWT signing (CAMELEER_JWT_SECRET). If set, tokens survive restarts
security.oidc.enabled false Enable OIDC login (CAMELEER_OIDC_ENABLED)
security.oidc.issuer-uri OIDC provider issuer URL (CAMELEER_OIDC_ISSUER)
security.oidc.client-id OAuth2 client ID (CAMELEER_OIDC_CLIENT_ID)
security.oidc.client-secret OAuth2 client secret (CAMELEER_OIDC_CLIENT_SECRET)
security.oidc.roles-claim realm_access.roles JSONPath to roles in OIDC id_token (CAMELEER_OIDC_ROLES_CLAIM)
security.oidc.default-roles VIEWER Default roles for new OIDC users (CAMELEER_OIDC_DEFAULT_ROLES)
cameleer.indexer.debounce-ms 2000 Search indexer debounce delay (CAMELEER_INDEXER_DEBOUNCE_MS)
cameleer.indexer.queue-size 10000 Search indexer queue capacity (CAMELEER_INDEXER_QUEUE_SIZE)

Web UI Development

cd ui
npm install
npm run dev          # Vite dev server on http://localhost:5173 (proxies /api to :8081)
npm run build        # Production build to ui/dist/

Login with admin / admin (or whatever CAMELEER_UI_USER / CAMELEER_UI_PASSWORD are set to).

The UI uses runtime configuration via public/config.js. In Kubernetes, a ConfigMap overrides this file to set the correct API base URL.

Regenerate API Types

When the backend OpenAPI spec changes:

cd ui
npm run generate-api   # Requires backend running on :8081

Running Tests

Integration tests use Testcontainers (starts PostgreSQL automatically — requires Docker):

# All tests
mvn verify

# Unit tests only (no Docker needed)
mvn test -pl cameleer3-server-core

# Specific integration test
mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT

Verify Database Data

After posting data and waiting for the flush interval (1s default):

docker exec -it cameleer3-server-postgres-1 psql -U cameleer -d cameleer3 \
  -c "SELECT count(*) FROM route_executions"

Kubernetes Deployment

The full stack is deployed to k3s via CI/CD on push to main. K8s manifests are in deploy/.

Architecture

cameleer namespace:
  PostgreSQL (StatefulSet, 10Gi PVC)       ← postgres:5432 (ClusterIP)
  ClickHouse (StatefulSet, 10Gi PVC)       ← clickhouse:8123 (ClusterIP)
  cameleer3-server (Deployment)            ← NodePort 30081
  cameleer3-ui (Deployment, Nginx)         ← NodePort 30090
  cameleer-deploy-demo (Deployment)        ← NodePort 30092
  Logto Server (Deployment)               ← NodePort 30951/30952
  Logto PostgreSQL (StatefulSet, 1Gi)     ← ClusterIP

cameleer-demo namespace:
  (deployed Camel applications — managed by cameleer-deploy-demo)

Access (from your network)

Service URL
Web UI http://192.168.50.86:30090
Server API http://192.168.50.86:30081/api/v1/health
Swagger UI http://192.168.50.86:30081/api/v1/swagger-ui.html
Deploy Demo http://192.168.50.86:30092
Logto API LOGTO_ENDPOINT secret (NodePort 30951 direct, or behind reverse proxy)
Logto Admin LOGTO_ADMIN_ENDPOINT secret (NodePort 30952 direct, or behind reverse proxy)

CI/CD Pipeline

Push to main triggers: build (UI npm + Maven, unit tests) → docker (buildx amd64 for server + UI, push to Gitea registry) → deploy (kubectl apply + rolling update).

Required Gitea org secrets: REGISTRY_TOKEN, KUBECONFIG_BASE64, CAMELEER_AUTH_TOKEN, CAMELEER_JWT_SECRET, POSTGRES_USER, POSTGRES_PASSWORD, POSTGRES_DB, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD, CAMELEER_UI_USER (optional), CAMELEER_UI_PASSWORD (optional), LOGTO_PG_USER, LOGTO_PG_PASSWORD, LOGTO_ENDPOINT (public-facing Logto URL, e.g., https://auth.cameleer.my.domain), LOGTO_ADMIN_ENDPOINT (admin console URL), CAMELEER_OIDC_ISSUER_URI (optional, for resource server M2M token validation), CAMELEER_OIDC_AUDIENCE (optional, API resource indicator), CAMELEER_OIDC_TLS_SKIP_VERIFY (optional, skip TLS cert verification for self-signed CAs).

Manual K8s Commands

# Check pod status
kubectl -n cameleer get pods

# View server logs
kubectl -n cameleer logs -f deploy/cameleer3-server

# View PostgreSQL logs
kubectl -n cameleer logs -f statefulset/postgres

# View ClickHouse logs
kubectl -n cameleer logs -f statefulset/clickhouse

# Restart server
kubectl -n cameleer rollout restart deployment/cameleer3-server