Files
cameleer-server/HOWTO.md
hsiegeln 9c2391e5d4
All checks were successful
CI / build (push) Successful in 41s
CI / docker (push) Successful in 13s
CI / deploy (push) Successful in 38s
Move ClickHouse credentials to K8s Secret and add health probes
- ClickHouse user/password now injected via `clickhouse-credentials` Secret
  instead of hardcoded plaintext in deploy manifests (#33)
- CI deploy step creates the secret idempotently from Gitea CI secrets
- Added liveness/readiness probes: server uses /api/v1/health, ClickHouse
  uses /ping (#35)
- Updated HOWTO.md and CLAUDE.md with new secrets and probe details

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 10:59:15 +01:00

11 KiB

HOWTO — Cameleer3 Server

Prerequisites

  • Java 17+
  • Maven 3.9+
  • Docker & Docker Compose
  • Access to the Gitea Maven registry (for cameleer3-common dependency)

Build

mvn clean compile          # compile only
mvn clean verify           # compile + run all tests (needs Docker for integration tests)

Infrastructure Setup

Start ClickHouse:

docker compose up -d

This starts ClickHouse 25.3 and automatically runs the schema init scripts (clickhouse/init/01-schema.sql, clickhouse/init/02-search-columns.sql).

Service Port Purpose
ClickHouse 8123 HTTP API (JDBC)
ClickHouse 9000 Native protocol

ClickHouse credentials: cameleer / cameleer_dev, database cameleer3.

Run the Server

mvn clean package -DskipTests
CAMELEER_AUTH_TOKEN=my-secret-token java -jar cameleer3-server-app/target/cameleer3-server-app-1.0-SNAPSHOT.jar

The server starts on port 8081. The CAMELEER_AUTH_TOKEN environment variable is required — the server fails fast on startup if it is not set.

For token rotation without downtime, set CAMELEER_AUTH_TOKEN_PREVIOUS to the old token while rolling out the new one. The server accepts both during the overlap window.

API Endpoints

Authentication (Phase 4)

All endpoints except health, registration, and docs require a JWT Bearer token. The typical flow:

# 1. Register agent (requires bootstrap token)
curl -s -X POST http://localhost:8081/api/v1/agents/register \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-token" \
  -d '{"agentId":"agent-1","name":"Order Service","group":"order-service-prod","version":"1.0.0","routeIds":["route-1"],"capabilities":["deep-trace","replay"]}'
# Response includes: accessToken, refreshToken, serverPublicKey (Ed25519, Base64)

# 2. Use access token for all subsequent requests
TOKEN="<accessToken from registration>"

# 3. Refresh when access token expires (1h default)
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/refresh \
  -H "Authorization: Bearer <refreshToken>"
# Response: { "accessToken": "new-jwt" }

Public endpoints (no JWT required): GET /api/v1/health, POST /api/v1/agents/register (uses bootstrap token), OpenAPI/Swagger docs.

Protected endpoints (JWT required): All other endpoints including ingestion, search, agent management, commands.

SSE connections: Authenticated via query parameter: /agents/{id}/events?token=<jwt> (EventSource API doesn't support custom headers).

Ed25519 signatures: All SSE command payloads (config-update, deep-trace, replay) include a signature field. Agents verify payload integrity using the serverPublicKey received during registration. The server generates a new ephemeral keypair on each startup — agents must re-register to get the new key.

Ingestion (POST, returns 202 Accepted)

# Post route execution data (JWT required)
curl -s -X POST http://localhost:8081/api/v1/data/executions \
  -H "Content-Type: application/json" \
  -H "X-Protocol-Version: 1" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"agentId":"agent-1","routeId":"route-1","executionId":"exec-1","status":"COMPLETED","startTime":"2026-03-11T00:00:00Z","endTime":"2026-03-11T00:00:01Z","processorExecutions":[]}'

# Post route diagram
curl -s -X POST http://localhost:8081/api/v1/data/diagrams \
  -H "Content-Type: application/json" \
  -H "X-Protocol-Version: 1" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"agentId":"agent-1","routeId":"route-1","version":1,"nodes":[],"edges":[]}'

# Post agent metrics
curl -s -X POST http://localhost:8081/api/v1/data/metrics \
  -H "Content-Type: application/json" \
  -H "X-Protocol-Version: 1" \
  -H "Authorization: Bearer $TOKEN" \
  -d '[{"agentId":"agent-1","metricName":"cpu","value":42.0,"timestamp":"2026-03-11T00:00:00Z","tags":{}}]'

Note: The X-Protocol-Version: 1 header is required on all /api/v1/data/** endpoints. Missing or wrong version returns 400.

Health & Docs

# Health check
curl -s http://localhost:8081/api/v1/health

# OpenAPI JSON
curl -s http://localhost:8081/api/v1/api-docs

# Swagger UI
open http://localhost:8081/api/v1/swagger-ui.html

Search (Phase 2)

# Search by status (GET with basic filters)
curl -s -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8081/api/v1/search/executions?status=COMPLETED&limit=10"

# Search by time range
curl -s -H "Authorization: Bearer $TOKEN" \
  "http://localhost:8081/api/v1/search/executions?timeFrom=2026-03-11T00:00:00Z&timeTo=2026-03-12T00:00:00Z"

# Advanced search (POST with full-text)
curl -s -X POST http://localhost:8081/api/v1/search/executions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"status":"FAILED","text":"NullPointerException","limit":20}'

# Transaction detail (nested processor tree)
curl -s -H "Authorization: Bearer $TOKEN" \
  http://localhost:8081/api/v1/executions/{executionId}

# Processor exchange snapshot
curl -s -H "Authorization: Bearer $TOKEN" \
  http://localhost:8081/api/v1/executions/{executionId}/processors/{index}/snapshot

# Render diagram as SVG
curl -s -H "Authorization: Bearer $TOKEN" \
  -H "Accept: image/svg+xml" \
  http://localhost:8081/api/v1/diagrams/{contentHash}/render

# Render diagram as JSON layout
curl -s -H "Authorization: Bearer $TOKEN" \
  -H "Accept: application/json" \
  http://localhost:8081/api/v1/diagrams/{contentHash}/render

Search response format: { "data": [...], "total": N, "offset": 0, "limit": 50 }

Supported search filters (GET): status, timeFrom, timeTo, correlationId, limit, offset

Additional POST filters: durationMin, durationMax, text (global full-text), textInBody, textInHeaders, textInErrors

Agent Registry & SSE (Phase 3)

# Register an agent (uses bootstrap token, not JWT — see Authentication section above)
curl -s -X POST http://localhost:8081/api/v1/agents/register \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer my-secret-token" \
  -d '{"agentId":"agent-1","name":"Order Service","group":"order-service-prod","version":"1.0.0","routeIds":["route-1","route-2"],"capabilities":["deep-trace","replay"]}'

# Heartbeat (call every 30s)
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/heartbeat \
  -H "Authorization: Bearer $TOKEN"

# List agents (optionally filter by status)
curl -s -H "Authorization: Bearer $TOKEN" "http://localhost:8081/api/v1/agents"
curl -s -H "Authorization: Bearer $TOKEN" "http://localhost:8081/api/v1/agents?status=LIVE"

# Connect to SSE event stream (JWT via query parameter)
curl -s -N "http://localhost:8081/api/v1/agents/agent-1/events?token=$TOKEN"

# Send command to single agent
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/commands \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"config-update","payload":{"samplingRate":0.5}}'

# Send command to agent group
curl -s -X POST http://localhost:8081/api/v1/agents/groups/order-service-prod/commands \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"deep-trace","payload":{"routeId":"route-1","durationSeconds":60}}'

# Broadcast command to all live agents
curl -s -X POST http://localhost:8081/api/v1/agents/commands \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"type":"config-update","payload":{"samplingRate":1.0}}'

# Acknowledge command delivery
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/commands/{commandId}/ack \
  -H "Authorization: Bearer $TOKEN"

Agent lifecycle: LIVE (heartbeat within 90s) → STALE (missed 3 heartbeats) → DEAD (5min after STALE). DEAD agents kept indefinitely.

SSE events: config-update, deep-trace, replay commands pushed in real time. Server sends ping keepalive every 15s.

Command expiry: Unacknowledged commands expire after 60 seconds.

Backpressure

When the write buffer is full (default capacity: 50,000), ingestion endpoints return 503 Service Unavailable. Already-buffered data is not lost.

Configuration

Key settings in cameleer3-server-app/src/main/resources/application.yml:

Setting Default Description
server.port 8081 Server port
ingestion.buffer-capacity 50000 Max items in write buffer
ingestion.batch-size 5000 Items per ClickHouse batch insert
ingestion.flush-interval-ms 1000 Buffer flush interval (ms)
ingestion.data-ttl-days 30 ClickHouse TTL for auto-deletion
agent-registry.heartbeat-interval-seconds 30 Expected heartbeat interval
agent-registry.stale-threshold-seconds 90 Time before agent marked STALE
agent-registry.dead-threshold-seconds 300 Time after STALE before DEAD
agent-registry.command-expiry-seconds 60 Pending command TTL
agent-registry.keepalive-interval-seconds 15 SSE ping keepalive interval
security.access-token-expiry-ms 3600000 JWT access token lifetime (1h)
security.refresh-token-expiry-ms 604800000 Refresh token lifetime (7d)
security.bootstrap-token ${CAMELEER_AUTH_TOKEN} Bootstrap token for agent registration (required)
security.bootstrap-token-previous ${CAMELEER_AUTH_TOKEN_PREVIOUS} Previous bootstrap token for rotation (optional)

Running Tests

Integration tests use Testcontainers (starts ClickHouse automatically — requires Docker):

# All tests
mvn verify

# Unit tests only (no Docker needed)
mvn test -pl cameleer3-server-core

# Specific integration test
mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT

Verify ClickHouse Data

After posting data and waiting for the flush interval (1s default):

docker exec -it cameleer3-server-clickhouse-1 clickhouse-client \
  --user cameleer --password cameleer_dev -d cameleer3 \
  -q "SELECT count() FROM route_executions"

Kubernetes Deployment

The full stack is deployed to k3s via CI/CD on push to main. K8s manifests are in deploy/.

Architecture

cameleer namespace:
  ClickHouse (StatefulSet, 2Gi PVC) ← clickhouse:8123 (ClusterIP)
  cameleer3-server (Deployment)     ← NodePort 30081
  cameleer3-sample (Deployment)     ← NodePort 30080  (from cameleer3 repo)

Access (from your network)

Service URL
Server API http://192.168.50.86:30081/api/v1/health
Swagger UI http://192.168.50.86:30081/api/v1/swagger-ui.html
Sample App http://192.168.50.86:30080/api/orders

CI/CD Pipeline

Push to main triggers: build (Maven, unit tests) → docker (buildx cross-compile amd64, push to Gitea registry) → deploy (kubectl apply + rolling update).

Required Gitea org secrets: REGISTRY_TOKEN, KUBECONFIG_BASE64, CAMELEER_AUTH_TOKEN, CLICKHOUSE_USER, CLICKHOUSE_PASSWORD.

Manual K8s Commands

# Check pod status
kubectl -n cameleer get pods

# View server logs
kubectl -n cameleer logs -f deploy/cameleer3-server

# View ClickHouse logs
kubectl -n cameleer logs -f statefulset/clickhouse

# Restart server
kubectl -n cameleer rollout restart deployment/cameleer3-server