Files
cameleer-server/HOWTO.md
2026-04-05 13:15:09 +02:00

497 lines
20 KiB
Markdown

# HOWTO — Cameleer3 Server
## Prerequisites
- Java 17+
- Maven 3.9+
- Node.js 22+ and npm
- Docker & Docker Compose
- Access to the Gitea Maven registry (for `cameleer3-common` dependency)
## Build
```bash
# Build UI first (required for embedded mode)
cd ui && npm ci && npm run build && cd ..
# Backend
mvn clean compile # compile only
mvn clean verify # compile + run all tests (needs Docker for integration tests)
```
## Infrastructure Setup
Start PostgreSQL:
```bash
docker compose up -d
```
This starts PostgreSQL 16. The database schema is applied automatically via Flyway migrations on server startup. ClickHouse tables are created by the schema initializer on startup.
| Service | Port | Purpose |
|------------|------|----------------------|
| PostgreSQL | 5432 | JDBC (Spring JDBC) |
PostgreSQL credentials: `cameleer` / `cameleer_dev`, database `cameleer3`.
## Run the Server
```bash
mvn clean package -DskipTests
CAMELEER_AUTH_TOKEN=my-secret-token java -jar cameleer3-server-app/target/cameleer3-server-app-1.0-SNAPSHOT.jar
```
The server starts on **port 8081**. The `CAMELEER_AUTH_TOKEN` environment variable is **required** — the server fails fast on startup if it is not set.
For token rotation without downtime, set `CAMELEER_AUTH_TOKEN_PREVIOUS` to the old token while rolling out the new one. The server accepts both during the overlap window.
## API Endpoints
### Authentication (Phase 4)
All endpoints except health, registration, and docs require a JWT Bearer token. The typical flow:
```bash
# 1. Register agent (requires bootstrap token)
curl -s -X POST http://localhost:8081/api/v1/agents/register \
-H "Content-Type: application/json" \
-H "Authorization: Bearer my-secret-token" \
-d '{"agentId":"agent-1","name":"Order Service","group":"order-service-prod","version":"1.0.0","routeIds":["route-1"],"capabilities":["deep-trace","replay"]}'
# Response includes: accessToken, refreshToken, serverPublicKey (Ed25519, Base64)
# 2. Use access token for all subsequent requests
TOKEN="<accessToken from registration>"
# 3. Refresh when access token expires (1h default)
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/refresh \
-H "Authorization: Bearer <refreshToken>"
# Response: { "accessToken": "new-jwt" }
```
**UI Login (for browser access):**
```bash
# Login with UI credentials (returns JWT tokens)
curl -s -X POST http://localhost:8081/api/v1/auth/login \
-H "Content-Type: application/json" \
-d '{"username":"admin","password":"admin"}'
# Response: { "accessToken": "...", "refreshToken": "..." }
# Refresh UI token
curl -s -X POST http://localhost:8081/api/v1/auth/refresh \
-H "Content-Type: application/json" \
-d '{"refreshToken":"<refreshToken>"}'
```
UI credentials are configured via `CAMELEER_UI_USER` / `CAMELEER_UI_PASSWORD` env vars (default: `admin` / `admin`).
**Public endpoints (no JWT required):** `GET /api/v1/health`, `POST /api/v1/agents/register` (uses bootstrap token), `POST /api/v1/auth/**`, OpenAPI/Swagger docs.
**Protected endpoints (JWT required):** All other endpoints including ingestion, search, agent management, commands.
**SSE connections:** Authenticated via query parameter: `/agents/{id}/events?token=<jwt>` (EventSource API doesn't support custom headers).
**Ed25519 signatures:** All SSE command payloads (config-update, deep-trace, replay) include a `signature` field. Agents verify payload integrity using the `serverPublicKey` received during registration. The server generates a new ephemeral keypair on each startup — agents must re-register to get the new key.
### RBAC (Role-Based Access Control)
JWTs carry a `roles` claim. Endpoints are restricted by role:
| Role | Access |
|------|--------|
| `AGENT` | Data ingestion (`/data/**` — executions, diagrams, metrics, logs), heartbeat, SSE events, command ack |
| `VIEWER` | Search, execution detail, diagrams, agent list |
| `OPERATOR` | VIEWER + send commands to agents |
| `ADMIN` | OPERATOR + user management (`/admin/**`) |
The env-var local user gets `ADMIN` role. Agents get `AGENT` role at registration.
### OIDC Login (Optional)
OIDC configuration is stored in PostgreSQL and managed via the admin API or UI. The SPA checks if OIDC is available:
```bash
# 1. SPA checks if OIDC is available (returns 404 if not configured)
curl -s http://localhost:8081/api/v1/auth/oidc/config
# Returns: { "issuer": "...", "clientId": "...", "authorizationEndpoint": "..." }
# 2. After OIDC redirect, SPA sends the authorization code
curl -s -X POST http://localhost:8081/api/v1/auth/oidc/callback \
-H "Content-Type: application/json" \
-d '{"code":"auth-code-from-provider","redirectUri":"http://localhost:5173/callback"}'
# Returns: { "accessToken": "...", "refreshToken": "..." }
```
Local login remains available as fallback even when OIDC is enabled.
### OIDC Admin Configuration (ADMIN only)
OIDC settings are managed at runtime via the admin API. No server restart needed.
```bash
# Get current OIDC config
curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8081/api/v1/admin/oidc
# Save OIDC config (client_secret: send "********" to keep existing, or new value to update)
curl -s -X PUT http://localhost:8081/api/v1/admin/oidc \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"enabled": true,
"issuerUri": "http://logto:3001/oidc",
"clientId": "your-client-id",
"clientSecret": "your-client-secret",
"rolesClaim": "realm_access.roles",
"defaultRoles": ["VIEWER"]
}'
# Test OIDC provider connectivity
curl -s -X POST http://localhost:8081/api/v1/admin/oidc/test \
-H "Authorization: Bearer $TOKEN"
# Delete OIDC config (disables OIDC)
curl -s -X DELETE http://localhost:8081/api/v1/admin/oidc \
-H "Authorization: Bearer $TOKEN"
```
**Initial provisioning**: OIDC can also be seeded from `CAMELEER_OIDC_*` env vars on first startup (when DB is empty). After that, the admin API takes over.
### Logto Setup (OIDC Provider)
Logto is deployed alongside the Cameleer stack. After first deployment:
1. **Initial setup**: Open `http://192.168.50.86:30952` (admin console) and create the admin account
2. **Create SPA application**: Applications → Create → Single Page App
- Name: `Cameleer UI`
- Redirect URI: `http://192.168.50.86:30090/oidc/callback` (or your UI URL)
- Note the **Client ID**
3. **Create API Resource**: API Resources → Create
- Name: `Cameleer Server API`
- Indicator: `https://cameleer.siegeln.net/api` (or your API URL)
- Add permissions: `admin`, `operator`, `viewer`
4. **Create M2M application** (for SaaS platform): Applications → Create → Machine-to-Machine
- Name: `Cameleer SaaS`
- Assign the API Resource created above with `admin` scope
- Note the **Client ID** and **Client Secret**
5. **Configure Cameleer**: Use the admin API (`PUT /api/v1/admin/oidc`) or set env vars for initial seeding:
```
CAMELEER_OIDC_ENABLED=true
CAMELEER_OIDC_ISSUER=http://logto:3001/oidc
CAMELEER_OIDC_CLIENT_ID=<client-id-from-step-2>
CAMELEER_OIDC_CLIENT_SECRET=<not-needed-for-public-spa>
```
6. **Configure resource server** (for M2M token validation):
```
CAMELEER_OIDC_ISSUER_URI=http://logto:3001/oidc
CAMELEER_OIDC_AUDIENCE=https://cameleer.siegeln.net/api
```
### User Management (ADMIN only)
```bash
# List all users
curl -s -H "Authorization: Bearer $TOKEN" http://localhost:8081/api/v1/admin/users
# Update user roles
curl -s -X PUT http://localhost:8081/api/v1/admin/users/{userId}/roles \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"roles":["VIEWER","OPERATOR"]}'
# Delete user
curl -s -X DELETE http://localhost:8081/api/v1/admin/users/{userId} \
-H "Authorization: Bearer $TOKEN"
```
### Ingestion (POST, returns 202 Accepted)
```bash
# Post route execution data (JWT required)
curl -s -X POST http://localhost:8081/api/v1/data/executions \
-H "Content-Type: application/json" \
-H "X-Protocol-Version: 1" \
-H "Authorization: Bearer $TOKEN" \
-d '{"agentId":"agent-1","routeId":"route-1","executionId":"exec-1","status":"COMPLETED","startTime":"2026-03-11T00:00:00Z","endTime":"2026-03-11T00:00:01Z","processorExecutions":[]}'
# Post route diagram
curl -s -X POST http://localhost:8081/api/v1/data/diagrams \
-H "Content-Type: application/json" \
-H "X-Protocol-Version: 1" \
-H "Authorization: Bearer $TOKEN" \
-d '{"agentId":"agent-1","routeId":"route-1","version":1,"nodes":[],"edges":[]}'
# Post agent metrics
curl -s -X POST http://localhost:8081/api/v1/data/metrics \
-H "Content-Type: application/json" \
-H "X-Protocol-Version: 1" \
-H "Authorization: Bearer $TOKEN" \
-d '[{"agentId":"agent-1","metricName":"cpu","value":42.0,"timestamp":"2026-03-11T00:00:00Z","tags":{}}]'
# Post application log entries (batch)
curl -s -X POST http://localhost:8081/api/v1/data/logs \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{
"entries": [{
"timestamp": "2026-03-25T10:00:00Z",
"level": "INFO",
"loggerName": "com.acme.MyService",
"message": "Processing order #12345",
"threadName": "main"
}]
}'
```
**Note:** The `X-Protocol-Version: 1` header is required on all `/api/v1/data/**` endpoints. Missing or wrong version returns 400.
### Health & Docs
```bash
# Health check
curl -s http://localhost:8081/api/v1/health
# OpenAPI JSON
curl -s http://localhost:8081/api/v1/api-docs
# Swagger UI
open http://localhost:8081/api/v1/swagger-ui.html
```
### Search (Phase 2)
```bash
# Search by status (GET with basic filters)
curl -s -H "Authorization: Bearer $TOKEN" \
"http://localhost:8081/api/v1/search/executions?status=COMPLETED&limit=10"
# Search by time range
curl -s -H "Authorization: Bearer $TOKEN" \
"http://localhost:8081/api/v1/search/executions?timeFrom=2026-03-11T00:00:00Z&timeTo=2026-03-12T00:00:00Z"
# Advanced search (POST with full-text)
curl -s -X POST http://localhost:8081/api/v1/search/executions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"status":"FAILED","text":"NullPointerException","limit":20}'
# Transaction detail (nested processor tree)
curl -s -H "Authorization: Bearer $TOKEN" \
http://localhost:8081/api/v1/executions/{executionId}
# Processor exchange snapshot
curl -s -H "Authorization: Bearer $TOKEN" \
http://localhost:8081/api/v1/executions/{executionId}/processors/{index}/snapshot
# Render diagram as SVG
curl -s -H "Authorization: Bearer $TOKEN" \
-H "Accept: image/svg+xml" \
http://localhost:8081/api/v1/diagrams/{contentHash}/render
# Render diagram as JSON layout
curl -s -H "Authorization: Bearer $TOKEN" \
-H "Accept: application/json" \
http://localhost:8081/api/v1/diagrams/{contentHash}/render
```
**Search response format:** `{ "data": [...], "total": N, "offset": 0, "limit": 50 }`
**Supported search filters (GET):** `status`, `timeFrom`, `timeTo`, `correlationId`, `limit`, `offset`
**Additional POST filters:** `durationMin`, `durationMax`, `text` (global full-text), `textInBody`, `textInHeaders`, `textInErrors`
### Agent Registry & SSE (Phase 3)
```bash
# Register an agent (uses bootstrap token, not JWT — see Authentication section above)
curl -s -X POST http://localhost:8081/api/v1/agents/register \
-H "Content-Type: application/json" \
-H "Authorization: Bearer my-secret-token" \
-d '{"agentId":"agent-1","name":"Order Service","group":"order-service-prod","version":"1.0.0","routeIds":["route-1","route-2"],"capabilities":["deep-trace","replay"]}'
# Heartbeat (call every 30s)
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/heartbeat \
-H "Authorization: Bearer $TOKEN"
# List agents (optionally filter by status)
curl -s -H "Authorization: Bearer $TOKEN" "http://localhost:8081/api/v1/agents"
curl -s -H "Authorization: Bearer $TOKEN" "http://localhost:8081/api/v1/agents?status=LIVE"
# Connect to SSE event stream (JWT via query parameter)
curl -s -N "http://localhost:8081/api/v1/agents/agent-1/events?token=$TOKEN"
# Send command to single agent
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/commands \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"type":"config-update","payload":{"samplingRate":0.5}}'
# Send command to agent group
curl -s -X POST http://localhost:8081/api/v1/agents/groups/order-service-prod/commands \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"type":"deep-trace","payload":{"routeId":"route-1","durationSeconds":60}}'
# Send route control command to agent group (start/stop/suspend/resume)
curl -s -X POST http://localhost:8081/api/v1/agents/groups/order-service-prod/commands \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"type":"route-control","payload":{"routeId":"route-1","action":"stop","nonce":"unique-uuid"}}'
# Broadcast command to all live agents
curl -s -X POST http://localhost:8081/api/v1/agents/commands \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOKEN" \
-d '{"type":"config-update","payload":{"samplingRate":1.0}}'
# Acknowledge command delivery
curl -s -X POST http://localhost:8081/api/v1/agents/agent-1/commands/{commandId}/ack \
-H "Authorization: Bearer $TOKEN"
```
**Agent lifecycle:** LIVE (heartbeat within 90s) → STALE (missed 3 heartbeats) → DEAD (5min after STALE). DEAD agents kept indefinitely.
**Server restart resilience:** The agent registry is in-memory and lost on server restart. Agents auto-re-register on their next heartbeat or SSE connection — the server reconstructs registry entries from JWT claims (subject, application). Route catalog uses ClickHouse execution data as fallback until agents re-register with full route IDs. Agents should also handle 404 on heartbeat by triggering a full re-registration.
**SSE events:** `config-update`, `deep-trace`, `replay`, `route-control` commands pushed in real time. Server sends ping keepalive every 15s.
**Command expiry:** Unacknowledged commands expire after 60 seconds.
**Route control responses:** Route control commands return `CommandGroupResponse` with per-agent status, response count, and timed-out agent IDs.
### Backpressure
When the write buffer is full (default capacity: 50,000), ingestion endpoints return **503 Service Unavailable**. Already-buffered data is not lost.
## Configuration
Key settings in `cameleer3-server-app/src/main/resources/application.yml`:
| Setting | Default | Description |
|---------|---------|-------------|
| `server.port` | 8081 | Server port |
| `ingestion.buffer-capacity` | 50000 | Max items in write buffer |
| `ingestion.batch-size` | 5000 | Items per batch insert |
| `ingestion.flush-interval-ms` | 1000 | Buffer flush interval (ms) |
| `agent-registry.heartbeat-interval-seconds` | 30 | Expected heartbeat interval |
| `agent-registry.stale-threshold-seconds` | 90 | Time before agent marked STALE |
| `agent-registry.dead-threshold-seconds` | 300 | Time after STALE before DEAD |
| `agent-registry.command-expiry-seconds` | 60 | Pending command TTL |
| `agent-registry.keepalive-interval-seconds` | 15 | SSE ping keepalive interval |
| `security.access-token-expiry-ms` | 3600000 | JWT access token lifetime (1h) |
| `security.refresh-token-expiry-ms` | 604800000 | Refresh token lifetime (7d) |
| `security.bootstrap-token` | `${CAMELEER_AUTH_TOKEN}` | Bootstrap token for agent registration (required) |
| `security.bootstrap-token-previous` | `${CAMELEER_AUTH_TOKEN_PREVIOUS}` | Previous bootstrap token for rotation (optional) |
| `security.ui-user` | `admin` | UI login username (`CAMELEER_UI_USER` env var) |
| `security.ui-password` | `admin` | UI login password (`CAMELEER_UI_PASSWORD` env var) |
| `security.ui-origin` | `http://localhost:5173` | CORS allowed origin for UI (`CAMELEER_UI_ORIGIN` env var) |
| `security.jwt-secret` | *(random)* | HMAC secret for JWT signing (`CAMELEER_JWT_SECRET`). If set, tokens survive restarts |
| `security.oidc.enabled` | `false` | Enable OIDC login (`CAMELEER_OIDC_ENABLED`) |
| `security.oidc.issuer-uri` | | OIDC provider issuer URL (`CAMELEER_OIDC_ISSUER`) |
| `security.oidc.client-id` | | OAuth2 client ID (`CAMELEER_OIDC_CLIENT_ID`) |
| `security.oidc.client-secret` | | OAuth2 client secret (`CAMELEER_OIDC_CLIENT_SECRET`) |
| `security.oidc.roles-claim` | `realm_access.roles` | JSONPath to roles in OIDC id_token (`CAMELEER_OIDC_ROLES_CLAIM`) |
| `security.oidc.default-roles` | `VIEWER` | Default roles for new OIDC users (`CAMELEER_OIDC_DEFAULT_ROLES`) |
| `cameleer.indexer.debounce-ms` | `2000` | Search indexer debounce delay (`CAMELEER_INDEXER_DEBOUNCE_MS`) |
| `cameleer.indexer.queue-size` | `10000` | Search indexer queue capacity (`CAMELEER_INDEXER_QUEUE_SIZE`) |
## Web UI Development
```bash
cd ui
npm install
npm run dev # Vite dev server on http://localhost:5173 (proxies /api to :8081)
npm run build # Production build to ui/dist/
```
Login with `admin` / `admin` (or whatever `CAMELEER_UI_USER` / `CAMELEER_UI_PASSWORD` are set to).
The UI uses runtime configuration via `public/config.js`. In Kubernetes, a ConfigMap overrides this file to set the correct API base URL.
### Regenerate API Types
When the backend OpenAPI spec changes:
```bash
cd ui
npm run generate-api # Requires backend running on :8081
```
## Running Tests
Integration tests use Testcontainers (starts PostgreSQL automatically — requires Docker):
```bash
# All tests
mvn verify
# Unit tests only (no Docker needed)
mvn test -pl cameleer3-server-core
# Specific integration test
mvn test -pl cameleer3-server-app -Dtest=ExecutionControllerIT
```
## Verify Database Data
After posting data and waiting for the flush interval (1s default):
```bash
docker exec -it cameleer3-server-postgres-1 psql -U cameleer -d cameleer3 \
-c "SELECT count(*) FROM route_executions"
```
## Kubernetes Deployment
The full stack is deployed to k3s via CI/CD on push to `main`. K8s manifests are in `deploy/`.
### Architecture
```
cameleer namespace:
PostgreSQL (StatefulSet, 10Gi PVC) ← postgres:5432 (ClusterIP)
ClickHouse (StatefulSet, 10Gi PVC) ← clickhouse:8123 (ClusterIP)
cameleer3-server (Deployment) ← NodePort 30081
cameleer3-ui (Deployment, Nginx) ← NodePort 30090
cameleer-deploy-demo (Deployment) ← NodePort 30092
Logto Server (Deployment) ← NodePort 30951/30952
Logto PostgreSQL (StatefulSet, 1Gi) ← ClusterIP
cameleer-demo namespace:
(deployed Camel applications — managed by cameleer-deploy-demo)
```
### Access (from your network)
| Service | URL |
|---------|-----|
| Web UI | `http://192.168.50.86:30090` |
| Server API | `http://192.168.50.86:30081/api/v1/health` |
| Swagger UI | `http://192.168.50.86:30081/api/v1/swagger-ui.html` |
| Deploy Demo | `http://192.168.50.86:30092` |
| Logto API | `http://192.168.50.86:30951` |
| Logto Admin | `http://192.168.50.86:30952` |
### CI/CD Pipeline
Push to `main` triggers: **build** (UI npm + Maven, unit tests) → **docker** (buildx amd64 for server + UI, push to Gitea registry) → **deploy** (kubectl apply + rolling update).
Required Gitea org secrets: `REGISTRY_TOKEN`, `KUBECONFIG_BASE64`, `CAMELEER_AUTH_TOKEN`, `CAMELEER_JWT_SECRET`, `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`, `CLICKHOUSE_USER`, `CLICKHOUSE_PASSWORD`, `CAMELEER_UI_USER` (optional), `CAMELEER_UI_PASSWORD` (optional), `LOGTO_PG_USER`, `LOGTO_PG_PASSWORD`, `LOGTO_ENDPOINT`, `LOGTO_ADMIN_ENDPOINT`, `CAMELEER_OIDC_ENABLED`, `CAMELEER_OIDC_ISSUER`, `CAMELEER_OIDC_CLIENT_ID`, `CAMELEER_OIDC_CLIENT_SECRET`, `CAMELEER_OIDC_ISSUER_URI` (optional), `CAMELEER_OIDC_AUDIENCE` (optional).
### Manual K8s Commands
```bash
# Check pod status
kubectl -n cameleer get pods
# View server logs
kubectl -n cameleer logs -f deploy/cameleer3-server
# View PostgreSQL logs
kubectl -n cameleer logs -f statefulset/postgres
# View ClickHouse logs
kubectl -n cameleer logs -f statefulset/clickhouse
# Restart server
kubectl -n cameleer rollout restart deployment/cameleer3-server
```