Files
cameleer-server/CLAUDE.md
hsiegeln c6aef5ab35
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m4s
CI / docker (push) Successful in 1m15s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
fix(deploy): Checkpoints — preserve STOPPED history, fix filter + placement
- Backend: rename deleteTerminalByAppAndEnvironment → deleteFailedByAppAndEnvironment.
  STOPPED rows were being wiped on every redeploy, so Checkpoints was always empty.
  Now only FAILED rows are pruned; STOPPED deployments are retained as restorable
  checkpoints (they still carry deployed_config_snapshot from their RUNNING window).
- UI filter: any deployment with a snapshot is a checkpoint (was RUNNING|DEGRADED only,
  which excluded the main case — the previous blue/green deployment now in STOPPED).
- UI placement: Checkpoints disclosure now renders inside IdentitySection, matching
  the design spec.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 10:26:46 +02:00

15 KiB
Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Project

Cameleer Server — observability server that receives, stores, and serves Camel route execution data and route diagrams from Cameleer agents. Pushes config and commands to agents via SSE. Also orchestrates Docker container deployments when running under cameleer-saas.

  • cameleer (https://gitea.siegeln.net/cameleer/cameleer) — the Java agent that instruments Camel applications
  • Protocol defined in cameleer-common/PROTOCOL.md in the agent repo
  • This server depends on com.cameleer:cameleer-common (shared models and graph API)

Modules

  • cameleer-server-core — domain logic, storage interfaces, services (no Spring dependencies)
  • cameleer-server-app — Spring Boot web app, REST controllers, SSE, persistence, Docker orchestration

Build Commands

mvn clean compile          # Compile all modules
mvn clean verify           # Full build with tests

Run

java -jar cameleer-server-app/target/cameleer-server-app-1.0-SNAPSHOT.jar

Key Conventions

  • Java 17+ required
  • Spring Boot 3.4.3 parent POM
  • Depends on com.cameleer:cameleer-common from Gitea Maven registry
  • Jackson JavaTimeModule for Instant deserialization
  • Communication: receives HTTP POST data from agents (executions, diagrams, metrics, logs), serves SSE event streams for config push/commands (config-update, deep-trace, replay, route-control)
  • URL taxonomy: user-facing data, config, and query endpoints live under /api/v1/environments/{envSlug}/.... Env is a path segment, resolved via the @EnvPath argument resolver (404 on unknown slug). Flat endpoints are only for: agent self-service (JWT-authoritative), cross-env admin (RBAC, OIDC, audit, license, thresholds, env CRUD), cross-env discovery (/catalog), content-addressed lookups (/diagrams/{contentHash}/render, /executions/{id}), and auth. See .claude/rules/app-classes.md for the full allow-list.
  • Slug immutability: environment and app slugs are immutable after creation (both appear in URLs, Docker network names, container names, and ClickHouse partition keys). Slug regex ^[a-z0-9][a-z0-9-]{0,63}$ is enforced on POST; update endpoints silently drop any slug field in the request body via Jackson's default unknown-property handling.
  • App uniqueness: (environment_id, app_slug) is the natural key. The same app slug can legitimately exist in multiple environments; AppService.getByEnvironmentAndSlug(envId, slug) is the canonical lookup for controllers. Bare getBySlug(slug) remains for internal use but is ambiguous across envs.
  • Environment filtering: all data queries filter by the selected environment. All commands target only agents in the selected environment. Env is required on every env-scoped endpoint (path param); the legacy ?environment= query form is retired.
  • Maintains agent instance registry (in-memory) with states: LIVE -> STALE -> DEAD. Auto-heals from JWT env claim + heartbeat body on heartbeat/SSE after server restart (priority: heartbeat environmentId > JWT env claim; no silent default — missing env on heartbeat auto-heal returns 400). Registration (POST /api/v1/agents/register) requires environmentId in the request body; missing or blank returns 400. Capabilities and route states updated on every heartbeat (protocol v2). Route catalog merges three sources: in-memory agent registry, persistent route_catalog table (ClickHouse), and stats_1m_route execution stats. The persistent catalog tracks first_seen/last_seen per route per environment, updated on every registration and heartbeat. Routes appear in the sidebar when their lifecycle overlaps the selected time window (first_seen <= to AND last_seen >= from), so historical routes remain visible even after being dropped from newer app versions.
  • Multi-tenancy: each server instance serves one tenant (configured via CAMELEER_SERVER_TENANT_ID, default: "default"). Environments (dev/staging/prod) are first-class. PostgreSQL isolated via schema-per-tenant (?currentSchema=tenant_{id}) and ApplicationName=tenant_{id} on the JDBC URL. ClickHouse shared DB with tenant_id + environment columns, partitioned by (tenant_id, toYYYYMM(timestamp)).
  • Storage: PostgreSQL for RBAC, config, and audit; ClickHouse for all observability data (executions, search, logs, metrics, stats, diagrams). ClickHouse schema migrations in clickhouse/*.sql, run idempotently on startup by ClickHouseSchemaInitializer. Use IF NOT EXISTS for CREATE and ADD PROJECTION.
  • Log exchange correlation: ClickHouseLogStore extracts exchange_id from log entry MDC, preferring cameleer.exchangeId over camel.exchangeId (fallback for older agents). For ON_COMPLETION exchange copies, the agent sets cameleer.exchangeId to the parent's exchange ID via CORRELATION_ID.
  • Log processor correlation: The agent sets cameleer.processorId in MDC, identifying which processor node emitted a log line.
  • Logging: ClickHouse JDBC set to INFO (com.clickhouse), HTTP client to WARN (org.apache.hc.client5) in application.yml
  • Security: JWT auth with RBAC (AGENT/VIEWER/OPERATOR/ADMIN roles), Ed25519 config signing (key derived deterministically from JWT secret via HMAC-SHA256), bootstrap token for registration. CORS: CAMELEER_SERVER_SECURITY_CORSALLOWEDORIGINS (comma-separated) overrides CAMELEER_SERVER_SECURITY_UIORIGIN for multi-origin setups. Infrastructure access: CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS=false disables Database and ClickHouse admin endpoints. Last-ADMIN guard: system prevents removal of the last ADMIN role (409 Conflict). Password policy: min 12 chars, 3-of-4 character classes, no username match. Brute-force protection: 5 failed attempts -> 15 min lockout. Token revocation: token_revoked_before column on users, checked in JwtAuthenticationFilter, set on password change.
  • OIDC: Optional external identity provider support (token exchange pattern). Configured via admin API/UI, stored in database (server_config table). Resource server mode: accepts external access tokens (Logto M2M) via JWKS validation when CAMELEER_SERVER_SECURITY_OIDCISSUERURI is set. Scope-based role mapping via SystemRole.normalizeScope(). System roles synced on every OIDC login via applyClaimMappings() in OidcAuthController (calls clearManagedAssignments + assignManagedRole on RbacService) — always overwrites managed role assignments; uses managed assignment origin to avoid touching group-inherited or directly-assigned roles. Supports ES384, ES256, RS256.
  • OIDC role extraction: OidcTokenExchanger reads roles from the access_token first (JWT with at+jwt type), then falls back to id_token. OidcConfig includes audience (RFC 8707 resource indicator) and additionalScopes. All provider-specific configuration is external — no provider-specific code in the server.
  • Sensitive keys: Global enforced baseline for masking sensitive data in agent payloads. Merge rule: final = global UNION per-app (case-insensitive dedup, per-app can only add, never remove global keys).
  • User persistence: PostgreSQL users table, admin CRUD at /api/v1/admin/users. users.user_id is the bare identifier — local users as <username>, OIDC users as oidc:<sub>. JWT sub carries the user: namespace prefix so JwtAuthenticationFilter can tell user tokens from agent tokens; write paths (UiAuthController, OidcAuthController, UserAdminController) all upsert unprefixed, and env-scoped read-path controllers strip the user: prefix before using the value as an FK to users.user_id / user_roles.user_id. Alerting / outbound FKs (alert_rules.created_by, outbound_connections.created_by, …) therefore all reference the bare form.
  • Usage analytics: ClickHouse usage_events table tracks authenticated UI requests, flushed every 5s

Database Migrations

PostgreSQL (Flyway): cameleer-server-app/src/main/resources/db/migration/

  • V1 — Consolidated baseline schema. All prior V1V18 evolution was collapsed before first prod deploy. Contains: RBAC (users, roles, groups, user_roles, user_groups, group_roles, claim_mapping_rules), runtime management (environments, apps, app_versions, deployments), env-scoped application config (application_config PK (application, environment), app_settings PK (application_id, environment)), audit_log, outbound_connections, server_config, and the full alerting subsystem (alert_rules, alert_rule_targets, alert_instances, alert_silences, alert_notifications). Seeds the 4 system roles (AGENT/VIEWER/OPERATOR/ADMIN), the Admins group with ADMIN role, and a default environment. Invariants covered by SchemaBootstrapIT.

ClickHouse: cameleer-server-app/src/main/resources/clickhouse/init.sql (run idempotently on startup)

Regenerating OpenAPI schema (SPA types)

After any change to REST controller paths, request/response DTOs, or @PathVariable/@RequestParam/@RequestBody signatures, regenerate the TypeScript types the SPA consumes. Required for every controller-level change.

# Backend must be running on :8081
cd ui && npm run generate-api:live   # fetches fresh openapi.json AND regenerates schema.d.ts
# OR, if openapi.json was updated by other means:
cd ui && npm run generate-api        # regenerates schema.d.ts from existing openapi.json

After regeneration, ui/src/api/schema.d.ts and ui/src/api/openapi.json will update. The TypeScript compiler then surfaces every SPA call site that needs updating — fix all compile errors before testing in the browser. Commit the regenerated files with the controller change.

Maintaining .claude/rules/

When adding, removing, or renaming classes, controllers, endpoints, UI components, or metrics, update the corresponding .claude/rules/ file as part of the same change. The rule files are the class/API map that future sessions rely on — stale rules cause wrong assumptions. Treat rule file updates like updating an import: part of the change, not a separate task.

Disabled Skills

  • Do NOT use any gsd:* skills in this project. This includes all /gsd: prefixed commands.

GitNexus — Code Intelligence

This project is indexed by GitNexus as cameleer-server (9318 symbols, 23997 relationships, 300 execution flows). Use the GitNexus MCP tools to understand code, assess impact, and navigate safely.

If any GitNexus tool warns the index is stale, run npx gitnexus analyze in terminal first.

Always Do

  • MUST run impact analysis before editing any symbol. Before modifying a function, class, or method, run gitnexus_impact({target: "symbolName", direction: "upstream"}) and report the blast radius (direct callers, affected processes, risk level) to the user.
  • MUST run gitnexus_detect_changes() before committing to verify your changes only affect expected symbols and execution flows.
  • MUST warn the user if impact analysis returns HIGH or CRITICAL risk before proceeding with edits.
  • When exploring unfamiliar code, use gitnexus_query({query: "concept"}) to find execution flows instead of grepping. It returns process-grouped results ranked by relevance.
  • When you need full context on a specific symbol — callers, callees, which execution flows it participates in — use gitnexus_context({name: "symbolName"}).

When Debugging

  1. gitnexus_query({query: "<error or symptom>"}) — find execution flows related to the issue
  2. gitnexus_context({name: "<suspect function>"}) — see all callers, callees, and process participation
  3. READ gitnexus://repo/cameleer-server/process/{processName} — trace the full execution flow step by step
  4. For regressions: gitnexus_detect_changes({scope: "compare", base_ref: "main"}) — see what your branch changed

When Refactoring

  • Renaming: MUST use gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true}) first. Review the preview — graph edits are safe, text_search edits need manual review. Then run with dry_run: false.
  • Extracting/Splitting: MUST run gitnexus_context({name: "target"}) to see all incoming/outgoing refs, then gitnexus_impact({target: "target", direction: "upstream"}) to find all external callers before moving code.
  • After any refactor: run gitnexus_detect_changes({scope: "all"}) to verify only expected files changed.

Never Do

  • NEVER edit a function, class, or method without first running gitnexus_impact on it.
  • NEVER ignore HIGH or CRITICAL risk warnings from impact analysis.
  • NEVER rename symbols with find-and-replace — use gitnexus_rename which understands the call graph.
  • NEVER commit changes without running gitnexus_detect_changes() to check affected scope.

Tools Quick Reference

Tool When to use Command
query Find code by concept gitnexus_query({query: "auth validation"})
context 360-degree view of one symbol gitnexus_context({name: "validateUser"})
impact Blast radius before editing gitnexus_impact({target: "X", direction: "upstream"})
detect_changes Pre-commit scope check gitnexus_detect_changes({scope: "staged"})
rename Safe multi-file rename gitnexus_rename({symbol_name: "old", new_name: "new", dry_run: true})
cypher Custom graph queries gitnexus_cypher({query: "MATCH ..."})

Impact Risk Levels

Depth Meaning Action
d=1 WILL BREAK — direct callers/importers MUST update these
d=2 LIKELY AFFECTED — indirect deps Should test
d=3 MAY NEED TESTING — transitive Test if critical path

Resources

Resource Use for
gitnexus://repo/cameleer-server/context Codebase overview, check index freshness
gitnexus://repo/cameleer-server/clusters All functional areas
gitnexus://repo/cameleer-server/processes All execution flows
gitnexus://repo/cameleer-server/process/{name} Step-by-step execution trace

Self-Check Before Finishing

Before completing any code modification task, verify:

  1. gitnexus_impact was run for all modified symbols
  2. No HIGH/CRITICAL risk warnings were ignored
  3. gitnexus_detect_changes() confirms changes match expected scope
  4. All d=1 (WILL BREAK) dependents were updated

Keeping the Index Fresh

After committing code changes, the GitNexus index becomes stale. Re-run analyze to update it:

npx gitnexus analyze

If the index previously included embeddings, preserve them by adding --embeddings:

npx gitnexus analyze --embeddings

To check whether embeddings exist, inspect .gitnexus/meta.json — the stats.embeddings field shows the count (0 means no embeddings). Running analyze without --embeddings will delete any previously generated embeddings.

Claude Code users: A PostToolUse hook handles this automatically after git commit and git merge.

CLI

Task Read this skill file
Understand architecture / "How does X work?" .claude/skills/gitnexus/gitnexus-exploring/SKILL.md
Blast radius / "What breaks if I change X?" .claude/skills/gitnexus/gitnexus-impact-analysis/SKILL.md
Trace bugs / "Why is X failing?" .claude/skills/gitnexus/gitnexus-debugging/SKILL.md
Rename / extract / split / refactor .claude/skills/gitnexus/gitnexus-refactoring/SKILL.md
Tools, resources, schema reference .claude/skills/gitnexus/gitnexus-guide/SKILL.md
Index, status, clean, wiki CLI commands .claude/skills/gitnexus/gitnexus-cli/SKILL.md