Files
cameleer-server/.claude/rules/metrics.md
hsiegeln 810f493639
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 5m22s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
chore: track .claude/rules/ and add self-maintenance instruction
Un-ignore .claude/rules/ so path-scoped rule files are shared via git.
Add instruction in CLAUDE.md to update rule files when modifying classes,
controllers, endpoints, or metrics — keeps rules current as part of
normal workflow rather than requiring separate maintenance.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:26:53 +02:00

3.8 KiB

paths
paths
cameleer-server-app/**/metrics/**
cameleer-server-app/**/ServerMetrics*
ui/src/pages/RuntimeTab/**
ui/src/pages/DashboardTab/**

Prometheus Metrics

Server exposes /api/v1/prometheus (unauthenticated, Prometheus text format). Spring Boot Actuator provides JVM, GC, thread pool, and http.server.requests metrics automatically. Business metrics via ServerMetrics component:

Gauges (auto-polled)

Metric Tags Source
cameleer.agents.connected state (live, stale, dead, shutdown) AgentRegistryService.findByState()
cameleer.agents.sse.active SseConnectionManager.getConnectionCount()
cameleer.ingestion.buffer.size type (execution, processor, log, metrics) WriteBuffer.size()
cameleer.ingestion.accumulator.pending ChunkAccumulator.getPendingCount()

Counters

Metric Tags Instrumented in
cameleer.ingestion.drops reason (buffer_full, no_agent, no_identity) LogIngestionController
cameleer.agents.transitions transition (went_stale, went_dead, recovered) AgentLifecycleMonitor
cameleer.deployments.outcome status (running, failed, degraded) DeploymentExecutor
cameleer.auth.failures reason (invalid_token, revoked, oidc_rejected) JwtAuthenticationFilter

Timers

Metric Tags Instrumented in
cameleer.ingestion.flush.duration type (execution, processor, log) ExecutionFlushScheduler
cameleer.deployments.duration DeploymentExecutor

Agent container Prometheus labels (set by PrometheusLabelBuilder at deploy time)

Runtime Type prometheus.path prometheus.port
spring-boot /actuator/prometheus 8081
quarkus / native /q/metrics 9000
plain-java /metrics 9464

All containers also get prometheus.scrape=true. These labels enable Prometheus docker_sd_configs auto-discovery.

Agent Metric Names (Micrometer)

Agents send MetricsSnapshot records with Micrometer-convention metric names. The server stores them generically (ClickHouse agent_metrics.metric_name). The UI references specific names in AgentInstance.tsx for JVM charts.

JVM metrics (used by UI)

Metric name UI usage
process.cpu.usage.value CPU % stat card + chart
jvm.memory.used.value Heap MB stat card + chart (tags: area=heap)
jvm.memory.max.value Heap max for % calculation (tags: area=heap)
jvm.threads.live.value Thread count chart
jvm.gc.pause.total_time GC time chart

Camel route metrics (stored, queried by dashboard)

Metric name Type Tags
camel.exchanges.succeeded.count counter routeId, camelContext
camel.exchanges.failed.count counter routeId, camelContext
camel.exchanges.total.count counter routeId, camelContext
camel.exchanges.failures.handled.count counter routeId, camelContext
camel.route.policy.count count routeId, camelContext
camel.route.policy.total_time total routeId, camelContext
camel.route.policy.max gauge routeId, camelContext
camel.routes.running.value gauge

Mean processing time = camel.route.policy.total_time / camel.route.policy.count. Min processing time is not available (Micrometer does not track minimums).

Cameleer agent metrics

Metric name Type Tags
cameleer.chunks.exported.count counter instanceId
cameleer.chunks.dropped.count counter instanceId, reason
cameleer.sse.reconnects.count counter instanceId
cameleer.taps.evaluated.count counter instanceId
cameleer.metrics.exported.count counter instanceId