Commit Graph

1101 Commits

Author SHA1 Message Date
hsiegeln
2fad8811c6 feat: merge global sensitive keys into app config GET and SSE push
- GET /config/{app} now returns AppConfigResponse with globalSensitiveKeys and mergedSensitiveKeys alongside the config
- PUT /config/{app} merges global + per-app sensitive keys before pushing CONFIG_UPDATE to agents via SSE
- extractSensitiveKeys() uses JsonNode reflection to avoid compile-time dependency on cameleer3-common getSensitiveKeys()
- SensitiveKeysRepository injected as new constructor parameter

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:19:59 +02:00
hsiegeln
28e38e4dee fix: add audit logging to GET /admin/sensitive-keys
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 18:17:42 +02:00
hsiegeln
c3892151a5 feat: add SensitiveKeysAdminController with fan-out support
GET/PUT /api/v1/admin/sensitive-keys (ADMIN only). PUT accepts optional
pushToAgents param — when true, fans out merged global+per-app sensitive
keys to all live agents via CONFIG_UPDATE SSE commands with 10-second
shared deadline. Per-app keys extracted via JsonNode to avoid depending
on ApplicationConfig.getSensitiveKeys() not yet in the published
cameleer3-common jar. Includes audit logging on every PUT.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:16:27 +02:00
hsiegeln
84641fe81a feat: add PostgresSensitiveKeysRepository 2026-04-14 18:08:45 +02:00
hsiegeln
d72a6511da feat: add SensitiveKeysMerger with case-insensitive union dedup
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 18:07:53 +02:00
hsiegeln
86b6c85aa7 feat: add SensitiveKeysConfig record and SensitiveKeysRepository interface 2026-04-14 18:06:12 +02:00
hsiegeln
dcd0b4ebcd fix: use managed assignments for OIDC fallback role paths
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m31s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The roles-claim and default-roles fallback paths in applyClaimMappings
were using assignRoleToUser (origin='direct'), causing OIDC-derived
roles to accumulate across logins and never be cleared. Changed both
to assignManagedRole (origin='managed') so all OIDC-assigned roles
are cleared and re-evaluated on every login, same as claim mapping
rules. Only roles assigned directly via the admin UI are preserved.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:19:20 +02:00
hsiegeln
58e802e2d4 feat: close modal on successful apply, update design spec
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Modal auto-closes after Apply succeeds. Design spec updated to reflect
implemented behavior: local-edit-then-apply pattern, target select
dropdowns, amber pill for add-to-group, close-on-success.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:12:39 +02:00
hsiegeln
9959e30e1e fix: use --amber DS variable for add-to-group pill color
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:09:31 +02:00
hsiegeln
5edefb2180 refactor: switch claim mapping editor to local-edit-then-apply pattern
All edits (add, edit, delete, reorder) now modify local state only.
Cancel discards changes, Apply diffs local vs server and issues the
necessary create/update/delete API calls. Target selects now include
a placeholder option. Footer shows Cancel and Apply buttons.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:07:36 +02:00
hsiegeln
0e87161426 feat: use select dropdowns for target role/group in claim mapping editor
Populate target field from existing roles (assign role) or groups
(add to group) instead of free-text input, preventing typos.
Switching action resets the target selection.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 17:02:09 +02:00
hsiegeln
c02fd77c30 fix: use correct DS CSS variables for modal background
Replace non-existent --surface-1/--surface-2 with --bg-raised (modal)
and --bg-hover (subtle backgrounds) from the design system.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:59:50 +02:00
hsiegeln
a3ec0aaef3 fix: address code review findings for claim mapping rules editor
- Bump all font sizes from 11px/10px to 12px (project minimum)
- Fix handleMove race condition: use mutateAsync + Promise.all
- Clear stale test results after rule create/edit/delete/reorder
- Replace inline styles with CSS module classes in OidcConfigPage
- Remove dead .editRow CSS class
- Replace inline chevron with Lucide icon

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:58:06 +02:00
hsiegeln
3985bb8a43 feat: wire claim mapping rules modal into OIDC config page
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:51:28 +02:00
hsiegeln
e8a697d185 feat: add claim mapping rules editor modal component
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:50:00 +02:00
hsiegeln
344700e29e feat: add React Query hooks for claim mapping rules API
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 16:46:40 +02:00
hsiegeln
f110169d54 feat: add POST /test endpoint for claim mapping rule evaluation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 16:42:54 +02:00
hsiegeln
90ae1d6a14 fix: include properties in hasTrace for ProcessorExecution path
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m39s
CI / docker (push) Failing after 47s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Now that cameleer3-common has getInputProperties/getOutputProperties on
ProcessorExecution, add the check to the processors_json deserialization
path as well.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:34:29 +02:00
hsiegeln
05d91c16e7 fix: include properties in hasTrace check for ProcessorRecord paths
Some checks failed
CI / build (push) Successful in 1m57s
CI / cleanup-branch (push) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The hasTrace flag on ProcessorNode now also checks inputProperties and
outputProperties on the flat-record code paths (buildTreeBySeq and
buildTreeByProcessorId). The ProcessorExecution path (processors_json)
will be updated once cameleer3-common publishes the new snapshot.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:32:18 +02:00
hsiegeln
0827fd21e3 feat: persist and display exchange properties from agent
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m59s
CI / docker (push) Successful in 2m13s
CI / deploy (push) Successful in 58s
CI / deploy-feature (push) Has been skipped
Add support for exchange properties sent by the agent alongside headers.
Properties flow through the same pipeline as headers: ClickHouse columns
(input_properties, output_properties) on both executions and
processor_executions tables, MergedExecution record, ChunkAccumulator
extraction, DetailService snapshot, and REST API response.

UI adds a Properties tab next to Headers in the process diagram detail
panel, with the same input/output split table layout.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 14:23:53 +02:00
hsiegeln
199d0259cd feat: add "+ App" shortcut button to sidebar Applications header
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Adds a subtle "+ App" button in the sidebar section header for quick
app creation without navigating to the Deployments tab first. Only
visible to OPERATOR and ADMIN roles.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 09:10:41 +02:00
hsiegeln
ac680b7f3f refactor: prefix all third-party service names with cameleer-
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m7s
CI / docker (push) Successful in 1m33s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m51s
SonarQube / sonarqube (push) Successful in 3m28s
Rename all Docker/K8s service names, DNS hostnames, secrets, volumes,
and manifest files to use the cameleer- prefix, making it clear which
software package each container belongs to.

Services renamed:
- postgres → cameleer-postgres
- clickhouse → cameleer-clickhouse
- logto → cameleer-logto
- logto-postgresql → cameleer-logto-postgresql
- traefik (service) → cameleer-traefik
- postgres-external → cameleer-postgres-external

Secrets renamed:
- postgres-credentials → cameleer-postgres-credentials
- clickhouse-credentials → cameleer-clickhouse-credentials
- logto-credentials → cameleer-logto-credentials

Volumes renamed:
- pgdata → cameleer-pgdata
- chdata → cameleer-chdata
- certs → cameleer-certs
- bootstrapdata → cameleer-bootstrapdata

K8s manifests renamed:
- deploy/postgres.yaml → deploy/cameleer-postgres.yaml
- deploy/clickhouse.yaml → deploy/cameleer-clickhouse.yaml
- deploy/logto.yaml → deploy/cameleer-logto.yaml

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 22:51:08 +02:00
hsiegeln
fe283674fb fix: use relative asset paths with always-injected <base> tag
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Switch Vite base back to './' (relative paths) and always inject
<base href="${BASE_PATH}"> in the entrypoint, even when BASE_PATH=/.

This fixes asset loading for both deployment modes:
- Single-instance: <base href="/"> resolves ./assets/x.js to /assets/x.js
- SaaS tenant: <base href="/t/slug/"> resolves to /t/slug/assets/x.js

Previously base:'/' produced absolute /assets/ paths that the <base>
tag couldn't redirect, breaking SaaS tenants. And base:'./' without
<base> broke deep URLs in single-instance mode. Always injecting the
tag makes relative paths work universally.

The patched server-ui-entrypoint.sh in cameleer-saas (which rewrote
absolute href/src attributes via sed) is no longer needed and can be
removed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:30:00 +02:00
hsiegeln
67e2c1a531 fix: revert relative base path and fix processor table overflow
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Revert base: './' back to '/' — relative asset paths break on deep
URLs like /dashboard/app/route where the browser resolves assets to
/dashboard/app/assets/ instead of /assets/.

Also fix processor metrics table clipping: remove flex:1/min-height:0
from .processorSection so the table takes its natural content height
and the page scrolls to show all rows (was clipping at ~12 of 18).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:16:59 +02:00
hsiegeln
025e1cfc34 docs: update CLAUDE.md GitNexus stats
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / docker (push) Successful in 32s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 21:04:59 +02:00
hsiegeln
d942425cb1 fix: use relative base path for Vite assets
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m16s
CI / docker (push) Successful in 1m8s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
When the server-ui is deployed under a subpath (/t/{slug}/), absolute
asset paths (/assets/...) resolve to the domain root instead of the
subpath, causing 404s. Using './' makes asset URLs relative to the
HTML page, so they resolve correctly regardless of mount path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 20:52:49 +02:00
hsiegeln
882198d59a fix: use lagInFrame instead of lag for ClickHouse compatibility
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m29s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
ClickHouse does not have lag() as a window function. Use lagInFrame()
with explicit ROWS BETWEEN frame instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 12:06:25 +02:00
hsiegeln
5edb833d21 chore: remove stats table migration logic from ClickHouseSchemaInitializer
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m44s
Not needed yet -- all deployments are under our control and can be
reset manually if the old schema is encountered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:51:34 +02:00
hsiegeln
3f2392b8f7 refactor: consolidate ClickHouse init.sql as clean idempotent schema
Some checks failed
CI / build (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
Rewrite init.sql as a pure CREATE IF NOT EXISTS file with no DROP or
INSERT statements. Safe for repeated runs on every startup without
corrupting aggregated stats data.

Old deployments with count()-based stats tables are migrated
automatically: ClickHouseSchemaInitializer checks system.columns for
the old AggregateFunction(count) type and drops those tables before
init.sql recreates them with the correct uniq() schema. This runs
once per table and is a no-op on fresh installs or already-migrated
deployments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:49:53 +02:00
hsiegeln
6b558b649d fix: use absolute asset paths and prevent route-level runtime navigation
Some checks failed
CI / build (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
Change vite base from './' to '/' so asset paths are absolute. With
relative paths, direct navigation to multi-segment URLs like
/runtime/app/instance resolved assets to /runtime/assets/ which 404'd.

Also fix sidebar navigation: clicking a route while on the runtime tab
no longer navigates to /runtime/{appId}/{routeId} (which the runtime
page interprets as an instanceId). It stays at /runtime/{appId}.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 11:24:15 +02:00
hsiegeln
e57343e3df feat: add delta mode for counter metrics using ClickHouse lag()
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m12s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Counter metrics like chunks.exported.count are monotonically increasing.
Add mode=delta query parameter to the agent metrics API that computes
per-bucket deltas server-side using ClickHouse lag() window function:
max(value) per bucket, then greatest(0, current - previous) to get the
increase per period with counter-reset handling.

The chunks exported/dropped charts now show throughput per bucket
instead of the ever-increasing cumulative total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 10:56:06 +02:00
hsiegeln
ae908fb382 fix: deduplicate all stats MVs and preserve loop iterations
All checks were successful
CI / build (push) Successful in 2m25s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m20s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m3s
SonarQube / sonarqube (push) Successful in 3m49s
Extend uniq-based dedup from processor tables to all stats tables
(stats_1m_all, stats_1m_app, stats_1m_route). Execution-level tables
use uniq(execution_id). Processor-level tables now use
uniq(concat(execution_id, toString(seq))) so loop iterations (same
exchange, different seq) are counted while chunk retry duplicates
(same exchange+seq) are collapsed.

All stats tables are dropped, recreated, and backfilled from raw
data on startup. All Java queries updated: countMerge -> uniqMerge,
countIfMerge -> uniqIfMerge.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:48:01 +02:00
hsiegeln
1872d46466 fix: remove semicolons from SQL comments that broke schema initializer
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m30s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 50s
The ClickHouseSchemaInitializer splits on semicolons before filtering
comments, so semicolons inside comment text created invalid statements.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:17:35 +02:00
hsiegeln
e2f784bf82 fix: deduplicate processor stats using uniq(execution_id)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
Processor execution counts were inflated by duplicate inserts into the
plain MergeTree processor_executions table (chunk retries, reconnects).
Replace count()/countIf() with uniq(execution_id)/uniqIf() in both
stats_1m_processor and stats_1m_processor_detail MVs so each exchange
is counted once per processor regardless of duplicates.

Tables are dropped and rebuilt from raw data on startup. MV created
after backfill to avoid double-counting.

Also adds stats_1m_processor_detail to the catalog purge list (was
missing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 23:12:00 +02:00
hsiegeln
27f2503640 fix: clean up runtime UI and harden session expiry handling
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 1m13s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Remove redundant "X/X LIVE" badge from runtime page, breadcrumb trail
and routes section from agent detail page (pills moved into Process
Information card). Fix session expiry: guard against concurrent 401
refresh races and skip re-entrant triggers on auth endpoints.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:33:44 +02:00
hsiegeln
ffce3b714f chore: update design system to 0.1.48 (x-axis tick label overlap fix)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m24s
CI / docker (push) Successful in 2m12s
CI / deploy (push) Successful in 54s
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:12:12 +02:00
hsiegeln
d34abeb5cb fix: diagram zoom not working on initial render
All checks were successful
CI / build (push) Successful in 2m10s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m32s
CI / deploy (push) Successful in 52s
CI / deploy-feature (push) Has been skipped
The wheel event listener was attached in a useEffect with empty deps,
but the SVG element doesn't exist during the loading state. Switch
svgRef from a plain ref to a callback ref that triggers re-attachment
when the SVG element becomes available after data loads.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:05:35 +02:00
hsiegeln
aaf9a00d67 fix: replace native SVG tooltip with styled heatmap tooltip overlay
Some checks failed
CI / build (push) Successful in 2m12s
CI / cleanup-branch (push) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Renders an HTML tooltip below hovered diagram nodes with processor
metrics (avg, p99, % time, invocations, error rate). Styled inline
with the existing NodeToolbar pattern — positioned via screen-space
coordinates, uses DS tokens for background/border/shadow/typography.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 22:03:05 +02:00
hsiegeln
00c9a0006e feat: rework runtime charts and fix time range propagation
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m28s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Runtime page (AgentInstance):
- Rearrange charts: CPU, Memory, GC (top); Threads, Chunks Exported,
  Chunks Dropped (bottom). Removes throughput/error charts (belong on
  Dashboard, not Runtime).
- Pass global time range (from/to) to useAgentMetrics — charts now
  respect the time filter instead of always showing last 60 minutes.
- Bottom row (logs + timeline) fills remaining vertical space.

Dashboard L3:
- Processor metrics section fills remaining vertical space.
- Chart x-axis uses timestamps instead of bucket indices.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:59:38 +02:00
hsiegeln
98ce7c2204 feat: combine process diagram and processor table into toggled card
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m25s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
Dashboard L3 now shows a single Processor Metrics card with
Diagram/Table toggle buttons. The diagram shows native tooltips on
hover with full processor metrics (avg, p99, invocations, error rate,
% time).

Also fixes:
- Chart x-axis uses actual timestamps instead of bucket indices
- formatDurationShort uses locale formatting with max 3 decimals

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:40:43 +02:00
hsiegeln
66248f6b1c fix: accept logs from unregistered agents using JWT claims
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m24s
CI / docker (push) Successful in 1m7s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
After server restart, agents send logs before re-registering. Instead
of dropping these logs, fall back to application and environment from
the JWT token claims. Only drops logs when neither registry nor JWT
provide an applicationId.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:29:05 +02:00
hsiegeln
ce8a2a1525 fix: use raw timestamp string for throughput/error chart data
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m22s
CI / docker (push) Successful in 1m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 56s
Avoids Date round-trip that crashes with toISOString() on invalid
timestamps from the timeseries API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:25:00 +02:00
hsiegeln
65ed94f0e0 fix: commit DS v0.1.47 dependency update (missed in migration commit)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m34s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:19:56 +02:00
hsiegeln
b1e1f789c5 ci: retrigger build (DS v0.1.47 publish race condition)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 37s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 21:13:09 +02:00
hsiegeln
0dae1f1cc7 feat: migrate agent charts to ThemedChart + Recharts
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
Replace custom LineChart/AreaChart/BarChart usage with ThemedChart
wrapper. Data format changed from ChartSeries[] to Recharts-native
flat objects. Uses DS v0.1.47.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-12 19:44:55 +02:00
hsiegeln
a0af53f8f5 chore: update design system to 0.1.46 (responsive charts, timestamp tooltips)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m31s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:52:53 +02:00
hsiegeln
aa91b867c5 fix: use Date x-values in agent charts for proper time axes
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / build (push) Has been cancelled
All chart series now use Date objects from the API response instead
of integer indices. This gives proper date/time on x-axes and in
tooltips (leveraging DS v0.1.46 responsive charts + timestamp
tooltips). GC chart switched from BarChart to AreaChart for
consistency with Date x-values.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:51:18 +02:00
hsiegeln
0b19fe8319 fix: update UI metric names from JMX to Micrometer convention
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m34s
CI / docker (push) Successful in 2m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
Agent team migrated from JMX to Micrometer metrics. Update the 5
hardcoded metric names in AgentInstance.tsx JVM charts:
- jvm.cpu.process → process.cpu.usage.value
- jvm.memory.heap.used → jvm.memory.used.value
- jvm.memory.heap.max → jvm.memory.max.value
- jvm.threads.count → jvm.threads.live.value
- jvm.gc.time → jvm.gc.pause.total_time

Server backend is unaffected (generic MetricsSnapshot storage).
CLAUDE.md updated with full agent metric name reference.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:35:29 +02:00
hsiegeln
dc29afd4c8 docs: add Prometheus metrics reference to CLAUDE.md
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 3m42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Lists all business metrics (gauges, counters, timers) with their
tags and source classes, plus agent container label mapping table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:29:01 +02:00
hsiegeln
6bf7175a6c feat: add Micrometer Prometheus metrics to server
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m36s
CI / deploy (push) Has been cancelled
CI / docker (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Adds micrometer-registry-prometheus and exposes /api/v1/prometheus
endpoint (unauthenticated for scraping). ServerMetrics component
provides business metrics beyond default JVM/HTTP:

Gauges: agents by state, SSE connections, buffer depths (execution,
processor, log, metrics), accumulator pending exchanges.

Counters: ingestion drops (buffer_full, no_agent, no_identity),
agent transitions (went_stale, went_dead, recovered), deployment
outcomes (running, failed, degraded), auth failures (invalid_token,
revoked, oidc_rejected).

Timers: ClickHouse flush duration by type, deployment duration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 18:23:27 +02:00