Commit Graph

774 Commits

Author SHA1 Message Date
hsiegeln
e0aac4bf0a perf: enable ClickHouse async_insert to batch small inserts server-side
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Diagnostics showed 3,200 tiny inserts per 5 minutes (processor_executions:
2,376 at 14 rows avg, logs: 803 at 5 rows avg), each creating a new part
and triggering MV aggregations + background merges. This was the root cause
of ~400m CPU usage at 3 tx/s.

async_insert=1 with 5s busy timeout lets ClickHouse buffer incoming inserts
and consolidate them into fewer, larger parts before writing to disk.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:33:48 +02:00
hsiegeln
ac94a67a49 fix: reduce ClickHouse CPU by increasing flush interval, rename LIVE→AUTO labels
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m24s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- Increase ingestion flush interval from 500ms to 5000ms to reduce MV merge storms
- Reduce ClickHouse background_schedule_pool_size from 8 to 4
- Rename LIVE/PAUSED badge labels to AUTO/MANUAL across all pages
- Update design system to v0.1.29

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 22:05:29 +02:00
hsiegeln
e1cb9d7872 fix: extract snapshot data from chunks, reduce ClickHouse log noise
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- ChunkAccumulator now extracts inputBody/outputBody/inputHeaders/outputHeaders
  from ExecutionChunk.inputSnapshot/outputSnapshot instead of storing empty strings
- Set ClickHouse server log level to warning (was trace by default)
- Update CLAUDE.md to document Ed25519 key derivation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:58:54 +02:00
hsiegeln
a9ec424d52 fix: derive Ed25519 signing key from JWT secret, no DB storage
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Replace DB-persisted keypair with deterministic derivation from
CAMELEER_JWT_SECRET via HMAC-SHA256 seed + seeded SHA1PRNG KeyPairGenerator.
Same secret = same key pair across restarts, no private key in the database.

Closes #121

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:18:43 +02:00
hsiegeln
81f13396a0 fix: persist Ed25519 signing key to survive server restarts
All checks were successful
CI / build (push) Successful in 2m8s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 50s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 54s
The keypair was generated ephemerally on each startup, causing agents
to reject all commands after a server restart (signature mismatch).
Now persisted to PostgreSQL server_config table and restored on startup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:13:40 +02:00
hsiegeln
670e458376 fix: update ITs to use consolidated init.sql, remove dead code
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / docker (push) Successful in 1m29s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 50s
- All 7 ClickHouse integration tests now load init.sql via shared
  ClickHouseTestHelper instead of deleted V1-V11 migration files
- Remove unused useScope exports (setApp, setRoute, setExchange, clearScope)
- Remove unused CSS classes (monoCell, punchcardStack)
- Update ui/README.md DS version to v0.1.28

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 17:03:54 +02:00
hsiegeln
d4327af6a4 refactor: consolidate ClickHouse schema into single init.sql, cache diagrams
All checks were successful
CI / build (push) Successful in 2m2s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 51s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
- Merge all V1-V11 migration scripts into one idempotent init.sql
- Simplify ClickHouseSchemaInitializer to load single file
- Replace route_diagrams projection with in-memory caches:
  hashCache (routeId+instanceId → contentHash) warm-loaded on startup,
  graphCache (contentHash → RouteGraph) lazy-populated on access
- Eliminates 9M+ row scans on diagram lookups

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:24:53 +02:00
hsiegeln
bb3e1e2bc3 fix: set deduplicate_merge_projection_mode for ReplacingMergeTree projection
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
ClickHouse 24.12 requires this setting before adding projections to
ReplacingMergeTree tables. Using 'drop' mode which discards the projection
during deduplication merges and rebuilds it afterward.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:14:56 +02:00
hsiegeln
984bb2d40f fix: sort ClickHouse migration scripts by numeric version prefix
All checks were successful
CI / build (push) Successful in 2m32s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 55s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 52s
Alphabetical sort put V10/V11 before V2-V9 ("V11" < "V1_" in ASCII),
causing the route_diagrams projection to run before the table existed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 15:06:56 +02:00
hsiegeln
6f00ff2e28 fix: reduce ClickHouse log noise, admin query spam, and diagram scan perf
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m25s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
- Set com.clickhouse log level to INFO and org.apache.hc.client5 to WARN
- Admin hooks (useUsers/useGroups/useRoles) now only fetch on admin pages,
  eliminating AUDIT view_users entries on every UI click
- Add ClickHouse projection on route_diagrams for (tenant_id, route_id,
  instance_id, created_at) to avoid full table scans on diagram lookups
- Bump @cameleer/design-system to v0.1.28 (PAUSED mode time range fix,
  refreshTimeRange API)
- Call refreshTimeRange before invalidateQueries in PAUSED mode manual
  refresh so sidebar clicks use current time window

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:48:30 +02:00
hsiegeln
2708bcec17 fix: first exchange click doesn't highlight selected row
All checks were successful
CI / build (push) Successful in 1m47s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
On first click, Dashboard was in non-split mode. The click set
selectedId locally then triggered split view, which remounted
Dashboard — losing the selectedId state.

Added activeExchangeId prop passed from ExchangesPage so the
selection survives the remount. Also syncs via useEffect when
parent changes selection (e.g. correlated exchange navigation).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:28:26 +02:00
hsiegeln
901dfd1eb8 fix: PAUSED mode disabled queries entirely instead of just polling
Some checks failed
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
useLiveQuery returned enabled:false when paused, which prevented
queries from running at all. Changed to enabled:true always —
PAUSED now means "fetch once, no polling" instead of "don't fetch".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:25:04 +02:00
hsiegeln
726e77bb91 docs: update all documentation for session changes
Some checks failed
CI / build (push) Successful in 2m2s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CLAUDE.md:
- Agent registry auto-heal note (in-memory, JWT fallback)
- Usage analytics (ClickHouse usage_events table)

HOWTO.md:
- Architecture diagram: added deploy-demo (NodePort 30092) and cameleer-demo namespace
- Access URLs: added Deploy Demo
- Agent registry: server restart resilience documentation
- Route control: CommandGroupResponse note

ui/README.md:
- Fixed outdated generate-api command
- Added DS version (v0.1.26)
- Fixed VITE_API_TARGET (30081 not 30090)
- Added key features section (cmd-k, LIVE mode, route control, event icons)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:22:44 +02:00
hsiegeln
d30c267292 fix: route catalog missing routes after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 52s
CI / deploy (push) Successful in 54s
CI / deploy-feature (push) Has been skipped
After server restart, auto-healed agents register with empty
routeIds. The catalog only looked at agent registry for routes,
so routes and counts disappeared.

Now merges route IDs from ClickHouse stats_1m_route into the
catalog. Also includes apps that only exist in ClickHouse data
(no agent currently registered). Routes and exchange counts
survive server restarts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:14:27 +02:00
hsiegeln
37c10ae0a6 feat: manual refresh on sidebar navigation when LIVE mode is off
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
When autoRefresh is disabled, sidebar clicks now invalidate all
queries (queryClient.invalidateQueries()), triggering a re-fetch.
This gives users "click to refresh" behavior instead of stale data.

When LIVE mode is on, queries already poll at intervals, so no
invalidation is needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:01:29 +02:00
hsiegeln
c16f0e62ed fix: clicking Applications header navigates back to all apps
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m31s
CI / docker (push) Successful in 1m22s
CI / deploy (push) Failing after 2m26s
CI / deploy-feature (push) Has been skipped
When the Applications section is already expanded, clicking the
header now navigates to /{tab} (all applications) instead of
collapsing. When collapsed, clicking expands as before.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:49:54 +02:00
hsiegeln
2bc3efad7f fix: agent auth, heartbeat, and SSE all break after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Three related issues caused by in-memory agent registry being empty
after server restart:

1. JwtAuthenticationFilter rejected valid agent JWTs if agent wasn't
   in registry — now authenticates any valid JWT regardless

2. Heartbeat returned 404 for unknown agents — now auto-registers
   the agent from JWT claims (subject, application)

3. SSE endpoint returned 404 — same auto-registration fix

JWT validation result is stored as a request attribute so downstream
controllers can extract the application claim for auto-registration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:41:23 +02:00
hsiegeln
0632f1c6a8 fix: agent token refresh returns 404 after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m23s
The refresh endpoint required the agent to exist in the in-memory
registry. After server restart the registry is empty, so all refresh
attempts got 404. The refresh token itself is self-contained with
subject, application, and roles — the registry lookup is optional.

Now uses application from the JWT, falling back to registry only
if the agent happens to be registered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:37:57 +02:00
hsiegeln
bdac363e40 fix: active queries list always showed itself
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
The system.processes query was returning its own row. Added
filter: query NOT LIKE '%system.processes%'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:33:47 +02:00
hsiegeln
d9615204bf fix: admin pages not scrollable (content clipped by overflow:hidden)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
AdminLayout was a plain div with padding but no scroll. The parent
<main> has overflow:hidden, so admin page content beyond viewport
height was clipped. Added flex:1, overflow:auto, minHeight:0 to
make AdminLayout a proper scroll container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:31:04 +02:00
hsiegeln
2896bb90a9 fix: usage events never flushed to ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m7s
UsageFlushScheduler was a @Component with @ConditionalOnBean, but
ClickHouseUsageTracker is created via @Bean — component scan runs
first, so the condition always evaluated false. Events accumulated
in the WriteBuffer but flush() was never called.

Moved scheduler to @Bean in StorageBeanConfig with the same
@ConditionalOnProperty guard as the tracker.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:07:13 +02:00
hsiegeln
a036d8a027 docs: spec for cameleer-deploy-demo prototype
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 00:13:04 +02:00
hsiegeln
44a37317d1 fix: cmd-k context key for tab reset and Enter-to-navigate on admin pages
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 1m27s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 51s
SonarQube / sonarqube (push) Failing after 2m22s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:53:09 +02:00
hsiegeln
146398b183 feat: RBAC page reads cmd-k navigation state for tab switch and highlight
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 23:42:18 +02:00
hsiegeln
69ca52b25e feat: handle admin cmd-k selection with tab navigation state 2026-04-02 23:38:06 +02:00
hsiegeln
111bcc302d feat: build admin search data for cmd-k on admin pages
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 23:34:52 +02:00
hsiegeln
cf36f81ef1 chore: bump @cameleer/design-system to v0.1.26 2026-04-02 23:33:00 +02:00
hsiegeln
28f38331cc docs: implementation plan for context-aware cmd-k search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:27:07 +02:00
hsiegeln
394fde30c7 docs: spec for context-aware cmd-k search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:21:53 +02:00
hsiegeln
62b5c56c56 feat: event-type icons for agent event feeds
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
Icons now reflect event type (UserPlus for registration, Skull
for dead, HeartPulse for recovery, Route for state changes, etc.)
while severity still drives the color. Updated in both
AgentInstance and AgentHealth pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:06:01 +02:00
hsiegeln
9b401558a5 fix: make disabled route control buttons visually distinct
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Disabled buttons now show reduced opacity (0.35) and muted icon
color instead of just changing the cursor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:58:46 +02:00
hsiegeln
38b76513c7 feat: route control buttons reflect current route state
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Buttons are disabled based on route state: Started disables
Start/Resume, Stopped disables Stop/Suspend/Resume, Suspended
disables Start/Suspend. State looked up from catalog API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:56:49 +02:00
hsiegeln
2265ebf801 chore: bump @cameleer/design-system to v0.1.25
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m23s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m12s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:47:53 +02:00
hsiegeln
20af81a5dc feat: show server version in sidebar header
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m39s
Version injected at build time via VITE_APP_VERSION env var.
CI sets it to branch@sha. Falls back to 'dev' in local dev.
Displayed next to "Cameleer" in the sidebar header.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:42:06 +02:00
hsiegeln
d819f88ae4 fix: starred routes not showing — starKey prefix mismatch
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
collectStarredItems used 'app:' prefix for route keys but
buildAppTreeNodes uses 'route:' prefix. Routes were starred
but never matched in the starred section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:36:28 +02:00
hsiegeln
5880abdd93 fix: keep admin section in place, don't move to top
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Admin section stays in its fixed position (after Starred, before
Footer). Entering admin mode collapses Applications and Starred
but does not reorder sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:32:53 +02:00
hsiegeln
b676450995 fix: simplify sidebar to Applications + Starred + Admin footer
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Remove Agents and Routes sections from sidebar. Layout is now:
Header (camel logo + Cameleer) → Search → Applications section →
Starred section (when items exist) → Footer (Admin + API Docs).

Admin accordion: clicking Admin navigates to /admin/rbac and
expands Admin section at top while collapsing Applications and
Starred. Clicking Applications exits admin mode.

Removed buildAgentTreeNodes and buildRouteTreeNodes from
sidebar-utils (no longer needed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:29:44 +02:00
hsiegeln
e495b80432 fix: increase ClickHouse pool size and reduce flush interval
All checks were successful
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 2m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Pool was hardcoded to 10 connections serving 7 concurrent write
streams + UI reads, causing "too many simultaneous queries" and
WriteBuffer overflow. Pool now defaults to 50 (configurable via
clickhouse.pool-size), flush interval reduced from 1000ms to 500ms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:11:15 +02:00
hsiegeln
45eab761b7 chore: bump @cameleer/design-system to v0.1.24
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:06:13 +02:00
hsiegeln
8d899cc70c refactor: use HeartbeatRequest from cameleer3-common
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
Replace local HeartbeatRequest DTO with the shared model from
cameleer3-common. Message types exchanged between server and agent
belong in the common module.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:05:26 +02:00
hsiegeln
520b80444a feat(#119): accept route states in heartbeat and state-change events
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 34s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Replace ACK-based route state inference with agent-reported state.
Heartbeats now carry optional routeStates map, and ROUTE_STATE_CHANGED
events update the registry immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 21:45:13 +02:00
hsiegeln
17aff5ef9d docs: route state protocol extension spec
Defines two backward-compatible mechanisms for accurate route state
tracking: heartbeat extension (routeStates map in heartbeat body)
and ROUTE_STATE_CHANGED events for real-time updates. Covers
agent-side detection via Camel EventNotifier, server-side handling,
multi-agent conflict resolution, and migration path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:26:38 +02:00
hsiegeln
b714d3363f feat(#119): expose route state in catalog API and sidebar/dashboard
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 29s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add routeState field to RouteSummary DTO (null for started, 'stopped'
or 'suspended' for non-default states). Sidebar shows stop/pause icons
and state badge for affected routes in both Apps and Routes sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:46 +02:00
hsiegeln
0acceaf1a9 feat(#119): add RouteStateRegistry for tracking route operational state
In-memory registry that infers route state (started/stopped/suspended)
from successful route-control command ACKs. Updates state only when all
agents in a group confirm success.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:35 +02:00
hsiegeln
ca1d472b78 feat(#117): agent-count toasts and persistent error toast dismiss
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:08:00 +02:00
hsiegeln
c3b4f70913 feat(#116): update command hooks for synchronous group response
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add CommandGroupResponse and ConfigUpdateResponse types. Switch
useSendGroupCommand and useSendRouteCommand from openapi-fetch to authFetch
returning CommandGroupResponse. Update useUpdateApplicationConfig to return
ConfigUpdateResponse and fix all consumer onSuccess callbacks to access
saved.config.version instead of saved.version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:01:06 +02:00
hsiegeln
027e45aadf feat(#116): synchronous group command dispatch with multi-agent response collection
Add addGroupCommandWithReplies() to AgentRegistryService that sends commands
to all LIVE agents in a group and returns CompletableFuture per agent for
collecting replies. Update sendGroupCommand() and pushConfigToAgents() to
wait with a shared 10-second deadline, returning CommandGroupResponse with
per-agent status, timeouts, and overall success. Config update endpoint now
returns ConfigUpdateResponse wrapping both the saved config and push result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:00:56 +02:00
hsiegeln
f39f07e7bf feat(#118): add confirmation dialog for stop and suspend commands
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 35s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Stop and suspend route commands now show a ConfirmDialog requiring
typed confirmation before dispatch. Start and resume execute
immediately without confirmation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:54:23 +02:00
hsiegeln
d21d8b2c48 fix(#112): initialize sidebar accordion state from initial route
Some checks failed
CI / build (push) Failing after 43s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Direct navigation to /admin/* now correctly opens Admin section
and collapses operational sections on first render. Previously
the accordion effect only triggered on route transitions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:36:43 +02:00
hsiegeln
d5f5601554 fix(#112): add missing Routes section, fix admin double padding
Review feedback: buildRouteTreeNodes was defined but never rendered.
Added Routes section between Agents and Admin. Removed duplicate
padding on admin pages (AdminLayout handles its own padding).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:32:26 +02:00