Commit Graph

762 Commits

Author SHA1 Message Date
hsiegeln
726e77bb91 docs: update all documentation for session changes
Some checks failed
CI / build (push) Successful in 2m2s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CLAUDE.md:
- Agent registry auto-heal note (in-memory, JWT fallback)
- Usage analytics (ClickHouse usage_events table)

HOWTO.md:
- Architecture diagram: added deploy-demo (NodePort 30092) and cameleer-demo namespace
- Access URLs: added Deploy Demo
- Agent registry: server restart resilience documentation
- Route control: CommandGroupResponse note

ui/README.md:
- Fixed outdated generate-api command
- Added DS version (v0.1.26)
- Fixed VITE_API_TARGET (30081 not 30090)
- Added key features section (cmd-k, LIVE mode, route control, event icons)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:22:44 +02:00
hsiegeln
d30c267292 fix: route catalog missing routes after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m20s
CI / docker (push) Successful in 52s
CI / deploy (push) Successful in 54s
CI / deploy-feature (push) Has been skipped
After server restart, auto-healed agents register with empty
routeIds. The catalog only looked at agent registry for routes,
so routes and counts disappeared.

Now merges route IDs from ClickHouse stats_1m_route into the
catalog. Also includes apps that only exist in ClickHouse data
(no agent currently registered). Routes and exchange counts
survive server restarts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:14:27 +02:00
hsiegeln
37c10ae0a6 feat: manual refresh on sidebar navigation when LIVE mode is off
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
When autoRefresh is disabled, sidebar clicks now invalidate all
queries (queryClient.invalidateQueries()), triggering a re-fetch.
This gives users "click to refresh" behavior instead of stale data.

When LIVE mode is on, queries already poll at intervals, so no
invalidation is needed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 11:01:29 +02:00
hsiegeln
c16f0e62ed fix: clicking Applications header navigates back to all apps
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m31s
CI / docker (push) Successful in 1m22s
CI / deploy (push) Failing after 2m26s
CI / deploy-feature (push) Has been skipped
When the Applications section is already expanded, clicking the
header now navigates to /{tab} (all applications) instead of
collapsing. When collapsed, clicking expands as before.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:49:54 +02:00
hsiegeln
2bc3efad7f fix: agent auth, heartbeat, and SSE all break after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
Three related issues caused by in-memory agent registry being empty
after server restart:

1. JwtAuthenticationFilter rejected valid agent JWTs if agent wasn't
   in registry — now authenticates any valid JWT regardless

2. Heartbeat returned 404 for unknown agents — now auto-registers
   the agent from JWT claims (subject, application)

3. SSE endpoint returned 404 — same auto-registration fix

JWT validation result is stored as a request attribute so downstream
controllers can extract the application claim for auto-registration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:41:23 +02:00
hsiegeln
0632f1c6a8 fix: agent token refresh returns 404 after server restart
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m23s
The refresh endpoint required the agent to exist in the in-memory
registry. After server restart the registry is empty, so all refresh
attempts got 404. The refresh token itself is self-contained with
subject, application, and roles — the registry lookup is optional.

Now uses application from the JWT, falling back to registry only
if the agent happens to be registered.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:37:57 +02:00
hsiegeln
bdac363e40 fix: active queries list always showed itself
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 40s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
The system.processes query was returning its own row. Added
filter: query NOT LIKE '%system.processes%'

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:33:47 +02:00
hsiegeln
d9615204bf fix: admin pages not scrollable (content clipped by overflow:hidden)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
AdminLayout was a plain div with padding but no scroll. The parent
<main> has overflow:hidden, so admin page content beyond viewport
height was clipped. Added flex:1, overflow:auto, minHeight:0 to
make AdminLayout a proper scroll container.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:31:04 +02:00
hsiegeln
2896bb90a9 fix: usage events never flushed to ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m7s
UsageFlushScheduler was a @Component with @ConditionalOnBean, but
ClickHouseUsageTracker is created via @Bean — component scan runs
first, so the condition always evaluated false. Events accumulated
in the WriteBuffer but flush() was never called.

Moved scheduler to @Bean in StorageBeanConfig with the same
@ConditionalOnProperty guard as the tracker.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 10:07:13 +02:00
hsiegeln
a036d8a027 docs: spec for cameleer-deploy-demo prototype
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 00:13:04 +02:00
hsiegeln
44a37317d1 fix: cmd-k context key for tab reset and Enter-to-navigate on admin pages
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 1m27s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 51s
SonarQube / sonarqube (push) Failing after 2m22s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:53:09 +02:00
hsiegeln
146398b183 feat: RBAC page reads cmd-k navigation state for tab switch and highlight
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 23:42:18 +02:00
hsiegeln
69ca52b25e feat: handle admin cmd-k selection with tab navigation state 2026-04-02 23:38:06 +02:00
hsiegeln
111bcc302d feat: build admin search data for cmd-k on admin pages
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 23:34:52 +02:00
hsiegeln
cf36f81ef1 chore: bump @cameleer/design-system to v0.1.26 2026-04-02 23:33:00 +02:00
hsiegeln
28f38331cc docs: implementation plan for context-aware cmd-k search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:27:07 +02:00
hsiegeln
394fde30c7 docs: spec for context-aware cmd-k search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:21:53 +02:00
hsiegeln
62b5c56c56 feat: event-type icons for agent event feeds
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
Icons now reflect event type (UserPlus for registration, Skull
for dead, HeartPulse for recovery, Route for state changes, etc.)
while severity still drives the color. Updated in both
AgentInstance and AgentHealth pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 23:06:01 +02:00
hsiegeln
9b401558a5 fix: make disabled route control buttons visually distinct
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
Disabled buttons now show reduced opacity (0.35) and muted icon
color instead of just changing the cursor.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:58:46 +02:00
hsiegeln
38b76513c7 feat: route control buttons reflect current route state
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Buttons are disabled based on route state: Started disables
Start/Resume, Stopped disables Stop/Suspend/Resume, Suspended
disables Start/Suspend. State looked up from catalog API.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:56:49 +02:00
hsiegeln
2265ebf801 chore: bump @cameleer/design-system to v0.1.25
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m23s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m12s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:47:53 +02:00
hsiegeln
20af81a5dc feat: show server version in sidebar header
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 1m30s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m39s
Version injected at build time via VITE_APP_VERSION env var.
CI sets it to branch@sha. Falls back to 'dev' in local dev.
Displayed next to "Cameleer" in the sidebar header.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:42:06 +02:00
hsiegeln
d819f88ae4 fix: starred routes not showing — starKey prefix mismatch
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 53s
collectStarredItems used 'app:' prefix for route keys but
buildAppTreeNodes uses 'route:' prefix. Routes were starred
but never matched in the starred section.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:36:28 +02:00
hsiegeln
5880abdd93 fix: keep admin section in place, don't move to top
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Admin section stays in its fixed position (after Starred, before
Footer). Entering admin mode collapses Applications and Starred
but does not reorder sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:32:53 +02:00
hsiegeln
b676450995 fix: simplify sidebar to Applications + Starred + Admin footer
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Remove Agents and Routes sections from sidebar. Layout is now:
Header (camel logo + Cameleer) → Search → Applications section →
Starred section (when items exist) → Footer (Admin + API Docs).

Admin accordion: clicking Admin navigates to /admin/rbac and
expands Admin section at top while collapsing Applications and
Starred. Clicking Applications exits admin mode.

Removed buildAgentTreeNodes and buildRouteTreeNodes from
sidebar-utils (no longer needed).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:29:44 +02:00
hsiegeln
e495b80432 fix: increase ClickHouse pool size and reduce flush interval
All checks were successful
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 2m10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
Pool was hardcoded to 10 connections serving 7 concurrent write
streams + UI reads, causing "too many simultaneous queries" and
WriteBuffer overflow. Pool now defaults to 50 (configurable via
clickhouse.pool-size), flush interval reduced from 1000ms to 500ms.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:11:15 +02:00
hsiegeln
45eab761b7 chore: bump @cameleer/design-system to v0.1.24
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m3s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:06:13 +02:00
hsiegeln
8d899cc70c refactor: use HeartbeatRequest from cameleer3-common
Some checks failed
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / cleanup-branch (push) Has been cancelled
CI / build (push) Has been cancelled
Replace local HeartbeatRequest DTO with the shared model from
cameleer3-common. Message types exchanged between server and agent
belong in the common module.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 22:05:26 +02:00
hsiegeln
520b80444a feat(#119): accept route states in heartbeat and state-change events
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 34s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Replace ACK-based route state inference with agent-reported state.
Heartbeats now carry optional routeStates map, and ROUTE_STATE_CHANGED
events update the registry immediately.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 21:45:13 +02:00
hsiegeln
17aff5ef9d docs: route state protocol extension spec
Defines two backward-compatible mechanisms for accurate route state
tracking: heartbeat extension (routeStates map in heartbeat body)
and ROUTE_STATE_CHANGED events for real-time updates. Covers
agent-side detection via Camel EventNotifier, server-side handling,
multi-agent conflict resolution, and migration path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:26:38 +02:00
hsiegeln
b714d3363f feat(#119): expose route state in catalog API and sidebar/dashboard
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 29s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add routeState field to RouteSummary DTO (null for started, 'stopped'
or 'suspended' for non-default states). Sidebar shows stop/pause icons
and state badge for affected routes in both Apps and Routes sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:46 +02:00
hsiegeln
0acceaf1a9 feat(#119): add RouteStateRegistry for tracking route operational state
In-memory registry that infers route state (started/stopped/suspended)
from successful route-control command ACKs. Updates state only when all
agents in a group confirm success.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:35 +02:00
hsiegeln
ca1d472b78 feat(#117): agent-count toasts and persistent error toast dismiss
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:08:00 +02:00
hsiegeln
c3b4f70913 feat(#116): update command hooks for synchronous group response
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add CommandGroupResponse and ConfigUpdateResponse types. Switch
useSendGroupCommand and useSendRouteCommand from openapi-fetch to authFetch
returning CommandGroupResponse. Update useUpdateApplicationConfig to return
ConfigUpdateResponse and fix all consumer onSuccess callbacks to access
saved.config.version instead of saved.version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:01:06 +02:00
hsiegeln
027e45aadf feat(#116): synchronous group command dispatch with multi-agent response collection
Add addGroupCommandWithReplies() to AgentRegistryService that sends commands
to all LIVE agents in a group and returns CompletableFuture per agent for
collecting replies. Update sendGroupCommand() and pushConfigToAgents() to
wait with a shared 10-second deadline, returning CommandGroupResponse with
per-agent status, timeouts, and overall success. Config update endpoint now
returns ConfigUpdateResponse wrapping both the saved config and push result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:00:56 +02:00
hsiegeln
f39f07e7bf feat(#118): add confirmation dialog for stop and suspend commands
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 35s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Stop and suspend route commands now show a ConfirmDialog requiring
typed confirmation before dispatch. Start and resume execute
immediately without confirmation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:54:23 +02:00
hsiegeln
d21d8b2c48 fix(#112): initialize sidebar accordion state from initial route
Some checks failed
CI / build (push) Failing after 43s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Direct navigation to /admin/* now correctly opens Admin section
and collapses operational sections on first render. Previously
the accordion effect only triggered on route transitions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:36:43 +02:00
hsiegeln
d5f5601554 fix(#112): add missing Routes section, fix admin double padding
Review feedback: buildRouteTreeNodes was defined but never rendered.
Added Routes section between Agents and Admin. Removed duplicate
padding on admin pages (AdminLayout handles its own padding).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:32:26 +02:00
hsiegeln
00042b1d14 feat(#112): remove admin tabs, sidebar handles navigation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:29 +02:00
hsiegeln
fe49eb5aba feat(#112): migrate to composable sidebar with accordion and collapse
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:25 +02:00
hsiegeln
bc913eef6e feat(#112): extract sidebar tree builders and types from DS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:22 +02:00
hsiegeln
d70ad91b33 docs: clarify search ownership and icon-rail click behavior
Search: DS renders dumb input, app owns filterQuery state and
passes it to each SidebarTree. Icon-rail click: fires both
onCollapseToggle and onToggle simultaneously, no navigation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:41:31 +02:00
hsiegeln
ba361af2d7 docs: composable sidebar design spec for #112
Replaces the previous "hide sidebar on admin" approach with a
composable compound component design. DS provides shell + building
blocks (Sidebar, Section, Footer, SidebarTree); consuming app
controls all content, section ordering, accordion behavior, and
icon-rail collapse.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:38:01 +02:00
hsiegeln
78777d2ba6 Revert "feat(#112): hide sidebar, topbar, cmd palette on admin pages"
This reverts commit d95e518622.
2026-04-02 17:22:06 +02:00
hsiegeln
3f8a9715a4 Revert "feat(#112): add admin header bar with back button and logout"
This reverts commit a484364029.
2026-04-02 17:22:06 +02:00
hsiegeln
f00a3e8b97 Revert "fix(#112): remove dead admin breadcrumb code, add logout aria-label"
This reverts commit d5028193c0.
2026-04-02 17:22:06 +02:00
hsiegeln
d5028193c0 fix(#112): remove dead admin breadcrumb code, add logout aria-label
Review feedback: breadcrumb memo had an unused isAdminPage branch
(TopBar no longer renders on admin pages). Added aria-label to
icon-only logout button for screen readers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:16:01 +02:00
hsiegeln
a484364029 feat(#112): add admin header bar with back button and logout
AdminLayout gains a self-contained header (Back / Admin / user+logout)
with CSS module styles, replacing the inline padding wrapper. Admin
pages now render fully without the main app chrome.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 17:12:50 +02:00
hsiegeln
d95e518622 feat(#112): hide sidebar, topbar, cmd palette on admin pages
Pass null as sidebar prop, guard TopBar and CommandPalette with
!isAdminPage, and remove conditional admin padding from main element.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 17:12:44 +02:00
hsiegeln
56297701e6 fix: use ILIKE for case-insensitive log search in ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m29s
LIKE is case-sensitive in ClickHouse. Switch to ILIKE for message,
stack_trace, and logger_name searches so queries match regardless
of casing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:35:34 +02:00