refactor: rename group/groupName to application/applicationName
Some checks failed
CI / build (push) Failing after 40s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped

The execution-related "group" concept actually represents the
application name. Rename all Java fields, API parameters, and frontend
types from groupName→applicationName and group→application for clarity.

- Java records: ExecutionSummary, ExecutionDetail, ExecutionDocument,
  ExecutionRecord, ProcessorRecord
- API params: SearchRequest.group→application, SearchController
  @RequestParam group→application
- Services: IngestionService, DetailService, SearchIndexer, StatsStore
- Frontend: schema.d.ts, Dashboard, ExchangeDetail, RouteDetail,
  executions query hooks

Database column names (group_name) and OpenSearch field names are
unchanged — only the API-facing Java/TS field names are renamed.

RBAC group references (groups table, GroupRepository, GroupsTab) are
a separate domain concept and are NOT affected by this change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
hsiegeln
2026-03-23 21:21:38 +01:00
parent 3c226de62f
commit 8ad0016a8e
54 changed files with 21442 additions and 73 deletions

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,142 @@
# RBAC CRUD Gaps — Design Specification
## Goal
Add missing CRUD and assignment UI to the RBAC management page, fix date formatting, seed a built-in Admins group, and fix dashboard diagram ordering.
## References
- Parent spec: `docs/superpowers/specs/2026-03-17-rbac-management-design.md`
- Visual prototype: `examples/RBAC/rbac_management_ui.html`
---
## Changes
### 1. Users Tab — Delete + Assignments
Users cannot be created manually (they arrive via login). The detail pane gains:
- **Delete button** in the detail header area. Uses existing `ConfirmDeleteDialog` with the user's `displayName` as the confirmation string. Calls `useDeleteUser()`. **Guard:** the currently authenticated user (from `useAuthStore`) cannot delete themselves — button disabled with tooltip "Cannot delete your own account".
- **Group membership section** — "+ Add" chip opens a **multi-select dropdown** listing all groups the user is NOT already a member of. Checkboxes for batch selection, "Apply" button to commit. Calls are batched via `Promise.allSettled()` — if any fail, show an inline error, invalidate queries regardless to refresh. Existing group chips gain an "x" remove button calling `useRemoveUserFromGroup()`.
- **Direct roles section** — the existing "Effective roles" section renders both direct and inherited roles. The "+ Add" multi-select dropdown lists roles not yet directly assigned. Calls `useAssignRoleToUser()` (batched via `Promise.allSettled()`). Direct role chips gain an "x" button calling `useRemoveRoleFromUser()`. Inherited role chips (dashed border) do NOT get remove buttons — they can only be removed by changing group membership or group role assignments.
- **Created field** — change from date-only to full date+time: `new Date(createdAt).toLocaleString()`.
- **Mutation button states** — all action buttons (delete, remove chip "x") disable while their mutation is in-flight to prevent double-clicks.
### 2. Groups Tab — CRUD + Assignments
- **"+ Add group" button** in the panel header (`.btnAdd` style exists). Opens an inline form below the search bar with: name text input, optional parent group dropdown, "Create" button. Calls `useCreateGroup()`. Form clears and closes on success. On error: shows error message inline.
- **Delete button** in detail pane header. Uses `ConfirmDeleteDialog` with group name. Calls `useDeleteGroup()`. Resets selected group. **Guard:** the built-in Admins group (`SystemRole.ADMINS_GROUP_ID`) cannot be deleted — button disabled with tooltip "Built-in group cannot be deleted".
- **Assigned roles section** — "+ Add" multi-select dropdown listing roles not yet assigned to this group. Batched via `Promise.allSettled()`. Calls `useAssignRoleToGroup()`. Role chips gain "x" for `useRemoveRoleFromGroup()`.
- **Parent group** — shown as a dropdown in the detail header area, allowing re-parenting. Calls `useUpdateGroup()`. The dropdown excludes the group itself and its transitive descendants (cycle prevention — requires recursive traversal of `childGroups` on each `GroupDetail`). Setting to empty/none makes it top-level.
### 3. Roles Tab — CRUD
- **"+ Add role" button** in panel header. Opens an inline form: name (required), description (optional), scope (optional, free-text, defaults to "custom"). Calls `useCreateRole()`.
- **Delete button** in detail pane header. **Disabled for system roles** (lock icon + tooltip "System roles cannot be deleted"). Custom roles use `ConfirmDeleteDialog` with role name → `useDeleteRole()`.
- No assignment UI on the roles tab — assignments are managed from the User and Group detail panes.
### 4. Multi-Select Dropdown Component
A reusable component used across all assignment actions:
```
Props:
items: { id: string; label: string }[] — available items to pick from
onApply: (selectedIds: string[]) => void — called with all checked IDs
placeholder?: string — search filter placeholder
```
Behavior:
- Opens as a positioned dropdown below the "+ Add" chip
- Search/filter input at top
- Checkbox list of items (max-height with scroll)
- "Apply" button at bottom (disabled when nothing selected)
- Closes on Apply, Escape, or click-outside
- Shows count badge on Apply button: "Apply (3)"
Styling: background `var(--bg-raised)`, border `var(--border)`, border-radius `var(--radius-md)`, items with `var(--bg-hover)` on hover, checkboxes with `var(--amber)` accent.
### 5. Inline Create Form
A reusable pattern for "Add group" and "Add role":
- Appears below the search bar in the list pane, pushing content down
- Input fields with labels
- "Create" and "Cancel" buttons
- On success: closes form, clears inputs, new entity appears in list
- On error: shows error message inline
- "Create" button disabled while mutation is in-flight
### 6. Built-in Admins Group Seed
**Database migration** — new `V2__admin_group_seed.sql` (V1 is already deployed, V2-V10 were deleted in the migration consolidation so V2 is safe):
```sql
-- Built-in Admins group
INSERT INTO groups (id, name) VALUES
('00000000-0000-0000-0000-000000000010', 'Admins');
-- Assign ADMIN role to Admins group
INSERT INTO group_roles (group_id, role_id) VALUES
('00000000-0000-0000-0000-000000000010', '00000000-0000-0000-0000-000000000004');
```
**SystemRole.java** — add constants:
```java
public static final UUID ADMINS_GROUP_ID = UUID.fromString("00000000-0000-0000-0000-000000000010");
```
**UiAuthController.login()** — after upserting the user and assigning ADMIN role, also add to Admins group:
```java
rbacService.addUserToGroup(subject, SystemRole.ADMINS_GROUP_ID);
```
**Frontend guard:** The Admins group UUID is hardcoded as a constant in the frontend to disable deletion. Alternatively, check if a group's ID matches a known system group ID.
### 7. Dashboard Diagram Ordering
The inheritance diagram's three columns (Groups → Roles → Users) must show items in a consistent, matching order:
- **Groups column**: alphabetical by name, children indented under parents
- **Roles column**: iterate groups top-to-bottom, collect their direct roles, deduplicate preserving first-seen order. Roles not assigned to any group are omitted from the diagram.
- **Users column**: alphabetical by display name
Sort explicitly in `DashboardTab.tsx` before rendering.
---
## Files Changed
### Frontend — Modified
| File | Change |
|---|---|
| `ui/src/pages/admin/rbac/UsersTab.tsx` | Delete button, group/role assignment dropdowns, date format fix, self-delete guard |
| `ui/src/pages/admin/rbac/GroupsTab.tsx` | Add group form, delete button, role assignment dropdown, parent group dropdown, Admins guard |
| `ui/src/pages/admin/rbac/RolesTab.tsx` | Add role form, delete button (disabled for system) |
| `ui/src/pages/admin/rbac/DashboardTab.tsx` | Sort diagram columns consistently |
| `ui/src/pages/admin/rbac/RbacPage.module.css` | Styles for multi-select dropdown, inline create form, delete button, action chips, remove buttons |
### Frontend — New
| File | Responsibility |
|---|---|
| `ui/src/pages/admin/rbac/components/MultiSelectDropdown.tsx` | Reusable multi-select picker with search, checkboxes, batch apply |
### Backend — Modified
| File | Change |
|---|---|
| `cameleer3-server-core/.../rbac/SystemRole.java` | Add `ADMINS_GROUP_ID` constant |
| `cameleer3-server-app/.../security/UiAuthController.java` | Add admin user to Admins group on login |
### Backend — New Migration
| File | Change |
|---|---|
| `cameleer3-server-app/src/main/resources/db/migration/V2__admin_group_seed.sql` | Seed Admins group + ADMIN role assignment |
---
## Out of Scope
- Editing user profile fields (name, email) — users are managed by their identity provider
- Drag-and-drop group hierarchy management
- Role permission editing (custom roles have no effect on Spring Security yet)

View File

@@ -0,0 +1,327 @@
# RBAC Management — Design Specification
## Goal
Implement a full RBAC management system (issue #41) with group hierarchy, role inheritance, and a management UI integrated into the admin section. Replace the flat `users.roles` text array with a proper relational model.
## References
- Functional spec: `examples/RBAC/rbac-ui-spec.md`
- Visual prototype: `examples/RBAC/rbac_management_ui.html`
---
## Backend
### Database Schema
Squash V1V10 Flyway migrations into a single `V1__init.sql`. The `users` table drops the `roles TEXT[]` column. New tables:
```sql
CREATE TABLE users (
user_id TEXT PRIMARY KEY,
provider TEXT NOT NULL,
email TEXT,
display_name TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- RBAC: all roles — system roles seeded with fixed UUIDs, custom roles created by admins
CREATE TABLE roles (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE,
description TEXT NOT NULL DEFAULT '',
scope TEXT NOT NULL DEFAULT 'custom',
system BOOLEAN NOT NULL DEFAULT false,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Seed system roles with fixed UUIDs (stable across environments)
INSERT INTO roles (id, name, description, scope, system) VALUES
('00000000-0000-0000-0000-000000000001', 'AGENT', 'Agent registration and data ingestion', 'system-wide', true),
('00000000-0000-0000-0000-000000000002', 'VIEWER', 'Read-only access to dashboards and data', 'system-wide', true),
('00000000-0000-0000-0000-000000000003', 'OPERATOR', 'Operational commands (start/stop/configure agents)', 'system-wide', true),
('00000000-0000-0000-0000-000000000004', 'ADMIN', 'Full administrative access', 'system-wide', true);
-- RBAC: groups with self-referential hierarchy
CREATE TABLE groups (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL UNIQUE,
parent_group_id UUID REFERENCES groups(id) ON DELETE SET NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Join: roles assigned to groups (system + custom)
CREATE TABLE group_roles (
group_id UUID NOT NULL REFERENCES groups(id) ON DELETE CASCADE,
role_id UUID NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
PRIMARY KEY (group_id, role_id)
);
-- Join: direct group membership for users
CREATE TABLE user_groups (
user_id TEXT NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
group_id UUID NOT NULL REFERENCES groups(id) ON DELETE CASCADE,
PRIMARY KEY (user_id, group_id)
);
-- Join: direct role assignments to users (system + custom)
CREATE TABLE user_roles (
user_id TEXT NOT NULL REFERENCES users(user_id) ON DELETE CASCADE,
role_id UUID NOT NULL REFERENCES roles(id) ON DELETE CASCADE,
PRIMARY KEY (user_id, role_id)
);
-- Indexes for join query performance
CREATE INDEX idx_user_roles_user_id ON user_roles(user_id);
CREATE INDEX idx_user_groups_user_id ON user_groups(user_id);
CREATE INDEX idx_group_roles_group_id ON group_roles(group_id);
CREATE INDEX idx_groups_parent ON groups(parent_group_id);
```
Note: `roles TEXT[]` column is removed from `users`. All roles (system and custom) live in the `roles` table. System roles are seeded rows with `system = true` and fixed UUIDs — the application prevents their deletion or modification.
### System Roles
The four system roles (AGENT, VIEWER, OPERATOR, ADMIN) are:
- Seeded as rows in the `roles` table with `system = true` and fixed UUIDs
- Assigned to users via the same `user_roles` join table as custom roles
- Protected by application logic: creation, deletion, and name/scope modification are rejected
- Displayed in the UI as read-only entries (lock icon, non-deletable)
- Used by Spring Security / JWT for authorization decisions
Custom roles (`system = false`) are application-defined and have no effect on Spring Security — they serve the RBAC management model only (for future permission expansion).
The `scope` field distinguishes role domains: system roles use `system-wide`, custom roles can use descriptive scopes like `monitoring:read`, `config:write` for future permission gating.
### Domain Model (Java)
```java
// Existing, modified — drop roles field
public record UserInfo(
String userId,
String provider,
String email,
String displayName,
Instant createdAt
) {}
// New — enriched user for admin API responses
public record UserDetail(
String userId,
String provider,
String email,
String displayName,
Instant createdAt,
List<RoleSummary> directRoles, // from user_roles join (system + custom)
List<GroupSummary> directGroups, // from user_groups join
List<RoleSummary> effectiveRoles, // computed: union of direct + inherited via groups
List<GroupSummary> effectiveGroups // computed: direct groups + their ancestor chain
) {}
public record GroupDetail(
UUID id,
String name,
UUID parentGroupId, // nullable
Instant createdAt,
List<RoleSummary> directRoles,
List<RoleSummary> effectiveRoles, // direct + inherited from parent chain
List<UserSummary> members, // direct members
List<GroupSummary> childGroups
) {}
public record RoleDetail(
UUID id,
String name,
String description,
String scope,
boolean system, // true for AGENT/VIEWER/OPERATOR/ADMIN
Instant createdAt,
List<GroupSummary> assignedGroups,
List<UserSummary> directUsers,
List<UserSummary> effectivePrincipals // all users who hold this role
) {}
// Summaries for embedding in detail responses
public record UserSummary(String userId, String displayName, String provider) {}
public record GroupSummary(UUID id, String name) {}
public record RoleSummary(UUID id, String name, boolean system, String source) {}
// source: "direct" | group name (for inherited)
```
### Inheritance Logic
Server-side computation in a service class (e.g., `RbacService`):
1. **Effective groups for user**: Start from `user_groups` (direct memberships), then for each group walk `parent_group_id` chain upward to collect all ancestor groups. The union is every group the user is transitively a member of.
2. **Effective roles for user**: Direct `user_roles` + all `group_roles` for every effective group. Both system and custom roles flow through the same path.
3. **Effective roles for group**: Direct `group_roles` + inherited from parent chain.
4. **Effective principals for role**: All users who hold the role directly + all users in any group that has the role (transitively).
No role negation — roles only grant, never deny.
**Cycle detection**: When setting `parent_group_id` on a group, the application must walk the proposed parent chain upward and reject the update if it would create a cycle (i.e., the group appears in its own ancestor chain). Return HTTP 409 Conflict.
### Auth Integration
`JwtService` and `SecurityConfig` read system roles from `user_roles` joined to `roles WHERE system = true`, instead of `users.roles`. The `UserRepository` methods that currently read/write `users.roles` are updated to use the join table. JWT claims remain unchanged (`roles: ["ADMIN", "VIEWER"]`).
OIDC auto-signup: When a user is auto-registered via OIDC token exchange, they get a row in `users` with `provider = "oidc:<issuer>"` and a default system role (VIEWER) via `user_roles`. No group membership by default.
### API Endpoints
All under `/api/v1/admin/` prefix, protected by `@PreAuthorize("hasRole('ADMIN')")`.
The existing `PUT /users/{userId}/roles` bulk endpoint is removed. Role assignments use individual add/remove endpoints.
All mutation endpoints log to the `AuditService` (category: `USER_MGMT` for user operations, `RBAC` for group/role operations).
**Users** — response type: `UserDetail`
| Method | Path | Description | Request Body |
|---|---|---|---|
| GET | `/users` | List all users with effective roles/groups | — |
| GET | `/users/{id}` | Full user detail | — |
| POST | `/users/{id}/roles/{roleId}` | Assign role to user (system or custom) | — |
| DELETE | `/users/{id}/roles/{roleId}` | Remove role from user | — |
| POST | `/users/{id}/groups/{groupId}` | Add user to group | — |
| DELETE | `/users/{id}/groups/{groupId}` | Remove user from group | — |
| DELETE | `/users/{id}` | Delete user | — |
**Groups** — response type: `GroupDetail`
| Method | Path | Description | Request Body |
|---|---|---|---|
| GET | `/groups` | List all groups with hierarchy | — |
| GET | `/groups/{id}` | Full group detail | — |
| POST | `/groups` | Create group | `{ name, parentGroupId? }` |
| PUT | `/groups/{id}` | Update group | `{ name?, parentGroupId? }` — returns 409 on cycle |
| DELETE | `/groups/{id}` | Delete group — cascades role/member associations; child groups become top-level (parent set to null) | — |
| POST | `/groups/{id}/roles/{roleId}` | Assign role to group | — |
| DELETE | `/groups/{id}/roles/{roleId}` | Remove role from group | — |
**Roles** — response type: `RoleDetail`
| Method | Path | Description | Request Body |
|---|---|---|---|
| GET | `/roles` | List all roles (system + custom) | — |
| GET | `/roles/{id}` | Role detail | — |
| POST | `/roles` | Create custom role | `{ name, description?, scope? }` |
| PUT | `/roles/{id}` | Update custom role (rejects system roles) | `{ name?, description?, scope? }` |
| DELETE | `/roles/{id}` | Delete custom role (rejects system roles) | — |
**Dashboard:**
| Method | Path | Description |
|---|---|---|
| GET | `/rbac/stats` | `{ userCount, activeUserCount, groupCount, maxGroupDepth, roleCount }` |
---
## Frontend
### Routing
New route at `/admin/rbac` in `router.tsx`, lazy-loaded:
```tsx
const RbacPage = lazy(() => import('./pages/admin/rbac/RbacPage').then(m => ({ default: m.RbacPage })));
// ...
{ path: 'admin/rbac', element: <Suspense fallback={null}><RbacPage /></Suspense> }
```
Update `AppSidebar` ADMIN_LINKS to add `{ to: '/admin/rbac', label: 'User Management' }`.
### Component Structure
```
pages/admin/rbac/
├── RbacPage.tsx ← ADMIN role gate + tab navigation
├── RbacPage.module.css ← All RBAC-specific styles
├── DashboardTab.tsx ← Stat cards + inheritance diagram
├── UsersTab.tsx ← Split pane orchestrator
├── GroupsTab.tsx ← Split pane orchestrator
├── RolesTab.tsx ← Split pane orchestrator
├── components/
│ ├── EntityListPane.tsx ← Reusable: search input + scrollable card list
│ ├── EntityCard.tsx ← Single list row: avatar, name, meta, tags, status dot
│ ├── UserDetail.tsx ← Header, fields, groups, effective roles, group tree
│ ├── GroupDetail.tsx ← Header, fields, members, children, roles, hierarchy
│ ├── RoleDetail.tsx ← Header, fields, assigned groups/users, effective principals
│ ├── InheritanceChip.tsx ← Chip with dashed border + "↑ Source" annotation
│ ├── GroupTree.tsx ← Indented tree with corner connectors
│ ├── EntityAvatar.tsx ← Circle (user), rounded-square (group/role), color by type
│ ├── OidcBadge.tsx ← Small badge showing OIDC provider origin
│ ├── InheritanceDiagram.tsx ← Three-column Groups→Roles→Users read-only diagram
│ └── InheritanceNote.tsx ← Green-bordered explanation block
api/queries/admin/
│ └── rbac.ts ← useUsers, useUser, useGroups, useGroup, useRoles, useRole, useRbacStats + mutation hooks
```
### Tab Navigation
`RbacPage` uses a horizontal tab bar (Dashboard | Users | Groups | Roles) with URL-synced active state via query parameter (`?tab=users`). Each tab renders its content below the tab bar in the full main panel area.
### Split Pane Layout
Users, Groups, and Roles tabs share the same layout:
- **Left (52%)**: `EntityListPane` with search input + scrollable entity cards
- **Right (48%)**: Detail pane showing selected entity, or empty state prompt
- Resizable via `ResizableDivider` (existing shared component)
### Entity Card Patterns
**User card:** Circle avatar (initials, blue tint) + name + email/primary-group meta + role tags (amber) + group tags (green) + status dot + OIDC badge if `provider !== "local"`
**Group card:** Rounded-square avatar (initials, green/amber/red by domain) + name + parent/member-count meta + role tags (direct solid, inherited faded+italic)
**Role card:** Rounded-square avatar (initials, amber tint) + name + description/assignment-count meta + assigned-to tags. System roles show a lock icon.
### Badge/Chip Styling
Following the spec and existing CSS token system:
| Chip type | Background | Border | Text |
|---|---|---|---|
| Role (direct) | `var(--amber-dim)` | solid `var(--amber)` | amber text |
| Role (inherited) | transparent | dashed `var(--amber)` | faded amber, italic |
| Group | `var(--green-dim)` / `#E1F5EE` | solid green | green text |
| OIDC badge | `var(--cyan-dim)` | solid cyan | cyan text, shows provider |
| System role | Same as role but with lock icon | — | — |
Inherited role chips include `↑ GroupName` annotation in the detail pane.
### OIDC Badge
Displayed on user cards and user detail when `provider !== "local"`. Shows a small cyan-tinted pill with the provider name (e.g., "OIDC" or the issuer hostname). Positioned after the user's name in the card, and as a field in the detail pane.
### Search
Client-side filtering on entity list panes — filter by any visible text (name, email, group, role). Sufficient for the expected user count.
### State Management
- React Query for all server state (users, groups, roles, stats)
- Local `useState` for selected entity, search filter, active tab
- Mutations invalidate related queries (e.g., updating a user's groups invalidates both user and group queries)
---
## Migration Strategy
1. Delete all V1V10 migration files
2. Create single `V1__init.sql` containing the full consolidated schema
3. Deployed environments: drop and recreate the database (data loss accepted)
4. CI/CD: no special handling — clean database on deploy
5. Update `application.yml` if needed: `spring.flyway.clean-on-validation-error: true` or manual DB drop
---
## Out of Scope
- Permission-based access control (custom roles don't gate endpoints — system roles do)
- Audit log panel within RBAC (existing audit log page covers this)
- Bulk import/export of users or groups
- SCIM provisioning
- Role negation / deny rules

View File

@@ -0,0 +1,261 @@
# Cameleer3 Dashboard Review -- Senior Camel Developer Perspective
**Reviewer**: Senior Apache Camel Developer (10+ years, Java DSL / Spring Boot)
**Artifact reviewed**: `mock-v2-light.html` -- Operations Dashboard (v2 synthesis)
**Date**: 2026-03-17
---
## 1. What the Dashboard Gets RIGHT
### Business ID as First-Class Citizen
The Order ID and Customer columns in the execution table are exactly what I need. When support calls me about "order OP-88421", I can paste that into the search and find the execution immediately. Every other monitoring tool I have used forces me to map business IDs to correlation IDs manually. This alone would save me 10-15 minutes per incident.
### Inline Error Previews
Showing the exception message directly in the table row without requiring a click-through is genuinely useful. The two error examples in the mock (`HttpOperationFailedException` with a 504, `SQLTransientConnectionException` with HikariPool exhaustion) are realistic Camel exceptions. I can scan the error list and immediately tell whether it is a downstream timeout or a connection pool issue. That distinction determines whether I investigate our code or page the DBA.
### Processor Timeline (Gantt View)
The processor timeline in the detail panel is the single most valuable feature. Seeing that `to(payment-api)` consumed 280ms out of a 412ms total execution, while `enrich(inventory)` took 85ms, immediately tells me WHERE the bottleneck is. In my experience, 95% of Camel performance issues are in external calls, and this view pinpoints them. The color coding (green/yellow/red) for processor bars makes the slow step obvious at a glance.
### SLA Awareness Baked In
The SLA threshold line on the latency chart, the "SLA" tag on slow durations, and the "CLOSE" warning on the p99 card are exactly the kind of proactive indicators I want. Most monitoring tools show me raw numbers; this dashboard shows me numbers in context. I know immediately that 287ms p99 is dangerously close to our 300ms SLA.
### Shift-Aware Time Context
The "since 06:00" shift concept is something I have never seen in a developer tool but actually matches how production support works. When I start my day shift, I want to see what happened overnight and what is happening now, not a rolling 24-hour window that mixes yesterday afternoon with this morning.
### Agent Health in Sidebar
Seeing agent status (live/stale/dead), throughput per agent, and error rates at a glance in the sidebar is practical. When an agent goes stale, I know to check if a pod restarted or if there is a network partition.
### Application-to-Route Navigation Hierarchy
The sidebar tree (Applications > order-service > Routes > order-intake, order-enrichment, etc.) matches how I think about Camel deployments. I have multiple applications, each with multiple routes. Being able to filter by application first, then drill into routes, is the right hierarchy.
---
## 2. What is MISSING or Could Be Better
### 2.1 Exchange Body/Header Inspection -- CRITICAL GAP
**Pain point**: The "Exchange" tab exists in the detail panel tabs but its content is not shown. This is the single most important debugging feature for a Camel developer. When a message fails at step 5 of 7, I need to see:
- What was the original inbound message (before any transformation)?
- What did the exchange body look like at each processor step?
- Which headers were present at each step, and which were added/removed?
- What was the exception body (often different from the exception message)?
**How to address it**: The Exchange tab should show a step-by-step diff view of the exchange. For each processor in the route, show the body (with a JSON/XML pretty-printer) and the headers map. Highlight headers that were added at that step. Allow comparing any two steps side-by-side. Show the original inbound message prominently at the top.
**Priority**: **Must-Have**. Without this, the dashboard is an operations monitor, not a debugging tool. This is the difference between "I can see something failed" and "I can see WHY it failed."
### 2.2 Route Diagram / Visual Graph -- MENTIONED BUT NOT SHOWN
**Pain point**: The "View Route Diagram" button exists in the detail actions, but there is no mockup of what the route diagram looks like. As a Camel developer, I need to see the DAG (directed acyclic graph) of my route: from(jms:orders) -> unmarshal -> validate -> choice -> [branch A: enrich -> transform -> to(http)] [branch B: log -> to(dlq)]. I also need to see execution overlay on the diagram -- which path did THIS specific exchange take, and how long did each node take.
**How to address it**: Add a Route Diagram page/view that shows:
- The route definition as an interactive DAG (nodes = processors, edges = flow)
- Execution overlay: color-code each node by success/failure for a specific execution
- Aggregate overlay: color-code each node by throughput/error rate over a time window
- Highlight the path taken by the selected exchange (dim the branches not taken)
- Show inter-route connections (e.g., `direct:`, `seda:`, `vm:` endpoints linking routes)
**Priority**: **Must-Have**. Cameleer already has `RouteGraph` data from agents -- this is the tool's differentiating feature.
### 2.3 Cross-Route Correlation / Message Tracing
**Pain point**: A single business transaction (e.g., an order) often spans multiple routes: `order-intake` -> `order-enrichment` -> `payment-process` -> `shipment-dispatch`. The dashboard shows each route execution as a separate row. There is no way to see the full journey of order OP-88421 across all routes.
**How to address it**: Add a "Transaction Trace" or "Message Flow" view that:
- Groups all executions sharing a breadcrumbId or correlation ID
- Shows them as a horizontal timeline or waterfall chart
- Highlights which route in the chain failed
- Works across `direct:`, `seda:`, and `vm:` endpoints that link routes
The search bar says "Search by Order ID, correlation ID" which is a good start, but the results should show the correlated group, not just individual rows.
**Priority**: **Must-Have**. Splitter/aggregator patterns and multi-route flows are the norm, not the exception, in real Camel applications.
### 2.4 Dead Letter Queue Monitoring
**Pain point**: When messages fail and are routed to a dead letter channel (which is the standard Camel error handling pattern), I need to know: how many messages are in the DLQ, what are they, how long have they been there, and can I retry them?
**How to address it**: Add a DLQ section or page showing:
- Count of messages per dead letter endpoint
- Age distribution (how many are from today vs. last week)
- Message preview (body + headers + the exception that caused routing to DLQ)
- Retry action (re-submit the message to the original route)
- Purge action (acknowledge and discard)
**Priority**: **Must-Have**. DLQ management is a daily production task.
### 2.5 Per-Processor Statistics (Aggregate View)
**Pain point**: The processor timeline in the detail panel shows per-processor timing for a single execution. But I also need aggregate statistics: for processor `to(payment-api)`, what is the p50/p95/p99 latency over the last hour? How many times did it fail? Is it getting slower over time?
**How to address it**: Clicking a processor name in the timeline should show aggregate stats for that processor. Alternatively, the Route Detail page should have a "Processors" tab with a table of all processors in the route, their call count, success rate, and latency percentiles.
**Priority**: **Must-Have**. Identifying a chronically slow processor is different from identifying a one-off slow execution.
### 2.6 Error Pattern Grouping / Top Errors
**Pain point**: The dashboard shows individual error rows. When there are 38 errors, I do not want to scroll through all 38. I want to see: "23 of the 38 errors are `HttpOperationFailedException` on `payment-process`, 10 are `SQLTransientConnectionException` on `order-enrichment`, 5 are `ValidationException` on `order-intake`." The design notes mention "Top error pattern grouping panel" from the operator expert, but it is not in the final mock.
**How to address it**: Add an error summary panel above or alongside the execution table showing errors grouped by exception class + route. Each group should show count, first/last occurrence, and whether the count is trending up.
**Priority**: **Must-Have**. Pattern recognition is more important than individual error viewing.
### 2.7 Route Status Management
**Pain point**: I need to know which routes are started, stopped, or suspended. And I need the ability to stop/start/suspend individual routes without redeploying. This is routine in production -- temporarily suspending a route that is flooding a downstream system.
**How to address it**: The sidebar route list should show route status (started/stopped/suspended) with icons. Right-click or action menu on a route should offer start/stop/suspend. This maps directly to Camel's route controller API.
**Priority**: **Nice-to-Have** for v1, **Must-Have** for v2. Operators will ask for this quickly.
### 2.8 Route Version Comparison
**Pain point**: After a deployment, I want to compare the current route definition with the previous version. Did someone add a processor? Change an endpoint URI? Route definition drift is a real source of production issues.
**How to address it**: Store route graph snapshots per deployment/version. Show a diff view highlighting added/removed/modified processors.
**Priority**: **Nice-to-Have**. Valuable but less urgent than the above.
### 2.9 Thread Pool / Resource Monitoring
**Pain point**: Camel's default thread pool max is 20. When all threads are consumed, messages queue up silently. The HikariPool error in the mock is a perfect example -- pool exhaustion. I need visibility into thread pool utilization, connection pool utilization, and inflight exchange count.
**How to address it**: Add a "Resources" section (either in the agent detail or a separate page) showing:
- Camel thread pool utilization (active/max)
- Connection pool utilization (from endpoint components)
- Inflight exchange count per route
- Consumer prefetch/backlog (for JMS/Kafka consumers)
**Priority**: **Nice-to-Have** initially, but becomes **Must-Have** when debugging pool exhaustion issues.
### 2.10 Saved Searches / Alert Rules
**Pain point**: I find myself searching for the same patterns repeatedly: "errors on payment-process in the last hour", "executions over 500ms for order-enrichment". There is no way to save these as bookmarks or convert them into alert rules.
**How to address it**: Allow saving filter configurations as named views. Allow converting a saved search into an alerting rule (email/webhook when count exceeds threshold).
**Priority**: **Nice-to-Have**.
---
## 3. Specific Page/Feature Recommendations
### 3.1 Route Detail Page
When I click a route name (e.g., `order-intake`) from the sidebar, I should see:
- **Header**: Route name, status (started/stopped), uptime, route definition source (Java DSL / XML / YAML)
- **KPI Strip**: Total executions, success rate, p50/p99 latency, inflight count, throughput -- all for this route only
- **Processor Table**: Every processor in the route with columns: name, type, call count, success rate, p50 latency, p99 latency, total time %. Sortable by any column. This is where I find the bottleneck processor.
- **Route Diagram**: Interactive DAG with execution overlay. Nodes sized by throughput, colored by error rate. Clicking a node filters the execution list to that processor.
- **Recent Executions**: Filtered version of the main table, showing only this route's executions.
- **Error Patterns**: Top errors for this route, grouped by exception class.
### 3.2 Exchange / Message Inspector
When I click "Exchange" tab in the detail panel:
- **Inbound Message**: The original message as received by the route's consumer. Body + headers. Shown prominently, always visible.
- **Step-by-Step Trace**: For each processor, show the exchange state AFTER that processor ran. Diff mode should highlight what changed (body mutations, added headers, removed headers).
- **Properties**: Camel exchange properties (not just headers). Properties often carry routing decisions.
- **Exception**: If the exchange failed, show the caught exception, the handled flag, and whether it was routed to a dead letter channel.
- **Response**: If the route produces a response (e.g., REST endpoint), show the outbound body.
Display format should auto-detect JSON/XML and pretty-print. Binary payloads should show hex dump with size.
### 3.3 Metrics Dashboard (Developer vs. Operator KPIs)
The current metrics (throughput, latency p99, error rate) are operator KPIs. A Camel developer also needs:
**Developer KPIs** (add a "Developer" metrics view):
- Per-processor latency breakdown (stacked bar: which processors consume the most time)
- External endpoint response time (HTTP, DB, JMS) -- separate from Camel processing time
- Type converter cache hit rate (rarely needed, but valuable when debugging serialization issues)
- Redelivery count (how many messages required retries before succeeding)
- Content-based router distribution (for `choice()` routes: how many messages went down each branch)
**Operator KPIs** (already well-covered):
- Throughput, error rate, latency percentiles -- these are solid as-is
### 3.4 Dead Letter Queue View
A dedicated DLQ page:
- **Summary Cards**: One card per DLQ endpoint (e.g., `jms:DLQ.orders`, `seda:error-handler`), showing message count, oldest message age, newest message timestamp.
- **Message List**: Table with columns: original route, exception class, business ID, timestamp, retry count.
- **Message Detail**: Click a DLQ message to see the exchange snapshot (body + headers + exception) at the time of failure.
- **Actions**: Retry (re-submit to original endpoint), Retry All (bulk retry for a pattern), Discard, Move to another queue.
- **Filters**: By exception type, by route, by age.
### 3.5 Route Comparison
Two use cases:
1. **Version diff**: Compare route graph v3.2.0 vs. v3.2.1. Show added/removed/modified processors as a visual diff on the DAG.
2. **Performance comparison**: Compare this week's latency distribution for `payment-process` with last week's. Overlay histograms. Useful for validating that a deployment improved (or degraded) performance.
---
## 4. Information Architecture Critique
### What Works
- **Sidebar hierarchy** (Applications > Routes) is correct and matches how Camel projects are structured.
- **Health strip at top** provides instant situational awareness without scrolling.
- **Master-detail pattern** (table + slide-in panel) avoids page navigation for quick inspection. This keeps context.
- **Keyboard shortcuts** (Ctrl+K search, arrow navigation, Esc to close) are the right accelerators for power users.
### What Needs Adjustment
**The sidebar is too flat.** It shows applications and routes in the same list, but there is no way to navigate to:
- A dedicated Route Detail page (with per-processor stats, diagram, error patterns)
- An Agent Detail page (with resource utilization, version info, configuration)
- A DLQ page
- A Search/Trace page (for cross-route correlation)
Recommendation: Add top-level navigation items to the sidebar:
```
Dashboard (the current view)
Routes (route list with status, drill into route detail)
Traces (cross-route message flow / correlation)
Errors (grouped error patterns, DLQ)
Agents (agent health, resource utilization)
Diagrams (route graph visualization)
```
**Route click should go deeper.** Currently, clicking a route in the sidebar filters the execution table. This is useful, but clicking the route NAME in a table row or in the detail panel should navigate to a dedicated Route Detail page with per-processor aggregate stats and the route diagram.
**Search results need grouping.** The Ctrl+K search bar says "Search by Order ID, route, error..." but search results should group by correlation ID when searching by business ID. If I search for "OP-88421", I want to see ALL executions related to that order across all routes, not just the one row in `payment-process`.
**1-click access priorities:**
- Health overview: 1 click (current: 0 clicks -- it is the home page -- good)
- Filter by errors only: 1 click (current: 1 click on Error pill -- good)
- View a specific execution's processor timeline: 2 clicks (current: 1 click on row -- good)
- View exchange body/headers: should be 2 clicks (click row, click Exchange tab). Currently not implemented.
- View route diagram: should be 2 clicks (click route name, see diagram). Currently requires finding the button in the detail panel.
- Cross-route trace: should be 2 clicks (click correlation ID or business ID, see trace). Currently not possible.
- DLQ status: should be 1 click from sidebar. Currently not available.
---
## 5. Score Card
| Dimension | Score (1-10) | Notes |
|-----------------------------|:---:|-------|
| Transaction tracking | 4 | Individual executions visible, but no cross-route transaction view. Correlation ID shown but not actionable. |
| Root cause analysis | 6 | Processor timeline identifies the slow/failing step. Error messages shown inline. But no exchange body inspection, no stack trace expansion, no header diff. |
| Performance monitoring | 7 | Throughput, latency p99, error rate charts with SLA lines are solid. Missing per-processor aggregate stats and resource utilization. |
| Route visualization | 3 | Route names in sidebar, but no actual route diagram/DAG. The "View Route Diagram" button exists with no destination. This is Cameleer's key differentiator -- it must ship. |
| Exchange/message visibility | 2 | Exchange tab exists but has no content. No body inspection, no header view, no step-by-step diff. This is the most critical gap. |
| Correlation/tracing | 3 | Correlation ID displayed in detail panel, but no way to trace a message across routes. No breadcrumb linking. No transaction waterfall. |
| Overall daily usefulness | 5 | As an operations monitor (is anything broken right now?), it scores 7-8. As a developer debugging tool (why is it broken and how do I fix it?), it scores 3-4. The gap is in the debugging/inspection features. |
### Summary Verdict
The dashboard is a **strong operations monitor** -- it answers "what is happening right now?" effectively. The health strip, SLA awareness, shift context, business ID columns, and inline error previews are genuinely useful and better than most tools I have used.
However, it is a **weak debugging tool** -- it does not yet answer "why did this specific message fail?" or "what did the exchange look like at each step?" The Exchange tab, route diagram, cross-route tracing, and error pattern grouping are the features that would make this a daily-driver tool rather than a pretty overview I glance at in the morning.
The processor Gantt chart in the detail panel is the single best feature in the entire dashboard. Build on that. Make it clickable (click a processor to see the exchange state at that point). Add aggregate stats. Link it to the route diagram. That is where this tool becomes indispensable.
**Bottom line**: Ship the exchange inspector, the route diagram, and cross-route tracing, and this goes from a 5/10 to an 8/10 daily-use tool.

File diff suppressed because one or more lines are too long

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long