Commit Graph

17 Commits

Author SHA1 Message Date
hsiegeln
873e1d3df7 feat!: move query/logs/routes/diagram/agent-view endpoints under /environments/{envSlug}/
P3C — the last data/query wave of the taxonomy migration. Every user-
facing read endpoint that was keyed on env-as-query-param is now under
the env-scoped URL, making env impossible to omit and unambiguous in
server-side tenant+env filtering.

Server:
- SearchController: /api/v1/search/** → /api/v1/environments/{envSlug}/...
  Endpoints: /executions (GET), /executions/search (POST), /stats,
  /stats/timeseries, /stats/timeseries/by-app, /stats/timeseries/by-route,
  /stats/punchcard, /attributes/keys, /errors/top. Env comes from path.
- LogQueryController: /api/v1/logs → /api/v1/environments/{envSlug}/logs.
- RouteCatalogController: /api/v1/routes/catalog → /api/v1/environments/
  {envSlug}/routes. Env filter unconditional (path).
- RouteMetricsController: /api/v1/routes/metrics →
  /api/v1/environments/{envSlug}/routes/metrics (and /metrics/processors).
- DiagramRenderController: /{contentHash}/render stays flat (hashes are
  globally unique). Find-by-route moved to /api/v1/environments/{envSlug}/
  apps/{appSlug}/routes/{routeId}/diagram — the old GET /diagrams?...
  handler is removed.
- Agent views split cleanly:
  - AgentListController (new): /api/v1/environments/{envSlug}/agents
  - AgentEventsController: /api/v1/environments/{envSlug}/agents/events
  - AgentMetricsController: /api/v1/environments/{envSlug}/agents/
    {agentId}/metrics — now also rejects cross-env agents (404) as a
    defense-in-depth check, fulfilling B3.
  Agent self-service endpoints (register/refresh/heartbeat/deregister)
  remain flat at /api/v1/agents/** — JWT-authoritative.

SPA:
- queries/agents.ts, agent-metrics.ts, logs.ts, catalog.ts (route
  metrics only; /catalog stays flat), processor-metrics.ts,
  executions.ts (attributes/keys, stats, timeseries, search),
  dashboard.ts (all stats/errors/punchcard), correlation.ts,
  diagrams.ts (by-route) — all rewritten to env-scoped URLs.
- Hooks now either read env from useEnvironmentStore internally or
  require it as an argument. Query keys include env so switching env
  invalidates caches.
- useAgents/useAgentEvents signature simplified — env is no longer a
  parameter; it's read from the store. Callers (LayoutShell,
  AgentHealth, AgentInstance) updated accordingly.
- LogTab and useStartupLogs thread env through to useLogs.
- envFetch helper introduced in executions.ts for env-prefixed raw
  fetch until schema.d.ts is regenerated against the new backend.

BREAKING CHANGE: All these flat paths are removed:
  /api/v1/search/**, /api/v1/logs, /api/v1/routes/catalog,
  /api/v1/routes/metrics (and /processors), /api/v1/diagrams
  (lookup), /api/v1/agents (list), /api/v1/agents/events-log,
  /api/v1/agents/{id}/metrics, /api/v1/agent-events.
Clients must use the /api/v1/environments/{envSlug}/... equivalents.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:48:25 +02:00
hsiegeln
fcb53dd010 fix!: require environment on diagram lookup and attribute keys queries
Closes two cross-env data leakage paths. Both endpoints previously
returned data aggregated across all environments, so a diagram or
attribute key from dev could appear in a prod UI query (and vice versa).

B1: GET /api/v1/diagrams?application=&routeId= now requires
?environment= and resolves agents via
registryService.findByApplicationAndEnvironment instead of
findByApplication. Prevents serving a dev diagram for a prod route.

B2: GET /api/v1/search/attributes/keys now requires ?environment=.
SearchIndex.distinctAttributeKeys gains an environment parameter and
the ClickHouse query adds the env filter alongside the existing
tenant_id filter. Prevents prod attribute names leaking into dev
autocompletion (and vice versa).

SPA hooks updated to thread environment through from
useEnvironmentStore; query keys include environment so React Query
re-fetches on env switch. No call-site changes needed — hook
signatures unchanged.

B3 (AgentMetricsController env scope) deferred to P3C: agent-env is
effectively 1:1 today via the instance_id naming
({envSlug}-{appSlug}-{replicaIndex}), and the URL migration in P3C
to /api/v1/environments/{env}/agents/{agentId}/metrics naturally
introduces env from path. A minimal P1 fix would regress the "view
metrics of a killed agent" case.

BREAKING CHANGE: Both endpoints now require ?environment= (slug).
Clients omitting the parameter receive 400.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-16 23:19:55 +02:00
hsiegeln
694d0eef59 feat: add environment filtering across all APIs and UI
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Backend: Added optional `environment` query parameter to catalog,
search, stats, timeseries, punchcard, top-errors, logs, and agents
endpoints. ClickHouse queries filter by environment when specified
(literal SQL for AggregatingMergeTree, ? binds for raw tables).
StatsStore interface methods all accept environment parameter.

UI: Added EnvironmentSelector component (compact native select).
LayoutShell extracts distinct environments from agent data and
passes selected environment to catalog and agent queries via URL
search param (?env=). TopBar shows current environment label.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-04 15:42:26 +02:00
hsiegeln
f3feaddbfe feat: show distinct attribute keys in cmd-k Attributes tab
All checks were successful
CI / build (push) Successful in 1m58s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m46s
CI / deploy (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
Add GET /search/attributes/keys endpoint that queries distinct
attribute key names from ClickHouse using JSONExtractKeys. Attribute
keys appear in the cmd-k Attributes tab alongside attribute value
matches from exchange results.

- SearchIndex.distinctAttributeKeys() interface method
- ClickHouseSearchIndex implementation using arrayJoin(JSONExtractKeys)
- SearchController /attributes/keys endpoint
- useAttributeKeys() React Query hook
- buildSearchData includes attribute keys as 'attribute' category items

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:39:27 +02:00
hsiegeln
3af1d1f3b6 feat: add useProcessorSnapshotById hook for snapshot-by-processorId endpoint
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-27 18:54:01 +01:00
hsiegeln
996ea65293 feat: LIVE/PAUSED toggle controls data fetching on sidebar navigation
All checks were successful
CI / build (push) Successful in 1m13s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 55s
CI / deploy (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
LIVE: sidebar clicks trigger initial fetch + polling for the new route.
PAUSED: sidebar clicks navigate but queries are disabled — no fetches
until the user switches back to LIVE.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 10:01:14 +01:00
hsiegeln
4ac11551c9 feat: add auto-refresh toggle wired to all polling queries
Some checks failed
CI / build (push) Failing after 51s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Upgrade @cameleer/design-system to ^0.1.3 which adds LIVE/PAUSED
toggle to TopBar backed by autoRefresh state in GlobalFilterProvider.

Add useRefreshInterval() hook that returns the polling interval when
auto-refresh is on, or false when paused. Wire it into all query
hooks that use refetchInterval (executions, catalog, agents, metrics,
admin database/opensearch).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-24 18:10:32 +01:00
hsiegeln
8ad0016a8e refactor: rename group/groupName to application/applicationName
Some checks failed
CI / build (push) Failing after 40s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
The execution-related "group" concept actually represents the
application name. Rename all Java fields, API parameters, and frontend
types from groupName→applicationName and group→application for clarity.

- Java records: ExecutionSummary, ExecutionDetail, ExecutionDocument,
  ExecutionRecord, ProcessorRecord
- API params: SearchRequest.group→application, SearchController
  @RequestParam group→application
- Services: IngestionService, DetailService, SearchIndexer, StatsStore
- Frontend: schema.d.ts, Dashboard, ExchangeDetail, RouteDetail,
  executions query hooks

Database column names (group_name) and OpenSearch field names are
unchanged — only the API-facing Java/TS field names are renamed.

RBAC group references (groups table, GroupRepository, GroupsTab) are
a separate domain concept and are NOT affected by this change.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 21:21:38 +01:00
hsiegeln
a108b57591 Fix route diagram open issues: bugs, visual polish, interactive features
Some checks failed
CI / build (push) Successful in 1m12s
CI / deploy (push) Has been cancelled
CI / docker (push) Has been cancelled
Batch 1 — Bug fixes:
- #51: Pass group+routeId to stats/timeseries API for route-scoped data
- #55: Propagate processor FAILED status to diagram error node highlighting

Batch 2 — Visual polish:
- #56: Brighter canvas background with amber/cyan radial gradients
- #57: Stronger glow filters (stdDeviation 3→6, opacity 0.4→0.6)
- #58: Uniform 200×40px leaf nodes with label truncation at 22 chars
- #59: Diagram legend (node types, edge types, overlay indicators)
- #64: SVG <title> tooltips on all nodes showing type, status, duration

Batch 3 — Interactive features:
- #60: Draggable minimap viewport (click-to-center, drag-to-pan)
- #62: CSS View Transitions slide animation, back arrow, Backspace key

Batch 4 — Advanced features:
- #50: Execution picker dropdown scoped to group+routeId
- #49: Iteration count badge (×N) on compound nodes
- #63: Route header stats (Executions Today, Success Rate, Avg, P99)

Closes #49 #50 #51 #55 #56 #57 #58 #59 #60 #62 #63 #64

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 22:14:23 +01:00
hsiegeln
465f210aee Contract-first API with DTOs, validation, and server-side OpenAPI post-processing
All checks were successful
CI / build (push) Successful in 1m27s
CI / docker (push) Successful in 2m6s
CI / deploy (push) Successful in 30s
Add dedicated request/response DTOs for all controllers, replacing raw
JsonNode parameters with validated types. Move OpenAPI path-prefix stripping
and ProcessorNode children injection into OpenApiCustomizer beans so the
spec served at /api/v1/api-docs is already clean — eliminating the need for
the ui/scripts/process-openapi.mjs post-processing script.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 15:33:37 +01:00
hsiegeln
50bb22d6f6 Add OIDC logout, fix OpenAPI schema types, expose end_session_endpoint
All checks were successful
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 51s
CI / deploy (push) Successful in 29s
Backend:
- Expose end_session_endpoint from OIDC provider metadata in /auth/oidc/config
- Add getEndSessionEndpoint() to OidcTokenExchanger

Frontend:
- On OIDC logout, redirect to provider's end_session_endpoint to clear SSO session
- Strip /api/v1 prefix from OpenAPI paths to match client baseUrl convention
- Add schema-types.ts with convenience type re-exports from generated schema
- Fix all type imports to use schema-types instead of raw generated schema
- Fix optional field access (processors, children, duration) with proper typing
- Fix AgentInstance.state → status field name

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-14 14:43:18 +01:00
hsiegeln
cdf4c93630 Make stats endpoint respect selected time window instead of hardcoded last hour
All checks were successful
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 48s
CI / deploy (push) Successful in 28s
P99 latency and active count now use the same from/to parameters as the
timeseries sparklines, so all stat cards are consistent with the user's
selected time range.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 22:19:59 +01:00
hsiegeln
96a5b00b99 Fix sparklines not rendering on initial page load in Chromium
All checks were successful
CI / build (push) Successful in 1m2s
CI / docker (push) Successful in 48s
CI / deploy (push) Successful in 38s
Replace useId() with colon-free ref-based ID generator to avoid SVG
url() gradient resolution failures, and add placeholderData to
timeseries query to prevent flash during refetch.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-13 22:00:44 +01:00
hsiegeln
cf804638d7 Add live/paused toggle, env badge, remove topnav search, rename labels
All checks were successful
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 46s
CI / deploy (push) Successful in 28s
- Add LIVE/PAUSED toggle button that auto-refreshes search results every 5s
- Source environment badge from VITE_ENV_NAME env var (defaults to DEV locally, PRODUCTION in Docker)
- Remove search trigger button from topnav (command palette still available via keyboard)
- Rename "Transaction Explorer" to "Route Explorer" and "Active Now" to "In-Flight"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:54:24 +01:00
hsiegeln
9e6e1b350a Add stat card sparkline graphs with timeseries backend endpoint
All checks were successful
CI / build (push) Successful in 1m0s
CI / docker (push) Successful in 45s
CI / deploy (push) Successful in 23s
New /search/stats/timeseries endpoint returns bucketed counts/metrics
over a time window using ClickHouse toStartOfInterval(). Frontend
Sparkline component renders SVG polyline + gradient fill on each
stat card, driven by a useStatsTimeseries query hook.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:20:08 +01:00
hsiegeln
3f98467ba5 Fix status filter OR logic and add P99/active stats endpoint
All checks were successful
CI / build (push) Successful in 1m1s
CI / docker (push) Successful in 47s
CI / deploy (push) Successful in 29s
Status filter now parses comma-separated values into SQL IN clause
instead of exact match, so filtering by multiple statuses works.

Added GET /api/v1/search/stats returning P99 latency (last hour) and
active execution count, wired into the UI stat cards with 10s polling.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:34:11 +01:00
hsiegeln
3eb83f97d3 Add React UI with Execution Explorer, auth, and standalone deployment
Some checks failed
CI / build (push) Failing after 1m53s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
- Scaffold Vite + React + TypeScript frontend in ui/ with full design
  system (dark/light themes) matching the HTML mockups
- Implement Execution Explorer page: search filters, results table with
  expandable processor tree and exchange detail sidebar, pagination
- Add UI authentication: UiAuthController (login/refresh endpoints),
  JWT filter handles ui: subject prefix, CORS configuration
- Shared components: StatusPill, DurationBar, StatCard, AppBadge,
  FilterChip, Pagination — all using CSS Modules with design tokens
- API client layer: openapi-fetch with auth middleware, TanStack Query
  hooks for search/detail/snapshot queries, Zustand for state
- Standalone deployment: Nginx Dockerfile, K8s Deployment + ConfigMap +
  NodePort (30080), runtime config.js for API base URL
- Embedded mode: maven-resources-plugin copies ui/dist into JAR static
  resources, SPA forward controller for client-side routing
- CI/CD: UI build step, Docker build/push for server-ui image, K8s
  deploy step for UI, UI credential secrets

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 13:59:22 +01:00