117 Commits

Author SHA1 Message Date
hsiegeln
b714d3363f feat(#119): expose route state in catalog API and sidebar/dashboard
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 29s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add routeState field to RouteSummary DTO (null for started, 'stopped'
or 'suspended' for non-default states). Sidebar shows stop/pause icons
and state badge for affected routes in both Apps and Routes sections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:46 +02:00
hsiegeln
0acceaf1a9 feat(#119): add RouteStateRegistry for tracking route operational state
In-memory registry that infers route state (started/stopped/suspended)
from successful route-control command ACKs. Updates state only when all
agents in a group confirm success.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:15:35 +02:00
hsiegeln
ca1d472b78 feat(#117): agent-count toasts and persistent error toast dismiss
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:08:00 +02:00
hsiegeln
c3b4f70913 feat(#116): update command hooks for synchronous group response
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 30s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Add CommandGroupResponse and ConfigUpdateResponse types. Switch
useSendGroupCommand and useSendRouteCommand from openapi-fetch to authFetch
returning CommandGroupResponse. Update useUpdateApplicationConfig to return
ConfigUpdateResponse and fix all consumer onSuccess callbacks to access
saved.config.version instead of saved.version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:01:06 +02:00
hsiegeln
027e45aadf feat(#116): synchronous group command dispatch with multi-agent response collection
Add addGroupCommandWithReplies() to AgentRegistryService that sends commands
to all LIVE agents in a group and returns CompletableFuture per agent for
collecting replies. Update sendGroupCommand() and pushConfigToAgents() to
wait with a shared 10-second deadline, returning CommandGroupResponse with
per-agent status, timeouts, and overall success. Config update endpoint now
returns ConfigUpdateResponse wrapping both the saved config and push result.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 19:00:56 +02:00
hsiegeln
f39f07e7bf feat(#118): add confirmation dialog for stop and suspend commands
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 35s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Stop and suspend route commands now show a ConfirmDialog requiring
typed confirmation before dispatch. Start and resume execute
immediately without confirmation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:54:23 +02:00
hsiegeln
d21d8b2c48 fix(#112): initialize sidebar accordion state from initial route
Some checks failed
CI / build (push) Failing after 43s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Direct navigation to /admin/* now correctly opens Admin section
and collapses operational sections on first render. Previously
the accordion effect only triggered on route transitions.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:36:43 +02:00
hsiegeln
d5f5601554 fix(#112): add missing Routes section, fix admin double padding
Review feedback: buildRouteTreeNodes was defined but never rendered.
Added Routes section between Agents and Admin. Removed duplicate
padding on admin pages (AdminLayout handles its own padding).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:32:26 +02:00
hsiegeln
00042b1d14 feat(#112): remove admin tabs, sidebar handles navigation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:29 +02:00
hsiegeln
fe49eb5aba feat(#112): migrate to composable sidebar with accordion and collapse
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:25 +02:00
hsiegeln
bc913eef6e feat(#112): extract sidebar tree builders and types from DS
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 18:29:22 +02:00
hsiegeln
d70ad91b33 docs: clarify search ownership and icon-rail click behavior
Search: DS renders dumb input, app owns filterQuery state and
passes it to each SidebarTree. Icon-rail click: fires both
onCollapseToggle and onToggle simultaneously, no navigation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:41:31 +02:00
hsiegeln
ba361af2d7 docs: composable sidebar design spec for #112
Replaces the previous "hide sidebar on admin" approach with a
composable compound component design. DS provides shell + building
blocks (Sidebar, Section, Footer, SidebarTree); consuming app
controls all content, section ordering, accordion behavior, and
icon-rail collapse.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:38:01 +02:00
hsiegeln
78777d2ba6 Revert "feat(#112): hide sidebar, topbar, cmd palette on admin pages"
This reverts commit d95e518622.
2026-04-02 17:22:06 +02:00
hsiegeln
3f8a9715a4 Revert "feat(#112): add admin header bar with back button and logout"
This reverts commit a484364029.
2026-04-02 17:22:06 +02:00
hsiegeln
f00a3e8b97 Revert "fix(#112): remove dead admin breadcrumb code, add logout aria-label"
This reverts commit d5028193c0.
2026-04-02 17:22:06 +02:00
hsiegeln
d5028193c0 fix(#112): remove dead admin breadcrumb code, add logout aria-label
Review feedback: breadcrumb memo had an unused isAdminPage branch
(TopBar no longer renders on admin pages). Added aria-label to
icon-only logout button for screen readers.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 17:16:01 +02:00
hsiegeln
a484364029 feat(#112): add admin header bar with back button and logout
AdminLayout gains a self-contained header (Back / Admin / user+logout)
with CSS module styles, replacing the inline padding wrapper. Admin
pages now render fully without the main app chrome.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 17:12:50 +02:00
hsiegeln
d95e518622 feat(#112): hide sidebar, topbar, cmd palette on admin pages
Pass null as sidebar prop, guard TopBar and CommandPalette with
!isAdminPage, and remove conditional admin padding from main element.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-02 17:12:44 +02:00
hsiegeln
56297701e6 fix: use ILIKE for case-insensitive log search in ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m4s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m29s
LIKE is case-sensitive in ClickHouse. Switch to ILIKE for message,
stack_trace, and logger_name searches so queries match regardless
of casing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:35:34 +02:00
hsiegeln
8c7c9911c4 feat: highlight search matches in log results
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Has been cancelled
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
Recursive case-insensitive highlighting of the search query in
collapsed message, expanded full message, and stack trace. Uses the
project's amber accent color for the highlight mark.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:34:15 +02:00
hsiegeln
4d66d6ab23 fix: use deterministic badge color for app names in Logs tab
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m0s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Use attributeBadgeColor() (hash-based) instead of "auto" so the same
application name gets the same badge color across all pages.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 16:31:04 +02:00
hsiegeln
b73f5e6dd4 feat: add Logs tab with cursor-paginated search, level filters, and live tail
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m3s
CI / docker (push) Successful in 1m11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 49s
- Extend GET /api/v1/logs with cursor pagination, multi-level filtering,
  optional application scoping, and level count aggregation
- Add exchangeId, instanceId, application, mdc fields to log responses
- Refactor ClickHouseLogStore with keyset pagination (N+1 pattern)
- Add LogSearchRequest/LogSearchResponse core domain records
- Create LogSearchPageResponse wrapper DTO
- Add Logs as 4th content tab (Exchanges | Dashboard | Runtime | Logs)
- Implement LogSearch component with debounced search, level filter bar,
  expandable log entries, cursor pagination, and live tail mode
- Add cross-navigation: exchange header → logs, log tab → logs tab
- Update ClickHouseLogStoreIT with cursor, multi-level, cross-app tests

Closes: #104

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-02 08:47:16 +02:00
hsiegeln
a52751da1b fix: avoid alias shadowing in processor metrics -Merge query
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m11s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
SonarQube / sonarqube (push) Failing after 1m52s
ClickHouse 24.12 new query analyzer resolves countMerge(total_count)
in the CASE WHEN to the SELECT alias (UInt64) instead of the original
AggregateFunction column when the alias has the same name. Renamed
aliases to tc/fc to avoid the collision.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:24:50 +02:00
hsiegeln
51780031ea fix: use alias in ORDER BY for processor metrics query
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
ClickHouse rejects countMerge() in ORDER BY after GROUP BY because the
column is already finalized to UInt64. Use the SELECT alias instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:11:54 +02:00
hsiegeln
eb2cafc7fa fix: use jar instead of unzip in sonarqube workflow
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 47s
The build container lacks unzip. The JDK jar command handles zip
extraction natively.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:02:09 +02:00
hsiegeln
805e6d51cb fix: add processor_type to stats_1m_processor_detail MV
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m14s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The table and materialized view were missing the processor_type column,
causing the RouteMetricsController query to fail and the dashboard
processor metrics table to render empty.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 22:00:23 +02:00
hsiegeln
f3feaddbfe feat: show distinct attribute keys in cmd-k Attributes tab
All checks were successful
CI / build (push) Successful in 1m58s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 1m46s
CI / deploy (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
Add GET /search/attributes/keys endpoint that queries distinct
attribute key names from ClickHouse using JSONExtractKeys. Attribute
keys appear in the cmd-k Attributes tab alongside attribute value
matches from exchange results.

- SearchIndex.distinctAttributeKeys() interface method
- ClickHouseSearchIndex implementation using arrayJoin(JSONExtractKeys)
- SearchController /attributes/keys endpoint
- useAttributeKeys() React Query hook
- buildSearchData includes attribute keys as 'attribute' category items

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:39:27 +02:00
hsiegeln
9057981cf7 fix: use composite ID for routes in command palette search data
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 1m1s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 55s
Routes with the same name across different applications (e.g., "route1"
in both QUARKUS-APP and BACKEND-APP) were deduplicated because they
shared the same id (routeId). Use appId/routeId as the id so all
routes appear in cmd-k results.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:33:23 +02:00
hsiegeln
b30a5b5760 fix: prevent cmd-k scroll reset on catalog poll refresh
All checks were successful
CI / build (push) Successful in 1m49s
CI / cleanup-branch (push) Has been skipped
CI / docker (push) Successful in 2m3s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m0s
The searchData useMemo recomputed on every catalog poll cycle because
catalogData got a new array reference even when content was unchanged.
This caused the CommandPalette list to re-render and reset scroll.

Use a ref with deep equality check to keep a stable catalog reference,
only updating when the actual data changes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:22:50 +02:00
hsiegeln
910230cbf8 fix: add <mark> highlighting to search match context snippets
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m14s
CI / docker (push) Successful in 46s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The command palette renders matchContext via dangerouslySetInnerHTML
expecting HTML with <mark> tags, but extractSnippet() returned plain
text. Wrap the matched term in <mark> tags and escape surrounding
text to prevent XSS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:18:04 +02:00
hsiegeln
1d791bb329 fix: use exact match for ID fields in full-text search
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 41s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
ID fields (execution_id, correlation_id, exchange_id) should use
exact equality, not LIKE with wildcards. LIKE is only needed for
the _search_text full-text columns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:13:54 +02:00
hsiegeln
9781fe0d7c fix: include execution/correlation/exchange IDs in full-text search
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
The _search_text materialized column only contained error messages,
bodies, and headers — not execution_id, correlation_id, exchange_id,
or route_id. Searching by ID via cmd-k returned no results.

- Add ID fields to _search_text in ClickHouse DDL (covered by ngram
  bloom filter index)
- Add direct LIKE matches on execution_id, correlation_id, exchange_id
  in the text search WHERE clause for faster exact ID lookups

Requires ClickHouse table recreation (fresh install).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:12:15 +02:00
hsiegeln
92951f1dcf chore: update @cameleer/design-system to v0.1.22
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m27s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been cancelled
Sidebar selectedPath now uses sidebarReveal on all tabs, not just
exchanges. This fixes sidebar highlighting on dashboard and runtime.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:09:20 +02:00
hsiegeln
a7d256b38a fix: compute hasTraceData from processor records in chunk accumulator
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 38s
The chunked ingestion path hardcoded hasTraceData=false because the
execution envelope doesn't carry processor bodies. But the processor
records DO have inputBody/outputBody — we just need to check them.

Track hasTraceData across chunks in PendingExchange and pass it to
MergedExecution when the final chunk arrives or on stale sweep.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 21:04:34 +02:00
hsiegeln
e26266532a fix: regenerate OpenAPI types, fix search scoping by applicationId
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m10s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 40s
The identity rename (application→applicationId) broke search filtering
because the stale schema.d.ts still had 'application' as the field name.
The backend silently ignored the unknown field, returning unfiltered results.

- Regenerate openapi.json and schema.d.ts from live backend
- Fix Dashboard: application→applicationId in search request
- Fix RouteDetail: application→applicationId in search request (2 places)
- LayoutShell: scope command palette search by appId/routeId
- LayoutShell: pass sidebarReveal state on sidebar click navigation

Note for DS team: the Sidebar selectedPath logic (line 5451 in dist)
has a hardcoded pathname.startsWith("/exchanges/") guard. This should
be broadened to simply `S ? S : $.pathname` so sidebarReveal works on
all tabs (dashboard, runtime), not just exchanges.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:55:19 +02:00
hsiegeln
178bc40706 Revert "fix: sidebar selection highlight and scoped command palette search"
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m19s
CI / docker (push) Successful in 9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 36s
This reverts commit 4168a6d45b.
2026-04-01 20:43:27 +02:00
hsiegeln
4168a6d45b fix: sidebar selection highlight and scoped command palette search
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / deploy (push) Has been cancelled
CI / deploy-feature (push) Has been cancelled
CI / docker (push) Has been cancelled
Two fixes:
- Pass sidebarReveal state on sidebar navigation so the design system
  can highlight the selected entry (it compares internal /apps/... paths
  against this state value, not the browser URL)
- Command palette search now includes scope.appId and scope.routeId
  so results are filtered to the current sidebar selection

Note: sidebar highlighting works on the exchanges tab. The design
system's selectedPath logic only checks pathname.startsWith("/exchanges/")
for sidebarReveal — a DS update is needed to support /dashboard/ and
/runtime/ tabs too.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:41:42 +02:00
hsiegeln
a028905e41 fix: update agent field names in frontend to match backend DTO
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 57s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 37s
The AgentInstanceResponse backend DTO uses instanceId, displayName,
applicationId, status — but the stale schema.d.ts still had id, name,
application, state. This caused the runtime table to show no data.

- Update schema.d.ts AgentInstanceResponse fields
- Fix AgentHealth: row.id→instanceId, row.name→displayName,
  row.application→applicationId, inst.id→instanceId
- Fix AgentInstance: agent.id→instanceId, agent.name→displayName
- Fix ExchangeHeader: agent.id→instanceId, agent.state→status
- Fix LayoutShell search: agent.state→status, agentTps→tps

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:36:31 +02:00
hsiegeln
f82aa26371 fix: improve ClickHouse admin page, fix AgentHealth type error
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m13s
CI / docker (push) Successful in 3m46s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 58s
Rewrite ClickHouse admin to show useful storage metrics instead of
often-empty system.events data. Add active queries section.

- Replace performance endpoint: query system.parts for disk size,
  uncompressed size, compression ratio, total rows, part count
- Add /queries endpoint querying system.processes for active queries
- Frontend: storage overview strip, tables with total size, active
  queries DataTable
- Fix AgentHealth.tsx type: agentId → instanceId in inline type cast

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:18:06 +02:00
hsiegeln
188810e54b feat: remove TimescaleDB, dead PG stores, and storage feature flags
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 32s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Complete the ClickHouse migration by removing all PostgreSQL analytics
code. PostgreSQL now serves only RBAC, config, and audit — all
observability data is exclusively in ClickHouse.

- Delete 6 dead PostgreSQL store classes (executions, stats, diagrams,
  events, metrics, metrics-query) and 2 integration tests
- Delete RetentionScheduler (ClickHouse TTL handles retention)
- Remove all 7 cameleer.storage.* feature flags from application.yml
- Remove all @ConditionalOnProperty from ClickHouse beans in StorageBeanConfig
- Consolidate 14 Flyway migrations (V1-V14) into single clean V1 with
  only RBAC/config/audit tables (no TimescaleDB, no analytics tables)
- Switch from timescale/timescaledb-ha:pg16 to postgres:16 everywhere
  (docker-compose, deploy/postgres.yaml, test containers)
- Remove TimescaleDB check and /metrics-pipeline from DatabaseAdminController
- Set clickhouse.enabled default to true

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 20:10:58 +02:00
hsiegeln
283e38a20d feat: remove OpenSearch, add ClickHouse admin page
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 33s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
Remove all OpenSearch code, dependencies, configuration, deployment
manifests, and CI/CD references. Replace the OpenSearch admin page
with a ClickHouse admin page showing cluster status, table sizes,
performance metrics, and indexer pipeline stats.

- Delete 11 OpenSearch Java files (config, search impl, admin controller, DTOs, tests)
- Delete 3 OpenSearch frontend files (admin page, CSS, query hooks)
- Delete deploy/opensearch.yaml K8s manifest
- Remove opensearch Maven dependencies from pom.xml
- Remove opensearch config from application.yml, Dockerfile, docker-compose
- Remove opensearch from CI workflow (secrets, deploy, cleanup steps)
- Simplify ThresholdConfig (remove OpenSearch thresholds, database-only)
- Change default search backend from opensearch to clickhouse
- Add ClickHouseAdminController with /status, /tables, /performance, /pipeline
- Add ClickHouseAdminPage with StatCards, pipeline ProgressBar, tables DataTable
- Update CLAUDE.md, HOWTO.md, and source comments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:56:06 +02:00
hsiegeln
5ed7d38bf7 fix: sort sidebar entries alphanumerically
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 29s
CI / docker (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been skipped
Applications, routes within each app, and agents within each app
are now sorted by name using localeCompare.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:24:39 +02:00
hsiegeln
4cdbcdaeea fix: update frontend field names for identity rename (applicationId, instanceId)
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 32s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
The backend identity rename (applicationName → applicationId,
agentId → instanceId) was not reflected in the frontend. This caused
drilldown to fail (detail.applicationName was undefined, disabling
the diagram fetch) and various display issues.

Updated schema.d.ts, ExchangeHeader, ExecutionDiagram, Dashboard,
AgentHealth, AgentInstance, LayoutShell, LogTab, InfoTab, DetailPanel,
ExchangesPage, and tracing-store.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 18:22:16 +02:00
hsiegeln
aa2d203f4e feat: add UI usage analytics tracking
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 1m14s
CI / deploy (push) Successful in 46s
CI / deploy-feature (push) Has been skipped
Tracks authenticated UI user requests to understand usage patterns:
- New ClickHouse usage_events table with 90-day TTL
- UsageTrackingInterceptor captures method, path, duration, user
- Path normalization groups dynamic segments ({id}, {hash})
- Buffered writes via WriteBuffer + periodic flush
- Admin endpoint GET /api/v1/admin/usage with groupBy=endpoint|user|hour
- Skips agent requests, health checks, and data ingestion

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:53:32 +02:00
hsiegeln
ce4abaf862 fix: infer compound node color from descendants when no own overlay state
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m17s
CI / docker (push) Successful in 1m9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 49s
Path containers (EIP_WHEN, EIP_OTHERWISE, etc.) don't have their own
processor records, so they never get an overlay entry. Now inferred
from descendants: green if any descendant executed, red if any failed.
Gated (amber) only when no descendants executed at all.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:37:47 +02:00
hsiegeln
40ce4a57b4 fix: only show amber on containers where gate blocked all children
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 49s
A container is only gated (amber) when filterMatched=false or
duplicateMessage=true AND no descendants were executed. Containers
with executed children (split, choice, idempotent that passed) now
correctly show green/red based on their execution status.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:32:39 +02:00
hsiegeln
b44ffd08be fix: color compound nodes by execution status in overlay
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 51s
CompoundNode now uses execution overlay status to color its header:
failed (red) > completed (green) > default. Previously only used
static type-based color regardless of execution state.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:20:59 +02:00
hsiegeln
cf439248b5 feat: expose iteration/iterationSize fields for diagram overlay
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m12s
CI / docker (push) Successful in 1m5s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 52s
Replace synthetic wrapper node approach with direct iteration fields:
- ProcessorNode gains iteration (child's index) and iterationSize
  (container's total) fields, populated from ClickHouse flat records
- Frontend hooks detect iteration containers from iterationSize != null
  instead of scanning for wrapper processorTypes
- useExecutionOverlay filters children by iteration field instead of
  wrapper nodes, eliminating ITERATION_WRAPPER_TYPES entirely
- Cleaner data contract: API returns exactly what the DB stores

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 17:14:36 +02:00
hsiegeln
e8f9ada1d1 fix: inject ClickHouse JdbcTemplate into stats-querying controllers
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 49s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
RouteCatalogController, RouteMetricsController, and AgentRegistrationController
had unqualified JdbcTemplate injection, receiving the PostgreSQL template
instead of ClickHouse. The stats queries silently failed (caught exception)
returning 0 counts. Added @Qualifier("clickHouseJdbcTemplate") to all three.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:34:56 +02:00
hsiegeln
bc70797e31 fix: force UTC timezone in Docker runtime
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 47s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Sets TZ=UTC and -Duser.timezone=UTC to guarantee all JVM time operations
use UTC regardless of the container's base image or host configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:24:23 +02:00
hsiegeln
f6123b8a7c fix: use explicit UTC formatting in ClickHouse DateTime literals
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 50s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 57s
Timestamp.toString() uses JVM local timezone which can mismatch with
ClickHouse's UTC timezone, causing time-filtered queries to return empty
results. Replaced with DateTimeFormatter.withZone(UTC) in all lit() methods.

Also added warn logging to RouteCatalogController catch blocks to surface
query errors instead of silently swallowing them.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 16:13:52 +02:00
hsiegeln
d739094a56 fix: update ClickHouse DDL files with new column names instead of ALTER RENAME
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 43s
ClickHouse can't rename columns that are part of ORDER BY keys.
Updated V1-V8 DDL files directly with new column names (instance_id,
application_id) and removed V9 migration. Wipe ClickHouse and restart.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:40:54 +02:00
hsiegeln
91400defe9 fix: add missing V9 (ClickHouse) and V14 (PostgreSQL) identity column rename migrations
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 39s
Migration files were lost during worktree merge — recreated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:33:02 +02:00
hsiegeln
909d713837 feat: rename agent identity fields for protocol v2 + add SHUTDOWN lifecycle state
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 22s
Align all internal naming with the agent team's protocol v2 identity rename:
- agentId → instanceId (unique per-JVM identifier)
- applicationName → applicationId (shared app identifier)
- AgentInfo: id → instanceId, name → displayName, application → applicationId

Add SHUTDOWN lifecycle state for graceful agent shutdowns:
- New POST /data/events endpoint receives agent lifecycle events
- AGENT_STOPPED event transitions agent to SHUTDOWN (skips STALE/DEAD)
- New POST /{id}/deregister endpoint removes agent from registry
- Server now distinguishes graceful shutdown from crash (heartbeat timeout)

Includes ClickHouse V9 and PostgreSQL V14 migrations for column renames.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 12:22:42 +02:00
hsiegeln
ad8dd73596 fix: update ChunkAccumulator tests for DiagramStore constructor param
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 1m6s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 52s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:58:27 +02:00
hsiegeln
e50c9fa60d fix: address SonarQube reliability issues
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 39s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
- ElkDiagramRenderer.getElkRoot(): add null guard to prevent NPE
  when node is null (SQ java:S2259)
- WriteBuffer: add offerOrWarn() that logs when buffer is full instead
  of silently dropping data. ChunkAccumulator now uses this method
  so ingestion backpressure is visible in logs (SQ java:S899)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:55:31 +02:00
hsiegeln
d4dbfa7ae6 fix: populate diagramContentHash in chunked ingestion pipeline
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 43s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
ChunkAccumulator now injects DiagramStore and looks up the content hash
when converting to MergedExecution. Without this, the detail page had
no diagram hash, so the overlay couldn't find the route diagram.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:50:34 +02:00
hsiegeln
59374482bc fix: replace PostgreSQL aggregate functions with ClickHouse -Merge combinators
RouteCatalogController, RouteMetricsController, AgentRegistrationController
all had inline SQL using SUM() on AggregateFunction columns from stats_1m_*
AggregatingMergeTree tables. Replace with countMerge/countIfMerge/sumMerge.
Also fix time_bucket() → toStartOfInterval() and ::double → toFloat64().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:49:06 +02:00
hsiegeln
43e187a023 fix: ChunkIngestionController ObjectMapper missing FAIL_ON_UNKNOWN_PROPERTIES
Adds DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES=false (required
by PROTOCOL.md) and explicit TypeReference<List<ExecutionChunk>> for
array parsing. Without this, batched chunks from ChunkedExporter
(2+ chunks in a JSON array) were silently rejected, causing final:true
chunks to be lost and all exchanges to go stale.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 10:45:12 +02:00
hsiegeln
bc1c71277c fix: resolve duplicate ExecutionStore bean conflict
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 49s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 55s
ClickHouseExecutionStore implements ExecutionStore, so the concrete bean
already satisfies the interface — remove redundant wrapper bean. Align
ChunkAccumulator and ExecutionFlushScheduler conditions to
cameleer.storage.executions flag.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 09:44:02 +02:00
hsiegeln
520181d241 test(clickhouse): add integration tests for execution read path and tree reconstruction
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 46s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m16s
SonarQube / sonarqube (push) Failing after 2m21s
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:11:44 +02:00
hsiegeln
95b9dea5c4 feat(clickhouse): wire ClickHouseExecutionStore as active ExecutionStore
Add cameleer.storage.executions feature flag (default: clickhouse).
PostgresExecutionStore activates only when explicitly set to postgres.
Add by-seq snapshot endpoint for iteration-aware processor lookup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:09:14 +02:00
hsiegeln
151b96a680 feat: seq-based tree reconstruction for ClickHouse flat processor model
Dual-mode buildTree: detects seq presence and uses seq/parentSeq linkage
instead of processorId map. Handles duplicate processorIds across
iterations correctly. Old processorId-based mode kept for PG compat.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:07:20 +02:00
hsiegeln
0661fd995f feat(clickhouse): add read methods to ClickHouseExecutionStore
Implements ExecutionStore interface with findById (FINAL for
ReplacingMergeTree), findProcessors (ORDER BY seq), findProcessorById,
and findProcessorBySeq. Write methods unchanged.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:04:03 +02:00
hsiegeln
190ae2797d refactor: extend ProcessorRecord with seq/iteration fields for ClickHouse model
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-01 00:02:03 +02:00
hsiegeln
968117c41a feat(clickhouse): wire Phase 4 stores with feature flags
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
Add conditional beans for ClickHouseDiagramStore, ClickHouseAgentEventRepository,
and ClickHouseLogStore. All default to ClickHouse (matchIfMissing=true).
PG/OS stores activate only when explicitly configured.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:44:10 +02:00
hsiegeln
7d7eb52afb feat(clickhouse): add ClickHouseLogStore with LogIndex interface
Extract LogIndex interface from OpenSearchLogIndex. Both ClickHouseLogStore
and OpenSearchLogIndex implement it. Controllers now inject LogIndex.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:42:07 +02:00
hsiegeln
c73e4abf68 feat(clickhouse): add ClickHouseAgentEventRepository with integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:37:51 +02:00
hsiegeln
cd63d300b3 feat(clickhouse): add ClickHouseDiagramStore with integration tests
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:35:32 +02:00
hsiegeln
f7daadaaa9 feat(clickhouse): add DDL for route_diagrams, agent_events, and logs tables
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:30:38 +02:00
hsiegeln
af080337f5 feat: comprehensive ClickHouse low-memory tuning and switch all storage to ClickHouse
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m9s
CI / docker (push) Successful in 42s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 58s
Replace partial memory config with full Altinity low-memory guide
settings. Revert container limit from 6Gi back to 4Gi — proper
tuning (mlock=false, reduced caches/pools/threads, disk spill for
aggregations) makes the original budget sufficient.

Switch all storage feature flags to ClickHouse:
- CAMELEER_STORAGE_SEARCH: opensearch → clickhouse
- CAMELEER_STORAGE_METRICS: postgres → clickhouse
- CAMELEER_STORAGE_STATS: already clickhouse

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:27:10 +02:00
hsiegeln
606f81a970 fix: align server with protocol v2 chunked transport spec
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m45s
CI / docker (push) Successful in 59s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 46s
- ChunkIngestionController: /data/chunks → /data/executions (matches
  PROTOCOL.md endpoint the agent actually posts to)
- ExecutionController: conditional on ClickHouse being disabled to
  avoid mapping conflict
- Persist originalExchangeId and replayExchangeId from ExecutionChunk
  envelope through to ClickHouse (was silently dropped)
- V5 migration adds the two new columns to executions table

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:18:35 +02:00
hsiegeln
154bce366a fix: remove references to deleted ProcessorExecution tree fields
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m15s
CI / docker (push) Successful in 44s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m0s
cameleer3-common removed children, loopIndex, splitIndex,
multicastIndex from ProcessorExecution (flat model only now).
Iteration context lives on synthetic wrapper nodes via processorType.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 23:00:11 +02:00
hsiegeln
a669df08bd fix(clickhouse): tune memory settings to prevent OOM on insert
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 40s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
ClickHouse 24.12 auto-sizes caches from the cgroup limit, leaving
insufficient headroom for MV processing and background merges.
Adds a custom config that shrinks mark/index/expression caches and
caps per-query memory at 2 GiB. Bumps container limit 4Gi → 6Gi.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 22:54:43 +02:00
hsiegeln
af18fc4142 Merge branch 'worktree-clickhouse-phase2'
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 45s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
2026-03-31 22:06:35 +02:00
hsiegeln
1a00eed389 fix: schema initializer skips comment-only SQL segments
The V4 DDL had a semicolon inside a comment which caused the
split-on-semicolon logic to produce a comment-only segment that
ClickHouse rejected as empty query. Fixed the comment and made
the initializer strip comment-only segments before execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 22:06:31 +02:00
hsiegeln
0423518f72 feat: ClickHouse Phase 3 — Stats & Analytics (materialized views)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m8s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
- DDL for 5 AggregatingMergeTree tables + 5 materialized views
- ClickHouseStatsStore: all 15 StatsStore methods using -Merge combinators
- Stats/timeseries read from pre-aggregated MVs (countMerge, sumMerge, quantileMerge)
- SLA/topErrors/punchcard query raw executions FINAL table
- Feature flag: cameleer.storage.stats (default: clickhouse)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:52:13 +02:00
hsiegeln
9df00fdde0 feat(clickhouse): wire ClickHouseStatsStore with cameleer.storage.stats feature flag (default: clickhouse)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:51:45 +02:00
hsiegeln
052990bb59 feat(clickhouse): add ClickHouseStatsStore with -Merge aggregate queries
Implements StatsStore interface for ClickHouse using AggregatingMergeTree
tables with -Merge combinators (countMerge, countIfMerge, sumMerge,
quantileMerge). Uses literal SQL for aggregate table queries to avoid
ClickHouse JDBC driver PreparedStatement issues with AggregateFunction
columns. Raw table queries (SLA, topErrors, activeErrorTypes) use normal
prepared statements.

Includes 13 integration tests covering stats, timeseries, grouped
timeseries, SLA compliance, SLA counts by app/route, top errors, active
error types, punchcard, and processor stats. Also fixes AggregateFunction
type signatures in V4 DDL (count() takes no args, countIf takes UInt8).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 21:49:22 +02:00
hsiegeln
eb0d26814f feat(clickhouse): add stats materialized views DDL (5 tables + 5 MVs)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 20:11:38 +02:00
hsiegeln
c8e6bbe059 Merge branch 'worktree-clickhouse-phase2'
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m2s
CI / docker (push) Successful in 43s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 44s
2026-03-31 20:02:49 +02:00
hsiegeln
a9eabe97f7 fix: wire @Primary JdbcTemplate to the @Primary DataSource bean
The jdbcTemplate() method was calling dataSource(properties) directly,
creating a new DataSource instance instead of using the Spring-managed
@Primary bean. This caused some repositories to receive the ClickHouse
connection instead of PostgreSQL.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 20:02:44 +02:00
hsiegeln
e724607a66 feat: ClickHouse Phase 2 — Executions + Search (chunked transport)
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m36s
CI / docker (push) Successful in 3m21s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 41s
- DDL for executions (ReplacingMergeTree) and processor_executions (MergeTree with seq/parentSeq/iteration)
- ClickHouseExecutionStore with batch INSERT for both tables
- ChunkAccumulator: buffers exchange envelope across chunks, inserts processors immediately, writes execution on final chunk
- ExecutionFlushScheduler drains WriteBuffers to ClickHouse
- ChunkIngestionController: POST /api/v1/data/chunks endpoint
- ClickHouseSearchIndex: ngram-accelerated SQL search implementing SearchIndex interface
- Feature flags: cameleer.storage.search=opensearch|clickhouse
- Uses cameleer3-common ExecutionChunk and FlatProcessorRecord models

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:37:21 +02:00
hsiegeln
07f215b0fd refactor: replace server-side DTOs with cameleer3-common ExecutionChunk and FlatProcessorRecord
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:33:49 +02:00
hsiegeln
38551eac9d test(clickhouse): add end-to-end chunk pipeline integration test
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:24:55 +02:00
hsiegeln
31f7113b3f feat(clickhouse): wire ChunkAccumulator, flush scheduler, and search feature flag
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:21:19 +02:00
hsiegeln
6052407c82 feat(clickhouse): add ClickHouseSearchIndex with ngram-accelerated SQL search
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:18:01 +02:00
hsiegeln
776f2ce90d feat(clickhouse): add ExecutionFlushScheduler and ChunkIngestionController
ExecutionFlushScheduler drains MergedExecution and ProcessorBatch write
buffers on a fixed interval and delegates batch inserts to
ClickHouseExecutionStore. Also sweeps stale exchanges every 60s.

ChunkIngestionController exposes POST /api/v1/data/chunks, accepts
single or array ExecutionChunk payloads, and feeds them into the
ChunkAccumulator. Conditional on ChunkAccumulator bean (clickhouse.enabled).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:12:38 +02:00
hsiegeln
62420cf0c2 feat(clickhouse): add ChunkAccumulator for chunked execution ingestion
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:10:21 +02:00
hsiegeln
81f7f8afe1 feat(clickhouse): add ClickHouseExecutionStore with batch insert for chunked format
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:07:33 +02:00
hsiegeln
b30dfa39f4 feat(clickhouse): add executions and processor_executions DDL for chunked transport
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:04:19 +02:00
hsiegeln
20c8e17843 feat: add server-side ExecutionChunk and FlatProcessorRecord DTOs
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 19:02:47 +02:00
a96fe59840 Merge pull request 'fix: add @Primary PG DataSource/JdbcTemplate to prevent CH bean conflict' (#99) from feature/clickhouse-phase1 into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m46s
CI / docker (push) Successful in 11s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 42s
Reviewed-on: cameleer/cameleer3-server#99
2026-03-31 18:21:00 +02:00
hsiegeln
7cf849269f fix: add @Primary PG DataSource/JdbcTemplate to prevent CH bean conflict
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 41s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 38s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m51s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
When clickhouse.enabled=true, the ClickHouse JdbcTemplate bean prevents
Spring Boot auto-config from creating the default PG JdbcTemplate.
All PG repositories then get the CH JdbcTemplate and fail with
"Table cameleer.audit_log does not exist".

Fix: explicitly create @Primary DataSource and JdbcTemplate from
DataSourceProperties so PG remains the default for unqualified injections.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 18:18:09 +02:00
76afcaa637 Merge pull request 'fix: cast DateTime64 to DateTime in ClickHouse TTL expression' (#98) from feature/clickhouse-phase1 into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m55s
CI / docker (push) Successful in 14s
CI / deploy (push) Successful in 30s
CI / deploy-feature (push) Has been skipped
Reviewed-on: cameleer/cameleer3-server#98
2026-03-31 18:10:58 +02:00
hsiegeln
b1c5cc0616 fix: cast DateTime64 to DateTime in ClickHouse TTL expression
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m23s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m46s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 1m8s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 2m19s
2026-03-31 18:10:20 +02:00
8838077eff Merge pull request 'fix: remove unsupported async_insert params from ClickHouse JDBC URL' (#97) from feature/clickhouse-phase1 into main
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m39s
CI / docker (push) Successful in 10s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 34s
Reviewed-on: cameleer/cameleer3-server#97
2026-03-31 18:04:22 +02:00
hsiegeln
8eeaecf6f3 fix: remove unsupported async_insert params from ClickHouse JDBC URL
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m6s
CI / docker (push) Successful in 55s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m39s
CI / deploy (push) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (push) Successful in 51s
CI / deploy-feature (pull_request) Has been skipped
clickhouse-jdbc 0.9.7 rejects async_insert and wait_for_async_insert as
unknown URL parameters. These are server-side settings, not driver config.
Can be set per-query later if needed via custom_settings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 18:02:53 +02:00
b54bef302d Merge pull request 'fix: ClickHouse auth credentials and non-fatal schema init' (#96) from feature/clickhouse-phase1 into main
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m48s
CI / docker (push) Successful in 9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m17s
Reviewed-on: cameleer/cameleer3-server#96
2026-03-31 17:57:27 +02:00
hsiegeln
f8505401d7 fix: ClickHouse auth credentials and non-fatal schema init
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 43s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 13s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m47s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
- Set CLICKHOUSE_USER/PASSWORD via k8s secret (fixes "disabling network
  access for user 'default'" when no password is set)
- Add clickhouse-credentials secret to CI deploy + feature branch copy
- Pass CLICKHOUSE_USERNAME/PASSWORD env vars to server pod
- Make schema initializer non-fatal so server starts even if CH is
  temporarily unavailable

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:54:44 +02:00
a0f1a4aba4 Merge pull request 'feature/clickhouse-phase1' (#95) from feature/clickhouse-phase1 into main
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m5s
CI / docker (push) Successful in 9s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Failing after 2m41s
Reviewed-on: cameleer/cameleer3-server#95
2026-03-31 17:48:41 +02:00
hsiegeln
aa5fc1b830 ci: retrigger after transient GitHub actions/cache 500 error
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m44s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 1m44s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 11s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 2m15s
2026-03-31 17:43:40 +02:00
hsiegeln
c42e13932b ci: deploy ClickHouse StatefulSet in main deploy job
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (pull_request) Failing after 45s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / build (push) Failing after 1m6s
CI / docker (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Has been skipped
The deploy/clickhouse.yaml manifest was created but not referenced
in the CI workflow. Add kubectl apply between OpenSearch and Authentik.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:41:15 +02:00
hsiegeln
59dd629b0e fix: create cameleer database on ClickHouse startup
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (pull_request) Successful in 1m49s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 10s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been cancelled
ClickHouse only has the 'default' database out of the box. The JDBC URL
connects to 'cameleer', so the database must exist before the server starts.
Uses /docker-entrypoint-initdb.d/ init script via ConfigMap.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:31:17 +02:00
hsiegeln
697c689192 fix: rename ClickHouse tests to *IT pattern for CI compatibility
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 2m28s
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Successful in 2m27s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / docker (push) Successful in 3m32s
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Failing after 2m17s
Testcontainers tests need Docker which isn't available in CI.
Rename to *IT so Surefire skips them (Failsafe runs them with -DskipITs=false).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 17:19:33 +02:00
hsiegeln
7a2a0ee649 test: add ClickHouse testcontainer to integration test base
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Failing after 2m29s
CI / docker (push) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Has been skipped
CI / cleanup-branch (pull_request) Has been skipped
CI / build (pull_request) Failing after 2m28s
CI / docker (pull_request) Has been skipped
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:09:09 +02:00
hsiegeln
1b991f99a3 deploy: add ClickHouse StatefulSet and server env vars
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:08:42 +02:00
hsiegeln
21991b6cf8 feat: wire MetricsStore and MetricsQueryStore with feature flag
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:07:35 +02:00
hsiegeln
53766aeb56 feat: add ClickHouseMetricsQueryStore with time-bucketed queries
Implements MetricsQueryStore using ClickHouse toStartOfInterval() for
time-bucketed aggregation queries; verified with 4 Testcontainers tests.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:05:45 +02:00
hsiegeln
bf0e9ea418 refactor: extract MetricsQueryStore interface from AgentMetricsController
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 17:00:57 +02:00
hsiegeln
6e30b7ec65 feat: add ClickHouseMetricsStore with batch insert
TDD implementation of MetricsStore backed by ClickHouse. Uses native
Map(String,String) column type (no JSON cast), relies on ClickHouse
DEFAULT for server_received_at, and handles null tags by substituting
an empty HashMap. All 4 Testcontainers tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:58:20 +02:00
hsiegeln
08934376df feat: add ClickHouse schema initializer with agent_metrics DDL
Adds ClickHouseSchemaInitializer that runs on ApplicationReadyEvent,
scanning classpath:clickhouse/*.sql in filename order and executing each
statement. Adds V1__agent_metrics.sql with MergeTree table, tenant/agent
partitioning, and 365-day TTL.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:51:21 +02:00
hsiegeln
23f901279a feat: add ClickHouse DataSource and JdbcTemplate configuration
Adds ClickHouseProperties (bound to clickhouse.*), ClickHouseConfig
(conditional HikariDataSource + JdbcTemplate beans), and extends
application.yml with clickhouse.enabled/url/username/password and
cameleer.storage.metrics properties.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:51:14 +02:00
hsiegeln
6171827243 build: add clickhouse-jdbc and testcontainers-clickhouse dependencies
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-31 16:49:04 +02:00
hsiegeln
c77d8a7af0 docs: add Phase 1 implementation plan for ClickHouse migration
10-task TDD plan covering: CH dependency, config, schema init,
ClickHouseMetricsStore, MetricsQueryStore interface extraction,
ClickHouseMetricsQueryStore, feature flag wiring, k8s deployment,
integration tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:43:14 +02:00
hsiegeln
e7eda7a7b3 docs: add ClickHouse migration design and append-only protocol spec
Design for replacing PostgreSQL/TimescaleDB + OpenSearch with ClickHouse
OSS. Covers table schemas, ingestion pipeline (ExecutionAccumulator),
ngram search indexes, materialized views, multitenancy, and retention.

Companion doc proposes append-only execution protocol for the agent repo.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-31 16:36:22 +02:00
233 changed files with 12576 additions and 4274 deletions

View File

@@ -209,12 +209,6 @@ jobs:
--from-literal=POSTGRES_DB="${POSTGRES_DB:-cameleer}" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl create secret generic opensearch-credentials \
--namespace=cameleer \
--from-literal=OPENSEARCH_USER="${OPENSEARCH_USER:-admin}" \
--from-literal=OPENSEARCH_PASSWORD="$OPENSEARCH_PASSWORD" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl create secret generic authentik-credentials \
--namespace=cameleer \
--from-literal=PG_USER="${AUTHENTIK_PG_USER:-authentik}" \
@@ -222,11 +216,17 @@ jobs:
--from-literal=AUTHENTIK_SECRET_KEY="${AUTHENTIK_SECRET_KEY}" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl create secret generic clickhouse-credentials \
--namespace=cameleer \
--from-literal=CLICKHOUSE_USER="${CLICKHOUSE_USER:-default}" \
--from-literal=CLICKHOUSE_PASSWORD="$CLICKHOUSE_PASSWORD" \
--dry-run=client -o yaml | kubectl apply -f -
kubectl apply -f deploy/postgres.yaml
kubectl -n cameleer rollout status statefulset/postgres --timeout=120s
kubectl apply -f deploy/opensearch.yaml
kubectl -n cameleer rollout status statefulset/opensearch --timeout=180s
kubectl apply -f deploy/clickhouse.yaml
kubectl -n cameleer rollout status statefulset/clickhouse --timeout=180s
kubectl apply -f deploy/authentik.yaml
kubectl -n cameleer rollout status deployment/authentik-server --timeout=180s
@@ -248,11 +248,11 @@ jobs:
POSTGRES_USER: ${{ secrets.POSTGRES_USER }}
POSTGRES_PASSWORD: ${{ secrets.POSTGRES_PASSWORD }}
POSTGRES_DB: ${{ secrets.POSTGRES_DB }}
OPENSEARCH_USER: ${{ secrets.OPENSEARCH_USER }}
OPENSEARCH_PASSWORD: ${{ secrets.OPENSEARCH_PASSWORD }}
AUTHENTIK_PG_USER: ${{ secrets.AUTHENTIK_PG_USER }}
AUTHENTIK_PG_PASSWORD: ${{ secrets.AUTHENTIK_PG_PASSWORD }}
AUTHENTIK_SECRET_KEY: ${{ secrets.AUTHENTIK_SECRET_KEY }}
CLICKHOUSE_USER: ${{ secrets.CLICKHOUSE_USER }}
CLICKHOUSE_PASSWORD: ${{ secrets.CLICKHOUSE_PASSWORD }}
deploy-feature:
needs: docker
@@ -292,7 +292,7 @@ jobs:
run: kubectl create namespace "$BRANCH_NS" --dry-run=client -o yaml | kubectl apply -f -
- name: Copy secrets from cameleer namespace
run: |
for SECRET in gitea-registry postgres-credentials opensearch-credentials cameleer-auth; do
for SECRET in gitea-registry postgres-credentials clickhouse-credentials cameleer-auth; do
kubectl get secret "$SECRET" -n cameleer -o json \
| jq 'del(.metadata.namespace, .metadata.resourceVersion, .metadata.uid, .metadata.creationTimestamp, .metadata.managedFields)' \
| kubectl apply -n "$BRANCH_NS" -f -
@@ -372,15 +372,6 @@ jobs:
kubectl wait --for=condition=Ready pod/cleanup-schema-${BRANCH_SLUG} -n cameleer --timeout=30s || true
kubectl wait --for=jsonpath='{.status.phase}'=Succeeded pod/cleanup-schema-${BRANCH_SLUG} -n cameleer --timeout=60s || true
kubectl delete pod cleanup-schema-${BRANCH_SLUG} -n cameleer --ignore-not-found
- name: Delete OpenSearch indices
run: |
kubectl run cleanup-indices-${BRANCH_SLUG} \
--namespace=cameleer \
--image=curlimages/curl:latest \
--restart=Never \
--command -- curl -sf -X DELETE "http://opensearch:9200/cam-${BRANCH_SLUG}-*"
kubectl wait --for=jsonpath='{.status.phase}'=Succeeded pod/cleanup-indices-${BRANCH_SLUG} -n cameleer --timeout=60s || true
kubectl delete pod cleanup-indices-${BRANCH_SLUG} -n cameleer --ignore-not-found
- name: Cleanup Docker images
run: |
API="https://gitea.siegeln.net/api/v1"

View File

@@ -66,7 +66,7 @@ jobs:
*) PLATFORM="linux-x64" ;;
esac
curl -sSLo sonar-scanner.zip "https://binaries.sonarsource.com/Distribution/sonar-scanner-cli/sonar-scanner-cli-${SONAR_SCANNER_VERSION}-${PLATFORM}.zip"
unzip -q sonar-scanner.zip
jar xf sonar-scanner.zip
ln -s "$(pwd)/sonar-scanner-${SONAR_SCANNER_VERSION}-${PLATFORM}/bin/sonar-scanner" /usr/local/bin/sonar-scanner
- name: SonarQube Analysis

View File

@@ -38,7 +38,7 @@ java -jar cameleer3-server-app/target/cameleer3-server-app-1.0-SNAPSHOT.jar
- Jackson `JavaTimeModule` for `Instant` deserialization
- Communication: receives HTTP POST data from agents (executions, diagrams, metrics, logs), serves SSE event streams for config push/commands (config-update, deep-trace, replay, route-control)
- Maintains agent instance registry with states: LIVE → STALE → DEAD
- Storage: PostgreSQL (TimescaleDB) for structured data, OpenSearch for full-text search and application log storage
- Storage: PostgreSQL for RBAC, config, and audit; ClickHouse for all observability data (executions, search, logs, metrics, stats, diagrams)
- Security: JWT auth with RBAC (AGENT/VIEWER/OPERATOR/ADMIN roles), Ed25519 config signing, bootstrap token for registration
- OIDC: Optional external identity provider support (token exchange pattern). Configured via admin API, stored in database (`server_config` table)
- User persistence: PostgreSQL `users` table, admin CRUD at `/api/v1/admin/users`
@@ -50,11 +50,11 @@ java -jar cameleer3-server-app/target/cameleer3-server-app-1.0-SNAPSHOT.jar
- Docker: multi-stage build (`Dockerfile`), `$BUILDPLATFORM` for native Maven on ARM64 runner, amd64 runtime
- `REGISTRY_TOKEN` build arg required for `cameleer3-common` dependency resolution
- Registry: `gitea.siegeln.net/cameleer/cameleer3-server` (container images)
- K8s manifests in `deploy/` — Kustomize base + overlays (main/feature), shared infra (PostgreSQL, OpenSearch, Authentik) as top-level manifests
- K8s manifests in `deploy/` — Kustomize base + overlays (main/feature), shared infra (PostgreSQL, ClickHouse, Authentik) as top-level manifests
- Deployment target: k3s at 192.168.50.86, namespace `cameleer` (main), `cam-<slug>` (feature branches)
- Feature branches: isolated namespace, PG schema, OpenSearch index prefix; Traefik Ingress at `<slug>-api.cameleer.siegeln.net`
- Secrets managed in CI deploy step (idempotent `--dry-run=client | kubectl apply`): `cameleer-auth`, `postgres-credentials`, `opensearch-credentials`
- K8s probes: server uses `/api/v1/health`, PostgreSQL uses `pg_isready`, OpenSearch uses `/_cluster/health`
- Feature branches: isolated namespace, PG schema; Traefik Ingress at `<slug>-api.cameleer.siegeln.net`
- Secrets managed in CI deploy step (idempotent `--dry-run=client | kubectl apply`): `cameleer-auth`, `postgres-credentials`, `clickhouse-credentials`
- K8s probes: server uses `/api/v1/health`, PostgreSQL uses `pg_isready`
- Docker build uses buildx registry cache + `--provenance=false` for Gitea compatibility
## UI Styling

View File

@@ -21,7 +21,7 @@ COPY --from=build /build/cameleer3-server-app/target/cameleer3-server-app-*.jar
ENV SPRING_DATASOURCE_URL=jdbc:postgresql://postgres:5432/cameleer3
ENV SPRING_DATASOURCE_USERNAME=cameleer
ENV SPRING_DATASOURCE_PASSWORD=cameleer_dev
ENV OPENSEARCH_URL=http://opensearch:9200
EXPOSE 8081
ENTRYPOINT exec java -jar /app/server.jar
ENV TZ=UTC
ENTRYPOINT exec java -Duser.timezone=UTC -jar /app/server.jar

View File

@@ -21,18 +21,17 @@ mvn clean verify # compile + run all tests (needs Docker for integrati
## Infrastructure Setup
Start PostgreSQL and OpenSearch:
Start PostgreSQL:
```bash
docker compose up -d
```
This starts TimescaleDB (PostgreSQL 16) and OpenSearch 2.19. The database schema is applied automatically via Flyway migrations on server startup.
This starts PostgreSQL 16. The database schema is applied automatically via Flyway migrations on server startup. ClickHouse tables are created by the schema initializer on startup.
| Service | Port | Purpose |
|------------|------|----------------------|
| PostgreSQL | 5432 | JDBC (Spring JDBC) |
| OpenSearch | 9200 | REST API (full-text) |
PostgreSQL credentials: `cameleer` / `cameleer_dev`, database `cameleer3`.
@@ -381,8 +380,8 @@ Key settings in `cameleer3-server-app/src/main/resources/application.yml`:
| `security.oidc.client-secret` | | OAuth2 client secret (`CAMELEER_OIDC_CLIENT_SECRET`) |
| `security.oidc.roles-claim` | `realm_access.roles` | JSONPath to roles in OIDC id_token (`CAMELEER_OIDC_ROLES_CLAIM`) |
| `security.oidc.default-roles` | `VIEWER` | Default roles for new OIDC users (`CAMELEER_OIDC_DEFAULT_ROLES`) |
| `opensearch.log-index-prefix` | `logs-` | OpenSearch index prefix for application logs (`CAMELEER_LOG_INDEX_PREFIX`) |
| `opensearch.log-retention-days` | `7` | Days before log indices are deleted (`CAMELEER_LOG_RETENTION_DAYS`) |
| `cameleer.indexer.debounce-ms` | `2000` | Search indexer debounce delay (`CAMELEER_INDEXER_DEBOUNCE_MS`) |
| `cameleer.indexer.queue-size` | `10000` | Search indexer queue capacity (`CAMELEER_INDEXER_QUEUE_SIZE`) |
## Web UI Development
@@ -407,7 +406,7 @@ npm run generate-api # Requires backend running on :8081
## Running Tests
Integration tests use Testcontainers (starts PostgreSQL and OpenSearch automatically — requires Docker):
Integration tests use Testcontainers (starts PostgreSQL automatically — requires Docker):
```bash
# All tests
@@ -438,7 +437,7 @@ The full stack is deployed to k3s via CI/CD on push to `main`. K8s manifests are
```
cameleer namespace:
PostgreSQL (StatefulSet, 10Gi PVC) ← postgres:5432 (ClusterIP)
OpenSearch (StatefulSet, 10Gi PVC) ← opensearch:9200 (ClusterIP)
ClickHouse (StatefulSet, 10Gi PVC) ← clickhouse:8123 (ClusterIP)
cameleer3-server (Deployment) ← NodePort 30081
cameleer3-ui (Deployment, Nginx) ← NodePort 30090
Authentik Server (Deployment) ← NodePort 30950
@@ -460,7 +459,7 @@ cameleer namespace:
Push to `main` triggers: **build** (UI npm + Maven, unit tests) → **docker** (buildx amd64 for server + UI, push to Gitea registry) → **deploy** (kubectl apply + rolling update).
Required Gitea org secrets: `REGISTRY_TOKEN`, `KUBECONFIG_BASE64`, `CAMELEER_AUTH_TOKEN`, `CAMELEER_JWT_SECRET`, `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`, `OPENSEARCH_USER`, `OPENSEARCH_PASSWORD`, `CAMELEER_UI_USER` (optional), `CAMELEER_UI_PASSWORD` (optional), `AUTHENTIK_PG_USER`, `AUTHENTIK_PG_PASSWORD`, `AUTHENTIK_SECRET_KEY`, `CAMELEER_OIDC_ENABLED`, `CAMELEER_OIDC_ISSUER`, `CAMELEER_OIDC_CLIENT_ID`, `CAMELEER_OIDC_CLIENT_SECRET`.
Required Gitea org secrets: `REGISTRY_TOKEN`, `KUBECONFIG_BASE64`, `CAMELEER_AUTH_TOKEN`, `CAMELEER_JWT_SECRET`, `POSTGRES_USER`, `POSTGRES_PASSWORD`, `POSTGRES_DB`, `CLICKHOUSE_USER`, `CLICKHOUSE_PASSWORD`, `CAMELEER_UI_USER` (optional), `CAMELEER_UI_PASSWORD` (optional), `AUTHENTIK_PG_USER`, `AUTHENTIK_PG_PASSWORD`, `AUTHENTIK_SECRET_KEY`, `CAMELEER_OIDC_ENABLED`, `CAMELEER_OIDC_ISSUER`, `CAMELEER_OIDC_CLIENT_ID`, `CAMELEER_OIDC_CLIENT_SECRET`.
### Manual K8s Commands
@@ -474,8 +473,8 @@ kubectl -n cameleer logs -f deploy/cameleer3-server
# View PostgreSQL logs
kubectl -n cameleer logs -f statefulset/postgres
# View OpenSearch logs
kubectl -n cameleer logs -f statefulset/opensearch
# View ClickHouse logs
kubectl -n cameleer logs -f statefulset/clickhouse
# Restart server
kubectl -n cameleer rollout restart deployment/cameleer3-server

View File

@@ -48,14 +48,10 @@
<artifactId>flyway-database-postgresql</artifactId>
</dependency>
<dependency>
<groupId>org.opensearch.client</groupId>
<artifactId>opensearch-java</artifactId>
<version>2.19.0</version>
</dependency>
<dependency>
<groupId>org.opensearch.client</groupId>
<artifactId>opensearch-rest-client</artifactId>
<version>2.19.0</version>
<groupId>com.clickhouse</groupId>
<artifactId>clickhouse-jdbc</artifactId>
<version>0.9.7</version>
<classifier>all</classifier>
</dependency>
<dependency>
<groupId>org.springdoc</groupId>
@@ -121,9 +117,8 @@
<scope>test</scope>
</dependency>
<dependency>
<groupId>org.opensearch</groupId>
<artifactId>opensearch-testcontainers</artifactId>
<version>2.1.1</version>
<groupId>org.testcontainers</groupId>
<artifactId>testcontainers-clickhouse</artifactId>
<scope>test</scope>
</dependency>
<dependency>

View File

@@ -39,7 +39,7 @@ public class AgentLifecycleMonitor {
// Snapshot states before lifecycle check
Map<String, AgentState> statesBefore = new HashMap<>();
for (AgentInfo agent : registryService.findAll()) {
statesBefore.put(agent.id(), agent.state());
statesBefore.put(agent.instanceId(), agent.state());
}
registryService.checkLifecycle();
@@ -47,12 +47,12 @@ public class AgentLifecycleMonitor {
// Detect transitions and record events
for (AgentInfo agent : registryService.findAll()) {
AgentState before = statesBefore.get(agent.id());
AgentState before = statesBefore.get(agent.instanceId());
if (before != null && before != agent.state()) {
String eventType = mapTransitionEvent(before, agent.state());
if (eventType != null) {
agentEventService.recordEvent(agent.id(), agent.application(), eventType,
agent.name() + " " + before + " -> " + agent.state());
agentEventService.recordEvent(agent.instanceId(), agent.applicationId(), eventType,
agent.displayName() + " " + before + " -> " + agent.state());
}
}
}

View File

@@ -0,0 +1,30 @@
package com.cameleer3.server.app.analytics;
import com.cameleer3.server.app.storage.ClickHouseUsageTracker;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.autoconfigure.condition.ConditionalOnBean;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
@Component
@ConditionalOnBean(ClickHouseUsageTracker.class)
public class UsageFlushScheduler {
private static final Logger log = LoggerFactory.getLogger(UsageFlushScheduler.class);
private final ClickHouseUsageTracker tracker;
public UsageFlushScheduler(ClickHouseUsageTracker tracker) {
this.tracker = tracker;
}
@Scheduled(fixedDelayString = "${cameleer.usage.flush-interval-ms:5000}")
public void flush() {
try {
tracker.flush();
} catch (Exception e) {
log.warn("Usage event flush failed: {}", e.getMessage());
}
}
}

View File

@@ -0,0 +1,88 @@
package com.cameleer3.server.app.analytics;
import com.cameleer3.server.core.analytics.UsageEvent;
import com.cameleer3.server.core.analytics.UsageTracker;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.web.servlet.HandlerInterceptor;
import java.time.Instant;
import java.util.regex.Pattern;
/**
* Tracks authenticated UI user requests for usage analytics.
* Skips agent requests, health checks, data ingestion, and static assets.
*/
public class UsageTrackingInterceptor implements HandlerInterceptor {
private static final String START_ATTR = "usage.startNanos";
// Patterns for normalizing dynamic path segments
private static final Pattern EXCHANGE_ID = Pattern.compile(
"/[A-F0-9]{15,}-[A-F0-9]{16}(?=/|$)", Pattern.CASE_INSENSITIVE);
private static final Pattern UUID = Pattern.compile(
"/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}(?=/|$)", Pattern.CASE_INSENSITIVE);
private static final Pattern HEX_HASH = Pattern.compile(
"/[0-9a-f]{32,64}(?=/|$)", Pattern.CASE_INSENSITIVE);
private static final Pattern NUMERIC_ID = Pattern.compile(
"(?<=/)(\\d{2,})(?=/|$)");
// Agent instance IDs like "cameleer3-sample-598867949d-g7nt4-1"
private static final Pattern INSTANCE_ID = Pattern.compile(
"(?<=/agents/)[^/]+(?=/)", Pattern.CASE_INSENSITIVE);
private final UsageTracker usageTracker;
public UsageTrackingInterceptor(UsageTracker usageTracker) {
this.usageTracker = usageTracker;
}
@Override
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) {
request.setAttribute(START_ATTR, System.nanoTime());
return true;
}
@Override
public void afterCompletion(HttpServletRequest request, HttpServletResponse response,
Object handler, Exception ex) {
String username = extractUsername();
if (username == null) return; // unauthenticated or agent request
Long startNanos = (Long) request.getAttribute(START_ATTR);
long durationMs = startNanos != null ? (System.nanoTime() - startNanos) / 1_000_000 : 0;
String path = request.getRequestURI();
String queryString = request.getQueryString();
usageTracker.track(new UsageEvent(
Instant.now(),
username,
request.getMethod(),
path,
normalizePath(path),
response.getStatus(),
durationMs,
queryString
));
}
private String extractUsername() {
Authentication auth = SecurityContextHolder.getContext().getAuthentication();
if (auth == null || auth.getName() == null) return null;
String name = auth.getName();
// Only track UI users (user:admin), not agents
if (!name.startsWith("user:")) return null;
return name;
}
static String normalizePath(String path) {
String normalized = EXCHANGE_ID.matcher(path).replaceAll("/{id}");
normalized = UUID.matcher(normalized).replaceAll("/{id}");
normalized = HEX_HASH.matcher(normalized).replaceAll("/{hash}");
normalized = INSTANCE_ID.matcher(normalized).replaceAll("{id}");
normalized = NUMERIC_ID.matcher(normalized).replaceAll("{id}");
return normalized;
}
}

View File

@@ -3,11 +3,13 @@ package com.cameleer3.server.app.config;
import com.cameleer3.server.core.agent.AgentEventRepository;
import com.cameleer3.server.core.agent.AgentEventService;
import com.cameleer3.server.core.agent.AgentRegistryService;
import com.cameleer3.server.core.agent.RouteStateRegistry;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
/**
* Creates the {@link AgentRegistryService} and {@link AgentEventService} beans.
* Creates the {@link AgentRegistryService}, {@link AgentEventService},
* and {@link RouteStateRegistry} beans.
* <p>
* Follows the established pattern: core module plain class, app module bean config.
*/
@@ -27,4 +29,9 @@ public class AgentRegistryBeanConfig {
public AgentEventService agentEventService(AgentEventRepository repository) {
return new AgentEventService(repository);
}
@Bean
public RouteStateRegistry routeStateRegistry() {
return new RouteStateRegistry();
}
}

View File

@@ -0,0 +1,52 @@
package com.cameleer3.server.app.config;
import com.zaxxer.hikari.HikariDataSource;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.boot.autoconfigure.jdbc.DataSourceProperties;
import org.springframework.boot.context.properties.EnableConfigurationProperties;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Primary;
import org.springframework.jdbc.core.JdbcTemplate;
import javax.sql.DataSource;
@Configuration
@EnableConfigurationProperties(ClickHouseProperties.class)
@ConditionalOnProperty(name = "clickhouse.enabled", havingValue = "true")
public class ClickHouseConfig {
/**
* Explicit primary PG DataSource. Required because adding a second DataSource
* (ClickHouse) prevents Spring Boot auto-configuration from creating the default one.
*/
@Bean
@Primary
public DataSource dataSource(DataSourceProperties properties) {
return properties.initializeDataSourceBuilder().build();
}
@Bean
@Primary
public JdbcTemplate jdbcTemplate(@Qualifier("dataSource") DataSource dataSource) {
return new JdbcTemplate(dataSource);
}
@Bean(name = "clickHouseDataSource")
public DataSource clickHouseDataSource(ClickHouseProperties props) {
HikariDataSource ds = new HikariDataSource();
ds.setJdbcUrl(props.getUrl());
ds.setUsername(props.getUsername());
ds.setPassword(props.getPassword());
ds.setMaximumPoolSize(10);
ds.setPoolName("clickhouse-pool");
return ds;
}
@Bean(name = "clickHouseJdbcTemplate")
public JdbcTemplate clickHouseJdbcTemplate(
@Qualifier("clickHouseDataSource") DataSource ds) {
return new JdbcTemplate(ds);
}
}

View File

@@ -0,0 +1,20 @@
package com.cameleer3.server.app.config;
import org.springframework.boot.context.properties.ConfigurationProperties;
@ConfigurationProperties(prefix = "clickhouse")
public class ClickHouseProperties {
private String url = "jdbc:clickhouse://localhost:8123/cameleer";
private String username = "default";
private String password = "";
public String getUrl() { return url; }
public void setUrl(String url) { this.url = url; }
public String getUsername() { return username; }
public void setUsername(String username) { this.username = username; }
public String getPassword() { return password; }
public void setPassword(String password) { this.password = password; }
}

View File

@@ -0,0 +1,62 @@
package com.cameleer3.server.app.config;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.boot.context.event.ApplicationReadyEvent;
import org.springframework.context.event.EventListener;
import org.springframework.core.io.Resource;
import org.springframework.core.io.support.PathMatchingResourcePatternResolver;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Component;
import java.io.IOException;
import java.nio.charset.StandardCharsets;
import java.util.Arrays;
import java.util.Comparator;
@Component
@ConditionalOnProperty(name = "clickhouse.enabled", havingValue = "true")
public class ClickHouseSchemaInitializer {
private static final Logger log = LoggerFactory.getLogger(ClickHouseSchemaInitializer.class);
private final JdbcTemplate clickHouseJdbc;
public ClickHouseSchemaInitializer(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
this.clickHouseJdbc = clickHouseJdbc;
}
@EventListener(ApplicationReadyEvent.class)
public void initializeSchema() {
try {
PathMatchingResourcePatternResolver resolver = new PathMatchingResourcePatternResolver();
Resource[] scripts = resolver.getResources("classpath:clickhouse/*.sql");
Arrays.sort(scripts, Comparator.comparing(Resource::getFilename));
for (Resource script : scripts) {
String sql = script.getContentAsString(StandardCharsets.UTF_8);
log.info("Executing ClickHouse schema script: {}", script.getFilename());
for (String statement : sql.split(";")) {
String trimmed = statement.trim();
// Skip empty segments and comment-only segments
String withoutComments = trimmed.lines()
.filter(line -> !line.stripLeading().startsWith("--"))
.map(String::trim)
.filter(line -> !line.isEmpty())
.reduce("", (a, b) -> a + b);
if (!withoutComments.isEmpty()) {
clickHouseJdbc.execute(trimmed);
}
}
}
log.info("ClickHouse schema initialization complete ({} scripts)", scripts.length);
} catch (Exception e) {
log.error("ClickHouse schema initialization failed — server will continue but ClickHouse features may not work", e);
}
}
}

View File

@@ -1,7 +1,10 @@
package com.cameleer3.server.app.config;
import com.cameleer3.server.core.ingestion.ChunkAccumulator;
import com.cameleer3.server.core.ingestion.MergedExecution;
import com.cameleer3.server.core.ingestion.WriteBuffer;
import com.cameleer3.server.core.storage.model.MetricsSnapshot;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@@ -19,4 +22,16 @@ public class IngestionBeanConfig {
public WriteBuffer<MetricsSnapshot> metricsBuffer(IngestionConfig config) {
return new WriteBuffer<>(config.getBufferCapacity());
}
@Bean
@ConditionalOnProperty(name = "clickhouse.enabled", havingValue = "true")
public WriteBuffer<MergedExecution> executionBuffer(IngestionConfig config) {
return new WriteBuffer<>(config.getBufferCapacity());
}
@Bean
@ConditionalOnProperty(name = "clickhouse.enabled", havingValue = "true")
public WriteBuffer<ChunkAccumulator.ProcessorBatch> processorBatchBuffer(IngestionConfig config) {
return new WriteBuffer<>(config.getBufferCapacity());
}
}

View File

@@ -1,28 +0,0 @@
package com.cameleer3.server.app.config;
import org.apache.http.HttpHost;
import org.opensearch.client.RestClient;
import org.opensearch.client.json.jackson.JacksonJsonpMapper;
import org.opensearch.client.opensearch.OpenSearchClient;
import org.opensearch.client.transport.rest_client.RestClientTransport;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
@Configuration
public class OpenSearchConfig {
@Value("${opensearch.url:http://localhost:9200}")
private String opensearchUrl;
@Bean(destroyMethod = "close")
public RestClient opensearchRestClient() {
return RestClient.builder(HttpHost.create(opensearchUrl)).build();
}
@Bean
public OpenSearchClient openSearchClient(RestClient restClient) {
var transport = new RestClientTransport(restClient, new JacksonJsonpMapper());
return new OpenSearchClient(transport);
}
}

View File

@@ -1,16 +1,34 @@
package com.cameleer3.server.app.config;
import com.cameleer3.server.app.search.ClickHouseLogStore;
import com.cameleer3.server.app.storage.ClickHouseAgentEventRepository;
import com.cameleer3.server.app.storage.ClickHouseUsageTracker;
import com.cameleer3.server.app.storage.ClickHouseDiagramStore;
import com.cameleer3.server.app.storage.ClickHouseMetricsQueryStore;
import com.cameleer3.server.app.storage.ClickHouseMetricsStore;
import com.cameleer3.server.app.storage.ClickHouseStatsStore;
import com.cameleer3.server.core.admin.AuditRepository;
import com.cameleer3.server.core.admin.AuditService;
import com.cameleer3.server.core.agent.AgentEventRepository;
import com.cameleer3.server.core.detail.DetailService;
import com.cameleer3.server.core.indexing.SearchIndexer;
import com.cameleer3.server.app.ingestion.ExecutionFlushScheduler;
import com.cameleer3.server.app.search.ClickHouseSearchIndex;
import com.cameleer3.server.app.storage.ClickHouseExecutionStore;
import com.cameleer3.server.core.ingestion.ChunkAccumulator;
import com.cameleer3.server.core.ingestion.IngestionService;
import com.cameleer3.server.core.ingestion.MergedExecution;
import com.cameleer3.server.core.ingestion.WriteBuffer;
import com.cameleer3.server.core.storage.*;
import com.cameleer3.server.core.storage.LogIndex;
import com.cameleer3.server.core.storage.StatsStore;
import com.cameleer3.server.core.storage.model.MetricsSnapshot;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.jdbc.core.JdbcTemplate;
@Configuration
public class StorageBeanConfig {
@@ -22,8 +40,8 @@ public class StorageBeanConfig {
@Bean(destroyMethod = "shutdown")
public SearchIndexer searchIndexer(ExecutionStore executionStore, SearchIndex searchIndex,
@Value("${opensearch.debounce-ms:2000}") long debounceMs,
@Value("${opensearch.queue-size:10000}") int queueSize) {
@Value("${cameleer.indexer.debounce-ms:2000}") long debounceMs,
@Value("${cameleer.indexer.queue-size:10000}") int queueSize) {
return new SearchIndexer(executionStore, searchIndex, debounceMs, queueSize);
}
@@ -41,4 +59,102 @@ public class StorageBeanConfig {
return new IngestionService(executionStore, diagramStore, metricsBuffer,
searchIndexer::onExecutionUpdated, bodySizeLimit);
}
@Bean
public MetricsStore clickHouseMetricsStore(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseMetricsStore(clickHouseJdbc);
}
@Bean
public MetricsQueryStore clickHouseMetricsQueryStore(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseMetricsQueryStore(clickHouseJdbc);
}
// ── Execution Store ──────────────────────────────────────────────────
@Bean
public ClickHouseExecutionStore clickHouseExecutionStore(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseExecutionStore(clickHouseJdbc);
}
@Bean
public ChunkAccumulator chunkAccumulator(
WriteBuffer<MergedExecution> executionBuffer,
WriteBuffer<ChunkAccumulator.ProcessorBatch> processorBatchBuffer,
DiagramStore diagramStore) {
return new ChunkAccumulator(
executionBuffer::offerOrWarn,
processorBatchBuffer::offerOrWarn,
diagramStore,
java.time.Duration.ofMinutes(5));
}
@Bean
public ExecutionFlushScheduler executionFlushScheduler(
WriteBuffer<MergedExecution> executionBuffer,
WriteBuffer<ChunkAccumulator.ProcessorBatch> processorBatchBuffer,
ClickHouseExecutionStore executionStore,
ChunkAccumulator accumulator,
IngestionConfig config) {
return new ExecutionFlushScheduler(executionBuffer, processorBatchBuffer,
executionStore, accumulator, config);
}
@Bean
public SearchIndex clickHouseSearchIndex(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseSearchIndex(clickHouseJdbc);
}
// ── ClickHouse Stats Store ─────────────────────────────────────────
@Bean
public StatsStore clickHouseStatsStore(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseStatsStore(clickHouseJdbc);
}
// ── ClickHouse Diagram Store ──────────────────────────────────────
@Bean
public DiagramStore clickHouseDiagramStore(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseDiagramStore(clickHouseJdbc);
}
// ── ClickHouse Agent Event Repository ─────────────────────────────
@Bean
public AgentEventRepository clickHouseAgentEventRepository(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseAgentEventRepository(clickHouseJdbc);
}
// ── ClickHouse Log Store ──────────────────────────────────────────
@Bean
public LogIndex clickHouseLogStore(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseLogStore(clickHouseJdbc);
}
// ── Usage Analytics ──────────────────────────────────────────────
@Bean
@ConditionalOnProperty(name = "clickhouse.enabled", havingValue = "true")
public ClickHouseUsageTracker clickHouseUsageTracker(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc) {
return new ClickHouseUsageTracker(clickHouseJdbc,
new com.cameleer3.server.core.ingestion.WriteBuffer<>(5000));
}
@Bean
@ConditionalOnProperty(name = "clickhouse.enabled", havingValue = "true")
public com.cameleer3.server.app.analytics.UsageTrackingInterceptor usageTrackingInterceptor(
ClickHouseUsageTracker usageTracker) {
return new com.cameleer3.server.app.analytics.UsageTrackingInterceptor(usageTracker);
}
}

View File

@@ -1,5 +1,6 @@
package com.cameleer3.server.app.config;
import com.cameleer3.server.app.analytics.UsageTrackingInterceptor;
import com.cameleer3.server.app.interceptor.AuditInterceptor;
import com.cameleer3.server.app.interceptor.ProtocolVersionInterceptor;
import org.springframework.context.annotation.Configuration;
@@ -14,11 +15,14 @@ public class WebConfig implements WebMvcConfigurer {
private final ProtocolVersionInterceptor protocolVersionInterceptor;
private final AuditInterceptor auditInterceptor;
private final UsageTrackingInterceptor usageTrackingInterceptor;
public WebConfig(ProtocolVersionInterceptor protocolVersionInterceptor,
AuditInterceptor auditInterceptor) {
AuditInterceptor auditInterceptor,
@org.springframework.lang.Nullable UsageTrackingInterceptor usageTrackingInterceptor) {
this.protocolVersionInterceptor = protocolVersionInterceptor;
this.auditInterceptor = auditInterceptor;
this.usageTrackingInterceptor = usageTrackingInterceptor;
}
@Override
@@ -35,6 +39,18 @@ public class WebConfig implements WebMvcConfigurer {
"/api/v1/agents/*/refresh"
);
// Usage analytics: tracks authenticated UI user requests
if (usageTrackingInterceptor != null) {
registry.addInterceptor(usageTrackingInterceptor)
.addPathPatterns("/api/v1/**")
.excludePathPatterns(
"/api/v1/data/**",
"/api/v1/agents/*/heartbeat",
"/api/v1/agents/*/events",
"/api/v1/health"
);
}
// Safety-net audit: catches any unaudited POST/PUT/DELETE
registry.addInterceptor(auditInterceptor)
.addPathPatterns("/api/v1/**")

View File

@@ -3,6 +3,7 @@ package com.cameleer3.server.app.controller;
import com.cameleer3.server.app.agent.SseConnectionManager;
import com.cameleer3.server.app.dto.CommandAckRequest;
import com.cameleer3.server.app.dto.CommandBroadcastResponse;
import com.cameleer3.server.app.dto.CommandGroupResponse;
import com.cameleer3.server.app.dto.CommandRequest;
import com.cameleer3.server.app.dto.CommandSingleResponse;
import com.cameleer3.server.app.dto.ReplayRequest;
@@ -17,6 +18,7 @@ import com.cameleer3.server.core.agent.AgentRegistryService;
import com.cameleer3.server.core.agent.AgentState;
import com.cameleer3.server.core.agent.CommandReply;
import com.cameleer3.server.core.agent.CommandType;
import com.cameleer3.server.core.agent.RouteStateRegistry;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import jakarta.servlet.http.HttpServletRequest;
@@ -66,17 +68,20 @@ public class AgentCommandController {
private final ObjectMapper objectMapper;
private final AgentEventService agentEventService;
private final AuditService auditService;
private final RouteStateRegistry routeStateRegistry;
public AgentCommandController(AgentRegistryService registryService,
SseConnectionManager connectionManager,
ObjectMapper objectMapper,
AgentEventService agentEventService,
AuditService auditService) {
AuditService auditService,
RouteStateRegistry routeStateRegistry) {
this.registryService = registryService;
this.connectionManager = connectionManager;
this.objectMapper = objectMapper;
this.agentEventService = agentEventService;
this.auditService = auditService;
this.routeStateRegistry = routeStateRegistry;
}
@PostMapping("/{id}/commands")
@@ -109,32 +114,83 @@ public class AgentCommandController {
@PostMapping("/groups/{group}/commands")
@Operation(summary = "Send command to all agents in a group",
description = "Sends a command to all LIVE agents in the specified group")
@ApiResponse(responseCode = "202", description = "Commands accepted")
description = "Sends a command to all LIVE agents in the specified group and waits for responses")
@ApiResponse(responseCode = "200", description = "Commands dispatched and responses collected")
@ApiResponse(responseCode = "400", description = "Invalid command payload")
public ResponseEntity<CommandBroadcastResponse> sendGroupCommand(@PathVariable String group,
@RequestBody CommandRequest request,
HttpServletRequest httpRequest) throws JsonProcessingException {
public ResponseEntity<CommandGroupResponse> sendGroupCommand(@PathVariable String group,
@RequestBody CommandRequest request,
HttpServletRequest httpRequest) throws JsonProcessingException {
CommandType type = mapCommandType(request.type());
String payloadJson = request.payload() != null ? objectMapper.writeValueAsString(request.payload()) : "{}";
List<AgentInfo> agents = registryService.findAll().stream()
.filter(a -> a.state() == AgentState.LIVE)
.filter(a -> group.equals(a.application()))
.toList();
Map<String, CompletableFuture<CommandReply>> futures =
registryService.addGroupCommandWithReplies(group, type, payloadJson);
List<String> commandIds = new ArrayList<>();
for (AgentInfo agent : agents) {
AgentCommand command = registryService.addCommand(agent.id(), type, payloadJson);
commandIds.add(command.id());
if (futures.isEmpty()) {
auditService.log("broadcast_group_command", AuditCategory.AGENT, group,
java.util.Map.of("type", request.type(), "agentCount", 0),
AuditResult.SUCCESS, httpRequest);
return ResponseEntity.ok(new CommandGroupResponse(true, 0, 0, List.of(), List.of()));
}
// Wait with shared 10-second deadline
long deadline = System.currentTimeMillis() + 10_000;
List<CommandGroupResponse.AgentResponse> responses = new ArrayList<>();
List<String> timedOut = new ArrayList<>();
for (var entry : futures.entrySet()) {
long remaining = deadline - System.currentTimeMillis();
if (remaining <= 0) {
timedOut.add(entry.getKey());
entry.getValue().cancel(false);
continue;
}
try {
CommandReply reply = entry.getValue().get(remaining, TimeUnit.MILLISECONDS);
responses.add(new CommandGroupResponse.AgentResponse(
entry.getKey(), reply.status(), reply.message()));
} catch (TimeoutException e) {
timedOut.add(entry.getKey());
entry.getValue().cancel(false);
} catch (Exception e) {
responses.add(new CommandGroupResponse.AgentResponse(
entry.getKey(), "ERROR", e.getMessage()));
}
}
boolean allSuccess = timedOut.isEmpty() &&
responses.stream().allMatch(r -> "SUCCESS".equals(r.status()));
// Update route state when all agents successfully ACK a route-control command
if (allSuccess && type == CommandType.ROUTE_CONTROL) {
try {
@SuppressWarnings("unchecked")
var payloadMap = objectMapper.readValue(payloadJson, java.util.Map.class);
String routeId = (String) payloadMap.get("routeId");
String action = (String) payloadMap.get("action");
if (routeId != null && action != null) {
RouteStateRegistry.RouteState newState = switch (action) {
case "start", "resume" -> RouteStateRegistry.RouteState.STARTED;
case "stop" -> RouteStateRegistry.RouteState.STOPPED;
case "suspend" -> RouteStateRegistry.RouteState.SUSPENDED;
default -> null;
};
if (newState != null) {
routeStateRegistry.setState(group, routeId, newState);
}
}
} catch (Exception e) {
log.warn("Failed to parse route-control payload for state tracking", e);
}
}
auditService.log("broadcast_group_command", AuditCategory.AGENT, group,
java.util.Map.of("type", request.type(), "agentCount", agents.size()),
java.util.Map.of("type", request.type(), "agentCount", futures.size(),
"responded", responses.size(), "timedOut", timedOut.size()),
AuditResult.SUCCESS, httpRequest);
return ResponseEntity.status(HttpStatus.ACCEPTED)
.body(new CommandBroadcastResponse(commandIds, agents.size()));
return ResponseEntity.ok(new CommandGroupResponse(
allSuccess, futures.size(), responses.size(), responses, timedOut));
}
@PostMapping("/commands")
@@ -151,7 +207,7 @@ public class AgentCommandController {
List<String> commandIds = new ArrayList<>();
for (AgentInfo agent : liveAgents) {
AgentCommand command = registryService.addCommand(agent.id(), type, payloadJson);
AgentCommand command = registryService.addCommand(agent.instanceId(), type, payloadJson);
commandIds.add(command.id());
}
@@ -185,7 +241,7 @@ public class AgentCommandController {
// Record command result in agent event log
if (body != null && body.status() != null) {
AgentInfo agent = registryService.findById(id);
String application = agent != null ? agent.application() : "unknown";
String application = agent != null ? agent.applicationId() : "unknown";
agentEventService.recordEvent(id, application, "COMMAND_" + body.status(),
"Command " + commandId + ": " + body.message());
log.debug("Command {} ack from agent {}: {} - {}", commandId, id, body.status(), body.message());

View File

@@ -2,22 +2,23 @@ package com.cameleer3.server.app.controller;
import com.cameleer3.server.app.dto.AgentMetricsResponse;
import com.cameleer3.server.app.dto.MetricBucket;
import org.springframework.jdbc.core.JdbcTemplate;
import com.cameleer3.server.core.storage.MetricsQueryStore;
import com.cameleer3.server.core.storage.model.MetricTimeSeries;
import org.springframework.web.bind.annotation.*;
import java.sql.Timestamp;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
import java.util.*;
import java.util.stream.Collectors;
@RestController
@RequestMapping("/api/v1/agents/{agentId}/metrics")
public class AgentMetricsController {
private final JdbcTemplate jdbc;
private final MetricsQueryStore metricsQueryStore;
public AgentMetricsController(JdbcTemplate jdbc) {
this.jdbc = jdbc;
public AgentMetricsController(MetricsQueryStore metricsQueryStore) {
this.metricsQueryStore = metricsQueryStore;
}
@GetMapping
@@ -32,34 +33,18 @@ public class AgentMetricsController {
if (to == null) to = Instant.now();
List<String> metricNames = Arrays.asList(names.split(","));
long intervalMs = (to.toEpochMilli() - from.toEpochMilli()) / Math.max(buckets, 1);
String intervalStr = intervalMs + " milliseconds";
Map<String, List<MetricBucket>> result = new LinkedHashMap<>();
for (String name : metricNames) {
result.put(name.trim(), new ArrayList<>());
}
Map<String, List<MetricTimeSeries.Bucket>> raw =
metricsQueryStore.queryTimeSeries(agentId, metricNames, from, to, buckets);
String sql = """
SELECT time_bucket(CAST(? AS interval), collected_at) AS bucket,
metric_name,
AVG(metric_value) AS avg_value
FROM agent_metrics
WHERE agent_id = ?
AND collected_at >= ? AND collected_at < ?
AND metric_name = ANY(?)
GROUP BY bucket, metric_name
ORDER BY bucket
""";
String[] namesArray = metricNames.stream().map(String::trim).toArray(String[]::new);
jdbc.query(sql, rs -> {
String metricName = rs.getString("metric_name");
Instant bucket = rs.getTimestamp("bucket").toInstant();
double value = rs.getDouble("avg_value");
result.computeIfAbsent(metricName, k -> new ArrayList<>())
.add(new MetricBucket(bucket, value));
}, intervalStr, agentId, Timestamp.from(from), Timestamp.from(to), namesArray);
Map<String, List<MetricBucket>> result = raw.entrySet().stream()
.collect(Collectors.toMap(
Map.Entry::getKey,
e -> e.getValue().stream()
.map(b -> new MetricBucket(b.time(), b.value()))
.toList(),
(a, b) -> a,
LinkedHashMap::new));
return new AgentMetricsResponse(result);
}

View File

@@ -71,7 +71,7 @@ public class AgentRegistrationController {
Ed25519SigningService ed25519SigningService,
AgentEventService agentEventService,
AuditService auditService,
JdbcTemplate jdbc) {
@org.springframework.beans.factory.annotation.Qualifier("clickHouseJdbcTemplate") JdbcTemplate jdbc) {
this.registryService = registryService;
this.config = config;
this.bootstrapTokenValidator = bootstrapTokenValidator;
@@ -103,34 +103,34 @@ public class AgentRegistrationController {
return ResponseEntity.status(401).build();
}
if (request.agentId() == null || request.agentId().isBlank()
|| request.name() == null || request.name().isBlank()) {
if (request.instanceId() == null || request.instanceId().isBlank()
|| request.displayName() == null || request.displayName().isBlank()) {
return ResponseEntity.badRequest().build();
}
String application = request.application() != null ? request.application() : "default";
String application = request.applicationId() != null ? request.applicationId() : "default";
List<String> routeIds = request.routeIds() != null ? request.routeIds() : List.of();
var capabilities = request.capabilities() != null ? request.capabilities() : Collections.<String, Object>emptyMap();
AgentInfo agent = registryService.register(
request.agentId(), request.name(), application, request.version(), routeIds, capabilities);
log.info("Agent registered: {} (name={}, application={})", request.agentId(), request.name(), application);
request.instanceId(), request.displayName(), application, request.version(), routeIds, capabilities);
log.info("Agent registered: {} (name={}, application={})", request.instanceId(), request.displayName(), application);
agentEventService.recordEvent(request.agentId(), application, "REGISTERED",
"Agent registered: " + request.name());
agentEventService.recordEvent(request.instanceId(), application, "REGISTERED",
"Agent registered: " + request.displayName());
auditService.log(request.agentId(), "agent_register", AuditCategory.AGENT, request.agentId(),
Map.of("application", application, "name", request.name()),
auditService.log(request.instanceId(), "agent_register", AuditCategory.AGENT, request.instanceId(),
Map.of("application", application, "name", request.displayName()),
AuditResult.SUCCESS, httpRequest);
// Issue JWT tokens with AGENT role
List<String> roles = List.of("AGENT");
String accessToken = jwtService.createAccessToken(request.agentId(), application, roles);
String refreshToken = jwtService.createRefreshToken(request.agentId(), application, roles);
String accessToken = jwtService.createAccessToken(request.instanceId(), application, roles);
String refreshToken = jwtService.createRefreshToken(request.instanceId(), application, roles);
return ResponseEntity.ok(new AgentRegistrationResponse(
agent.id(),
"/api/v1/agents/" + agent.id() + "/events",
agent.instanceId(),
"/api/v1/agents/" + agent.instanceId() + "/events",
config.getHeartbeatIntervalMs(),
ed25519SigningService.getPublicKeyBase64(),
accessToken,
@@ -177,8 +177,8 @@ public class AgentRegistrationController {
// Preserve roles from refresh token
List<String> roles = result.roles().isEmpty()
? List.of("AGENT") : result.roles();
String newAccessToken = jwtService.createAccessToken(agentId, agent.application(), roles);
String newRefreshToken = jwtService.createRefreshToken(agentId, agent.application(), roles);
String newAccessToken = jwtService.createAccessToken(agentId, agent.applicationId(), roles);
String newRefreshToken = jwtService.createRefreshToken(agentId, agent.applicationId(), roles);
auditService.log(agentId, "agent_token_refresh", AuditCategory.AUTH, agentId,
null, AuditResult.SUCCESS, httpRequest);
@@ -199,6 +199,23 @@ public class AgentRegistrationController {
return ResponseEntity.ok().build();
}
@PostMapping("/{id}/deregister")
@Operation(summary = "Deregister agent",
description = "Removes the agent from the registry. Called by agents during graceful shutdown.")
@ApiResponse(responseCode = "200", description = "Agent deregistered")
@ApiResponse(responseCode = "404", description = "Agent not registered")
public ResponseEntity<Void> deregister(@PathVariable String id, HttpServletRequest httpRequest) {
AgentInfo agent = registryService.findById(id);
if (agent == null) {
return ResponseEntity.notFound().build();
}
String applicationId = agent.applicationId();
registryService.deregister(id);
agentEventService.recordEvent(id, applicationId, "DEREGISTERED", "Agent deregistered");
auditService.log(id, "agent_deregister", AuditCategory.AGENT, id, null, AuditResult.SUCCESS, httpRequest);
return ResponseEntity.ok().build();
}
@GetMapping
@Operation(summary = "List all agents",
description = "Returns all registered agents with runtime metrics, optionally filtered by status and/or application")
@@ -224,7 +241,7 @@ public class AgentRegistrationController {
// Apply application filter if specified
if (application != null && !application.isBlank()) {
agents = agents.stream()
.filter(a -> application.equals(a.application()))
.filter(a -> application.equals(a.applicationId()))
.toList();
}
@@ -235,10 +252,10 @@ public class AgentRegistrationController {
List<AgentInstanceResponse> response = finalAgents.stream()
.map(a -> {
AgentInstanceResponse dto = AgentInstanceResponse.from(a);
double[] m = agentMetrics.get(a.application());
double[] m = agentMetrics.get(a.applicationId());
if (m != null) {
long appAgentCount = finalAgents.stream()
.filter(ag -> ag.application().equals(a.application())).count();
.filter(ag -> ag.applicationId().equals(a.applicationId())).count();
double agentTps = appAgentCount > 0 ? m[0] / appAgentCount : 0;
double errorRate = m[1];
int activeRoutes = (int) m[2];
@@ -255,25 +272,33 @@ public class AgentRegistrationController {
Instant now = Instant.now();
Instant from1m = now.minus(1, ChronoUnit.MINUTES);
try {
// Literal SQL — ClickHouse JDBC driver wraps prepared statements in sub-queries
// that strip AggregateFunction column types, breaking -Merge combinators
jdbc.query(
"SELECT application_name, " +
"SUM(total_count) AS total, " +
"SUM(failed_count) AS failed, " +
"SELECT application_id, " +
"countMerge(total_count) AS total, " +
"countIfMerge(failed_count) AS failed, " +
"COUNT(DISTINCT route_id) AS active_routes " +
"FROM stats_1m_route WHERE bucket >= ? AND bucket < ? " +
"GROUP BY application_name",
"FROM stats_1m_route WHERE bucket >= " + lit(from1m) + " AND bucket < " + lit(now) +
" GROUP BY application_id",
rs -> {
long total = rs.getLong("total");
long failed = rs.getLong("failed");
double tps = total / 60.0;
double errorRate = total > 0 ? (double) failed / total : 0.0;
int activeRoutes = rs.getInt("active_routes");
result.put(rs.getString("application_name"), new double[]{tps, errorRate, activeRoutes});
},
Timestamp.from(from1m), Timestamp.from(now));
result.put(rs.getString("application_id"), new double[]{tps, errorRate, activeRoutes});
});
} catch (Exception e) {
log.debug("Could not query agent metrics: {}", e.getMessage());
}
return result;
}
/** Format an Instant as a ClickHouse DateTime literal. */
private static String lit(Instant instant) {
return "'" + java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
.withZone(java.time.ZoneOffset.UTC)
.format(instant.truncatedTo(ChronoUnit.SECONDS)) + "'";
}
}

View File

@@ -48,7 +48,7 @@ public class AppSettingsController {
@GetMapping("/{appId}")
@Operation(summary = "Get settings for a specific application (returns defaults if not configured)")
public ResponseEntity<AppSettings> getByAppId(@PathVariable String appId) {
AppSettings settings = repository.findByAppId(appId).orElse(AppSettings.defaults(appId));
AppSettings settings = repository.findByApplicationId(appId).orElse(AppSettings.defaults(appId));
return ResponseEntity.ok(settings);
}

View File

@@ -1,13 +1,14 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.common.model.ApplicationConfig;
import com.cameleer3.server.app.dto.CommandGroupResponse;
import com.cameleer3.server.app.dto.ConfigUpdateResponse;
import com.cameleer3.server.app.dto.TestExpressionRequest;
import com.cameleer3.server.app.dto.TestExpressionResponse;
import com.cameleer3.server.app.storage.PostgresApplicationConfigRepository;
import com.cameleer3.server.core.admin.AuditCategory;
import com.cameleer3.server.core.admin.AuditResult;
import com.cameleer3.server.core.admin.AuditService;
import com.cameleer3.server.core.agent.AgentCommand;
import com.cameleer3.server.core.agent.AgentInfo;
import com.cameleer3.server.core.agent.AgentRegistryService;
import com.cameleer3.server.core.agent.AgentState;
@@ -27,6 +28,7 @@ import org.springframework.http.ResponseEntity;
import org.springframework.security.core.Authentication;
import org.springframework.web.bind.annotation.*;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.concurrent.CompletableFuture;
@@ -88,23 +90,25 @@ public class ApplicationConfigController {
@Operation(summary = "Update application config",
description = "Saves config and pushes CONFIG_UPDATE to all LIVE agents of this application")
@ApiResponse(responseCode = "200", description = "Config saved and pushed")
public ResponseEntity<ApplicationConfig> updateConfig(@PathVariable String application,
@RequestBody ApplicationConfig config,
Authentication auth,
HttpServletRequest httpRequest) {
public ResponseEntity<ConfigUpdateResponse> updateConfig(@PathVariable String application,
@RequestBody ApplicationConfig config,
Authentication auth,
HttpServletRequest httpRequest) {
String updatedBy = auth != null ? auth.getName() : "system";
config.setApplication(application);
ApplicationConfig saved = configRepository.save(application, config, updatedBy);
int pushed = pushConfigToAgents(application, saved);
log.info("Config v{} saved for '{}', pushed to {} agent(s)", saved.getVersion(), application, pushed);
CommandGroupResponse pushResult = pushConfigToAgents(application, saved);
log.info("Config v{} saved for '{}', pushed to {} agent(s), {} responded",
saved.getVersion(), application, pushResult.total(), pushResult.responded());
auditService.log("update_app_config", AuditCategory.CONFIG, application,
Map.of("version", saved.getVersion(), "agentsPushed", pushed),
Map.of("version", saved.getVersion(), "agentsPushed", pushResult.total(),
"responded", pushResult.responded(), "timedOut", pushResult.timedOut().size()),
AuditResult.SUCCESS, httpRequest);
return ResponseEntity.ok(saved);
return ResponseEntity.ok(new ConfigUpdateResponse(saved, pushResult));
}
@GetMapping("/{application}/processor-routes")
@@ -125,7 +129,7 @@ public class ApplicationConfigController {
@RequestBody TestExpressionRequest request) {
// Find a LIVE agent for this application
AgentInfo agent = registryService.findAll().stream()
.filter(a -> application.equals(a.application()))
.filter(a -> application.equals(a.applicationId()))
.filter(a -> a.state() == AgentState.LIVE)
.findFirst()
.orElse(null);
@@ -152,7 +156,7 @@ public class ApplicationConfigController {
// Send command and await reply
CompletableFuture<CommandReply> future = registryService.addCommandWithReply(
agent.id(), CommandType.TEST_EXPRESSION, payloadJson);
agent.instanceId(), CommandType.TEST_EXPRESSION, payloadJson);
try {
CommandReply reply = future.orTimeout(5, TimeUnit.SECONDS).join();
@@ -166,30 +170,56 @@ public class ApplicationConfigController {
return ResponseEntity.status(HttpStatus.GATEWAY_TIMEOUT)
.body(new TestExpressionResponse(null, "Agent did not respond within 5 seconds"));
}
log.error("Error awaiting test-expression reply from agent {}", agent.id(), e);
log.error("Error awaiting test-expression reply from agent {}", agent.instanceId(), e);
return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR)
.body(new TestExpressionResponse(null, "Internal error: " + e.getCause().getMessage()));
}
}
private int pushConfigToAgents(String application, ApplicationConfig config) {
private CommandGroupResponse pushConfigToAgents(String application, ApplicationConfig config) {
String payloadJson;
try {
payloadJson = objectMapper.writeValueAsString(config);
} catch (JsonProcessingException e) {
log.error("Failed to serialize config for push", e);
return 0;
return new CommandGroupResponse(false, 0, 0, List.of(), List.of());
}
List<AgentInfo> agents = registryService.findAll().stream()
.filter(a -> a.state() == AgentState.LIVE)
.filter(a -> application.equals(a.application()))
.toList();
Map<String, CompletableFuture<CommandReply>> futures =
registryService.addGroupCommandWithReplies(application, CommandType.CONFIG_UPDATE, payloadJson);
for (AgentInfo agent : agents) {
registryService.addCommand(agent.id(), CommandType.CONFIG_UPDATE, payloadJson);
if (futures.isEmpty()) {
return new CommandGroupResponse(true, 0, 0, List.of(), List.of());
}
return agents.size();
// Wait with shared 10-second deadline
long deadline = System.currentTimeMillis() + 10_000;
List<CommandGroupResponse.AgentResponse> responses = new ArrayList<>();
List<String> timedOut = new ArrayList<>();
for (var entry : futures.entrySet()) {
long remaining = deadline - System.currentTimeMillis();
if (remaining <= 0) {
timedOut.add(entry.getKey());
entry.getValue().cancel(false);
continue;
}
try {
CommandReply reply = entry.getValue().get(remaining, TimeUnit.MILLISECONDS);
responses.add(new CommandGroupResponse.AgentResponse(
entry.getKey(), reply.status(), reply.message()));
} catch (TimeoutException e) {
timedOut.add(entry.getKey());
entry.getValue().cancel(false);
} catch (Exception e) {
responses.add(new CommandGroupResponse.AgentResponse(
entry.getKey(), "ERROR", e.getMessage()));
}
}
boolean allSuccess = timedOut.isEmpty() &&
responses.stream().allMatch(r -> "SUCCESS".equals(r.status()));
return new CommandGroupResponse(allSuccess, futures.size(), responses.size(), responses, timedOut);
}
private static ApplicationConfig defaultConfig(String application) {

View File

@@ -0,0 +1,70 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.server.core.ingestion.ChunkAccumulator;
import com.cameleer3.common.model.ExecutionChunk;
import com.fasterxml.jackson.core.type.TypeReference;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.autoconfigure.condition.ConditionalOnBean;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
/**
* Ingestion endpoint for execution chunk data (ClickHouse pipeline).
* <p>
* Accepts single or array {@link ExecutionChunk} payloads and feeds them
* into the {@link ChunkAccumulator}. Only active when
* {@code clickhouse.enabled=true} (conditional on the accumulator bean).
*/
@RestController
@RequestMapping("/api/v1/data")
@ConditionalOnBean(ChunkAccumulator.class)
@Tag(name = "Ingestion", description = "Data ingestion endpoints")
public class ChunkIngestionController {
private static final Logger log = LoggerFactory.getLogger(ChunkIngestionController.class);
private final ChunkAccumulator accumulator;
private final ObjectMapper objectMapper;
public ChunkIngestionController(ChunkAccumulator accumulator) {
this.accumulator = accumulator;
this.objectMapper = new ObjectMapper();
this.objectMapper.registerModule(new JavaTimeModule());
this.objectMapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false);
}
@PostMapping("/executions")
@Operation(summary = "Ingest execution chunk")
public ResponseEntity<Void> ingestChunks(@RequestBody String body) {
try {
String trimmed = body.strip();
List<ExecutionChunk> chunks;
if (trimmed.startsWith("[")) {
chunks = objectMapper.readValue(trimmed, new TypeReference<List<ExecutionChunk>>() {});
} else {
ExecutionChunk single = objectMapper.readValue(trimmed, ExecutionChunk.class);
chunks = List.of(single);
}
for (ExecutionChunk chunk : chunks) {
accumulator.onChunk(chunk);
}
return ResponseEntity.accepted().build();
} catch (Exception e) {
log.warn("Failed to parse execution chunk payload: {}", e.getMessage());
return ResponseEntity.badRequest().build();
}
}
}

View File

@@ -0,0 +1,165 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.server.app.dto.ClickHousePerformanceResponse;
import com.cameleer3.server.app.dto.ClickHouseQueryInfo;
import com.cameleer3.server.app.dto.ClickHouseStatusResponse;
import com.cameleer3.server.app.dto.ClickHouseTableInfo;
import com.cameleer3.server.app.dto.IndexerPipelineResponse;
import com.cameleer3.server.core.indexing.SearchIndexerStats;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.springframework.beans.factory.annotation.Qualifier;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
@RestController
@RequestMapping("/api/v1/admin/clickhouse")
@PreAuthorize("hasRole('ADMIN')")
@Tag(name = "ClickHouse Admin", description = "ClickHouse monitoring and diagnostics (ADMIN only)")
public class ClickHouseAdminController {
private final JdbcTemplate clickHouseJdbc;
private final SearchIndexerStats indexerStats;
private final String clickHouseUrl;
public ClickHouseAdminController(
@Qualifier("clickHouseJdbcTemplate") JdbcTemplate clickHouseJdbc,
SearchIndexerStats indexerStats,
@Value("${clickhouse.url:}") String clickHouseUrl) {
this.clickHouseJdbc = clickHouseJdbc;
this.indexerStats = indexerStats;
this.clickHouseUrl = clickHouseUrl;
}
@GetMapping("/status")
@Operation(summary = "ClickHouse cluster status")
public ClickHouseStatusResponse getStatus() {
try {
var row = clickHouseJdbc.queryForMap(
"SELECT version() AS version, formatReadableTimeDelta(uptime()) AS uptime");
return new ClickHouseStatusResponse(true,
(String) row.get("version"),
(String) row.get("uptime"),
clickHouseUrl);
} catch (Exception e) {
return new ClickHouseStatusResponse(false, null, null, clickHouseUrl);
}
}
@GetMapping("/tables")
@Operation(summary = "List ClickHouse tables with sizes")
public List<ClickHouseTableInfo> getTables() {
return clickHouseJdbc.query("""
SELECT t.name, t.engine,
t.total_rows AS row_count,
formatReadableSize(t.total_bytes) AS data_size,
t.total_bytes AS data_size_bytes,
ifNull(p.partition_count, 0) AS partition_count
FROM system.tables t
LEFT JOIN (
SELECT table, countDistinct(partition) AS partition_count
FROM system.parts
WHERE database = currentDatabase() AND active
GROUP BY table
) p ON t.name = p.table
WHERE t.database = currentDatabase()
ORDER BY t.total_bytes DESC NULLS LAST
""",
(rs, rowNum) -> new ClickHouseTableInfo(
rs.getString("name"),
rs.getString("engine"),
rs.getLong("row_count"),
rs.getString("data_size"),
rs.getLong("data_size_bytes"),
rs.getInt("partition_count")));
}
@GetMapping("/performance")
@Operation(summary = "ClickHouse storage and performance metrics")
public ClickHousePerformanceResponse getPerformance() {
try {
var row = clickHouseJdbc.queryForMap("""
SELECT
formatReadableSize(sum(bytes_on_disk)) AS disk_size,
formatReadableSize(sum(data_uncompressed_bytes)) AS uncompressed_size,
if(sum(data_uncompressed_bytes) > 0,
round(sum(bytes_on_disk) / sum(data_uncompressed_bytes), 3), 0) AS compression_ratio,
sum(rows) AS total_rows,
count() AS part_count
FROM system.parts
WHERE database = currentDatabase() AND active
""");
String memory = "N/A";
try {
memory = clickHouseJdbc.queryForObject(
"SELECT formatReadableSize(value) FROM system.metrics WHERE metric = 'MemoryTracking'",
String.class);
} catch (Exception ignored) {}
int currentQueries = 0;
try {
Integer q = clickHouseJdbc.queryForObject(
"SELECT toInt32(value) FROM system.metrics WHERE metric = 'Query'",
Integer.class);
if (q != null) currentQueries = q;
} catch (Exception ignored) {}
return new ClickHousePerformanceResponse(
(String) row.get("disk_size"),
(String) row.get("uncompressed_size"),
((Number) row.get("compression_ratio")).doubleValue(),
((Number) row.get("total_rows")).longValue(),
((Number) row.get("part_count")).intValue(),
memory != null ? memory : "N/A",
currentQueries);
} catch (Exception e) {
return new ClickHousePerformanceResponse("N/A", "N/A", 0, 0, 0, "N/A", 0);
}
}
@GetMapping("/queries")
@Operation(summary = "Active ClickHouse queries")
public List<ClickHouseQueryInfo> getQueries() {
try {
return clickHouseJdbc.query("""
SELECT
query_id,
round(elapsed, 2) AS elapsed_seconds,
formatReadableSize(memory_usage) AS memory,
read_rows,
substring(query, 1, 200) AS query
FROM system.processes
WHERE is_initial_query = 1
ORDER BY elapsed DESC
""",
(rs, rowNum) -> new ClickHouseQueryInfo(
rs.getString("query_id"),
rs.getDouble("elapsed_seconds"),
rs.getString("memory"),
rs.getLong("read_rows"),
rs.getString("query")));
} catch (Exception e) {
return List.of();
}
}
@GetMapping("/pipeline")
@Operation(summary = "Search indexer pipeline statistics")
public IndexerPipelineResponse getPipeline() {
return new IndexerPipelineResponse(
indexerStats.getQueueDepth(),
indexerStats.getMaxQueueSize(),
indexerStats.getFailedCount(),
indexerStats.getIndexedCount(),
indexerStats.getDebounceMs(),
indexerStats.getIndexingRate(),
indexerStats.getLastIndexedAt());
}
}

View File

@@ -7,7 +7,6 @@ import com.cameleer3.server.app.dto.TableSizeResponse;
import com.cameleer3.server.core.admin.AuditCategory;
import com.cameleer3.server.core.admin.AuditResult;
import com.cameleer3.server.core.admin.AuditService;
import com.cameleer3.server.core.ingestion.IngestionService;
import com.zaxxer.hikari.HikariDataSource;
import com.zaxxer.hikari.HikariPoolMXBean;
import io.swagger.v3.oas.annotations.Operation;
@@ -25,9 +24,7 @@ import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.server.ResponseStatusException;
import javax.sql.DataSource;
import java.time.Instant;
import java.util.List;
import java.util.Map;
@RestController
@RequestMapping("/api/v1/admin/database")
@@ -38,14 +35,12 @@ public class DatabaseAdminController {
private final JdbcTemplate jdbc;
private final DataSource dataSource;
private final AuditService auditService;
private final IngestionService ingestionService;
public DatabaseAdminController(JdbcTemplate jdbc, DataSource dataSource,
AuditService auditService, IngestionService ingestionService) {
AuditService auditService) {
this.jdbc = jdbc;
this.dataSource = dataSource;
this.auditService = auditService;
this.ingestionService = ingestionService;
}
@GetMapping("/status")
@@ -53,14 +48,12 @@ public class DatabaseAdminController {
public ResponseEntity<DatabaseStatusResponse> getStatus() {
try {
String version = jdbc.queryForObject("SELECT version()", String.class);
boolean timescaleDb = Boolean.TRUE.equals(
jdbc.queryForObject("SELECT EXISTS(SELECT 1 FROM pg_extension WHERE extname = 'timescaledb')", Boolean.class));
String schema = jdbc.queryForObject("SELECT current_schema()", String.class);
String host = extractHost(dataSource);
return ResponseEntity.ok(new DatabaseStatusResponse(true, version, host, schema, timescaleDb));
return ResponseEntity.ok(new DatabaseStatusResponse(true, version, host, schema));
} catch (Exception e) {
return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)
.body(new DatabaseStatusResponse(false, null, null, null, false));
.body(new DatabaseStatusResponse(false, null, null, null));
}
}
@@ -124,29 +117,6 @@ public class DatabaseAdminController {
return ResponseEntity.ok().build();
}
@GetMapping("/metrics-pipeline")
@Operation(summary = "Get metrics ingestion pipeline diagnostics")
public ResponseEntity<Map<String, Object>> getMetricsPipeline() {
int bufferDepth = ingestionService.getMetricsBufferDepth();
Long totalRows = jdbc.queryForObject(
"SELECT count(*) FROM agent_metrics", Long.class);
List<String> agentIds = jdbc.queryForList(
"SELECT DISTINCT agent_id FROM agent_metrics ORDER BY agent_id", String.class);
Instant latestCollected = jdbc.queryForObject(
"SELECT max(collected_at) FROM agent_metrics", Instant.class);
List<String> metricNames = jdbc.queryForList(
"SELECT DISTINCT metric_name FROM agent_metrics ORDER BY metric_name", String.class);
return ResponseEntity.ok(Map.of(
"bufferDepth", bufferDepth,
"totalRows", totalRows != null ? totalRows : 0,
"distinctAgents", agentIds,
"distinctMetrics", metricNames,
"latestCollectedAt", latestCollected != null ? latestCollected.toString() : "none"
));
}
private String extractHost(DataSource ds) {
try {
if (ds instanceof HikariDataSource hds) {

View File

@@ -81,4 +81,16 @@ public class DetailController {
.map(ResponseEntity::ok)
.orElse(ResponseEntity.notFound().build());
}
@GetMapping("/{executionId}/processors/by-seq/{seq}/snapshot")
@Operation(summary = "Get exchange snapshot for a processor by seq number")
@ApiResponse(responseCode = "200", description = "Snapshot data")
@ApiResponse(responseCode = "404", description = "Snapshot not found")
public ResponseEntity<Map<String, String>> processorSnapshotBySeq(
@PathVariable String executionId,
@PathVariable int seq) {
return detailService.getProcessorSnapshotBySeq(executionId, seq)
.map(ResponseEntity::ok)
.orElse(ResponseEntity.notFound().build());
}
}

View File

@@ -53,12 +53,12 @@ public class DiagramController {
description = "Accepts a single RouteGraph or an array of RouteGraphs")
@ApiResponse(responseCode = "202", description = "Data accepted for processing")
public ResponseEntity<Void> ingestDiagrams(@RequestBody String body) throws JsonProcessingException {
String agentId = extractAgentId();
String applicationName = resolveApplicationName(agentId);
String instanceId = extractAgentId();
String applicationId = resolveApplicationId(instanceId);
List<RouteGraph> graphs = parsePayload(body);
for (RouteGraph graph : graphs) {
ingestionService.ingestDiagram(new TaggedDiagram(agentId, applicationName, graph));
ingestionService.ingestDiagram(new TaggedDiagram(instanceId, applicationId, graph));
}
return ResponseEntity.accepted().build();
@@ -69,9 +69,9 @@ public class DiagramController {
return auth != null ? auth.getName() : "";
}
private String resolveApplicationName(String agentId) {
AgentInfo agent = registryService.findById(agentId);
return agent != null ? agent.application() : "";
private String resolveApplicationId(String instanceId) {
AgentInfo agent = registryService.findById(instanceId);
return agent != null ? agent.applicationId() : "";
}
private List<RouteGraph> parsePayload(String body) throws JsonProcessingException {

View File

@@ -100,7 +100,7 @@ public class DiagramRenderController {
@RequestParam String routeId,
@RequestParam(defaultValue = "LR") String direction) {
List<String> agentIds = registryService.findByApplication(application).stream()
.map(AgentInfo::id)
.map(AgentInfo::instanceId)
.toList();
if (agentIds.isEmpty()) {

View File

@@ -0,0 +1,88 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.common.model.AgentEvent;
import com.cameleer3.server.core.agent.AgentEventService;
import com.cameleer3.server.core.agent.AgentInfo;
import com.cameleer3.server.core.agent.AgentRegistryService;
import com.fasterxml.jackson.core.type.TypeReference;
import com.fasterxml.jackson.databind.ObjectMapper;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.http.ResponseEntity;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.context.SecurityContextHolder;
import org.springframework.web.bind.annotation.PostMapping;
import org.springframework.web.bind.annotation.RequestBody;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.List;
/**
* Ingestion endpoint for agent lifecycle events.
* <p>
* Agents emit events (AGENT_STARTED, AGENT_STOPPED, etc.) which are
* stored in the event log. AGENT_STOPPED triggers a graceful shutdown
* transition in the registry.
*/
@RestController
@RequestMapping("/api/v1/data")
@Tag(name = "Ingestion", description = "Data ingestion endpoints")
public class EventIngestionController {
private static final Logger log = LoggerFactory.getLogger(EventIngestionController.class);
private final AgentEventService agentEventService;
private final AgentRegistryService registryService;
private final ObjectMapper objectMapper;
public EventIngestionController(AgentEventService agentEventService,
AgentRegistryService registryService,
ObjectMapper objectMapper) {
this.agentEventService = agentEventService;
this.registryService = registryService;
this.objectMapper = objectMapper;
}
@PostMapping("/events")
@Operation(summary = "Ingest agent events")
public ResponseEntity<Void> ingestEvents(@RequestBody String body) {
String instanceId = extractInstanceId();
List<AgentEvent> events;
try {
String trimmed = body.strip();
if (trimmed.startsWith("[")) {
events = objectMapper.readValue(trimmed, new TypeReference<List<AgentEvent>>() {});
} else {
events = List.of(objectMapper.readValue(trimmed, AgentEvent.class));
}
} catch (Exception e) {
log.warn("Failed to parse event payload: {}", e.getMessage());
return ResponseEntity.badRequest().build();
}
AgentInfo agent = registryService.findById(instanceId);
String applicationId = agent != null ? agent.applicationId() : "";
for (AgentEvent event : events) {
agentEventService.recordEvent(instanceId, applicationId,
event.getEventType(),
event.getDetails() != null ? event.getDetails().toString() : null);
if ("AGENT_STOPPED".equals(event.getEventType())) {
log.info("Agent {} reported graceful shutdown", instanceId);
registryService.shutdown(instanceId);
}
}
return ResponseEntity.accepted().build();
}
private String extractInstanceId() {
Authentication auth = SecurityContextHolder.getContext().getAuthentication();
return auth != null ? auth.getName() : "";
}
}

View File

@@ -3,6 +3,7 @@ package com.cameleer3.server.app.controller;
import com.cameleer3.common.model.RouteExecution;
import com.cameleer3.server.core.agent.AgentInfo;
import com.cameleer3.server.core.agent.AgentRegistryService;
import com.cameleer3.server.core.ingestion.ChunkAccumulator;
import com.cameleer3.server.core.ingestion.IngestionService;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.core.type.TypeReference;
@@ -12,6 +13,7 @@ import io.swagger.v3.oas.annotations.responses.ApiResponse;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.boot.autoconfigure.condition.ConditionalOnMissingBean;
import org.springframework.http.ResponseEntity;
import org.springframework.security.core.Authentication;
import org.springframework.security.core.context.SecurityContextHolder;
@@ -23,13 +25,17 @@ import org.springframework.web.bind.annotation.RestController;
import java.util.List;
/**
* Ingestion endpoint for route execution data.
* Legacy ingestion endpoint for route execution data (PostgreSQL path).
* <p>
* Accepts both single {@link RouteExecution} and arrays. Data is written
* synchronously to PostgreSQL via {@link IngestionService}.
* <p>
* Only active when ClickHouse is disabled — when ClickHouse is enabled,
* {@link ChunkIngestionController} takes over the {@code /executions} mapping.
*/
@RestController
@RequestMapping("/api/v1/data")
@ConditionalOnMissingBean(ChunkAccumulator.class)
@Tag(name = "Ingestion", description = "Data ingestion endpoints")
public class ExecutionController {
@@ -52,12 +58,12 @@ public class ExecutionController {
description = "Accepts a single RouteExecution or an array of RouteExecutions")
@ApiResponse(responseCode = "202", description = "Data accepted for processing")
public ResponseEntity<Void> ingestExecutions(@RequestBody String body) throws JsonProcessingException {
String agentId = extractAgentId();
String applicationName = resolveApplicationName(agentId);
String instanceId = extractAgentId();
String applicationId = resolveApplicationId(instanceId);
List<RouteExecution> executions = parsePayload(body);
for (RouteExecution execution : executions) {
ingestionService.ingestExecution(agentId, applicationName, execution);
ingestionService.ingestExecution(instanceId, applicationId, execution);
}
return ResponseEntity.accepted().build();
@@ -68,9 +74,9 @@ public class ExecutionController {
return auth != null ? auth.getName() : "";
}
private String resolveApplicationName(String agentId) {
AgentInfo agent = registryService.findById(agentId);
return agent != null ? agent.application() : "";
private String resolveApplicationId(String instanceId) {
AgentInfo agent = registryService.findById(instanceId);
return agent != null ? agent.applicationId() : "";
}
private List<RouteExecution> parsePayload(String body) throws JsonProcessingException {

View File

@@ -1,7 +1,7 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.common.model.LogBatch;
import com.cameleer3.server.app.search.OpenSearchLogIndex;
import com.cameleer3.server.core.storage.LogIndex;
import com.cameleer3.server.core.agent.AgentInfo;
import com.cameleer3.server.core.agent.AgentRegistryService;
import io.swagger.v3.oas.annotations.Operation;
@@ -24,10 +24,10 @@ public class LogIngestionController {
private static final Logger log = LoggerFactory.getLogger(LogIngestionController.class);
private final OpenSearchLogIndex logIndex;
private final LogIndex logIndex;
private final AgentRegistryService registryService;
public LogIngestionController(OpenSearchLogIndex logIndex,
public LogIngestionController(LogIndex logIndex,
AgentRegistryService registryService) {
this.logIndex = logIndex;
this.registryService = registryService;
@@ -35,15 +35,15 @@ public class LogIngestionController {
@PostMapping("/logs")
@Operation(summary = "Ingest application log entries",
description = "Accepts a batch of log entries from an agent. Entries are indexed in OpenSearch.")
description = "Accepts a batch of log entries from an agent. Entries are stored in the configured log store.")
@ApiResponse(responseCode = "202", description = "Logs accepted for indexing")
public ResponseEntity<Void> ingestLogs(@RequestBody LogBatch batch) {
String agentId = extractAgentId();
String application = resolveApplicationName(agentId);
String instanceId = extractAgentId();
String applicationId = resolveApplicationId(instanceId);
if (batch.getEntries() != null && !batch.getEntries().isEmpty()) {
log.debug("Received {} log entries from agent={}, app={}", batch.getEntries().size(), agentId, application);
logIndex.indexBatch(agentId, application, batch.getEntries());
log.debug("Received {} log entries from instance={}, app={}", batch.getEntries().size(), instanceId, applicationId);
logIndex.indexBatch(instanceId, applicationId, batch.getEntries());
}
return ResponseEntity.accepted().build();
@@ -54,8 +54,8 @@ public class LogIngestionController {
return auth != null ? auth.getName() : "";
}
private String resolveApplicationName(String agentId) {
AgentInfo agent = registryService.findById(agentId);
return agent != null ? agent.application() : "";
private String resolveApplicationId(String instanceId) {
AgentInfo agent = registryService.findById(instanceId);
return agent != null ? agent.applicationId() : "";
}
}

View File

@@ -1,7 +1,10 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.server.app.dto.LogEntryResponse;
import com.cameleer3.server.app.search.OpenSearchLogIndex;
import com.cameleer3.server.app.dto.LogSearchPageResponse;
import com.cameleer3.server.core.search.LogSearchRequest;
import com.cameleer3.server.core.search.LogSearchResponse;
import com.cameleer3.server.core.storage.LogIndex;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.springframework.http.ResponseEntity;
@@ -11,40 +14,68 @@ import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import java.time.Instant;
import java.util.Arrays;
import java.util.List;
@RestController
@RequestMapping("/api/v1/logs")
@Tag(name = "Application Logs", description = "Query application logs stored in OpenSearch")
@Tag(name = "Application Logs", description = "Query application logs")
public class LogQueryController {
private final OpenSearchLogIndex logIndex;
private final LogIndex logIndex;
public LogQueryController(OpenSearchLogIndex logIndex) {
public LogQueryController(LogIndex logIndex) {
this.logIndex = logIndex;
}
@GetMapping
@Operation(summary = "Search application log entries",
description = "Returns log entries for a given application, optionally filtered by agent, level, time range, and text query")
public ResponseEntity<List<LogEntryResponse>> searchLogs(
@RequestParam String application,
@RequestParam(required = false) String agentId,
@RequestParam(required = false) String level,
description = "Returns log entries with cursor-based pagination and level count aggregation. " +
"Supports free-text search, multi-level filtering, and optional application scoping.")
public ResponseEntity<LogSearchPageResponse> searchLogs(
@RequestParam(required = false) String q,
@RequestParam(required = false) String query,
@RequestParam(required = false) String level,
@RequestParam(required = false) String application,
@RequestParam(name = "agentId", required = false) String instanceId,
@RequestParam(required = false) String exchangeId,
@RequestParam(required = false) String logger,
@RequestParam(required = false) String from,
@RequestParam(required = false) String to,
@RequestParam(defaultValue = "200") int limit) {
@RequestParam(required = false) String cursor,
@RequestParam(defaultValue = "100") int limit,
@RequestParam(defaultValue = "desc") String sort) {
limit = Math.min(limit, 1000);
// q takes precedence over deprecated query param
String searchText = q != null ? q : query;
// Parse CSV levels
List<String> levels = List.of();
if (level != null && !level.isEmpty()) {
levels = Arrays.stream(level.split(","))
.map(String::trim)
.filter(s -> !s.isEmpty())
.toList();
}
Instant fromInstant = from != null ? Instant.parse(from) : null;
Instant toInstant = to != null ? Instant.parse(to) : null;
List<LogEntryResponse> entries = logIndex.search(
application, agentId, level, query, exchangeId, fromInstant, toInstant, limit);
LogSearchRequest request = new LogSearchRequest(
searchText, levels, application, instanceId, exchangeId,
logger, fromInstant, toInstant, cursor, limit, sort);
return ResponseEntity.ok(entries);
LogSearchResponse result = logIndex.search(request);
List<LogEntryResponse> entries = result.data().stream()
.map(r -> new LogEntryResponse(
r.timestamp(), r.level(), r.loggerName(),
r.message(), r.threadName(), r.stackTrace(),
r.exchangeId(), r.instanceId(), r.application(),
r.mdc()))
.toList();
return ResponseEntity.ok(new LogSearchPageResponse(
entries, result.nextCursor(), result.hasMore(), result.levelCounts()));
}
}

View File

@@ -1,266 +0,0 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.server.app.dto.IndexInfoResponse;
import com.cameleer3.server.app.dto.IndicesPageResponse;
import com.cameleer3.server.app.dto.OpenSearchStatusResponse;
import com.cameleer3.server.app.dto.PerformanceResponse;
import com.cameleer3.server.app.dto.PipelineStatsResponse;
import com.cameleer3.server.core.admin.AuditCategory;
import com.cameleer3.server.core.admin.AuditResult;
import com.cameleer3.server.core.admin.AuditService;
import com.cameleer3.server.core.indexing.SearchIndexerStats;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import jakarta.servlet.http.HttpServletRequest;
import org.opensearch.client.Request;
import org.opensearch.client.Response;
import org.opensearch.client.RestClient;
import org.opensearch.client.opensearch.OpenSearchClient;
import org.opensearch.client.opensearch.cluster.HealthResponse;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;
import org.springframework.security.access.prepost.PreAuthorize;
import org.springframework.web.bind.annotation.DeleteMapping;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import org.springframework.web.server.ResponseStatusException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.Comparator;
import java.util.List;
@RestController
@RequestMapping("/api/v1/admin/opensearch")
@PreAuthorize("hasRole('ADMIN')")
@Tag(name = "OpenSearch Admin", description = "OpenSearch monitoring and management (ADMIN only)")
public class OpenSearchAdminController {
private final OpenSearchClient client;
private final RestClient restClient;
private final SearchIndexerStats indexerStats;
private final AuditService auditService;
private final ObjectMapper objectMapper;
private final String opensearchUrl;
private final String indexPrefix;
private final String logIndexPrefix;
public OpenSearchAdminController(OpenSearchClient client, RestClient restClient,
SearchIndexerStats indexerStats, AuditService auditService,
ObjectMapper objectMapper,
@Value("${opensearch.url:http://localhost:9200}") String opensearchUrl,
@Value("${opensearch.index-prefix:executions-}") String indexPrefix,
@Value("${opensearch.log-index-prefix:logs-}") String logIndexPrefix) {
this.client = client;
this.restClient = restClient;
this.indexerStats = indexerStats;
this.auditService = auditService;
this.objectMapper = objectMapper;
this.opensearchUrl = opensearchUrl;
this.indexPrefix = indexPrefix;
this.logIndexPrefix = logIndexPrefix;
}
@GetMapping("/status")
@Operation(summary = "Get OpenSearch cluster status and version")
public ResponseEntity<OpenSearchStatusResponse> getStatus() {
try {
HealthResponse health = client.cluster().health();
String version = client.info().version().number();
return ResponseEntity.ok(new OpenSearchStatusResponse(
true,
health.status().name(),
version,
health.numberOfNodes(),
opensearchUrl));
} catch (Exception e) {
return ResponseEntity.status(HttpStatus.SERVICE_UNAVAILABLE)
.body(new OpenSearchStatusResponse(
false, "UNREACHABLE", null, 0, opensearchUrl));
}
}
@GetMapping("/pipeline")
@Operation(summary = "Get indexing pipeline statistics")
public ResponseEntity<PipelineStatsResponse> getPipeline() {
return ResponseEntity.ok(new PipelineStatsResponse(
indexerStats.getQueueDepth(),
indexerStats.getMaxQueueSize(),
indexerStats.getFailedCount(),
indexerStats.getIndexedCount(),
indexerStats.getDebounceMs(),
indexerStats.getIndexingRate(),
indexerStats.getLastIndexedAt()));
}
@GetMapping("/indices")
@Operation(summary = "Get OpenSearch indices with pagination")
public ResponseEntity<IndicesPageResponse> getIndices(
@RequestParam(defaultValue = "0") int page,
@RequestParam(defaultValue = "20") int size,
@RequestParam(defaultValue = "") String search,
@RequestParam(defaultValue = "executions") String prefix) {
try {
Response response = restClient.performRequest(
new Request("GET", "/_cat/indices?format=json&h=index,health,docs.count,store.size,pri,rep&bytes=b"));
JsonNode indices;
try (InputStream is = response.getEntity().getContent()) {
indices = objectMapper.readTree(is);
}
String filterPrefix = "logs".equals(prefix) ? logIndexPrefix : indexPrefix;
List<IndexInfoResponse> allIndices = new ArrayList<>();
for (JsonNode idx : indices) {
String name = idx.path("index").asText("");
if (!name.startsWith(filterPrefix)) {
continue;
}
if (!search.isEmpty() && !name.contains(search)) {
continue;
}
allIndices.add(new IndexInfoResponse(
name,
parseLong(idx.path("docs.count").asText("0")),
humanSize(parseLong(idx.path("store.size").asText("0"))),
parseLong(idx.path("store.size").asText("0")),
idx.path("health").asText("unknown"),
parseInt(idx.path("pri").asText("0")),
parseInt(idx.path("rep").asText("0"))));
}
allIndices.sort(Comparator.comparing(IndexInfoResponse::name));
long totalDocs = allIndices.stream().mapToLong(IndexInfoResponse::docCount).sum();
long totalBytes = allIndices.stream().mapToLong(IndexInfoResponse::sizeBytes).sum();
int totalIndices = allIndices.size();
int totalPages = Math.max(1, (int) Math.ceil((double) totalIndices / size));
int fromIndex = Math.min(page * size, totalIndices);
int toIndex = Math.min(fromIndex + size, totalIndices);
List<IndexInfoResponse> pageItems = allIndices.subList(fromIndex, toIndex);
return ResponseEntity.ok(new IndicesPageResponse(
pageItems, totalIndices, totalDocs,
humanSize(totalBytes), page, size, totalPages));
} catch (Exception e) {
return ResponseEntity.status(HttpStatus.BAD_GATEWAY)
.body(new IndicesPageResponse(
List.of(), 0, 0, "0 B", page, size, 0));
}
}
@DeleteMapping("/indices/{name}")
@Operation(summary = "Delete an OpenSearch index")
public ResponseEntity<Void> deleteIndex(@PathVariable String name, HttpServletRequest request) {
try {
if (!name.startsWith(indexPrefix) && !name.startsWith(logIndexPrefix)) {
throw new ResponseStatusException(HttpStatus.FORBIDDEN, "Cannot delete index outside application scope");
}
boolean exists = client.indices().exists(r -> r.index(name)).value();
if (!exists) {
throw new ResponseStatusException(HttpStatus.NOT_FOUND, "Index not found: " + name);
}
client.indices().delete(r -> r.index(name));
auditService.log("delete_index", AuditCategory.INFRA, name, null, AuditResult.SUCCESS, request);
return ResponseEntity.ok().build();
} catch (ResponseStatusException e) {
throw e;
} catch (Exception e) {
throw new ResponseStatusException(HttpStatus.INTERNAL_SERVER_ERROR, "Failed to delete index: " + e.getMessage());
}
}
@GetMapping("/performance")
@Operation(summary = "Get OpenSearch performance metrics")
public ResponseEntity<PerformanceResponse> getPerformance() {
try {
Response response = restClient.performRequest(
new Request("GET", "/_nodes/stats/jvm,indices"));
JsonNode root;
try (InputStream is = response.getEntity().getContent()) {
root = objectMapper.readTree(is);
}
JsonNode nodes = root.path("nodes");
long heapUsed = 0, heapMax = 0;
long queryCacheHits = 0, queryCacheMisses = 0;
long requestCacheHits = 0, requestCacheMisses = 0;
long searchQueryTotal = 0, searchQueryTimeMs = 0;
long indexTotal = 0, indexTimeMs = 0;
var it = nodes.fields();
while (it.hasNext()) {
var entry = it.next();
JsonNode node = entry.getValue();
JsonNode jvm = node.path("jvm").path("mem");
heapUsed += jvm.path("heap_used_in_bytes").asLong(0);
heapMax += jvm.path("heap_max_in_bytes").asLong(0);
JsonNode indicesNode = node.path("indices");
JsonNode queryCache = indicesNode.path("query_cache");
queryCacheHits += queryCache.path("hit_count").asLong(0);
queryCacheMisses += queryCache.path("miss_count").asLong(0);
JsonNode requestCache = indicesNode.path("request_cache");
requestCacheHits += requestCache.path("hit_count").asLong(0);
requestCacheMisses += requestCache.path("miss_count").asLong(0);
JsonNode searchNode = indicesNode.path("search");
searchQueryTotal += searchNode.path("query_total").asLong(0);
searchQueryTimeMs += searchNode.path("query_time_in_millis").asLong(0);
JsonNode indexing = indicesNode.path("indexing");
indexTotal += indexing.path("index_total").asLong(0);
indexTimeMs += indexing.path("index_time_in_millis").asLong(0);
}
double queryCacheHitRate = (queryCacheHits + queryCacheMisses) > 0
? (double) queryCacheHits / (queryCacheHits + queryCacheMisses) : 0.0;
double requestCacheHitRate = (requestCacheHits + requestCacheMisses) > 0
? (double) requestCacheHits / (requestCacheHits + requestCacheMisses) : 0.0;
double searchLatency = searchQueryTotal > 0
? (double) searchQueryTimeMs / searchQueryTotal : 0.0;
double indexingLatency = indexTotal > 0
? (double) indexTimeMs / indexTotal : 0.0;
return ResponseEntity.ok(new PerformanceResponse(
queryCacheHitRate, requestCacheHitRate,
searchLatency, indexingLatency,
heapUsed, heapMax));
} catch (Exception e) {
return ResponseEntity.status(HttpStatus.BAD_GATEWAY)
.body(new PerformanceResponse(0, 0, 0, 0, 0, 0));
}
}
private static long parseLong(String s) {
try {
return Long.parseLong(s);
} catch (NumberFormatException e) {
return 0;
}
}
private static int parseInt(String s) {
try {
return Integer.parseInt(s);
} catch (NumberFormatException e) {
return 0;
}
}
private static String humanSize(long bytes) {
if (bytes < 1024) return bytes + " B";
if (bytes < 1024 * 1024) return String.format("%.1f KB", bytes / 1024.0);
if (bytes < 1024 * 1024 * 1024) return String.format("%.1f MB", bytes / (1024.0 * 1024));
return String.format("%.1f GB", bytes / (1024.0 * 1024 * 1024));
}
}

View File

@@ -7,6 +7,7 @@ import com.cameleer3.common.graph.RouteGraph;
import com.cameleer3.server.core.agent.AgentInfo;
import com.cameleer3.server.core.agent.AgentRegistryService;
import com.cameleer3.server.core.agent.AgentState;
import com.cameleer3.server.core.agent.RouteStateRegistry;
import com.cameleer3.server.core.storage.DiagramStore;
import com.cameleer3.server.core.storage.StatsStore;
import io.swagger.v3.oas.annotations.Operation;
@@ -35,16 +36,21 @@ import java.util.stream.Collectors;
@Tag(name = "Route Catalog", description = "Route catalog and discovery")
public class RouteCatalogController {
private static final org.slf4j.Logger log = org.slf4j.LoggerFactory.getLogger(RouteCatalogController.class);
private final AgentRegistryService registryService;
private final DiagramStore diagramStore;
private final JdbcTemplate jdbc;
private final RouteStateRegistry routeStateRegistry;
public RouteCatalogController(AgentRegistryService registryService,
DiagramStore diagramStore,
JdbcTemplate jdbc) {
@org.springframework.beans.factory.annotation.Qualifier("clickHouseJdbcTemplate") JdbcTemplate jdbc,
RouteStateRegistry routeStateRegistry) {
this.registryService = registryService;
this.diagramStore = diagramStore;
this.jdbc = jdbc;
this.routeStateRegistry = routeStateRegistry;
}
@GetMapping("/catalog")
@@ -58,7 +64,7 @@ public class RouteCatalogController {
// Group agents by application name
Map<String, List<AgentInfo>> agentsByApp = allAgents.stream()
.collect(Collectors.groupingBy(AgentInfo::application, LinkedHashMap::new, Collectors.toList()));
.collect(Collectors.groupingBy(AgentInfo::applicationId, LinkedHashMap::new, Collectors.toList()));
// Collect all distinct routes per app
Map<String, Set<String>> routesByApp = new LinkedHashMap<>();
@@ -78,38 +84,37 @@ public class RouteCatalogController {
Instant rangeTo = to != null ? Instant.parse(to) : now;
Instant from1m = now.minus(1, ChronoUnit.MINUTES);
// Route exchange counts from continuous aggregate
// Route exchange counts from AggregatingMergeTree (literal SQL — ClickHouse JDBC driver
// wraps prepared statements in sub-queries that strip AggregateFunction column types)
Map<String, Long> routeExchangeCounts = new LinkedHashMap<>();
Map<String, Instant> routeLastSeen = new LinkedHashMap<>();
try {
jdbc.query(
"SELECT application_name, route_id, SUM(total_count) AS cnt, MAX(bucket) AS last_seen " +
"FROM stats_1m_route WHERE bucket >= ? AND bucket < ? " +
"GROUP BY application_name, route_id",
"SELECT application_id, route_id, countMerge(total_count) AS cnt, MAX(bucket) AS last_seen " +
"FROM stats_1m_route WHERE bucket >= " + lit(rangeFrom) + " AND bucket < " + lit(rangeTo) +
" GROUP BY application_id, route_id",
rs -> {
String key = rs.getString("application_name") + "/" + rs.getString("route_id");
String key = rs.getString("application_id") + "/" + rs.getString("route_id");
routeExchangeCounts.put(key, rs.getLong("cnt"));
Timestamp ts = rs.getTimestamp("last_seen");
if (ts != null) routeLastSeen.put(key, ts.toInstant());
},
Timestamp.from(rangeFrom), Timestamp.from(rangeTo));
});
} catch (Exception e) {
// Continuous aggregate may not exist yet
log.warn("Failed to query route exchange counts: {}", e.getMessage());
}
// Per-agent TPS from the last minute
Map<String, Double> agentTps = new LinkedHashMap<>();
try {
jdbc.query(
"SELECT application_name, SUM(total_count) AS cnt " +
"FROM stats_1m_route WHERE bucket >= ? AND bucket < ? " +
"GROUP BY application_name",
"SELECT application_id, countMerge(total_count) AS cnt " +
"FROM stats_1m_route WHERE bucket >= " + lit(from1m) + " AND bucket < " + lit(now) +
" GROUP BY application_id",
rs -> {
// This gives per-app TPS; we'll distribute among agents below
},
Timestamp.from(from1m), Timestamp.from(now));
});
} catch (Exception e) {
// Continuous aggregate may not exist yet
// AggregatingMergeTree table may not exist yet
}
// Build catalog entries
@@ -120,20 +125,23 @@ public class RouteCatalogController {
// Routes
Set<String> routeIds = routesByApp.getOrDefault(appId, Set.of());
List<String> agentIds = agents.stream().map(AgentInfo::id).toList();
List<String> agentIds = agents.stream().map(AgentInfo::instanceId).toList();
List<RouteSummary> routeSummaries = routeIds.stream()
.map(routeId -> {
String key = appId + "/" + routeId;
long count = routeExchangeCounts.getOrDefault(key, 0L);
Instant lastSeen = routeLastSeen.get(key);
String fromUri = resolveFromEndpointUri(routeId, agentIds);
return new RouteSummary(routeId, count, lastSeen, fromUri);
String state = routeStateRegistry.getState(appId, routeId).name().toLowerCase();
// Only include non-default states (stopped/suspended); null means started
String routeState = "started".equals(state) ? null : state;
return new RouteSummary(routeId, count, lastSeen, fromUri, routeState);
})
.toList();
// Agent summaries
List<AgentSummary> agentSummaries = agents.stream()
.map(a -> new AgentSummary(a.id(), a.name(), a.state().name().toLowerCase(), 0.0))
.map(a -> new AgentSummary(a.instanceId(), a.displayName(), a.state().name().toLowerCase(), 0.0))
.toList();
// Health = worst state among agents
@@ -158,6 +166,13 @@ public class RouteCatalogController {
.orElse(null);
}
/** Format an Instant as a ClickHouse DateTime literal in UTC. */
private static String lit(Instant instant) {
return "'" + java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
.withZone(java.time.ZoneOffset.UTC)
.format(instant.truncatedTo(ChronoUnit.SECONDS)) + "'";
}
private String computeWorstHealth(List<AgentInfo> agents) {
boolean hasDead = false;
boolean hasStale = false;

View File

@@ -32,7 +32,7 @@ public class RouteMetricsController {
private final StatsStore statsStore;
private final AppSettingsRepository appSettingsRepository;
public RouteMetricsController(JdbcTemplate jdbc, StatsStore statsStore,
public RouteMetricsController(@org.springframework.beans.factory.annotation.Qualifier("clickHouseJdbcTemplate") JdbcTemplate jdbc, StatsStore statsStore,
AppSettingsRepository appSettingsRepository) {
this.jdbc = jdbc;
this.statsStore = statsStore;
@@ -52,29 +52,27 @@ public class RouteMetricsController {
Instant fromInstant = from != null ? Instant.parse(from) : toInstant.minus(24, ChronoUnit.HOURS);
long windowSeconds = Duration.between(fromInstant, toInstant).toSeconds();
// Literal SQL — ClickHouse JDBC driver wraps prepared statements in sub-queries
// that strip AggregateFunction column types, breaking -Merge combinators
var sql = new StringBuilder(
"SELECT application_name, route_id, " +
"SUM(total_count) AS total, " +
"SUM(failed_count) AS failed, " +
"CASE WHEN SUM(total_count) > 0 THEN SUM(duration_sum) / SUM(total_count) ELSE 0 END AS avg_dur, " +
"COALESCE(MAX(p99_duration), 0) AS p99_dur " +
"FROM stats_1m_route WHERE bucket >= ? AND bucket < ?");
var params = new ArrayList<Object>();
params.add(Timestamp.from(fromInstant));
params.add(Timestamp.from(toInstant));
"SELECT application_id, route_id, " +
"countMerge(total_count) AS total, " +
"countIfMerge(failed_count) AS failed, " +
"CASE WHEN countMerge(total_count) > 0 THEN toFloat64(sumMerge(duration_sum)) / countMerge(total_count) ELSE 0 END AS avg_dur, " +
"COALESCE(quantileMerge(0.99)(p99_duration), 0) AS p99_dur " +
"FROM stats_1m_route WHERE bucket >= " + lit(fromInstant) + " AND bucket < " + lit(toInstant));
if (appId != null) {
sql.append(" AND application_name = ?");
params.add(appId);
sql.append(" AND application_id = " + lit(appId));
}
sql.append(" GROUP BY application_name, route_id ORDER BY application_name, route_id");
sql.append(" GROUP BY application_id, route_id ORDER BY application_id, route_id");
// Key struct for sparkline lookup
record RouteKey(String appId, String routeId) {}
List<RouteKey> routeKeys = new ArrayList<>();
List<RouteMetrics> metrics = jdbc.query(sql.toString(), (rs, rowNum) -> {
String applicationName = rs.getString("application_name");
String applicationId = rs.getString("application_id");
String routeId = rs.getString("route_id");
long total = rs.getLong("total");
long failed = rs.getLong("failed");
@@ -85,10 +83,10 @@ public class RouteMetricsController {
double errorRate = total > 0 ? (double) failed / total : 0.0;
double tps = windowSeconds > 0 ? (double) total / windowSeconds : 0.0;
routeKeys.add(new RouteKey(applicationName, routeId));
return new RouteMetrics(routeId, applicationName, total, successRate,
routeKeys.add(new RouteKey(applicationId, routeId));
return new RouteMetrics(routeId, applicationId, total, successRate,
avgDur, p99Dur, errorRate, tps, List.of(), -1.0);
}, params.toArray());
});
// Fetch sparklines (12 buckets over the time window)
if (!metrics.isEmpty()) {
@@ -98,15 +96,13 @@ public class RouteMetricsController {
for (int i = 0; i < metrics.size(); i++) {
RouteMetrics m = metrics.get(i);
try {
List<Double> sparkline = jdbc.query(
"SELECT time_bucket(? * INTERVAL '1 second', bucket) AS period, " +
"COALESCE(SUM(total_count), 0) AS cnt " +
"FROM stats_1m_route WHERE bucket >= ? AND bucket < ? " +
"AND application_name = ? AND route_id = ? " +
"GROUP BY period ORDER BY period",
(rs, rowNum) -> rs.getDouble("cnt"),
bucketSeconds, Timestamp.from(fromInstant), Timestamp.from(toInstant),
m.appId(), m.routeId());
String sparkSql = "SELECT toStartOfInterval(bucket, toIntervalSecond(" + bucketSeconds + ")) AS period, " +
"COALESCE(countMerge(total_count), 0) AS cnt " +
"FROM stats_1m_route WHERE bucket >= " + lit(fromInstant) + " AND bucket < " + lit(toInstant) +
" AND application_id = " + lit(m.appId()) + " AND route_id = " + lit(m.routeId()) +
" GROUP BY period ORDER BY period";
List<Double> sparkline = jdbc.query(sparkSql,
(rs, rowNum) -> rs.getDouble("cnt"));
metrics.set(i, new RouteMetrics(m.routeId(), m.appId(), m.exchangeCount(),
m.successRate(), m.avgDurationMs(), m.p99DurationMs(),
m.errorRate(), m.throughputPerSec(), sparkline, m.slaCompliance()));
@@ -120,7 +116,7 @@ public class RouteMetricsController {
if (!metrics.isEmpty()) {
// Determine SLA threshold (per-app or default)
String effectiveAppId = appId != null ? appId : (metrics.isEmpty() ? null : metrics.get(0).appId());
int threshold = appSettingsRepository.findByAppId(effectiveAppId != null ? effectiveAppId : "")
int threshold = appSettingsRepository.findByApplicationId(effectiveAppId != null ? effectiveAppId : "")
.map(AppSettings::slaThresholdMs).orElse(300);
Map<String, long[]> slaCounts = statsStore.slaCountsByRoute(fromInstant, toInstant,
@@ -153,42 +149,54 @@ public class RouteMetricsController {
Instant toInstant = to != null ? to : Instant.now();
Instant fromInstant = from != null ? from : toInstant.minus(24, ChronoUnit.HOURS);
// Literal SQL for AggregatingMergeTree -Merge combinators.
// Aliases (tc, fc) must NOT shadow column names (total_count, failed_count) —
// ClickHouse 24.12 new analyzer resolves subsequent countMerge(total_count)
// to the alias (UInt64) instead of the AggregateFunction column.
var sql = new StringBuilder(
"SELECT processor_id, processor_type, route_id, application_name, " +
"SUM(total_count) AS total_count, " +
"SUM(failed_count) AS failed_count, " +
"CASE WHEN SUM(total_count) > 0 THEN SUM(duration_sum)::double precision / SUM(total_count) ELSE 0 END AS avg_duration_ms, " +
"MAX(p99_duration) AS p99_duration_ms " +
"SELECT processor_id, processor_type, route_id, application_id, " +
"countMerge(total_count) AS tc, " +
"countIfMerge(failed_count) AS fc, " +
"CASE WHEN countMerge(total_count) > 0 THEN toFloat64(sumMerge(duration_sum)) / countMerge(total_count) ELSE 0 END AS avg_duration_ms, " +
"quantileMerge(0.99)(p99_duration) AS p99_duration_ms " +
"FROM stats_1m_processor_detail " +
"WHERE bucket >= ? AND bucket < ? AND route_id = ?");
var params = new ArrayList<Object>();
params.add(Timestamp.from(fromInstant));
params.add(Timestamp.from(toInstant));
params.add(routeId);
"WHERE bucket >= " + lit(fromInstant) + " AND bucket < " + lit(toInstant) +
" AND route_id = " + lit(routeId));
if (appId != null) {
sql.append(" AND application_name = ?");
params.add(appId);
sql.append(" AND application_id = " + lit(appId));
}
sql.append(" GROUP BY processor_id, processor_type, route_id, application_name");
sql.append(" ORDER BY SUM(total_count) DESC");
sql.append(" GROUP BY processor_id, processor_type, route_id, application_id");
sql.append(" ORDER BY tc DESC");
List<ProcessorMetrics> metrics = jdbc.query(sql.toString(), (rs, rowNum) -> {
long totalCount = rs.getLong("total_count");
long failedCount = rs.getLong("failed_count");
long totalCount = rs.getLong("tc");
long failedCount = rs.getLong("fc");
double errorRate = failedCount > 0 ? (double) failedCount / totalCount : 0.0;
return new ProcessorMetrics(
rs.getString("processor_id"),
rs.getString("processor_type"),
rs.getString("route_id"),
rs.getString("application_name"),
rs.getString("application_id"),
totalCount,
failedCount,
rs.getDouble("avg_duration_ms"),
rs.getDouble("p99_duration_ms"),
errorRate);
}, params.toArray());
});
return ResponseEntity.ok(metrics);
}
/** Format an Instant as a ClickHouse DateTime literal. */
private static String lit(Instant instant) {
return "'" + java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
.withZone(java.time.ZoneOffset.UTC)
.format(instant.truncatedTo(ChronoUnit.SECONDS)) + "'";
}
/** Format a string as a SQL literal with single-quote escaping. */
private static String lit(String value) {
return "'" + value.replace("'", "\\'") + "'";
}
}

View File

@@ -57,7 +57,7 @@ public class SearchController {
@RequestParam(required = false) String correlationId,
@RequestParam(required = false) String text,
@RequestParam(required = false) String routeId,
@RequestParam(required = false) String agentId,
@RequestParam(name = "agentId", required = false) String instanceId,
@RequestParam(required = false) String processorType,
@RequestParam(required = false) String application,
@RequestParam(defaultValue = "0") int offset,
@@ -72,7 +72,7 @@ public class SearchController {
null, null,
correlationId,
text, null, null, null,
routeId, agentId, processorType,
routeId, instanceId, processorType,
application, agentIds,
offset, limit,
sortField, sortDir
@@ -87,9 +87,9 @@ public class SearchController {
@RequestBody SearchRequest request) {
// Resolve application to agentIds if application is specified but agentIds is not
SearchRequest resolved = request;
if (request.application() != null && !request.application().isBlank()
&& (request.agentIds() == null || request.agentIds().isEmpty())) {
resolved = request.withAgentIds(resolveApplicationToAgentIds(request.application()));
if (request.applicationId() != null && !request.applicationId().isBlank()
&& (request.instanceIds() == null || request.instanceIds().isEmpty())) {
resolved = request.withInstanceIds(resolveApplicationToAgentIds(request.applicationId()));
}
return ResponseEntity.ok(searchService.search(resolved));
}
@@ -114,7 +114,7 @@ public class SearchController {
// Enrich with SLA compliance
int threshold = appSettingsRepository
.findByAppId(application != null ? application : "")
.findByApplicationId(application != null ? application : "")
.map(AppSettings::slaThresholdMs).orElse(300);
double sla = searchService.slaCompliance(from, end, threshold, application, routeId);
return ResponseEntity.ok(stats.withSlaCompliance(sla));
@@ -172,6 +172,12 @@ public class SearchController {
return ResponseEntity.ok(searchService.punchcard(from, to, application));
}
@GetMapping("/attributes/keys")
@Operation(summary = "Distinct attribute key names across all executions")
public ResponseEntity<List<String>> attributeKeys() {
return ResponseEntity.ok(searchService.distinctAttributeKeys());
}
@GetMapping("/errors/top")
@Operation(summary = "Top N errors with velocity trend")
public ResponseEntity<List<TopError>> topErrors(
@@ -193,7 +199,7 @@ public class SearchController {
return null;
}
return registryService.findByApplication(application).stream()
.map(AgentInfo::id)
.map(AgentInfo::instanceId)
.toList();
}
}

View File

@@ -0,0 +1,50 @@
package com.cameleer3.server.app.controller;
import com.cameleer3.server.app.storage.ClickHouseUsageTracker;
import com.cameleer3.server.core.analytics.UsageStats;
import io.swagger.v3.oas.annotations.Operation;
import io.swagger.v3.oas.annotations.tags.Tag;
import org.springframework.boot.autoconfigure.condition.ConditionalOnBean;
import org.springframework.http.ResponseEntity;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RequestParam;
import org.springframework.web.bind.annotation.RestController;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
import java.util.List;
@RestController
@RequestMapping("/api/v1/admin/usage")
@ConditionalOnBean(ClickHouseUsageTracker.class)
@Tag(name = "Usage Analytics", description = "UI usage pattern analytics")
public class UsageAnalyticsController {
private final ClickHouseUsageTracker tracker;
public UsageAnalyticsController(ClickHouseUsageTracker tracker) {
this.tracker = tracker;
}
@GetMapping
@Operation(summary = "Query usage statistics",
description = "Returns aggregated API usage stats grouped by endpoint, user, or hour")
public ResponseEntity<List<UsageStats>> getUsage(
@RequestParam(required = false) String from,
@RequestParam(required = false) String to,
@RequestParam(required = false) String username,
@RequestParam(defaultValue = "endpoint") String groupBy) {
Instant fromInstant = from != null ? Instant.parse(from) : Instant.now().minus(7, ChronoUnit.DAYS);
Instant toInstant = to != null ? Instant.parse(to) : Instant.now();
List<UsageStats> stats = switch (groupBy) {
case "user" -> tracker.queryByUser(fromInstant, toInstant);
case "hour" -> tracker.queryByHour(fromInstant, toInstant, username);
default -> tracker.queryByEndpoint(fromInstant, toInstant, username);
};
return ResponseEntity.ok(stats);
}
}

View File

@@ -884,6 +884,7 @@ public class ElkDiagramRenderer implements DiagramRenderer {
}
private ElkNode getElkRoot(ElkNode node) {
if (node == null) return null;
ElkNode current = node;
while (current.getParent() != null) {
current = current.getParent();

View File

@@ -9,15 +9,15 @@ import java.time.Instant;
@Schema(description = "Agent lifecycle event")
public record AgentEventResponse(
@NotNull long id,
@NotNull String agentId,
@NotNull String appId,
@NotNull String instanceId,
@NotNull String applicationId,
@NotNull String eventType,
String detail,
@NotNull Instant timestamp
) {
public static AgentEventResponse from(AgentEventRecord record) {
return new AgentEventResponse(
record.id(), record.agentId(), record.appId(),
record.id(), record.instanceId(), record.applicationId(),
record.eventType(), record.detail(), record.timestamp()
);
}

View File

@@ -11,9 +11,9 @@ import java.util.Map;
@Schema(description = "Agent instance summary with runtime metrics")
public record AgentInstanceResponse(
@NotNull String id,
@NotNull String name,
@NotNull String application,
@NotNull String instanceId,
@NotNull String displayName,
@NotNull String applicationId,
@NotNull String status,
@NotNull List<String> routeIds,
@NotNull Instant registeredAt,
@@ -29,7 +29,7 @@ public record AgentInstanceResponse(
public static AgentInstanceResponse from(AgentInfo info) {
long uptime = Duration.between(info.registeredAt(), Instant.now()).toSeconds();
return new AgentInstanceResponse(
info.id(), info.name(), info.application(),
info.instanceId(), info.displayName(), info.applicationId(),
info.state().name(), info.routeIds(),
info.registeredAt(), info.lastHeartbeat(),
info.version(), info.capabilities(),
@@ -41,7 +41,7 @@ public record AgentInstanceResponse(
public AgentInstanceResponse withMetrics(double tps, double errorRate, int activeRoutes) {
return new AgentInstanceResponse(
id, name, application, status, routeIds, registeredAt, lastHeartbeat,
instanceId, displayName, applicationId, status, routeIds, registeredAt, lastHeartbeat,
version, capabilities,
tps, errorRate, activeRoutes, totalRoutes, uptimeSeconds
);

View File

@@ -8,9 +8,9 @@ import java.util.Map;
@Schema(description = "Agent registration payload")
public record AgentRegistrationRequest(
@NotNull String agentId,
@NotNull String name,
@Schema(defaultValue = "default") String application,
@NotNull String instanceId,
@NotNull String displayName,
@Schema(defaultValue = "default") String applicationId,
String version,
List<String> routeIds,
Map<String, Object> capabilities

View File

@@ -5,7 +5,7 @@ import jakarta.validation.constraints.NotNull;
@Schema(description = "Agent registration result with JWT tokens and SSE endpoint")
public record AgentRegistrationResponse(
@NotNull String agentId,
@NotNull String instanceId,
@NotNull String sseEndpoint,
long heartbeatIntervalMs,
@NotNull String serverPublicKey,

View File

@@ -0,0 +1,14 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "ClickHouse storage and performance metrics")
public record ClickHousePerformanceResponse(
String diskSize,
String uncompressedSize,
double compressionRatio,
long totalRows,
int partCount,
String memoryUsage,
int currentQueries
) {}

View File

@@ -0,0 +1,12 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "Active ClickHouse query information")
public record ClickHouseQueryInfo(
String queryId,
double elapsedSeconds,
String memory,
long readRows,
String query
) {}

View File

@@ -0,0 +1,11 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "ClickHouse cluster status")
public record ClickHouseStatusResponse(
boolean reachable,
String version,
String uptime,
String host
) {}

View File

@@ -0,0 +1,13 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "ClickHouse table information")
public record ClickHouseTableInfo(
String name,
String engine,
long rowCount,
String dataSize,
long dataSizeBytes,
int partitionCount
) {}

View File

@@ -0,0 +1,13 @@
package com.cameleer3.server.app.dto;
import java.util.List;
public record CommandGroupResponse(
boolean success,
int total,
int responded,
List<AgentResponse> responses,
List<String> timedOut
) {
public record AgentResponse(String agentId, String status, String message) {}
}

View File

@@ -0,0 +1,8 @@
package com.cameleer3.server.app.dto;
import com.cameleer3.common.model.ApplicationConfig;
public record ConfigUpdateResponse(
ApplicationConfig config,
CommandGroupResponse pushResult
) {}

View File

@@ -7,6 +7,5 @@ public record DatabaseStatusResponse(
@Schema(description = "Whether the database is reachable") boolean connected,
@Schema(description = "PostgreSQL version string") String version,
@Schema(description = "Database host") String host,
@Schema(description = "Current schema search path") String schema,
@Schema(description = "Whether TimescaleDB extension is available") boolean timescaleDb
@Schema(description = "Current schema") String schema
) {}

View File

@@ -1,14 +0,0 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "OpenSearch index information")
public record IndexInfoResponse(
@Schema(description = "Index name") String name,
@Schema(description = "Document count") long docCount,
@Schema(description = "Human-readable index size") String size,
@Schema(description = "Index size in bytes") long sizeBytes,
@Schema(description = "Index health status") String health,
@Schema(description = "Number of primary shards") int primaryShards,
@Schema(description = "Number of replica shards") int replicaShards
) {}

View File

@@ -0,0 +1,16 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
import java.time.Instant;
@Schema(description = "Search indexer pipeline statistics")
public record IndexerPipelineResponse(
int queueDepth,
int maxQueueSize,
long failedCount,
long indexedCount,
long debounceMs,
double indexingRate,
Instant lastIndexedAt
) {}

View File

@@ -1,16 +0,0 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
import java.util.List;
@Schema(description = "Paginated list of OpenSearch indices")
public record IndicesPageResponse(
@Schema(description = "Index list for current page") List<IndexInfoResponse> indices,
@Schema(description = "Total number of indices") long totalIndices,
@Schema(description = "Total document count across all indices") long totalDocs,
@Schema(description = "Human-readable total size") String totalSize,
@Schema(description = "Current page number (0-based)") int page,
@Schema(description = "Page size") int pageSize,
@Schema(description = "Total number of pages") int totalPages
) {}

View File

@@ -2,12 +2,18 @@ package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "Application log entry from OpenSearch")
import java.util.Map;
@Schema(description = "Application log entry")
public record LogEntryResponse(
@Schema(description = "Log timestamp (ISO-8601)") String timestamp,
@Schema(description = "Log level (INFO, WARN, ERROR, DEBUG)") String level,
@Schema(description = "Log level (INFO, WARN, ERROR, DEBUG, TRACE)") String level,
@Schema(description = "Logger name") String loggerName,
@Schema(description = "Log message") String message,
@Schema(description = "Thread name") String threadName,
@Schema(description = "Stack trace (if present)") String stackTrace
@Schema(description = "Stack trace (if present)") String stackTrace,
@Schema(description = "Camel exchange ID (if present)") String exchangeId,
@Schema(description = "Agent instance ID") String instanceId,
@Schema(description = "Application ID") String application,
@Schema(description = "MDC context map") Map<String, String> mdc
) {}

View File

@@ -0,0 +1,14 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
import java.util.List;
import java.util.Map;
@Schema(description = "Log search response with cursor pagination and level counts")
public record LogSearchPageResponse(
@Schema(description = "Log entries for the current page") List<LogEntryResponse> data,
@Schema(description = "Cursor for next page (null if no more results)") String nextCursor,
@Schema(description = "Whether more results exist beyond this page") boolean hasMore,
@Schema(description = "Count of logs per level (unaffected by level filter)") Map<String, Long> levelCounts
) {}

View File

@@ -1,12 +0,0 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "OpenSearch cluster status")
public record OpenSearchStatusResponse(
@Schema(description = "Whether the cluster is reachable") boolean reachable,
@Schema(description = "Cluster health status (GREEN, YELLOW, RED)") String clusterHealth,
@Schema(description = "OpenSearch version") String version,
@Schema(description = "Number of nodes in the cluster") int nodeCount,
@Schema(description = "OpenSearch host") String host
) {}

View File

@@ -1,13 +0,0 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
@Schema(description = "OpenSearch performance metrics")
public record PerformanceResponse(
@Schema(description = "Query cache hit rate (0.0-1.0)") double queryCacheHitRate,
@Schema(description = "Request cache hit rate (0.0-1.0)") double requestCacheHitRate,
@Schema(description = "Average search latency in milliseconds") double searchLatencyMs,
@Schema(description = "Average indexing latency in milliseconds") double indexingLatencyMs,
@Schema(description = "JVM heap used in bytes") long jvmHeapUsedBytes,
@Schema(description = "JVM heap max in bytes") long jvmHeapMaxBytes
) {}

View File

@@ -1,16 +0,0 @@
package com.cameleer3.server.app.dto;
import io.swagger.v3.oas.annotations.media.Schema;
import java.time.Instant;
@Schema(description = "Search indexing pipeline statistics")
public record PipelineStatsResponse(
@Schema(description = "Current queue depth") int queueDepth,
@Schema(description = "Maximum queue size") int maxQueueSize,
@Schema(description = "Number of failed indexing operations") long failedCount,
@Schema(description = "Number of successfully indexed documents") long indexedCount,
@Schema(description = "Debounce interval in milliseconds") long debounceMs,
@Schema(description = "Current indexing rate (docs/sec)") double indexingRate,
@Schema(description = "Timestamp of last indexed document") Instant lastIndexedAt
) {}

View File

@@ -11,5 +11,8 @@ public record RouteSummary(
@NotNull long exchangeCount,
Instant lastSeen,
@Schema(description = "The from() endpoint URI, e.g. 'direct:processOrder'")
String fromEndpointUri
String fromEndpointUri,
@Schema(description = "Operational state of the route: stopped, suspended, or null (started/default)")
String routeState
) {}

View File

@@ -5,18 +5,15 @@ import io.swagger.v3.oas.annotations.media.Schema;
import jakarta.validation.Valid;
import jakarta.validation.constraints.Max;
import jakarta.validation.constraints.Min;
import jakarta.validation.constraints.NotBlank;
import jakarta.validation.constraints.NotNull;
import jakarta.validation.constraints.Positive;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
@Schema(description = "Threshold configuration for admin monitoring")
public record ThresholdConfigRequest(
@Valid @NotNull DatabaseThresholdsRequest database,
@Valid @NotNull OpenSearchThresholdsRequest opensearch
@Valid @NotNull DatabaseThresholdsRequest database
) {
@Schema(description = "Database monitoring thresholds")
@@ -38,41 +35,6 @@ public record ThresholdConfigRequest(
double queryDurationCritical
) {}
@Schema(description = "OpenSearch monitoring thresholds")
public record OpenSearchThresholdsRequest(
@NotBlank
@Schema(description = "Cluster health warning threshold (GREEN, YELLOW, RED)")
String clusterHealthWarning,
@NotBlank
@Schema(description = "Cluster health critical threshold (GREEN, YELLOW, RED)")
String clusterHealthCritical,
@Min(0)
@Schema(description = "Queue depth warning threshold")
int queueDepthWarning,
@Min(0)
@Schema(description = "Queue depth critical threshold")
int queueDepthCritical,
@Min(0) @Max(100)
@Schema(description = "JVM heap usage warning threshold (percentage)")
int jvmHeapWarning,
@Min(0) @Max(100)
@Schema(description = "JVM heap usage critical threshold (percentage)")
int jvmHeapCritical,
@Min(0)
@Schema(description = "Failed document count warning threshold")
int failedDocsWarning,
@Min(0)
@Schema(description = "Failed document count critical threshold")
int failedDocsCritical
) {}
/** Convert to core domain model */
public ThresholdConfig toConfig() {
return new ThresholdConfig(
@@ -81,16 +43,6 @@ public record ThresholdConfigRequest(
database.connectionPoolCritical(),
database.queryDurationWarning(),
database.queryDurationCritical()
),
new ThresholdConfig.OpenSearchThresholds(
opensearch.clusterHealthWarning(),
opensearch.clusterHealthCritical(),
opensearch.queueDepthWarning(),
opensearch.queueDepthCritical(),
opensearch.jvmHeapWarning(),
opensearch.jvmHeapCritical(),
opensearch.failedDocsWarning(),
opensearch.failedDocsCritical()
)
);
}
@@ -108,37 +60,6 @@ public record ThresholdConfigRequest(
}
}
if (opensearch != null) {
if (opensearch.queueDepthWarning() > opensearch.queueDepthCritical()) {
errors.add("opensearch.queueDepthWarning must be <= queueDepthCritical");
}
if (opensearch.jvmHeapWarning() > opensearch.jvmHeapCritical()) {
errors.add("opensearch.jvmHeapWarning must be <= jvmHeapCritical");
}
if (opensearch.failedDocsWarning() > opensearch.failedDocsCritical()) {
errors.add("opensearch.failedDocsWarning must be <= failedDocsCritical");
}
// Validate health severity ordering: GREEN < YELLOW < RED
int warningSeverity = healthSeverity(opensearch.clusterHealthWarning());
int criticalSeverity = healthSeverity(opensearch.clusterHealthCritical());
if (warningSeverity < 0) {
errors.add("opensearch.clusterHealthWarning must be GREEN, YELLOW, or RED");
}
if (criticalSeverity < 0) {
errors.add("opensearch.clusterHealthCritical must be GREEN, YELLOW, or RED");
}
if (warningSeverity >= 0 && criticalSeverity >= 0 && warningSeverity > criticalSeverity) {
errors.add("opensearch.clusterHealthWarning severity must be <= clusterHealthCritical (GREEN < YELLOW < RED)");
}
}
return errors;
}
private static final Map<String, Integer> HEALTH_SEVERITY =
Map.of("GREEN", 0, "YELLOW", 1, "RED", 2);
private static int healthSeverity(String health) {
return HEALTH_SEVERITY.getOrDefault(health != null ? health.toUpperCase() : "", -1);
}
}

View File

@@ -0,0 +1,136 @@
package com.cameleer3.server.app.ingestion;
import com.cameleer3.server.app.config.IngestionConfig;
import com.cameleer3.server.app.storage.ClickHouseExecutionStore;
import com.cameleer3.server.core.ingestion.ChunkAccumulator;
import com.cameleer3.server.core.ingestion.MergedExecution;
import com.cameleer3.server.core.ingestion.WriteBuffer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.context.SmartLifecycle;
import org.springframework.scheduling.annotation.Scheduled;
import java.util.List;
/**
* Scheduled flush task for ClickHouse execution and processor write buffers.
* <p>
* Drains both buffers on a fixed interval and delegates batch inserts to
* {@link ClickHouseExecutionStore}. Also periodically sweeps stale exchanges
* from the {@link ChunkAccumulator}.
* <p>
* Not a {@code @Component} — instantiated as a {@code @Bean} in StorageBeanConfig.
*/
public class ExecutionFlushScheduler implements SmartLifecycle {
private static final Logger log = LoggerFactory.getLogger(ExecutionFlushScheduler.class);
private final WriteBuffer<MergedExecution> executionBuffer;
private final WriteBuffer<ChunkAccumulator.ProcessorBatch> processorBuffer;
private final ClickHouseExecutionStore executionStore;
private final ChunkAccumulator accumulator;
private final int batchSize;
private volatile boolean running = false;
public ExecutionFlushScheduler(WriteBuffer<MergedExecution> executionBuffer,
WriteBuffer<ChunkAccumulator.ProcessorBatch> processorBuffer,
ClickHouseExecutionStore executionStore,
ChunkAccumulator accumulator,
IngestionConfig config) {
this.executionBuffer = executionBuffer;
this.processorBuffer = processorBuffer;
this.executionStore = executionStore;
this.accumulator = accumulator;
this.batchSize = config.getBatchSize();
}
@Scheduled(fixedDelayString = "${ingestion.flush-interval-ms:1000}")
public void flush() {
try {
List<MergedExecution> executions = executionBuffer.drain(batchSize);
if (!executions.isEmpty()) {
executionStore.insertExecutionBatch(executions);
log.debug("Flushed {} executions to ClickHouse", executions.size());
}
} catch (Exception e) {
log.error("Failed to flush executions", e);
}
try {
List<ChunkAccumulator.ProcessorBatch> batches = processorBuffer.drain(batchSize);
for (ChunkAccumulator.ProcessorBatch batch : batches) {
executionStore.insertProcessorBatch(
batch.tenantId(),
batch.executionId(),
batch.routeId(),
batch.applicationId(),
batch.execStartTime(),
batch.processors());
}
if (!batches.isEmpty()) {
log.debug("Flushed {} processor batches to ClickHouse", batches.size());
}
} catch (Exception e) {
log.error("Failed to flush processor batches", e);
}
}
@Scheduled(fixedDelay = 60_000)
public void sweepStale() {
try {
accumulator.sweepStale();
} catch (Exception e) {
log.error("Failed to sweep stale exchanges", e);
}
}
@Override
public void start() {
running = true;
}
@Override
public void stop() {
// Drain remaining executions on shutdown
while (executionBuffer.size() > 0) {
List<MergedExecution> batch = executionBuffer.drain(batchSize);
if (batch.isEmpty()) break;
try {
executionStore.insertExecutionBatch(batch);
} catch (Exception e) {
log.error("Failed to flush executions during shutdown", e);
break;
}
}
// Drain remaining processor batches on shutdown
while (processorBuffer.size() > 0) {
List<ChunkAccumulator.ProcessorBatch> batches = processorBuffer.drain(batchSize);
if (batches.isEmpty()) break;
try {
for (ChunkAccumulator.ProcessorBatch batch : batches) {
executionStore.insertProcessorBatch(
batch.tenantId(),
batch.executionId(),
batch.routeId(),
batch.applicationId(),
batch.execStartTime(),
batch.processors());
}
} catch (Exception e) {
log.error("Failed to flush processor batches during shutdown", e);
break;
}
}
running = false;
}
@Override
public boolean isRunning() {
return running;
}
@Override
public int getPhase() {
return Integer.MAX_VALUE - 1;
}
}

View File

@@ -1,48 +0,0 @@
package com.cameleer3.server.app.retention;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.scheduling.annotation.Scheduled;
import org.springframework.stereotype.Component;
@Component
public class RetentionScheduler {
private static final Logger log = LoggerFactory.getLogger(RetentionScheduler.class);
private final JdbcTemplate jdbc;
private final int retentionDays;
public RetentionScheduler(JdbcTemplate jdbc,
@Value("${cameleer.retention-days:30}") int retentionDays) {
this.jdbc = jdbc;
this.retentionDays = retentionDays;
}
@Scheduled(cron = "0 0 2 * * *") // Daily at 2 AM UTC
public void dropExpiredChunks() {
String interval = retentionDays + " days";
try {
// Raw data
jdbc.execute("SELECT drop_chunks('executions', INTERVAL '" + interval + "')");
jdbc.execute("SELECT drop_chunks('processor_executions', INTERVAL '" + interval + "')");
jdbc.execute("SELECT drop_chunks('agent_metrics', INTERVAL '" + interval + "')");
// Continuous aggregates (keep 3x longer)
String caggInterval = (retentionDays * 3) + " days";
jdbc.execute("SELECT drop_chunks('stats_1m_all', INTERVAL '" + caggInterval + "')");
jdbc.execute("SELECT drop_chunks('stats_1m_app', INTERVAL '" + caggInterval + "')");
jdbc.execute("SELECT drop_chunks('stats_1m_route', INTERVAL '" + caggInterval + "')");
jdbc.execute("SELECT drop_chunks('stats_1m_processor', INTERVAL '" + caggInterval + "')");
log.info("Retention: dropped chunks older than {} days (aggregates: {} days)",
retentionDays, retentionDays * 3);
} catch (Exception e) {
log.error("Retention job failed", e);
}
}
// Note: OpenSearch daily index deletion should be handled via ILM policy
// configured at deployment time, not in application code.
}

View File

@@ -0,0 +1,207 @@
package com.cameleer3.server.app.search;
import com.cameleer3.common.model.LogEntry;
import com.cameleer3.server.core.search.LogSearchRequest;
import com.cameleer3.server.core.search.LogSearchResponse;
import com.cameleer3.server.core.storage.LogEntryResult;
import com.cameleer3.server.core.storage.LogIndex;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.jdbc.core.JdbcTemplate;
import java.sql.Timestamp;
import java.time.Instant;
import java.time.ZoneOffset;
import java.time.format.DateTimeFormatter;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
/**
* ClickHouse-backed implementation of {@link LogIndex}.
* Stores application logs in the {@code logs} MergeTree table with
* ngram bloom-filter indexes for efficient substring search.
*/
public class ClickHouseLogStore implements LogIndex {
private static final Logger log = LoggerFactory.getLogger(ClickHouseLogStore.class);
private static final String TENANT = "default";
private static final DateTimeFormatter ISO_FMT = DateTimeFormatter.ISO_INSTANT;
private final JdbcTemplate jdbc;
public ClickHouseLogStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public void indexBatch(String instanceId, String applicationId, List<LogEntry> entries) {
if (entries == null || entries.isEmpty()) {
return;
}
String sql = "INSERT INTO logs (tenant_id, timestamp, application, instance_id, level, " +
"logger_name, message, thread_name, stack_trace, exchange_id, mdc) " +
"VALUES ('default', ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)";
jdbc.batchUpdate(sql, entries, entries.size(), (ps, entry) -> {
Instant ts = entry.getTimestamp() != null ? entry.getTimestamp() : Instant.now();
ps.setTimestamp(1, Timestamp.from(ts));
ps.setString(2, applicationId);
ps.setString(3, instanceId);
ps.setString(4, entry.getLevel() != null ? entry.getLevel() : "");
ps.setString(5, entry.getLoggerName() != null ? entry.getLoggerName() : "");
ps.setString(6, entry.getMessage() != null ? entry.getMessage() : "");
ps.setString(7, entry.getThreadName() != null ? entry.getThreadName() : "");
ps.setString(8, entry.getStackTrace() != null ? entry.getStackTrace() : "");
Map<String, String> mdc = entry.getMdc() != null ? entry.getMdc() : Collections.emptyMap();
String exchangeId = mdc.getOrDefault("camel.exchangeId", "");
ps.setString(9, exchangeId);
ps.setObject(10, mdc);
});
log.debug("Indexed {} log entries for instance={}, app={}", entries.size(), instanceId, applicationId);
}
@Override
public LogSearchResponse search(LogSearchRequest request) {
// Build shared WHERE conditions (used by both data and count queries)
List<String> baseConditions = new ArrayList<>();
List<Object> baseParams = new ArrayList<>();
baseConditions.add("tenant_id = 'default'");
if (request.application() != null && !request.application().isEmpty()) {
baseConditions.add("application = ?");
baseParams.add(request.application());
}
if (request.instanceId() != null && !request.instanceId().isEmpty()) {
baseConditions.add("instance_id = ?");
baseParams.add(request.instanceId());
}
if (request.exchangeId() != null && !request.exchangeId().isEmpty()) {
baseConditions.add("(exchange_id = ? OR (mapContains(mdc, 'camel.exchangeId') AND mdc['camel.exchangeId'] = ?))");
baseParams.add(request.exchangeId());
baseParams.add(request.exchangeId());
}
if (request.q() != null && !request.q().isEmpty()) {
String term = "%" + escapeLike(request.q()) + "%";
baseConditions.add("(message ILIKE ? OR stack_trace ILIKE ?)");
baseParams.add(term);
baseParams.add(term);
}
if (request.logger() != null && !request.logger().isEmpty()) {
baseConditions.add("logger_name ILIKE ?");
baseParams.add("%" + escapeLike(request.logger()) + "%");
}
if (request.from() != null) {
baseConditions.add("timestamp >= ?");
baseParams.add(Timestamp.from(request.from()));
}
if (request.to() != null) {
baseConditions.add("timestamp <= ?");
baseParams.add(Timestamp.from(request.to()));
}
// Level counts query: uses base conditions WITHOUT level filter and cursor
String baseWhere = String.join(" AND ", baseConditions);
Map<String, Long> levelCounts = queryLevelCounts(baseWhere, baseParams);
// Data query conditions: add level filter and cursor on top of base
List<String> dataConditions = new ArrayList<>(baseConditions);
List<Object> dataParams = new ArrayList<>(baseParams);
if (request.levels() != null && !request.levels().isEmpty()) {
String placeholders = String.join(", ", Collections.nCopies(request.levels().size(), "?"));
dataConditions.add("level IN (" + placeholders + ")");
for (String lvl : request.levels()) {
dataParams.add(lvl.toUpperCase());
}
}
if (request.cursor() != null && !request.cursor().isEmpty()) {
Instant cursorTs = Instant.parse(request.cursor());
if ("asc".equalsIgnoreCase(request.sort())) {
dataConditions.add("timestamp > ?");
} else {
dataConditions.add("timestamp < ?");
}
dataParams.add(Timestamp.from(cursorTs));
}
String dataWhere = String.join(" AND ", dataConditions);
String orderDir = "asc".equalsIgnoreCase(request.sort()) ? "ASC" : "DESC";
int fetchLimit = request.limit() + 1; // fetch N+1 to detect hasMore
String dataSql = "SELECT timestamp, level, logger_name, message, thread_name, stack_trace, " +
"exchange_id, instance_id, application, mdc " +
"FROM logs WHERE " + dataWhere +
" ORDER BY timestamp " + orderDir + " LIMIT ?";
dataParams.add(fetchLimit);
List<LogEntryResult> results = jdbc.query(dataSql, dataParams.toArray(), (rs, rowNum) -> {
Timestamp ts = rs.getTimestamp("timestamp");
String timestampStr = ts != null
? ts.toInstant().atOffset(ZoneOffset.UTC).format(ISO_FMT)
: null;
@SuppressWarnings("unchecked")
Map<String, String> mdc = (Map<String, String>) rs.getObject("mdc");
if (mdc == null) mdc = Collections.emptyMap();
return new LogEntryResult(
timestampStr,
rs.getString("level"),
rs.getString("logger_name"),
rs.getString("message"),
rs.getString("thread_name"),
rs.getString("stack_trace"),
rs.getString("exchange_id"),
rs.getString("instance_id"),
rs.getString("application"),
mdc
);
});
boolean hasMore = results.size() > request.limit();
if (hasMore) {
results = new ArrayList<>(results.subList(0, request.limit()));
}
String nextCursor = null;
if (hasMore && !results.isEmpty()) {
nextCursor = results.get(results.size() - 1).timestamp();
}
return new LogSearchResponse(results, nextCursor, hasMore, levelCounts);
}
private Map<String, Long> queryLevelCounts(String baseWhere, List<Object> baseParams) {
String sql = "SELECT level, count() AS cnt FROM logs WHERE " + baseWhere + " GROUP BY level";
Map<String, Long> counts = new LinkedHashMap<>();
try {
jdbc.query(sql, baseParams.toArray(), (rs, rowNum) -> {
counts.put(rs.getString("level"), rs.getLong("cnt"));
return null;
});
} catch (Exception e) {
log.warn("Failed to query level counts", e);
}
return counts;
}
private static String escapeLike(String term) {
return term.replace("\\", "\\\\")
.replace("%", "\\%")
.replace("_", "\\_");
}
}

View File

@@ -0,0 +1,331 @@
package com.cameleer3.server.app.search;
import com.cameleer3.server.core.search.ExecutionSummary;
import com.cameleer3.server.core.search.SearchRequest;
import com.cameleer3.server.core.search.SearchResult;
import com.cameleer3.server.core.storage.SearchIndex;
import com.cameleer3.server.core.storage.model.ExecutionDocument;
import com.fasterxml.jackson.core.type.TypeReference;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.jdbc.core.JdbcTemplate;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Timestamp;
import java.time.Instant;
import java.util.*;
/**
* ClickHouse-backed implementation of {@link SearchIndex}.
* <p>
* Queries the {@code executions} and {@code processor_executions} tables directly
* using SQL with ngram bloom-filter indexes for full-text search acceleration.
* <p>
* The {@link #index} and {@link #delete} methods are no-ops because data is
* written by the accumulator/store pipeline, not the search index.
*/
public class ClickHouseSearchIndex implements SearchIndex {
private static final Logger log = LoggerFactory.getLogger(ClickHouseSearchIndex.class);
private static final ObjectMapper JSON = new ObjectMapper();
private static final TypeReference<Map<String, String>> STR_MAP = new TypeReference<>() {};
private static final int HIGHLIGHT_CONTEXT_CHARS = 120;
private static final Map<String, String> SORT_FIELD_MAP = Map.of(
"startTime", "start_time",
"durationMs", "duration_ms",
"status", "status",
"instanceId", "instance_id",
"routeId", "route_id",
"correlationId", "correlation_id",
"executionId", "execution_id",
"applicationId", "application_id"
);
private final JdbcTemplate jdbc;
public ClickHouseSearchIndex(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public void index(ExecutionDocument document) {
// No-op: data is written by ClickHouseExecutionStore
}
@Override
public void delete(String executionId) {
// No-op: ClickHouse ReplacingMergeTree handles versioning
}
@Override
public SearchResult<ExecutionSummary> search(SearchRequest request) {
try {
List<Object> params = new ArrayList<>();
String whereClause = buildWhereClause(request, params);
String searchTerm = request.text();
// Count query
String countSql = "SELECT count() FROM executions FINAL WHERE " + whereClause;
Long total = jdbc.queryForObject(countSql, Long.class, params.toArray());
if (total == null || total == 0) {
return SearchResult.empty(request.offset(), request.limit());
}
// Data query
String sortColumn = SORT_FIELD_MAP.getOrDefault(request.sortField(), "start_time");
String sortDir = "asc".equalsIgnoreCase(request.sortDir()) ? "ASC" : "DESC";
String dataSql = "SELECT execution_id, route_id, instance_id, application_id, "
+ "status, start_time, end_time, duration_ms, correlation_id, "
+ "error_message, error_stacktrace, diagram_content_hash, attributes, "
+ "has_trace_data, is_replay, "
+ "input_body, output_body, input_headers, output_headers, root_cause_message "
+ "FROM executions FINAL WHERE " + whereClause
+ " ORDER BY " + sortColumn + " " + sortDir
+ " LIMIT ? OFFSET ?";
List<Object> dataParams = new ArrayList<>(params);
dataParams.add(request.limit());
dataParams.add(request.offset());
List<ExecutionSummary> data = jdbc.query(
dataSql, dataParams.toArray(),
(rs, rowNum) -> mapRow(rs, searchTerm));
return new SearchResult<>(data, total, request.offset(), request.limit());
} catch (Exception e) {
log.error("ClickHouse search failed", e);
return SearchResult.empty(request.offset(), request.limit());
}
}
@Override
public long count(SearchRequest request) {
try {
List<Object> params = new ArrayList<>();
String whereClause = buildWhereClause(request, params);
String sql = "SELECT count() FROM executions FINAL WHERE " + whereClause;
Long result = jdbc.queryForObject(sql, Long.class, params.toArray());
return result != null ? result : 0L;
} catch (Exception e) {
log.error("ClickHouse count failed", e);
return 0L;
}
}
private String buildWhereClause(SearchRequest request, List<Object> params) {
List<String> conditions = new ArrayList<>();
conditions.add("tenant_id = 'default'");
if (request.timeFrom() != null) {
conditions.add("start_time >= ?");
params.add(Timestamp.from(request.timeFrom()));
}
if (request.timeTo() != null) {
conditions.add("start_time <= ?");
params.add(Timestamp.from(request.timeTo()));
}
if (request.status() != null && !request.status().isBlank()) {
String[] statuses = request.status().split(",");
if (statuses.length == 1) {
conditions.add("status = ?");
params.add(statuses[0].trim());
} else {
String placeholders = String.join(", ", Collections.nCopies(statuses.length, "?"));
conditions.add("status IN (" + placeholders + ")");
for (String s : statuses) {
params.add(s.trim());
}
}
}
if (request.routeId() != null) {
conditions.add("route_id = ?");
params.add(request.routeId());
}
if (request.instanceId() != null) {
conditions.add("instance_id = ?");
params.add(request.instanceId());
}
if (request.correlationId() != null) {
conditions.add("correlation_id = ?");
params.add(request.correlationId());
}
if (request.applicationId() != null && !request.applicationId().isBlank()) {
conditions.add("application_id = ?");
params.add(request.applicationId());
}
if (request.instanceIds() != null && !request.instanceIds().isEmpty()) {
String placeholders = String.join(", ", Collections.nCopies(request.instanceIds().size(), "?"));
conditions.add("instance_id IN (" + placeholders + ")");
params.addAll(request.instanceIds());
}
if (request.durationMin() != null) {
conditions.add("duration_ms >= ?");
params.add(request.durationMin());
}
if (request.durationMax() != null) {
conditions.add("duration_ms <= ?");
params.add(request.durationMax());
}
// Global full-text search: exact ID match, full-text on execution + processor level
if (request.text() != null && !request.text().isBlank()) {
String term = escapeLike(request.text());
String likeTerm = "%" + term + "%";
conditions.add("(execution_id = ? OR correlation_id = ? OR exchange_id = ?"
+ " OR _search_text LIKE ? OR execution_id IN ("
+ "SELECT DISTINCT execution_id FROM processor_executions "
+ "WHERE tenant_id = 'default' AND _search_text LIKE ?))");
params.add(term);
params.add(term);
params.add(term);
params.add(likeTerm);
params.add(likeTerm);
}
// Scoped body search in processor_executions
if (request.textInBody() != null && !request.textInBody().isBlank()) {
String likeTerm = "%" + escapeLike(request.textInBody()) + "%";
conditions.add("execution_id IN ("
+ "SELECT DISTINCT execution_id FROM processor_executions "
+ "WHERE tenant_id = 'default' AND (input_body LIKE ? OR output_body LIKE ?))");
params.add(likeTerm);
params.add(likeTerm);
}
// Scoped headers search in processor_executions
if (request.textInHeaders() != null && !request.textInHeaders().isBlank()) {
String likeTerm = "%" + escapeLike(request.textInHeaders()) + "%";
conditions.add("execution_id IN ("
+ "SELECT DISTINCT execution_id FROM processor_executions "
+ "WHERE tenant_id = 'default' AND (input_headers LIKE ? OR output_headers LIKE ?))");
params.add(likeTerm);
params.add(likeTerm);
}
// Scoped error search: execution-level + processor-level
if (request.textInErrors() != null && !request.textInErrors().isBlank()) {
String likeTerm = "%" + escapeLike(request.textInErrors()) + "%";
conditions.add("(error_message LIKE ? OR error_stacktrace LIKE ? OR execution_id IN ("
+ "SELECT DISTINCT execution_id FROM processor_executions "
+ "WHERE tenant_id = 'default' AND (error_message LIKE ? OR error_stacktrace LIKE ?)))");
params.add(likeTerm);
params.add(likeTerm);
params.add(likeTerm);
params.add(likeTerm);
}
return String.join(" AND ", conditions);
}
private ExecutionSummary mapRow(ResultSet rs, String searchTerm) throws SQLException {
String executionId = rs.getString("execution_id");
String routeId = rs.getString("route_id");
String instanceId = rs.getString("instance_id");
String applicationId = rs.getString("application_id");
String status = rs.getString("status");
Timestamp startTs = rs.getTimestamp("start_time");
Instant startTime = startTs != null ? startTs.toInstant() : null;
Timestamp endTs = rs.getTimestamp("end_time");
Instant endTime = endTs != null ? endTs.toInstant() : null;
long durationMs = rs.getLong("duration_ms");
String correlationId = rs.getString("correlation_id");
String errorMessage = rs.getString("error_message");
String errorStacktrace = rs.getString("error_stacktrace");
String diagramContentHash = rs.getString("diagram_content_hash");
String attributesJson = rs.getString("attributes");
boolean hasTraceData = rs.getBoolean("has_trace_data");
boolean isReplay = rs.getBoolean("is_replay");
String inputBody = rs.getString("input_body");
String outputBody = rs.getString("output_body");
String inputHeaders = rs.getString("input_headers");
String outputHeaders = rs.getString("output_headers");
String rootCauseMessage = rs.getString("root_cause_message");
Map<String, String> attributes = parseAttributesJson(attributesJson);
// Application-side highlighting
String highlight = null;
if (searchTerm != null && !searchTerm.isBlank()) {
highlight = findHighlight(searchTerm, errorMessage, errorStacktrace,
inputBody, outputBody, inputHeaders, outputHeaders, attributesJson, rootCauseMessage);
}
return new ExecutionSummary(
executionId, routeId, instanceId, applicationId, status,
startTime, endTime, durationMs,
correlationId, errorMessage, diagramContentHash,
highlight, attributes, hasTraceData, isReplay
);
}
private String findHighlight(String searchTerm, String... fields) {
for (String field : fields) {
String snippet = extractSnippet(field, searchTerm, HIGHLIGHT_CONTEXT_CHARS);
if (snippet != null) {
return snippet;
}
}
return null;
}
static String extractSnippet(String text, String searchTerm, int contextChars) {
if (text == null || text.isEmpty() || searchTerm == null) return null;
int idx = text.toLowerCase().indexOf(searchTerm.toLowerCase());
if (idx < 0) return null;
int start = Math.max(0, idx - contextChars / 2);
int end = Math.min(text.length(), idx + searchTerm.length() + contextChars / 2);
String before = escapeHtml(text.substring(start, idx));
String match = escapeHtml(text.substring(idx, idx + searchTerm.length()));
String after = escapeHtml(text.substring(idx + searchTerm.length(), end));
return (start > 0 ? "..." : "") + before + "<mark>" + match + "</mark>" + after + (end < text.length() ? "..." : "");
}
private static String escapeHtml(String s) {
return s.replace("&", "&amp;").replace("<", "&lt;").replace(">", "&gt;");
}
private static String escapeLike(String term) {
return term.replace("\\", "\\\\")
.replace("%", "\\%")
.replace("_", "\\_");
}
@Override
public List<String> distinctAttributeKeys() {
try {
return jdbc.queryForList("""
SELECT DISTINCT arrayJoin(JSONExtractKeys(attributes)) AS attr_key
FROM executions FINAL
WHERE tenant_id = 'default' AND attributes != '' AND attributes != '{}'
ORDER BY attr_key
""", String.class);
} catch (Exception e) {
log.error("Failed to query distinct attribute keys", e);
return List.of();
}
}
private static Map<String, String> parseAttributesJson(String json) {
if (json == null || json.isBlank()) return null;
try {
return JSON.readValue(json, STR_MAP);
} catch (Exception e) {
return null;
}
}
}

View File

@@ -1,433 +0,0 @@
package com.cameleer3.server.app.search;
import com.cameleer3.server.core.search.ExecutionSummary;
import com.cameleer3.server.core.search.SearchRequest;
import com.cameleer3.server.core.search.SearchResult;
import com.cameleer3.server.core.storage.SearchIndex;
import com.cameleer3.server.core.storage.model.ExecutionDocument;
import com.cameleer3.server.core.storage.model.ExecutionDocument.ProcessorDoc;
import com.fasterxml.jackson.core.type.TypeReference;
import com.fasterxml.jackson.databind.ObjectMapper;
import jakarta.annotation.PostConstruct;
import org.opensearch.client.json.JsonData;
import org.opensearch.client.opensearch.OpenSearchClient;
import org.opensearch.client.opensearch._types.FieldValue;
import org.opensearch.client.opensearch._types.SortOrder;
import org.opensearch.client.opensearch._types.query_dsl.*;
import org.opensearch.client.opensearch.core.*;
import org.opensearch.client.opensearch.core.search.Hit;
import org.opensearch.client.opensearch.indices.*;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Repository;
import java.io.IOException;
import java.time.Instant;
import java.time.ZoneOffset;
import java.time.format.DateTimeFormatter;
import java.util.*;
import java.util.stream.Collectors;
@Repository
public class OpenSearchIndex implements SearchIndex {
private static final Logger log = LoggerFactory.getLogger(OpenSearchIndex.class);
private static final DateTimeFormatter DAY_FMT = DateTimeFormatter.ofPattern("yyyy-MM-dd")
.withZone(ZoneOffset.UTC);
private static final ObjectMapper JSON = new ObjectMapper();
private static final TypeReference<Map<String, String>> STR_MAP = new TypeReference<>() {};
private final OpenSearchClient client;
private final String indexPrefix;
public OpenSearchIndex(OpenSearchClient client,
@Value("${opensearch.index-prefix:executions-}") String indexPrefix) {
this.client = client;
this.indexPrefix = indexPrefix;
}
@PostConstruct
void ensureIndexTemplate() {
String templateName = indexPrefix + "template";
String indexPattern = indexPrefix + "*";
try {
boolean exists = client.indices().existsIndexTemplate(
ExistsIndexTemplateRequest.of(b -> b.name(templateName))).value();
if (!exists) {
client.indices().putIndexTemplate(PutIndexTemplateRequest.of(b -> b
.name(templateName)
.indexPatterns(List.of(indexPattern))
.template(t -> t
.settings(s -> s
.numberOfShards("3")
.numberOfReplicas("1"))
.mappings(m -> m
.properties("processors", p -> p
.nested(n -> n))))));
log.info("OpenSearch index template created");
}
} catch (IOException e) {
log.error("Failed to create index template", e);
}
}
@Override
public void index(ExecutionDocument doc) {
String indexName = indexPrefix + DAY_FMT.format(doc.startTime());
try {
client.index(IndexRequest.of(b -> b
.index(indexName)
.id(doc.executionId())
.document(toMap(doc))));
} catch (IOException e) {
log.error("Failed to index execution {}", doc.executionId(), e);
}
}
@Override
public SearchResult<ExecutionSummary> search(SearchRequest request) {
try {
var searchReq = buildSearchRequest(request, request.limit());
var response = client.search(searchReq, Map.class);
List<ExecutionSummary> items = response.hits().hits().stream()
.map(this::hitToSummary)
.collect(Collectors.toList());
long total = response.hits().total() != null ? response.hits().total().value() : 0;
return new SearchResult<>(items, total, request.offset(), request.limit());
} catch (IOException e) {
log.error("Search failed", e);
return SearchResult.empty(request.offset(), request.limit());
}
}
@Override
public long count(SearchRequest request) {
try {
var countReq = CountRequest.of(b -> b
.index(indexPrefix + "*")
.query(buildQuery(request)));
return client.count(countReq).count();
} catch (IOException e) {
log.error("Count failed", e);
return 0;
}
}
@Override
public void delete(String executionId) {
try {
client.deleteByQuery(DeleteByQueryRequest.of(b -> b
.index(List.of(indexPrefix + "*"))
.query(Query.of(q -> q.term(t -> t
.field("execution_id")
.value(FieldValue.of(executionId)))))));
} catch (IOException e) {
log.error("Failed to delete execution {}", executionId, e);
}
}
private static final List<String> HIGHLIGHT_FIELDS = List.of(
"error_message", "attributes_text",
"processors.input_body", "processors.output_body",
"processors.input_headers", "processors.output_headers",
"processors.attributes_text");
private org.opensearch.client.opensearch.core.SearchRequest buildSearchRequest(
SearchRequest request, int size) {
return org.opensearch.client.opensearch.core.SearchRequest.of(b -> {
b.index(indexPrefix + "*")
.query(buildQuery(request))
.trackTotalHits(th -> th.enabled(true))
.size(size)
.from(request.offset())
.sort(s -> s.field(f -> f
.field(request.sortColumn())
.order("asc".equalsIgnoreCase(request.sortDir())
? SortOrder.Asc : SortOrder.Desc)));
// Add highlight when full-text search is active
if (request.text() != null && !request.text().isBlank()) {
b.highlight(h -> {
for (String field : HIGHLIGHT_FIELDS) {
h.fields(field, hf -> hf
.fragmentSize(120)
.numberOfFragments(1));
}
return h;
});
}
return b;
});
}
private Query buildQuery(SearchRequest request) {
List<Query> must = new ArrayList<>();
List<Query> filter = new ArrayList<>();
// Time range
if (request.timeFrom() != null || request.timeTo() != null) {
filter.add(Query.of(q -> q.range(r -> {
r.field("start_time");
if (request.timeFrom() != null)
r.gte(JsonData.of(request.timeFrom().toString()));
if (request.timeTo() != null)
r.lte(JsonData.of(request.timeTo().toString()));
return r;
})));
}
// Keyword filters (use .keyword sub-field for exact matching on dynamically mapped text fields)
if (request.status() != null && !request.status().isBlank()) {
String[] statuses = request.status().split(",");
if (statuses.length == 1) {
filter.add(termQuery("status.keyword", statuses[0].trim()));
} else {
filter.add(Query.of(q -> q.terms(t -> t
.field("status.keyword")
.terms(tv -> tv.value(
java.util.Arrays.stream(statuses)
.map(String::trim)
.map(FieldValue::of)
.toList())))));
}
}
if (request.routeId() != null)
filter.add(termQuery("route_id.keyword", request.routeId()));
if (request.agentId() != null)
filter.add(termQuery("agent_id.keyword", request.agentId()));
if (request.correlationId() != null)
filter.add(termQuery("correlation_id.keyword", request.correlationId()));
if (request.application() != null && !request.application().isBlank())
filter.add(termQuery("application_name.keyword", request.application()));
// Full-text search across all fields + nested processor fields
if (request.text() != null && !request.text().isBlank()) {
String text = request.text();
String wildcard = "*" + text.toLowerCase() + "*";
List<Query> textQueries = new ArrayList<>();
// Search top-level text fields (analyzed match + wildcard for substring)
textQueries.add(Query.of(q -> q.multiMatch(m -> m
.query(text)
.fields("error_message", "error_stacktrace", "attributes_text"))));
textQueries.add(Query.of(q -> q.wildcard(w -> w
.field("error_message").value(wildcard).caseInsensitive(true))));
textQueries.add(Query.of(q -> q.wildcard(w -> w
.field("error_stacktrace").value(wildcard).caseInsensitive(true))));
textQueries.add(Query.of(q -> q.wildcard(w -> w
.field("attributes_text").value(wildcard).caseInsensitive(true))));
// Search nested processor fields (analyzed match + wildcard)
textQueries.add(Query.of(q -> q.nested(n -> n
.path("processors")
.query(nq -> nq.multiMatch(m -> m
.query(text)
.fields("processors.input_body", "processors.output_body",
"processors.input_headers", "processors.output_headers",
"processors.error_message", "processors.error_stacktrace",
"processors.attributes_text"))))));
textQueries.add(Query.of(q -> q.nested(n -> n
.path("processors")
.query(nq -> nq.bool(nb -> nb.should(
wildcardQuery("processors.input_body", wildcard),
wildcardQuery("processors.output_body", wildcard),
wildcardQuery("processors.input_headers", wildcard),
wildcardQuery("processors.output_headers", wildcard),
wildcardQuery("processors.attributes_text", wildcard)
).minimumShouldMatch("1"))))));
// Also try keyword fields for exact matches
textQueries.add(Query.of(q -> q.multiMatch(m -> m
.query(text)
.fields("execution_id", "route_id", "agent_id", "correlation_id", "exchange_id"))));
must.add(Query.of(q -> q.bool(b -> b.should(textQueries).minimumShouldMatch("1"))));
}
// Scoped text searches (multiMatch + wildcard fallback for substring matching)
if (request.textInBody() != null && !request.textInBody().isBlank()) {
String bodyText = request.textInBody();
String bodyWildcard = "*" + bodyText.toLowerCase() + "*";
must.add(Query.of(q -> q.nested(n -> n
.path("processors")
.query(nq -> nq.bool(nb -> nb.should(
Query.of(mq -> mq.multiMatch(m -> m
.query(bodyText)
.fields("processors.input_body", "processors.output_body"))),
wildcardQuery("processors.input_body", bodyWildcard),
wildcardQuery("processors.output_body", bodyWildcard)
).minimumShouldMatch("1"))))));
}
if (request.textInHeaders() != null && !request.textInHeaders().isBlank()) {
String headerText = request.textInHeaders();
String headerWildcard = "*" + headerText.toLowerCase() + "*";
must.add(Query.of(q -> q.nested(n -> n
.path("processors")
.query(nq -> nq.bool(nb -> nb.should(
Query.of(mq -> mq.multiMatch(m -> m
.query(headerText)
.fields("processors.input_headers", "processors.output_headers"))),
wildcardQuery("processors.input_headers", headerWildcard),
wildcardQuery("processors.output_headers", headerWildcard)
).minimumShouldMatch("1"))))));
}
if (request.textInErrors() != null && !request.textInErrors().isBlank()) {
String errText = request.textInErrors();
String errWildcard = "*" + errText.toLowerCase() + "*";
must.add(Query.of(q -> q.bool(b -> b.should(
Query.of(sq -> sq.multiMatch(m -> m
.query(errText)
.fields("error_message", "error_stacktrace"))),
wildcardQuery("error_message", errWildcard),
wildcardQuery("error_stacktrace", errWildcard),
Query.of(sq -> sq.nested(n -> n
.path("processors")
.query(nq -> nq.bool(nb -> nb.should(
Query.of(nmq -> nmq.multiMatch(m -> m
.query(errText)
.fields("processors.error_message", "processors.error_stacktrace"))),
wildcardQuery("processors.error_message", errWildcard),
wildcardQuery("processors.error_stacktrace", errWildcard)
).minimumShouldMatch("1")))))
).minimumShouldMatch("1"))));
}
// Duration range
if (request.durationMin() != null || request.durationMax() != null) {
filter.add(Query.of(q -> q.range(r -> {
r.field("duration_ms");
if (request.durationMin() != null)
r.gte(JsonData.of(request.durationMin()));
if (request.durationMax() != null)
r.lte(JsonData.of(request.durationMax()));
return r;
})));
}
return Query.of(q -> q.bool(b -> {
if (!must.isEmpty()) b.must(must);
if (!filter.isEmpty()) b.filter(filter);
if (must.isEmpty() && filter.isEmpty()) b.must(Query.of(mq -> mq.matchAll(m -> m)));
return b;
}));
}
private Query termQuery(String field, String value) {
return Query.of(q -> q.term(t -> t.field(field).value(FieldValue.of(value))));
}
private Query wildcardQuery(String field, String pattern) {
return Query.of(q -> q.wildcard(w -> w.field(field).value(pattern).caseInsensitive(true)));
}
private Map<String, Object> toMap(ExecutionDocument doc) {
Map<String, Object> map = new LinkedHashMap<>();
map.put("execution_id", doc.executionId());
map.put("route_id", doc.routeId());
map.put("agent_id", doc.agentId());
map.put("application_name", doc.applicationName());
map.put("status", doc.status());
map.put("correlation_id", doc.correlationId());
map.put("exchange_id", doc.exchangeId());
map.put("start_time", doc.startTime() != null ? doc.startTime().toString() : null);
map.put("end_time", doc.endTime() != null ? doc.endTime().toString() : null);
map.put("duration_ms", doc.durationMs());
map.put("error_message", doc.errorMessage());
map.put("error_stacktrace", doc.errorStacktrace());
if (doc.attributes() != null) {
Map<String, String> attrs = parseAttributesJson(doc.attributes());
map.put("attributes", attrs);
map.put("attributes_text", flattenAttributes(attrs));
}
if (doc.processors() != null) {
map.put("processors", doc.processors().stream().map(p -> {
Map<String, Object> pm = new LinkedHashMap<>();
pm.put("processor_id", p.processorId());
pm.put("processor_type", p.processorType());
pm.put("status", p.status());
pm.put("error_message", p.errorMessage());
pm.put("error_stacktrace", p.errorStacktrace());
pm.put("input_body", p.inputBody());
pm.put("output_body", p.outputBody());
pm.put("input_headers", p.inputHeaders());
pm.put("output_headers", p.outputHeaders());
if (p.attributes() != null) {
Map<String, String> pAttrs = parseAttributesJson(p.attributes());
pm.put("attributes", pAttrs);
pm.put("attributes_text", flattenAttributes(pAttrs));
}
return pm;
}).toList());
}
map.put("has_trace_data", doc.hasTraceData());
map.put("is_replay", doc.isReplay());
return map;
}
@SuppressWarnings("unchecked")
private ExecutionSummary hitToSummary(Hit<Map> hit) {
Map<String, Object> src = hit.source();
if (src == null) return null;
@SuppressWarnings("unchecked")
Map<String, String> attributes = src.get("attributes") instanceof Map
? new LinkedHashMap<>((Map<String, String>) src.get("attributes")) : null;
// Merge processor-level attributes (execution-level takes precedence)
if (src.get("processors") instanceof List<?> procs) {
for (Object pObj : procs) {
if (pObj instanceof Map<?, ?> pm && pm.get("attributes") instanceof Map<?, ?> pa) {
if (attributes == null) attributes = new LinkedHashMap<>();
for (var entry : pa.entrySet()) {
attributes.putIfAbsent(
String.valueOf(entry.getKey()),
String.valueOf(entry.getValue()));
}
}
}
}
return new ExecutionSummary(
(String) src.get("execution_id"),
(String) src.get("route_id"),
(String) src.get("agent_id"),
(String) src.get("application_name"),
(String) src.get("status"),
src.get("start_time") != null ? Instant.parse((String) src.get("start_time")) : null,
src.get("end_time") != null ? Instant.parse((String) src.get("end_time")) : null,
src.get("duration_ms") != null ? ((Number) src.get("duration_ms")).longValue() : 0L,
(String) src.get("correlation_id"),
(String) src.get("error_message"),
null, // diagramContentHash not stored in index
extractHighlight(hit),
attributes,
Boolean.TRUE.equals(src.get("has_trace_data")),
Boolean.TRUE.equals(src.get("is_replay"))
);
}
private String extractHighlight(Hit<Map> hit) {
if (hit.highlight() == null || hit.highlight().isEmpty()) return null;
for (List<String> fragments : hit.highlight().values()) {
if (fragments != null && !fragments.isEmpty()) {
return fragments.get(0);
}
}
return null;
}
private static Map<String, String> parseAttributesJson(String json) {
if (json == null || json.isBlank()) return null;
try {
return JSON.readValue(json, STR_MAP);
} catch (Exception e) {
return null;
}
}
private static String flattenAttributes(Map<String, String> attrs) {
if (attrs == null || attrs.isEmpty()) return "";
return attrs.entrySet().stream()
.map(e -> e.getKey() + "=" + e.getValue())
.collect(Collectors.joining(" "));
}
}

View File

@@ -1,223 +0,0 @@
package com.cameleer3.server.app.search;
import com.cameleer3.common.model.LogEntry;
import com.cameleer3.server.app.dto.LogEntryResponse;
import jakarta.annotation.PostConstruct;
import org.opensearch.client.json.JsonData;
import org.opensearch.client.opensearch.OpenSearchClient;
import org.opensearch.client.opensearch._types.FieldValue;
import org.opensearch.client.opensearch._types.SortOrder;
import org.opensearch.client.opensearch._types.mapping.Property;
import org.opensearch.client.opensearch._types.query_dsl.BoolQuery;
import org.opensearch.client.opensearch._types.query_dsl.Query;
import org.opensearch.client.opensearch.core.BulkRequest;
import org.opensearch.client.opensearch.core.BulkResponse;
import org.opensearch.client.opensearch.core.bulk.BulkResponseItem;
import org.opensearch.client.opensearch.indices.ExistsIndexTemplateRequest;
import org.opensearch.client.opensearch.indices.PutIndexTemplateRequest;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.stereotype.Repository;
import java.io.IOException;
import java.time.Instant;
import java.time.ZoneOffset;
import java.time.format.DateTimeFormatter;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
@Repository
public class OpenSearchLogIndex {
private static final Logger log = LoggerFactory.getLogger(OpenSearchLogIndex.class);
private static final DateTimeFormatter DAY_FMT = DateTimeFormatter.ofPattern("yyyy-MM-dd")
.withZone(ZoneOffset.UTC);
private final OpenSearchClient client;
private final String indexPrefix;
private final int retentionDays;
public OpenSearchLogIndex(OpenSearchClient client,
@Value("${opensearch.log-index-prefix:logs-}") String indexPrefix,
@Value("${opensearch.log-retention-days:7}") int retentionDays) {
this.client = client;
this.indexPrefix = indexPrefix;
this.retentionDays = retentionDays;
}
@PostConstruct
void init() {
ensureIndexTemplate();
ensureIsmPolicy();
}
private void ensureIndexTemplate() {
String templateName = indexPrefix.replace("-", "") + "-template";
String indexPattern = indexPrefix + "*";
try {
boolean exists = client.indices().existsIndexTemplate(
ExistsIndexTemplateRequest.of(b -> b.name(templateName))).value();
if (!exists) {
client.indices().putIndexTemplate(PutIndexTemplateRequest.of(b -> b
.name(templateName)
.indexPatterns(List.of(indexPattern))
.template(t -> t
.settings(s -> s
.numberOfShards("1")
.numberOfReplicas("1"))
.mappings(m -> m
.properties("@timestamp", Property.of(p -> p.date(d -> d)))
.properties("level", Property.of(p -> p.keyword(k -> k)))
.properties("loggerName", Property.of(p -> p.keyword(k -> k)))
.properties("message", Property.of(p -> p.text(tx -> tx)))
.properties("threadName", Property.of(p -> p.keyword(k -> k)))
.properties("stackTrace", Property.of(p -> p.text(tx -> tx)))
.properties("agentId", Property.of(p -> p.keyword(k -> k)))
.properties("application", Property.of(p -> p.keyword(k -> k)))
.properties("exchangeId", Property.of(p -> p.keyword(k -> k)))))));
log.info("OpenSearch log index template '{}' created", templateName);
}
} catch (IOException e) {
log.error("Failed to create log index template", e);
}
}
private void ensureIsmPolicy() {
String policyId = "logs-retention";
try {
// Use the low-level REST client to manage ISM policies
var restClient = client._transport();
// Check if the ISM policy exists via a GET; create if not
// ISM is managed via the _plugins/_ism/policies API
// For now, log a reminder — ISM policy should be created via OpenSearch API or dashboard
log.info("Log retention policy: indices matching '{}*' should be deleted after {} days. " +
"Ensure ISM policy '{}' is configured in OpenSearch.", indexPrefix, retentionDays, policyId);
} catch (Exception e) {
log.warn("Could not verify ISM policy for log retention", e);
}
}
public List<LogEntryResponse> search(String application, String agentId, String level,
String query, String exchangeId,
Instant from, Instant to, int limit) {
try {
BoolQuery.Builder bool = new BoolQuery.Builder();
bool.must(Query.of(q -> q.term(t -> t.field("application").value(FieldValue.of(application)))));
if (agentId != null && !agentId.isEmpty()) {
bool.must(Query.of(q -> q.term(t -> t.field("agentId").value(FieldValue.of(agentId)))));
}
if (exchangeId != null && !exchangeId.isEmpty()) {
// Match on top-level field (new records) or MDC nested field (old records)
bool.must(Query.of(q -> q.bool(b -> b
.should(Query.of(s -> s.term(t -> t.field("exchangeId.keyword").value(FieldValue.of(exchangeId)))))
.should(Query.of(s -> s.term(t -> t.field("mdc.camel.exchangeId.keyword").value(FieldValue.of(exchangeId)))))
.minimumShouldMatch("1"))));
}
if (level != null && !level.isEmpty()) {
bool.must(Query.of(q -> q.term(t -> t.field("level").value(FieldValue.of(level.toUpperCase())))));
}
if (query != null && !query.isEmpty()) {
bool.must(Query.of(q -> q.match(m -> m.field("message").query(FieldValue.of(query)))));
}
if (from != null || to != null) {
bool.must(Query.of(q -> q.range(r -> {
r.field("@timestamp");
if (from != null) r.gte(JsonData.of(from.toString()));
if (to != null) r.lte(JsonData.of(to.toString()));
return r;
})));
}
var response = client.search(s -> s
.index(indexPrefix + "*")
.query(Query.of(q -> q.bool(bool.build())))
.sort(so -> so.field(f -> f.field("@timestamp").order(SortOrder.Desc)))
.size(limit), Map.class);
List<LogEntryResponse> results = new ArrayList<>();
for (var hit : response.hits().hits()) {
@SuppressWarnings("unchecked")
Map<String, Object> src = (Map<String, Object>) hit.source();
if (src == null) continue;
results.add(new LogEntryResponse(
str(src, "@timestamp"),
str(src, "level"),
str(src, "loggerName"),
str(src, "message"),
str(src, "threadName"),
str(src, "stackTrace")));
}
return results;
} catch (IOException e) {
log.error("Failed to search log entries for application={}", application, e);
return List.of();
}
}
private static String str(Map<String, Object> map, String key) {
Object v = map.get(key);
return v != null ? v.toString() : null;
}
public void indexBatch(String agentId, String application, List<LogEntry> entries) {
if (entries == null || entries.isEmpty()) {
return;
}
try {
BulkRequest.Builder bulkBuilder = new BulkRequest.Builder();
for (LogEntry entry : entries) {
String indexName = indexPrefix + DAY_FMT.format(
entry.getTimestamp() != null ? entry.getTimestamp() : java.time.Instant.now());
Map<String, Object> doc = toMap(entry, agentId, application);
bulkBuilder.operations(op -> op
.index(idx -> idx
.index(indexName)
.document(doc)));
}
BulkResponse response = client.bulk(bulkBuilder.build());
if (response.errors()) {
int errorCount = 0;
for (BulkResponseItem item : response.items()) {
if (item.error() != null) {
errorCount++;
if (errorCount == 1) {
log.error("Bulk log index error: {}", item.error().reason());
}
}
}
log.error("Bulk log indexing had {} error(s) out of {} entries", errorCount, entries.size());
} else {
log.debug("Indexed {} log entries for agent={}, app={}", entries.size(), agentId, application);
}
} catch (IOException e) {
log.error("Failed to bulk index {} log entries for agent={}", entries.size(), agentId, e);
}
}
private Map<String, Object> toMap(LogEntry entry, String agentId, String application) {
Map<String, Object> doc = new LinkedHashMap<>();
doc.put("@timestamp", entry.getTimestamp() != null ? entry.getTimestamp().toString() : null);
doc.put("level", entry.getLevel());
doc.put("loggerName", entry.getLoggerName());
doc.put("message", entry.getMessage());
doc.put("threadName", entry.getThreadName());
doc.put("stackTrace", entry.getStackTrace());
doc.put("mdc", entry.getMdc());
doc.put("agentId", agentId);
doc.put("application", application);
if (entry.getMdc() != null) {
String exId = entry.getMdc().get("camel.exchangeId");
if (exId != null) doc.put("exchangeId", exId);
}
return doc;
}
}

View File

@@ -0,0 +1,73 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.agent.AgentEventRecord;
import com.cameleer3.server.core.agent.AgentEventRepository;
import org.springframework.jdbc.core.JdbcTemplate;
import java.sql.Timestamp;
import java.time.Instant;
import java.util.ArrayList;
import java.util.List;
/**
* ClickHouse implementation of {@link AgentEventRepository}.
* <p>
* The ClickHouse table has no {@code id} column (no BIGSERIAL equivalent),
* so all returned {@link AgentEventRecord} instances have {@code id = 0}.
*/
public class ClickHouseAgentEventRepository implements AgentEventRepository {
private static final String TENANT = "default";
private static final String INSERT_SQL =
"INSERT INTO agent_events (tenant_id, instance_id, application_id, event_type, detail) VALUES (?, ?, ?, ?, ?)";
private static final String SELECT_BASE =
"SELECT 0 AS id, instance_id, application_id, event_type, detail, timestamp FROM agent_events WHERE tenant_id = ?";
private final JdbcTemplate jdbc;
public ClickHouseAgentEventRepository(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public void insert(String instanceId, String applicationId, String eventType, String detail) {
jdbc.update(INSERT_SQL, TENANT, instanceId, applicationId, eventType, detail);
}
@Override
public List<AgentEventRecord> query(String applicationId, String instanceId, Instant from, Instant to, int limit) {
var sql = new StringBuilder(SELECT_BASE);
var params = new ArrayList<Object>();
params.add(TENANT);
if (applicationId != null) {
sql.append(" AND application_id = ?");
params.add(applicationId);
}
if (instanceId != null) {
sql.append(" AND instance_id = ?");
params.add(instanceId);
}
if (from != null) {
sql.append(" AND timestamp >= ?");
params.add(Timestamp.from(from));
}
if (to != null) {
sql.append(" AND timestamp < ?");
params.add(Timestamp.from(to));
}
sql.append(" ORDER BY timestamp DESC LIMIT ?");
params.add(limit);
return jdbc.query(sql.toString(), (rs, rowNum) -> new AgentEventRecord(
rs.getLong("id"),
rs.getString("instance_id"),
rs.getString("application_id"),
rs.getString("event_type"),
rs.getString("detail"),
rs.getTimestamp("timestamp").toInstant()
), params.toArray());
}
}

View File

@@ -1,6 +1,7 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.common.graph.RouteGraph;
import com.cameleer3.common.graph.RouteNode;
import com.cameleer3.server.core.ingestion.TaggedDiagram;
import com.cameleer3.server.core.storage.DiagramStore;
import com.fasterxml.jackson.core.JsonProcessingException;
@@ -9,11 +10,12 @@ import com.fasterxml.jackson.datatype.jsr310.JavaTimeModule;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Repository;
import java.nio.charset.StandardCharsets;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import java.sql.Timestamp;
import java.time.Instant;
import java.util.ArrayList;
import java.util.Collections;
import java.util.HashMap;
@@ -23,37 +25,49 @@ import java.util.Map;
import java.util.Optional;
/**
* PostgreSQL implementation of {@link DiagramStore}.
* ClickHouse implementation of {@link DiagramStore}.
* <p>
* Stores route graphs as JSON with SHA-256 content-hash deduplication.
* Uses {@code ON CONFLICT (content_hash) DO NOTHING} for idempotent inserts.
* Uses ReplacingMergeTree duplicate inserts are deduplicated on merge.
* <p>
* {@code findProcessorRouteMapping} fetches all definitions for the application
* and deserializes them in Java because ClickHouse has no equivalent of
* PostgreSQL's {@code jsonb_array_elements()}.
*/
@Repository
public class PostgresDiagramStore implements DiagramStore {
public class ClickHouseDiagramStore implements DiagramStore {
private static final Logger log = LoggerFactory.getLogger(PostgresDiagramStore.class);
private static final Logger log = LoggerFactory.getLogger(ClickHouseDiagramStore.class);
private static final String TENANT = "default";
private static final String INSERT_SQL = """
INSERT INTO route_diagrams (content_hash, route_id, agent_id, application_name, definition)
VALUES (?, ?, ?, ?, ?::jsonb)
ON CONFLICT (content_hash) DO NOTHING
INSERT INTO route_diagrams
(tenant_id, content_hash, route_id, instance_id, application_id, definition, created_at)
VALUES (?, ?, ?, ?, ?, ?, ?)
""";
private static final String SELECT_BY_HASH = """
SELECT definition FROM route_diagrams WHERE content_hash = ? LIMIT 1
SELECT definition FROM route_diagrams
WHERE tenant_id = ? AND content_hash = ?
LIMIT 1
""";
private static final String SELECT_HASH_FOR_ROUTE = """
SELECT content_hash FROM route_diagrams
WHERE route_id = ? AND agent_id = ?
WHERE tenant_id = ? AND route_id = ? AND instance_id = ?
ORDER BY created_at DESC LIMIT 1
""";
private final JdbcTemplate jdbcTemplate;
private static final String SELECT_DEFINITIONS_FOR_APP = """
SELECT DISTINCT route_id, definition FROM route_diagrams
WHERE tenant_id = ? AND application_id = ?
""";
private final JdbcTemplate jdbc;
private final ObjectMapper objectMapper;
public PostgresDiagramStore(JdbcTemplate jdbcTemplate) {
this.jdbcTemplate = jdbcTemplate;
public ClickHouseDiagramStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
this.objectMapper = new ObjectMapper();
this.objectMapper.registerModule(new JavaTimeModule());
}
@@ -62,13 +76,20 @@ public class PostgresDiagramStore implements DiagramStore {
public void store(TaggedDiagram diagram) {
try {
RouteGraph graph = diagram.graph();
String agentId = diagram.agentId() != null ? diagram.agentId() : "";
String applicationName = diagram.applicationName() != null ? diagram.applicationName() : "";
String agentId = diagram.instanceId() != null ? diagram.instanceId() : "";
String applicationId = diagram.applicationId() != null ? diagram.applicationId() : "";
String json = objectMapper.writeValueAsString(graph);
String contentHash = sha256Hex(json);
String routeId = graph.getRouteId() != null ? graph.getRouteId() : "";
jdbcTemplate.update(INSERT_SQL, contentHash, routeId, agentId, applicationName, json);
jdbc.update(INSERT_SQL,
TENANT,
contentHash,
routeId,
agentId,
applicationId,
json,
Timestamp.from(Instant.now()));
log.debug("Stored diagram for route={} agent={} with hash={}", routeId, agentId, contentHash);
} catch (JsonProcessingException e) {
throw new RuntimeException("Failed to serialize RouteGraph to JSON", e);
@@ -77,7 +98,7 @@ public class PostgresDiagramStore implements DiagramStore {
@Override
public Optional<RouteGraph> findByContentHash(String contentHash) {
List<Map<String, Object>> rows = jdbcTemplate.queryForList(SELECT_BY_HASH, contentHash);
List<Map<String, Object>> rows = jdbc.queryForList(SELECT_BY_HASH, TENANT, contentHash);
if (rows.isEmpty()) {
return Optional.empty();
}
@@ -85,14 +106,15 @@ public class PostgresDiagramStore implements DiagramStore {
try {
return Optional.of(objectMapper.readValue(json, RouteGraph.class));
} catch (JsonProcessingException e) {
log.error("Failed to deserialize RouteGraph from PostgreSQL", e);
log.error("Failed to deserialize RouteGraph from ClickHouse", e);
return Optional.empty();
}
}
@Override
public Optional<String> findContentHashForRoute(String routeId, String agentId) {
List<Map<String, Object>> rows = jdbcTemplate.queryForList(SELECT_HASH_FOR_ROUTE, routeId, agentId);
List<Map<String, Object>> rows = jdbc.queryForList(
SELECT_HASH_FOR_ROUTE, TENANT, routeId, agentId);
if (rows.isEmpty()) {
return Optional.empty();
}
@@ -106,12 +128,13 @@ public class PostgresDiagramStore implements DiagramStore {
}
String placeholders = String.join(", ", Collections.nCopies(agentIds.size(), "?"));
String sql = "SELECT content_hash FROM route_diagrams " +
"WHERE route_id = ? AND agent_id IN (" + placeholders + ") " +
"WHERE tenant_id = ? AND route_id = ? AND instance_id IN (" + placeholders + ") " +
"ORDER BY created_at DESC LIMIT 1";
var params = new ArrayList<Object>();
params.add(TENANT);
params.add(routeId);
params.addAll(agentIds);
List<Map<String, Object>> rows = jdbcTemplate.queryForList(sql, params.toArray());
List<Map<String, Object>> rows = jdbc.queryForList(sql, params.toArray());
if (rows.isEmpty()) {
return Optional.empty();
}
@@ -119,20 +142,45 @@ public class PostgresDiagramStore implements DiagramStore {
}
@Override
public Map<String, String> findProcessorRouteMapping(String applicationName) {
public Map<String, String> findProcessorRouteMapping(String applicationId) {
Map<String, String> mapping = new HashMap<>();
jdbcTemplate.query("""
SELECT DISTINCT rd.route_id, node_elem->>'id' AS processor_id
FROM route_diagrams rd,
jsonb_array_elements(rd.definition::jsonb->'nodes') AS node_elem
WHERE rd.application_name = ?
AND node_elem->>'id' IS NOT NULL
""",
rs -> { mapping.put(rs.getString("processor_id"), rs.getString("route_id")); },
applicationName);
List<Map<String, Object>> rows = jdbc.queryForList(
SELECT_DEFINITIONS_FOR_APP, TENANT, applicationId);
for (Map<String, Object> row : rows) {
String routeId = (String) row.get("route_id");
String json = (String) row.get("definition");
if (json == null || routeId == null) {
continue;
}
try {
RouteGraph graph = objectMapper.readValue(json, RouteGraph.class);
collectNodeIds(graph.getRoot(), routeId, mapping);
} catch (JsonProcessingException e) {
log.warn("Failed to deserialize RouteGraph for route={} app={}", routeId, applicationId, e);
}
}
return mapping;
}
/**
* Recursively walks the RouteNode tree and maps each node ID to the given routeId.
*/
private void collectNodeIds(RouteNode node, String routeId, Map<String, String> mapping) {
if (node == null) {
return;
}
String id = node.getId();
if (id != null && !id.isEmpty()) {
mapping.put(id, routeId);
}
List<RouteNode> children = node.getChildren();
if (children != null) {
for (RouteNode child : children) {
collectNodeIds(child, routeId, mapping);
}
}
}
static String sha256Hex(String input) {
try {
MessageDigest digest = MessageDigest.getInstance("SHA-256");

View File

@@ -0,0 +1,339 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.ingestion.MergedExecution;
import com.cameleer3.server.core.storage.ExecutionStore;
import com.cameleer3.common.model.FlatProcessorRecord;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.springframework.jdbc.core.JdbcTemplate;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Timestamp;
import java.time.Instant;
import java.util.List;
import java.util.Map;
import java.util.Optional;
public class ClickHouseExecutionStore implements ExecutionStore {
private final JdbcTemplate jdbc;
private final ObjectMapper objectMapper;
public ClickHouseExecutionStore(JdbcTemplate jdbc) {
this(jdbc, new ObjectMapper());
}
public ClickHouseExecutionStore(JdbcTemplate jdbc, ObjectMapper objectMapper) {
this.jdbc = jdbc;
this.objectMapper = objectMapper;
}
public void insertExecutionBatch(List<MergedExecution> executions) {
if (executions.isEmpty()) return;
jdbc.batchUpdate("""
INSERT INTO executions (
tenant_id, _version, execution_id, route_id, instance_id, application_id,
status, correlation_id, exchange_id, start_time, end_time, duration_ms,
error_message, error_stacktrace, error_type, error_category,
root_cause_type, root_cause_message, diagram_content_hash, engine_level,
input_body, output_body, input_headers, output_headers, attributes,
trace_id, span_id, has_trace_data, is_replay,
original_exchange_id, replay_exchange_id
)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""",
executions.stream().map(e -> new Object[]{
nullToEmpty(e.tenantId()),
e.version(),
nullToEmpty(e.executionId()),
nullToEmpty(e.routeId()),
nullToEmpty(e.instanceId()),
nullToEmpty(e.applicationId()),
nullToEmpty(e.status()),
nullToEmpty(e.correlationId()),
nullToEmpty(e.exchangeId()),
Timestamp.from(e.startTime()),
e.endTime() != null ? Timestamp.from(e.endTime()) : null,
e.durationMs(),
nullToEmpty(e.errorMessage()),
nullToEmpty(e.errorStacktrace()),
nullToEmpty(e.errorType()),
nullToEmpty(e.errorCategory()),
nullToEmpty(e.rootCauseType()),
nullToEmpty(e.rootCauseMessage()),
nullToEmpty(e.diagramContentHash()),
nullToEmpty(e.engineLevel()),
nullToEmpty(e.inputBody()),
nullToEmpty(e.outputBody()),
nullToEmpty(e.inputHeaders()),
nullToEmpty(e.outputHeaders()),
nullToEmpty(e.attributes()),
nullToEmpty(e.traceId()),
nullToEmpty(e.spanId()),
e.hasTraceData(),
e.isReplay(),
nullToEmpty(e.originalExchangeId()),
nullToEmpty(e.replayExchangeId())
}).toList());
}
public void insertProcessorBatch(String tenantId, String executionId, String routeId,
String applicationId, Instant execStartTime,
List<FlatProcessorRecord> processors) {
if (processors.isEmpty()) return;
jdbc.batchUpdate("""
INSERT INTO processor_executions (
tenant_id, execution_id, seq, parent_seq, parent_processor_id,
processor_id, processor_type, start_time, route_id, application_id,
iteration, iteration_size, status, end_time, duration_ms,
error_message, error_stacktrace, error_type, error_category,
root_cause_type, root_cause_message,
input_body, output_body, input_headers, output_headers, attributes,
resolved_endpoint_uri, circuit_breaker_state,
fallback_triggered, filter_matched, duplicate_message
)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
""",
processors.stream().map(p -> new Object[]{
nullToEmpty(tenantId),
nullToEmpty(executionId),
p.getSeq(),
p.getParentSeq(),
nullToEmpty(p.getParentProcessorId()),
nullToEmpty(p.getProcessorId()),
nullToEmpty(p.getProcessorType()),
Timestamp.from(p.getStartTime() != null ? p.getStartTime() : execStartTime),
nullToEmpty(routeId),
nullToEmpty(applicationId),
p.getIteration(),
p.getIterationSize(),
p.getStatus() != null ? p.getStatus().name() : "",
computeEndTime(p.getStartTime(), p.getDurationMs()),
p.getDurationMs(),
nullToEmpty(p.getErrorMessage()),
nullToEmpty(p.getErrorStackTrace()),
nullToEmpty(p.getErrorType()),
nullToEmpty(p.getErrorCategory()),
nullToEmpty(p.getRootCauseType()),
nullToEmpty(p.getRootCauseMessage()),
nullToEmpty(p.getInputBody()),
nullToEmpty(p.getOutputBody()),
mapToJson(p.getInputHeaders()),
mapToJson(p.getOutputHeaders()),
mapToJson(p.getAttributes()),
nullToEmpty(p.getResolvedEndpointUri()),
nullToEmpty(p.getCircuitBreakerState()),
boolOrFalse(p.getFallbackTriggered()),
boolOrFalse(p.getFilterMatched()),
boolOrFalse(p.getDuplicateMessage())
}).toList());
}
// --- ExecutionStore interface: read methods ---
@Override
public Optional<ExecutionRecord> findById(String executionId) {
List<ExecutionRecord> results = jdbc.query("""
SELECT execution_id, route_id, instance_id, application_id, status,
correlation_id, exchange_id, start_time, end_time, duration_ms,
error_message, error_stacktrace, diagram_content_hash, engine_level,
input_body, output_body, input_headers, output_headers, attributes,
error_type, error_category, root_cause_type, root_cause_message,
trace_id, span_id, has_trace_data, is_replay
FROM executions FINAL
WHERE tenant_id = 'default' AND execution_id = ?
LIMIT 1
""",
(rs, rowNum) -> mapExecutionRecord(rs),
executionId);
return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
}
@Override
public List<ProcessorRecord> findProcessors(String executionId) {
return jdbc.query("""
SELECT execution_id, seq, parent_seq, parent_processor_id,
processor_id, processor_type, start_time, route_id, application_id,
iteration, iteration_size, status, end_time, duration_ms,
error_message, error_stacktrace, error_type, error_category,
root_cause_type, root_cause_message,
input_body, output_body, input_headers, output_headers, attributes,
resolved_endpoint_uri, circuit_breaker_state,
fallback_triggered, filter_matched, duplicate_message
FROM processor_executions
WHERE tenant_id = 'default' AND execution_id = ?
ORDER BY seq
""",
(rs, rowNum) -> mapProcessorRecord(rs),
executionId);
}
@Override
public Optional<ProcessorRecord> findProcessorById(String executionId, String processorId) {
List<ProcessorRecord> results = jdbc.query("""
SELECT execution_id, seq, parent_seq, parent_processor_id,
processor_id, processor_type, start_time, route_id, application_id,
iteration, iteration_size, status, end_time, duration_ms,
error_message, error_stacktrace, error_type, error_category,
root_cause_type, root_cause_message,
input_body, output_body, input_headers, output_headers, attributes,
resolved_endpoint_uri, circuit_breaker_state,
fallback_triggered, filter_matched, duplicate_message
FROM processor_executions
WHERE tenant_id = 'default' AND execution_id = ? AND processor_id = ?
LIMIT 1
""",
(rs, rowNum) -> mapProcessorRecord(rs),
executionId, processorId);
return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
}
@Override
public Optional<ProcessorRecord> findProcessorBySeq(String executionId, int seq) {
List<ProcessorRecord> results = jdbc.query("""
SELECT execution_id, seq, parent_seq, parent_processor_id,
processor_id, processor_type, start_time, route_id, application_id,
iteration, iteration_size, status, end_time, duration_ms,
error_message, error_stacktrace, error_type, error_category,
root_cause_type, root_cause_message,
input_body, output_body, input_headers, output_headers, attributes,
resolved_endpoint_uri, circuit_breaker_state,
fallback_triggered, filter_matched, duplicate_message
FROM processor_executions
WHERE tenant_id = 'default' AND execution_id = ? AND seq = ?
LIMIT 1
""",
(rs, rowNum) -> mapProcessorRecord(rs),
executionId, seq);
return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
}
// --- ExecutionStore interface: write methods (unsupported, use chunked pipeline) ---
@Override
public void upsert(ExecutionRecord execution) {
throw new UnsupportedOperationException("ClickHouse writes use the chunked pipeline");
}
@Override
public void upsertProcessors(String executionId, Instant startTime,
String applicationId, String routeId,
List<ProcessorRecord> processors) {
throw new UnsupportedOperationException("ClickHouse writes use the chunked pipeline");
}
// --- Row mappers ---
private static ExecutionRecord mapExecutionRecord(ResultSet rs) throws SQLException {
return new ExecutionRecord(
emptyToNull(rs.getString("execution_id")),
emptyToNull(rs.getString("route_id")),
emptyToNull(rs.getString("instance_id")),
emptyToNull(rs.getString("application_id")),
emptyToNull(rs.getString("status")),
emptyToNull(rs.getString("correlation_id")),
emptyToNull(rs.getString("exchange_id")),
toInstant(rs, "start_time"),
toInstant(rs, "end_time"),
rs.getObject("duration_ms") != null ? rs.getLong("duration_ms") : null,
emptyToNull(rs.getString("error_message")),
emptyToNull(rs.getString("error_stacktrace")),
emptyToNull(rs.getString("diagram_content_hash")),
emptyToNull(rs.getString("engine_level")),
emptyToNull(rs.getString("input_body")),
emptyToNull(rs.getString("output_body")),
emptyToNull(rs.getString("input_headers")),
emptyToNull(rs.getString("output_headers")),
emptyToNull(rs.getString("attributes")),
emptyToNull(rs.getString("error_type")),
emptyToNull(rs.getString("error_category")),
emptyToNull(rs.getString("root_cause_type")),
emptyToNull(rs.getString("root_cause_message")),
emptyToNull(rs.getString("trace_id")),
emptyToNull(rs.getString("span_id")),
null, // processorsJson not stored in ClickHouse
rs.getBoolean("has_trace_data"),
rs.getBoolean("is_replay")
);
}
private static ProcessorRecord mapProcessorRecord(ResultSet rs) throws SQLException {
return new ProcessorRecord(
emptyToNull(rs.getString("execution_id")),
emptyToNull(rs.getString("processor_id")),
emptyToNull(rs.getString("processor_type")),
emptyToNull(rs.getString("application_id")),
emptyToNull(rs.getString("route_id")),
0, // depth not stored in ClickHouse
emptyToNull(rs.getString("parent_processor_id")),
emptyToNull(rs.getString("status")),
toInstant(rs, "start_time"),
toInstant(rs, "end_time"),
rs.getObject("duration_ms") != null ? rs.getLong("duration_ms") : null,
emptyToNull(rs.getString("error_message")),
emptyToNull(rs.getString("error_stacktrace")),
emptyToNull(rs.getString("input_body")),
emptyToNull(rs.getString("output_body")),
emptyToNull(rs.getString("input_headers")),
emptyToNull(rs.getString("output_headers")),
emptyToNull(rs.getString("attributes")),
null, // loopIndex
null, // loopSize
null, // splitIndex
null, // splitSize
null, // multicastIndex
emptyToNull(rs.getString("resolved_endpoint_uri")),
emptyToNull(rs.getString("error_type")),
emptyToNull(rs.getString("error_category")),
emptyToNull(rs.getString("root_cause_type")),
emptyToNull(rs.getString("root_cause_message")),
null, // errorHandlerType
emptyToNull(rs.getString("circuit_breaker_state")),
rs.getObject("fallback_triggered") != null ? rs.getBoolean("fallback_triggered") : null,
rs.getObject("seq") != null ? rs.getInt("seq") : null,
rs.getObject("parent_seq") != null ? rs.getInt("parent_seq") : null,
rs.getObject("iteration") != null ? rs.getInt("iteration") : null,
rs.getObject("iteration_size") != null ? rs.getInt("iteration_size") : null,
rs.getObject("filter_matched") != null ? rs.getBoolean("filter_matched") : null,
rs.getObject("duplicate_message") != null ? rs.getBoolean("duplicate_message") : null
);
}
// --- Helpers ---
private static String emptyToNull(String value) {
return (value == null || value.isEmpty()) ? null : value;
}
private static Instant toInstant(ResultSet rs, String column) throws SQLException {
Timestamp ts = rs.getTimestamp(column);
return ts != null ? ts.toInstant() : null;
}
private static String nullToEmpty(String value) {
return value != null ? value : "";
}
private static boolean boolOrFalse(Boolean value) {
return value != null && value;
}
private static Timestamp computeEndTime(Instant startTime, long durationMs) {
if (startTime != null && durationMs > 0) {
return Timestamp.from(startTime.plusMillis(durationMs));
}
return null;
}
private String mapToJson(Map<String, String> map) {
if (map == null || map.isEmpty()) return "";
try {
return objectMapper.writeValueAsString(map);
} catch (JsonProcessingException e) {
return "";
}
}
}

View File

@@ -0,0 +1,66 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.storage.MetricsQueryStore;
import com.cameleer3.server.core.storage.model.MetricTimeSeries;
import org.springframework.jdbc.core.JdbcTemplate;
import java.time.Instant;
import java.util.*;
public class ClickHouseMetricsQueryStore implements MetricsQueryStore {
private final JdbcTemplate jdbc;
public ClickHouseMetricsQueryStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public Map<String, List<MetricTimeSeries.Bucket>> queryTimeSeries(
String instanceId, List<String> metricNames,
Instant from, Instant to, int buckets) {
long intervalSeconds = Math.max(60,
(to.getEpochSecond() - from.getEpochSecond()) / Math.max(buckets, 1));
Map<String, List<MetricTimeSeries.Bucket>> result = new LinkedHashMap<>();
for (String name : metricNames) {
result.put(name.trim(), new ArrayList<>());
}
String[] namesArray = metricNames.stream().map(String::trim).toArray(String[]::new);
// ClickHouse JDBC doesn't support array params with IN (?).
// Build the IN clause with properly escaped values.
StringBuilder inClause = new StringBuilder();
for (int i = 0; i < namesArray.length; i++) {
if (i > 0) inClause.append(", ");
inClause.append("'").append(namesArray[i].replace("'", "\\'")).append("'");
}
String finalSql = """
SELECT toStartOfInterval(collected_at, INTERVAL %d SECOND) AS bucket,
metric_name,
avg(metric_value) AS avg_value
FROM agent_metrics
WHERE instance_id = ?
AND collected_at >= ?
AND collected_at < ?
AND metric_name IN (%s)
GROUP BY bucket, metric_name
ORDER BY bucket
""".formatted(intervalSeconds, inClause);
jdbc.query(finalSql, rs -> {
String metricName = rs.getString("metric_name");
Instant bucket = rs.getTimestamp("bucket").toInstant();
double value = rs.getDouble("avg_value");
result.computeIfAbsent(metricName, k -> new ArrayList<>())
.add(new MetricTimeSeries.Bucket(bucket, value));
}, instanceId,
java.sql.Timestamp.from(from),
java.sql.Timestamp.from(to));
return result;
}
}

View File

@@ -0,0 +1,41 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.storage.MetricsStore;
import com.cameleer3.server.core.storage.model.MetricsSnapshot;
import org.springframework.jdbc.core.JdbcTemplate;
import java.sql.Timestamp;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
public class ClickHouseMetricsStore implements MetricsStore {
private final JdbcTemplate jdbc;
public ClickHouseMetricsStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public void insertBatch(List<MetricsSnapshot> snapshots) {
if (snapshots.isEmpty()) return;
jdbc.batchUpdate("""
INSERT INTO agent_metrics (instance_id, metric_name, metric_value, tags, collected_at)
VALUES (?, ?, ?, ?, ?)
""",
snapshots.stream().map(s -> new Object[]{
s.instanceId(),
s.metricName(),
s.metricValue(),
tagsToClickHouseMap(s.tags()),
Timestamp.from(s.collectedAt())
}).toList());
}
private Map<String, String> tagsToClickHouseMap(Map<String, String> tags) {
if (tags == null || tags.isEmpty()) return new HashMap<>();
return new HashMap<>(tags);
}
}

View File

@@ -0,0 +1,562 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.search.ExecutionStats;
import com.cameleer3.server.core.search.StatsTimeseries;
import com.cameleer3.server.core.search.StatsTimeseries.TimeseriesBucket;
import com.cameleer3.server.core.search.TopError;
import com.cameleer3.server.core.storage.StatsStore;
import org.springframework.jdbc.core.JdbcTemplate;
import java.sql.Timestamp;
import java.time.Duration;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
/**
* ClickHouse implementation of {@link StatsStore}.
* Reads from AggregatingMergeTree tables populated by materialized views,
* using {@code -Merge} aggregate combinators to finalize partial states.
*
* <p>Queries against AggregatingMergeTree tables use literal SQL values instead
* of JDBC prepared-statement parameters because the ClickHouse JDBC v2 driver
* (0.9.x) wraps prepared statements in a sub-query that strips the
* {@code AggregateFunction} column type, breaking {@code -Merge} combinators.
* Queries against raw tables ({@code executions FINAL},
* {@code processor_executions}) use normal prepared-statement parameters
* since they have no AggregateFunction columns.</p>
*/
public class ClickHouseStatsStore implements StatsStore {
private static final String TENANT = "default";
private final JdbcTemplate jdbc;
public ClickHouseStatsStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
// ── Stats (aggregate) ────────────────────────────────────────────────
@Override
public ExecutionStats stats(Instant from, Instant to) {
return queryStats("stats_1m_all", from, to, List.of(), true);
}
@Override
public ExecutionStats statsForApp(Instant from, Instant to, String applicationId) {
return queryStats("stats_1m_app", from, to, List.of(
new Filter("application_id", applicationId)), true);
}
@Override
public ExecutionStats statsForRoute(Instant from, Instant to, String routeId, List<String> agentIds) {
return queryStats("stats_1m_route", from, to, List.of(
new Filter("route_id", routeId)), true);
}
@Override
public ExecutionStats statsForProcessor(Instant from, Instant to, String routeId, String processorType) {
return queryProcessorStatsRaw(from, to, routeId, processorType);
}
// ── Timeseries ───────────────────────────────────────────────────────
@Override
public StatsTimeseries timeseries(Instant from, Instant to, int bucketCount) {
return queryTimeseries("stats_1m_all", from, to, bucketCount, List.of(), true);
}
@Override
public StatsTimeseries timeseriesForApp(Instant from, Instant to, int bucketCount, String applicationId) {
return queryTimeseries("stats_1m_app", from, to, bucketCount, List.of(
new Filter("application_id", applicationId)), true);
}
@Override
public StatsTimeseries timeseriesForRoute(Instant from, Instant to, int bucketCount,
String routeId, List<String> agentIds) {
return queryTimeseries("stats_1m_route", from, to, bucketCount, List.of(
new Filter("route_id", routeId)), true);
}
@Override
public StatsTimeseries timeseriesForProcessor(Instant from, Instant to, int bucketCount,
String routeId, String processorType) {
return queryProcessorTimeseriesRaw(from, to, bucketCount, routeId, processorType);
}
// ── Grouped timeseries ───────────────────────────────────────────────
@Override
public Map<String, StatsTimeseries> timeseriesGroupedByApp(Instant from, Instant to, int bucketCount) {
return queryGroupedTimeseries("stats_1m_app", "application_id", from, to,
bucketCount, List.of());
}
@Override
public Map<String, StatsTimeseries> timeseriesGroupedByRoute(Instant from, Instant to,
int bucketCount, String applicationId) {
return queryGroupedTimeseries("stats_1m_route", "route_id", from, to,
bucketCount, List.of(new Filter("application_id", applicationId)));
}
// ── SLA compliance (raw table — prepared statements OK) ──────────────
@Override
public double slaCompliance(Instant from, Instant to, int thresholdMs,
String applicationId, String routeId) {
String sql = "SELECT " +
"countIf(duration_ms <= ? AND status != 'RUNNING') AS compliant, " +
"countIf(status != 'RUNNING') AS total " +
"FROM executions FINAL " +
"WHERE tenant_id = ? AND start_time >= ? AND start_time < ?";
List<Object> params = new ArrayList<>();
params.add(thresholdMs);
params.add(TENANT);
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
if (applicationId != null) {
sql += " AND application_id = ?";
params.add(applicationId);
}
if (routeId != null) {
sql += " AND route_id = ?";
params.add(routeId);
}
return jdbc.query(sql, (rs, rowNum) -> {
long total = rs.getLong("total");
if (total == 0) return 1.0;
return rs.getLong("compliant") * 100.0 / total;
}, params.toArray()).stream().findFirst().orElse(1.0);
}
@Override
public Map<String, long[]> slaCountsByApp(Instant from, Instant to, int defaultThresholdMs) {
String sql = "SELECT application_id, " +
"countIf(duration_ms <= ? AND status != 'RUNNING') AS compliant, " +
"countIf(status != 'RUNNING') AS total " +
"FROM executions FINAL " +
"WHERE tenant_id = ? AND start_time >= ? AND start_time < ? " +
"GROUP BY application_id";
Map<String, long[]> result = new LinkedHashMap<>();
jdbc.query(sql, (rs) -> {
result.put(rs.getString("application_id"),
new long[]{rs.getLong("compliant"), rs.getLong("total")});
}, defaultThresholdMs, TENANT, Timestamp.from(from), Timestamp.from(to));
return result;
}
@Override
public Map<String, long[]> slaCountsByRoute(Instant from, Instant to,
String applicationId, int thresholdMs) {
String sql = "SELECT route_id, " +
"countIf(duration_ms <= ? AND status != 'RUNNING') AS compliant, " +
"countIf(status != 'RUNNING') AS total " +
"FROM executions FINAL " +
"WHERE tenant_id = ? AND start_time >= ? AND start_time < ? " +
"AND application_id = ? GROUP BY route_id";
Map<String, long[]> result = new LinkedHashMap<>();
jdbc.query(sql, (rs) -> {
result.put(rs.getString("route_id"),
new long[]{rs.getLong("compliant"), rs.getLong("total")});
}, thresholdMs, TENANT, Timestamp.from(from), Timestamp.from(to), applicationId);
return result;
}
// ── Top errors (raw table — prepared statements OK) ──────────────────
@Override
public List<TopError> topErrors(Instant from, Instant to, String applicationId,
String routeId, int limit) {
StringBuilder where = new StringBuilder(
"status = 'FAILED' AND start_time >= ? AND start_time < ?");
List<Object> params = new ArrayList<>();
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
if (applicationId != null) {
where.append(" AND application_id = ?");
params.add(applicationId);
}
String table;
String groupId;
if (routeId != null) {
table = "processor_executions";
groupId = "processor_id";
where.append(" AND route_id = ?");
params.add(routeId);
} else {
table = "executions FINAL";
groupId = "route_id";
}
Instant fiveMinAgo = Instant.now().minus(5, ChronoUnit.MINUTES);
Instant tenMinAgo = Instant.now().minus(10, ChronoUnit.MINUTES);
String sql = "WITH counted AS (" +
" SELECT COALESCE(error_type, substring(error_message, 1, 200)) AS error_key, " +
" " + groupId + " AS group_id, " +
" count() AS cnt, max(start_time) AS last_seen " +
" FROM " + table + " WHERE tenant_id = ? AND " + where +
" GROUP BY error_key, group_id ORDER BY cnt DESC LIMIT ?" +
"), velocity AS (" +
" SELECT COALESCE(error_type, substring(error_message, 1, 200)) AS error_key, " +
" countIf(start_time >= ?) AS recent_5m, " +
" countIf(start_time >= ? AND start_time < ?) AS prev_5m " +
" FROM " + table + " WHERE tenant_id = ? AND " + where +
" GROUP BY error_key" +
") SELECT c.error_key, c.group_id, c.cnt, c.last_seen, " +
" COALESCE(v.recent_5m, 0) / 5.0 AS velocity, " +
" CASE " +
" WHEN COALESCE(v.recent_5m, 0) > COALESCE(v.prev_5m, 0) * 1.2 THEN 'accelerating' " +
" WHEN COALESCE(v.recent_5m, 0) < COALESCE(v.prev_5m, 0) * 0.8 THEN 'decelerating' " +
" ELSE 'stable' END AS trend " +
"FROM counted c LEFT JOIN velocity v ON c.error_key = v.error_key " +
"ORDER BY c.cnt DESC";
List<Object> fullParams = new ArrayList<>();
fullParams.add(TENANT);
fullParams.addAll(params);
fullParams.add(limit);
fullParams.add(Timestamp.from(fiveMinAgo));
fullParams.add(Timestamp.from(tenMinAgo));
fullParams.add(Timestamp.from(fiveMinAgo));
fullParams.add(TENANT);
fullParams.addAll(params);
return jdbc.query(sql, (rs, rowNum) -> {
String errorKey = rs.getString("error_key");
String gid = rs.getString("group_id");
return new TopError(
errorKey,
routeId != null ? routeId : gid,
routeId != null ? gid : null,
rs.getLong("cnt"),
rs.getDouble("velocity"),
rs.getString("trend"),
rs.getTimestamp("last_seen").toInstant());
}, fullParams.toArray());
}
@Override
public int activeErrorTypes(Instant from, Instant to, String applicationId) {
String sql = "SELECT COUNT(DISTINCT COALESCE(error_type, substring(error_message, 1, 200))) " +
"FROM executions FINAL " +
"WHERE tenant_id = ? AND status = 'FAILED' AND start_time >= ? AND start_time < ?";
List<Object> params = new ArrayList<>();
params.add(TENANT);
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
if (applicationId != null) {
sql += " AND application_id = ?";
params.add(applicationId);
}
Integer count = jdbc.queryForObject(sql, Integer.class, params.toArray());
return count != null ? count : 0;
}
// ── Punchcard (AggregatingMergeTree — literal SQL) ───────────────────
@Override
public List<PunchcardCell> punchcard(Instant from, Instant to, String applicationId) {
String view = applicationId != null ? "stats_1m_app" : "stats_1m_all";
String sql = "SELECT toDayOfWeek(bucket, 1) % 7 AS weekday, " +
"toHour(bucket) AS hour, " +
"countMerge(total_count) AS total_count, " +
"countIfMerge(failed_count) AS failed_count " +
"FROM " + view +
" WHERE tenant_id = " + lit(TENANT) +
" AND bucket >= " + lit(from) +
" AND bucket < " + lit(to);
if (applicationId != null) {
sql += " AND application_id = " + lit(applicationId);
}
sql += " GROUP BY weekday, hour ORDER BY weekday, hour";
return jdbc.query(sql, (rs, rowNum) -> new PunchcardCell(
rs.getInt("weekday"), rs.getInt("hour"),
rs.getLong("total_count"), rs.getLong("failed_count")));
}
// ── Private helpers ──────────────────────────────────────────────────
private record Filter(String column, String value) {}
/**
* Format an Instant as a ClickHouse DateTime literal.
* Uses java.sql.Timestamp to match the JVM→ClickHouse timezone convention
* used by the JDBC driver, then truncates to second precision for DateTime
* column compatibility.
*/
private static String lit(Instant instant) {
return "'" + java.time.format.DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
.withZone(java.time.ZoneOffset.UTC)
.format(instant.truncatedTo(ChronoUnit.SECONDS)) + "'";
}
/** Format a string as a SQL literal with single-quote escaping. */
private static String lit(String value) {
return "'" + value.replace("'", "\\'") + "'";
}
/** Convert Instant to java.sql.Timestamp for JDBC binding. */
private static Timestamp ts(Instant instant) {
return Timestamp.from(instant);
}
/**
* Build -Merge combinator SQL for the given view and time range.
*/
private String buildStatsSql(String view, Instant rangeFrom, Instant rangeTo,
List<Filter> filters, boolean hasRunning) {
String runningCol = hasRunning ? "countIfMerge(running_count)" : "0";
String sql = "SELECT " +
"countMerge(total_count) AS total_count, " +
"countIfMerge(failed_count) AS failed_count, " +
"sumMerge(duration_sum) AS duration_sum, " +
"quantileMerge(0.99)(p99_duration) AS p99_duration, " +
runningCol + " AS active_count " +
"FROM " + view +
" WHERE tenant_id = " + lit(TENANT) +
" AND bucket >= " + lit(rangeFrom) +
" AND bucket < " + lit(rangeTo);
for (Filter f : filters) {
sql += " AND " + f.column() + " = " + lit(f.value());
}
return sql;
}
/**
* Query an AggregatingMergeTree stats table using -Merge combinators.
* Uses literal SQL to avoid ClickHouse JDBC driver PreparedStatement issues.
*/
private ExecutionStats queryStats(String view, Instant from, Instant to,
List<Filter> filters, boolean hasRunning) {
String sql = buildStatsSql(view, from, to, filters, hasRunning);
long totalCount = 0, failedCount = 0, avgDuration = 0, p99Duration = 0, activeCount = 0;
var currentResult = jdbc.query(sql, (rs, rowNum) -> {
long tc = rs.getLong("total_count");
long fc = rs.getLong("failed_count");
long ds = rs.getLong("duration_sum"); // Nullable → 0 if null
long p99 = (long) rs.getDouble("p99_duration"); // quantileMerge returns Float64
long ac = rs.getLong("active_count");
return new long[]{tc, fc, ds, p99, ac};
});
if (!currentResult.isEmpty()) {
long[] r = currentResult.get(0);
totalCount = r[0]; failedCount = r[1];
avgDuration = totalCount > 0 ? r[2] / totalCount : 0;
p99Duration = r[3]; activeCount = r[4];
}
// Previous period (shifted back 24h)
Instant prevFrom = from.minus(Duration.ofHours(24));
Instant prevTo = to.minus(Duration.ofHours(24));
String prevSql = buildStatsSql(view, prevFrom, prevTo, filters, hasRunning);
long prevTotal = 0, prevFailed = 0, prevAvg = 0, prevP99 = 0;
var prevResult = jdbc.query(prevSql, (rs, rowNum) -> {
long tc = rs.getLong("total_count");
long fc = rs.getLong("failed_count");
long ds = rs.getLong("duration_sum");
long p99 = (long) rs.getDouble("p99_duration");
return new long[]{tc, fc, ds, p99};
});
if (!prevResult.isEmpty()) {
long[] r = prevResult.get(0);
prevTotal = r[0]; prevFailed = r[1];
prevAvg = prevTotal > 0 ? r[2] / prevTotal : 0;
prevP99 = r[3];
}
// Today total
Instant todayStart = Instant.now().truncatedTo(ChronoUnit.DAYS);
String todaySql = buildStatsSql(view, todayStart, Instant.now(), filters, hasRunning);
long totalToday = 0;
var todayResult = jdbc.query(todaySql, (rs, rowNum) -> rs.getLong("total_count"));
if (!todayResult.isEmpty()) totalToday = todayResult.get(0);
return new ExecutionStats(
totalCount, failedCount, avgDuration, p99Duration, activeCount,
totalToday, prevTotal, prevFailed, prevAvg, prevP99);
}
/**
* Timeseries from AggregatingMergeTree using -Merge combinators.
*/
private StatsTimeseries queryTimeseries(String view, Instant from, Instant to,
int bucketCount, List<Filter> filters,
boolean hasRunningCount) {
long intervalSeconds = Duration.between(from, to).toSeconds() / Math.max(bucketCount, 1);
if (intervalSeconds < 60) intervalSeconds = 60;
String runningCol = hasRunningCount ? "countIfMerge(running_count)" : "0";
String sql = "SELECT " +
"toStartOfInterval(bucket, INTERVAL " + intervalSeconds + " SECOND) AS period, " +
"countMerge(total_count) AS total_count, " +
"countIfMerge(failed_count) AS failed_count, " +
"sumMerge(duration_sum) AS duration_sum, " +
"quantileMerge(0.99)(p99_duration) AS p99_duration, " +
runningCol + " AS active_count " +
"FROM " + view +
" WHERE tenant_id = " + lit(TENANT) +
" AND bucket >= " + lit(from) +
" AND bucket < " + lit(to);
for (Filter f : filters) {
sql += " AND " + f.column() + " = " + lit(f.value());
}
sql += " GROUP BY period ORDER BY period";
List<TimeseriesBucket> buckets = jdbc.query(sql, (rs, rowNum) -> {
long tc = rs.getLong("total_count");
long ds = rs.getLong("duration_sum");
return new TimeseriesBucket(
rs.getTimestamp("period").toInstant(),
tc, rs.getLong("failed_count"),
tc > 0 ? ds / tc : 0, (long) rs.getDouble("p99_duration"),
rs.getLong("active_count"));
});
return new StatsTimeseries(buckets);
}
/**
* Grouped timeseries from AggregatingMergeTree.
*/
private Map<String, StatsTimeseries> queryGroupedTimeseries(
String view, String groupCol, Instant from, Instant to,
int bucketCount, List<Filter> filters) {
long intervalSeconds = Duration.between(from, to).toSeconds() / Math.max(bucketCount, 1);
if (intervalSeconds < 60) intervalSeconds = 60;
String sql = "SELECT " +
"toStartOfInterval(bucket, INTERVAL " + intervalSeconds + " SECOND) AS period, " +
groupCol + " AS group_key, " +
"countMerge(total_count) AS total_count, " +
"countIfMerge(failed_count) AS failed_count, " +
"sumMerge(duration_sum) AS duration_sum, " +
"quantileMerge(0.99)(p99_duration) AS p99_duration, " +
"countIfMerge(running_count) AS active_count " +
"FROM " + view +
" WHERE tenant_id = " + lit(TENANT) +
" AND bucket >= " + lit(from) +
" AND bucket < " + lit(to);
for (Filter f : filters) {
sql += " AND " + f.column() + " = " + lit(f.value());
}
sql += " GROUP BY period, group_key ORDER BY period, group_key";
Map<String, List<TimeseriesBucket>> grouped = new LinkedHashMap<>();
jdbc.query(sql, (rs) -> {
String key = rs.getString("group_key");
long tc = rs.getLong("total_count");
long ds = rs.getLong("duration_sum");
TimeseriesBucket bucket = new TimeseriesBucket(
rs.getTimestamp("period").toInstant(),
tc, rs.getLong("failed_count"),
tc > 0 ? ds / tc : 0, (long) rs.getDouble("p99_duration"),
rs.getLong("active_count"));
grouped.computeIfAbsent(key, k -> new ArrayList<>()).add(bucket);
});
Map<String, StatsTimeseries> result = new LinkedHashMap<>();
grouped.forEach((key, buckets) -> result.put(key, new StatsTimeseries(buckets)));
return result;
}
/**
* Direct aggregation on processor_executions for processor-level stats.
*/
private ExecutionStats queryProcessorStatsRaw(Instant from, Instant to,
String routeId, String processorType) {
String sql = "SELECT " +
"count() AS total_count, " +
"countIf(status = 'FAILED') AS failed_count, " +
"CASE WHEN count() > 0 THEN sum(duration_ms) / count() ELSE 0 END AS avg_duration, " +
"quantile(0.99)(duration_ms) AS p99_duration, " +
"0 AS active_count " +
"FROM processor_executions " +
"WHERE tenant_id = ? AND start_time >= ? AND start_time < ? " +
"AND route_id = ? AND processor_type = ?";
long totalCount = 0, failedCount = 0, avgDuration = 0, p99Duration = 0;
var currentResult = jdbc.query(sql, (rs, rowNum) -> new long[]{
rs.getLong("total_count"), rs.getLong("failed_count"),
(long) rs.getDouble("avg_duration"), (long) rs.getDouble("p99_duration"),
rs.getLong("active_count")
}, TENANT, ts(from), ts(to), routeId, processorType);
if (!currentResult.isEmpty()) {
long[] r = currentResult.get(0);
totalCount = r[0]; failedCount = r[1]; avgDuration = r[2]; p99Duration = r[3];
}
Instant prevFrom = from.minus(Duration.ofHours(24));
Instant prevTo = to.minus(Duration.ofHours(24));
long prevTotal = 0, prevFailed = 0, prevAvg = 0, prevP99 = 0;
var prevResult = jdbc.query(sql, (rs, rowNum) -> new long[]{
rs.getLong("total_count"), rs.getLong("failed_count"),
(long) rs.getDouble("avg_duration"), (long) rs.getDouble("p99_duration")
}, TENANT, ts(prevFrom), ts(prevTo), routeId, processorType);
if (!prevResult.isEmpty()) {
long[] r = prevResult.get(0);
prevTotal = r[0]; prevFailed = r[1]; prevAvg = r[2]; prevP99 = r[3];
}
Instant todayStart = Instant.now().truncatedTo(ChronoUnit.DAYS);
long totalToday = 0;
var todayResult = jdbc.query(sql, (rs, rowNum) -> rs.getLong("total_count"),
TENANT, ts(todayStart), ts(Instant.now()), routeId, processorType);
if (!todayResult.isEmpty()) totalToday = todayResult.get(0);
return new ExecutionStats(
totalCount, failedCount, avgDuration, p99Duration, 0,
totalToday, prevTotal, prevFailed, prevAvg, prevP99);
}
/**
* Direct aggregation on processor_executions for processor-level timeseries.
*/
private StatsTimeseries queryProcessorTimeseriesRaw(Instant from, Instant to,
int bucketCount,
String routeId, String processorType) {
long intervalSeconds = Duration.between(from, to).toSeconds() / Math.max(bucketCount, 1);
if (intervalSeconds < 60) intervalSeconds = 60;
String sql = "SELECT " +
"toStartOfInterval(start_time, INTERVAL " + intervalSeconds + " SECOND) AS period, " +
"count() AS total_count, " +
"countIf(status = 'FAILED') AS failed_count, " +
"CASE WHEN count() > 0 THEN sum(duration_ms) / count() ELSE 0 END AS avg_duration, " +
"quantile(0.99)(duration_ms) AS p99_duration, " +
"0 AS active_count " +
"FROM processor_executions " +
"WHERE tenant_id = ? AND start_time >= ? AND start_time < ? " +
"AND route_id = ? AND processor_type = ? " +
"GROUP BY period ORDER BY period";
List<TimeseriesBucket> buckets = jdbc.query(sql, (rs, rowNum) ->
new TimeseriesBucket(
rs.getTimestamp("period").toInstant(),
rs.getLong("total_count"), rs.getLong("failed_count"),
(long) rs.getDouble("avg_duration"), (long) rs.getDouble("p99_duration"),
rs.getLong("active_count")
), TENANT, ts(from), ts(to), routeId, processorType);
return new StatsTimeseries(buckets);
}
}

View File

@@ -0,0 +1,108 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.analytics.UsageEvent;
import com.cameleer3.server.core.analytics.UsageStats;
import com.cameleer3.server.core.analytics.UsageTracker;
import com.cameleer3.server.core.ingestion.WriteBuffer;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.jdbc.core.JdbcTemplate;
import java.sql.Timestamp;
import java.time.Instant;
import java.time.ZoneOffset;
import java.time.format.DateTimeFormatter;
import java.time.temporal.ChronoUnit;
import java.util.ArrayList;
import java.util.List;
public class ClickHouseUsageTracker implements UsageTracker {
private static final Logger log = LoggerFactory.getLogger(ClickHouseUsageTracker.class);
private static final DateTimeFormatter CH_FMT = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss")
.withZone(ZoneOffset.UTC);
private final JdbcTemplate jdbc;
private final WriteBuffer<UsageEvent> buffer;
public ClickHouseUsageTracker(JdbcTemplate jdbc, WriteBuffer<UsageEvent> buffer) {
this.jdbc = jdbc;
this.buffer = buffer;
}
@Override
public void track(UsageEvent event) {
buffer.offerOrWarn(event);
}
public void flush() {
List<UsageEvent> batch = buffer.drain(200);
if (batch.isEmpty()) return;
jdbc.batchUpdate("""
INSERT INTO usage_events (timestamp, username, method, path, normalized,
status_code, duration_ms, query_params)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
""",
batch.stream().map(e -> new Object[]{
Timestamp.from(e.timestamp()),
e.username(),
e.method(),
e.path(),
e.normalized(),
e.statusCode(),
e.durationMs(),
e.queryParams() != null ? e.queryParams() : ""
}).toList());
log.debug("Flushed {} usage events to ClickHouse", batch.size());
}
public List<UsageStats> queryByEndpoint(Instant from, Instant to, String username) {
StringBuilder sql = new StringBuilder("""
SELECT concat(method, ' ', normalized) AS key,
count() AS cnt,
avg(duration_ms) AS avg_dur
FROM usage_events
WHERE timestamp >= '%s' AND timestamp < '%s'
""".formatted(CH_FMT.format(from), CH_FMT.format(to)));
if (username != null && !username.isBlank()) {
sql.append(" AND username = '").append(username.replace("'", "\\'")).append("'");
}
sql.append(" GROUP BY key ORDER BY cnt DESC LIMIT 100");
return jdbc.query(sql.toString(), (rs, i) -> new UsageStats(
rs.getString("key"), rs.getLong("cnt"), rs.getLong("avg_dur")));
}
public List<UsageStats> queryByUser(Instant from, Instant to) {
String sql = """
SELECT username AS key, count() AS cnt, avg(duration_ms) AS avg_dur
FROM usage_events
WHERE timestamp >= '%s' AND timestamp < '%s'
GROUP BY key ORDER BY cnt DESC LIMIT 100
""".formatted(CH_FMT.format(from), CH_FMT.format(to));
return jdbc.query(sql, (rs, i) -> new UsageStats(
rs.getString("key"), rs.getLong("cnt"), rs.getLong("avg_dur")));
}
public List<UsageStats> queryByHour(Instant from, Instant to, String username) {
StringBuilder sql = new StringBuilder("""
SELECT formatDateTime(toStartOfHour(timestamp), '%%Y-%%m-%%d %%H:00') AS key,
count() AS cnt,
avg(duration_ms) AS avg_dur
FROM usage_events
WHERE timestamp >= '%s' AND timestamp < '%s'
""".formatted(CH_FMT.format(from), CH_FMT.format(to)));
if (username != null && !username.isBlank()) {
sql.append(" AND username = '").append(username.replace("'", "\\'")).append("'");
}
sql.append(" GROUP BY key ORDER BY key LIMIT 720");
return jdbc.query(sql.toString(), (rs, i) -> new UsageStats(
rs.getString("key"), rs.getLong("cnt"), rs.getLong("avg_dur")));
}
}

View File

@@ -1,62 +0,0 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.agent.AgentEventRecord;
import com.cameleer3.server.core.agent.AgentEventRepository;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Repository;
import java.sql.Timestamp;
import java.time.Instant;
import java.util.ArrayList;
import java.util.List;
@Repository
public class PostgresAgentEventRepository implements AgentEventRepository {
private final JdbcTemplate jdbc;
public PostgresAgentEventRepository(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public void insert(String agentId, String appId, String eventType, String detail) {
jdbc.update(
"INSERT INTO agent_events (agent_id, app_id, event_type, detail) VALUES (?, ?, ?, ?)",
agentId, appId, eventType, detail);
}
@Override
public List<AgentEventRecord> query(String appId, String agentId, Instant from, Instant to, int limit) {
var sql = new StringBuilder("SELECT id, agent_id, app_id, event_type, detail, timestamp FROM agent_events WHERE 1=1");
var params = new ArrayList<Object>();
if (appId != null) {
sql.append(" AND app_id = ?");
params.add(appId);
}
if (agentId != null) {
sql.append(" AND agent_id = ?");
params.add(agentId);
}
if (from != null) {
sql.append(" AND timestamp >= ?");
params.add(Timestamp.from(from));
}
if (to != null) {
sql.append(" AND timestamp < ?");
params.add(Timestamp.from(to));
}
sql.append(" ORDER BY timestamp DESC LIMIT ?");
params.add(limit);
return jdbc.query(sql.toString(), (rs, rowNum) -> new AgentEventRecord(
rs.getLong("id"),
rs.getString("agent_id"),
rs.getString("app_id"),
rs.getString("event_type"),
rs.getString("detail"),
rs.getTimestamp("timestamp").toInstant()
), params.toArray());
}
}

View File

@@ -15,7 +15,7 @@ public class PostgresAppSettingsRepository implements AppSettingsRepository {
private final JdbcTemplate jdbc;
private static final RowMapper<AppSettings> ROW_MAPPER = (rs, rowNum) -> new AppSettings(
rs.getString("app_id"),
rs.getString("application_id"),
rs.getInt("sla_threshold_ms"),
rs.getDouble("health_error_warn"),
rs.getDouble("health_error_crit"),
@@ -29,24 +29,24 @@ public class PostgresAppSettingsRepository implements AppSettingsRepository {
}
@Override
public Optional<AppSettings> findByAppId(String appId) {
public Optional<AppSettings> findByApplicationId(String applicationId) {
List<AppSettings> results = jdbc.query(
"SELECT * FROM app_settings WHERE app_id = ?", ROW_MAPPER, appId);
"SELECT * FROM app_settings WHERE application_id = ?", ROW_MAPPER, applicationId);
return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
}
@Override
public List<AppSettings> findAll() {
return jdbc.query("SELECT * FROM app_settings ORDER BY app_id", ROW_MAPPER);
return jdbc.query("SELECT * FROM app_settings ORDER BY application_id", ROW_MAPPER);
}
@Override
public AppSettings save(AppSettings settings) {
jdbc.update("""
INSERT INTO app_settings (app_id, sla_threshold_ms, health_error_warn,
INSERT INTO app_settings (application_id, sla_threshold_ms, health_error_warn,
health_error_crit, health_sla_warn, health_sla_crit, created_at, updated_at)
VALUES (?, ?, ?, ?, ?, ?, now(), now())
ON CONFLICT (app_id) DO UPDATE SET
ON CONFLICT (application_id) DO UPDATE SET
sla_threshold_ms = EXCLUDED.sla_threshold_ms,
health_error_warn = EXCLUDED.health_error_warn,
health_error_crit = EXCLUDED.health_error_crit,
@@ -54,14 +54,14 @@ public class PostgresAppSettingsRepository implements AppSettingsRepository {
health_sla_crit = EXCLUDED.health_sla_crit,
updated_at = now()
""",
settings.appId(), settings.slaThresholdMs(),
settings.applicationId(), settings.slaThresholdMs(),
settings.healthErrorWarn(), settings.healthErrorCrit(),
settings.healthSlaWarn(), settings.healthSlaCrit());
return findByAppId(settings.appId()).orElseThrow();
return findByApplicationId(settings.applicationId()).orElseThrow();
}
@Override
public void delete(String appId) {
jdbc.update("DELETE FROM app_settings WHERE app_id = ?", appId);
jdbc.update("DELETE FROM app_settings WHERE application_id = ?", appId);
}
}

View File

@@ -1,215 +0,0 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.storage.ExecutionStore;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.jdbc.core.RowMapper;
import org.springframework.stereotype.Repository;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Timestamp;
import java.time.Instant;
import java.util.List;
import java.util.Optional;
@Repository
public class PostgresExecutionStore implements ExecutionStore {
private final JdbcTemplate jdbc;
public PostgresExecutionStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public void upsert(ExecutionRecord execution) {
jdbc.update("""
INSERT INTO executions (execution_id, route_id, agent_id, application_name,
status, correlation_id, exchange_id, start_time, end_time,
duration_ms, error_message, error_stacktrace, diagram_content_hash,
engine_level, input_body, output_body, input_headers, output_headers,
attributes,
error_type, error_category, root_cause_type, root_cause_message,
trace_id, span_id,
processors_json, has_trace_data, is_replay,
created_at, updated_at)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?::jsonb, ?::jsonb, ?::jsonb,
?, ?, ?, ?, ?, ?, ?::jsonb, ?, ?, now(), now())
ON CONFLICT (execution_id, start_time) DO UPDATE SET
status = CASE
WHEN EXCLUDED.status IN ('COMPLETED', 'FAILED')
AND executions.status = 'RUNNING'
THEN EXCLUDED.status
WHEN EXCLUDED.status = executions.status THEN executions.status
ELSE EXCLUDED.status
END,
end_time = COALESCE(EXCLUDED.end_time, executions.end_time),
duration_ms = COALESCE(EXCLUDED.duration_ms, executions.duration_ms),
error_message = COALESCE(EXCLUDED.error_message, executions.error_message),
error_stacktrace = COALESCE(EXCLUDED.error_stacktrace, executions.error_stacktrace),
diagram_content_hash = COALESCE(EXCLUDED.diagram_content_hash, executions.diagram_content_hash),
engine_level = COALESCE(EXCLUDED.engine_level, executions.engine_level),
input_body = COALESCE(EXCLUDED.input_body, executions.input_body),
output_body = COALESCE(EXCLUDED.output_body, executions.output_body),
input_headers = COALESCE(EXCLUDED.input_headers, executions.input_headers),
output_headers = COALESCE(EXCLUDED.output_headers, executions.output_headers),
attributes = COALESCE(EXCLUDED.attributes, executions.attributes),
error_type = COALESCE(EXCLUDED.error_type, executions.error_type),
error_category = COALESCE(EXCLUDED.error_category, executions.error_category),
root_cause_type = COALESCE(EXCLUDED.root_cause_type, executions.root_cause_type),
root_cause_message = COALESCE(EXCLUDED.root_cause_message, executions.root_cause_message),
trace_id = COALESCE(EXCLUDED.trace_id, executions.trace_id),
span_id = COALESCE(EXCLUDED.span_id, executions.span_id),
processors_json = COALESCE(EXCLUDED.processors_json, executions.processors_json),
has_trace_data = EXCLUDED.has_trace_data OR executions.has_trace_data,
is_replay = EXCLUDED.is_replay OR executions.is_replay,
updated_at = now()
""",
execution.executionId(), execution.routeId(), execution.agentId(),
execution.applicationName(), execution.status(), execution.correlationId(),
execution.exchangeId(),
Timestamp.from(execution.startTime()),
execution.endTime() != null ? Timestamp.from(execution.endTime()) : null,
execution.durationMs(), execution.errorMessage(),
execution.errorStacktrace(), execution.diagramContentHash(),
execution.engineLevel(),
execution.inputBody(), execution.outputBody(),
execution.inputHeaders(), execution.outputHeaders(),
execution.attributes(),
execution.errorType(), execution.errorCategory(),
execution.rootCauseType(), execution.rootCauseMessage(),
execution.traceId(), execution.spanId(),
execution.processorsJson(), execution.hasTraceData(), execution.isReplay());
}
@Override
public void upsertProcessors(String executionId, Instant startTime,
String applicationName, String routeId,
List<ProcessorRecord> processors) {
jdbc.batchUpdate("""
INSERT INTO processor_executions (execution_id, processor_id, processor_type,
application_name, route_id, depth, parent_processor_id,
status, start_time, end_time, duration_ms, error_message, error_stacktrace,
input_body, output_body, input_headers, output_headers, attributes,
loop_index, loop_size, split_index, split_size, multicast_index,
resolved_endpoint_uri,
error_type, error_category, root_cause_type, root_cause_message,
error_handler_type, circuit_breaker_state, fallback_triggered)
VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?::jsonb, ?::jsonb, ?::jsonb,
?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
ON CONFLICT (execution_id, processor_id, start_time) DO UPDATE SET
status = EXCLUDED.status,
end_time = COALESCE(EXCLUDED.end_time, processor_executions.end_time),
duration_ms = COALESCE(EXCLUDED.duration_ms, processor_executions.duration_ms),
error_message = COALESCE(EXCLUDED.error_message, processor_executions.error_message),
error_stacktrace = COALESCE(EXCLUDED.error_stacktrace, processor_executions.error_stacktrace),
input_body = COALESCE(EXCLUDED.input_body, processor_executions.input_body),
output_body = COALESCE(EXCLUDED.output_body, processor_executions.output_body),
input_headers = COALESCE(EXCLUDED.input_headers, processor_executions.input_headers),
output_headers = COALESCE(EXCLUDED.output_headers, processor_executions.output_headers),
attributes = COALESCE(EXCLUDED.attributes, processor_executions.attributes),
loop_index = COALESCE(EXCLUDED.loop_index, processor_executions.loop_index),
loop_size = COALESCE(EXCLUDED.loop_size, processor_executions.loop_size),
split_index = COALESCE(EXCLUDED.split_index, processor_executions.split_index),
split_size = COALESCE(EXCLUDED.split_size, processor_executions.split_size),
multicast_index = COALESCE(EXCLUDED.multicast_index, processor_executions.multicast_index),
resolved_endpoint_uri = COALESCE(EXCLUDED.resolved_endpoint_uri, processor_executions.resolved_endpoint_uri),
error_type = COALESCE(EXCLUDED.error_type, processor_executions.error_type),
error_category = COALESCE(EXCLUDED.error_category, processor_executions.error_category),
root_cause_type = COALESCE(EXCLUDED.root_cause_type, processor_executions.root_cause_type),
root_cause_message = COALESCE(EXCLUDED.root_cause_message, processor_executions.root_cause_message),
error_handler_type = COALESCE(EXCLUDED.error_handler_type, processor_executions.error_handler_type),
circuit_breaker_state = COALESCE(EXCLUDED.circuit_breaker_state, processor_executions.circuit_breaker_state),
fallback_triggered = COALESCE(EXCLUDED.fallback_triggered, processor_executions.fallback_triggered)
""",
processors.stream().map(p -> new Object[]{
p.executionId(), p.processorId(), p.processorType(),
p.applicationName(), p.routeId(),
p.depth(), p.parentProcessorId(), p.status(),
Timestamp.from(p.startTime()),
p.endTime() != null ? Timestamp.from(p.endTime()) : null,
p.durationMs(), p.errorMessage(), p.errorStacktrace(),
p.inputBody(), p.outputBody(), p.inputHeaders(), p.outputHeaders(),
p.attributes(),
p.loopIndex(), p.loopSize(), p.splitIndex(), p.splitSize(),
p.multicastIndex(),
p.resolvedEndpointUri(),
p.errorType(), p.errorCategory(),
p.rootCauseType(), p.rootCauseMessage(),
p.errorHandlerType(), p.circuitBreakerState(),
p.fallbackTriggered()
}).toList());
}
@Override
public Optional<ExecutionRecord> findById(String executionId) {
List<ExecutionRecord> results = jdbc.query(
"SELECT * FROM executions WHERE execution_id = ? ORDER BY start_time DESC LIMIT 1",
EXECUTION_MAPPER, executionId);
return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
}
@Override
public List<ProcessorRecord> findProcessors(String executionId) {
return jdbc.query(
"SELECT * FROM processor_executions WHERE execution_id = ? ORDER BY depth, start_time",
PROCESSOR_MAPPER, executionId);
}
@Override
public Optional<ProcessorRecord> findProcessorById(String executionId, String processorId) {
String sql = "SELECT * FROM processor_executions WHERE execution_id = ? AND processor_id = ? LIMIT 1";
List<ProcessorRecord> results = jdbc.query(sql, PROCESSOR_MAPPER, executionId, processorId);
return results.isEmpty() ? Optional.empty() : Optional.of(results.get(0));
}
private static final RowMapper<ExecutionRecord> EXECUTION_MAPPER = (rs, rowNum) ->
new ExecutionRecord(
rs.getString("execution_id"), rs.getString("route_id"),
rs.getString("agent_id"), rs.getString("application_name"),
rs.getString("status"), rs.getString("correlation_id"),
rs.getString("exchange_id"),
toInstant(rs, "start_time"), toInstant(rs, "end_time"),
rs.getObject("duration_ms") != null ? rs.getLong("duration_ms") : null,
rs.getString("error_message"), rs.getString("error_stacktrace"),
rs.getString("diagram_content_hash"),
rs.getString("engine_level"),
rs.getString("input_body"), rs.getString("output_body"),
rs.getString("input_headers"), rs.getString("output_headers"),
rs.getString("attributes"),
rs.getString("error_type"), rs.getString("error_category"),
rs.getString("root_cause_type"), rs.getString("root_cause_message"),
rs.getString("trace_id"), rs.getString("span_id"),
rs.getString("processors_json"),
rs.getBoolean("has_trace_data"),
rs.getBoolean("is_replay"));
private static final RowMapper<ProcessorRecord> PROCESSOR_MAPPER = (rs, rowNum) ->
new ProcessorRecord(
rs.getString("execution_id"), rs.getString("processor_id"),
rs.getString("processor_type"),
rs.getString("application_name"), rs.getString("route_id"),
rs.getInt("depth"), rs.getString("parent_processor_id"),
rs.getString("status"),
toInstant(rs, "start_time"), toInstant(rs, "end_time"),
rs.getObject("duration_ms") != null ? rs.getLong("duration_ms") : null,
rs.getString("error_message"), rs.getString("error_stacktrace"),
rs.getString("input_body"), rs.getString("output_body"),
rs.getString("input_headers"), rs.getString("output_headers"),
rs.getString("attributes"),
rs.getObject("loop_index") != null ? rs.getInt("loop_index") : null,
rs.getObject("loop_size") != null ? rs.getInt("loop_size") : null,
rs.getObject("split_index") != null ? rs.getInt("split_index") : null,
rs.getObject("split_size") != null ? rs.getInt("split_size") : null,
rs.getObject("multicast_index") != null ? rs.getInt("multicast_index") : null,
rs.getString("resolved_endpoint_uri"),
rs.getString("error_type"), rs.getString("error_category"),
rs.getString("root_cause_type"), rs.getString("root_cause_message"),
rs.getString("error_handler_type"), rs.getString("circuit_breaker_state"),
rs.getObject("fallback_triggered") != null ? rs.getBoolean("fallback_triggered") : null);
private static Instant toInstant(ResultSet rs, String column) throws SQLException {
Timestamp ts = rs.getTimestamp(column);
return ts != null ? ts.toInstant() : null;
}
}

View File

@@ -1,42 +0,0 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.storage.MetricsStore;
import com.cameleer3.server.core.storage.model.MetricsSnapshot;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Repository;
import java.sql.Timestamp;
import java.util.List;
@Repository
public class PostgresMetricsStore implements MetricsStore {
private static final ObjectMapper MAPPER = new ObjectMapper();
private final JdbcTemplate jdbc;
public PostgresMetricsStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public void insertBatch(List<MetricsSnapshot> snapshots) {
jdbc.batchUpdate("""
INSERT INTO agent_metrics (agent_id, metric_name, metric_value, tags,
collected_at, server_received_at)
VALUES (?, ?, ?, ?::jsonb, ?, now())
""",
snapshots.stream().map(s -> new Object[]{
s.agentId(), s.metricName(), s.metricValue(),
tagsToJson(s.tags()),
Timestamp.from(s.collectedAt())
}).toList());
}
private String tagsToJson(java.util.Map<String, String> tags) {
if (tags == null || tags.isEmpty()) return null;
try { return MAPPER.writeValueAsString(tags); }
catch (JsonProcessingException e) { return null; }
}
}

View File

@@ -1,428 +0,0 @@
package com.cameleer3.server.app.storage;
import com.cameleer3.server.core.search.ExecutionStats;
import com.cameleer3.server.core.search.StatsTimeseries;
import com.cameleer3.server.core.search.StatsTimeseries.TimeseriesBucket;
import com.cameleer3.server.core.search.TopError;
import com.cameleer3.server.core.storage.StatsStore;
import org.springframework.jdbc.core.JdbcTemplate;
import org.springframework.stereotype.Repository;
import java.sql.Timestamp;
import java.time.Duration;
import java.time.Instant;
import java.time.temporal.ChronoUnit;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
@Repository
public class PostgresStatsStore implements StatsStore {
private final JdbcTemplate jdbc;
public PostgresStatsStore(JdbcTemplate jdbc) {
this.jdbc = jdbc;
}
@Override
public ExecutionStats stats(Instant from, Instant to) {
return queryStats("stats_1m_all", from, to, List.of());
}
@Override
public ExecutionStats statsForApp(Instant from, Instant to, String applicationName) {
return queryStats("stats_1m_app", from, to, List.of(
new Filter("application_name", applicationName)));
}
@Override
public ExecutionStats statsForRoute(Instant from, Instant to, String routeId, List<String> agentIds) {
// Note: agentIds is accepted for interface compatibility but not filterable
// on the continuous aggregate (it groups by route_id, not agent_id).
// All agents for the same route contribute to the same aggregate.
return queryStats("stats_1m_route", from, to, List.of(
new Filter("route_id", routeId)));
}
@Override
public ExecutionStats statsForProcessor(Instant from, Instant to, String routeId, String processorType) {
return queryStats("stats_1m_processor", from, to, List.of(
new Filter("route_id", routeId),
new Filter("processor_type", processorType)));
}
@Override
public StatsTimeseries timeseries(Instant from, Instant to, int bucketCount) {
return queryTimeseries("stats_1m_all", from, to, bucketCount, List.of(), true);
}
@Override
public StatsTimeseries timeseriesForApp(Instant from, Instant to, int bucketCount, String applicationName) {
return queryTimeseries("stats_1m_app", from, to, bucketCount, List.of(
new Filter("application_name", applicationName)), true);
}
@Override
public StatsTimeseries timeseriesForRoute(Instant from, Instant to, int bucketCount,
String routeId, List<String> agentIds) {
return queryTimeseries("stats_1m_route", from, to, bucketCount, List.of(
new Filter("route_id", routeId)), true);
}
@Override
public StatsTimeseries timeseriesForProcessor(Instant from, Instant to, int bucketCount,
String routeId, String processorType) {
// stats_1m_processor does NOT have running_count column
return queryTimeseries("stats_1m_processor", from, to, bucketCount, List.of(
new Filter("route_id", routeId),
new Filter("processor_type", processorType)), false);
}
private record Filter(String column, String value) {}
private ExecutionStats queryStats(String view, Instant from, Instant to, List<Filter> filters) {
// running_count only exists on execution-level aggregates, not processor
boolean hasRunning = !view.equals("stats_1m_processor");
String runningCol = hasRunning ? "COALESCE(SUM(running_count), 0)" : "0";
String sql = "SELECT COALESCE(SUM(total_count), 0) AS total_count, " +
"COALESCE(SUM(failed_count), 0) AS failed_count, " +
"CASE WHEN SUM(total_count) > 0 THEN SUM(duration_sum) / SUM(total_count) ELSE 0 END AS avg_duration, " +
"COALESCE(MAX(p99_duration), 0) AS p99_duration, " +
runningCol + " AS active_count " +
"FROM " + view + " WHERE bucket >= ? AND bucket < ?";
List<Object> params = new ArrayList<>();
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
for (Filter f : filters) {
sql += " AND " + f.column() + " = ?";
params.add(f.value());
}
long totalCount = 0, failedCount = 0, avgDuration = 0, p99Duration = 0, activeCount = 0;
var currentResult = jdbc.query(sql, (rs, rowNum) -> new long[]{
rs.getLong("total_count"), rs.getLong("failed_count"),
rs.getLong("avg_duration"), rs.getLong("p99_duration"),
rs.getLong("active_count")
}, params.toArray());
if (!currentResult.isEmpty()) {
long[] r = currentResult.get(0);
totalCount = r[0]; failedCount = r[1]; avgDuration = r[2];
p99Duration = r[3]; activeCount = r[4];
}
// Previous period (shifted back 24h)
Instant prevFrom = from.minus(Duration.ofHours(24));
Instant prevTo = to.minus(Duration.ofHours(24));
List<Object> prevParams = new ArrayList<>();
prevParams.add(Timestamp.from(prevFrom));
prevParams.add(Timestamp.from(prevTo));
for (Filter f : filters) prevParams.add(f.value());
String prevSql = sql; // same shape, different time params
long prevTotal = 0, prevFailed = 0, prevAvg = 0, prevP99 = 0;
var prevResult = jdbc.query(prevSql, (rs, rowNum) -> new long[]{
rs.getLong("total_count"), rs.getLong("failed_count"),
rs.getLong("avg_duration"), rs.getLong("p99_duration")
}, prevParams.toArray());
if (!prevResult.isEmpty()) {
long[] r = prevResult.get(0);
prevTotal = r[0]; prevFailed = r[1]; prevAvg = r[2]; prevP99 = r[3];
}
// Today total (from midnight UTC)
Instant todayStart = Instant.now().truncatedTo(ChronoUnit.DAYS);
List<Object> todayParams = new ArrayList<>();
todayParams.add(Timestamp.from(todayStart));
todayParams.add(Timestamp.from(Instant.now()));
for (Filter f : filters) todayParams.add(f.value());
String todaySql = sql;
long totalToday = 0;
var todayResult = jdbc.query(todaySql, (rs, rowNum) -> rs.getLong("total_count"),
todayParams.toArray());
if (!todayResult.isEmpty()) totalToday = todayResult.get(0);
return new ExecutionStats(
totalCount, failedCount, avgDuration, p99Duration, activeCount,
totalToday, prevTotal, prevFailed, prevAvg, prevP99);
}
private StatsTimeseries queryTimeseries(String view, Instant from, Instant to,
int bucketCount, List<Filter> filters,
boolean hasRunningCount) {
long intervalSeconds = Duration.between(from, to).toSeconds() / Math.max(bucketCount, 1);
if (intervalSeconds < 60) intervalSeconds = 60;
String runningCol = hasRunningCount ? "COALESCE(SUM(running_count), 0)" : "0";
String sql = "SELECT time_bucket(? * INTERVAL '1 second', bucket) AS period, " +
"COALESCE(SUM(total_count), 0) AS total_count, " +
"COALESCE(SUM(failed_count), 0) AS failed_count, " +
"CASE WHEN SUM(total_count) > 0 THEN SUM(duration_sum) / SUM(total_count) ELSE 0 END AS avg_duration, " +
"COALESCE(MAX(p99_duration), 0) AS p99_duration, " +
runningCol + " AS active_count " +
"FROM " + view + " WHERE bucket >= ? AND bucket < ?";
List<Object> params = new ArrayList<>();
params.add(intervalSeconds);
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
for (Filter f : filters) {
sql += " AND " + f.column() + " = ?";
params.add(f.value());
}
sql += " GROUP BY period ORDER BY period";
List<TimeseriesBucket> buckets = jdbc.query(sql, (rs, rowNum) ->
new TimeseriesBucket(
rs.getTimestamp("period").toInstant(),
rs.getLong("total_count"), rs.getLong("failed_count"),
rs.getLong("avg_duration"), rs.getLong("p99_duration"),
rs.getLong("active_count")
), params.toArray());
return new StatsTimeseries(buckets);
}
// ── Grouped timeseries ────────────────────────────────────────────────
@Override
public Map<String, StatsTimeseries> timeseriesGroupedByApp(Instant from, Instant to, int bucketCount) {
return queryGroupedTimeseries("stats_1m_app", "application_name", from, to,
bucketCount, List.of());
}
@Override
public Map<String, StatsTimeseries> timeseriesGroupedByRoute(Instant from, Instant to,
int bucketCount, String applicationName) {
return queryGroupedTimeseries("stats_1m_route", "route_id", from, to,
bucketCount, List.of(new Filter("application_name", applicationName)));
}
private Map<String, StatsTimeseries> queryGroupedTimeseries(
String view, String groupCol, Instant from, Instant to,
int bucketCount, List<Filter> filters) {
long intervalSeconds = Duration.between(from, to).toSeconds() / Math.max(bucketCount, 1);
if (intervalSeconds < 60) intervalSeconds = 60;
String sql = "SELECT time_bucket(? * INTERVAL '1 second', bucket) AS period, " +
groupCol + " AS group_key, " +
"COALESCE(SUM(total_count), 0) AS total_count, " +
"COALESCE(SUM(failed_count), 0) AS failed_count, " +
"CASE WHEN SUM(total_count) > 0 THEN SUM(duration_sum) / SUM(total_count) ELSE 0 END AS avg_duration, " +
"COALESCE(MAX(p99_duration), 0) AS p99_duration, " +
"COALESCE(SUM(running_count), 0) AS active_count " +
"FROM " + view + " WHERE bucket >= ? AND bucket < ?";
List<Object> params = new ArrayList<>();
params.add(intervalSeconds);
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
for (Filter f : filters) {
sql += " AND " + f.column() + " = ?";
params.add(f.value());
}
sql += " GROUP BY period, group_key ORDER BY period, group_key";
Map<String, List<TimeseriesBucket>> grouped = new LinkedHashMap<>();
jdbc.query(sql, (rs) -> {
String key = rs.getString("group_key");
TimeseriesBucket bucket = new TimeseriesBucket(
rs.getTimestamp("period").toInstant(),
rs.getLong("total_count"), rs.getLong("failed_count"),
rs.getLong("avg_duration"), rs.getLong("p99_duration"),
rs.getLong("active_count"));
grouped.computeIfAbsent(key, k -> new ArrayList<>()).add(bucket);
}, params.toArray());
Map<String, StatsTimeseries> result = new LinkedHashMap<>();
grouped.forEach((key, buckets) -> result.put(key, new StatsTimeseries(buckets)));
return result;
}
// ── SLA compliance ────────────────────────────────────────────────────
@Override
public double slaCompliance(Instant from, Instant to, int thresholdMs,
String applicationName, String routeId) {
String sql = "SELECT " +
"COUNT(*) FILTER (WHERE duration_ms <= ? AND status != 'RUNNING') AS compliant, " +
"COUNT(*) FILTER (WHERE status != 'RUNNING') AS total " +
"FROM executions WHERE start_time >= ? AND start_time < ?";
List<Object> params = new ArrayList<>();
params.add(thresholdMs);
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
if (applicationName != null) {
sql += " AND application_name = ?";
params.add(applicationName);
}
if (routeId != null) {
sql += " AND route_id = ?";
params.add(routeId);
}
return jdbc.query(sql, (rs, rowNum) -> {
long total = rs.getLong("total");
if (total == 0) return 1.0;
return rs.getLong("compliant") * 100.0 / total;
}, params.toArray()).stream().findFirst().orElse(1.0);
}
@Override
public Map<String, long[]> slaCountsByApp(Instant from, Instant to, int defaultThresholdMs) {
String sql = "SELECT application_name, " +
"COUNT(*) FILTER (WHERE duration_ms <= ? AND status != 'RUNNING') AS compliant, " +
"COUNT(*) FILTER (WHERE status != 'RUNNING') AS total " +
"FROM executions WHERE start_time >= ? AND start_time < ? " +
"GROUP BY application_name";
Map<String, long[]> result = new LinkedHashMap<>();
jdbc.query(sql, (rs) -> {
result.put(rs.getString("application_name"),
new long[]{rs.getLong("compliant"), rs.getLong("total")});
}, defaultThresholdMs, Timestamp.from(from), Timestamp.from(to));
return result;
}
@Override
public Map<String, long[]> slaCountsByRoute(Instant from, Instant to,
String applicationName, int thresholdMs) {
String sql = "SELECT route_id, " +
"COUNT(*) FILTER (WHERE duration_ms <= ? AND status != 'RUNNING') AS compliant, " +
"COUNT(*) FILTER (WHERE status != 'RUNNING') AS total " +
"FROM executions WHERE start_time >= ? AND start_time < ? " +
"AND application_name = ? GROUP BY route_id";
Map<String, long[]> result = new LinkedHashMap<>();
jdbc.query(sql, (rs) -> {
result.put(rs.getString("route_id"),
new long[]{rs.getLong("compliant"), rs.getLong("total")});
}, thresholdMs, Timestamp.from(from), Timestamp.from(to), applicationName);
return result;
}
// ── Top errors ────────────────────────────────────────────────────────
@Override
public List<TopError> topErrors(Instant from, Instant to, String applicationName,
String routeId, int limit) {
StringBuilder where = new StringBuilder(
"status = 'FAILED' AND start_time >= ? AND start_time < ?");
List<Object> params = new ArrayList<>();
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
if (applicationName != null) {
where.append(" AND application_name = ?");
params.add(applicationName);
}
String table;
String groupId;
if (routeId != null) {
// L3: attribute errors to processors
table = "processor_executions";
groupId = "processor_id";
where.append(" AND route_id = ?");
params.add(routeId);
} else {
// L1/L2: attribute errors to routes
table = "executions";
groupId = "route_id";
}
Instant fiveMinAgo = Instant.now().minus(5, ChronoUnit.MINUTES);
Instant tenMinAgo = Instant.now().minus(10, ChronoUnit.MINUTES);
String sql = "WITH counted AS (" +
" SELECT COALESCE(error_type, LEFT(error_message, 200)) AS error_key, " +
" " + groupId + " AS group_id, " +
" COUNT(*) AS cnt, MAX(start_time) AS last_seen " +
" FROM " + table + " WHERE " + where +
" GROUP BY error_key, group_id ORDER BY cnt DESC LIMIT ?" +
"), velocity AS (" +
" SELECT COALESCE(error_type, LEFT(error_message, 200)) AS error_key, " +
" COUNT(*) FILTER (WHERE start_time >= ?) AS recent_5m, " +
" COUNT(*) FILTER (WHERE start_time >= ? AND start_time < ?) AS prev_5m " +
" FROM " + table + " WHERE " + where +
" GROUP BY error_key" +
") SELECT c.error_key, c.group_id, c.cnt, c.last_seen, " +
" COALESCE(v.recent_5m, 0) / 5.0 AS velocity, " +
" CASE " +
" WHEN COALESCE(v.recent_5m, 0) > COALESCE(v.prev_5m, 0) * 1.2 THEN 'accelerating' " +
" WHEN COALESCE(v.recent_5m, 0) < COALESCE(v.prev_5m, 0) * 0.8 THEN 'decelerating' " +
" ELSE 'stable' END AS trend " +
"FROM counted c LEFT JOIN velocity v ON c.error_key = v.error_key " +
"ORDER BY c.cnt DESC";
// Build full params: counted-where params + limit + velocity timestamps + velocity-where params
List<Object> fullParams = new ArrayList<>(params);
fullParams.add(limit);
fullParams.add(Timestamp.from(fiveMinAgo));
fullParams.add(Timestamp.from(tenMinAgo));
fullParams.add(Timestamp.from(fiveMinAgo));
fullParams.addAll(params); // same where clause for velocity CTE
return jdbc.query(sql, (rs, rowNum) -> {
String errorKey = rs.getString("error_key");
String gid = rs.getString("group_id");
return new TopError(
errorKey,
routeId != null ? routeId : gid, // routeId
routeId != null ? gid : null, // processorId (only at L3)
rs.getLong("cnt"),
rs.getDouble("velocity"),
rs.getString("trend"),
rs.getTimestamp("last_seen").toInstant());
}, fullParams.toArray());
}
@Override
public int activeErrorTypes(Instant from, Instant to, String applicationName) {
String sql = "SELECT COUNT(DISTINCT COALESCE(error_type, LEFT(error_message, 200))) " +
"FROM executions WHERE status = 'FAILED' AND start_time >= ? AND start_time < ?";
List<Object> params = new ArrayList<>();
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
if (applicationName != null) {
sql += " AND application_name = ?";
params.add(applicationName);
}
Integer count = jdbc.queryForObject(sql, Integer.class, params.toArray());
return count != null ? count : 0;
}
// ── Punchcard ─────────────────────────────────────────────────────────
@Override
public List<PunchcardCell> punchcard(Instant from, Instant to, String applicationName) {
String view = applicationName != null ? "stats_1m_app" : "stats_1m_all";
String sql = "SELECT EXTRACT(DOW FROM bucket) AS weekday, " +
"EXTRACT(HOUR FROM bucket) AS hour, " +
"COALESCE(SUM(total_count), 0) AS total_count, " +
"COALESCE(SUM(failed_count), 0) AS failed_count " +
"FROM " + view + " WHERE bucket >= ? AND bucket < ?";
List<Object> params = new ArrayList<>();
params.add(Timestamp.from(from));
params.add(Timestamp.from(to));
if (applicationName != null) {
sql += " AND application_name = ?";
params.add(applicationName);
}
sql += " GROUP BY weekday, hour ORDER BY weekday, hour";
return jdbc.query(sql, (rs, rowNum) -> new PunchcardCell(
rs.getInt("weekday"), rs.getInt("hour"),
rs.getLong("total_count"), rs.getLong("failed_count")),
params.toArray());
}
}

View File

@@ -37,17 +37,11 @@ ingestion:
batch-size: 5000
flush-interval-ms: 1000
opensearch:
url: ${OPENSEARCH_URL:http://localhost:9200}
index-prefix: ${CAMELEER_OPENSEARCH_INDEX_PREFIX:executions-}
queue-size: ${CAMELEER_OPENSEARCH_QUEUE_SIZE:10000}
debounce-ms: ${CAMELEER_OPENSEARCH_DEBOUNCE_MS:2000}
log-index-prefix: ${CAMELEER_LOG_INDEX_PREFIX:logs-}
log-retention-days: ${CAMELEER_LOG_RETENTION_DAYS:7}
cameleer:
body-size-limit: ${CAMELEER_BODY_SIZE_LIMIT:16384}
retention-days: ${CAMELEER_RETENTION_DAYS:30}
indexer:
debounce-ms: ${CAMELEER_INDEXER_DEBOUNCE_MS:2000}
queue-size: ${CAMELEER_INDEXER_QUEUE_SIZE:10000}
security:
access-token-expiry-ms: 3600000
@@ -66,6 +60,12 @@ springdoc:
swagger-ui:
path: /api/v1/swagger-ui
clickhouse:
enabled: ${CLICKHOUSE_ENABLED:true}
url: ${CLICKHOUSE_URL:jdbc:clickhouse://localhost:8123/cameleer}
username: ${CLICKHOUSE_USERNAME:default}
password: ${CLICKHOUSE_PASSWORD:}
management:
endpoints:
web:

View File

@@ -0,0 +1,13 @@
CREATE TABLE IF NOT EXISTS usage_events (
timestamp DateTime64(3) DEFAULT now64(3),
username LowCardinality(String),
method LowCardinality(String),
path String,
normalized LowCardinality(String),
status_code UInt16,
duration_ms UInt32,
query_params String DEFAULT ''
)
ENGINE = MergeTree()
ORDER BY (username, timestamp)
TTL toDateTime(timestamp) + INTERVAL 90 DAY;

View File

@@ -0,0 +1,14 @@
CREATE TABLE IF NOT EXISTS agent_metrics (
tenant_id LowCardinality(String) DEFAULT 'default',
collected_at DateTime64(3),
instance_id LowCardinality(String),
metric_name LowCardinality(String),
metric_value Float64,
tags Map(String, String) DEFAULT map(),
server_received_at DateTime64(3) DEFAULT now64(3)
)
ENGINE = MergeTree()
PARTITION BY (tenant_id, toYYYYMM(collected_at))
ORDER BY (tenant_id, instance_id, metric_name, collected_at)
TTL toDateTime(collected_at) + INTERVAL 365 DAY DELETE
SETTINGS index_granularity = 8192;

View File

@@ -0,0 +1,49 @@
CREATE TABLE IF NOT EXISTS executions (
tenant_id LowCardinality(String) DEFAULT 'default',
execution_id String,
start_time DateTime64(3),
_version UInt64 DEFAULT 1,
route_id LowCardinality(String),
instance_id LowCardinality(String),
application_id LowCardinality(String),
status LowCardinality(String),
correlation_id String DEFAULT '',
exchange_id String DEFAULT '',
end_time Nullable(DateTime64(3)),
duration_ms Nullable(Int64),
error_message String DEFAULT '',
error_stacktrace String DEFAULT '',
error_type LowCardinality(String) DEFAULT '',
error_category LowCardinality(String) DEFAULT '',
root_cause_type String DEFAULT '',
root_cause_message String DEFAULT '',
diagram_content_hash String DEFAULT '',
engine_level LowCardinality(String) DEFAULT '',
input_body String DEFAULT '',
output_body String DEFAULT '',
input_headers String DEFAULT '',
output_headers String DEFAULT '',
attributes String DEFAULT '',
trace_id String DEFAULT '',
span_id String DEFAULT '',
has_trace_data Bool DEFAULT false,
is_replay Bool DEFAULT false,
_search_text String MATERIALIZED
concat(execution_id, ' ', correlation_id, ' ', exchange_id, ' ', route_id,
' ', error_message, ' ', error_stacktrace, ' ', attributes,
' ', input_body, ' ', output_body, ' ', input_headers,
' ', output_headers, ' ', root_cause_message),
INDEX idx_search _search_text TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4,
INDEX idx_error error_message TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4,
INDEX idx_bodies concat(input_body, ' ', output_body) TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4,
INDEX idx_headers concat(input_headers, ' ', output_headers) TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4,
INDEX idx_status status TYPE set(10) GRANULARITY 1,
INDEX idx_corr correlation_id TYPE bloom_filter(0.01) GRANULARITY 4
)
ENGINE = ReplacingMergeTree(_version)
PARTITION BY (tenant_id, toYYYYMM(start_time))
ORDER BY (tenant_id, start_time, application_id, route_id, execution_id)
TTL toDateTime(start_time) + INTERVAL 365 DAY DELETE
SETTINGS index_granularity = 8192;

View File

@@ -0,0 +1,45 @@
CREATE TABLE IF NOT EXISTS processor_executions (
tenant_id LowCardinality(String) DEFAULT 'default',
execution_id String,
seq UInt32,
parent_seq Nullable(UInt32),
parent_processor_id String DEFAULT '',
processor_id String,
processor_type LowCardinality(String),
start_time DateTime64(3),
route_id LowCardinality(String),
application_id LowCardinality(String),
iteration Nullable(Int32),
iteration_size Nullable(Int32),
status LowCardinality(String),
end_time Nullable(DateTime64(3)),
duration_ms Nullable(Int64),
error_message String DEFAULT '',
error_stacktrace String DEFAULT '',
error_type LowCardinality(String) DEFAULT '',
error_category LowCardinality(String) DEFAULT '',
root_cause_type String DEFAULT '',
root_cause_message String DEFAULT '',
input_body String DEFAULT '',
output_body String DEFAULT '',
input_headers String DEFAULT '',
output_headers String DEFAULT '',
attributes String DEFAULT '',
resolved_endpoint_uri String DEFAULT '',
circuit_breaker_state LowCardinality(String) DEFAULT '',
fallback_triggered Bool DEFAULT false,
filter_matched Bool DEFAULT false,
duplicate_message Bool DEFAULT false,
_search_text String MATERIALIZED
concat(error_message, ' ', error_stacktrace, ' ', attributes,
' ', input_body, ' ', output_body, ' ', input_headers, ' ', output_headers),
INDEX idx_search _search_text TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4,
INDEX idx_exec_id execution_id TYPE bloom_filter(0.01) GRANULARITY 4
)
ENGINE = MergeTree()
PARTITION BY (tenant_id, toYYYYMM(start_time))
ORDER BY (tenant_id, start_time, application_id, route_id, execution_id, seq)
TTL toDateTime(start_time) + INTERVAL 365 DAY DELETE
SETTINGS index_granularity = 8192;

View File

@@ -0,0 +1,167 @@
-- V4__stats_tables_and_mvs.sql
-- Pre-aggregated statistics tables and materialized views.
-- Tables use AggregatingMergeTree, MVs use -State combinators.
-- stats_1m_all (global)
CREATE TABLE IF NOT EXISTS stats_1m_all (
tenant_id LowCardinality(String),
bucket DateTime,
total_count AggregateFunction(count),
failed_count AggregateFunction(countIf, UInt8),
running_count AggregateFunction(countIf, UInt8),
duration_sum AggregateFunction(sum, Nullable(Int64)),
duration_max AggregateFunction(max, Nullable(Int64)),
p99_duration AggregateFunction(quantile(0.99), Nullable(Int64))
)
ENGINE = AggregatingMergeTree()
PARTITION BY (tenant_id, toYYYYMM(bucket))
ORDER BY (tenant_id, bucket)
TTL bucket + INTERVAL 365 DAY DELETE;
CREATE MATERIALIZED VIEW IF NOT EXISTS stats_1m_all_mv TO stats_1m_all AS
SELECT
tenant_id,
toStartOfMinute(start_time) AS bucket,
countState() AS total_count,
countIfState(status = 'FAILED') AS failed_count,
countIfState(status = 'RUNNING') AS running_count,
sumState(duration_ms) AS duration_sum,
maxState(duration_ms) AS duration_max,
quantileState(0.99)(duration_ms) AS p99_duration
FROM executions
GROUP BY tenant_id, bucket;
-- stats_1m_app (per-application)
CREATE TABLE IF NOT EXISTS stats_1m_app (
tenant_id LowCardinality(String),
application_id LowCardinality(String),
bucket DateTime,
total_count AggregateFunction(count),
failed_count AggregateFunction(countIf, UInt8),
running_count AggregateFunction(countIf, UInt8),
duration_sum AggregateFunction(sum, Nullable(Int64)),
duration_max AggregateFunction(max, Nullable(Int64)),
p99_duration AggregateFunction(quantile(0.99), Nullable(Int64))
)
ENGINE = AggregatingMergeTree()
PARTITION BY (tenant_id, toYYYYMM(bucket))
ORDER BY (tenant_id, application_id, bucket)
TTL bucket + INTERVAL 365 DAY DELETE;
CREATE MATERIALIZED VIEW IF NOT EXISTS stats_1m_app_mv TO stats_1m_app AS
SELECT
tenant_id,
application_id,
toStartOfMinute(start_time) AS bucket,
countState() AS total_count,
countIfState(status = 'FAILED') AS failed_count,
countIfState(status = 'RUNNING') AS running_count,
sumState(duration_ms) AS duration_sum,
maxState(duration_ms) AS duration_max,
quantileState(0.99)(duration_ms) AS p99_duration
FROM executions
GROUP BY tenant_id, application_id, bucket;
-- stats_1m_route (per-route)
CREATE TABLE IF NOT EXISTS stats_1m_route (
tenant_id LowCardinality(String),
application_id LowCardinality(String),
route_id LowCardinality(String),
bucket DateTime,
total_count AggregateFunction(count),
failed_count AggregateFunction(countIf, UInt8),
running_count AggregateFunction(countIf, UInt8),
duration_sum AggregateFunction(sum, Nullable(Int64)),
duration_max AggregateFunction(max, Nullable(Int64)),
p99_duration AggregateFunction(quantile(0.99), Nullable(Int64))
)
ENGINE = AggregatingMergeTree()
PARTITION BY (tenant_id, toYYYYMM(bucket))
ORDER BY (tenant_id, application_id, route_id, bucket)
TTL bucket + INTERVAL 365 DAY DELETE;
CREATE MATERIALIZED VIEW IF NOT EXISTS stats_1m_route_mv TO stats_1m_route AS
SELECT
tenant_id,
application_id,
route_id,
toStartOfMinute(start_time) AS bucket,
countState() AS total_count,
countIfState(status = 'FAILED') AS failed_count,
countIfState(status = 'RUNNING') AS running_count,
sumState(duration_ms) AS duration_sum,
maxState(duration_ms) AS duration_max,
quantileState(0.99)(duration_ms) AS p99_duration
FROM executions
GROUP BY tenant_id, application_id, route_id, bucket;
-- stats_1m_processor (per-processor-type)
CREATE TABLE IF NOT EXISTS stats_1m_processor (
tenant_id LowCardinality(String),
application_id LowCardinality(String),
processor_type LowCardinality(String),
bucket DateTime,
total_count AggregateFunction(count),
failed_count AggregateFunction(countIf, UInt8),
duration_sum AggregateFunction(sum, Nullable(Int64)),
duration_max AggregateFunction(max, Nullable(Int64)),
p99_duration AggregateFunction(quantile(0.99), Nullable(Int64))
)
ENGINE = AggregatingMergeTree()
PARTITION BY (tenant_id, toYYYYMM(bucket))
ORDER BY (tenant_id, application_id, processor_type, bucket)
TTL bucket + INTERVAL 365 DAY DELETE;
CREATE MATERIALIZED VIEW IF NOT EXISTS stats_1m_processor_mv TO stats_1m_processor AS
SELECT
tenant_id,
application_id,
processor_type,
toStartOfMinute(start_time) AS bucket,
countState() AS total_count,
countIfState(status = 'FAILED') AS failed_count,
sumState(duration_ms) AS duration_sum,
maxState(duration_ms) AS duration_max,
quantileState(0.99)(duration_ms) AS p99_duration
FROM processor_executions
GROUP BY tenant_id, application_id, processor_type, bucket;
-- stats_1m_processor_detail (per-processor-id)
CREATE TABLE IF NOT EXISTS stats_1m_processor_detail (
tenant_id LowCardinality(String),
application_id LowCardinality(String),
route_id LowCardinality(String),
processor_id String,
processor_type LowCardinality(String),
bucket DateTime,
total_count AggregateFunction(count),
failed_count AggregateFunction(countIf, UInt8),
duration_sum AggregateFunction(sum, Nullable(Int64)),
duration_max AggregateFunction(max, Nullable(Int64)),
p99_duration AggregateFunction(quantile(0.99), Nullable(Int64))
)
ENGINE = AggregatingMergeTree()
PARTITION BY (tenant_id, toYYYYMM(bucket))
ORDER BY (tenant_id, application_id, route_id, processor_id, processor_type, bucket)
TTL bucket + INTERVAL 365 DAY DELETE;
CREATE MATERIALIZED VIEW IF NOT EXISTS stats_1m_processor_detail_mv TO stats_1m_processor_detail AS
SELECT
tenant_id,
application_id,
route_id,
processor_id,
processor_type,
toStartOfMinute(start_time) AS bucket,
countState() AS total_count,
countIfState(status = 'FAILED') AS failed_count,
sumState(duration_ms) AS duration_sum,
maxState(duration_ms) AS duration_max,
quantileState(0.99)(duration_ms) AS p99_duration
FROM processor_executions
GROUP BY tenant_id, application_id, route_id, processor_id, processor_type, bucket;

View File

@@ -0,0 +1,2 @@
ALTER TABLE executions ADD COLUMN IF NOT EXISTS original_exchange_id String DEFAULT '';
ALTER TABLE executions ADD COLUMN IF NOT EXISTS replay_exchange_id String DEFAULT '';

View File

@@ -0,0 +1,12 @@
CREATE TABLE IF NOT EXISTS route_diagrams (
tenant_id LowCardinality(String) DEFAULT 'default',
content_hash String,
route_id LowCardinality(String),
instance_id LowCardinality(String),
application_id LowCardinality(String),
definition String,
created_at DateTime64(3) DEFAULT now64(3)
)
ENGINE = ReplacingMergeTree(created_at)
ORDER BY (tenant_id, content_hash)
SETTINGS index_granularity = 8192;

View File

@@ -0,0 +1,12 @@
CREATE TABLE IF NOT EXISTS agent_events (
tenant_id LowCardinality(String) DEFAULT 'default',
timestamp DateTime64(3) DEFAULT now64(3),
instance_id LowCardinality(String),
application_id LowCardinality(String),
event_type LowCardinality(String),
detail String DEFAULT ''
)
ENGINE = MergeTree()
PARTITION BY (tenant_id, toYYYYMM(timestamp))
ORDER BY (tenant_id, application_id, instance_id, timestamp)
TTL toDateTime(timestamp) + INTERVAL 365 DAY DELETE;

View File

@@ -0,0 +1,22 @@
CREATE TABLE IF NOT EXISTS logs (
tenant_id LowCardinality(String) DEFAULT 'default',
timestamp DateTime64(3),
application LowCardinality(String),
instance_id LowCardinality(String),
level LowCardinality(String),
logger_name LowCardinality(String) DEFAULT '',
message String,
thread_name LowCardinality(String) DEFAULT '',
stack_trace String DEFAULT '',
exchange_id String DEFAULT '',
mdc Map(String, String) DEFAULT map(),
INDEX idx_msg message TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4,
INDEX idx_stack stack_trace TYPE ngrambf_v1(3, 256, 2, 0) GRANULARITY 4,
INDEX idx_level level TYPE set(10) GRANULARITY 1
)
ENGINE = MergeTree()
PARTITION BY (tenant_id, toYYYYMM(timestamp))
ORDER BY (tenant_id, application, timestamp)
TTL toDateTime(timestamp) + INTERVAL 365 DAY DELETE
SETTINGS index_granularity = 8192;

View File

@@ -1,23 +0,0 @@
-- executions: store raw processor tree for faithful detail response
ALTER TABLE executions ADD COLUMN processors_json JSONB;
-- executions: error categorization + OTel tracing
ALTER TABLE executions ADD COLUMN error_type TEXT;
ALTER TABLE executions ADD COLUMN error_category TEXT;
ALTER TABLE executions ADD COLUMN root_cause_type TEXT;
ALTER TABLE executions ADD COLUMN root_cause_message TEXT;
ALTER TABLE executions ADD COLUMN trace_id TEXT;
ALTER TABLE executions ADD COLUMN span_id TEXT;
-- processor_executions: error categorization + circuit breaker
ALTER TABLE processor_executions ADD COLUMN error_type TEXT;
ALTER TABLE processor_executions ADD COLUMN error_category TEXT;
ALTER TABLE processor_executions ADD COLUMN root_cause_type TEXT;
ALTER TABLE processor_executions ADD COLUMN root_cause_message TEXT;
ALTER TABLE processor_executions ADD COLUMN error_handler_type TEXT;
ALTER TABLE processor_executions ADD COLUMN circuit_breaker_state TEXT;
ALTER TABLE processor_executions ADD COLUMN fallback_triggered BOOLEAN;
-- Remove erroneous depth columns from V9
ALTER TABLE processor_executions DROP COLUMN IF EXISTS split_depth;
ALTER TABLE processor_executions DROP COLUMN IF EXISTS loop_depth;

View File

@@ -1,10 +0,0 @@
-- Flag indicating whether any processor in this execution captured trace data
ALTER TABLE executions ADD COLUMN IF NOT EXISTS has_trace_data BOOLEAN NOT NULL DEFAULT FALSE;
-- Backfill: set flag for existing executions that have processor trace data
UPDATE executions e SET has_trace_data = TRUE
WHERE EXISTS (
SELECT 1 FROM processor_executions pe
WHERE pe.execution_id = e.execution_id
AND (pe.input_body IS NOT NULL OR pe.output_body IS NOT NULL)
);

View File

@@ -1,11 +0,0 @@
-- Per-application dashboard settings (SLA thresholds, health dot thresholds)
CREATE TABLE app_settings (
app_id TEXT PRIMARY KEY,
sla_threshold_ms INTEGER NOT NULL DEFAULT 300,
health_error_warn DOUBLE PRECISION NOT NULL DEFAULT 1.0,
health_error_crit DOUBLE PRECISION NOT NULL DEFAULT 5.0,
health_sla_warn DOUBLE PRECISION NOT NULL DEFAULT 99.0,
health_sla_crit DOUBLE PRECISION NOT NULL DEFAULT 95.0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);

View File

@@ -1,7 +0,0 @@
-- Flag indicating whether this execution is a replayed exchange
ALTER TABLE executions ADD COLUMN IF NOT EXISTS is_replay BOOLEAN NOT NULL DEFAULT FALSE;
-- Backfill: check inputHeaders JSON for X-Cameleer-Replay header
UPDATE executions SET is_replay = TRUE
WHERE input_headers IS NOT NULL
AND input_headers::jsonb ? 'X-Cameleer-Replay';

View File

@@ -1,8 +1,6 @@
-- V1__init.sql - Consolidated schema for Cameleer3
-- Extensions
CREATE EXTENSION IF NOT EXISTS timescaledb;
CREATE EXTENSION IF NOT EXISTS timescaledb_toolkit;
-- V1__init.sql — PostgreSQL schema for Cameleer3 Server
-- PostgreSQL stores RBAC, configuration, and audit data only.
-- All observability data (executions, metrics, diagrams, logs, stats) is in ClickHouse.
-- =============================================================
-- RBAC
@@ -40,7 +38,6 @@ CREATE TABLE groups (
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- Built-in Admins group
INSERT INTO groups (id, name) VALUES
('00000000-0000-0000-0000-000000000010', 'Admins');
@@ -50,7 +47,6 @@ CREATE TABLE group_roles (
PRIMARY KEY (group_id, role_id)
);
-- Assign ADMIN role to Admins group
INSERT INTO group_roles (group_id, role_id) VALUES
('00000000-0000-0000-0000-000000000010', '00000000-0000-0000-0000-000000000004');
@@ -71,113 +67,6 @@ CREATE INDEX idx_user_groups_user_id ON user_groups(user_id);
CREATE INDEX idx_group_roles_group_id ON group_roles(group_id);
CREATE INDEX idx_groups_parent ON groups(parent_group_id);
-- =============================================================
-- Execution data (TimescaleDB hypertables)
-- =============================================================
CREATE TABLE executions (
execution_id TEXT NOT NULL,
route_id TEXT NOT NULL,
agent_id TEXT NOT NULL,
application_name TEXT NOT NULL,
status TEXT NOT NULL,
correlation_id TEXT,
exchange_id TEXT,
start_time TIMESTAMPTZ NOT NULL,
end_time TIMESTAMPTZ,
duration_ms BIGINT,
error_message TEXT,
error_stacktrace TEXT,
diagram_content_hash TEXT,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (execution_id, start_time)
);
SELECT create_hypertable('executions', 'start_time', chunk_time_interval => INTERVAL '1 day');
CREATE INDEX idx_executions_agent_time ON executions (agent_id, start_time DESC);
CREATE INDEX idx_executions_route_time ON executions (route_id, start_time DESC);
CREATE INDEX idx_executions_app_time ON executions (application_name, start_time DESC);
CREATE INDEX idx_executions_correlation ON executions (correlation_id);
CREATE TABLE processor_executions (
id BIGSERIAL,
execution_id TEXT NOT NULL,
processor_id TEXT NOT NULL,
processor_type TEXT NOT NULL,
diagram_node_id TEXT,
application_name TEXT NOT NULL,
route_id TEXT NOT NULL,
depth INT NOT NULL,
parent_processor_id TEXT,
status TEXT NOT NULL,
start_time TIMESTAMPTZ NOT NULL,
end_time TIMESTAMPTZ,
duration_ms BIGINT,
error_message TEXT,
error_stacktrace TEXT,
input_body TEXT,
output_body TEXT,
input_headers JSONB,
output_headers JSONB,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
UNIQUE (execution_id, processor_id, start_time)
);
SELECT create_hypertable('processor_executions', 'start_time', chunk_time_interval => INTERVAL '1 day');
CREATE INDEX idx_proc_exec_execution ON processor_executions (execution_id);
CREATE INDEX idx_proc_exec_type_time ON processor_executions (processor_type, start_time DESC);
-- =============================================================
-- Agent metrics
-- =============================================================
CREATE TABLE agent_metrics (
agent_id TEXT NOT NULL,
metric_name TEXT NOT NULL,
metric_value DOUBLE PRECISION NOT NULL,
tags JSONB,
collected_at TIMESTAMPTZ NOT NULL,
server_received_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
SELECT create_hypertable('agent_metrics', 'collected_at', chunk_time_interval => INTERVAL '1 day');
CREATE INDEX idx_metrics_agent_name ON agent_metrics (agent_id, metric_name, collected_at DESC);
-- =============================================================
-- Route diagrams
-- =============================================================
CREATE TABLE route_diagrams (
content_hash TEXT PRIMARY KEY,
route_id TEXT NOT NULL,
agent_id TEXT NOT NULL,
definition TEXT NOT NULL,
created_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
CREATE INDEX idx_diagrams_route_agent ON route_diagrams (route_id, agent_id);
-- =============================================================
-- Agent events
-- =============================================================
CREATE TABLE agent_events (
id BIGSERIAL PRIMARY KEY,
agent_id TEXT NOT NULL,
app_id TEXT NOT NULL,
event_type TEXT NOT NULL,
detail TEXT,
timestamp TIMESTAMPTZ NOT NULL DEFAULT NOW()
);
CREATE INDEX idx_agent_events_agent ON agent_events(agent_id, timestamp DESC);
CREATE INDEX idx_agent_events_app ON agent_events(app_id, timestamp DESC);
CREATE INDEX idx_agent_events_time ON agent_events(timestamp DESC);
-- =============================================================
-- Server configuration
-- =============================================================
@@ -190,7 +79,30 @@ CREATE TABLE server_config (
);
-- =============================================================
-- Admin
-- Application configuration
-- =============================================================
CREATE TABLE application_config (
application TEXT PRIMARY KEY,
config_val JSONB NOT NULL,
version INTEGER NOT NULL DEFAULT 1,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_by TEXT
);
CREATE TABLE app_settings (
application_id TEXT PRIMARY KEY,
sla_threshold_ms INTEGER NOT NULL DEFAULT 300,
health_error_warn DOUBLE PRECISION NOT NULL DEFAULT 1.0,
health_error_crit DOUBLE PRECISION NOT NULL DEFAULT 5.0,
health_sla_warn DOUBLE PRECISION NOT NULL DEFAULT 99.0,
health_sla_crit DOUBLE PRECISION NOT NULL DEFAULT 95.0,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
-- =============================================================
-- Audit log
-- =============================================================
CREATE TABLE audit_log (
@@ -211,93 +123,3 @@ CREATE INDEX idx_audit_log_username ON audit_log (username);
CREATE INDEX idx_audit_log_category ON audit_log (category);
CREATE INDEX idx_audit_log_action ON audit_log (action);
CREATE INDEX idx_audit_log_target ON audit_log (target);
-- =============================================================
-- Continuous aggregates
-- =============================================================
CREATE MATERIALIZED VIEW stats_1m_all
WITH (timescaledb.continuous, timescaledb.materialized_only = false) AS
SELECT
time_bucket('1 minute', start_time) AS bucket,
COUNT(*) AS total_count,
COUNT(*) FILTER (WHERE status = 'FAILED') AS failed_count,
COUNT(*) FILTER (WHERE status = 'RUNNING') AS running_count,
SUM(duration_ms) AS duration_sum,
MAX(duration_ms) AS duration_max,
approx_percentile(0.99, percentile_agg(duration_ms::DOUBLE PRECISION)) AS p99_duration
FROM executions
WHERE status IS NOT NULL
GROUP BY bucket
WITH NO DATA;
CREATE MATERIALIZED VIEW stats_1m_app
WITH (timescaledb.continuous, timescaledb.materialized_only = false) AS
SELECT
time_bucket('1 minute', start_time) AS bucket,
application_name,
COUNT(*) AS total_count,
COUNT(*) FILTER (WHERE status = 'FAILED') AS failed_count,
COUNT(*) FILTER (WHERE status = 'RUNNING') AS running_count,
SUM(duration_ms) AS duration_sum,
MAX(duration_ms) AS duration_max,
approx_percentile(0.99, percentile_agg(duration_ms::DOUBLE PRECISION)) AS p99_duration
FROM executions
WHERE status IS NOT NULL
GROUP BY bucket, application_name
WITH NO DATA;
CREATE MATERIALIZED VIEW stats_1m_route
WITH (timescaledb.continuous, timescaledb.materialized_only = false) AS
SELECT
time_bucket('1 minute', start_time) AS bucket,
application_name,
route_id,
COUNT(*) AS total_count,
COUNT(*) FILTER (WHERE status = 'FAILED') AS failed_count,
COUNT(*) FILTER (WHERE status = 'RUNNING') AS running_count,
SUM(duration_ms) AS duration_sum,
MAX(duration_ms) AS duration_max,
approx_percentile(0.99, percentile_agg(duration_ms::DOUBLE PRECISION)) AS p99_duration
FROM executions
WHERE status IS NOT NULL
GROUP BY bucket, application_name, route_id
WITH NO DATA;
CREATE MATERIALIZED VIEW stats_1m_processor
WITH (timescaledb.continuous, timescaledb.materialized_only = false) AS
SELECT
time_bucket('1 minute', start_time) AS bucket,
application_name,
route_id,
processor_type,
COUNT(*) AS total_count,
COUNT(*) FILTER (WHERE status = 'FAILED') AS failed_count,
SUM(duration_ms) AS duration_sum,
MAX(duration_ms) AS duration_max,
approx_percentile(0.99, percentile_agg(duration_ms::DOUBLE PRECISION)) AS p99_duration
FROM processor_executions
GROUP BY bucket, application_name, route_id, processor_type
WITH NO DATA;
CREATE MATERIALIZED VIEW stats_1m_processor_detail
WITH (timescaledb.continuous, timescaledb.materialized_only = false) AS
SELECT
time_bucket('1 minute', start_time) AS bucket,
application_name,
route_id,
processor_id,
processor_type,
COUNT(*) AS total_count,
COUNT(*) FILTER (WHERE status = 'FAILED') AS failed_count,
SUM(duration_ms) AS duration_sum,
MAX(duration_ms) AS duration_max,
approx_percentile(0.99, percentile_agg(duration_ms)) AS p99_duration
FROM processor_executions
GROUP BY bucket, application_name, route_id, processor_id, processor_type
WITH NO DATA;

View File

@@ -1,38 +0,0 @@
-- V2__policies.sql - TimescaleDB policies (must run outside transaction)
-- flyway:executeInTransaction=false
-- Agent metrics retention & compression
ALTER TABLE agent_metrics SET (timescaledb.compress);
SELECT add_retention_policy('agent_metrics', INTERVAL '90 days', if_not_exists => true);
SELECT add_compression_policy('agent_metrics', INTERVAL '7 days', if_not_exists => true);
-- Continuous aggregate refresh policies
SELECT add_continuous_aggregate_policy('stats_1m_all',
start_offset => INTERVAL '1 hour',
end_offset => INTERVAL '1 minute',
schedule_interval => INTERVAL '1 minute',
if_not_exists => true);
SELECT add_continuous_aggregate_policy('stats_1m_app',
start_offset => INTERVAL '1 hour',
end_offset => INTERVAL '1 minute',
schedule_interval => INTERVAL '1 minute',
if_not_exists => true);
SELECT add_continuous_aggregate_policy('stats_1m_route',
start_offset => INTERVAL '1 hour',
end_offset => INTERVAL '1 minute',
schedule_interval => INTERVAL '1 minute',
if_not_exists => true);
SELECT add_continuous_aggregate_policy('stats_1m_processor',
start_offset => INTERVAL '1 hour',
end_offset => INTERVAL '1 minute',
schedule_interval => INTERVAL '1 minute',
if_not_exists => true);
SELECT add_continuous_aggregate_policy('stats_1m_processor_detail',
start_offset => INTERVAL '1 hour',
end_offset => INTERVAL '1 minute',
schedule_interval => INTERVAL '1 minute',
if_not_exists => true);

View File

@@ -1,9 +0,0 @@
-- Add engine level and route-level snapshot columns to executions table.
-- Required for REGULAR engine level where route-level payloads exist but
-- no processor execution records are created.
ALTER TABLE executions ADD COLUMN IF NOT EXISTS engine_level VARCHAR(16);
ALTER TABLE executions ADD COLUMN IF NOT EXISTS input_body TEXT;
ALTER TABLE executions ADD COLUMN IF NOT EXISTS output_body TEXT;
ALTER TABLE executions ADD COLUMN IF NOT EXISTS input_headers JSONB;
ALTER TABLE executions ADD COLUMN IF NOT EXISTS output_headers JSONB;

View File

@@ -1,9 +0,0 @@
-- Per-application configuration for agent observability settings.
-- Agents download this at startup and receive updates via SSE CONFIG_UPDATE.
CREATE TABLE application_config (
application TEXT PRIMARY KEY,
config_val JSONB NOT NULL,
version INTEGER NOT NULL DEFAULT 1,
updated_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_by TEXT
);

View File

@@ -1,2 +0,0 @@
ALTER TABLE executions ADD COLUMN IF NOT EXISTS attributes JSONB;
ALTER TABLE processor_executions ADD COLUMN IF NOT EXISTS attributes JSONB;

View File

@@ -1 +0,0 @@
ALTER TABLE processor_executions DROP COLUMN IF EXISTS diagram_node_id;

View File

@@ -1,2 +0,0 @@
ALTER TABLE route_diagrams ADD COLUMN IF NOT EXISTS application_name TEXT NOT NULL DEFAULT '';
CREATE INDEX IF NOT EXISTS idx_diagrams_application ON route_diagrams (application_name);

View File

@@ -1,5 +0,0 @@
ALTER TABLE processor_executions ADD COLUMN IF NOT EXISTS loop_index INTEGER;
ALTER TABLE processor_executions ADD COLUMN IF NOT EXISTS loop_size INTEGER;
ALTER TABLE processor_executions ADD COLUMN IF NOT EXISTS split_index INTEGER;
ALTER TABLE processor_executions ADD COLUMN IF NOT EXISTS split_size INTEGER;
ALTER TABLE processor_executions ADD COLUMN IF NOT EXISTS multicast_index INTEGER;

View File

@@ -1,3 +0,0 @@
ALTER TABLE processor_executions ADD COLUMN resolved_endpoint_uri TEXT;
ALTER TABLE processor_executions ADD COLUMN split_depth INTEGER DEFAULT 0;
ALTER TABLE processor_executions ADD COLUMN loop_depth INTEGER DEFAULT 0;

Some files were not shown because too many files have changed in this diff Show More