Files
2026-03-11 11:05:37 +01:00

14 KiB

Feature Landscape

Domain: Transaction monitoring / observability for Apache Camel route executions Researched: 2026-03-11 Confidence: MEDIUM (based on domain expertise from njams Server, Jaeger, Zipkin, Dynatrace; web search unavailable for latest feature sets)

Table Stakes

Features users expect. Missing = product feels incomplete.

Transaction Search and Filtering

Feature Why Expected Complexity Notes
Search by time range Every monitoring tool has this; primary axis for incident investigation Low Date picker with presets (last 15m, 1h, 24h, 7d, custom)
Filter by transaction state SUCCESS/ERROR/WARNING is the first thing ops checks Low Multi-select checkboxes, counts per state
Filter by duration Finding slow transactions is core use case Low Min/max duration inputs, or predefined buckets
Full-text search across payload/attributes Users need to find "that one order ID" across millions of records Medium Requires text index; match highlighting in results
Combined/compound filters Users always combine: "errors in last hour on instance X" Medium AND-composition of all filter criteria
Paginated result list Cannot load millions of rows; must page or virtual-scroll Low Cursor-based pagination preferred over offset for large datasets
Sort by time, duration, state Basic result ordering Low Default: newest first
Filter by agent/instance "Show me only transactions from production-instance-3" Low Dropdown populated from agent registry
Filter by route name Users think in routes, not raw IDs Low Autocomplete from known route definitions
Save/bookmark search queries Ops teams reuse the same searches during incidents Medium Named saved searches, shareable via URL

Transaction Detail and Drill-Down

Feature Why Expected Complexity Notes
Transaction summary view One-glance: state, start time, duration, instance, route entry point Low Header card in detail page
Activity list (per-route breakdown) Hierarchical view of all route executions within a transaction Medium Tree or table showing each activity with timing
Activity timing waterfall Visual timeline showing which routes executed when, and their overlap Medium Horizontal bar chart; critical for finding bottlenecks
Payload/attribute inspection View message body, headers, properties at each activity step Medium Expandable sections; JSON/XML pretty-printing
Error detail with stack trace When a transaction fails, users need the exception detail immediately Low Rendered stack trace with copy button
Cross-instance correlation Transaction spans instances A and B -- show the full chain High Requires correlation ID propagation; single unified view
Link to route diagram From any activity, jump to the diagram showing the route definition Low Hyperlink; depends on diagram storage existing

Route Diagram Visualization

Feature Why Expected Complexity Notes
Render route diagram from stored definition The core differentiator vs generic tracing tools; users think in Camel routes High Server-side or client-side rendering from graph model
Diagram versioning Route changed last Tuesday -- show the diagram as it was when the transaction ran Medium Version stored per diagram; transaction references specific version
Zoom and pan Diagrams can be large (50+ nodes); must be navigable Medium Standard canvas controls; minimap helpful for large diagrams
Execution overlay on diagram Highlight which path the transaction actually took through the route High Color/annotate nodes with state (success/error), timing
Node click for activity detail Click a node in the diagram to see the activity data for that step Medium Links diagram nodes to activity records

Agent Management

Feature Why Expected Complexity Notes
Agent list with status See all connected agents and their lifecycle state (LIVE/STALE/DEAD) Low Table with status indicator; auto-refresh
Agent heartbeat monitoring Detect when an agent goes silent Low Timestamp of last heartbeat; threshold-based state transitions
Agent detail view Instance name, version, connected routes, uptime, config Low Detail page per agent
Agent registration/deregistration New agents register via bootstrap token; dead agents get cleaned up Medium Registration endpoint; TTL-based cleanup

Authentication and Security

Feature Why Expected Complexity Notes
JWT-based API authentication Secure the REST API; every enterprise monitoring tool requires auth Medium Token issuance, validation, refresh
Bootstrap token for agent registration Agents need a way to initially register without pre-existing credentials Low Shared secret, single-use or time-limited
Ed25519 config signing Agents must verify config came from the server, not tampered Medium Key management, signature generation/verification

Dashboard and Overview

Feature Why Expected Complexity Notes
Transaction volume chart (time series) "How many transactions are we processing?" -- first question on login Medium Bar or line chart, grouped by time bucket
Error rate chart "Is something broken right now?" -- second question Medium Error count or percentage over time
Active agents count Quick health check of the agent fleet Low Simple counter with status breakdown
Recent errors list Quick access to the latest failures without searching Low Pre-filtered list, auto-refreshing

Differentiators

Features that set product apart from generic tracing tools. Not expected, but valued.

Diagram-Centric Experience

Feature Value Proposition Complexity Notes
Route diagram as primary navigation Instead of trace waterfall, users navigate via the Camel route diagram -- this is how they think High Diagram becomes the entry point, not just a visualization
Execution heatmap on diagram Color nodes by frequency/error rate over a time window -- shows hotspots High Aggregate stats per node; requires efficient querying
Side-by-side diagram comparison Compare two diagram versions to see what changed in a route Medium Diff view highlighting added/removed/changed nodes
Diagram-based search "Show me all failed transactions that passed through this node" High Click a node, get filtered transaction list

Advanced Search and Analytics

Feature Value Proposition Complexity Notes
Statistical duration analysis P50/P95/P99 duration for a route over time -- detect degradation trends Medium Requires ClickHouse aggregation queries
Transaction comparison Side-by-side diff of two transactions through the same route Medium Useful for "why did this one fail but that one succeed?"
Search result aggregations Faceted counts: N errors, N warnings, distribution by route, by instance Medium ClickHouse GROUP BY queries alongside search results
Correlation graph Visual graph showing how transactions flow across instances High Network diagram; requires correlation data

Configuration Push

Feature Value Proposition Complexity Notes
Per-route tracing level control Turn on detailed tracing for one problematic route without restarting the agent Medium SSE push of config change; agent applies dynamically
Bulk config push to agent groups "Enable debug tracing on all production instances" Medium Agent tagging/grouping + batch SSE dispatch
Config history and rollback See what config was active when, roll back a bad change Medium Versioned config storage with timestamps
Ad-hoc command dispatch Send a "flush cache" or "reconnect" command to specific agents Medium Command/response pattern over SSE; command status tracking

Operational Intelligence

Feature Value Proposition Complexity Notes
Alerting on error rate thresholds Notify when error rate exceeds threshold for a route High Threshold evaluation, notification channels (email, webhook)
Anomaly detection on duration Alert when P95 duration spikes compared to baseline High Statistical baseline computation; deviation detection
Scheduled data export Export transaction data as CSV/JSON for compliance or reporting Medium Job scheduler; file generation; download endpoint
Retention policy management Configure per-route or per-instance retention periods Medium TTL management in ClickHouse; UI for policy CRUD

Anti-Features

Features to explicitly NOT build.

Anti-Feature Why Avoid What to Do Instead
General APM metrics (CPU, memory, GC) Out of scope; Cameleer is transaction-focused, not an APM tool. Adding metrics creates scope creep and competes with Prometheus/Grafana which do it better Provide a link/integration point to external metrics tools if needed
Log aggregation/viewer Transactions are not logs. Mixing them confuses the data model and competes with ELK/Loki Store transaction payloads and attributes, not raw log lines
Custom dashboard builder Enormous complexity for marginal value. Ops teams already have Grafana for custom dashboards Provide good built-in dashboards; expose metrics via Prometheus endpoint for Grafana
Multi-tenancy Adds auth complexity, data isolation, billing concerns. Single-tenant deployment is simpler and sufficient for the target audience Deploy separate instances per environment/team
Mobile app Ops teams use desktop browsers during incidents. Mobile adds huge UI complexity Responsive web UI that works on tablets if needed
Plugin/extension system Premature abstraction; adds API stability burden before the core is stable Build features directly; consider plugins much later if demand emerges
Real-time streaming transaction view "Firehose" views of all transactions in real-time look impressive but are useless at scale (millions/day). Users cannot process the stream Provide auto-refreshing search results and recent errors list
AI/ML-powered root cause analysis Hype-driven feature with poor reliability. Requires massive training data and domain-specific models Provide good search, filtering, and comparison tools so humans can find root causes efficiently

Feature Dependencies

Agent Registration --> Agent List/Status
Agent Registration --> SSE Connection --> Config Push
Agent Registration --> SSE Connection --> Ad-hoc Commands

Transaction Ingestion --> Transaction Storage
Transaction Storage --> Transaction Search/Filtering
Transaction Search --> Transaction Detail View
Transaction Detail --> Activity Waterfall
Transaction Detail --> Payload Inspection
Transaction Detail --> Error Detail

Diagram Storage --> Diagram Rendering
Diagram Versioning --> Transaction-to-Diagram Linking
Diagram Rendering --> Execution Overlay (requires both diagram + activity data)
Diagram Rendering --> Execution Heatmap (requires aggregated activity data)
Diagram Rendering --> Diagram-based Search

Transaction Search --> Statistical Duration Analysis (aggregation of search results)
Transaction Search --> Search Result Aggregations

JWT Auth --> All REST API endpoints
Bootstrap Token --> Agent Registration
Ed25519 Signing --> Config Push

Transaction Volume Chart --> Transaction Storage (aggregation queries)
Error Rate Chart --> Transaction Storage (aggregation queries)

MVP Recommendation

Prioritize (Phase 1 -- Foundation):

  1. Transaction ingestion and storage -- nothing works without data flowing in
  2. Agent registration and lifecycle -- must know who is sending data
  3. Basic transaction search (time range, state, duration) -- core value proposition
  4. Transaction detail with activity breakdown -- users need to drill down

Prioritize (Phase 2 -- Core Experience): 5. Full-text search -- the "find that one transaction" use case 6. Route diagram rendering with version linking -- the Camel-specific differentiator 7. JWT authentication -- required before any production deployment 8. Dashboard overview (volume chart, error rate, agent status)

Prioritize (Phase 3 -- Differentiation): 9. Execution overlay on diagrams -- the killer feature that generic tools cannot offer 10. Config push via SSE -- operational value that justifies the agent-server architecture 11. Cross-instance correlation -- required for complex multi-instance Camel deployments

Defer:

  • Alerting: defer until core search and dashboard are solid; alerting without good data is noise
  • Data export: useful but not blocking; add when compliance demands arise
  • Anomaly detection: requires baseline data that only accumulates over time
  • Diagram-based search: powerful but depends on both diagram rendering and search being mature
  • Execution heatmap: requires significant aggregation infrastructure

Sources

  • Domain knowledge from njams Server (Integration Matters) feature set -- transaction monitoring for integration platforms, hierarchical transaction/activity model, route diagram visualization
  • Jaeger UI and Zipkin UI -- distributed tracing search, trace detail waterfall views, service dependency graphs
  • Dynatrace PurePath -- transaction-level drill-down, service flow visualization, statistical analysis
  • Apache Camel route model -- EIP-based visual representation, route definition structure
  • Project context from PROJECT.md and CLAUDE.md -- specific requirements, constraints, and architectural decisions

Confidence note: Feature categorization is based on training data knowledge of these products. Web search was unavailable to verify latest feature additions in 2025-2026 releases. The core feature landscape for this domain is mature and unlikely to have shifted dramatically, but specific UI patterns and newer differentiators may be missed. Confidence: MEDIUM.