# Infrastructure Overview — Admin Pages Design **Date:** 2026-03-17 **Status:** Approved **Scope:** Phase 1 implementation; full vision documented with Phase 2+ sections marked ## Overview Add Database and OpenSearch admin pages to the Cameleer3 Server UI, allowing administrators to monitor subsystem health, inspect metrics, and perform basic maintenance actions. Restructure admin navigation from a single OIDC page to a sidebar sub-menu with dedicated pages per concern. ## Goals - Give admins real-time visibility into PostgreSQL and OpenSearch health, performance, and storage - Enable basic maintenance actions (kill queries, delete indices) without SSH/kubectl access - Provide configurable thresholds for visual status indicators (green/yellow/red) - Establish a database-backed audit log for all admin actions (SOC2 compliance foundation) - Design for future expansion (VACUUM, reindex, OPERATOR role) without requiring restructuring ## Non-Goals (Phase 1) - Database maintenance actions (VACUUM ANALYZE, Reindex) - OpenSearch bulk operations (Force Reindex All, Flush) - OPERATOR role with restricted permissions - TimescaleDB-specific features (hypertable stats, continuous aggregate status) - Alerting or notifications beyond visual indicators --- ## 1. Admin Navigation Restructuring ### Current State Single gear icon at bottom of `AppSidebar` linking directly to `/admin/oidc`. ### New Structure The gear icon expands/collapses an admin sub-menu in the sidebar: ``` ── Apps ────────────── app-1 app-2 ── Admin (gear icon) ─ Database → /admin/database OpenSearch → /admin/opensearch Audit Log → /admin/audit OIDC → /admin/oidc Users → (future) ``` - Admin section visible only to users with `ADMIN` role - Section collapsed by default; state persisted in localStorage - Active sub-item highlighted - `/admin` redirects to `/admin/database` - Existing `OidcAdminPage` unchanged functionally, re-routed from being the sole admin page to a sub-page --- ## 2. Database Page (`/admin/database`) ### Header - Connection status badge (green/red) - PostgreSQL version (with TimescaleDB extension noted if present) - Host and schema name - Manual refresh button (refreshes all sections) ### Connection Pool Section - Visual bar showing active connections vs. max pool size - Metrics: active, idle, pending, max wait time - Status badge based on configurable threshold (% of pool in use) - Source: HikariCP pool MXBean - **Auto-refreshes every 15 seconds** ### Table Sizes Section - Table with columns: Table, Rows, Size, Index Size - All application tables listed (executions, processor_executions, route_diagrams, agent_metrics, users, oidc_config, admin_thresholds) - Summary row: total data size, total index size - Source: `pg_stat_user_tables` + `pg_relation_size` - **Manual refresh only** (expensive query) ### Active Queries Section - Table with columns: PID, Duration, State, Query (truncated), Action - Queries > warning threshold highlighted yellow, > critical threshold highlighted red - Kill button per row → calls `pg_terminate_backend(pid)` - Kill requires confirmation dialog - After kill, query list refreshes automatically - Source: `pg_stat_activity` - **Auto-refreshes every 15 seconds** ### Maintenance Section (Phase 2 — Visible but Disabled) - Buttons: Run VACUUM ANALYZE, Reindex Tables - Greyed out with tooltip: "Available in a future release" ### Thresholds Section - Collapsible, collapsed by default - Configurable values: - Connection pool usage: warning % and critical % - Query duration: warning seconds and critical seconds - Save button persists to database --- ## 3. OpenSearch Page (`/admin/opensearch`) ### Header - Cluster health badge (green/yellow/red — maps directly to OpenSearch cluster health) - OpenSearch version - Node count - Host URL - Manual refresh button ### Indexing Pipeline Section - Visual bar showing queue depth vs. max queue size - Metrics: queue depth, failed document count, debounce interval, indexing rate (docs/s), time since last indexed - Status badge based on configurable thresholds - Source: `SearchIndexer` internal stats (exposed via `SearchIndexerStats` interface) - **Auto-refreshes every 15 seconds** ### Indices Section - **Search/filter** by index name pattern (text input) - **Filter by health** — All / Green / Yellow / Red dropdown - **Sortable columns** — Name, Docs, Size, Health, Shards (click column header) - **Pagination** — 10 per page, server-side - **Summary row** above table — total index count, total docs, total storage - **Delete button** (trash icon) per row: - Confirmation dialog: "Delete index `{name}`? This cannot be undone." - User must type the index name to confirm - After deletion, table and summary refresh - Table columns: Index, Docs, Size, Health, Shards (primary/replica) - Source: OpenSearch `_cat/indices` API - **Manual refresh only** ### Performance Section - Metrics: query cache hit rate, request cache hit rate, average search latency, average indexing latency, JVM heap used (visual bar with used/max) - Source: OpenSearch `_nodes/stats` API - **Auto-refreshes every 15 seconds** ### Operations Section (Phase 2 — Visible but Disabled) - Buttons: Force Reindex All, Flush Index, Delete Index (bulk via checkbox selection) - Greyed out with tooltip: "Available in a future release" ### Thresholds Section - Collapsible, collapsed by default - Configurable values: - Cluster health: warning level, critical level - Queue depth: warning count, critical count - JVM heap usage: warning %, critical % - Failed docs: warning count, critical count - Save button persists to database --- ## 4. Audit Log Page (`/admin/audit`) ### Purpose Database-backed audit trail of all administrative actions across the system. Provides SOC2-compliant evidence of who did what, when, and from where. The audit log is append-only — entries cannot be modified or deleted through the UI or API. ### Header - Total event count - Date range selector (default: last 7 days) ### Audit Log Table ``` ┌─ Audit Log ────────────────────────────────────────────────┐ │ Date range: [2026-03-10] to [2026-03-17] │ │ [User: All ▾] [Category: All ▾] [Search: ________] │ │ │ │ Timestamp User Category Action Target │ │ 2026-03-17 14:32:01 admin INFRA kill_query PID 42 │ │ 2026-03-17 14:28:15 admin INFRA delete_idx exec-… │ │ 2026-03-17 12:01:44 admin CONFIG update oidc │ │ 2026-03-17 09:15:22 jdoe AUTH login │ │ 2026-03-16 18:45:00 admin USER_MGMT update_roles u:5 │ │ ... │ │ │ │ ◀ 1 2 3 ... 12 ▶ Showing 1-25 of 294 │ └────────────────────────────────────────────────────────────┘ ``` - **Filterable** by user, category, date range - **Searchable** by free text (matches action, target, detail) - **Sortable** by timestamp (default: newest first) - **Pagination** — 25 per page, server-side - **Detail expansion** — click a row to expand and show full `detail` JSON - **Read-only** — no edit or delete actions available (compliance requirement) - **Export** (Phase 2) — CSV/JSON download for auditors ### Audit Categories | Category | Actions Logged | |----------|---------------| | `INFRA` | kill_query, delete_index, update_thresholds | | `AUTH` | login, login_oidc, logout, login_failed | | `USER_MGMT` | create_user, update_roles, delete_user | | `CONFIG` | update_oidc, delete_oidc, test_oidc | ### What Gets Logged Every admin action across the system, not just infrastructure pages: - **Infrastructure:** kill query, delete OpenSearch index, save thresholds - **OIDC:** save config, delete config, test connection - **User management:** update roles, delete user - **Authentication:** login (success and failure), OIDC login, logout ### Audit Record Fields | Field | Description | |-------|-------------| | `timestamp` | When the action occurred (server time, UTC) | | `username` | Authenticated user who performed the action | | `action` | Machine-readable action name (e.g., `kill_query`, `delete_index`) | | `category` | Grouping: `INFRA`, `AUTH`, `USER_MGMT`, `CONFIG` | | `target` | What was acted on (e.g., PID, index name, user ID) | | `detail` | JSONB with action-specific context (e.g., query text for killed query, old/new roles for role change) | | `result` | `SUCCESS` or `FAILURE` | | `ip_address` | Client IP address from the request | ### Backend Implementation - `AuditService` — central service injected into all admin controllers - Single method: `log(action, category, target, detail, result)` - Extracts username and IP from `SecurityContextHolder` and `HttpServletRequest` - Writes to both the `audit_log` table AND SLF4J (belt and suspenders) - Async write option not used — audit must be synchronous for compliance guarantees --- ## 5. Backend API All endpoints under `/api/v1/admin/` — secured by existing Spring Security filter chain (`ROLE_ADMIN` required). Controllers additionally annotated with `@PreAuthorize("hasRole('ADMIN')")` for defense-in-depth. ### Database Endpoints | Method | Path | Description | |--------|------|-------------| | `GET` | `/admin/database/status` | Version, host, schema, connection state | | `GET` | `/admin/database/pool` | Active, idle, pending, max wait (HikariCP) | | `GET` | `/admin/database/tables` | Table names, row counts, data sizes, index sizes | | `GET` | `/admin/database/queries` | Active queries: pid, duration, state, SQL | | `POST` | `/admin/database/queries/{pid}/kill` | Terminate query via `pg_terminate_backend` | ### OpenSearch Endpoints | Method | Path | Description | |--------|------|-------------| | `GET` | `/admin/opensearch/status` | Version, host, cluster health, node count | | `GET` | `/admin/opensearch/pipeline` | Queue depth, failed count, debounce, rate, last indexed | | `GET` | `/admin/opensearch/indices` | Paginated, sortable, filterable index list | | `DELETE` | `/admin/opensearch/indices/{name}` | Delete specific index (with audit log) | | `GET` | `/admin/opensearch/performance` | Cache rates, latencies, JVM heap | #### Indices Query Parameters | Param | Type | Default | Description | |-------|------|---------|-------------| | `search` | string | — | Filter by index name pattern | | `health` | enum | `ALL` | Filter by health: ALL, GREEN, YELLOW, RED | | `sort` | string | `name` | Sort field: name, docs, size, health | | `order` | enum | `asc` | Sort direction: asc, desc | | `page` | int | `0` | Page number (zero-based) | | `size` | int | `10` | Page size | ### Audit Log Endpoints | Method | Path | Description | |--------|------|-------------| | `GET` | `/admin/audit` | Paginated, filterable audit log entries | #### Audit Log Query Parameters | Param | Type | Default | Description | |-------|------|---------|-------------| | `username` | string | — | Filter by username | | `category` | enum | — | Filter by category: INFRA, AUTH, USER_MGMT, CONFIG | | `search` | string | — | Free text search across action, target, detail | | `from` | ISO date | 7 days ago | Start of date range | | `to` | ISO date | now | End of date range | | `sort` | string | `timestamp` | Sort field | | `order` | enum | `desc` | Sort direction: asc, desc | | `page` | int | `0` | Page number (zero-based) | | `size` | int | `25` | Page size | ### Thresholds Endpoints | Method | Path | Description | |--------|------|-------------| | `GET` | `/admin/thresholds` | All configured thresholds | | `PUT` | `/admin/thresholds` | Save thresholds (database + OpenSearch in one payload) | ### Thresholds Payload ```json { "database": { "connectionPoolWarning": 80, "connectionPoolCritical": 95, "queryDurationWarning": 1.0, "queryDurationCritical": 10.0 }, "opensearch": { "clusterHealthWarning": "YELLOW", "clusterHealthCritical": "RED", "queueDepthWarning": 100, "queueDepthCritical": 500, "jvmHeapWarning": 75, "jvmHeapCritical": 90, "failedDocsWarning": 1, "failedDocsCritical": 10 } } ``` --- ## 6. Security ### Enforcement Layers 1. **Spring Security filter chain** — `/api/v1/admin/**` requires `ROLE_ADMIN` (existing configuration) 2. **Controller annotation** — `@PreAuthorize("hasRole('ADMIN')")` on each controller class (defense-in-depth) 3. **UI role check** — sidebar admin section hidden for non-admin users (cosmetic only, not a security boundary) ### Audit Logging All admin actions are persisted to the `audit_log` database table (see Section 4 and Section 7 — Data Storage) AND logged via SLF4J at INFO level. The database record is the source of truth for compliance; the SLF4J log provides operational visibility. The `AuditService` is injected into all admin controllers (infrastructure, OIDC, user management) and the authentication flow. See Section 4 (Audit Log Page) for full details on what is logged and the record structure. ### Future: OPERATOR Role (Phase 2+) Design anticipates a read-only `OPERATOR` role: - Can view all monitoring data - Cannot perform destructive actions (kill, delete) - Implementation: method-level `@PreAuthorize` on action endpoints, UI conditionally disables buttons based on role --- ## 7. Data Storage ### New Flyway Migration: V9 ```sql CREATE TABLE admin_thresholds ( id INTEGER PRIMARY KEY DEFAULT 1, config JSONB NOT NULL DEFAULT '{}', updated_at TIMESTAMPTZ NOT NULL DEFAULT now(), updated_by TEXT NOT NULL, CONSTRAINT single_row CHECK (id = 1) ); CREATE TABLE audit_log ( id BIGSERIAL PRIMARY KEY, timestamp TIMESTAMPTZ NOT NULL DEFAULT now(), username TEXT NOT NULL, action TEXT NOT NULL, category TEXT NOT NULL, target TEXT, detail JSONB, result TEXT NOT NULL, ip_address TEXT ); CREATE INDEX idx_audit_log_timestamp ON audit_log (timestamp DESC); CREATE INDEX idx_audit_log_username ON audit_log (username); CREATE INDEX idx_audit_log_category ON audit_log (category); ``` **admin_thresholds:** - Single-row table (same pattern as `oidc_config`) - JSON column for flexibility — adding new thresholds doesn't require schema changes - Tracks who last updated and when **audit_log:** - Append-only table — no UPDATE or DELETE exposed via API - Indexed on timestamp (primary query axis), username, and category for filtered views - JSONB `detail` column holds action-specific context without schema changes - No foreign key to `users` table — username is denormalized so audit records survive user deletion --- ## 8. Frontend Architecture ### New Files | File | Purpose | |------|---------| | `pages/admin/DatabaseAdminPage.tsx` | Database monitoring and management | | `pages/admin/OpenSearchAdminPage.tsx` | OpenSearch monitoring and management | | `pages/admin/AuditLogPage.tsx` | Audit log viewer | | `api/queries/admin/database.ts` | React Query hooks for database endpoints | | `api/queries/admin/opensearch.ts` | React Query hooks for OpenSearch endpoints | | `api/queries/admin/thresholds.ts` | React Query hooks for threshold endpoints | | `api/queries/admin/audit.ts` | React Query hooks for audit log endpoint | | `components/admin/StatusBadge.tsx` | Color-coded status indicator (green/yellow/red) | | `components/admin/RefreshableCard.tsx` | Card with manual refresh button + optional auto-refresh | | `components/admin/ConfirmDeleteDialog.tsx` | Confirmation dialog requiring name input for destructive actions | ### Modified Files | File | Change | |------|--------| | `components/layout/AppSidebar.tsx` | Refactor admin section to collapsible sub-menu with multiple items | | `router.tsx` | Add routes for `/admin/database`, `/admin/opensearch`, `/admin/audit`, redirect `/admin` | | `SpaForwardController.java` | Ensure `/admin/*` forwarding covers new routes | ### Auto-Refresh Strategy - React Query `refetchInterval: 15000` on lightweight endpoints (pool, queries, pipeline, performance) - Heavy endpoints (tables, indices) use `refetchInterval: false` — manual refresh only - Refresh button calls `queryClient.invalidateQueries` for all queries on that page --- ## 9. Implementation Phases ### Phase 1 (Current Scope) 1. Admin sidebar restructuring 2. Database page — all monitoring sections + kill query 3. OpenSearch page — all monitoring sections + delete index 4. Threshold configuration (both pages) 5. Audit log — database-backed audit trail + admin viewer page 6. Retrofit audit logging into existing admin controllers (OIDC, user management) and auth flow 7. Backend endpoints with RBAC enforcement 8. Flyway migration V9 for thresholds + audit_log tables ### Phase 2 - Database maintenance actions (VACUUM ANALYZE, Reindex) - OpenSearch operations (Force Reindex All, Flush) - Bulk index operations (checkbox selection) - Audit log CSV/JSON export for auditors - OPERATOR role with view-only permissions ### Phase 3 - TimescaleDB-aware metrics (hypertable chunks, continuous aggregate status, compression) - Historical trend charts for key metrics - Alerting/notification system