From a634bf9f9da8f0b714b267743829be54abadfd3b Mon Sep 17 00:00:00 2001 From: hsiegeln <37154749+hsiegeln@users.noreply.github.com> Date: Tue, 17 Mar 2026 15:01:53 +0100 Subject: [PATCH] docs: address spec review feedback for infrastructure overview - Document SearchIndexerStats interface and required SearchIndexer changes - Add @EnableMethodSecurity prerequisite and retrofit of existing controllers - Limit audit log free-text search to indexed text columns (not JSONB) - Split migrations into V9 (thresholds) and V10 (audit_log) - Add user_agent field to audit records for SOC2 forensics - Add thresholds validation rules, pagination limits, error response shapes - Clarify SPA forwarding, single-row pattern, OpenSearch client reuse - Add audit log retention note for Phase 2 Co-Authored-By: Claude Opus 4.6 (1M context) --- ...26-03-17-infrastructure-overview-design.md | 98 +++++++++++++++---- 1 file changed, 77 insertions(+), 21 deletions(-) diff --git a/docs/superpowers/specs/2026-03-17-infrastructure-overview-design.md b/docs/superpowers/specs/2026-03-17-infrastructure-overview-design.md index f6045a44..0a13821d 100644 --- a/docs/superpowers/specs/2026-03-17-infrastructure-overview-design.md +++ b/docs/superpowers/specs/2026-03-17-infrastructure-overview-design.md @@ -45,7 +45,7 @@ The gear icon expands/collapses an admin sub-menu in the sidebar: OpenSearch → /admin/opensearch Audit Log → /admin/audit OIDC → /admin/oidc - Users → (future) + Users → /admin/users (backend exists, UI page is future scope) ``` - Admin section visible only to users with `ADMIN` role @@ -121,9 +121,17 @@ The gear icon expands/collapses an admin sub-menu in the sidebar: - Visual bar showing queue depth vs. max queue size - Metrics: queue depth, failed document count, debounce interval, indexing rate (docs/s), time since last indexed - Status badge based on configurable thresholds -- Source: `SearchIndexer` internal stats (exposed via `SearchIndexerStats` interface) +- Source: `SearchIndexer` internal stats, exposed via a new `SearchIndexerStats` interface in `cameleer3-server-core` - **Auto-refreshes every 15 seconds** +**Implementation note:** `SearchIndexer` currently has no stats API. This requires adding: +- `AtomicLong failedCount` — incremented on indexing errors +- `AtomicLong indexedCount` — incremented on successful index operations +- `volatile Instant lastIndexedAt` — updated after each successful batch +- Rate calculation via a sliding window counter (e.g., count delta over last 15s interval) +- `SearchIndexerStats` interface in core module with getters for all above, implemented by `SearchIndexer` +- `queueDepth` and `maxQueueSize` already derivable from the internal `BlockingQueue` + ### Indices Section - **Search/filter** by index name pattern (text input) @@ -193,7 +201,7 @@ Database-backed audit trail of all administrative actions across the system. Pro ``` - **Filterable** by user, category, date range -- **Searchable** by free text (matches action, target, detail) +- **Searchable** by free text (matches `action` and `target` columns only — JSONB `detail` excluded from text search for performance) - **Sortable** by timestamp (default: newest first) - **Pagination** — 25 per page, server-side - **Detail expansion** — click a row to expand and show full `detail` JSON @@ -230,14 +238,17 @@ Every admin action across the system, not just infrastructure pages: | `detail` | JSONB with action-specific context (e.g., query text for killed query, old/new roles for role change) | | `result` | `SUCCESS` or `FAILURE` | | `ip_address` | Client IP address from the request | +| `user_agent` | Browser/client identification from request header (SOC2 forensics) | ### Backend Implementation -- `AuditService` — central service injected into all admin controllers -- Single method: `log(action, category, target, detail, result)` -- Extracts username and IP from `SecurityContextHolder` and `HttpServletRequest` +- `AuditService` — central service in `cameleer3-server-core`, injected into all admin controllers via direct method calls (no AOP/interceptor — consistent with existing controller style) +- Primary method: `log(action, category, target, detail, result)` — extracts username and IP from `SecurityContextHolder` and `HttpServletRequest` +- Overloaded method for pre-auth contexts: `log(username, action, category, target, detail, result, request)` — used by auth controllers where `SecurityContext` is not yet populated (login success/failure) +- Captures `user_agent` from `HttpServletRequest` header - Writes to both the `audit_log` table AND SLF4J (belt and suspenders) - Async write option not used — audit must be synchronous for compliance guarantees +- Retrofit into existing controllers: add `auditService.log(...)` calls to `OidcConfigAdminController` (save, delete, test) and `UserAdminController` (update roles, delete user), and auth controllers (login, OIDC login, logout, failed login) --- @@ -265,6 +276,8 @@ All endpoints under `/api/v1/admin/` — secured by existing Spring Security fil | `DELETE` | `/admin/opensearch/indices/{name}` | Delete specific index (with audit log) | | `GET` | `/admin/opensearch/performance` | Cache rates, latencies, JVM heap | +**OpenSearch client:** All admin OpenSearch endpoints reuse the existing `OpenSearchClient` bean configured in `OpenSearchConfig.java`. No separate client or credentials needed — the admin endpoints call cluster-level APIs (`_cluster/health`, `_cat/indices`, `_nodes/stats`) using the same connection. + #### Indices Query Parameters | Param | Type | Default | Description | @@ -288,7 +301,7 @@ All endpoints under `/api/v1/admin/` — secured by existing Spring Security fil |-------|------|---------|-------------| | `username` | string | — | Filter by username | | `category` | enum | — | Filter by category: INFRA, AUTH, USER_MGMT, CONFIG | -| `search` | string | — | Free text search across action, target, detail | +| `search` | string | — | Free text search across `action` and `target` columns (not JSONB `detail`) | | `from` | ISO date | 7 days ago | Start of date range | | `to` | ISO date | now | End of date range | | `sort` | string | `timestamp` | Sort field | @@ -326,6 +339,37 @@ All endpoints under `/api/v1/admin/` — secured by existing Spring Security fil } ``` +### Thresholds Validation Rules + +- Warning must be <= critical for all numeric threshold pairs +- Percentage values must be 0–100 +- Duration values must be > 0 +- `clusterHealthWarning` must be less severe than `clusterHealthCritical` (GREEN < YELLOW < RED) +- Backend returns 400 Bad Request with field-level error messages on validation failure + +### Pagination Limits + +- All paginated endpoints enforce a maximum page size of 100 (`min(requested, 100)`) +- Applies to: indices listing, audit log + +### Error Responses + +All new endpoints return errors in a consistent shape: + +```json +{ + "status": 404, + "error": "Not Found", + "message": "No active query with PID 12345" +} +``` + +Specific error cases: +- `POST /admin/database/queries/{pid}/kill` — 404 if PID not found, 500 if `pg_terminate_backend` fails +- `DELETE /admin/opensearch/indices/{name}` — 404 if index not found, 502 if OpenSearch unreachable +- `GET /admin/database/status` — returns 200 with `"connected": false` if database is unreachable (not 503), so the frontend can render a red status badge rather than an error state +- `GET /admin/opensearch/status` — returns 200 with `"clusterHealth": "UNREACHABLE"` if OpenSearch is down + --- ## 6. Security @@ -333,8 +377,9 @@ All endpoints under `/api/v1/admin/` — secured by existing Spring Security fil ### Enforcement Layers 1. **Spring Security filter chain** — `/api/v1/admin/**` requires `ROLE_ADMIN` (existing configuration) -2. **Controller annotation** — `@PreAuthorize("hasRole('ADMIN')")` on each controller class (defense-in-depth) -3. **UI role check** — sidebar admin section hidden for non-admin users (cosmetic only, not a security boundary) +2. **Controller annotation** — `@PreAuthorize("hasRole('ADMIN')")` on each controller class (defense-in-depth). This is a new convention — existing controllers (`OidcConfigAdminController`, `UserAdminController`) must be retrofitted with this annotation as part of Phase 1. +3. **`@EnableMethodSecurity`** — must be added to `SecurityConfig.java` to activate `@PreAuthorize` processing (prerequisite for layer 2) +4. **UI role check** — sidebar admin section hidden for non-admin users (cosmetic only, not a security boundary) ### Audit Logging @@ -353,7 +398,7 @@ Design anticipates a read-only `OPERATOR` role: ## 7. Data Storage -### New Flyway Migration: V9 +### Flyway Migration V9: Admin Thresholds ```sql CREATE TABLE admin_thresholds ( @@ -363,7 +408,15 @@ CREATE TABLE admin_thresholds ( updated_by TEXT NOT NULL, CONSTRAINT single_row CHECK (id = 1) ); +``` +- Single-row table using `CHECK (id = 1)` constraint — stricter than the `oidc_config` pattern (which uses a text PK defaulting to `'default'` without a constraint). The CHECK approach is preferred going forward as it explicitly prevents multiple rows. +- JSON column for flexibility — adding new thresholds doesn't require schema changes +- Tracks who last updated and when + +### Flyway Migration V10: Audit Log + +```sql CREATE TABLE audit_log ( id BIGSERIAL PRIMARY KEY, timestamp TIMESTAMPTZ NOT NULL DEFAULT now(), @@ -373,24 +426,24 @@ CREATE TABLE audit_log ( target TEXT, detail JSONB, result TEXT NOT NULL, - ip_address TEXT + ip_address TEXT, + user_agent TEXT ); CREATE INDEX idx_audit_log_timestamp ON audit_log (timestamp DESC); CREATE INDEX idx_audit_log_username ON audit_log (username); CREATE INDEX idx_audit_log_category ON audit_log (category); +CREATE INDEX idx_audit_log_action ON audit_log (action); +CREATE INDEX idx_audit_log_target ON audit_log (target); ``` -**admin_thresholds:** -- Single-row table (same pattern as `oidc_config`) -- JSON column for flexibility — adding new thresholds doesn't require schema changes -- Tracks who last updated and when - -**audit_log:** +- Separate migration from thresholds so they can be developed and tested independently - Append-only table — no UPDATE or DELETE exposed via API -- Indexed on timestamp (primary query axis), username, and category for filtered views -- JSONB `detail` column holds action-specific context without schema changes +- Indexed on timestamp (primary query axis), username, category, action, and target for filtered views and free-text search via `ILIKE` on indexed text columns +- JSONB `detail` column holds action-specific context without schema changes (not searched via free text — use row expansion for detail inspection) +- `user_agent` field captures client identification for forensic analysis (SOC2) - No foreign key to `users` table — username is denormalized so audit records survive user deletion +- **Retention:** unbounded in Phase 1. Phase 2+ should add a retention/archival strategy (e.g., TimescaleDB hypertable with retention policy, or periodic archive to cold storage). Typical SOC2 retention is 7 years. --- @@ -417,7 +470,7 @@ CREATE INDEX idx_audit_log_category ON audit_log (category); |------|--------| | `components/layout/AppSidebar.tsx` | Refactor admin section to collapsible sub-menu with multiple items | | `router.tsx` | Add routes for `/admin/database`, `/admin/opensearch`, `/admin/audit`, redirect `/admin` | -| `SpaForwardController.java` | Ensure `/admin/*` forwarding covers new routes | +| `SpaForwardController.java` | Existing `/admin/{path:[^\\.]*}` pattern already covers single-segment routes — no change needed unless deeper routes are added | ### Auto-Refresh Strategy @@ -438,7 +491,9 @@ CREATE INDEX idx_audit_log_category ON audit_log (category); 5. Audit log — database-backed audit trail + admin viewer page 6. Retrofit audit logging into existing admin controllers (OIDC, user management) and auth flow 7. Backend endpoints with RBAC enforcement -8. Flyway migration V9 for thresholds + audit_log tables +8. Flyway migrations V9 (thresholds) and V10 (audit_log) +9. `SearchIndexerStats` interface and `SearchIndexer` stats instrumentation +10. `@EnableMethodSecurity` + `@PreAuthorize` retrofit on existing admin controllers ### Phase 2 @@ -446,6 +501,7 @@ CREATE INDEX idx_audit_log_category ON audit_log (category); - OpenSearch operations (Force Reindex All, Flush) - Bulk index operations (checkbox selection) - Audit log CSV/JSON export for auditors +- Audit log retention/archival strategy (7-year SOC2 requirement) - OPERATOR role with view-only permissions ### Phase 3