feat(alerting): Plan 01 — outbound HTTP infra + admin-managed outbound connections #139

Merged
hsiegeln merged 22 commits from feat/alerting-01-outbound-infra into main 2026-04-20 08:57:41 +02:00
Owner

Summary

Foundation for the alerting feature (spec: docs/superpowers/specs/2026-04-19-alerting-design.md). Adds a reusable outbound HTTP primitive + admin-managed outbound connections; no alerting behaviour yet (Plan 02 builds on this).

  • Cross-cutting outbound HTTP module (core/http/ + app/http/): OutboundHttpClientFactory interface with memoized Apache HttpClient 5 impl; SslContextBuilder supporting SYSTEM_DEFAULT / TRUST_ALL / TRUST_PATHS modes; startup validation (WARN on trust-all, fail-fast on missing CA paths).
  • Admin-managed outbound connections (core/outbound/ + app/outbound/): OutboundConnection domain record, SecretCipher (AES-GCM with JWT-derived key) for HMAC secrets at rest, Postgres repository with JSONB + env arrays, OutboundConnectionService with uniqueness + allowed-env + delete-if-referenced guards.
  • REST API: /api/v1/admin/outbound-connections (ADMIN CRUD + /test reachability probe + /{id}/usage). OPERATOR gets read-only access.
  • UI: admin page for outbound connections list + editor (TLS trust config, env restriction, HMAC).
  • V11 Flyway: outbound_connections table with HTTPS-only URL constraint.
  • Audit: new OUTBOUND_CONNECTION_CHANGE + OUTBOUND_HTTP_TRUST_CHANGE categories.

Plan 02 gate (important for reviewers)

OutboundConnectionServiceImpl.rulesReferencing(UUID) ships as a stub returning [] in this PR. Plan 02 wires it for real (queries alert_rules.webhooks). Until Plan 02 merges, the delete + narrow-envs guards are effectively no-ops. Safe today because no alert rules exist yet; do NOT park Plan 02 for long windows after this merges.

Test Plan

  • Backend tests: 22 commits' worth of unit + IT, including SslContextBuilder modes, ApacheOutboundHttpClientFactory memoization, SecretCipher round-trip, Postgres repo with env-arrays + JSONB, service uniqueness + narrow-envs + delete guards, admin controller RBAC + audit (ADMIN mutations, OPERATOR read).
  • UI: outbound connection list + editor render; TRUST_ALL warning banner; 409 on duplicate name; VIEWER can't see admin page.
  • Staging smoke: create connection pointing at https://httpbin.org/post, hit POST /{id}/test, verify 200 + latency + TLS summary. TRUST_ALL amber banner present in UI.
  • Verify startup WARN line when outbound-http.trust-all=true; fail-fast when trusted-ca-pem-paths contains a non-existent path.
  • Confirm no pre-existing test regressions (see docs/alerting-02-verification.md on Plan 02 branch for the pre-existing failure roster).

Documentation

  • docs/alerting-outbound-connections.md — admin guide (connection lifecycle, TLS modes, HMAC signing, test probe).
  • .claude/rules/app-classes.md + core-classes.md updated for new packages.

Deferred

  • rulesReferencing() wiring → Plan 02.
  • POST /{id}/test TLS summary is stubbed as "TLS" — Plan 02 or follow-up enriches with real protocol/cipher/peer-cert.
  • In-app CA bundle management UI — BL-001 / gitea#137 (deferred pending SaaS-layer CA reuse investigation).
  • OIDC retrofit to use OutboundHttpClientFactory — audit done, retrofit deferred to a separate commit.
## Summary Foundation for the alerting feature (spec: `docs/superpowers/specs/2026-04-19-alerting-design.md`). Adds a reusable outbound HTTP primitive + admin-managed outbound connections; no alerting behaviour yet (Plan 02 builds on this). - **Cross-cutting outbound HTTP module** (`core/http/` + `app/http/`): `OutboundHttpClientFactory` interface with memoized `Apache HttpClient 5` impl; `SslContextBuilder` supporting `SYSTEM_DEFAULT` / `TRUST_ALL` / `TRUST_PATHS` modes; startup validation (WARN on trust-all, fail-fast on missing CA paths). - **Admin-managed outbound connections** (`core/outbound/` + `app/outbound/`): `OutboundConnection` domain record, `SecretCipher` (AES-GCM with JWT-derived key) for HMAC secrets at rest, Postgres repository with JSONB + env arrays, `OutboundConnectionService` with uniqueness + allowed-env + delete-if-referenced guards. - **REST API**: `/api/v1/admin/outbound-connections` (ADMIN CRUD + `/test` reachability probe + `/{id}/usage`). OPERATOR gets read-only access. - **UI**: admin page for outbound connections list + editor (TLS trust config, env restriction, HMAC). - **V11 Flyway**: `outbound_connections` table with HTTPS-only URL constraint. - **Audit**: new `OUTBOUND_CONNECTION_CHANGE` + `OUTBOUND_HTTP_TRUST_CHANGE` categories. ### Plan 02 gate (important for reviewers) `OutboundConnectionServiceImpl.rulesReferencing(UUID)` ships as a stub returning `[]` in this PR. **Plan 02 wires it for real** (queries `alert_rules.webhooks`). Until Plan 02 merges, the delete + narrow-envs guards are effectively no-ops. Safe today because no alert rules exist yet; do NOT park Plan 02 for long windows after this merges. ## Test Plan - [x] Backend tests: 22 commits' worth of unit + IT, including `SslContextBuilder` modes, `ApacheOutboundHttpClientFactory` memoization, `SecretCipher` round-trip, Postgres repo with env-arrays + JSONB, service uniqueness + narrow-envs + delete guards, admin controller RBAC + audit (ADMIN mutations, OPERATOR read). - [x] UI: outbound connection list + editor render; TRUST_ALL warning banner; 409 on duplicate name; VIEWER can't see admin page. - [ ] Staging smoke: create connection pointing at `https://httpbin.org/post`, hit `POST /{id}/test`, verify 200 + latency + TLS summary. TRUST_ALL amber banner present in UI. - [ ] Verify startup WARN line when `outbound-http.trust-all=true`; fail-fast when `trusted-ca-pem-paths` contains a non-existent path. - [ ] Confirm no pre-existing test regressions (see `docs/alerting-02-verification.md` on Plan 02 branch for the pre-existing failure roster). ## Documentation - `docs/alerting-outbound-connections.md` — admin guide (connection lifecycle, TLS modes, HMAC signing, test probe). - `.claude/rules/app-classes.md` + `core-classes.md` updated for new packages. ## Deferred - `rulesReferencing()` wiring → Plan 02. - `POST /{id}/test` TLS summary is stubbed as `"TLS"` — Plan 02 or follow-up enriches with real protocol/cipher/peer-cert. - In-app CA bundle management UI — `BL-001` / gitea#137 (deferred pending SaaS-layer CA reuse investigation). - OIDC retrofit to use `OutboundHttpClientFactory` — audit done, retrofit deferred to a separate commit.
claude added 22 commits 2026-04-20 08:39:16 +02:00
Defense-in-depth per code review. DTO layer already validates HTTPS at save
time; this DB-level check guards against future code paths that might bypass
the DTO validator. Mustache template variables in the URL (e.g., {{env.slug}})
remain valid since only the scheme prefix is constrained.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds ApacheOutboundHttpClientFactory (Apache HttpClient 5) that memoizes
CloseableHttpClient instances keyed on effective TLS + timeout config, and
OutboundHttpConfig (@ConfigurationProperties) that validates trusted CA paths
at startup and exposes OutboundHttpClientFactory as a Spring bean.

TRUST_ALL mode disables both cert validation (TrustAllManager in SslContextBuilder)
and hostname verification (NoopHostnameVerifier on SSLConnectionSocketFactoryBuilder).
WireMock HTTPS integration test covers trust-all bypass, system-default PKIX rejection,
and client memoization.

OIDC audit: OidcProviderHelper and OidcTokenExchanger use Nimbus SDK's own HTTP layer
(DefaultResourceRetriever for JWKS, HTTPRequest.send() for token exchange) plus the
bespoke InsecureTlsHelper for TLS skip-verify; neither uses OutboundHttpClientFactory.
Retrofit deferred to a separate follow-up per plan §20.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
V11 migration referenced users(id) as uuid, but V1 users table has
user_id as TEXT primary key. Amending V11 and the OutboundConnection
record before Task 7's integration tests catch this at Flyway startup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- PostgresOutboundConnectionRepository: JdbcTemplate impl of
  OutboundConnectionRepository; UUID arrays via ConnectionCallback,
  JSONB for headers/auth/ca-paths, enum casts for method/trust/auth-kind
- OutboundBeanConfig: wires the repo + SecretCipher beans
- PostgresOutboundConnectionRepositoryIT: 5 Testcontainers tests
  (save+read, unique-name, allowed-env-ids round-trip, tenant isolation,
  delete); validates V11 Flyway migration end-to-end
- application-test.yml: add jwtsecret default so SecretCipher bean
  starts up in the Spring test context

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
rulesReferencing() is stubbed; wired to AlertRuleRepository in Plan 02.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New audit categories: OUTBOUND_CONNECTION_CHANGE, OUTBOUND_HTTP_TRUST_CHANGE.
Controller-level @PreAuthorize defaults to ADMIN; GETs relaxed to ADMIN|OPERATOR.
SecurityConfig permits OPERATOR GETs on /api/v1/admin/outbound-connections/**.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POST /{id}/test issues a synthetic probe against the connection URL.
TLS protocol/cipher/peer-cert details stubbed for now (Plan 02 follow-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Types are hand-authored (matching codebase admin-query convention);
schema.d.ts regeneration deferred until backend dev server is available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds OutboundConnectionsPage (list view with delete), lazy route at
/admin/outbound-connections, and Outbound Connections nav node in the
admin sidebar tree. No test file created — UI codebase has no existing
test infrastructure to build on.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Shared Spring test context meant seeded test-admin/test-operator/test-viewer/test-alice
users persisted across IT classes, breaking FlywayMigrationIT's "users is empty" assertion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(outbound): null-guard TRUST_PATHS check; add RBAC test for probe endpoint
All checks were successful
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 3m5s
CI / build (pull_request) Successful in 2m13s
CI / cleanup-branch (pull_request) Has been skipped
CI / docker (pull_request) Has been skipped
CI / docker (push) Successful in 4m48s
CI / deploy (pull_request) Has been skipped
CI / deploy-feature (pull_request) Has been skipped
CI / deploy (push) Has been skipped
CI / deploy-feature (push) Successful in 32s
cacedd3f16
- OutboundConnectionRequest compact ctor: avoid NPE if tlsTrustMode is null
  (defense-in-depth alongside @NotNull Bean Validation).
- Add operatorCannotTest IT case to lock the ADMIN-only contract on
  POST /{id}/test — was previously untested.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
hsiegeln merged commit b763155a60 into main 2026-04-20 08:57:41 +02:00
Sign in to join this conversation.