Covers restricting DB/ClickHouse admin endpoints in SaaS-managed server instances via @ConditionalOnProperty flag, and building a vendor-facing infrastructure dashboard in the SaaS platform with per-tenant PostgreSQL and ClickHouse visibility. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
250 lines
9.5 KiB
Markdown
250 lines
9.5 KiB
Markdown
# Infrastructure Endpoint Visibility
|
|
|
|
**Date:** 2026-04-11
|
|
**Status:** Approved
|
|
**Scope:** cameleer3-server + cameleer-saas
|
|
|
|
---
|
|
|
|
## Problem
|
|
|
|
The server's admin section exposes PostgreSQL and ClickHouse diagnostic
|
|
endpoints (connection strings, pool stats, active queries, table sizes, server
|
|
versions). In standalone mode this is fine -- the admin user is the platform
|
|
owner. In SaaS mode, tenant admins receive `ADMIN` role via OIDC, which grants
|
|
them access to infrastructure internals they should not see.
|
|
|
|
Worse, the current endpoints have cross-tenant data leaks in shared-infra
|
|
deployments:
|
|
|
|
- `GET /admin/database/queries` returns `pg_stat_activity` which is
|
|
database-wide, not schema-scoped. Tenant A can see and kill Tenant B's
|
|
queries.
|
|
- `GET /admin/clickhouse/tables`, `/performance`, `/queries` query
|
|
`system.tables`, `system.parts`, and `system.processes` globally -- no
|
|
`tenant_id` filtering.
|
|
|
|
## Solution
|
|
|
|
Two complementary changes:
|
|
|
|
1. **Server**: an explicit flag disables infrastructure endpoints entirely when
|
|
the server is provisioned by the SaaS platform.
|
|
2. **SaaS**: the vendor console gains its own infrastructure dashboard that
|
|
queries shared PostgreSQL and ClickHouse directly, with per-tenant breakdown.
|
|
|
|
---
|
|
|
|
## Part 1: Server -- Disable Infrastructure Endpoints
|
|
|
|
### New Property
|
|
|
|
```yaml
|
|
cameleer:
|
|
server:
|
|
security:
|
|
infrastructureendpoints: ${CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS:true}
|
|
```
|
|
|
|
Default `true` (standalone mode -- endpoints available as today). The SaaS
|
|
provisioner sets `false` on tenant server containers.
|
|
|
|
### Bean Removal via @ConditionalOnProperty
|
|
|
|
Add to both `DatabaseAdminController` and `ClickHouseAdminController`:
|
|
|
|
```java
|
|
@ConditionalOnProperty(
|
|
name = "cameleer.server.security.infrastructureendpoints",
|
|
havingValue = "true",
|
|
matchIfMissing = true
|
|
)
|
|
```
|
|
|
|
When `false`:
|
|
- Controller beans are not registered by Spring
|
|
- Requests to `/api/v1/admin/database/**` and `/api/v1/admin/clickhouse/**`
|
|
return **404 Not Found** (Spring's default for unmapped paths)
|
|
- Controllers do not appear in the OpenAPI spec
|
|
- No role, no interceptor, no filter -- the endpoints simply do not exist
|
|
|
|
### Health Endpoint Flag
|
|
|
|
Add `infrastructureEndpoints` boolean to the health endpoint response
|
|
(`GET /api/v1/health`). The value reflects the property. This is a public
|
|
endpoint, and the flag itself is not sensitive -- it only tells the UI whether
|
|
to render the Database/ClickHouse admin tabs.
|
|
|
|
Implementation: a custom `HealthIndicator` bean or a `@RestControllerAdvice`
|
|
that enriches the health response. The exact mechanism is an implementation
|
|
detail.
|
|
|
|
### UI Changes
|
|
|
|
`buildAdminTreeNodes()` in `sidebar-utils.ts` currently returns a static list
|
|
including Database and ClickHouse nodes. Change it to accept a parameter
|
|
(or read from a store) and omit those nodes when `infrastructureEndpoints` is
|
|
`false`.
|
|
|
|
The flag is fetched once from the health endpoint at startup (the UI already
|
|
calls health for connectivity). Store it in the auth store or a dedicated
|
|
capabilities store.
|
|
|
|
Router: the `/admin/database` and `/admin/clickhouse` routes remain defined
|
|
but are unreachable via navigation. If a user navigates directly, the API
|
|
returns 404 and the page shows its existing error state.
|
|
|
|
### Files Changed (Server)
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `application.yml` | Add `cameleer.server.security.infrastructureendpoints: true` |
|
|
| `DatabaseAdminController.java` | Add `@ConditionalOnProperty` annotation |
|
|
| `ClickHouseAdminController.java` | Add `@ConditionalOnProperty` annotation |
|
|
| Health response | Add `infrastructureEndpoints` boolean |
|
|
| `ui/src/components/sidebar-utils.ts` | Filter admin tree nodes based on flag |
|
|
| `ui/src/components/LayoutShell.tsx` | Fetch and pass flag |
|
|
|
|
---
|
|
|
|
## Part 2: SaaS -- Vendor Infrastructure Dashboard
|
|
|
|
### Architecture
|
|
|
|
The SaaS platform sits on the same Docker network as PostgreSQL and ClickHouse.
|
|
It already has their connection URLs in `ProvisioningProperties` (`datasourceUrl`
|
|
for cameleer3 PostgreSQL, `clickhouseUrl` for ClickHouse). It already uses raw
|
|
JDBC (`DriverManager.getConnection()`) for tenant data cleanup in
|
|
`TenantDataCleanupService`. The infrastructure dashboard uses the same pattern.
|
|
|
|
The SaaS does NOT call the server's admin endpoints. It queries the shared
|
|
infrastructure directly. This means:
|
|
|
|
- No dependency on server endpoint availability
|
|
- Cross-tenant aggregation is natural (the SaaS knows all tenants)
|
|
- Per-tenant filtering is explicit (`WHERE tenant_id = ?` for ClickHouse,
|
|
schema-scoped queries for PostgreSQL)
|
|
- No new Logto scopes or roles needed
|
|
|
|
### Backend
|
|
|
|
**`InfrastructureService.java`** -- new service class. Raw JDBC connections
|
|
from `ProvisioningProperties.datasourceUrl()` and `.clickhouseUrl()`. Methods:
|
|
|
|
PostgreSQL:
|
|
- `getPostgresOverview()` -- server version (`SELECT version()`), total DB
|
|
size (`pg_database_size`), active connection count (`pg_stat_activity`)
|
|
- `getPostgresTenantStats()` -- per-tenant schema sizes, table counts, row
|
|
counts. Query `information_schema.tables` joined with `pg_stat_user_tables`
|
|
grouped by `table_schema` where schema matches `tenant_%`
|
|
- `getPostgresTenantDetail(slug)` -- single tenant: table-level breakdown
|
|
(name, rows, data size, index size) from `pg_stat_user_tables` filtered to
|
|
`tenant_{slug}` schema
|
|
|
|
ClickHouse:
|
|
- `getClickHouseOverview()` -- server version, uptime (`system.metrics`),
|
|
total disk size, total rows, compression ratio (`system.parts` aggregated)
|
|
- `getClickHouseTenantStats()` -- per-tenant row counts and disk usage. Query
|
|
actual data tables (executions, logs, etc.) with
|
|
`SELECT tenant_id, count(), sum(bytes) ... GROUP BY tenant_id`
|
|
- `getClickHouseTenantDetail(slug)` -- single tenant: per-table breakdown
|
|
(table name, row count, disk size) filtered by `WHERE tenant_id = ?`
|
|
|
|
Note: ClickHouse `system.parts` does not have a `tenant_id` column (it is a
|
|
system table). Per-tenant ClickHouse stats require querying the actual data
|
|
tables. For the overview, `system.parts` provides aggregate stats across all
|
|
tenants.
|
|
|
|
**`InfrastructureController.java`** -- new REST controller at
|
|
`/api/vendor/infrastructure`. All endpoints require `platform:admin` scope
|
|
via `@PreAuthorize("hasAuthority('SCOPE_platform:admin')")`.
|
|
|
|
| Method | Path | Returns |
|
|
|--------|------|---------|
|
|
| GET | `/` | Combined PG + CH overview |
|
|
| GET | `/postgres` | PG overview + per-tenant breakdown |
|
|
| GET | `/postgres/{slug}` | Single tenant PG detail |
|
|
| GET | `/clickhouse` | CH overview + per-tenant breakdown |
|
|
| GET | `/clickhouse/{slug}` | Single tenant CH detail |
|
|
|
|
### Frontend
|
|
|
|
New vendor sidebar entry: **Infrastructure** (icon: `Server` or `Database`
|
|
from lucide-react) at `/vendor/infrastructure`.
|
|
|
|
**Page layout:**
|
|
- Two section cards: PostgreSQL and ClickHouse
|
|
- Each shows aggregate KPIs at top (version, total size, connections/queries)
|
|
- Per-tenant table below: slug, schema size / row count, disk usage
|
|
- Click tenant row to expand or navigate to detail view
|
|
- Detail view: per-table breakdown for that tenant
|
|
|
|
The page follows the existing vendor console patterns (card layout, tables,
|
|
KPI strips) using `@cameleer/design-system` components.
|
|
|
|
### SaaS Provisioner Change
|
|
|
|
`DockerTenantProvisioner.createServerContainer()` adds one env var to the
|
|
list passed to tenant server containers:
|
|
|
|
```java
|
|
env.add("CAMELEER_SERVER_SECURITY_INFRASTRUCTUREENDPOINTS=false");
|
|
```
|
|
|
|
### Files Changed (SaaS)
|
|
|
|
| File | Change |
|
|
|------|--------|
|
|
| `DockerTenantProvisioner.java` | Add env var to tenant server containers |
|
|
| `InfrastructureService.java` | New -- raw JDBC queries for PG + CH stats |
|
|
| `InfrastructureController.java` | New -- vendor-facing REST endpoints |
|
|
| `Layout.tsx` | Add Infrastructure to vendor sidebar |
|
|
| `router.tsx` | Add `/vendor/infrastructure` route |
|
|
| `InfrastructurePage.tsx` | New -- overview page with PG/CH cards |
|
|
|
|
---
|
|
|
|
## What Does NOT Change
|
|
|
|
- No new Logto scopes, roles, or API resources
|
|
- No new Spring datasource beans (raw JDBC, same as TenantDataCleanupService)
|
|
- No changes to SecurityConfig in either repo
|
|
- No changes to existing tenant admin endpoints or RBAC
|
|
- No changes to ServerApiClient (SaaS queries infra directly, not via server)
|
|
- Standalone server deployments are unaffected (flag defaults to `true`)
|
|
|
|
---
|
|
|
|
## Data Flow Summary
|
|
|
|
**Standalone mode (no SaaS):**
|
|
1. Admin user logs into server UI
|
|
2. Admin sidebar shows Database and ClickHouse tabs
|
|
3. Tabs work as today -- full infrastructure visibility
|
|
|
|
**SaaS managed mode:**
|
|
1. SaaS provisions tenant server with `INFRASTRUCTUREENDPOINTS=false`
|
|
2. Tenant admin logs into server UI via OIDC
|
|
3. Admin sidebar shows Users & Roles, OIDC, Audit, Environments -- no
|
|
Database or ClickHouse tabs
|
|
4. Direct navigation to `/admin/database` returns 404
|
|
5. Vendor opens SaaS console -> Infrastructure page
|
|
6. SaaS queries shared PG + CH directly with per-tenant filtering
|
|
7. Vendor sees aggregate stats + per-tenant breakdown
|
|
|
|
---
|
|
|
|
## Security Properties
|
|
|
|
- **Tenant isolation**: tenant admins cannot see any infrastructure data.
|
|
The endpoints do not exist on their server instance.
|
|
- **Cross-tenant prevention**: the SaaS infrastructure dashboard queries
|
|
with explicit tenant filtering. No tenant can see another tenant's data.
|
|
- **Blast radius**: the flag is set at provisioning time via env var. A tenant
|
|
admin cannot change it. Only someone with access to the Docker container
|
|
config (platform operator) can toggle it.
|
|
- **Defense in depth**: even if the flag were somehow bypassed, the server's
|
|
DB/CH admin endpoints expose `pg_stat_activity` and `system.processes`
|
|
globally. The SaaS approach of querying directly with tenant filtering is
|
|
inherently safer.
|