docs: per-tenant PostgreSQL isolation design spec

Per-tenant PG users and schemas for DB-level data isolation.
Each tenant server gets its own credentials and currentSchema/ApplicationName
JDBC parameters, aligned with server team's commit 7a63135.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
hsiegeln
2026-04-15 00:08:35 +02:00
parent 91e93696ed
commit 17c6723f7e

View File

@@ -0,0 +1,147 @@
# Per-Tenant PostgreSQL Isolation
**Date:** 2026-04-15
**Status:** Approved
## Context
The cameleer3-server team introduced `currentSchema` and `ApplicationName` JDBC parameters (commit `7a63135`) to scope admin diagnostic queries to a single tenant's connections. Previously, all tenant servers shared one PostgreSQL user and connected to the `cameleer3` database without schema isolation — a tenant's server could theoretically see SQL text from other tenants via `pg_stat_activity`.
This spec adds per-tenant PostgreSQL users and schemas so each tenant server can only access its own data at the database level.
## Architecture
### Current State
- All tenant servers connect as the shared admin PG user to `cameleer3` database, `public` schema.
- No per-tenant schemas exist — the server's Flyway runs in `public`.
- `TenantDataCleanupService` already attempts `DROP SCHEMA tenant_<slug>` on delete (no-op today since schemas don't exist).
- Standalone mode sets `currentSchema=tenant_default` in the compose file and is unaffected by this change.
### Target State
- Each tenant gets a dedicated PG user (`tenant_<slug>`) and schema (`tenant_<slug>`).
- The tenant user owns only its schema. `REVOKE ALL ON SCHEMA public` prevents cross-tenant access.
- The server's Flyway runs inside `tenant_<slug>` via the `currentSchema` JDBC parameter.
- `ApplicationName=tenant_<slug>` scopes `pg_stat_activity` visibility per the server team's convention.
- On tenant delete, both schema and user are dropped.
## New Component: `TenantDatabaseService`
A focused service with two methods:
```java
@Service
public class TenantDatabaseService {
void createTenantDatabase(String slug, String password);
void dropTenantDatabase(String slug);
}
```
### `createTenantDatabase(slug, password)`
Connects to `cameleer3` using the admin PG credentials from `ProvisioningProperties`. Executes:
1. Validate slug against `^[a-z0-9-]+$` (reject unexpected characters).
2. `CREATE USER "tenant_<slug>" WITH PASSWORD '<password>'` (skip if user already exists — idempotent for re-provisioning).
3. `CREATE SCHEMA "tenant_<slug>" AUTHORIZATION "tenant_<slug>"` (skip if schema already exists).
4. `REVOKE ALL ON SCHEMA public FROM "tenant_<slug>"`.
All identifiers are double-quoted. The password is a 32-character random alphanumeric string generated by the same `SecureRandom` utility used for other credential generation.
### `dropTenantDatabase(slug)`
1. `DROP SCHEMA IF EXISTS "tenant_<slug>" CASCADE`
2. `DROP USER IF EXISTS "tenant_<slug>"`
Schema must be dropped first (with `CASCADE`) because PG won't drop a user that owns objects.
## Entity Change
**New Flyway migration:** `V014__add_tenant_db_password.sql`
```sql
ALTER TABLE tenants ADD COLUMN db_password VARCHAR(255);
```
Nullable — existing tenants won't have it. Code checks for null and falls back to shared credentials for backwards compatibility.
**TenantEntity:** new `dbPassword` field with JPA `@Column` mapping.
## Provisioning Flow Changes
### `VendorTenantService.provisionAsync()` — new steps before container creation
```
1. Generate 32-char random password
2. tenantDatabaseService.createTenantDatabase(slug, password)
3. entity.setDbPassword(password)
4. tenantRepository.save(entity)
5. tenantProvisioner.provision(request) ← request now includes dbPassword
6. ... rest unchanged (health check, license push, OIDC push)
```
If step 2 fails, provisioning aborts with a stored error — no orphaned containers.
### `DockerTenantProvisioner` — JDBC URL construction
The `ProvisionRequest` record gains `dbPassword` field.
**When `dbPassword` is present** (new tenants):
```
SPRING_DATASOURCE_URL=jdbc:postgresql://cameleer-postgres:5432/cameleer3?currentSchema=tenant_<slug>&ApplicationName=tenant_<slug>
SPRING_DATASOURCE_USERNAME=tenant_<slug>
SPRING_DATASOURCE_PASSWORD=<generated>
```
**When `dbPassword` is null** (pre-existing tenants, backwards compat):
```
SPRING_DATASOURCE_URL=<props.datasourceUrl()> (no currentSchema/ApplicationName)
SPRING_DATASOURCE_USERNAME=<props.datasourceUsername()>
SPRING_DATASOURCE_PASSWORD=<props.datasourcePassword()>
```
Server restart/upgrade re-creates containers via `provisionAsync()`, which re-reads `dbPassword` from the entity. Restarting an upgraded tenant picks up isolated credentials automatically.
## Delete Flow Changes
### `VendorTenantService.delete()`
```
1. tenantProvisioner.remove(slug) ← existing
2. licenseService.revokeLicense(...) ← existing
3. logtoClient.deleteOrganization(...) ← existing
4. tenantDatabaseService.dropTenantDatabase(slug) ← replaces TenantDataCleanupService PG logic
5. dataCleanupService.cleanupClickHouse(slug) ← ClickHouse cleanup stays separate
6. entity.setStatus(DELETED) ← existing
```
`TenantDataCleanupService` loses its PostgreSQL cleanup responsibility (delegated to `TenantDatabaseService`). It keeps only the ClickHouse cleanup. Rename method to `cleanupClickHouse(slug)` for clarity.
## Backwards Compatibility
| Scenario | Behavior |
|----------|----------|
| **Standalone mode** | Unaffected. Server is in compose, not provisioned by SaaS. Defaults to `tenant_default`. |
| **Existing SaaS tenants** (dbPassword=null) | Shared credentials, no `currentSchema`. Same as before. |
| **Existing tenants after restart/upgrade** | Still use shared credentials until re-provisioned with new code. |
| **New tenants** | Isolated user+schema+JDBC URL. Full isolation. |
| **Delete of pre-existing tenant** | `DROP USER IF EXISTS` is a no-op (user doesn't exist). Schema drop unchanged. |
## InfrastructureService
No changes needed. Already queries `information_schema.schemata WHERE schema_name LIKE 'tenant_%'`. With per-tenant schemas now created, the PostgreSQL tenant table on the Infrastructure page will populate automatically.
## Files Changed
| File | Change |
|------|--------|
| `TenantDatabaseService.java` | **New** — create/drop PG user+schema |
| `TenantEntity.java` | Add `dbPassword` field |
| `V014__add_tenant_db_password.sql` | **New** — nullable column |
| `VendorTenantService.java` | Call `createTenantDatabase` in provision, `dropTenantDatabase` in delete |
| `DockerTenantProvisioner.java` | Construct per-tenant JDBC URL, username, password |
| `ProvisionRequest` record | Add `dbPassword` field |
| `TenantDataCleanupService.java` | Remove PG logic, keep ClickHouse only, rename method |