Files
cameleer-saas/docs/superpowers/specs/2026-03-29-saas-platform-prd.md

685 lines
29 KiB
Markdown
Raw Normal View History

# Cameleer SaaS Platform — Product Requirements Document
**Status:** Draft — Awaiting Review
**Date:** 2026-03-29
**Author:** Hendrik Siegeln + Claude (brainstorming session)
**Gitea Project:** cameleer/cameleer-saas
**Gitea Epics:** #1#13
---
## 1. Product Definition
**Cameleer SaaS** is a Camel application runtime platform with built-in observability. Customers deploy Apache Camel applications and get zero-configuration tracing, topology mapping, payload lineage, distributed correlation, live debugging, and exchange replay — powered by the cameleer agent (auto-injected) and cameleer-server (managed per tenant).
### Three Pillars
1. **Runtime** — Deploy and run Camel applications with automatic agent injection
2. **Observability** — Per-tenant cameleer-server (traces, topology, lineage, correlation, debugger, replay)
3. **Management** — Auth, billing, teams, provisioning, secrets, environments
### Two Deployment Modes
- **SaaS (managed)** — Fully managed by the Cameleer platform
- **Self-hosted / Air-gapped** — Customer-operated, license-enforced feature parity with SaaS tiers
### Relationship to Existing Components
| Component | Role | Changes Required |
|-----------|------|------------------|
| cameleer (agent) | Zero-code Camel instrumentation, auto-injected into customer JARs | MOAT features (lineage, correlation, debugger, replay) |
| cameleer-server | Per-tenant observability backend | Managed mode (trust SaaS JWT), license module, MOAT features |
| cameleer-saas (this repo) | SaaS management platform — control plane | New: everything in this document |
| design-system | Shared React component library | Used by both SaaS shell and server UI |
---
## 2. Tier Structure
### Tier Matrix
| Dimension | Low | Mid | High | Business |
|-----------|-----|-----|------|----------|
| **Infrastructure** | Shared cluster, shared PG/OS | Shared cluster, shared PG/OS | Dedicated cluster(s) | Dedicated cluster(s) |
| **Pricing** | Base fee + usage (data vol, CPU, RAM) | Base fee + usage (data vol, CPU, RAM) | Committed resources | Committed resources |
| **Environments** | 1 (prod) | 2 (dev, prod) | Unlimited | Unlimited |
| **Agents** | Limited | Higher limit | Unlimited | Unlimited |
| **Data Retention** | 7 days | 30 days | 90 days | Custom |
| **Topology Graph** | Yes | Yes | Yes | Yes |
| **Payload Lineage** | Limited (route-scope only, max 10 captures/min) | Full | Full | Full |
| **Cross-Service Correlation** | No | Yes | Yes | Yes |
| **Live Route Debugger** | No | No | Yes | Yes |
| **Exchange Replay** | No | No | Yes | Yes |
| **SSO / OIDC** | No | No | Yes | Yes |
| **Custom Roles** | No | No | Yes | Yes |
| **Team Management** | Basic | Basic | Full | Full |
| **Secrets** | Platform-native | Platform-native + 1 vault | Unlimited vaults | Unlimited vaults |
| **Support** | Docs | Email | Priority | Dedicated CSM |
| **SLA** | Best effort | 99.5% | 99.9% | 99.95%+ custom |
| **VPN (future)** | No | No | Yes | Yes |
### Pricing Models
**Usage-based (Low/Mid):**
- Optional small monthly base fee
- Metered dimensions: data volume (GB ingested), CPU (core·hours), RAM (GB·hours)
- Stripe metered subscriptions with periodic usage reporting
**Committed resources (High/Business):**
- Fixed pricing based on reserved cluster capacity (CPU cores, RAM, storage, node count)
- Annual or multi-year contracts
- Overage alerts (upsell, not automatic billing)
---
## 3. System Architecture
### Approach: Modular Monolith Control Plane
Single Spring Boot application with well-bounded internal modules. K8s ingress handles tenant routing. Flux CD handles infrastructure reconciliation.
```
[Browser] → [Ingress (Traefik/Envoy)] → [SaaS Platform (modular Spring Boot)]
↓ (tenant routes) ↓ (provisioning)
[Tenant cameleer-server] [Flux CD → K8s]
```
### Component Map
```
┌─────────────────────────────────────────┐
│ Ingress (Traefik/Envoy) │
│ TLS termination, tenant routing │
└──────┬──────────────┬──────────────┬────┘
│ │ │
┌──────────▼──────────┐ │ ┌─────────▼─────────┐
│ SaaS Management │ │ │ Grafana/Prometheus│
│ Platform │ │ │ (self-monitoring) │
│ (Spring Boot) │ │ └───────────────────┘
│ │ │
│ Modules: │ │
│ ├─ Auth │ │
│ ├─ Billing │ │
│ ├─ Provisioning │ │
│ ├─ Runtime │ │
│ ├─ License │ │
│ ├─ Secrets │ │
│ └─ Audit │ │
└──┬───┬──────┬───────┘ │
│ │ │ │
┌────────┘ │ └──────┐ │
▼ ▼ ▼ │
┌──────────────┐ ┌────────┐ ┌──────────▼───────────────┐
│ Platform DB │ │ Stripe │ │ Shared K8s Cluster │
│ (PostgreSQL) │ │ API │ │ │
│ - tenants │ └────────┘ │ ┌─────────────────────┐ │
│ - users │ │ │ tenant-a namespace │ │
│ - teams │ ┌─────┐ │ │ ├─ cameleer-server │ │
│ - audit log │ │Flux │ │ │ ├─ camel-app-1 │ │
│ - licenses │ │ CD │ │ │ ├─ camel-app-2 │ │
└──────────────┘ └──┬──┘ │ │ └─ NetworkPolicies │ │
│ │ └─────────────────────┘ │
┌───────▼──┐ │ ┌─────────────────────┐ │
│ GitOps │ │ │ tenant-b namespace │ │
│ Repo │ │ │ └─ ... │ │
│(HelmRel) │ │ └─────────────────────┘ │
└──────────┘ │ │
│ Shared: │
│ ├─ PostgreSQL (tenant │
│ │ schemas) │
│ ├─ OpenSearch (tenant │
│ │ indices) │
│ └─ Container Registry │
└──────────────────────────┘
```
### Dedicated Tier (High/Business)
Same management platform routes to dedicated cluster(s) per customer. Dedicated PostgreSQL, OpenSearch, and container registry within the customer's cluster. Provisioned semi-manually at launch (Flux bootstrap), full Cluster API automation deferred.
### Tech Stack
| Component | Technology |
|-----------|------------|
| Management Platform backend | Spring Boot 3, Java 21 |
| Management Platform frontend | React, @cameleer/design-system |
| Platform database | PostgreSQL |
| Tenant observability | cameleer-server (Spring Boot), PostgreSQL, OpenSearch |
| GitOps | Flux CD |
| K8s distribution | Talos (production), k3s (dev) |
| Ingress | Traefik or Envoy |
| Billing | Stripe (Subscriptions + Usage Records API) |
| Auth | Spring Security OAuth2, Ed25519 JWT |
| Secrets sync | K8s External Secrets Operator |
| Container registry | Platform-managed (Harbor or Gitea Container Registry) |
| Monitoring | Prometheus, Grafana, Loki, Alertmanager |
| Image signing | cosign/sigstore |
| Image scanning | Trivy |
### Key Architectural Decisions
1. **Modular monolith** — Single Spring Boot app with clean module boundaries. Extractable later if needed.
2. **K8s ingress handles routing** — Tenant routing via path or subdomain. No custom API gateway.
3. **Flux CD for reconciliation** — HelmRelease CRs per tenant. Drift detection, self-healing. K8s-distribution-agnostic.
4. **Platform DB separate from tenant data** — Management platform has its own PostgreSQL. Tenant observability data in separate shared (or dedicated) instances.
5. **Immutable artifact pipeline** — JAR upload → container image → promote through environments. Same binary everywhere.
6. **Dual-mode auth** — SaaS mode: platform is the IdP. Air-gapped mode: server uses standalone auth with local license file.
7. **SOC 2 baked in** — Not bolted on. Audit logging, encryption, image signing, SBOM from day 1.
8. **Self-monitoring** — Prometheus + Grafana stack, completely separate from tenant observability.
---
## 4. Data Architecture
### Platform Database (Management Platform)
Stores all SaaS control plane data — completely separate from tenant observability data.
| Table/Domain | Purpose |
|---|---|
| `tenants` | Tenant record: ID, name, tier, status, Stripe customer ID, created_at |
| `users` | Platform users: email, password hash, MFA, status |
| `tenant_members` | User-to-tenant mapping with role |
| `teams` | Team groupings within a tenant |
| `roles` / `permissions` | RBAC definitions (predefined + custom for high/business) |
| `licenses` | License records: tenant, tier, feature flags, limits, expiry, signing key |
| `audit_log` | Immutable append-only log: actor, action, resource, timestamp, IP, tenant |
| `applications` | Deployed Camel app metadata: name, tenant, version, image ref, status |
| `secrets_metadata` | Secret references (actual values in K8s Secrets or external vault) |
| `vault_configs` | External vault connection configs per tenant |
| `provisioning_events` | Tenant provisioning pipeline state and history |
| `billing_usage` | Aggregated usage snapshots before Stripe reporting |
### Tenant Data (Shared PostgreSQL)
Each tenant's cameleer-server uses its own PostgreSQL schema on the shared instance (dedicated instance for high/business). This is the existing cameleer-server data model — unchanged:
- Route executions, processor traces, metrics
- Route graph topology
- Agent registrations, config history
- Lineage captures, correlation traces, debug sessions
### Tenant Data (Shared OpenSearch)
- `{tenant_id}-executions-*` — time-series execution data
- `{tenant_id}-traces-*` — processor-level traces
- Full index-level isolation with index templates per tenant
### Self-Monitoring Data
Completely separate: Prometheus TSDB for metrics, Loki for logs.
---
## 5. Identity & Access Management
### Architecture
The SaaS management platform is the single identity plane. It owns authentication and authorization. Per-tenant cameleer-server instances trust SaaS-issued tokens.
- Spring Security OAuth2 for OIDC federation with customer IdPs
- Ed25519 JWT signing (consistent with existing cameleer-server pattern)
- Tokens carry: tenant ID, user ID, roles, feature entitlements
- cameleer-server validates SaaS-issued JWTs in managed mode
- Standalone mode retains its own auth for air-gapped deployments
### RBAC Model
| Role | Capabilities |
|------|-------------|
| Owner | Full tenant admin, billing, team management, delete tenant |
| Admin | Manage apps, secrets, team members, environments. No billing. |
| Developer | Deploy apps, view traces, use debugger/replay. No team management. |
| Viewer | Read-only access to dashboards, traces, topology |
High/Business tiers: custom roles with granular permissions (e.g., "can replay in dev but not prod").
### Team Management
- Invite by email
- Role assignment per user
- Basic (low/mid): single team, predefined roles
- Full (high/business): multiple teams, custom roles, team-scoped permissions
---
## 6. Tenant Provisioning
### Shared Tier Flow (Low/Mid)
```
Customer signs up + payment
→ Create tenant record + Stripe customer/subscription
→ Generate signed license token (Ed25519)
→ Create Flux HelmRelease CR
→ Flux reconciles: namespace, ResourceQuota, NetworkPolicies, cameleer-server
→ Provision PostgreSQL schema + per-tenant credentials
→ Provision OpenSearch index template + per-tenant credentials
→ Readiness check: server healthy, DB migrated, auth working
→ Generate bootstrap tokens, present onboarding instructions
→ Tenant status → ACTIVE
```
**Target: < 5 minutes from payment to active environment.**
### Dedicated Tier Flow (High/Business)
Semi-manual at launch:
1. Customer signs committed resource agreement
2. Operator provisions dedicated cluster (Talos)
3. Flux bootstrap deploys full stack
4. Management platform configured to route to dedicated cluster
5. From this point, automated (same lifecycle management as shared)
Full Cluster API automation deferred to future release.
### Lifecycle Operations
| Operation | Mechanism |
|-----------|-----------|
| Suspension (non-payment) | Scale tenant workloads to 0, license set to suspended |
| Reactivation | Scale back up, license reactivated |
| Deletion | Remove namespace, drop PG schema, delete OS indices, scrub audit log references. GDPR compliant. |
| Tier upgrade (shared → dedicated) | Provision dedicated cluster, migrate data, update routing. Downtime window coordinated. |
| Tier downgrade | Reverse of upgrade. Data retention limits applied. |
### Failure Handling
- Each provisioning step is idempotent and retryable
- State machine in platform DB tracks progress per step
- Failed provisioning → alert ops + notify customer with ETA
- Partial provisioning cleanup on permanent failure
---
## 7. Camel Application Runtime
### JAR Upload → Immutable Image
1. **Validation** — File type check, size limit per tier, SHA-256 checksum, Trivy security scan, secret detection (reject JARs with embedded credentials)
2. **Image Build** — Templated Dockerfile: distroless JRE base + customer JAR + cameleer-agent.jar + `-javaagent` flag + agent pre-configured for tenant server. Image tagged: `registry/{tenant}/{app}:v{N}-{sha256short}`. Signed with cosign. SBOM attached.
3. **Registry Push** — Per-tenant repository in platform container registry
4. **Deploy** — K8s Deployment in tenant namespace with resource limits, secrets mounted, config injected, NetworkPolicy applied, liveness/readiness probes
### Environment Promotion
```
dev → staging → prod
(same image tag, different config + secrets per environment)
```
- Promotion = deploy existing image tag to target environment (no rebuild)
- Rollback = redeploy previous image tag
- Every promotion audit logged (who, what, from, to)
### Environment Model
| Tier | Default Environments | Custom Environments |
|------|---------------------|-------------------|
| Low | prod | No |
| Mid | dev, prod | No |
| High | dev, staging, prod | Unlimited |
| Business | dev, staging, prod | Unlimited |
### Application Deployment Page
Central UI for managing each deployed application:
- **Deploy** — Upload JAR, view build status, deploy to environment, promote, rollback
- **Configuration** — Environment variables, JVM options, agent config overrides, application properties. Per-environment. Changes trigger rolling restart.
- **Secrets** — Create/edit platform-managed secrets. Link external vault secrets. Scoped per environment. Masked in UI, reveal with audit log.
- **Status** — Pod health, resource usage, agent connection status, recent events
- **Logs** — Live stdout/stderr stream
- **Versions** — Image history, promotion history, rollback targets
### Application Lifecycle
| Action | Mechanism |
|--------|-----------|
| Deploy | Upload JAR → build image → deploy to environment |
| Promote | Redeploy same image tag to next environment |
| Rollback | Redeploy previous image tag |
| Scale | Update replica count |
| Stop | Scale to 0 (preserves config) |
| Delete | Remove Deployment + clean registry images per retention |
| Logs | Stream via K8s log API |
---
## 8. Observability Integration
### Architecture
Each tenant gets a dedicated cameleer-server instance:
- Shared tiers: deployed in tenant's namespace
- Dedicated tiers: deployed in tenant's cluster
The SaaS API gateway routes `/t/{tenant}/api/*` to the correct server instance. The server's React UI is embedded in the SaaS shell (nav, tenant switcher, billing pages provided by the shell; product UI rendered inside).
### Agent Connection
- Agent bootstrap tokens generated by the SaaS platform
- Agents connect directly to their tenant's cameleer-server instance
- Agent auto-injected into customer Camel apps deployed on the platform
- External agents (customer-hosted Camel apps) can also connect using bootstrap tokens
### MOAT Features (gated by license)
| Feature | Description | Tier Availability |
|---------|-------------|-------------------|
| **Topology Graph** | Route dependency visualization from existing execution data | All tiers |
| **Payload Flow Lineage** | Per-processor before/after capture + format-aware diff | Limited on Low (route-scope only, max 10 captures/min), Full on Mid+ |
| **Cross-Service Correlation** | Distributed trace assembly + service dependency graph | Mid+ |
| **Live Route Debugger** | Browser-based route stepping with breakpoints | High+ |
| **Exchange Replay** | Re-execute recorded exchange with modified payload, fully audited | High+ |
### Server Configuration
- SaaS platform pushes tier-specific config: feature flags, retention limits, resource limits
- Server runs in "managed mode": trusts SaaS-issued JWTs, reports metrics back to platform
- Air-gapped mode: standalone with local license file
---
## 9. Secrets Management
### Day 1 Requirements
- **Platform-native secret store** — Encrypted at rest in K8s Secrets (sealed-secrets or SOPS)
- **External vault integration** — HashiCorp Vault at launch. AWS Secrets Manager, Azure Key Vault, GCP Secret Manager deferred to future release.
- **Injection** — Secrets injected into Camel app containers as environment variables or mounted files
- **Rotation** — Update secret → rolling restart of affected apps
- **RBAC** — Only authorized team members can create/view/rotate secrets
- **Per-environment scoping** — Dev secrets ≠ prod secrets
- **K8s External Secrets Operator** — Syncs external vault secrets into K8s Secrets
### Tenant Isolation
- Secrets strictly scoped to tenant + environment
- No cross-tenant secret access possible
- Envelope encryption with per-tenant keys on shared storage
### Audit
- Every secret access logged (create, read, update, delete, inject)
- Audit trail queryable by tenant admins
---
## 10. License & Feature Gating
### License Token
Ed25519-signed JWT containing:
- Tenant ID, tier, expiry
- Feature flags (topology, lineage, correlation, debugger, replay)
- Resource limits (agents, retention, environments, vaults, debug sessions)
- SSO/OIDC and custom roles entitlements
### Dual Validation
| Mode | Mechanism |
|------|-----------|
| **SaaS** | Server polls platform API `GET /api/license/{tenant}`, caches 5 min, 24h grace on API failure |
| **Air-gapped** | Server validates local file `/etc/cameleer/license.jwt`, Ed25519 signature verification |
Both modes produce the same `LicenseContext` singleton used throughout the server.
### Enforcement
- Feature endpoints return 403 with `not_entitled` reason and upgrade URL
- Graceful degradation: features disabled, not errors
- License expiry: 7-day grace period (read-only mode), then hard cutoff
### Lifecycle
- Generated on tenant signup, regenerated on tier change
- Air-gapped: downloadable from management platform
- Non-payment: license suspended → grace period → expired
---
## 11. Networking & Tenant Isolation
### Day 1: Namespace Isolation (Shared Tiers)
K8s NetworkPolicies per tenant namespace:
- **Default deny** all ingress/egress between tenant namespaces
- **Allow:** tenant namespace → shared PostgreSQL/OpenSearch (authenticated per-tenant credentials)
- **Allow:** tenant namespace → public internet (Camel app external connectivity)
- **Allow:** SaaS platform namespace → all tenant namespaces (management access)
- **Allow:** tenant Camel apps → tenant cameleer-server (intra-namespace)
### Zero-Trust Tenant Boundary
- Per-tenant database credentials (not shared superuser with row filtering)
- Per-tenant OpenSearch roles with index-level ACLs
- Connection pooling per tenant (PgBouncer per namespace)
- A compromised tenant server physically cannot query another tenant's data
### Future: VPN / Private Connectivity
- WireGuard or IPsec tunnels to customer infrastructure
- Private DNS resolution for customer internal hostnames
- Available on High/Business tiers only
---
## 12. Security & SOC 2 Compliance
### Encryption
| Layer | Mechanism |
|-------|-----------|
| In transit (external) | TLS 1.3 at ingress |
| In transit (internal) | mTLS between services |
| In transit (agent ↔ server) | TLS + Ed25519-signed config payloads |
| At rest (databases) | Volume encryption (LUKS or cloud-native) |
| At rest (secrets) | Envelope encryption with per-tenant keys |
| At rest (registry) | Encrypted storage backend |
### Audit Trail
Every state-changing action produces an immutable audit record:
- Actor, tenant, action, resource, environment, source IP, result, metadata
- Append-only table with no UPDATE/DELETE grants
- Minimum 1 year retention
- Shipped to separate write-only sink (survives platform DB compromise)
- Covers: auth events, provisioning, deployments, config changes, secret access, billing, replay executions, debug sessions, admin actions
### Container Hardening
- Distroless base images (no shell in production)
- Read-only filesystem
- K8s Pod Security Standards: restricted profile (no root, no privilege escalation, no host access)
- Resource limits enforced — compromised tenant can't fork-bomb the node
### Supply Chain Security
- Container images signed with cosign/sigstore
- SBOM generated per build
- Dependency pinning (no floating versions)
- Trivy scanning in CI — block on critical CVEs
- Customer JAR uploads scanned
### Breach Detection
- Anomaly alerting: unusual API patterns, auth failures, cross-namespace DNS queries
- Runtime security scanning (Falco or similar)
- Audit log anomaly detection
### Payload Protection
- Application-level encryption of customer exchange payloads with per-tenant keys before writing to PG/OS
- Tenant key rotation without downtime
- Payload redaction rules configurable per tenant (agent already supports this)
### Compliance
- SOC 2 Trust Service Criteria: Security (CC6), Availability (A1), Processing Integrity (PI1), Confidentiality (C1), Privacy (P1-P8)
- Evidence collection: git history (change management), audit log (access), Prometheus (availability)
- Evaluate Vanta or Drata for continuous compliance monitoring
---
## 13. Platform Operations & Self-Monitoring
### Monitoring Stack
| Tool | Purpose |
|------|---------|
| Prometheus | Metrics collection (platform + tenant infra + K8s) |
| Grafana | Dashboards |
| Loki | Log aggregation |
| Alertmanager | Alert routing → PagerDuty/OpsGenie/Slack |
| Uptime Kuma or Checkly | External synthetic monitoring |
Completely separate from tenant observability data.
### Key Day-1 Alerts
- Control plane down/degraded
- Tenant provisioning failure
- Database connection pool exhaustion
- OpenSearch cluster red/yellow
- Flux reconciliation failure
- TLS certificate expiry < 14 days
- Metering pipeline stale > 1 hour
- Disk usage > 80% on any PV
- Tenant cameleer-server unhealthy > 5 minutes
- OOMKill on any tenant workload
### Dashboards
- Platform overview: tenant count, active agents, provisioning queue, error rates
- Per-tenant health: server status, app status, resource usage
- Billing: MRR, usage trends, metering pipeline health
- Infrastructure: cluster capacity, node utilization, storage growth
- Security: auth failures, audit log anomalies, certificate status
### SLA Reporting
- Automated uptime calculation per tenant
- SLA breach detection and alerting
- Monthly availability reports for high/business tier customers
---
## 14. Billing & Metering
### Metering Pipeline (Low/Mid Tiers)
```
K8s Metrics → Metrics Collector → Usage Aggregator (hourly) → Stripe Usage Records API
```
| Dimension | Unit | Source |
|-----------|------|--------|
| CPU | core·hours | K8s metrics (namespace aggregate) |
| RAM | GB·hours | K8s metrics (namespace aggregate) |
| Data volume | GB ingested | cameleer-server reports |
- Aggregated per tenant, per hour, stored in platform DB before Stripe submission
- Idempotent aggregation (safe to re-run)
- Staleness alert if no data for > 1 hour
- Monthly reconciliation: platform records vs Stripe invoices
### Committed Resources (High/Business)
- Fixed Stripe subscription per resource bundle
- Overage alerts (upsell, not automatic billing)
- Annual/multi-year contracts
### Billing UI
- Current period usage with live cost estimate
- Historical usage charts per dimension
- Invoice history
- Plan management (upgrade/downgrade)
---
## 15. Management Platform UI
### Navigation
| Section | Content |
|---------|---------|
| **Dashboard** | Platform overview: apps, health, usage summary |
| **Apps** | List deployed Camel applications |
| **App → Deploy** | Upload JAR, build status, deploy/promote/rollback |
| **App → Configuration** | Env vars, JVM options, agent config. Per environment. |
| **App → Secrets** | Manage secrets, link vaults. Per environment. |
| **App → Status** | Pod health, resource usage, agent connection, events |
| **App → Logs** | Live stdout/stderr stream |
| **App → Versions** | Image history, promotion log, rollback |
| **Observe** | Embedded cameleer-server UI (topology, traces, lineage, correlation, debugger, replay) |
| **Team** | Users, roles, invites |
| **Settings** | Tenant config, SSO/OIDC, vault connections |
| **Billing** | Usage, invoices, plan management |
### Design
- SaaS shell built with `@cameleer/design-system`
- cameleer-server React UI embedded (same design system, visual consistency)
- Responsive but desktop-primary (observability tooling is a desktop workflow)
---
## 16. Day-1 vs Future Scope
### Day 1 (Launch)
| Epic | Scope |
|------|-------|
| #1 Management Platform | Modular monolith, React shell |
| #2 Identity & Access | Registration, login, teams, JWT, OIDC (high/business) |
| #3 Tenant Provisioning | Automated shared tiers, semi-manual dedicated |
| #4 Billing & Metering | Stripe usage-based + committed. Full metering pipeline. |
| #5 Camel Runtime | JAR upload → immutable image → deploy. Agent auto-injection. |
| #6 Observability | Per-tenant server, embedded UI, all MOAT features gated by tier |
| #7 License Module | Dual-mode (SaaS API + local file), feature gating |
| #8 Networking | Namespace isolation, NetworkPolicies, public internet |
| #9 Secrets | Platform-native + HashiCorp Vault. Per-environment scoping. |
| #10 Environments | Build-once-deploy-often. Tier-based environment model. |
| #11 Security & SOC 2 | Full SOC 2 foundations, zero-trust tenant boundaries, audit logging |
| #12 Self-Monitoring | Prometheus/Grafana/Loki/Alertmanager, key alerts, dashboards |
| #13 Exchange Replay | MOAT feature, extends debugger infrastructure |
### Deferred (Future)
| Feature | Reason |
|---------|--------|
| Automated dedicated cluster provisioning (Cluster API) | Semi-manual sufficient for early high/business customers |
| Container image deployment | JAR upload covers day 1 |
| Git-based deployment | Nice-to-have |
| VPN / private connectivity | Public internet sufficient at launch |
| Auto-scaling (HPA) | Manual scaling sufficient |
| Data residency / region selection | Single region at launch |
| Cross-tenant correlation federation | Designed, deferred to v2 |
| Additional vault providers (AWS, Azure, GCP) | HashiCorp Vault covers day 1 |
| Compliance tooling integration (Vanta/Drata) | Manual evidence collection initially |
| Vulnerability scanning in registry | Trivy in CI covers basics |
---
## 17. Gitea Issue Map
| # | Epic | Labels |
|---|------|--------|
| 1 | SaaS Management Platform | epic, platform |
| 2 | Identity & Access Management | epic, auth |
| 3 | Tenant Provisioning & Lifecycle | epic, infra |
| 4 | Billing & Metering | epic, billing |
| 5 | Camel Application Runtime | epic, runtime |
| 6 | Observability Integration | epic, observability |
| 7 | License & Feature Gating | epic, licensing |
| 8 | Networking & Tenant Isolation | epic, networking |
| 9 | Secrets Management | epic, secrets |
| 10 | Environments & Promotion Pipeline | epic, runtime, day-1 |
| 11 | Security & SOC 2 Compliance | epic, security |
| 12 | Platform Operations & Self-Monitoring | epic, ops |
| 13 | MOAT: Exchange Replay | epic, observability |
MOAT features (Debugger, Lineage, Correlation) tracked in cameleer/cameleer #57#72.