265 lines
16 KiB
Markdown
265 lines
16 KiB
Markdown
|
|
# Checkpoints table redesign + deployment audit gap closure
|
|||
|
|
|
|||
|
|
**Date:** 2026-04-23
|
|||
|
|
**Status:** Spec — pending implementation
|
|||
|
|
**Affects:** App deployment page, deployments backend, audit log
|
|||
|
|
|
|||
|
|
## Context
|
|||
|
|
|
|||
|
|
The Checkpoints disclosure on the unified app deployment page (`ui/src/pages/AppsTab/AppDeploymentPage/Checkpoints.tsx`) currently renders past deployments as a cramped row list — a Badge, a "12m ago" label, and a Restore button. It hides the operator information that matters most when reasoning about a checkpoint: who deployed it, the JAR filename (not just the version number), the deployment outcome, and access to the logs and config snapshot the deployment ran with.
|
|||
|
|
|
|||
|
|
Investigating this also surfaced a **gap in the audit log**: `DeploymentController.deploy / stop / promote` make zero `auditService.log(...)` calls. Container deployments — the most consequential operations the server performs — leave no audit trail today. Closing this gap is in scope because it's prerequisite to the "Deployed by" column.
|
|||
|
|
|
|||
|
|
## Goals
|
|||
|
|
|
|||
|
|
1. Replace the cramped checkpoints list with a real table (DS `DataTable`) showing version, JAR filename, deployer, time, strategy, and outcome.
|
|||
|
|
2. Capture and display "who deployed" — backend gains a `created_by` column on `deployments`, populated from `SecurityContextHolder`.
|
|||
|
|
3. Audit deploy / stop / promote operations under a new `AuditCategory.DEPLOYMENT` value.
|
|||
|
|
4. Provide an in-page detail view (side drawer) where the operator can review the deployment's logs and config snapshot before deciding to restore, with an optional diff against the current live config.
|
|||
|
|
5. Cap the visible checkpoint list at the environment's JAR retention count, since older entries cannot be restored.
|
|||
|
|
|
|||
|
|
## Out of scope
|
|||
|
|
|
|||
|
|
- Sortable column headers (default newest-first is enough)
|
|||
|
|
- Deep-linking via `?checkpoint=<id>` query param
|
|||
|
|
- "Remember last drawer tab" preference
|
|||
|
|
- Bulk actions on checkpoints
|
|||
|
|
- Promoting `SideDrawer` into `@cameleer/design-system` (wait for a second consumer)
|
|||
|
|
|
|||
|
|
## Backend changes
|
|||
|
|
|
|||
|
|
### Audit category
|
|||
|
|
|
|||
|
|
Add `DEPLOYMENT` to `cameleer-server-core/src/main/java/com/cameleer/server/core/admin/AuditCategory.java`:
|
|||
|
|
|
|||
|
|
```java
|
|||
|
|
public enum AuditCategory {
|
|||
|
|
INFRA, AUTH, USER_MGMT, CONFIG, RBAC, AGENT,
|
|||
|
|
OUTBOUND_CONNECTION_CHANGE, OUTBOUND_HTTP_TRUST_CHANGE,
|
|||
|
|
ALERT_RULE_CHANGE, ALERT_SILENCE_CHANGE,
|
|||
|
|
DEPLOYMENT
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
The `AuditCategory.valueOf(...)` lookup in `AuditLogController` picks this up automatically. The Admin → Audit page filter dropdown gets one new option in `ui/src/pages/Admin/AuditLogPage.tsx`.
|
|||
|
|
|
|||
|
|
### Audit calls in `DeploymentController`
|
|||
|
|
|
|||
|
|
Add `AuditService` injection and write audit rows on every successful and failed lifecycle operation. Action codes:
|
|||
|
|
|
|||
|
|
| Method | Action | Target | Details |
|
|||
|
|
|---|---|---|---|
|
|||
|
|
| `deploy` | `deploy_app` | `deployment.id().toString()` | `{ appSlug, envSlug, appVersionId, jarFilename, version }` |
|
|||
|
|
| `stop` | `stop_deployment` | `deploymentId.toString()` | `{ appSlug, envSlug }` |
|
|||
|
|
| `promote` | `promote_deployment` | `deploymentId.toString()` | `{ sourceEnv, targetEnv, appSlug, appVersionId }` |
|
|||
|
|
|
|||
|
|
Each `try` branch writes `AuditResult.SUCCESS`; `catch (IllegalArgumentException)` writes `AuditResult.FAILURE` with the exception message in details before returning the existing 404. Pattern matches `OutboundConnectionAdminController`.
|
|||
|
|
|
|||
|
|
### Flyway migration `V2__add_deployment_created_by.sql`
|
|||
|
|
|
|||
|
|
```sql
|
|||
|
|
ALTER TABLE deployments ADD COLUMN created_by TEXT REFERENCES users(user_id);
|
|||
|
|
CREATE INDEX idx_deployments_created_by ON deployments (created_by);
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
Nullable — existing rows stay `NULL` (rendered as `—` in UI). New rows always populated. No backfill: pre-V2 history is unrecoverable, and the column starts paying off from the next deploy onward.
|
|||
|
|
|
|||
|
|
### Service signature change
|
|||
|
|
|
|||
|
|
`DeploymentService.createDeployment(appId, appVersionId, envId, createdBy)` and `promote(targetAppId, sourceVersionId, targetEnvId, createdBy)` both gain a trailing `String createdBy` parameter. `PostgresDeploymentRepository` writes it to the new column.
|
|||
|
|
|
|||
|
|
`DeploymentController` resolves `createdBy` via the existing user-id convention: strip `"user:"` prefix from `SecurityContextHolder.getContext().getAuthentication().getName()`. Same helper pattern as `AlertRuleController` / `OutboundConnectionAdminController`.
|
|||
|
|
|
|||
|
|
### DTO change
|
|||
|
|
|
|||
|
|
`com.cameleer.server.core.runtime.Deployment` record gains `createdBy: String`. UI `Deployment` interface in `ui/src/api/queries/admin/apps.ts` gains `createdBy: string | null`.
|
|||
|
|
|
|||
|
|
### Log filter for the drawer
|
|||
|
|
|
|||
|
|
`LogQueryController.GET /api/v1/environments/{envSlug}/logs` accepts a new multi-value query param `instanceIds` (comma-split, OR-joined). Translates to `WHERE instance_id IN (...)` against the existing `LowCardinality(String)` index on `logs.instance_id` (already part of the `ORDER BY` key).
|
|||
|
|
|
|||
|
|
`LogSearchRequest` gains `instanceIds: List<String>` (null-normalized). Service layer adds the `IN (...)` clause when non-null and non-empty.
|
|||
|
|
|
|||
|
|
The drawer client computes the instance_id list from `Deployment.replicaStates`: for each replica, `instance_id = "{envSlug}-{appSlug}-{replicaIndex}-{generation}"` where generation is the first 8 chars of `deployment.id`. This is the documented format from `.claude/rules/docker-orchestration.md` — pure client-side derivation, no extra server endpoint.
|
|||
|
|
|
|||
|
|
## Drawer infrastructure
|
|||
|
|
|
|||
|
|
The design system provides `Modal` but no drawer. Building a project-local component is preferred over submitting to DS first (single consumer; easier to iterate locally).
|
|||
|
|
|
|||
|
|
**File:** `ui/src/components/SideDrawer.tsx` + `SideDrawer.module.css` (~120 LOC total).
|
|||
|
|
|
|||
|
|
**API:**
|
|||
|
|
|
|||
|
|
```tsx
|
|||
|
|
<SideDrawer
|
|||
|
|
open={!!selectedCheckpoint}
|
|||
|
|
onClose={() => setSelectedCheckpoint(null)}
|
|||
|
|
title={`Deployment v${version} · ${jarFilename}`}
|
|||
|
|
size="lg" // 'md'=560px, 'lg'=720px, 'xl'=900px
|
|||
|
|
footer={<Button onClick={handleRestore}>Restore this checkpoint</Button>}
|
|||
|
|
>
|
|||
|
|
{/* scrollable body */}
|
|||
|
|
</SideDrawer>
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
**Behavior:**
|
|||
|
|
- React portal to `document.body` (mirrors DS `Modal`).
|
|||
|
|
- Slides in from right via `transform: translateX(100% → 0)` over 240ms ease-out.
|
|||
|
|
- Click-blocking transparent backdrop (no dim — the parent table stays readable). Clicking outside closes.
|
|||
|
|
- ESC closes.
|
|||
|
|
- Focus trap on open; focus restored to trigger on close.
|
|||
|
|
- Sticky header (title + close ×) and optional sticky footer.
|
|||
|
|
- Body uses `overflow-y: auto`.
|
|||
|
|
- All colors via DS CSS variables (`--bg`, `--border`, `--shadow-lg`).
|
|||
|
|
|
|||
|
|
**Unsaved-changes interaction:** Opening the drawer is unrestricted. The drawer is read-only — only Restore mutates form state, and Restore already triggers the existing unsaved-changes guard via `useUnsavedChangesBlocker`.
|
|||
|
|
|
|||
|
|
## Checkpoints table
|
|||
|
|
|
|||
|
|
**File:** `ui/src/pages/AppsTab/AppDeploymentPage/CheckpointsTable.tsx` — replaces `Checkpoints.tsx`.
|
|||
|
|
|
|||
|
|
**Columns** (left to right):
|
|||
|
|
|
|||
|
|
| Column | Source | Notes |
|
|||
|
|
|---|---|---|
|
|||
|
|
| Version | `versionMap.get(d.appVersionId).version` | Badge "v6" with auto-color (matches existing pattern) |
|
|||
|
|
| JAR | `versionMap.get(d.appVersionId).jarFilename` | Monospace; truncate with tooltip on overflow |
|
|||
|
|
| Deployed by | `d.createdBy` | Bare username; OIDC users show `oidc:<sub>` truncated with tooltip; null shows `—` muted |
|
|||
|
|
| Deployed | `d.deployedAt` | Relative ("12m ago") + ISO subline |
|
|||
|
|
| Strategy | `d.deploymentStrategy` | Small pill: "blue/green" or "rolling" |
|
|||
|
|
| Outcome | `d.status` | Tinted pill: STOPPED (slate), DEGRADED (amber) |
|
|||
|
|
| (chevron) | — | Visual affordance for "row click opens drawer" |
|
|||
|
|
|
|||
|
|
**Interaction:**
|
|||
|
|
- Row click opens `CheckpointDetailDrawer` (no separate "View" button).
|
|||
|
|
- No per-row Restore button — Restore lives inside the drawer to force review before action.
|
|||
|
|
- Pruned-JAR rows (`!versionMap.has(d.appVersionId)`) render at 55% opacity with a strikethrough on the filename and an amber "archived — JAR pruned" hint. Row stays clickable; Restore inside the drawer is disabled with tooltip.
|
|||
|
|
- Currently-running deployment is excluded (already represented by `StatusCard` above).
|
|||
|
|
|
|||
|
|
**Empty state:** When zero checkpoints, render a single full-width muted row: "No past deployments yet."
|
|||
|
|
|
|||
|
|
## Pagination
|
|||
|
|
|
|||
|
|
Visible cap = `Environment.jarRetentionCount` rows (newest first). Anything older has likely been pruned and is not restorable, so it's hidden by default.
|
|||
|
|
|
|||
|
|
- `total ≤ jarRetentionCount` → render all, no expander.
|
|||
|
|
- `total > jarRetentionCount` → render newest `jarRetentionCount` rows + an expander row: **"Show older (N) — archived, postmortem only"**. Expanding renders the full list (older rows already styled as archived).
|
|||
|
|
- `jarRetentionCount === 0` (unlimited or unconfigured) → fall back to a default cap of 10.
|
|||
|
|
|
|||
|
|
`jarRetentionCount` comes from `useEnvironments()` (already in the env-store).
|
|||
|
|
|
|||
|
|
## Drawer detail view
|
|||
|
|
|
|||
|
|
**File:** `ui/src/pages/AppsTab/AppDeploymentPage/CheckpointDetailDrawer/index.tsx` plus three panel files: `LogsPanel.tsx`, `ConfigPanel.tsx`, `ComparePanel.tsx`.
|
|||
|
|
|
|||
|
|
**Header:**
|
|||
|
|
- Version badge + JAR filename + outcome pill.
|
|||
|
|
- Meta line: "Deployed by **{createdBy}** · {relative} ({ISO}) · Strategy: {strategy} · {N} replicas · ran for {duration}".
|
|||
|
|
- Close × top-right.
|
|||
|
|
|
|||
|
|
**Tabs** (DS `Tabs`):
|
|||
|
|
- **Logs** — default on open
|
|||
|
|
- **Config** — read-only render of the live config sub-tabs, with a view-mode toggle for "Snapshot" vs "Diff vs current"
|
|||
|
|
|
|||
|
|
### Logs panel
|
|||
|
|
|
|||
|
|
Reuses `useInfiniteApplicationLogs` with the new `instanceIds` filter. The hook signature gets an optional `instanceIds: string[]` parameter that flows through to the `LogQueryController` query string.
|
|||
|
|
|
|||
|
|
**Filters** (in addition to `instanceIds`):
|
|||
|
|
- Existing source/level multi-select pills
|
|||
|
|
- New replica filter dropdown: "all (N)" / "0" / "1" / ... / "N-1" — narrows to a single replica when troubleshooting blue-green or rolling deploys.
|
|||
|
|
|
|||
|
|
**Default sort:** newest first (matches operator mental model when investigating a stopped deployment).
|
|||
|
|
|
|||
|
|
**Total line count** displayed in the filter bar.
|
|||
|
|
|
|||
|
|
### Config panel
|
|||
|
|
|
|||
|
|
Renders the five existing live config sub-tabs (`Monitoring`, `Resources`, `Variables`, `SensitiveKeys`, `Deployment`) **read-only**, hydrated from `deployedConfigSnapshot`.
|
|||
|
|
|
|||
|
|
Each sub-tab component (`ui/src/pages/AppsTab/AppDeploymentPage/ConfigTabs/*`) gains an optional `readOnly?: boolean` prop. When `readOnly` is set:
|
|||
|
|
- All inputs disabled (`disabled` attribute + visual styling)
|
|||
|
|
- Save / edit buttons hidden
|
|||
|
|
- Live banners (`LiveBanner`) hidden — these are not applicable to a frozen snapshot
|
|||
|
|
|
|||
|
|
If a sub-tab currently mixes derived state with form state in a way that makes a clean `readOnly` toggle awkward, refactor that sub-tab as part of this work. Don't proceed with leaky read-only behavior.
|
|||
|
|
|
|||
|
|
**View-mode toggle:** "Snapshot" / "Diff vs current". Default = Snapshot (full read-only render). Diff mode shows differences only — both old and new values per changed field, with red/green left borders, grouped by sub-tab. Each sub-tab pill shows a change-count badge (e.g. "Resources (2)"); sub-tabs with zero differences are dimmed and render a muted "No differences in this section" message when clicked.
|
|||
|
|
|
|||
|
|
Diff base = current live config, pulled via the existing `useApplicationConfig` hook the live form already uses. Algorithm: deep-equal field-level walk between snapshot and current.
|
|||
|
|
|
|||
|
|
The toggle is hidden entirely when JAR is pruned (the missing JAR makes "current vs snapshot" comparison incomplete and misleading).
|
|||
|
|
|
|||
|
|
**Footer:** Sticky. Single primary button "Restore this checkpoint" + helper text "Restoring hydrates the form — you'll still need to Redeploy."
|
|||
|
|
|
|||
|
|
When JAR is pruned: button disabled with tooltip "JAR was pruned by the environment retention policy".
|
|||
|
|
|
|||
|
|
Restore behavior is unchanged from today: closes the drawer + hydrates the form via the existing `onRestore(deploymentId)` callback. No backend call; the eventual Redeploy generates the next `deploy_app` audit row.
|
|||
|
|
|
|||
|
|
## Authorization
|
|||
|
|
|
|||
|
|
`DeploymentController` and `AppController` are already class-level `@PreAuthorize("hasAnyRole('OPERATOR', 'ADMIN')")`, so the deployment page is operator-gated. The new `instanceIds` filter on `LogQueryController` (which is VIEWER+) widens nothing — viewers can already query the same logs by `application + environment`; the filter just narrows.
|
|||
|
|
|
|||
|
|
## Real-time updates
|
|||
|
|
|
|||
|
|
When a new deployment lands, the previous "current" becomes a checkpoint. TanStack Query already polls deployments via the existing `useDeployments(appSlug, envSlug)` hook; the new table consumes the same data — auto-refresh comes for free.
|
|||
|
|
|
|||
|
|
## Tests
|
|||
|
|
|
|||
|
|
**Backend integration tests:**
|
|||
|
|
|
|||
|
|
| Test | What it asserts |
|
|||
|
|
|---|---|
|
|||
|
|
| `V2MigrationIT` | `created_by` column exists, FK valid, index exists |
|
|||
|
|
| `DeploymentServiceCreatedByIT` | `createDeployment(...createdBy)` persists the value |
|
|||
|
|
| `DeploymentControllerAuditIT` | All three lifecycle actions write the expected audit row (action, category, target, details, actor, result) including FAILURE branches |
|
|||
|
|
| `LogQueryControllerInstanceIdsFilterIT` | `?instanceIds=a,b,c` returns only matching rows; empty/missing param preserves prior behavior |
|
|||
|
|
|
|||
|
|
**UI component tests:**
|
|||
|
|
|
|||
|
|
| Test | What it asserts |
|
|||
|
|
|---|---|
|
|||
|
|
| `SideDrawer.test.tsx` | open/close, ESC closes, backdrop click closes, focus trap |
|
|||
|
|
| `CheckpointsTable.test.tsx` | row click opens drawer; pruned-JAR row dimmed + clickable; empty state |
|
|||
|
|
| `CheckpointDetailDrawer.test.tsx` | renders correct logs (mocked instance_id list); Restore disabled when JAR pruned |
|
|||
|
|
| `ConfigPanel.test.tsx` | snapshot mode renders all fields read-only; diff mode counts differences correctly per sub-tab; "no differences" message when section unchanged; toggle hidden when JAR pruned |
|
|||
|
|
|
|||
|
|
## Files touched
|
|||
|
|
|
|||
|
|
**Backend:**
|
|||
|
|
- New: `cameleer-server-app/src/main/resources/db/migration/V2__add_deployment_created_by.sql`
|
|||
|
|
- Modified: `cameleer-server-core/src/main/java/com/cameleer/server/core/admin/AuditCategory.java` (add `DEPLOYMENT`)
|
|||
|
|
- Modified: `cameleer-server-core/src/main/java/com/cameleer/server/core/runtime/Deployment.java` (record field)
|
|||
|
|
- Modified: `cameleer-server-core/src/main/java/com/cameleer/server/core/runtime/DeploymentService.java` (signature + impl)
|
|||
|
|
- Modified: `cameleer-server-app/src/main/java/com/cameleer/server/app/storage/PostgresDeploymentRepository.java` (insert + map)
|
|||
|
|
- Modified: `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/DeploymentController.java` (audit calls + createdBy resolution)
|
|||
|
|
- Modified: `cameleer-server-app/src/main/java/com/cameleer/server/app/controller/LogQueryController.java` (instanceIds param)
|
|||
|
|
- Modified: `cameleer-server-core/src/main/java/com/cameleer/server/core/search/LogSearchRequest.java` (instanceIds field)
|
|||
|
|
- Regenerate: `cameleer-server-app/src/main/resources/openapi.json` (controller change → SPA types)
|
|||
|
|
|
|||
|
|
**UI:**
|
|||
|
|
- New: `ui/src/components/SideDrawer.tsx` + `SideDrawer.module.css`
|
|||
|
|
- New: `ui/src/pages/AppsTab/AppDeploymentPage/CheckpointsTable.tsx`
|
|||
|
|
- New: `ui/src/pages/AppsTab/AppDeploymentPage/CheckpointDetailDrawer/{index,LogsPanel,ConfigPanel}.tsx` (Compare is a view-mode inside ConfigPanel, not a separate file)
|
|||
|
|
- Modified: `ui/src/pages/AppsTab/AppDeploymentPage/IdentitySection.tsx` (swap Checkpoints → CheckpointsTable)
|
|||
|
|
- Deleted: `ui/src/pages/AppsTab/AppDeploymentPage/Checkpoints.tsx`
|
|||
|
|
- Modified: `ui/src/pages/AppsTab/AppDeploymentPage/ConfigTabs/{Monitoring,Resources,Variables,SensitiveKeys,Deployment}Tab.tsx` (add `readOnly?` prop)
|
|||
|
|
- Modified: `ui/src/api/queries/logs.ts` (`useInfiniteApplicationLogs` accepts `instanceIds`)
|
|||
|
|
- Modified: `ui/src/api/queries/admin/apps.ts` (`Deployment.createdBy` field)
|
|||
|
|
- Modified: `ui/src/api/schema.d.ts` + `ui/src/api/openapi.json` (regenerated)
|
|||
|
|
- Modified: `ui/src/pages/Admin/AuditLogPage.tsx` (one new category in filter dropdown)
|
|||
|
|
|
|||
|
|
**Docs / rules:**
|
|||
|
|
- Modified: `.claude/rules/app-classes.md` (DeploymentController audit calls + LogQueryController instanceIds param)
|
|||
|
|
- Modified: `.claude/rules/ui.md` (CheckpointsTable + SideDrawer pattern)
|
|||
|
|
- Modified: `.claude/rules/core-classes.md` (`AuditCategory.DEPLOYMENT`, `Deployment.createdBy`)
|
|||
|
|
|
|||
|
|
## Rollout
|
|||
|
|
|
|||
|
|
Two phases, ideally two PRs:
|
|||
|
|
|
|||
|
|
1. **Backend phase** — V2 migration, `AuditCategory.DEPLOYMENT`, audit calls in `DeploymentController`, `created_by` plumbing through `DeploymentService` / record / repository, `LogQueryController` `instanceIds` param. Ships independently because the column is nullable, the audit category is picked up automatically, and the new log filter is opt-in.
|
|||
|
|
2. **UI phase** — `SideDrawer`, `CheckpointsTable`, `CheckpointDetailDrawer`, `readOnly?` props on the five config sub-tabs, audit-page dropdown entry. Depends on the backend PR being merged + the OpenAPI schema regenerated.
|
|||
|
|
|
|||
|
|
Splitting in this order means production gets the audit trail and `created_by` capture immediately, even before the new UI lands, so the audit gap is closed as quickly as possible.
|