fix(deploy): stop previous active deployment before START_REPLICAS (fixes 409)

Container names are deterministic: {tenant}-{envSlug}-{appSlug}-{replica}.
The prior code did the stop-existing step at SWAP_TRAFFIC, *after*
START_REPLICAS had already tried to create containers with the same
names — so a redeploy against a RUNNING app consistently failed with
Docker 409 "container name already in use".

Move the stop-existing block to run right after CREATE_NETWORK and
before START_REPLICAS. SWAP_TRAFFIC becomes a label-only marker (traffic
is swapped implicitly by Traefik labels once new replicas are healthy).

Also: add `findActiveByAppIdAndEnvironmentIdExcluding` so the SQL
excludes the current deployment by id — previously the Java-side
`!id.equals(me)` guard failed because the newly-inserted row has
status=STARTING (DB default) and ORDER BY created_at DESC LIMIT 1
picked the new row, hiding the actual previous deployment.

Trade-off: this is destroy-then-start rather than true blue/green —
brief downtime during the swap. Matches the pre-unified-page behavior
and is what users reasonably expect. True blue/green would require
per-deployment container names.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
hsiegeln
2026-04-23 01:01:00 +02:00
parent 9ecc9ee72a
commit f8dccaae2b
3 changed files with 30 additions and 8 deletions

View File

@@ -9,6 +9,7 @@ public interface DeploymentRepository {
List<Deployment> findByEnvironmentId(UUID environmentId);
Optional<Deployment> findById(UUID id);
Optional<Deployment> findActiveByAppIdAndEnvironmentId(UUID appId, UUID environmentId);
Optional<Deployment> findActiveByAppIdAndEnvironmentIdExcluding(UUID appId, UUID environmentId, UUID excludeDeploymentId);
UUID create(UUID appId, UUID appVersionId, UUID environmentId, String containerName);
void updateStatus(UUID id, DeploymentStatus status, String containerId, String errorMessage);
void markDeployed(UUID id);