Add K8s liveness/readiness probes for server and ClickHouse #35

New Issue

claude · 2026-03-12T21:23:23+01:00

claude commented

2026-03-12 21:23:23 +01:00

Problem

Neither the server Deployment nor the ClickHouse StatefulSet have liveness or readiness probes configured. If either process hangs or becomes unresponsive, K8s won't detect it or auto-restart the pod.

Solution

Server (`deploy/server.yaml`)

Readiness probe: GET /actuator/health (Spring Boot Actuator, likely already on classpath) or implement GET /api/v1/health (issue #30)
Liveness probe: same endpoint with longer thresholds

ClickHouse (`deploy/clickhouse.yaml`)

Readiness probe: GET /ping on port 8123 (built-in ClickHouse health endpoint, returns "Ok.\n")
Liveness probe: same endpoint

Impact

Auto-recovery from hangs/crashes without manual intervention.

## Problem Neither the server Deployment nor the ClickHouse StatefulSet have liveness or readiness probes configured. If either process hangs or becomes unresponsive, K8s won't detect it or auto-restart the pod. ## Solution ### Server (`deploy/server.yaml`) - Readiness probe: `GET /actuator/health` (Spring Boot Actuator, likely already on classpath) or implement `GET /api/v1/health` (issue #30) - Liveness probe: same endpoint with longer thresholds ### ClickHouse (`deploy/clickhouse.yaml`) - Readiness probe: `GET /ping` on port 8123 (built-in ClickHouse health endpoint, returns "Ok.\n") - Liveness probe: same endpoint ## Impact Auto-recovery from hangs/crashes without manual intervention.

claude commented

2026-03-12 22:09:59 +01:00

Note: ClickHouse is now exposed externally via NodePort (30123 HTTP, 30900 native). Liveness/readiness probes are even more important now that the server depends on ClickHouse being ready before it can initialize its schema. Without a readiness probe on ClickHouse, the server may start before ClickHouse is ready, causing schema init to fail.

claude commented

2026-03-13 10:57:50 +01:00

Implemented: Added liveness and readiness probes to both deployments.

Server: httpGet /api/v1/health:8081 (liveness: 30s initial delay, readiness: 10s)
ClickHouse: httpGet /ping:8123 (liveness: 15s initial delay, readiness: 5s)

Implemented: Added liveness and readiness probes to both deployments. - **Server**: `httpGet /api/v1/health:8081` (liveness: 30s initial delay, readiness: 10s) - **ClickHouse**: `httpGet /ping:8123` (liveness: 15s initial delay, readiness: 5s)

claude closed this issue

2026-03-13 10:57:55 +01:00

claude referenced this issue from a commit

2026-03-13 10:59:20 +01:00

Move ClickHouse credentials to K8s Secret and add health probes

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: cameleer/cameleer-server#35

Add K8s liveness/readiness probes for server and ClickHouse #35

Problem

Solution

Server (deploy/server.yaml)

ClickHouse (deploy/clickhouse.yaml)

Impact

Server (`deploy/server.yaml`)

ClickHouse (`deploy/clickhouse.yaml`)