Add K8s liveness/readiness probes for server and ClickHouse #35

Closed
opened 2026-03-12 21:23:23 +01:00 by claude · 2 comments
Owner

Problem

Neither the server Deployment nor the ClickHouse StatefulSet have liveness or readiness probes configured. If either process hangs or becomes unresponsive, K8s won't detect it or auto-restart the pod.

Solution

Server (deploy/server.yaml)

  • Readiness probe: GET /actuator/health (Spring Boot Actuator, likely already on classpath) or implement GET /api/v1/health (issue #30)
  • Liveness probe: same endpoint with longer thresholds

ClickHouse (deploy/clickhouse.yaml)

  • Readiness probe: GET /ping on port 8123 (built-in ClickHouse health endpoint, returns "Ok.\n")
  • Liveness probe: same endpoint

Impact

Auto-recovery from hangs/crashes without manual intervention.

## Problem Neither the server Deployment nor the ClickHouse StatefulSet have liveness or readiness probes configured. If either process hangs or becomes unresponsive, K8s won't detect it or auto-restart the pod. ## Solution ### Server (`deploy/server.yaml`) - Readiness probe: `GET /actuator/health` (Spring Boot Actuator, likely already on classpath) or implement `GET /api/v1/health` (issue #30) - Liveness probe: same endpoint with longer thresholds ### ClickHouse (`deploy/clickhouse.yaml`) - Readiness probe: `GET /ping` on port 8123 (built-in ClickHouse health endpoint, returns "Ok.\n") - Liveness probe: same endpoint ## Impact Auto-recovery from hangs/crashes without manual intervention.
Author
Owner

Note: ClickHouse is now exposed externally via NodePort (30123 HTTP, 30900 native). Liveness/readiness probes are even more important now that the server depends on ClickHouse being ready before it can initialize its schema. Without a readiness probe on ClickHouse, the server may start before ClickHouse is ready, causing schema init to fail.

Note: ClickHouse is now exposed externally via NodePort (30123 HTTP, 30900 native). Liveness/readiness probes are even more important now that the server depends on ClickHouse being ready before it can initialize its schema. Without a readiness probe on ClickHouse, the server may start before ClickHouse is ready, causing schema init to fail.
Author
Owner

Implemented: Added liveness and readiness probes to both deployments.

  • Server: httpGet /api/v1/health:8081 (liveness: 30s initial delay, readiness: 10s)
  • ClickHouse: httpGet /ping:8123 (liveness: 15s initial delay, readiness: 5s)
Implemented: Added liveness and readiness probes to both deployments. - **Server**: `httpGet /api/v1/health:8081` (liveness: 30s initial delay, readiness: 10s) - **ClickHouse**: `httpGet /ping:8123` (liveness: 15s initial delay, readiness: 5s)
Sign in to join this conversation.