perf: batch processor and log inserts to reduce ClickHouse part creation
Some checks failed
CI / cleanup-branch (push) Has been skipped
CI / build (push) Successful in 1m7s
CI / docker (push) Successful in 39s
CI / deploy-feature (push) Has been skipped
CI / deploy (push) Successful in 1m2s
SonarQube / sonarqube (push) Failing after 1m58s

Diagnostics showed ~3,200 tiny inserts per 5 minutes:
- processor_executions: 2,376 inserts (14 rows avg) — one per chunk
- logs: 803 inserts (5 rows avg) — synchronous in HTTP handler

Fix 1: Consolidate processor inserts — new insertProcessorBatches() method
flattens all ProcessorBatch records into a single INSERT per flush cycle.

Fix 2: Buffer log inserts — route through WriteBuffer<BufferedLogEntry>,
flushed on the same 5s interval as executions. LogIngestionController now
pushes to buffer instead of inserting directly.

Also reverts async_insert config (doesn't work with JDBC inline VALUES).

Expected: ~3,200 inserts/5min → ~160 (20x reduction in part creation,
MV triggers, and background merge work).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
hsiegeln
2026-04-03 22:48:04 +02:00
parent e0aac4bf0a
commit 633a61d89d
8 changed files with 148 additions and 30 deletions

View File

@@ -182,10 +182,6 @@ data:
<max_block_size>8192</max_block_size>
<queue_max_wait_ms>1000</queue_max_wait_ms>
<max_execution_time>600</max_execution_time>
<!-- Buffer small inserts server-side before creating parts -->
<async_insert>1</async_insert>
<wait_for_async_insert>0</wait_for_async_insert>
<async_insert_busy_timeout_ms>5000</async_insert_busy_timeout_ms>
<!-- Disable parallel parse/format to reduce per-query memory -->
<input_format_parallel_parsing>0</input_format_parallel_parsing>
<output_format_parallel_formatting>0</output_format_parallel_formatting>