Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Avoid OPTIMIZE FINAL
ClickHouse tables using the MergeTree engine store data on disk as immutable parts, which are created every time data is inserted.
Each insert creates a new part containing sorted, compressed column files, along with metadata like indexes and checksums. For a detailed description of part structures and how they’re formed we recommend this guide.
Over time, background processes merge smaller parts into larger ones to reduce fragmentation and improve query performance.
While it’s tempting to manually trigger this merge using:
OPTIMIZE FINAL operation in most cases as it initiates
resource intensive operations which may impact cluster performance.
OPTIMIZE FINAL vs FINAL
OPTIMIZE FINAL isn’t the same as FINAL, which is sometimes necessary to use
to get results without duplicates, such as with the ReplacingMergeTree. Generally,
FINAL is okay to use if your queries are filtering on the same columns as those
in your primary key.Why avoid?
It’s expensive
RunningOPTIMIZE FINAL forces ClickHouse to merge all active parts into a single part, even if large merges have already occurred. This involves:
- Decompressing all parts
- Merging the data
- Compressing it again
- Writing the final part to disk or object storage
It ignores safety limits
Normally, ClickHouse avoids merging parts larger than ~150 GB (configurable via max_bytes_to_merge_at_max_space_in_pool). ButOPTIMIZE FINAL ignores this safeguard, which means:
- It may try to merge multiple 150 GB parts into one massive part
- This could result in long merge times, memory pressure, or even out-of-memory errors
- These large parts may become challenging to merge, i.e. attempts to merge them further fails for the reasons stated above. In cases where merges are required for correct query time behavior, this can result in undesired consequences such as duplicates accumulating for a ReplacingMergeTree, diminishing query time performance.