Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
The query condition cache only works when enable_analyzer is set to true, which is the default value.
WHERE col = 'xyz') on the same data will always return the same results.
More specifically, the query condition cache remembers for each evaluated filter and each granule (= a block of 8192 rows by default) if no row in the granule satisfy the filter condition.
The information is recorded as a single bit: a 0 bit represents that no row matches the filter whereas a 1 bit means that at least one matching row exists.
In the former case, ClickHouse may skip the corresponding granule during filter evaluation, in the latter case, the granule must be loaded and evaluated.
The query condition cache is effective if three prerequisites are fulfilled:
- First, the workload must evaluate the same filter conditions repeatedly. This happens naturally if a query is repeated multiple times but it can also happen if two queries share the same filters, e.g.
SELECT product FROM products WHERE quality > 3andSELECT vendor, count() FROM products WHERE quality > 3. - Second, the majority of the data is immutable, i.e., does not change between queries. This is generally the case in ClickHouse as parts are immutable and created only by INSERTs.
- Third, filters are selective, i.e. only relatively few rows satisfy the filter condition. The fewer rows match the filter condition, the more granules will be recorded with bit 0 (no matching rows), and the more data can be “pruned” from subsequent filter evaluations.
Memory consumption
Since the query condition cache stores only a single bit per filter condition and granule, it consumes only little memory. The maximum size of the query condition cache can be configured using server settingsquery_condition_cache_size (default: 100 MB).
A cache size of 100 MB corresponds to 100 * 1024 * 1024 * 8 = 838,860,800 entries.
Since each entry represents a mark (8192 rows by default), the cache can cover up to 6,871,947,673,600 (6.8 trillion) rows of a single column.
In practice, filter are evaluated on more than one column, so that number needs to be divided by the number of filtered columns.
Configuration settings and usage
Setting use_query_condition_cache controls whether a specific query or all queries of the current session should utilize the query condition cache. For example, the first execution of queryuse_query_condition_cache = true, will utilize the query condition cache to scan less data.
Administration
The query condition cache is not retained between restarts of ClickHouse. To clear the query condition cache, runSYSTEM CLEAR QUERY CONDITION CACHE.
The content of the cache is displayed in system table system.query_condition_cache.
To calculate the current size of the query condition cache in MB, run SELECT formatReadableSize(sum(entry_size)) FROM system.query_condition_cache.
If you like to investigate individual filter conditions, you can check field condition in system.query_condition_cache. Note that this field is only available in debug builds.
The number of query condition cache hits and misses since database start are shown as events “QueryConditionCacheHits” and “QueryConditionCacheMisses” in system table system.events.
Both counters are only updated for SELECT queries which run with setting use_query_condition_cache = true, other queries do not affect “QueryCacheMisses”.