The query cache allows to computeDocumentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
SELECT queries just once and to serve further executions of the same query directly from the cache.
Depending on the type of the queries, this can dramatically reduce latency and resource consumption of the ClickHouse server.
Background, design and limitations
Query caches can generally be viewed as transactionally consistent or inconsistent.- In transactionally consistent caches, the database invalidates (discards) cached query results if the result of the
SELECTquery changes or potentially changes. In ClickHouse, operations which change the data include inserts/updates/deletes in/of/from tables or collapsing merges. Transactionally consistent caching is especially suitable for OLTP databases, for example MySQL (which removed query cache after v8.0) and Oracle. - In transactionally inconsistent caches, slight inaccuracies in query results are accepted under the assumption that all cache entries are
assigned a validity period after which they expire (e.g. 1 minute) and that the underlying data changes only little during this period.
This approach is overall more suitable for OLAP databases. As an example where transactionally inconsistent caching is sufficient,
consider an hourly sales report in a reporting tool which is simultaneously accessed by multiple users. Sales data changes typically
slowly enough that the database only needs to compute the report once (represented by the first
SELECTquery). Further queries can be served directly from the query cache. In this example, a reasonable validity period could be 30 min.
Configuration settings and usage
In ClickHouse Cloud, you must use query level settings to edit query cache settings. Editing config level settings is currently not supported.
clickhouse-local runs a single query at a time. Since query result caching does not make sense, the query
result cache is disabled in clickhouse-local.
use_query_cache = true) will
read the computed result from the cache and return it immediately.
Setting
use_query_cache and all other query-cache-related settings only take an effect on stand-alone SELECT statements. In particular,
the results of SELECTs to views created by CREATE VIEW AS SELECT [...] SETTINGS use_query_cache = true are not cached unless the SELECT
statement runs with SETTINGS use_query_cache = true.true by default). The former setting
controls whether query results are stored in the cache, whereas the latter setting determines if the database should try to retrieve query
results from the cache. For example, the following query will use the cache only passively, i.e. attempt to read from it but not store its
result in it:
use_query_cache, enable_writes_to_query_cache and
enable_reads_from_query_cache only with specific queries. It is also possible to enable caching at user or profile level (e.g. via SET use_query_cache = true) but one should keep in mind that all SELECT queries may return cached results then.
The query cache can be cleared using statement SYSTEM CLEAR QUERY CACHE. The content of the query cache is displayed in system table
system.query_cache. The number of query cache hits and misses since database start are shown as events
“QueryCacheHits” and “QueryCacheMisses” in system table system.events. Both counters are only updated for
SELECT queries which run with setting use_query_cache = true, other queries do not affect “QueryCacheMisses”. Field query_cache_usage
in system table system.query_log shows for each executed query whether the query result was written into or
read from the query cache. Metrics QueryCacheEntries and QueryCacheBytes in system table
system.metrics show how many entries / bytes the query cache currently contains.
The query cache exists once per ClickHouse server process. However, cache results are by default not shared between users. This can be
changed (see below) but doing so is not recommended for security reasons.
Query results are referenced in the query cache by the Abstract Syntax Tree (AST) of
their query. This means that caching is agnostic to upper/lowercase, for example SELECT 1 and select 1 are treated as the same query. To
make the matching more natural, all query-level settings related to the query cache and output formatting)
are removed from the AST.
If the query was aborted due to an exception or user cancellation, no entry is written into the query cache.
The size of the query cache in bytes, the maximum number of cache entries and the maximum size of individual cache entries (in bytes and in
records) can be configured using different server configuration options.
users.xml, then make both settings
readonly:
Age and Expires headers with the age (in seconds) and expiration timestamp of the
cached entry.
Entries in the query cache are compressed by default. This reduces the overall memory consumption at the cost of slower writes into / reads
from the query cache. To disable compression, use setting query_cache_compress_entries.
Sometimes it is useful to keep multiple results for the same query cached. This can be achieved using setting
query_cache_tag that acts as as a label (or namespace) for a query cache entries. The query cache
considers results of the same query with different tags different.
Example for creating three different query cache entries for the same query:
tag from the query cache, you can use statement SYSTEM CLEAR QUERY CACHE TAG 'tag'.
ClickHouse reads table data in blocks of max_block_size rows. Due to filtering, aggregation,
etc., result blocks are typically much smaller than ‘max_block_size’ but there are also cases where they are much bigger. Setting
query_cache_squash_partial_results (enabled by default) controls if result blocks
are squashed (if they are tiny) or split (if they are large) into blocks of ‘max_block_size’ size before insertion into the query result
cache. This reduces performance of writes into the query cache but improves compression rate of cache entries and provides more natural
block granularity when query results are later served from the query cache.
As a result, the query cache stores for each query multiple (partial)
result blocks. While this behavior is a good default, it can be suppressed using setting
query_cache_squash_partial_results.
Also, results of queries with non-deterministic functions are not cached by default. Such functions include
- functions for accessing dictionaries:
dictGet()etc. - user-defined functions without tag
<deterministic>true</deterministic>in their XML definition, - functions which return the current date or time:
now(),today(),yesterday()etc., - functions which return random values:
randomString(),fuzzBits()etc., - functions whose result depends on the size and order or the internal chunks used for query processing:
nowInBlock()etc.,rowNumberInBlock(),runningDifference(),blockSize()etc., - functions which depend on the environment:
currentUser(),queryID(),getMacro()etc.