v25.10 Changelog for Cloud - ClickHouse Documentation

Backward incompatible changes

Data format and schema changes

Changed default schema_inference_make_columns_nullable setting to respect column Nullable-ness information from Parquet/ORC/Arrow metadata, instead of making everything Nullable. No change for text formats. #71499 (Michael Kolupaev).

Query and function changes

Query result cache now ignores the log_comment setting, so that changing only the log_comment on a query no longer forces a cache miss. There is a small chance users intentionally segmented their cache by varying log_comment. This change alters that behavior and is therefore backward incompatible. Please use setting query_cache_tag for this purpose. #79878 (filimonov).
In previous versions, queries with table functions named the same way as the implementation functions for operators were formatted inconsistently. Closes #81601. Closes #81977. Closes #82834. Closes #82835. EXPLAIN SYNTAX queries will not format operators - the new behavior better reflects the purpose of explaining syntax. clickhouse-format, formatQuery, and similar will not format functions as operators if the query contained them in a functional form. #82825 (Alexey Milovidov).
Disable nonsensical binary operations with IPv4/IPv6: Plus / minus of a IPv4/IPv6 with a non integer type is disabled. Before it would allow operations with floating types and throw logical errors with some other types (such as DateTime). #86336 (Raúl Marín).
Renamed functions searchAny and searchAll to hasAnyTokens and hasAllTokens for better consistency with existing function hasToken. #88109 (Robert Schulze).

Data type changes

Forbid using the Dynamic type in JOIN keys. It could lead to unexpected results when Dynamic type is compared to a non-Dynamic type. It’s better to cast a Dynamic column to the required type. #86358 (Pavel Kruglov).

Storage and index changes

Deprecate setting allow_dynamic_metadata_for_data_lakes. Now all iceberg tables try to fetch up-to-date table schema from storage before executing of each query. #86366 (Daniil Ivanik).
The inverted text index was reworked from scratch to be scalable for datasets that don’t fit into RAM. #86485 (Anton Popov).
The storage_metadata_write_full_object_key server setting is turned on by default, and can no longer be turned off. #87335 (Sema Checherinda).
Remove cache_hits_threshold from the filesystem cache. cache_hits_threshold was added before the SLRU cache policy was added, and it is not necessary to support both. #88344 (Kseniia Sumarokova).

Settings and configuration changes

Decrease replicated_deduplication_window_seconds from 1 week down to one hour in order to store less znode on zookeeper when insertion rate is low. #87414 (Sema Checherinda).
Rename setting query_plan_use_new_logical_join_step to query_plan_use_logical_join_step. #87679 (Vladimir Cherkasov).
The new syntax allows tokenizer parameter to be more expressive. #87997 (Elmi Ahmadov).
Two slight changes to how min_free_disk_ratio_to_perform_insert and min_free_disk_bytes_to_perform_insert settings work: use unreserved instead of available bytes to determine if an insert should be rejected. This is probably not crucial if the reservations for background merges and mutations are small compared to the configured thresholds, but it seems more correct. - Don’t apply these settings to system tables. The reasoning for this is that we still want tables like query_log to be updated. This helps a lot with debugging. Data written to system tables is usually small compared to actual data, so they should be able to continue for much longer with a reasonable min_free_disk_ratio_to_perform_insert threshold. #88468 (c-end).

Keeper changes

Enable async mode for Keeper’s internal replication. Keeper will preserve the same behavior as before with possible performance improvements. If you are updating from a version older than 23.9, you need to either update first to 23.9+ and than to 25.10+. You can also set keeper_server.coordination_settings.async_replication to 0 before update and enable it after update is done. #88515 (Antonio Andelic).

New features

Functions

Add naiveBayesClassifier function to classify text using Naive Bayes based on ngrams. #78700 (Nihal Z. Miaji).
Added function arrayExcept that subtracts one array as a set from another. #82368 (Joanna Hulboj).
New conv function for converting numbers between bases, currently supports bases from 2-36. #83058 (hp).
Added studentTTestOneSample aggregate function. #85436 (Dylan).
Added isValidASCII function to check if string contains only ASCII characters. Close #85377. #85786 (rajat mohan).
Aggregate functions timeSeriesChangesToGrid and timeSeriesResetsToGrid. Behaves similarly to timeSeriesRateToGrid, accepting parameters for start timestamp, end timestamp, step, and look back window, as well as two arguments for the timestamps and values, but requiring at least 1 sample per window instead of 2. Calculates a PromQL changes/resets, counting the number of times the sample value changes or decreases in the specified window for each timestamp in the time grid defined by the parameters. The return type is Array(Nullable(Float64)). #86010 (Stephen Chi).
Aggregate function quantilePrometheusHistogram, which accepts the upper bounds and cumulative values of histogram buckets as arguments, and performs a linear interpolation between the upper and lower bounds of the bucket in which the quantile position is found. Behaves similarly to the PromQL histogram_quantile() function on classic histograms. #86294 (Stephen Chi).
Added optimized case-insensitive variants of startsWith and endsWith functions: startsWithCaseInsensitive, endsWithCaseInsensitive, startsWithCaseInsensitiveUTF8, and endsWithCaseInsensitiveUTF8. #87374 (Guang Zhao).

System tables

Add a new system table database_replicas with information about database replicas. #83408 (Konstantin Morozov).
Adds a new system.aggregated_zookeeper_log table. The table contains statistics (e.g. number of operations, average latency, errors) of ZooKeeper operations grouped by session id, parent path and operation type, and periodically flushed to disk. #85102 (Miсhael Stetsyuk).
Add system table iceberg_metadata_log to retrieve Iceberg metadata files during SELECT statements. #86152 (scanhex12).
Add warnings for cpu and memory to system.warnings table. #86838 (Bharat Nallan).
System table for delta lake metadata files. #87263 (scanhex12).

Table engines and storage

Support table engine Alias. #76569 (RinChanNOW).
You can now use NATS JetStream to consume messages by specifying the new settings of nats_stream and nats_consumer for the NATS engine. #84799 (Dmitry Novikov).
Iceberg and delta lake tables with disk configuration. This allows to specify user tables with an existing disk. Add setting allowed_disks_for_table_engines which allows specific disks to use for Iceberg. Example: CREATE TABLE test ENGINE = Iceberg('path/inside/disk') SETTING datalake_disk_name = '<some_user_disk>'; ### Documentation entry for user-facing changes. #86778 (scanhex12).
Add a new table setting min_level_for_wide_part that allows specifying the minimum level for a part to be created as a wide part. #88179 (Christoph Wurm).

Iceberg and data lakes

Add support for querying Apache Paimon in ClickHouse. This integration would enable ClickHouse users to directly interact with Paimon’s data lake storage. #84423 (JIaQi).
ALTER UPDATE for Iceberg table engine. #86059 (scanhex12).

Indexes and statistics

New sparse_gram bloom filter index useful for finding long substrings. #79985 (scanhex12).
Added an ability to automatically create statistics on all suitable columns in MergeTree tables. Added table-level setting auto_statistics_types which stores comma-separated types of statistics to create (e.g. auto_statistics_types = 'minmax, uniq, countmin'). #87241 (Anton Popov).

SQL and query features

Added LIMIT BY ALL syntax support. Similar to GROUP BY ALL and ORDER BY ALL, LIMIT BY ALL automatically expands to use all non-aggregate expressions from the SELECT clause as LIMIT BY keys. For example, SELECT id, name, count(*) FROM table GROUP BY id LIMIT 1 BY ALL is equivalent to SELECT id, name, count(*) FROM table GROUP BY id LIMIT 1 BY id, name. This feature simplifies queries when you want to limit by all selected non-aggregate columns without explicitly listing them. Closes #59152. #84079 (Surya Kant Ranjan).
Treat a bare setting name in query setting as equal to 1 (e.g. SELECT ... SETTINGS use_query_cache is equivalent to use_query_cache = 1). #85800 (thraeka).
Allows users to create temporary views with the same syntax as temporary tables. #86432 (Aly Kafoury).
Add support for negative LIMIT and negative OFFSET. Closes #28913. #88411 (Nihal Z. Miaji).

Client and CLI features

Access ClickHouse Cloud instances using Cloud credentials with --login. #82753 (Krishna Mannem).
Add --semicolons_inline option to format queries so that semicolons are placed on the last line instead of on a new line. #88018 (Jan Rada).

Server configuration and workload management

New configuration options: logger.startupLevel & logger.shutdownLevel allow for overriding the log level during the startup & shutdown of Clickhouse respectively. #85967 (Lennard Eijsackers).
Adds a way to provide WORKLOAD and RESOURCE definitions in SQL using the server configuration “resources_and_workloads” section. #87430 (Sergei Trifonov).

System commands

Add SYSTEM RECONNECT ZOOKEEPER command to force zookeeper disconnect and reconnect (https://github.com/ClickHouse/ClickHouse/issues/87317). #87318 (Pradeep Chhetri).
Limit the number of named collections through setting max_named_collection_num_to_warn and max_named_collection_num_to_throw. Add new metric NamedCollection and error TOO_MANY_NAMED_COLLECTIONS. #87343 (Pablo Marcos).

Keeper

Add recursive variants of cp-cpr and mv-mvr commands in Keeper client. #88570 (Mikhail Artemenko).

Experimental features

Functions searchAll and searchAny now work on top of columns without text columns. In those cases, they use the default tokenizer. #87722 (Jimmy Aguilar Mena).
Implement QBit data type that stores vectors in bit-sliced format and L2DistanceTransposed function that allows approximate vector search where precision-speed trade-off is controlled by a parameter. #87922 (Raufs Dunamalijevs).

Performance improvements

Query execution and optimization

Improved query performance by refactoring the order and integration of Query Condition Cache (QCC) with index analysis. QCC filtering is now applied before primary key and skip index analysis, reducing unnecessary index computation. Index analysis has been extended to support multiple range filters, and its filtering results are now stored back into the QCC. This significantly speeds up queries where index analysis dominates execution time—especially those relying on skip indexes (e.g. vector or inverted indexes). #82380 (Amos Bird).
A bunch of micro-optimizations to speed up small queries. #83096 (Raúl Marín).
Compress logs and profile events in the native protocol. On clusters with 100+ replicas, uncompressed profile events take 1..10 MB/sec, and the progress bar is sluggish on slow Internet connections. This closes #82533. #83586 (Alexey Milovidov).
Improve pre where optimization for conditions like func(primary_column) = 'xx' and column in (xxx). #85529 (李扬).
Avoid full scan for system.tables with filter by uuid (Can be useful if you have only UUID from logs or zookeeper path). #88379 (Azat Khuzhin).

JOIN optimizations

Provides a logic regarding pushing down the disjunction JOIN predicates. Example: in TPC-H Q7 for a condition on 2 tables n1 and n2 like (n1.n_name = 'FRANCE' AND n2.n_name = 'GERMANY') OR (n1.n_name = 'GERMANY' AND n2.n_name = 'FRANCE') we extract separate partial filters for each table n1.n_name = 'FRANCE' OR n1.n_name = 'GERMANY' for n1 and n2.n_name = 'GERMANY' OR n2.n_name = 'FRANCE' for n2. #84735 (Yarik Briukhovetskyi).
Implemented rewriting of JOIN: 1. Convert LEFT ANY JOIN and RIGHT ANY JOIN to SEMI/ANTI JOIN if the filter condition is always false for matched or non-matched rows. This optimization is controlled by a new setting query_plan_convert_any_join_to_semi_or_anti_join. 2. Convert FULL ALL JOIN to LEFT ALL or RIGHT ALL JOIN if the filter condition is always false for non-matched rows from one side. #86028 (Dmitry Novik).
HashJoin performance optimised slightly in the case of LEFT/RIGHT join having a lot of unmatched rows. #86312 (Nikita Taranov).
Join reordering now uses statistics. The feature can be enabled by setting allow_statistics_optimize = 1 and query_plan_optimize_join_order_limit = 10. #86822 (Han Fei).
Skip runtime hash table statistics recalculation during join optimization. Added new profile events JoinOptimizeMicroseconds and QueryPlanOptimizeMicroseconds. #87683 (Vladimir Cherkasov).
Inline AddedColumns::appendFromBlock for slightly better join performance in some cases. #88455 (Nikita Taranov).

String and function optimizations

Improve the performance of case sensitive string search (operations such as filtering, e.g. WHERE URL LIKE '%google%') by using the StringZilla library, using SIMD CPU instructions when available. #84161 (Raúl Marín).
Improves performance of LIKE with prefix or suffix by using the new default setting optimize_rewrite_like_perfect_affix. #85920 (Guang Zhao).
Improved performance of functions tokens, hasAllTokens, hasAnyTokens. #88416 (Anton Popov).

MergeTree and storage optimizations

Add optional .size subcolumn serialization for top-level String columns in MergeTree tables to improve compression and enable efficient subcolumn access. Introduce new MergeTree settings for serialization version control and expression optimization for empty strings. #82850 (Amos Bird).
Reduce memory allocation and memory copy when select from an aggregating merge tree table with FINAL when the table has columns with type SimpleAggregateFunction(anyLast). #84428 (Duc Canh Le).
Improved performance of vertical merges after executing a lightweight delete. #86169 (Anton Popov).
Improves performance of fast queries with lots of parts in table (by optimizing MarkRanges by using devector over deque). #86933 (Azat Khuzhin).
Improved performance of applying patch parts in join mode. #87094 (Anton Popov).
Enable saving marks in cache and avoid direct IO for the MergeTreeLazy reader. #87989 (Nikita Taranov).
SELECT query with FINAL clause on.a ReplacingMergeTree table with the is_deleted column now executes faster because of improved parallelization from 2 existing optimizations : 1) do_not_merge_across_partitions_select_final optimization for partitions of the table that have only a single part 2) Split other selected ranges of the table into intersecting / non-intersecting and only intersecting ranges have to pass through FINAL merging transform. #88090 (Shankar Iyer).

Aggregation and GROUP BY optimizations

Fix performance degradation caused by a large serialized key while grouping by multiple string/number columns. Close https://github.com/ClickHouse/ClickHouse/pull/83884#issuecomment-3187972297 cc @mkmkme . It is a follow-up of https://github.com/ClickHouse/ClickHouse/pull/83884. #85924 (李扬).
RadixSort: Help the compiler use SIMD and the CPU do better prefetching. Uses dynamic dispatch to use software prefetching with Intel CPUs only. Continues the work by @taiyang-li in https://github.com/ClickHouse/ClickHouse/pull/77029. #86378 (Raúl Marín).

Index and text search optimizations

Improved performance of building text index for documents that contain mostly non-frequent tokens. #87546 (Anton Popov).

Data lake optimizations

Read in order for Iceberg. #88454 (scanhex12).

Internal optimizations

Improvements to DB::SharedMutex. #87491 (Raúl Marín).
Speed up the common case of Field destructor. #87631 (Raúl Marín).
Reduce the impact of not using fail points. #88196 (Raúl Marín).

Improvements

Query optimization and execution

mannWhitneyUTest no longer throws an exception when both samples contain only identical values. Now returns a valid result, consistent with SciPy. This closes: #79814. #80009 (DeanNeaht).
Added experimental join order optimization that can automatically reorder JOINs for better performance (controlled by query_plan_optimize_join_order_limit setting). Note that the join order optimization currently has limited statistics support and primarily relies on row count estimates from storage engines - more sophisticated statistics collection and cardinality estimation will be added in future releases. If you encounter issues with JOIN queries after upgrading, you can temporarily disable the new implementation by setting SET query_plan_use_new_logical_join_step = 0 and report the issue for investigation. Note about resolution of identifiers from USING clause: Changed resolving of the coalesced column from OUTER JOIN ... USING clause to be more consistent: previously, when selecting both the USING column and qualified columns (a, t1.a, t2.a) in a OUTER JOIN, the USING column would incorrectly be resolved to t1.a, showing 0/NULL for rows from the right table with no left match. Now identifiers from USING clause are always resolved to the coalesced column, while qualified identifiers resolve to the non-coalesced columns, regardless of which other identifiers are present in the query. For example: ```sql SELECT a, t1.a, t2.a FROM (SELECT 1 as a WHERE 0) t1 FULL JOIN (SELECT 2 as a) t2 USING (a) — Before: a=0, t1.a=0, t2.a=2 (incorrect - ‘a’ resolved to t1.a) — After: a=2, t1.a=0, t2.a=2 (correct - ‘a’ is coalesced). #80848 (Vladimir Cherkasov).
Support filtering data parts using skip indexes during reading to reduce unnecessary index reads. Controlled by the new setting use_skip_indexes_on_data_read (disabled by default). This addresses #75774. This includes some common groundwork shared with #81021. #81526 (Amos Bird).
Rewrite disk object storage transaction removes previous remote blobs if metadata transaction is committed. #81787 (Sema Checherinda).
Make S3 retry strategy configurable and make settings of S3 disk can be hot reload if change the config XML file. #82642 (RinChanNOW).
Fixed optimization pass for redundant equal expression when LowCardinality of the resulting type differs before and after optimization. #82651 (Yakov Olkhovskiy).
Special column may be used to indicate presence of part of oneof. #82885 (Ilya Golshtein).
Users are now given clearer instructions when incorrect settings are specified for the new Kafka table engine. #83701 (János Benjamin Antal).
When HTTP clients set the header X-ClickHouse-100-Continue: defer in addition to Expect: 100-continue, ClickHouse doesn’t send send a 100 Continue response to the client until after quota validation passes, preventing waste of network bandwidth from transmitting request bodies that will be thrown away anyways. This is relevant for INSERT queries where the query can be sent in the URL query string and the data is sent in the request body. Aborting a request without sending the full body prevents connection reuse with HTTP/1.1, but the additional latency introduced by opening new connections is usually insignificant compared to total INSERT duration with large amounts of data. #84304 (c-end).
It’s no longer possible to specify time zones for the Time type. #84689 (Yarik Briukhovetskyi).
Client autocompletion is faster and more consistent by using system.completions rather than issuing multiple system-table queries. #84694 (|2ustam).
Simplified (and avoided some bugs) a logic related to parsing Time[64] in a best_effort format. #84730 (Yarik Briukhovetskyi).
Speed up some JOIN queries by building a bloom filter from the right subtree at runtime and pass this filter to the scan in the left subtree. This can be beneficial for queries like SELECT avg(o_totalprice) FROM orders, customer, nation WHERE c_custkey = o_custkey AND c_nationkey=n_nationkey AND n_name = 'FRANCE'. #84772 (Alexander Gololobov).
You can use query parameters after TO when creating a materialized view, for example: CREATE MATERIALIZED VIEW mv TO {to_table:Identifier} AS SELECT * FROM src_table. #84899 (Diskein).
Mask S3 credentials in logs when using DATABASE ENGINE = Backup with S3 storage. #85336 (Kenny Sun).
Update jemalloc to newer version. Improve allocation profiling based on jemalloc’s internal tooling. Global jemalloc profiler can now be enabled with config jemalloc_enable_global_profiler. Sampled global allocations and deallocations can be stored in system.trace_log under JemallocSample type by enabling config jemalloc_collect_global_profile_samples_in_trace_log. Jemalloc profiling can now be enabled for each query independently using setting jemalloc_enable_profiler. Storing samples in system.trace_log can be controlled per query using setting jemalloc_collect_profile_samples_in_trace_log. #85438 (Antonio Andelic).
Added deltaLakeAzureCluster function (similar to deltaLakeAzure for cluster) and deltaLakeS3Cluster (alias to deltaLakeCluster) function.resolves #85358. #85547 (Smita Kulkarni).
Rename InterpreterSystemQuery::dropReplicaImpl to InterpreterSystemQuery::dropStorageReplica - In InterpreterSystemQuery::dropDatabaseReplica: - When dropping with database or drop the whole replica: it also drops replica for each table of the database - If ‘WITH TABLES’ is provided, drop replica for each storage - Otherwise, the logic is unchanged, only call DatabaseReplicated::dropReplica on the databases - When dropping a database replica with the keeper path: - If ‘WITH TABLES’ is provided: - Restore the database as Atomic - Restore RMT tables from statement in Keeper - Drop the database (restored tables are also dropped) - Otherwise, only call DatabaseReplicated::dropReplica on the provided keeper path. #85637 (Tuan Pham Anh).
Fix inconsistent formatting of TTL when it contains a materialize function. Closes #82828. #85749 (Alexey Milovidov).
Apply azure_max_single_part_copy_size setting for normal copy operations in the same way as for backup. #85767 (Ilya Golshtein).
Slow down S3 client threads on retryable errors in S3 Object Storage. This extends the previous setting backup_slow_all_threads_after_retryable_s3_error to S3 disks and renames it to the more general s3_slow_all_threads_after_retryable_error. #85918 (Julia Kartseva).
Mark settings allow_experimental_variant/dynamic/json and enable_variant/dynamic/json as obsolete. Now all three types are enabled unconditionally. #85934 (Pavel Kruglov).
Improved S3(Azure)Queue table engine to allow it to survive zookeeper connection loss without potential duplicates. Requires enabling S3Queue setting use_persistent_processing_nodes (changeable by ALTER TABLE MODIFY SETTING). #85995 (Kseniia Sumarokova).
Iceberg table state is not stored in a storage object anymore. This should make Iceberg in ClickHouse usable with concurrent queries. #86062 (Daniil Ivanik).
Added setting query_condition_cache_selectivity_threshold (default value: 1.0) which excludes scan results of predicates with low selectivity from insertion into the query condition cache. This allows to reduce the memory consumption of the query condition cache at the cost of a worse cache hit rate. #86076 (zhongyuankai).
Support filtering by complete URL string (full_url directive) in http_handlers (including schema and host:port). #86155 (Azat Khuzhin).
Add an experimental setting to delta lake writes feature allow_experimental_delta_lake_writes, disabled by default. #86180 (Kseniia Sumarokova).
Fix detection of systemd in init.d script (fixes “Install packages” check). #86187 (Azat Khuzhin).
Add a new startup_scripts_failure_reason dimensional metric. This metric is needed to distinguish between different error types that result in failing startup scripts. In particular, for alerting purposes, we need to distinguish between transient (e.g., MEMORY_LIMIT_EXCEEDED or KEEPER_EXCEPTION) and non-transient errors. #86202 (Miсhael Stetsyuk).
Multiple data files in iceberg writes. #86275 (scanhex12).
More types for partitions in iceberg writes. This closes #86206. #86298 (scanhex12).
Allow to omit identity() function for partition for Iceberg table. #86314 (scanhex12).
Add ability to enable JSON logging only for specific channel, for this set logger.formatting.channel to one of syslog/console/errorlog/log. #86331 (Azat Khuzhin).
Add rows/bytes limit for inserted data files in delta lake. Controlled by settings delta_lake_insert_max_rows_in_data_file and delta_lake_insert_max_bytes_in_data_file. #86357 (Kseniia Sumarokova).
Allow using native numbers in WHERE. They are already allowed to be arguments of logical functions. This simplifies filter-push-down and move-to-prewhere optimizations. #86390 (Nikolai Kochetov).
Fixed error in case of executing SYSTEM DROP REPLICA against a Catalog with corrupted metadata. #86391 (Nikita Mikhaylov).
Add extra retries for disk access check (skip_access_check=0) in Azure because it may be provisioning access for quite a long time. #86419 (Alexander Tokmakov).
Rename setting evaluation_time to promql_evaluation_time. #86459 (Vitaly Baranov).
Setting to delete files in iceberg drop. This closes #86211. #86501 (scanhex12).
Reduce memory usage in iceberg writes. #86544 (scanhex12).
Make today() function case-insensitive to make it consistent with other date/time related functions like NOW(). #86561 (Kaviraj Kanagaraj).
Make the staleness window in timeSeries*() functions left-open and right-closed. #86588 (Vitaly Baranov).
Add FailedInternal*Query profile events. #86627 (Shane Andrade).
Make bucket lock in S3Queue ordered mode a persistent mode, similar to processing nodes in case use_persistent_processing_nodes = 1. Add keeper fault injection in tests. #86628 (Kseniia Sumarokova).
Fixes handling of users with a dot in the name when added via config file. #86633 (Mikhail Koviazin).
Add asynchronous metric for memory usage in queries (QueriesMemoryUsage and QueriesPeakMemoryUsage). #86669 (Azat Khuzhin).
You can use clickhouse-benchmark --precise flag for more precise reporting of QPS and other per-interval metrics. It helps to get consistent QPS in case if durations of queries are comparable to the reporting interval --delay D. #86684 (Sergei Trifonov).
Make nice values of Linux threads configurable to assign some threads (merge/mutate, query, materialized view, zookeeper client) higher or lower priorities. #86703 (Miсhael Stetsyuk).
Fix misleading “specified upload does not exist” error, which occurs when the original exception is lost in multipart upload because of a race condition. #86725 (Julia Kartseva).
Limit query plan description in the EXPLAIN query. Do not calculate the description for queries other than EXPLAIN. Added a setting query_plan_max_step_description_length. #86741 (Nikolai Kochetov).
Add ability to tune pending signals in attemp to overcome CANNOT_CREATE_TIMER (for query profilers, query_profiler_real_time_period_ns/query_profiler_cpu_time_period_ns). And also collect SigQ from the /proc/self/status for introspection (if ProcessSignalQueueSize is near to ProcessSignalQueueLimit, then you will likely get CANNOT_CREATE_TIMER errors). #86760 (Azat Khuzhin).
Distributed insert/select for data lakes. #86783 (scanhex12).
Improve performance of RemoveRecursive request in Keeper. #86789 (Antonio Andelic).
Remove extra whitespace in PrettyJSONEachRow during JSON type output. #86819 (Pavel Kruglov).
Increase replicated deduplication window up to 10000. #86820 (Sema Checherinda).
Now we write blobs sizes of for prefix.path when directory is removed for plain rewriteable disk. #86908 (alesapin).
Make yesterday() function case insensitive and consistent with today() function. #86914 (Kaviraj Kanagaraj).
Support .xml performance testing against remote ClickHouse instances, including ClickHouse Cloud. Usage example: tests/performance/scripts/perf.py tests/performance/math.xml --runs 10 --user <username> --password <password> --host <hostname> --port <port> --secure. #86995 (Raufs Dunamalijevs).
Respect memory limits in some places that are known to allocate significant (>16MiB) amount of memory (sorting, async inserts, file log). #87035 (Azat Khuzhin).
Prevent nonboolean settings from not setting value in queries. Improvement of #85800. #87084 (thraeka).
Support hints for format names. Closes #86761. #87092 (flynn).
Remote replicas skip index analysis when there are no projections. #87096 (zoomxi).
Throw an exception if setting network_compression_method is not a supported generic codec. #87097 (Robert Schulze).
System table system.query_cache now returns all query result cache entries, whereas it previously returned only shared entries or non-shared entries of the same user and role. That is okay as non-shared entries are supposed to not reveal query results, whereas system.query_cache returns query strings. This makes the behavior of the system table more similar to system.query_log. #87104 (Robert Schulze).
Added support for authentication and SSL in the arrowFlight() table function. #87120 (Vitaly Baranov).
Add new parameter to S3 table engine and s3 table function named storage_class_name which allows to specify intelligent tiring supported by AWS. Supported both in key-value format and in positional (deprecated) format). #87122 (alesapin).
Allow disabling utf8 encoding for ytsaurus table. #87150 (MikhailBurdukov).
Support azure for data lakes disks. #87173 (scanhex12).
Add new dictionary_block_frontcoding_compression text index parameter to control the dictionary compression. By default, it is enabled to use the front-coding compression. #87175 (Elmi Ahmadov).
Enable short circuit evaluation for parseDateTime function. #87184 (Pavel Kruglov).
Support alter table ... materialize statistics all will materialize all the statistics of a table. #87197 (Han Fei).
Disable s3_slow_all_threads_after_retryable_error by default. #87198 (Nikita Mikhaylov).
Adds a new system.aggregated_zookeeper_log table. The table contains statistics (e.g. number of operations, average latency, errors) of ZooKeeper operations grouped by session id, parent path and operation type, and periodically flushed to disk. #87208 (Miсhael Stetsyuk).
Rename table function arrowflight to arrowFlight. #87249 (Vitaly Baranov).
Updated clickhouse-benchmark to accept using - if in place of _ in its cli flags. #87251 (Ahmed Gouda).
Added session setting to exclude list of skip indexes from materialization on inserts (exclude_materialize_skip_indexes_on_insert). Added merge tree table setting to exclude list of skip indexes from materialization during merge (exclude_materialize_skip_indexes_on_merge). #87252 (George Larionov).
Make flushing to system.crash_log in signal handling synchronous. #87253 (Miсhael Stetsyuk).
Add a new column statistics in system.parts_columns. #87259 (Han Fei).
Added a setting inject_random_order_for_select_without_order_by which injects ORDER BY rand() into top-level SELECT queries without ORDER BY clause. #87261 (Rui Zhang).
Support other formats (ORC, Avro) in iceberg writes. This closes #86179. #87277 (scanhex12).
Improve joinGet error message so that it properly states that the number of join_keys is not the same as the number of right_table_keys. #87279 (Isak Ellmer).
Squash data from all threads before inserting to materialized views depending on the settings min_insert_block_size_rows_for_materialized_views and min_insert_block_size_bytes_for_materialized_views. Previously, if parallel_view_processing was enabled, each thread inserting to a specific materailized view would squash insert independently which could lead to higher number of generated parts. #87280 (Antonio Andelic).
This patch adds the ability to check an arbitrary Keeper node’s stat during the write tx. This can help with ABA problem detection. #87282 (Mikhail Artemenko).
Redirect heavy ytsaurus requests to heavy proxies. #87342 (MikhailBurdukov).
This patch fixes rollbacks of unlink/rename/removeRecursive/removeDirectory/etc operations and also hardlink counts in any possible workloads for metadata from disk transactions, and simplifies the interfaces to make them more generic so that they can be reused in other meta stores. #87358 (Mikhail Artemenko).
Added keeper_server.tcp_nodelay configuration parameter that allows disabling TCP_NODELAY for Keeper. #87363 (Copilot).
Support --connection in clickhouse-benchmarks. It is the same as supported by clickhouse-client, you can specify predefined connections in client config.xml/config.yaml under connections_credentials path, to avoid explicitly specifying user/password via command line arguments. Add support for --accept-invalid-certificate into clickhouse-benchmark. #87370 (Azat Khuzhin).
Now setting max_insert_threads will take effect on Iceberg tables. #87407 (alesapin).
Add histogram and dimensional metrics to PrometheusMetricsWriter. This way, the PrometheusRequestHandler handler will have all the essential metrics and can be used for reliable and low-overhead metric collection in the cloud. #87521 (Miсhael Stetsyuk).
Function hasToken now returns zero matches for the empty token (whereas this previously threw an exception). #87564 (Jimmy Aguilar Mena).
Add text index support for Array and Map (mapKeys and mapValues) values. The supported functions are mapContainsKey and has. #87602 (Elmi Ahmadov).
Add a new ZooKeeperSessionExpired metric which indicates the number of expired global ZooKeeper sessions. #87613 (Miсhael Stetsyuk).
Use S3 storage client with backup-specific settings (for example, backup_slow_all_threads_after_retryable_s3_error) for server-side (native) copy to a backup destination. Make s3_slow_all_threads_after_retryable_error obsolete. #87660 (Julia Kartseva).
Fix incorrect handling of settings max_joined_block_size_rows and max_joined_block_size_bytes during query plan serialization with experimental make_distributed_plan. #87675 (Vladimir Cherkasov).
The setting enable_http_compression is now the default. This means that if a client accepts HTTP compression, the server will use it. However, this change has certain downsides. The client can request a heavy compression method, such as bzip2, which is unreasonable, and it will increase the resource consumption of the server (but this will be visible only when large results are transferred). The client can request gzip, which is not that bad, but suboptimal compared to zstd. Closes #71591. #87703 (Alexey Milovidov).
Added a new setting keeper_hosts that exposes the list of [Zoo]Keeper hosts ClickHouse can connect to. #87718 (Nikita Mikhaylov).
Add ALTER TABLE REWRITE PARTS - rewrites the table parts from scratch, by using all new settings (since some, like use_const_adaptive_granularity, will be applied only for new parts). #87774 (Azat Khuzhin).
Add from and to values to the system dashboards to facilitate historical investigations. #87823 (Mikhail f. Shiryaev).
Add more information for performance tracking in Iceberg SELECTs. #87903 (Daniil Ivanik).
Add new joined_block_split_single_row setting to reduce memory usage in hash joins with many matches per key. This allows hash join results to be chunked even within matches for a single left table row, which is particularly useful when one row from the left table matches thousands or millions of rows from the right table. Previously, all matches had to be materialized at once in memory. This reduces peak memory usage but may increase CPU usage. #87913 (Vladimir Cherkasov).
Filesystem cache improvement: reuse cache priority iterator among threads concurrently reserving space in cache. #87914 (Kseniia Sumarokova).
Add ability to limit requests for Keeper (max_request_size setting, same as jute.maxbuffer for ZooKeeper, default OFF for backward compatibility, will be set in the next releases). #87952 (Azat Khuzhin).
Fix clickhouse-benchmark to not include stacktraces in error messages by default. #87954 (Ahmed Gouda).
Avoid utilizing thread pool asynchonous marks loading (load_marks_asynchronously=1) when marks are in cache (since the pool can be under pressure and queries will pay penalty for this even if the marks already in cache). #87967 (Azat Khuzhin).
Ytsaurus: allow create table/table functions/dictionaries with subset of columns. #87982 (MikhailBurdukov).
From now system.zookeeper_connection_log is enabled by default and it can be used to get information about Keeper sessions. #88011 (János Benjamin Antal).
Make TCP and HTTP behavior consistent when there duplicated external tables are passed. HTTP allows a temporary table to be passed several times. #88032 (Sema Checherinda).
Remove custom MemoryPools for reading Arrow/ORC/Parquet. This component seems unneeded after https://github.com/ClickHouse/ClickHouse/pull/84082 because now we track all the allocations regardless. #88035 (Nikita Mikhaylov).
Allow to create Replicated database without arguments. #88044 (Pervakov Grigorii).
Add support to connect to tls port of clickhouse-keeper, kept flag names same as in the clickhouse-client. #88065 (Pradeep Chhetri).
Added a new profile event to track the number of times that a background merge was rejected due to exceeding memory limits. #88084 (Grant Holly).
Added optional start_value parameter to generateSerialID function to specify custom starting values for new series. #88085 (Manuel).
Enables the analyzer for CREATE/ALTER TABLE column default expression validation. #88087 (Max Justus Spransy).
Internal query planning improvement: use JoinStepLogical for CROSS JOIN. #88151 (Vladimir Cherkasov).
Full support of operator IS NOT DISTINCT FROM (<=>). #88155 (simonmichal).
Enable global sampling profiler by default: collect stacktraces of all threads every 10 seconds of CPU and real time. #88209 (Alexander Tokmakov).
Fixed support for EXCHANGE TABLES operations on tables with the Alias engine. The engine now stores the target table as database and table names instead of a constant storage id, allowing it to correctly resolve the target after table exchanges. #88233 (Kai Zhu).
Add setting temporary_files_buffer_size to control size of the buffer for temporary files writers. * Optimize memory consumption of scatter operation (used, for example in grace hash join) for LowCardinality columns. #88237 (Vladimir Cherkasov).
Added support of direct reading from text indexes with parallel replicas. Improved performance of reading text indexes from object storage. #88262 (Anton Popov).
Now the function generateSerialID supports a non-constant argument with the series name. Closes #83750. #88270 (Alexey Milovidov).
Datalakes catalogs database for distributed processing. #88273 (scanhex12).
Update azure sdk to include ‘Content-Length’ fix that is seen with copy and create container functionalities. #88278 (Smita Kulkarni).
Make function lag case insensitive for compatibility with MySQL. #88322 (Lonny Kapelushnik).
Add config keeper_server.coordination_settings.check_node_acl_on_remove. If enabled, before each delete of a node, ACLs of both the node itself and parent node will be verified. Otherwise, only the ACL of the parent node will be verified. #88513 (Antonio Andelic).
JSON columns are now pretty printed when using Vertical format. Closes #81794. #88524 (Frank Rosner).
Store clickhouse-client files (e.g. query history) in places described by XDG Base Directories specification instead of root of home directory. ~/.clickhouse-client-history will still be used if it is already present. #88538 (Konstantin Bogdanov).
Fixes memory leak due to GLOBAL IN (https://github.com/ClickHouse/ClickHouse/issues/88615). #88617 (pranav mehta).
Added overload to hasAny/hasAllTokens to accept a string input. #88679 (George Larionov).
After this patch, heuristic to_remove_small_parts_at_right will be executed before the calculation of the merge range score. Before that, the merge selector was choosing the wide merge, and after that, it filtered its suffix. Fixes: #85374. #88736 (Mikhail Artemenko).
Add a step to postinstall script for clickhouse-keeper which enables starting on boot. #88746 (YenchangChan).
Check credentials in the Web UI only on pasting, rather than on every key press. This avoids a problem with misconfigured LDAP servers. This closes #85777. #88769 (Alexey Milovidov).
Limit exception message length when a constraint is violated. In previous versions, you could get a very long exception message when a very long string was inserted, and it ended up being written in the query_log. Closes #87032. #88801 (Alexey Milovidov).

Bug fix (user-visible misbehavior in an official stable release)

The results of alter queries are only validated on the initiator node for replicated databases and internally replicated tables. This will fix situations where an already committed alter query could get stuck on other nodes. #83849 (János Benjamin Antal).
Limit the number of tasks of each type in BackgroundSchedulePool. Avoid situations when all slots are occupied by task of one type, while other tasks are starving. Also avoids deadlocks when tasks wait for each other. This is controlled by background_schedule_pool_max_parallel_tasks_per_type_ratio server setting. #84008 (Alexander Tokmakov).
Fixed GeoParquet causing client protocol errors. #84020 (Michael Kolupaev).
Fix resolving host-dependent functions like shardNum() in subqueries on initiator node. #84409 (Eduard Karacharov).
Shutdown tables properly when recovering database replica. Improper shutdown would lead to LOGICAL_ERROR for some table engines during database replica recovery. #84744 (Antonio Andelic).
Check access rights during typo correction hints generation for the database name. #85371 (Dmitry Novik).
Fixed incorrect handling of pre-epoch dates with fractional seconds in various date time related functions, such as parseDateTime64BestEffort, change{Year,Month,Day} and makeDateTime64. Previously the subsecond part was substracted from seconds instead of adding them. For example parseDateTime64BestEffort('1969-01-01 00:00:00.468') was returning 1968-12-31 23:59:59.532 instead of 1969-01-01 00:00:00.468. #85396 (xiaohuanlin).
1. LowCardinality for hive columns 2. Fill hive columns before virtual columns (required for https://github.com/ClickHouse/ClickHouse/pull/81040) 3. LOGICAL_ERROR on empty format for hive #85528 4. Fix check for hive partition columns being the only columns 5. Assert all hive columns are specified in the schema 6. Partial fix for parallel_replicas_cluster with hive 7. Use ordered container in extractkeyValuePairs for hive utils (required for https://github.com/ClickHouse/ClickHouse/pull/81040). #85538 (Arthur Passos).
Prevent unnecessary optimization of the first argument of IN functions sometimes resulting in error when array mapping is used. #85546 (Yakov Olkhovskiy).
Mapping between iceberg source ids and parquet names was not adjusted to the schema when the parquet file was written. This PR processes schema relevant for each iceberg data file, not a current one. #85829 (Daniil Ivanik).
Fix reading file size separately from opening it. Relates to https://github.com/ClickHouse/ClickHouse/pull/33372, which was introduced in response to a bug in Linux kernels prior to 5.10 release. #85837 (Konstantin Bogdanov).
ClickHouse Keeper no longer fails to start on systems where IPv6 is disabled at the kernel level (e.g., RHEL with ipv6.disable=1). It now attempts to fall back to an IPv4 listener if the initial IPv6 listener fails. #85901 (jskong1124).
This PR closes #77990. Add TableFunctionRemote support for parallel replicas in globalJoin. #85929 (zoomxi).
Fix null pointer in OrcSchemaReader::initializeIfNeeded(). This PR addresses the following issue: #85292. #85951 (yanglongwei).
Add a check to allow correlated subqueries in the FROM clause only if they use columns from the outer query. Fixes #85469. Fixes #85402. #85966 (Dmitry Novik).
Fix alter update of a column with a subcolumn used in other column materialized expression. Previously materialized column with subcolumn in its expression was not updated properly. #85985 (Pavel Kruglov).
Forbid altering columns whose subcolumns are used in PK or partition expression. #86005 (Pavel Kruglov).
Fix ALTER COLUMN IF EXISTS commands failing when column state changes within the same ALTER statement. Commands like DROP COLUMN IF EXISTS, MODIFY COLUMN IF EXISTS, COMMENT COLUMN IF EXISTS, and RENAME COLUMN IF EXISTS now properly handle cases where a column is deleted by a previous command in the same statement. #86046 (xiaohuanlin).
Fix reading subcolumns with non-default column mapping mode in storage DeltaLake. #86064 (Kseniia Sumarokova).
Fix using wrong default values for path with Enum hint inside JSON. #86065 (Pavel Kruglov).
DataLake hive catalog url parsing with input sanitisation. Closes #86018. #86092 (rajat mohan).
Fix logical error during filesystem cache dynamic resize. Closes #86122. Closes https://github.com/ClickHouse/clickhouse-core-incidents/issues/473. #86130 (Kseniia Sumarokova).
Use NonZeroUInt64 for logs_to_keep in DatabaseReplicatedSettings. #86142 (Tuan Pham Anh).
Exception was thrown by a FINAL query with skip index if the table (e.g ReplacingMergeTree) was created with settingindex_granularity_bytes = 0. That exception has been fixed now. #86147 (Shankar Iyer).
Removes UB and fixes problems with parsing of Iceberg partition expression. #86166 (Daniil Ivanik).
Fix inferring Date/DateTime/DateTime64 on dates that are out of supported range. #86184 (Pavel Kruglov).
Fix crash in case of const and non-const blocks in one INSERT. #86230 (Azat Khuzhin).
Process includes from /etc/metrika.xml as a default when creating disks from SQL. #86232 (alekar).
Fix accurateCastOrNull/accurateCastOrDefault from String to JSON. #86240 (Pavel Kruglov).
Support directories without ’/’ in iceberg engine. #86249 (scanhex12).
Fix crash with replaceRegex, a FixedString haystack and an empty needle. #86270 (Raúl Marín).
Fix crash during ALTER UPDATE Nullable(JSON). #86281 (Pavel Kruglov).
Fix missing column definer in system.tables. #86295 (Raúl Marín).
Fix cast from LowCardinality(Nullable(T)) to Dynamic. #86365 (Pavel Kruglov).
Fix logical error during writes to DeltaLake. Closes #86175. #86367 (Kseniia Sumarokova).
Fix 416 The range specified is invalid for the current size of the resource. The range specified is invalid for the current size of the resource when reading empty blobs from Azure blob storage for plain_rewritable disk. #86400 (Julia Kartseva).
Fix GROUP BY Nullable(JSON). #86410 (Pavel Kruglov).
Fixed a bug in Materialized Views: an MV might not work if it was created, dropped, and then created again with the same name. #86413 (Alexander Tokmakov).
Fail if all replicas are unavailable when reading from *cluster functions. #86414 (Julian Maicher).
Fix leaking of MergesMutationsMemoryTracking due to Buffer tables and fix query_views_log for streaming from Kafka (and others). #86422 (Azat Khuzhin).
Fix show tables after dropping reference table of alias storage. #86433 (RinChanNOW).
Fix missing chunk header when send_chunk_header is enabled and UDF is invoked via HTTP protocol. #86469 (Vladimir Cherkasov).
Fix possible deadlock in case of jemalloc profile flushes enabled. #86473 (Azat Khuzhin).
Fix reading subcolumns in DeltaLake table engine. Closes #86204. #86477 (Kseniia Sumarokova).
Handling loopback host ID properly to avoid collision when processing DDL tasks:. #86479 (Tuan Pham Anh).
Fix detach/attach for PostgreSQL database engine tables with numeric/decimal columns. #86480 (Julian Maicher).
Fix use of uninitialized memory in getSubcolumnType. #86498 (Raúl Marín).
Functions searchAny and searchAll when called with empty needles now return true (aka. “matches everything”). Previously, they returned false. (issue #86300). #86500 (Elmi Ahmadov).
Fix function timeSeriesResampleToGridWithStaleness() when the first bucket has no value. #86507 (Vitaly Baranov).
Fix crash caused by merge_tree_min_read_task_size being set to 0. #86527 (yanglongwei).
While reading takes format for each data file from Iceberg metadata (earlier it was taken from table arguments). #86529 (Daniil Ivanik).
Fixes a crash where some valid user-submitted data to an AggregateFunction(quantileDD) column could cause merges to recurse infinitely. #86560 (Raphaël Thériault).
Fix Backup db engine raising exception on query with zero sized part files. #86563 (Max Justus Spransy).
Fix missing chunk header when send_chunk_header is enabled and UDF is invoked via HTTP protocol. #86606 (Vladimir Cherkasov).
Fix S3Queue logical error “Expected current processor to be equal to ”, which happened because of keeper session expiration. #86615 (Kseniia Sumarokova).
Nullablity bugs in insert and pruning. This closes #86407. #86630 (scanhex12).
Do not disable file system cache if Iceberg metadata cache is disabled. #86635 (Daniil Ivanik).
Fixed ‘Deadlock in Parquet::ReadManager (single-threaded)’ error in parquet reader v3. #86644 (Michael Kolupaev).
Fix support for IPv6 in listen_host for ArrowFlight. #86664 (Vitaly Baranov).
Fix shutdown in ArrowFlight handler. This PR fixes #86596. #86665 (Vitaly Baranov).
Fix distributed queries with describe_compact_output=1. #86676 (Azat Khuzhin).
Fix window definition parsing and applying query parameters. #86720 (Azat Khuzhin).
Fix exception Partition strategy wildcard can not be used without a '_partition_id' wildcard. when creating a table with PARTITION BY, but without partition wildcard, which used to work in versions before 25.8. Closes https://github.com/ClickHouse/clickhouse-private/issues/37567. #86748 (Kseniia Sumarokova).
Fix LogicalError if parallel queries are trying to acquire single lock. #86751 (Pervakov Grigorii).
Fix writing NULL into JSON shared data in RowBinary input format and add some additional validations in ColumnObject. #86812 (Pavel Kruglov).
Support JSON/Dynamic types in table created as cluster table function. #86821 (Pavel Kruglov).
Fix empty Tuple permutation with limit. #86828 (Pavel Kruglov).
Do not use separate keeper node for persistent processing nodes. Fix for https://github.com/ClickHouse/ClickHouse/pull/85995. Closes #86406. #86841 (Kseniia Sumarokova).
Fix TimeSeries engine table breaking creation of new replica in Replicated Database. #86845 (Nikolay Degterinsky).
Fix querying system.distributed_ddl_queue in cases where tasks are missing certain Keeper nodes. #86848 (Antonio Andelic).
Fix seeking at the end of the decompressed block. #86906 (Pavel Kruglov).
Process exception which is thrown during asyncronous execution of Iceberg Iterator. #86932 (Daniil Ivanik).
Fix saving of big preprocessed XML configs. #86934 (c-end).
Fix date field populating in system.iceberg_metadata_log table. #86961 (Daniil Ivanik).
Fixed infinite recalculation of TTL with WHERE. #86965 (Anton Popov).
Fix result of function calculated in CTE being non-deterministic in the query. #86967 (Yakov Olkhovskiy).
Fix LOGICAL_ERROR in EXPLAIN with pointInPolygon on primary key columns. #86971 (Michael Kolupaev).
Fixed possible incorrect result of uniqExact function with ROLLUP and CUBE modifiers. #87014 (Nikita Taranov).
Fix data lake tables with a percent-encoded sequence in the name. Closes #86626. #87020 (Anton Ivashkin).
Fix resolving table schema with url() table function when parallel_replicas_for_cluster_functions setting is set to 1. #87029 (Konstantin Bogdanov).
Correctly cast output of PREWHERE after splitting it into multiple steps. #87040 (Antonio Andelic).
Fixed lightweight updates with ON CLUSTER clause. #87043 (Anton Popov).
Fix compatibility of some aggregate function states with String argument. #87049 (Pavel Kruglov).
Fix incorrect IS NULL behavior on nullable columns in OUTER JOIN with optimize_functions_to_subcolumns, close #78625. #87058 (Vladimir Cherkasov).
Fixes an issue where model name from OpenAI wasn’t passed through. #87100 (Kaushik Iska).
EmbeddedRocksDB: Path must be inside user_files. #87109 (Raúl Marín).
Fix KeeperMap tables created before 25.1, leaving data in ZooKeeper after the DROP query. #87112 (Nikolay Degterinsky).
Fix maps and arrays field ids reading parquet. #87136 (scanhex12).
Fix reading array with array sizes subcolumn in lazy materialization. #87139 (Pavel Kruglov).
Fixed incorrect accounting of temporary data deallocations in max_temporary_data_on_disk_size limit tracking, close #87118. #87140 (JIaQi).
The function checkHeaders is now properly validating the provided headers and reject forbidden headers. Original author: Michael Anastasakis (@michael-anastasakis). #87172 (Raúl Marín).
Makes the same behavior of toDate and toDate32 for all numeric types. Fixes Date32 underflow check during cast from int16. #87176 (Pervakov Grigorii).
Fix CASE function with Dynamic arguments. #87177 (Pavel Kruglov).
Fix logical error with parallel replicas for queries with multiple JOINs, with RIGHT JOIN after LEFT/INNER JOIN in particular. #87178 (Igor Nikonov).
Respect setting input_format_try_infer_variants in schema inference cache. #87180 (Pavel Kruglov).
Make pathStartsWith only match paths under the prefix. #87181 (Raúl Marín).
Fix reading empty array from empty string in CSV. #87182 (Pavel Kruglov).
Fix possible wrong result of non-correlated EXISTS. It was broken with execute_exists_as_scalar_subquery=1 which was introduced in https://github.com/ClickHouse/ClickHouse/pull/85481 and affects 25.8. Fixes #86415. #87207 (Nikolai Kochetov).
Fixed logical errors in _row_number virtual column and iceberg positioned deletes. #87220 (Michael Kolupaev).
Fix “Too large size passed to allocator” LOGICAL_ERROR in JOIN due to mixed const and non-const blocks. #87231 (Azat Khuzhin).
Throws an error if iceberg_metadata_log is not configured, but user tries to get debug iceberg metadata info. Fixes nullptr access. #87250 (Daniil Ivanik).
Fixed lightweight updates with subqueries that read from another MergeTree tables. #87285 (Anton Popov).
Fixed move-to-prewhere optimization, which did not work in the presence of row policy. Continuation of #85118. Closes #69777. Closes #83748. #87303 (Nikolai Kochetov).
Fixed applying patches to columns with default expression that are missing in data parts. #87347 (Anton Popov).
Fix EmbeddedRocksDB upgrade. #87392 (Raúl Marín).
Fixed direct reading from the text index on object storage. #87399 (Anton Popov).
Prevent privilege with non-existent engine to be created. #87419 (Jitendra).
Ignore only not found errors for s3_plain_rewritable (which may lead to all sort of troubles). #87426 (Azat Khuzhin).
Fix dictionaries with YTSaurus source and *range_hashed layouts. #87490 (MikhailBurdukov).
Fix creating an array of empty tuples. #87520 (Pavel Kruglov).
Check for illegal columns during temporary table creation. #87524 (Pavel Kruglov).
Never put hive partition columns in the format header. Fixes #87515. #87528 (Arthur Passos).
Fix preparing reading from format in DeltaLake when text format is used. #87529 (Pavel Kruglov).
Fixes access validation on select and insert for Buffer tables. #87545 (pufit).
Disallow creating data skipping index for S3 table. #87554 (Bharat Nallan).
Avoid leaking of tracked memory for async logging (can have a significant drift, for 10 hours, ~100GiB) and text_log (almost same drift is possible). #87584 (Azat Khuzhin).
Fixed a bug that might lead to overriding global server settings with SELECT settings of a View or Materialized View, if this view was dropped asynchronously and the server was restarted before finishing background cleanup. #87603 (Alexander Tokmakov).
Exclude userspace page cache bytes (if possible) when computing memory overload warning. #87610 (Bharat Nallan).
Fix a bug when incorrect type order during CSV deserialization led to the LOGICAL_ERROR. #87622 (Yarik Briukhovetskyi).
Fix incorrect handling of command_read_timeout for executable dictionaries. #87627 (Azat Khuzhin).
Fixed incorrect SELECT * REPLACE behavior in WHERE clause with new analyzer when filtering on replaced columns. #87630 (xiaohuanlin).
Fixed two-level aggregation when using Merge over Distributed. #87687 (c-end).
Fix the generation of the output block in the HashJoin algorithm when the right row list is not used. Fixes #87401. #87699 (Dmitry Novik).
Parallel replicas read mode could be chosen incorrectly if there are no data to read after applying index analysis. Closes #87653. #87700 (zoomxi).
Fix handling of timestamp / timestamptz columns in Glue. #87733 (Andrey Zvonov).
This closes #86587. #87761 (scanhex12).
Fix writing boolean values in PostgreSQL interface. #87762 (Artem Yurov).
Fix unknown table error in insert select query with CTE, #85368. #87789 (Guang Zhao).
Fix reading null map subcolumn from Variants that cannot be inside Nullable. #87798 (Pavel Kruglov).
Fix handling error when failing to drop the database completely on the cluster on the secondary node. #87802 (Tuan Pham Anh).
Fix several skip indices bugs. #87817 (Raúl Marín).
In AzureBlobStorage, updated to try native copy first and go to read & write on ‘Unauthroized’ error (In AzureBlobStorage, if storage accounts are different for source & destination we get ‘Unauthorized’ error). And fix applying “use_native_copy” when endpoint is defined in configuration. #87826 (Smita Kulkarni).
ClickHouse crashes if ArrowStream file has non-unique dictionary. #87863 (Ilya Golshtein).
Fix merge with projections when the last block is empty. #87928 (Raúl Marín).
Don’t remove injective functions from GROUP BY if arguments types are not allowed in GROUP BY. #87958 (Pavel Kruglov).
Fix for incorrect granules/partitions elimination for datetime-based keys, when using session_timezone setting in queries. #87987 (Eduard Karacharov).
Returns affected rows count after query in PostgreSQL Interface. #87990 (Artem Yurov).
Restrics using of filter pushdown for PASTE JOIN because it can cause incorrect results. #88078 (Yarik Briukhovetskyi).
Applies URI normalization before evaluation for the grants check introduced by https://github.com/ClickHouse/ClickHouse/pull/84503. #88089 (pufit).
Fix logical error when ARRAY JOIN COLUMNS() matches no columns in new analyzer. #88091 (xiaohuanlin).
Fix “High ClickHouse memory usage” warning (exclude page cache). #88092 (Azat Khuzhin).
Fixed possible data corruption in MergeTree tables with set column TTL. #88095 (Anton Popov).
Fixed crash in mortonEncode and hilbertEncode functions when called with empty tuple argument. #88110 (xiaohuanlin).
Now ON CLUSTER queries will take less time in case of inactive replicas in cluster. #88153 (alesapin).
Now DDL worker cleanup outdated hosts from replicas set. It will reduce amount of stored metadata in ZooKeeper. #88154 (alesapin).
Do proper undo of the move directory operation in case of error. We need to rewrite all prefix.path objects changed during the execution, not only the root one. #88198 (Mikhail Artemenko).
Fixed propagation of is_shared flag in ColumnLowCardinality. It may lead to a wrong group-by result if a new value is inserted in a column after hash values are already pre-calculated and cached in the ReverseIndex. #88213 (Nikita Taranov).
Fixes a workload setting max_cpu_share. Now it can be used without max_cpus workload setting being set. #88217 (Neerav).
Fix bug that very heavy mutations with subqueries could stuck in prepare stage. Now it’s possible to stop these mutations with SYSTEM STOP MERGES. #88241 (alesapin).
Now correlated subqueries will work with object storages. #88290 (alesapin).
Avoid trying to initialize DataLake databases while accessing system.projections and system.data_skipping_indices. #88330 (Azat Khuzhin).
Now datalakes catalogs will be shown in system introspection tables only if show_data_lake_catalogs_in_system_tables explicitly enabled. #88341 (alesapin).
Fixed DatabaseReplicated to respect interserver_http_host configuration. #88378 (xiaohuanlin).
Positional arguments are now explicitly disabled in the context of defining Projections, as they are not sensible in this internal query stage. This fixes #48604. #88380 (Amos Bird).
Fix quadratic complexity in the countMatches function. Closes #88400. #88401 (Alexey Milovidov).
Make ALTER COLUMN ... COMMENT commands for KeeperMap tables replicated so they are committed to Replicated database metadata and propagated across all replicas. Closes #88077. #88408 (Eduard Karacharov).
Fix a case of false cyclic dependency with Materialized Views in Database Replicated, which prevented new replicas from being added to the database. #88423 (Nikolay Degterinsky).
Fix aggregation of sparse columns when group_by_overflow_mode is set to any. #88440 (Eduard Karacharov).
Fix “column not found” error when using query_plan_use_logical_join_step=0 with multiple FULL JOIN USING clauses. Closes #88103. #88473 (Vladimir Cherkasov).
Big clusters with node numbers > 10 have a high probability of failing the restore with error [941] 67c45db4-4df4-4879-87c5-25b8d1e0d414 <Trace>: RestoreCoordinationOnCluster The version of node /clickhouse/backups/restore-7c551a77-bd76-404c-bad0-3213618ac58e/stage/num_hosts changed (attempt #9), will try again. The num_hosts node is overwritten by many hosts at the same time. The fix makes the setting to control attempts dynamic. Closes #87721. #88484 (Mikhail f. Shiryaev).
This PR just for making compatibility to 23.8 and before, The compatibility issue was introduced by this PR: https://github.com/ClickHouse/ClickHouse/pull/54240 This SQL will fail with enable_analyzer=0(before 23.8, it’s ok) select * from t1 s final join ( select * from t2 final ) r final on s.key = r.key join ( select * from t3 final ) c final on s.key = c.key Because JoinToSubqueryTransformVisitor will rewrite this SQL to SELECT `_--s.key` AS `s.key`, `_--s.value` AS `s.value`, `_--r.key` AS `r.key`, `_--r.value` AS `r.value`, `_--c.key` AS `c.key`, `_--c.value` AS `c.value` FROM ( SELECT value AS `_--s.value`, key AS `_--s.key`, r.value AS `_--r.value`, r.key AS `_--r.key` FROM t1 AS s FINAL ALL INNER JOIN ( SELECT key, value FROM t2 FINAL ) AS r FINAL ON `_--s.key` = `_--r.key` ) AS `--.s` ALL INNER JOIN ( SELECT value AS `_--c.value`, key AS `_--c.key` FROM ( SELECT key, value FROM t3 FINAL ) AS c FINAL ) AS `--.t` ON `_--s.key` = `_--c.key` We want to rewrite this SQL to(just move the last FINAL) SELECT `_--s.key` AS `s.key`, `_--s.value` AS `s.value`, `_--r.key` AS `r.key`, `_--r.value` AS `r.value`, `_--c.key` AS `c.key`, `_--c.value` AS `c.value` FROM ( SELECT value AS `_--s.value`, key AS `_--s.key`, r.value AS `_--r.value`, r.key AS `_--r.key` FROM t1 AS s FINAL ALL INNER JOIN ( SELECT key, value FROM t2 FINAL ) AS r FINAL ON `_--s.key` = `_--r.key` ) AS `--.s` ALL INNER JOIN ( SELECT value AS `_--c.value`, key AS `_--c.key` FROM ( SELECT key, value FROM t3 FINAL ) AS c ) AS `--.t` FINAL ON `_--s.key` = `_--c.key`. #88491 (JIaQi).
Fix UBSAN integer overflow in accurateCast error message when converting large values to DateTime. #88520 (xiaohuanlin).
Fix coalescing merge tree for tuple types. This closes #88469. #88526 (scanhex12).
Forbid deletes for iceberg_format_version=1. This closes #88444. #88532 (scanhex12).
This patch fixes the move operation of plain-rewritable disks for folders of arbitrary depth. #88586 (Mikhail Artemenko).
Fix SQL SECURITY DEFINER with *cluster functions. #88588 (Julian Maicher).
Fix potential crash caused by concurrent mutation of underlying const PREWHERE columns. #88605 (Azat Khuzhin).
Fixed reading from the text index and enabled query condition cache (with enabled settings use_skip_indexes_on_data_read and use_query_condition_cache). #88660 (Anton Popov).
A Poco::TimeoutException exception thrown from Poco::Net::HTTPChunkedStreamBuf::readFromDevice leads to a crash with SIGABRT. #88668 (Miсhael Stetsyuk).
Fix appending to system.zookeeper_connection_log in case ClickHouse connects for the first time after config reload. #88728 (Antonio Andelic).
Fixed a bug where converting DateTime64 to Date with date_time_overflow_behavior = 'saturate' could lead to incorrect results for out-of-range values when working with time zones. #88737 (Manuel).
Nth attempt to fix “having zero bytes error” with s3 table engine with enabled cache. #88740 (Kseniia Sumarokova).
Fixes access validation on select for loop table function. #88802 (pufit).
Catch exceptions when async logging fails to prevent program aborts. #88814 (Raúl Marín).

Self-managed

Documentation Index

​Backward incompatible changes

​Data format and schema changes

​Query and function changes

​Data type changes

​Storage and index changes

​Settings and configuration changes

​Keeper changes

​New features

​Functions

​System tables

​Table engines and storage

​Iceberg and data lakes

​Indexes and statistics

​SQL and query features

​Client and CLI features

​Server configuration and workload management

​System commands

​Keeper

​Experimental features

​Performance improvements

​Query execution and optimization

​JOIN optimizations

​String and function optimizations

​MergeTree and storage optimizations

​Aggregation and GROUP BY optimizations

​Index and text search optimizations

​Data lake optimizations

​Internal optimizations

​Improvements

​Query optimization and execution

​Bug fix (user-visible misbehavior in an official stable release)

Backward incompatible changes

Data format and schema changes

Query and function changes

Data type changes

Storage and index changes

Settings and configuration changes

Keeper changes

New features

Functions

System tables

Table engines and storage

Iceberg and data lakes

Indexes and statistics

SQL and query features

Client and CLI features

Server configuration and workload management

System commands

Keeper

Experimental features

Performance improvements

Query execution and optimization

JOIN optimizations

String and function optimizations

MergeTree and storage optimizations

Aggregation and GROUP BY optimizations

Index and text search optimizations

Data lake optimizations

Internal optimizations

Improvements

Query optimization and execution

Bug fix (user-visible misbehavior in an official stable release)