Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Introduction
This guide focuses on the most common and effective performance optimizations for ClickStack, sufficient to optimize the majority of real-world observability workloads, typically up to tens of terabytes of data per day. The optimizations are presented in a deliberate order, starting with the simplest and highest impact techniques and progressing toward more advanced and specialized tuning. Early optimizations should be applied first and will often deliver substantial gains on their own. As data volumes grow and workloads become more demanding, the latter techniques become increasingly worthwhile to explore.ClickHouse concepts
Before applying any of the optimizations described in this guide, it’s important to be familiar with a few core ClickHouse concepts. In ClickStack, each data source maps directly to one or more ClickHouse tables. When using OpenTelemetry, ClickStack creates and manages a set of default tables that store logs, traces, and metrics data. If you’re using custom schemas or managing your own tables, you may already be familiar with these concepts. However, if you’re simply sending data via the OpenTelemetry Collector, these tables are created automatically, and are where all optimizations described below will be applied.| Data type | Table |
|---|---|
| Logs | otel_logs |
| Traces | otel_traces |
| Metrics (gauges) | otel_metrics_gauge |
| Metrics (sums) | otel_metrics_sum |
| Metrics (histogram) | otel_metrics_histogram |
| Metrics (Exponential histograms) | otel_metrics_exponentialhistogram |
| Metrics (summary) | otel_metrics_summary |
| Sessions | hyperdx_sessions |
default database is used - this can be modified in the OpenTelemetry collector.
:::important Focus on logs and traces
In most cases, performance tuning focuses on the logs and trace tables. While metrics tables can be optimized for filtering, their schemas are intentionally opinionated for Prometheus-style workloads and typically don’t require modification for standard charting. Logs and traces, by contrast, support a wider range of access patterns, and therefore benefit most from tuning. Session data has a fixed user experience, and its schema rarely needs to be modified.
:::
At a minimum, you should understand the following ClickHouse fundamentals:
| Concept | Description |
|---|---|
| Tables | How data sources in ClickStack correspond to underlying ClickHouse tables. Tables in ClickHouse mainly use the MergeTree engine. |
| Parts | How data is written in immutable parts and merged over time. |
| Partitions | Partitions group the data parts of a table into organised logical units. These units are easier to manage, query, and optimize. |
| Merges | The internal process that merges parts together to ensure that there are a lower number of parts to query. Essential for maintaining query performance. |
| Granules | The smallest unit of data that ClickHouse reads and prunes during query execution. |
| Primary (ordering) keys | How the ORDER BY key determines on-disk data layout, compression, and query pruning. |
- Creating tables in ClickHouse - A simple introduction to tables.
- Parts
- Partitions
- Merges
- Primary keys/indexes
- How ClickHouse stores data: parts and granules - More advanced guide on how data is structured and queried in ClickHouse, covering granules and primary keys in detail.
- MergeTree- Advanced MergeTree reference guide useful for commands and for internal specifics.
Optimization 1. Materialize frequently queried attributes
The first and simplest optimization for ClickStack users is to identify commonly queried attributes inLogAttributes, ScopeAttributes, and ResourceAttributes, and promote them to top-level columns using materialized columns.
This optimization alone is often sufficient to scale ClickStack deployments to tens of terabytes per day and should be applied before considering more advanced tuning techniques.
Why materialize attributes
ClickStack stores metadata such as Kubernetes labels, service metadata, and custom attributes inMap(String, String) columns. While this provides flexibility, querying map subkeys has an important performance implication.
When querying a single key from a Map column, ClickHouse must read the entire map column from disk. If the map contains many keys, this results in unnecessary IO and slower queries compared to reading a dedicated column.
Materializing frequently accessed attributes avoids this overhead by extracting the value at insert time and storing it as a first-class column.
Materialized columns:
- Are computed automatically during inserts
- Can’t be explicitly set in INSERT statements
- Support any ClickHouse expression
- Allow type conversion from String to more efficient numeric or date types
- Enable skip indexes and primary key usage
- Reduce disk reads by avoiding full map access
ClickStack automatically detects materialized columns extracted from maps and transparently uses them during query execution, even when users continue to query the original attribute path.
Example
Consider the default ClickStack schema for traces, where Kubernetes metadata is stored inResourceAttributes:
ResourceAttributes.k8s.pod.name:"checkout-675775c4cc-f2p9c":
This results in a SQL predicate similar to:
ResourceAttributes column for each matching row - potentially very large if the Map contains many keys.
If this attribute is queried frequently, it should be materialized as a top-level column.
To extract the pod name at insert time, add a materialized column:
PodName.
Users can now query for pod names efficiently, using Lucene syntax e.g. PodName:"checkout-675775c4cc-f2p9c"
For newly inserted data, this avoids map access entirely and significantly reduces I/O.
However, even if users continue querying the original attribute path e.g. ResourceAttributes.k8s.pod.name:"checkout-675775c4cc-f2p9c", ClickStack will automatically rewrite the query internally to use the materialized PodName column i.e. using the predicate:
By default, materialized columns are excluded from
SELECT * queries. This preserves the invariant that query results can always be reinserted into the table.Materializing historical data
Materialized columns only apply automatically to data inserted after the column is created. For existing data, queries against the materialized column will transparently fall back to reading from the original map. If historical performance is critical, the column can be backfilled using a mutation e.g.system.mutations table e.g.
is_done = 1 for the corresponding mutation.
:::important
Mutations incur additional IO and CPU overhead and should be used sparingly. In many cases, it’s sufficient to allow older data to age out naturally and rely on the performance improvements for newly ingested data.
:::
Optimization 2. Adding skip indices
After materializing frequently queried attributes, the next optimization is to add data skipping indexes to further reduce the amount of data ClickHouse needs to read during query execution. Skip indexes allow ClickHouse to avoid scanning entire blocks of data when it can determine that no matching values exist. Unlike traditional secondary indexes, skip indexes operate at the granule level and are most effective when query filters exclude large portions of the dataset. When used correctly, they can significantly accelerate filtering on high-cardinality attributes without changing query semantics. Consider the default traces schema for ClickStack, which includes skip indices:- High-cardinality string filtering, such as TraceId, session identifiers, attribute keys, or values
- Numeric range filtering, such as span duration
Bloom filters
Bloom filter indexes are the most commonly used skip index type in ClickStack. They’re well-suited for string columns with high cardinality, typically at least tens of thousands of distinct values. A false-positive rate of 0.01 with granularity 1 is a good default starting point and balances storage overhead with effective pruning. Continuing the example from Optimization 1, suppose the Kubernetes pod name has been materialized from ResourceAttributes:PodName:"checkout-675775c4cc-f2p9c".
Bloom filters are most effective when the distribution of values is such that a given value appears in a relatively small number of parts. This often occurs naturally in observability workloads where metadata like pod names, trace IDs, or session identifiers is correlated with time, and therefore clustered by the table’s ordering key.
As with all skip indexes, Bloom filters should be added selectively and validated against real query patterns to ensure they provide measurable benefit - see “Evaluating skip index effectiveness.”
Min-max indices
Minmax indexes store the minimum and maximum value per granule and are extremely lightweight. They’re particularly effective for numeric columns and range queries. While they may not accelerate every query, they’re low-cost and almost always worth adding for numeric fields. Minmax indexes work best when numeric values are either naturally ordered or confined to narrow ranges within each part. Suppose a Kafka offset is frequently queried fromSpanAttributes:
Materialize skip index
After a skip index has been added, it only applies to newly ingested data. Historical data won’t benefit from the index until it’s explicitly materialized. If you have already added a skip index, for example:Materializing skip indexesMaterializing a skip index is typically lightweight and safe to run, especially for minmax indexes. For Bloom filter indexes on large datasets, users may prefer to materialize on a per-partition basis to better control resource usage e.g.
is_done = 1 for the corresponding mutation.
Once complete, confirm that the index data has been created:
0.01 to 0.05 produces a smaller index that evaluates faster, at the cost of less aggressive pruning. While fewer granules may be skipped, overall query latency can improve due to faster index evaluation.
Tuning Bloom filter parameters is therefore a workload-dependent optimization and should be validated using real query patterns and production-like data volumes.
For further details on skip indices, see the guide “Understanding ClickHouse data skipping indexes.”
Evaluating skip index effectiveness
The most reliable way to evaluate skip index pruning is to useEXPLAIN indexes = 1, which shows how many parts and granules are eliminated at each stage of query planning. In most cases, you want to see a large reduction in granules at the Skip stage, ideally after the primary key has already reduced the search space. Skip indexes are evaluated after partition pruning and primary key pruning, so their impact is best measured relative to the remaining parts and granules.
EXPLAIN confirms whether pruning occurs, but it doesn’t guarantee a net speedup. Skip indexes have a cost to evaluate, especially if the index is large. Always benchmark queries before and after adding and materializing an index to confirm real performance improvements.
For example, consider the default Bloom filter skip index for TraceId included with the default Traces schema:
EXPLAIN indexes = 1 to see how effective it’s for a selective query:
FORMAT Null to avoid result serialization overhead, and disable the query condition cache to keep runs repeatable:
use_query_condition_cache ensures results aren’t affected by cached filtering decisions, and setting use_skip_indexes = 0 provides a clean baseline for comparison. If the pruning is effective and index evaluation cost is low, the indexed query should be materially faster, as in the example above.
When to add skip indexes
Skip indexes should be added selectively, based on the types of filters users run most frequently and the shape of the data in parts and granules. The goal is to prune enough granules to offset the cost of evaluating the index itself, which is why benchmarking on production-like data is essential. For numeric columns that are used in filters, a minmax skip index is almost always a good choice. It’s lightweight, cheap to evaluate, and can be effective for range predicates - especially when values are loosely ordered, or confined to narrow ranges inside parts. Even when minmax doesn’t help a specific query pattern, its overhead is typically low enough that it’s still reasonable to keep. String columns. Use Bloom filters when cardinality is high and values are sparse. Bloom filters are most effective for high-cardinality string columns where each value has relatively low frequency, meaning most parts and granules don’t contain the searched value. As a rule of thumb, Bloom filters are most promising when the column has at least 10,000 distinct values, and often perform best with 100,000+ distinct values. They’re also more effective when matching values are clustered into a small number of sequential parts, which typically happens when the column is correlated with the ordering key. Again, your mileage here may vary - nothing replaces real world-testing.Optimization 3. Modifying the primary key
The primary key is one of the most important components of ClickHouse performance tuning for most workloads. To tune it effectively, you must understand how it works and how it interacts with your query patterns. Ultimately, the primary key should align with how users access the data, particularly which columns are most commonly filtered on. While the primary key also influences compression and storage layout, its primary purpose is query performance. In ClickStack, the out-of-the-box primary keys are already optimized for the most common observability access patterns and for strong compression. The default keys for logs, traces, and metrics tables are designed to perform well for typical workflows. Filtering on columns that appear earlier in the primary key is more efficient than filtering on columns that appear later. While the default configuration is sufficient for most users, there are cases where modifying the primary key can improve performance for specific workloads.A note on terminologyThroughout this document, the term “ordering key” is used interchangeably with “primary key.” Strictly speaking, these differ in ClickHouse, but for ClickStack, they typically refer to the same columns specified in the table
ORDER BY clause. For details, see the ClickHouse documentation on choosing a primary key that differs from the sorting key.- Logs (
otel_logs) -(ServiceName, TimestampTime, Timestamp) - Traces (‘otel_traces) -
(ServiceName, SpanName, toDateTime(Timestamp))
Choosing a primary key
First, identify whether your access patterns differ substantially from the defaults for a specific table. For example, if you most commonly filter logs by Kubernetes node before service name, and this represents a dominant workflow, it may justify changing the primary key.Modifying the default primary keyThe default primary keys are sufficient in most cases. Changes should be made cautiously and only with a clear understanding of query patterns. Modifying a primary key can degrade performance for other workflows, so testing is essential.
- Select columns that align with your common filters and access patterns. If you typically start Observability investigations by filtering by a specific column e.g. pod name, this column will be used frequently in
WHEREclauses. Prioritize including these in your key over those that are used less frequently. - Prefer columns that help exclude a large percentage of the total rows when filtered, thus reducing the amount of data that needs to be read. Service names and status codes are often good candidates - in the latter case, only if you filter by values which exclude most rows e.g. filtering by 200 codes will, in most systems, match most rows, in comparison to 500 errors, which will correspond to a small subset.
- Prefer columns that are likely to be highly correlated with other columns in the table. This will help ensure these values are also stored contiguously, improving compression.
GROUP BY(aggregations for charts) andORDER BY(sorting) operations for columns in the ordering key can be made more memory efficient.
Changing the primary key
If you’re confident of your access patterns prior to data ingestion, simply drop and re-create the table for the relevant data type. The example below shows a simple way to create a new logs table with the existing schema, but with a new primary key that includes the columnSeverityText before the ServiceName.
Create new table
Ordering key vs primary keyNote in the above example, you’re required to specify a
PRIMARY KEY and ORDER BY.
In ClickStack, these are almost always the same.
The ORDER BY controls the physical data layout, while the PRIMARY KEY defines the sparse index.
In rare, very large workloads, these may differ, but most users should keep them aligned.Exchange and drop table
TheEXCHANGE statement is used to swap the names of the tables atomically. The temporary table (now the old default table), can be dropped.Backfilling existing data into a new table is rarely worthwhile at scale. The compute and IO cost is usually high, and doesn’t justify the performance benefits. Instead, allow older data to expire via TTL while newer data benefits from the improved key.
SeverityText as the first column in the primary key is used below. In this case, a table is created for new data, retaining the old table for historical analysis.
Create new table
Create the new table with the desired primary key. Note the_23_01_2025 suffix - adapt this to be the current date. e.g.Create a Merge table
The Merge engine (not to be confused with MergeTree) doesn’t store data itself, but allows reading from any number of other tables simultaneously.currentDatabase() assumes the command is run in the correct database. Otherwise, specify the database name explicitly.otel_logs.Update HyperDX to read from the merge table
Configure HyperDX to useotel_logs_merge as the table for the logs data source.At this point, writes continue to otel_logs with the original primary key, while reads use the merge table. There is no visible change for users or impact on ingestion.Exchange the tables
AnEXCHANGE statement is now used to swap the names of the otel_logs and otel_logs_23_01_2025 tables atomically.otel_logs table with the updated primary key. Existing data remains in otel_logs_23_01_2025 and is still accessible via the merge table. The suffix indicates the date the change was applied and represents the latest timestamp contained in that table.This process allows primary key changes with no ingest interruption and no user-visible impact.SeverityNumber should be part of the primary key one week later, rather than the SeverityText. The following process can be adapted as many times as primary key changes are required.
Create new table
Create the new table with the desired primary key. In the example below30_01_2025 is used as our suffix to denote the date of the table. e.g.Exchange the tables
AnEXCHANGE statement is now used to swap the names of the otel_logs and otel_logs_30_01_2025 tables atomically.otel_logs table with the updated primary key. The old data remains in otel_logs_30_01_2025, accessible via the merge table.Redundant tablesIf TTL policies are in place, which is recommended, tables with older primary keys that are no longer receiving writes will gradually empty as data expires. They should be monitored and periodically cleaned up once they contain no data. At present, this cleanup process is manual.
Optimization 4. Exploiting Materialized Views
ClickStack can exploit Incremental Materialized Views to accelerate visualizations that rely on aggregation-heavy queries, such as computing average request duration per minute over time. This feature can dramatically improve query performance and is typically most beneficial for larger deployments, around 10 TB per day and above, while enabling scaling into the petabytes-per-day range. Incremental Materialized Views are in Beta and should be used with care. For details on using this feature in ClickStack, see our dedicated guide “ClickStack - Materialized Views.”Optimization 5. Exploiting Projections
Projections represent a final, advanced optimization that can be considered once materialized columns, skip indexes, primary keys, and materialized views have been evaluated. While projections and materialized views may appear similar, in ClickStack, they serve different purposes, and are best used in different scenarios.ORDER BY key, allowing ClickHouse to prune data more effectively for access patterns that don’t align with the original ordering.
Materialized views can achieve a similar effect by explicitly writing rows into a separate target table with a different ordering key. The key difference is that projections are maintained automatically and transparently by ClickHouse, while materialized views are explicit tables that must be registered and selected intentionally by ClickStack.
When a query targets the base table, ClickHouse evaluates the base layout and any available projections, samples their primary indexes, and selects the layout that can produce the correct result while reading the fewest granules. This decision is made automatically by the query analyzer.
In ClickStack, projections are therefore best suited for pure data reordering, where:
- Access patterns are fundamentally different from the default primary key
- It’s impractical to cover all workflows with a single ordering key
- You want ClickHouse to transparently choose the optimal physical layout
Example projections
Suppose your traces table is optimized for the default ClickStack access pattern:Use wildcardsIn the example projection above, a wildcard (
SELECT *) is used. While selecting a subset of columns can reduce write overhead, it also limits when the projection can be used, since only queries that can be fully satisfied by those columns are eligible. In ClickStack, this often restricts projection usage to very narrow cases. For this reason, it is generally recommended to use a wildcard to maximize applicability.Materializing a projection can take a long time and consume significant resources. Because observability data typically expires via TTL, this should only be done when absolutely necessary. In most cases, it’s sufficient to let the projection apply only to newly ingested data, allowing it to optimize the most frequently queried time ranges, such as the last 24 hours.
SELECT *) and the query filters strongly align with the projection’s ORDER BY.
Queries that filter on TraceId (especially equality) and include a time range would benefit from the above projection. For example:
TraceId, or that primarily filter on other dimensions that aren’t leading in the projection’s ordering key, typically won’t benefit (and may read via the base layout instead).
Projections can also store aggregations (similar to materialized views). In ClickStack, projection-based aggregations aren’t generally recommended because selection depends on the ClickHouse analyzer, and usage can be harder to control and reason about. Instead, prefer explicit materialized views that ClickStack can register and select intentionally at the application layer.
Costs and guidance
- Insert overhead: A
SELECT *projection with a different ordering key is effectively writing the data twice, which increases write I/O and can require additional CPU and disk throughput to sustain ingestion. - Use sparingly: Projections are best reserved for genuinely diverse access patterns where a second physical ordering unlocks meaningful pruning for a large share of queries, for example, two teams querying the same dataset in fundamentally different ways.
- Validate with benchmarks: As with all tuning, compare real query latency and resource usage before and after adding and materializing a projection.
Lightweight projections with _part_offset
Lightweight projections are Beta for ClickStack
_part_offset-based lightweight projections aren’t recommended for ClickStack workloads. While they reduce storage and write I/O, they can introduce more random access at query time, and their production behavior at the observability scale is still being evaluated. This recommendation may change as the feature matures and we gain more operational data._part_offset pointer into the base table, rather than duplicating full rows. This can greatly reduce storage overhead, and recent improvements enable granule-level pruning, making them behave more like true secondary indexes. See:
Alternatives
If you need multiple ordering keys, projections aren’t the only option. Depending on operational constraints and how you want ClickStack to route queries, consider:- Configuring your OpenTelemetry collector to write to two tables with different
ORDER BYkeys, and create separate ClickStack sources for each table. - Create a materialized view as a copy pipeline i.e. attach a materialized view to the main table that selects raw rows into a secondary table with a different ordering key (a denormalization or routing pattern). Create a source for this target table. Examples can be found here.