This example demonstrates how to create a materialized view, and then how to cascade a second materialized view on to the first. In this page, you will see how to do it, many of the possibilities, and the limitations. Different use cases can be answered by creating a Materialized view using a second Materialized view as the source.Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Example: We will use a fake dataset with the number of views per hour for a group of domain names. Our Goal
- We need the data aggregated by month for each domain name,
- We also need the data aggregated by year for each domain name.
- Write queries that will read and aggregate the data during the SELECT request
- Prepare the data at the ingest time to a new format
- Prepare the data at the time of ingest to a specific aggregation.
Source table for the materialized views
Create the source table, because our goals involve reporting on the aggregated data and not the individual rows, we can parse it, pass the information on to the Materialized Views, and discard the actual incoming data. This meets our goals and saves on storage so we will use theNull table engine.
You can create a materialized view on a Null table. So the data written to the table will end up affecting the view, but the original raw data will still be discarded.
Monthly aggregated table and materialized view
For the first materialized view, we need to create theTarget table, for this example, it will be analytics.monthly_aggregated_data and we will store the sum of the views by month and domain name.
Yearly aggregated table and materialized view
Now we will create the second Materialized view that will be linked to our previous target tablemonthly_aggregated_data.
First, we will create a new target table that will store the sum of views aggregated by year for each domain name.
FROM statement will use the monthly_aggregated_data table, this means the data flow will be:
- The data comes to the
hourly_datatable. - ClickHouse will forward the data received to the first materialized view
monthly_aggregated_datatable, - Finally, the data received in step 2 will be forwarded to the
year_aggregated_data.
A common misinterpretation when working with Materialized views is that data is read from the table, This isn’t how
Materialized views work; the data forwarded is the inserted block, not the final result in your table.Let’s imagine in this example that the engine used in monthly_aggregated_data is a CollapsingMergeTree, the data forwarded to our second Materialized view year_aggregated_data_mv won’t be the final result of the collapsed table, it will forward the block of data with the fields defined as in the SELECT ... GROUP BY.If you’re using CollapsingMergeTree, ReplacingMergeTree, or even SummingMergeTree and you plan to create a cascade Materialized view you need to understand the limitations described here.Sample data
Now is the time to test our cascade materialized view by inserting some data:analytics.hourly_data you will see the following because the table engine is Null, but the data was processed.
Results
If you try to query the target table by selecting thesumCountViews field, you will see the binary representation (in some terminals), as the value isn’t stored as a number but as an AggregateFunction type.
To get the final result of the aggregation you should use the -Merge suffix.
You can see the special characters stored in AggregateFunction with this query:
Merge suffix to get the sumCountViews value:
AggregatingMergeTree we have defined the AggregateFunction as sum, so we can use the sumMerge. When we use the function avg on the AggregateFunction, we will use avgMerge, and so forth.
monthly_aggregated_data we can get the data aggregated by month for each domain name:
Combining multiple source tables to single target table
Materialized views can also be used to combine multiple source tables into the same destination table. This is useful for creating a materialized view that is similar to aUNION ALL logic.
First, create two source tables representing different sets of metrics:
Target table with the combined set of metrics:
Target table. You don’t need to explicitly include the missing columns:
Target table:
Target table: