Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Using ClickHouse’s azureBlobStorage table function
This is one of the most efficient and straightforward ways to copy data from Azure Blob Storage or Azure Data Lake Storage into ClickHouse. With this table function, you can instruct ClickHouse to connect directly to Azure storage and read data on demand. It provides a table-like interface that allows you to select, insert, and filter data directly from the source. The function is highly optimized and supports many widely used file formats, includingCSV, JSON, Parquet, Arrow,
TSV, ORC, Avro, and more. For the full list see “Data formats”.
In this section, we’ll walk through a simple startup guide for transferring
data from Azure Blob Storage to ClickHouse, along with important considerations
for using this function effectively. For more details and advanced options,
refer to the official documentation:
azureBlobStorage Table Function documentation page
Acquiring Azure Blob Storage Access Keys
To allow ClickHouse to access your Azure Blob Storage, you’ll need a connection string with an access key.- In the Azure portal, navigate to your Storage Account.
- In the left-hand menu, select Access keys under the Security + networking section.
- Choose either key1 or key2, and click the Show button next to the Connection string field.
- Copy the connection string — you’ll use this as a parameter in the azureBlobStorage table function.
Querying the data from Azure Blob Storage
Open your preferred ClickHouse query console — this can be the ClickHouse Cloud web interface, the ClickHouse CLI client, or any other tool you use to run queries. Once you have both the connection string and your ClickHouse query console ready, you can start querying data directly from Azure Blob Storage. In the following example, we query all the data stored in JSON files located in a container named data-container:INSERT INTO ... SELECT statement:
A simple example using the Environmental sensors dataset
As an example we will download a single file from the Environmental Sensors Dataset.- Download a sample file from the Environmental Sensors Dataset
- In the Azure Portal, create a new storage account if you don’t already have one.
- Create a new container in your storage account. In this example, we name it sensors. You can skip this step if you’re using an existing container.
-
Upload the previously downloaded
2019-06_bmp180.csv.zstfile to the container. - Follow the steps described earlier to obtain the Azure Blob Storage connection string.
- To load the data into a table, create a simplified version of the
schema used in the original dataset:
For more information on configuration options and schema inference when
querying external sources like Azure Blob Storage, see Automatic schema
inference from input data
- Now insert the data from Azure Blob Storage into the sensors table:
2019-06_bmp180.csv.zst
file stored in Azure Blob Storage.