Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Kafka and the JSON Data Type
With the introduction of the newJSON data type, ClickHouse is now a good choice of database for doing JSON analytics.
In this guide, we’re going to learn how to load JSON messages from Apache Kafka directly into a single JSON column in ClickHouse.
Setup Kafka
Let’s start by running a Kafka broker on our machine. We’re also going to map port 9092 to port 9092 on our host operating system so that it’s easier to interact with Kafka:Ingest data into Kafka
Once that’s running, we need to ingest some data. The Wikimedia recent changes feed is a good source of streaming data, so let’s ingest that into thewiki_events topic:
Ingest data into ClickHouse
Next, we’re going to ingest the data into ClickHouse. First, let’s enable the JSON type (which is currently experimental), by setting the following property:wiki_queue table, which uses the Kafka table engine.
JSONAsObject format, which will ensure that incoming messages are made available as a JSON object.
This format can only be parsed into a table that has a single column with the JSON type.
Next, we’ll create the underlying table to store the Wiki data:
wiki table:
Querying JSON data in ClickHouse
We can then write queries against thewiki table.
For example, we could count the number of bots that have committed changes:
en.wikipedia.org: