Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Goal
In this guide you will learn how to:- Load the OpenCelliD data in ClickHouse
- Connect Apache Superset to ClickHouse
- Build a dashboard based on data available in the dataset
Get the dataset
This dataset is from OpenCelliD - The world’s largest Open Database of Cell Towers. As of 2021, it contains more than 40 million records about cell towers (GSM, LTE, UMTS, etc.) around the world with their geographical coordinates and metadata (country code, network, etc.). OpenCelliD Project is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License, and we redistribute a snapshot of this dataset under the terms of the same license. The up-to-date version of the dataset is available to download after sign in.- ClickHouse Cloud
- Self-managed
Load the sample data
ClickHouse Cloud provides an easy-button for uploading this dataset from S3. Log in to your ClickHouse Cloud organization, or create a free trial at ClickHouse.cloud.Select your service, followed byData sources -> Predefined sample data.Choose the Cell Towers dataset from the Sample data tab, and Load data:Examine the schema of the cell_towers table
Connect to SQL console
Connect to SQL console
From your ClickHouse Cloud services list, click on a service.This will redirect you to the SQL console.
DESCRIBE. Down further in this guide the field type choices will be described.Run some example queries
- A number of cell towers by type:
- Cell towers by mobile country code (MCC):
Use case: incorporate geo data
Using thepointInPolygon function.
- Create a table where we will store polygons:
- ClickHouse Cloud
- Self-managed
- This is a rough shape of Moscow (without “new Moscow”):
- Check how many cell towers are in Moscow:
Review of the schema
Before building visualizations in Superset have a look at the columns that you will use. This dataset primarily provides the location (Longitude and Latitude) and radio types at mobile cellular towers worldwide. The column descriptions can be found in the community forum. The columns used in the visualizations that will be built are described below Here is a description of the columns taken from the OpenCelliD forum:| Column | Description |
|---|---|
| radio | Technology generation: CDMA, GSM, UMTS, 5G NR |
| mcc | Mobile Country Code: 204 is The Netherlands |
| lon | Longitude: With Latitude, approximate tower location |
| lat | Latitude: With Longitude, approximate tower location |
- The
radiodata is stored as anEnum8(UInt8) rather than a string. mccor Mobile country code, is stored as aUInt16as we know the range is 1 - 999.lonandlatareFloat64.
Build visualizations with Apache Superset
Superset is easy to run from Docker. If you already have Superset running, all you need to do is add ClickHouse Connect withpip install clickhouse-connect. If you need to install Superset open the Launch Apache Superset in Docker directly below.
Launch Apache Superset in Docker
Launch Apache Superset in Docker
Superset provides installing Superset locally using Docker Compose instructions. After checking out the Apache Superset repo from GitHub you can run the latest development code, or a specific tag. We recommend release 2.0.0 as it is the latest release not marked as
pre-release.There are a few tasks to be done before running docker compose:- Add the official ClickHouse Connect driver
- Obtain a Mapbox API key and add that as an environment variable (optional)
- Specify the version of Superset to run
Official ClickHouse connect driver
To make the ClickHouse Connect driver available in the Superset deployment add it to the local requirements file:Mapbox
This is optional, you can plot location data in Superset without a Mapbox API key, but you will see a message telling you that you should add a key and the background image of the map will be missing (you will only see the data points and not the map background). Mapbox provides a free tier if you would like to use it.Some of the sample visualizations that the guides have you create use location, for example longitude and latitude, data. Superset includes support for Mapbox maps. To use the Mapbox visualizations you need a Mapbox API key. Sign up for the Mapbox free tier, and generate an API key.Make the API key available to Superset:Deploy Superset version 2.0.0
To deploy release 2.0.0 run:- Add your ClickHouse service as a Superset database
- Add the table cell_towers as a Superset dataset
- Create some charts
- Add the charts to a dashboard
Add your ClickHouse service as a Superset database
To connect to ClickHouse with HTTP(S) you need this information:| Parameter(s) | Description |
|---|---|
HOST and PORT | Typically, the port is 8443 when using TLS or 8123 when not using TLS. |
DATABASE NAME | Out of the box, there is a database named default, use the name of the database that you want to connect to. |
USERNAME and PASSWORD | Out of the box, the username is default. Use the username appropriate for your use case. |
curl command.
If you’re using self-managed ClickHouse, the connection details are set by your ClickHouse administrator.
In Superset a database can be added by choosing the database type, and then providing the connection details. Open Superset and look for the +, it has a menu with Data and then Connect database options.
Choose ClickHouse Connect from the list:
If ClickHouse Connect isn’t one of your options, then you will need to install it. The command is
pip install clickhouse-connect, and more info is available here.Add your connection details
Add the table cell_towers as a Superset dataset
In Superset a dataset maps to a table within a database. Click on add a dataset and choose your ClickHouse service, the database containing your table (default), and choose the cell_towers table:
Create some charts
When you choose to add a chart in Superset you have to specify the dataset (cell_towers) and the chart type. Since the OpenCelliD dataset provides longitude and latitude coordinates for cell towers we will create a Map chart. The deck.gL Scatterplot type is suited to this dataset as it works well with dense data points on a map.
Specify the query used for the map
A deck.gl Scatterplot requires a longitude and latitude, and one or more filters can also be applied to the query. In this example two filters are applied, one for cell towers with UMTS radios, and one for the Mobile country code assigned to The Netherlands. The fieldslon and lat contain the longitude and latitude:
Add a filter with mcc = 204 (or substitute any other mcc value):
Add a filter with radio = 'UMTS' (or substitute any other radio value, you can see the choices in the output of DESCRIBE TABLE cell_towers):
This is the full configuration for the chart that filters on radio = 'UMTS' and mcc = 204:
Click on UPDATE CHART to render the visualization.