Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
How to build a LangChain/LangGraph AI agent using ClickHouse MCP server
In this guide, you’ll learn how to build a LangChain/LangGraph AI agent that
can interact with ClickHouse’s SQL playground using ClickHouse’s MCP server.
Prerequisites
- You’ll need to have Python installed on your system.
- You’ll need to have
pip installed on your system.
- You’ll need an Anthropic API key, or API key from another LLM provider
You can run the following steps either from your Python REPL or via script.
Install libraries
Install the required libraries by running the following commands:pip install -q --upgrade pip
pip install -q langchain-mcp-adapters langgraph "langchain[anthropic]"
Setup credentials
Next, you’ll need to provide your Anthropic API key:import os, getpass
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass("Enter Anthropic API Key:")
Enter Anthropic API Key: ········
Using another LLM providerIf you don’t have an Anthropic API key, and want to use another LLM provider,
you can find the instructions for setting up your credentials in the Langchain Providers docs Initialize MCP server
Now configure the ClickHouse MCP server to point at the ClickHouse SQL playground:from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
server_params = StdioServerParameters(
command="uv",
args=[
"run",
"--with", "mcp-clickhouse",
"--python", "3.13",
"mcp-clickhouse"
],
env={
"CLICKHOUSE_HOST": "sql-clickhouse.clickhouse.com",
"CLICKHOUSE_PORT": "8443",
"CLICKHOUSE_USER": "demo",
"CLICKHOUSE_PASSWORD": "",
"CLICKHOUSE_SECURE": "true"
}
)
When working with Langchain and ClickHouse MCP server, query results are often
returned as streaming data rather than a single response. For large datasets or
complex analytical queries that may take time to process, it’s important to configure
a stream handler. Without proper handling, this streamed output can be difficult
to work with in your application.Configure the handler for the streamed output so that it’s easier to consume:class UltraCleanStreamHandler:
def __init__(self):
self.buffer = ""
self.in_text_generation = False
self.last_was_tool = False
def handle_chunk(self, chunk):
event = chunk.get("event", "")
if event == "on_chat_model_stream":
data = chunk.get("data", {})
chunk_data = data.get("chunk", {})
# Only handle actual text content, skip tool invocation streams
if hasattr(chunk_data, 'content'):
content = chunk_data.content
if isinstance(content, str) and not content.startswith('{"'):
# Add space after tool completion if needed
if self.last_was_tool:
print(" ", end="", flush=True)
self.last_was_tool = False
print(content, end="", flush=True)
self.in_text_generation = True
elif isinstance(content, list):
for item in content:
if (isinstance(item, dict) and
item.get('type') == 'text' and
'partial_json' not in str(item)):
text = item.get('text', '')
if text and not text.startswith('{"'):
# Add space after tool completion if needed
if self.last_was_tool:
print(" ", end="", flush=True)
self.last_was_tool = False
print(text, end="", flush=True)
self.in_text_generation = True
elif event == "on_tool_start":
if self.in_text_generation:
print(f"\n🔧 {chunk.get('name', 'tool')}", end="", flush=True)
self.in_text_generation = False
elif event == "on_tool_end":
print(" ✅", end="", flush=True)
self.last_was_tool = True
Call the agent
Finally, call your agent and ask it who’s committed the most code to ClickHouse:async with stdio_client(server_params) as (read, write):
async with ClientSession(read, write) as session:
await session.initialize()
tools = await load_mcp_tools(session)
agent = create_react_agent("anthropic:claude-sonnet-4-0", tools)
handler = UltraCleanStreamHandler()
async for chunk in agent.astream_events(
{"messages": [{"role": "user", "content": "Who's committed the most code to ClickHouse?"}]},
version="v1"
):
handler.handle_chunk(chunk)
print("\n")
You should see a similar response as below:I'll help you find who has committed the most code to ClickHouse by exploring the available databases and tables to locate git commit data.
🔧 list_databases ✅ I can see there's a `git` database which likely contains git commit information. Let me explore the tables in that database:
🔧 list_tables ✅ Perfect! I can see the `clickhouse_commits` table in the git database contains ClickHouse commit data with 80,644 commits. This table has information about each commit including the author, lines added/deleted, files modified, etc. Let me query this table to find who has committed the most code based on different metrics.
🔧 run_select_query ✅ Let me also look at just the lines added to see who has contributed the most new code:
🔧 run_select_query ✅ Based on the ClickHouse git commit data, **Alexey Milovidov** has committed the most code to ClickHouse by several measures:
## Key Statistics:
1. **Most Total Lines Changed**: Alexey Milovidov with **1,696,929 total lines changed** (853,049 added + 843,880 deleted)
2. **Most Lines Added**: Alexey Milovidov with **853,049 lines added**
3. **Most Commits**: Alexey Milovidov with **15,375 commits**
4. **Most Files Changed**: Alexey Milovidov with **73,529 files changed**
## Top Contributors by Lines Added:
1. **Alexey Milovidov**: 853,049 lines added (15,375 commits)
2. **s-kat**: 541,609 lines added (50 commits)
3. **Nikolai Kochetov**: 219,020 lines added (4,218 commits)
4. **alesapin**: 193,566 lines added (4,783 commits)
5. **Vitaly Baranov**: 168,807 lines added (1,152 commits)
Alexey Milovidov is clearly the most prolific contributor to ClickHouse, which makes sense as he is one of the original creators and lead developers of the project. His contribution dwarfs others both in terms of total code volume and number of commits, with nearly 16,000 commits and over 850,000 lines of code added to the project.