Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-page-updates.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
C++ style guide
General recommendations
The following are recommendations, not requirements. If you are editing code, it makes sense to follow the formatting of the existing code. Code style is needed for consistency. Consistency makes it easier to read the code, and it also makes it easier to search the code. Many of the rules do not have logical reasons; they are dictated by established practices.Formatting
1. Most of the formatting is done automatically byclang-format.
2. Indents is 4 spaces. Configure your development environment so that a tab adds four spaces.
3. Opening and closing curly brackets must be on a separate line.
statement, it can be placed on a single line. Place spaces around curly braces (besides the space at the end of the line).
if, for, while and other expressions, a space is inserted in front of the opening bracket (as opposed to function calls).
+, -, *, /, %, …) and the ternary operator ?:.
., ->.
If necessary, the operator can be wrapped to the next line. In this case, the offset in front of it is increased.
11. Do not use a space to separate unary operators (--, ++, *, &, …) from the argument.
12. Put a space after a comma, but not before it. The same rule goes for a semicolon inside a for expression.
13. Do not use spaces to separate the [] operator.
14. In a template <...> expression, use a space between template and <; no spaces after < or before >.
public, private, and protected on the same level as class/struct, and indent the rest of the code.
namespace is used for the entire file, and there isn’t anything else significant, an offset is not necessary inside namespace.
17. If the block for an if, for, while, or other expression consists of a single statement, the curly brackets are optional. Place the statement on a separate line, instead. This rule is also valid for nested if, for, while, …
But if the inner statement contains curly brackets or else, the external block should be written in curly brackets.
A const (related to a value) must be written before the type name.
* and & symbols should be separated by spaces on both sides.
using keyword (except in the simplest cases).
In other words, the template parameters are specified only in using and aren’t repeated in the code.
using can be declared locally, such as inside a function.
Comments
1. Be sure to add comments for all non-trivial parts of code. This is very important. Writing the comment might help you realize that the code isn’t necessary, or that it is designed wrong./// and multi-line comments begin with /**. These comments are considered “documentation”.
Note: You can use Doxygen to generate documentation from these comments. But Doxygen is not generally used because it is more convenient to navigate the code in the IDE.
9. Multi-line comments must not have empty lines at the beginning and end (except the line that closes a multi-line comment).
10. For commenting out code, use basic comments, not “documenting” comments.
11. Delete the commented out parts of the code before committing.
12. Do not use profanity in comments or code.
13. Do not use uppercase letters. Do not use excessive punctuation.
Names
1. Use lowercase letters with underscores in the names of variables and class members.using are named the same way as classes.
5. Names of template type arguments: in simple cases, use T; T, U; T1, T2.
For more complex cases, either follow the rules for class names, or add the prefix T.
N in simple cases.
I prefix.
defines and global constants use ALL_CAPS with underscores.
- For variable names, the abbreviation should use lowercase letters
mysql_connection(notmySQL_connection). - For names of classes and functions, keep the uppercase letters in the abbreviation
MySQLConnection(notMySqlConnection).
enum, use CamelCase with a capital letter. ALL_CAPS is also acceptable. If the enum is non-local, use an enum class.
AST, SQL.
Not NVDH (some random letters)
Incomplete words are acceptable if the shortened version is common use.
You can also use an abbreviation if the full name is included next to it in the comments.
17. File names with C++ source code must have the .cpp extension. Header files must have the .h extension.
How to write code
1. Memory management. Manual memory deallocation (delete) can only be used in library code.
In library code, the delete operator can only be used in destructors.
In application code, memory must be freed by the object that owns it.
Examples:
- The easiest way is to place an object on the stack, or make it a member of another class.
- For a large number of small objects, use containers.
- For automatic deallocation of a small number of objects that reside in the heap, use
shared_ptr/unique_ptr.
RAII and see above.
3. Error handling.
Use exceptions. In most cases, you only need to throw an exception, and do not need to catch it (because of RAII).
In offline data processing applications, it’s often acceptable to not catch exceptions.
In servers that handle user requests, it’s usually enough to catch exceptions at the top level of the connection handler.
In thread functions, you should catch and keep all exceptions to rethrow them in the main thread after join.
errno, always check the result and throw an exception in case of error.
- Create a function (
done()orfinalize()) that will do all the work in advance that might lead to an exception. If that function was called, there should be no exceptions in the destructor later. - Tasks that are too complex (such as sending messages over the network) can be put in separate method that the class user will have to call before destruction.
- If there is an exception in the destructor, it’s better to log it than to hide it (if the logger is available).
- In simple applications, it is acceptable to rely on
std::terminate(for cases ofnoexceptby default in C++11) to handle exceptions.
- Try to get the best possible performance on a single CPU core. You can then parallelize your code if necessary.
- Use the thread pool to process requests. At this point, we haven’t had any tasks that required userspace context switching.
joinAll).
If synchronization is required, in most cases, it is sufficient to use mutex under lock_guard.
In other cases use system synchronization primitives. Do not use busy wait.
Atomic operations should be used only in the simplest cases.
Do not try to implement lock-free data structures unless it is your primary area of expertise.
9. Pointers vs references.
In most cases, prefer references.
10. const.
Use constant references, pointers to constants, const_iterator, and const methods.
Consider const to be default and use non-const only when necessary.
When passing variables by value, using const usually does not make sense.
11. unsigned.
Use unsigned if necessary.
12. Numeric types.
Use the types UInt8, UInt16, UInt32, UInt64, Int8, Int16, Int32, and Int64, as well as size_t, ssize_t, and ptrdiff_t.
Don’t use these types for numbers: signed/unsigned long, long long, short, signed/unsigned char, char.
13. Passing arguments.
Pass complex values by value if they are going to be moved and use std::move; pass by reference if you want to update value in a loop.
If a function captures ownership of an object created in the heap, make the argument type shared_ptr or unique_ptr.
14. Return values.
In most cases, just use return. Do not write return std::move(res).
If the function allocates an object on heap and returns it, use shared_ptr or unique_ptr.
In rare cases (updating a value in a loop) you might need to return the value via an argument. In this case, the argument should be a reference.
namespace.
There is no need to use a separate namespace for application code.
Small libraries do not need this, either.
For medium to large libraries, put everything in a namespace.
In the library’s .h file, you can use namespace detail to hide implementation details not needed for the application code.
In a .cpp file, you can use a static or anonymous namespace to hide symbols.
Also, a namespace can be used for an enum to prevent the corresponding names from falling into an external namespace (but it’s better to use an enum class).
16. Deferred initialization.
If arguments are required for initialization, then you normally shouldn’t write a default constructor.
If later you’ll need to delay initialization, you can add a default constructor that will create an invalid object. Or, for a small number of objects, you can use shared_ptr/unique_ptr.
std::string and char *. Do not use std::wstring and wchar_t.
19. Logging.
See the examples everywhere in the code.
Before committing, delete all meaningless and debug logging, and any other types of debug output.
Logging in cycles should be avoided, even on the Trace level.
Logs must be readable at any logging level.
Logging should only be used in application code, for the most part.
Log messages must be written in English.
The log should preferably be understandable for the system administrator.
Do not use profanity in the log.
Use UTF-8 encoding in the log. In rare cases you can use non-ASCII characters in the log.
20. Input-output.
Don’t use iostreams in internal cycles that are critical for application performance (and never use stringstream).
Use the DB/IO library instead.
21. Date and time.
See the DateLUT library.
22. include.
Always use #pragma once instead of include guards.
23. using.
using namespace is not used. You can use using with something specific. But make it local inside a class or function.
24. Do not use trailing return type for functions unless necessary.
virtual in the base class, but write override instead of virtual in descendent classes.
Unused features of C++
1. Virtual inheritance is not used. 2. Constructs which have convenient syntactic sugar in modern C++, e.g.Platform
1. We write code for a specific platform. But other things being equal, cross-platform or portable code is preferred. 2. Language: C++20 (see the list of available C++20 features). 3. Compiler:clang. At the time of writing (March 2025), the code is compiled using clang version >= 19.
The standard library is used (libc++).
4. OS: Linux Ubuntu, not older than Precise.
5. Code is written for x86_64 CPU architecture.
The CPU instruction set is the minimum supported set among our servers. Currently, it is SSE 4.2.
6. Use -Wall -Wextra -Werror -Weverything compilation flags with a few exception.
7. Use static linking with all libraries except those that are difficult to connect to statically (see the output of the ldd command).
8. Code is developed and debugged with release settings.
Tools
1. KDevelop is a good IDE. 2. For debugging, usegdb, valgrind (memcheck), strace, -fsanitize=..., or tcmalloc_minimal_debug.
3. For profiling, use Linux Perf, valgrind (callgrind), or strace -cf.
4. Sources are in Git.
5. Assembly uses CMake.
6. Programs are released using deb packages.
7. Commits to master must not break the build.
Though only selected revisions are considered workable.
8. Make commits as often as possible, even if the code is only partially ready.
Use branches for this purpose.
If your code in the master branch is not buildable yet, exclude it from the build before the push. You’ll need to finish it or remove it within a few days.
9. For non-trivial changes, use branches and publish them on the server.
10. Unused code is removed from the repository.
Libraries
1. The C++20 standard library is used (experimental extensions are allowed), as well asboost and Poco frameworks.
2. It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in contrib directory and built with ClickHouse. See Guidelines for adding new third-party libraries for details.
3. Preference is always given to libraries that are already in use.
General recommendations
1. Write as little code as possible. 2. Try the simplest solution. 3. Don’t write code until you know how it’s going to work and how the inner loop will function. 4. In the simplest cases, useusing instead of classes or structs.
5. If possible, do not write copy constructors, assignment operators, destructors (other than a virtual one, if the class contains at least one virtual function), move constructors or move assignment operators. In other words, the compiler-generated functions must work correctly. You can use default.
6. Code simplification is encouraged. Reduce the size of your code where possible.
Additional recommendations
1. Explicitly specifyingstd:: for types from stddef.h
is not recommended. In other words, we recommend writing size_t instead std::size_t, because it’s shorter.
It is acceptable to add std::.
2. Explicitly specifying std:: for functions from the standard C library
is not recommended. In other words, write memcpy instead of std::memcpy.
The reason is that there are similar non-standard functions, such as memmem. We do use these functions on occasion. These functions do not exist in namespace std.
If you write std::memcpy instead of memcpy everywhere, then memmem without std:: will look strange.
Nevertheless, you can still use std:: if you prefer it.
3. Using functions from C when the same ones are available in the standard C++ library.
This is acceptable if it is more efficient.
For example, use memcpy instead of std::copy for copying large chunks of memory.
4. Multiline function arguments.
Any of the following wrapping styles are allowed: