In the examples below you will see the backup destination specified as Disk('backups', '1.zip').
To use the Disk backup engine it is necessary to first add a file specifying
the backup destination at the path below:
/etc/clickhouse-server/config.d/backup_disk.xml
For example, the configuration below defines a disk named backups and then adds that disk to
the allowed_disk list of backups:
It is also possible to BACKUP/RESTORE to S3 by configuring an S3 disk in the
ClickHouse storage configuration. Configure the disk like this by adding a file to
/etc/clickhouse-server/config.d as was done above for the local disk.
BACKUP/RESTORE for S3 disk is done in the same way as for local disk:
BACKUP TABLE data TO Disk('s3_plain', 'cloud_backup');RESTORE TABLE data AS data_restored FROM Disk('s3_plain', 'cloud_backup');
This disk shouldn’t be used for MergeTree itself, only for BACKUP/RESTORE
If your tables are backed by S3 storage and the types of the disks are different,
it doesn’t use CopyObject calls to copy parts to the destination bucket, instead,
it downloads and uploads them, which is very inefficient. In this case prefer using
the BACKUP ... TO S3(<endpoint>) syntax for this use-case.
The above RESTORE would fail if the table test.table contains data.
The setting allow_non_empty_tables=true allows RESTORE TABLE to insert data
into non-empty tables. This will mix earlier data in the table with the data extracted from the backup.
This setting can therefore cause data duplication in the table, and should be used with caution.
To restore the table with data already in it, run:
RESTORE TABLE test_db.test_table FROM Disk('backups', '1.zip')SETTINGS allow_non_empty_tables=true
Tables can be restored, or backed up, with new names:
RESTORE TABLE test_db.test_table AS test_db.test_table_renamed FROM Disk('backups', '1.zip')
The backup archive for this backup has the following structure:
A base backup in ClickHouse is the initial, full backup from which the following
incremental backups are created. Incremental backups only store the changes
made since the base backup, so the base backup must be kept available to
restore from any incremental backup. The base backup destination can be set with setting
base_backup.
Incremental backups depend on the base backup. The base backup must be kept available
to be able to restore from an incremental backup.
To make an incremental backup of a table, first make a base backup:
BACKUP TABLE test_db.test_table TO Disk('backups', 'd.zip')
BACKUP TABLE test_db.test_table TO Disk('backups', 'incremental-a.zip')SETTINGS base_backup = Disk('backups', 'd.zip')
All data from the incremental backup and the base backup can be restored into a
new table test_db.test_table2 with command:
RESTORE TABLE test_db.test_table AS test_db.test_table2FROM Disk('backups', 'incremental-a.zip');
Backups written to disk can have a password applied to the file.
The password can be specified using the password setting.
Password protection is only supported for ZIP archives (.zip, .zipx).
The backup path must end with .zip or .zipx for the password to be accepted.
Using a password with any other format - including tar archives and non-archive paths - will
result in a BAD_ARGUMENTS error: Password is not applicable, backup cannot be encrypted.
Backups can be stored not only as zip archives, but also as tar archives.
The functionality is the same as for zip, except that password protection isn’t
supported for tar archives. Additionally, tar archives support a variety of
compression methods.To make a backup of a table as a tar:
BACKUP TABLE test_db.test_table TO Disk('backups', '1.tar')
to restore from a tar archive:
RESTORE TABLE test_db.test_table FROM Disk('backups', '1.tar')
To change the compression method, the correct file suffix should be appended to
the backup name. For example, to compress the tar archive using gzip run:
BACKUP TABLE test_db.test_table TO Disk('backups', '1.tar.gz')
If specific partitions associated with a table need to be restored, these can be specified.Let’s create a simple partitioned table into four parts, insert some data into it and then
take a backup of only the first and fourth partitions:
Setup
CREATE IF NOT EXISTS test_db;-- Create a partitioend tableCREATE TABLE test_db.partitioned ( id UInt32, data String, partition_key UInt8) ENGINE = MergeTree()PARTITION BY partition_keyORDER BY id;INSERT INTO test_db.partitioned VALUES(1, 'data1', 1),(2, 'data2', 2),(3, 'data3', 3),(4, 'data4', 4);SELECT count() FROM test_db.partitioned;SELECT partition_key, count() FROM test_db.partitionedGROUP BY partition_keyORDER BY partition_key;