Dbdump

Aus Free Software
Wechseln zu: Navigation, Suche

This program is designed to do regular database dumps from either PostgreSQL or MySQL. This is useful as a backup strategy for databases, since just copying the data-directories is not guaranteed to yield consistent database snapshots.

dbdump.py aims to automate this task along with some additional tasks you might want to do while dumping. Hence, this script can optionally dump to a remote directory via SSH and optionally encrypts the data with GPG. Both tools are usually shipped with any Linux distribution but might require some configuration to get it working.

It is developed together with dbclean, which periodically cleans old database dumps created by this script.

Authors, download and licence

dbdump is developed by Mati. I thus share the source in the public part of my SVN-repository, the URL is:

http://svn.fsinf.at/fs/dbdump

The whole project is of course under the GPLv3.

Installation and configuration

A simple svn-checkout

svn co http://svn.fsinf.at/fs/dbdump

will download all the files you need.


You can run dbdump as a normal user. If you want to run dbdump as a cron-job, dbdump must run without any interaction (i.e. manual password entry).

MySQL configuration

To use this script to dump MySQL databases, use --backend=mysql. So far, this backend only supports the option --defaults, which specifies a defaults-file that will be used by all mysql commands. This file must be used to specify login-credentials and any other options you might want to use. If the parameter is not specified, ~/.my.cnf is used. A typical file would look like this:

[mysqldump]
user            = dump
socket          = /var/run/mysqld/mysqld.sock
password        = <password> 

[client]
user            = dump
socket          = /var/run/mysqld/mysqld.sock
password        = <password>

To create a dump-user, you might issue this SQL-statement:

GRANT SELECT, LOCK TABLES ON *.* TO 'dump'@'localhost' IDENTIFIED BY '<password>';
flush privileges;


PostgreSQL configuration

To use this script to dump PostgreSQL databases, use --backend=postgresql. This script internally uses the tools psql (to get a list of databases to dump) and pg_dump (to actually dump the databases), both must be available in your path. You can pass additional parameters to those tools using:

 --psql-options=PSQL_OPTS
       PSQL_OPTS will be passed unmodified to psql. Note that psql is already
       called with -lAq in any case.
 --pg_dump-options=PGDUMP_OPTS
       PGDUMP_OPTS will be passed unmodified to pg_dump.

If you want to specify more than one parameter, you usually have to quote them. If you want dump the databases without the need for a password, you can use --su=postgres or any other user that has superuser rights

Dump to a remote location with SSH

To dump to a remote location, use the --remote parameter. Its value is directly passed to ssh, before the command that it remotely runs, so you can use --remote to pass any parameter to ssh. The minimum is the hostname that should be ssh'd to. Example:

dbdump.py <general opts> --remote="user@backup.example.com"

Note that if you intend to use this feature with a cron-job, you have to be able to ssh to the remote machine without a password. If you don't know how to set this up, try to google for "SSH public key authentication".

Sign/Encrypt dumps using GPG

Warning: If you encrypt dumps with GPG and do a full filesystem-dump to the same machine, take care that the private key is *not* saved to the same machine. Otherwise an attacker compromising the backup-machine might easily be able to forge backups.

This script can optionally sign and/or encrypt your dumps. Use the following two parameters to use gpg:

--sign=SIGN_KEY
      Use gpg to sign the dump using the key SIGN_KEY.
--encrypt=RECIPIENT
      Use gpg to encrypt the dumps for RECIPIENT.

Note that if you intend to use this feature with a cron-job, you have to be able to use gpg without a password. This usually means that the key used to sign the dumps does not use a password.

Also note that encryption is done at the local machine, hence no unencrypted data ever reaches the backup-machine. This is to ensure that data is not sniffed at any point, but has the drawback of putting an additional load on the primary machine.