Point In Time Recovery tools for PostgreSQL
pitrery is a tool to manage Point In Time Recovery (PITR) backups for PostgreSQL. This is the user manual of pitrery which, hopefully, will guide you in the process of setting it up to perform your backups.
This section introduces the principles of point in time recovery in PostgreSQL.
Firstly, it is necessary to know that PostgreSQL always write data twice. Every transaction is written to the Write Ahead Log (or WAL) and the corresponding file is synchronised to disk before PostgreSQL answers to the user when it is committed.
The Write Ahead Log is divided in segments: these are files of 16MB each, which names are hex numbers keeping them ordered. Once PostgreSQL has filled a number of WAL files, when a timeout occurs or when a superuser asks it, the engine starts a checkpoint. The checkpoint consists of writing all the modification of the data to the database files. So data is first written to the WAL, then to the data files. The checkpoint permits PostgreSQL to recycle the WAL files.
The purpose of this is to permit crash recovery without losing data. If PostgreSQL detects that the cluster was not cleanly shut down at startup, then it enters recovery. Recovery is applying missing changes to the database files by reading transactions from the WAL, starting from the last known checkpoint.
Point In Time Recovery is based on those principles: since all the data changes are always stored in the WAL, it means that we could have PostgreSQL apply the changes they contain to the database files to let it know about not yet applied transactions, even if the cluster database files are in an inconsistent state. To perform PITR backups, we need to store the WAL files in a safe place, this is called WAL archiving: PostgreSQL is able to execute an arbitrary command to archive a WAL segment. Then, we need a copy of the database files along with the position in the WAL where recovery must start, this is called the base backup. Finally, the recovery phase is configurable, allowing to stop at a user defined date and time. The name, Point In Time Recovery, comes from this feature of PostgreSQL.
Finally, these features of PostgreSQL are used to create standby servers. When the WAL files are applied to another server, created from a base base backup, as soon as they are archived, we get a replicated server. While it is possible to setup replication with pitrery, it is not its purpose. One can do both: backups with pitrery and replication with other tools.
The purpose of pitrery is to manage the archiving of WAL segments and automate the base backup along with restoring the files and preparing the recovery to a date and time. Those two jobs are independent in the design of pitrery. This means that you can decide not to use pitrery to put WAL files in a safe place, which can be interesting if you already have WAL based replication set up and you do not want to replace your archiving script with the one provided by pitrery.
The pitrery
script takes an action as argument to perform a specific
task. Each task has its set of options that can be controlled by
commandline switches. A configuration file can reduce the size of the
commandline. Adding commandline switches override what is defined in
the configuration file. If it is well configured, then restore is
possible from a simple command with a few switches, because the
pressure on the person running it can be high at a time when end-users
cannot access the database. On the other side, adding command line
switches at run time will easily modify the behaviour of the action to
avoid modifying the configuration all the time.
The management of WAL files is done by two scripts, archive_wal
and
restore_wal
. The archive_wal
script takes care of WAL archiving.
If you need to archive WAL to many places, you can integrate it with
an already existing archiving script or simply modify the
archive_command
parameter of postgresql.conf
. The archive_wal
script can copy and compress WAL files locally or to another server
reachable using SSH. A configuration file can be used to reduce the
size of the command line defined in the configuration file of
PostgreSQL.
The management of base backups is divided in four actions: backup
,
restore
, purge
and list
. Two more actions are available to ease
the administration : check
can be used to ensure the configuration
file is correct and PostgreSQL properly configured. configure
can be
used to easily create a configuration file from the command line.
The storage place can be a remote server or the local machine. If is a remote server, it must be accessible using SSH in batch mode (One needs to setup passphraseless SSH keys to do so). Using the local machine as storage space can be useful to backup on a filer, whose filesystems are mounted locally.
On the backup host, pitrery organises backed up files the following way:
A backup root directory is used to store everything
Each backup is stored in a directory whose name is a date and time when it was started, this name is used by the restore action to find the best candidate for a target date.
Please note that the archived WAL files can be stored in a directory inside the backup root directory as long as its name does not start with a number, to avoid confusing the restore with a non backup directory. This is the case by default.
pitrery is a set of bash scripts, so bash is required. Apart from bash,
standard tools found on any Linux server are needed: grep
, sed
, awk
,
tar
, gzip
, ssh
, scp
…
rsync
is needed to archive WAL files over the network on both hosts, and
for backups using the rsync storage method.
GNU make is also needed to install manpages from the source tarball.
Distribution packages are now available on Dalibo Labs for YUM and APT repositories. Details are available in the downloads page.
Pitrery tarballs are also available on GitHub releases
Download the package:
wget https://github.com/dalibo/pitrery/releases/download/v3.1/pitrery-3.1.tar.gz
Then, unpack the tarball:
tar xzf pitrery-x.y.tar.gz
Go to the pitrery-x.y
directory and edit config.mk
to fit your
system. Once done run make
(or gmake
) to replace the interpreter and
paths in the scripts:
make
Finally, install it, as root if needed:
make install
By default, the files are installed in /usr/local
:
scripts are installed in /usr/local/bin
configuration samples are installed in /usr/local/etc/pitrery
manual pages are installed in /usr/local/share/man
You can use the script directly without running make install
, when
doing so the default configuration directory is /etc/pitrery
.
Since pitrery is action based, the execution uses the following syntax:
pitrery [options] action [action-specific options]
Each action that can be performed by pitrery
uses the parameters
from the configuration file. A non default configuration can be
specified before the action. The help message is:
$ ./pitrery -?
pitrery 3.1 - PostgreSQL Point In Time Recovery made easy
usage: pitrery [options] action [args]
options:
-f file Path to the configuration file
-l List configuration files in the default directory
-V Display the version and exit
-? Print help
actions:
list - Display information about backups
backup - Perform a base backup
restore - Restore a base backup and prepare PITR
purge - Clean old base backups and archived WAL files
check - Verify configuration and backups integrity
configure - Create a configuration file from the command line
help - Print help, optionnally for an action
The help or usage of each action listed in the general help message can be shown by invoking:
$ pitrery action -?
or using the help
action:
$ pitrery help [action]
For WAL file archiving and restoring, the two light weight
scripts archive_wal
and restore_wal
do not have any action, the
usage is straight forward:
$ ./archive_wal -?
archive_wal Archive a WAL segment
usage: archive_wal [options] walfile
options:
-C conf configuration file
-a [[user@]host:]/dir Place to store the archive
-X do not compress
-O do not overwrite the destination file
-H check the hash of the destination file (remote only)
-F flush the destination file to disk
-c command compression command
-s suffix compressed file suffix (ex: gz)
-E encrypt the file using gpg
-r keys:... colon separated list of recipients for GPG encryption
-S send messages to syslog
-f facility syslog facility
-t ident syslog ident
-m mode destination file permission mode in octal (e.g. chmod)
-T Timestamp log messages
-V Display the version and exit
-? print help
and restore_wal
:
$ ./restore_wal -?
restore_wal - Restore a WAL segment
usage: restore_wal [options] walfile destination
options:
-C conf Configuration file
-a [[user@]host:]/dir Place to get the archive
-X Do not uncompress
-c command Uncompression command
-s suffix Compressed file suffix (ex: gz)
-S Send messages to syslog
-f facility Syslog facility
-t ident Syslog ident
-T Timestamp log messages
-V Display the version and exit
-? Print help
It is not mandatory to use a configuration file, as parameters used the most can be given on the commandline. However, using a configuration file is recommended to reduce the size of the commanline. Values from the commandline take precedence over the configuration file at execution time.
The default configuration file is /etc/pitrery/pitrery.conf
,
containing all the default parameters.
The same configuration file is used by pitrery
, archive_wal
and
restore_wal
.
Use the configure
action to create a configuration file. It needs a
destination of the form [[user@]host:]/path
to know where backups
shall be stored. If a host is not provided, the backup is considered
local. Some options are available to create a configuration :
$ pitrery configure -?
pitrery configure - Create a configuration file from the command line
usage: pitrery configure [options] [[user@]host:]/path/to/backups
options:
-o config_file Output configuration file
-f Overwrite the destination file
-C Do not connect to PostgreSQL
-s mode Storage method, tar or rsync
-m count Number of backups to keep when purging
-g days Remove backups older than this number of days
-D dir Path to $PGDATA
-a [[user@]host:]/dir Place to store WAL archives
-E Encrypt tar backups with GPG
-r keys:... Colon separated list of recipients for GPG encryption
-P psql Path to the psql command
-h hostname Database server host or socket directory
-p port Database server port number
-U name Connect as specified database user
-d database Database to use for connection
-? Print help
Not all possible configuration options are provided, the purpose is to
quickly set pitrery up, then edit the configuration file created for
further tuning. It is worth noting that -C
avoids making pitrery
connect to PostgreSQL, otherwise it tries to guess the correct
parameters for WAL archiving. -o
writes the configuration files if
it does not exists, if only a keyword is given, the file is created in
the default configuration directory, with the .conf suffix.
When the -a
is not provided, the place where WAL files would be
stored is deduced from the location of backups: WAL are archived in
the archived_wal
subdirectory in the backup directory.
All parameters to access PostgreSQL are inherited from the environment if not given on the commandline.
For example, output the configuration needed to backup the cluster on port 5433:
$ sudo mkdir -p /var/backups/postgresql
$ sudo chown postgres:postgres /var/backups/postgresql
$ pitrery configure -p 5432 /var/backups/postgresql
2020-01-27 17:06:52 CET INFO: ==> checking access to PostgreSQL
2020-01-27 17:06:52 CET INFO: PostgreSQL version is: 12.1 (Debian 12.1-1.pgdg100+1)
2020-01-27 17:06:52 CET INFO: connection role can run backup functions
2020-01-27 17:06:52 CET INFO: current configuration:
2020-01-27 17:06:52 CET INFO: wal_level = replica
2020-01-27 17:06:52 CET INFO: archive_mode = on
2020-01-27 17:06:52 CET INFO: archive_command = '/usr/bin/archive_wal %p'
2020-01-27 17:06:52 CET INFO: please ensure archive_command includes a call to archive_wal
2020-01-27 17:06:52 CET INFO: ==> checking $PGDATA
2020-01-27 17:06:52 CET INFO: access to the contents of PGDATA ok
2020-01-27 17:06:52 CET INFO: ==> contents of the configuration file
PGDATA="/var/lib/postgresql/12/main"
PGPORT="5432"
BACKUP_DIR="/var/backups/postgresql"
PURGE_KEEP_COUNT="2"
STORAGE="tar"
ARCHIVE_DIR="/var/backups/postgresql/archived_wal"
When PostgreSQL is configured, use the check
action to test if everything
looks good.
Every time PostgreSQL fills a WAL segment, it can run a command to
archive it. It is an arbitrary command used as the value of the
archive_command
parameter in postgresql.conf
. PostgreSQL only checks
the return code of the command to know whether it worked or not.
pitrery provides the archive_wal
script to copy and possibly compress
WAL segments either on the local machine or on a remote server
reachable using an SSH connection. It is not mandatory to use it, any
script can be used: the only requirement is to provide a mean for the
restore action to get archived segments.
The archive_wal
script uses the configuration file named pitrery.conf
,
which sets up defaults. By default, its location is
/usr/local/etc/pitrery/pitrery.conf
, which can be overridden on
the command line with -C
option. The following parameters can be
configured:
ARCHIVE_DIR
is the target directory where to put files.
ARCHIVE_HOST
is the target hostname or IP address used when
copying over an SSH connection. Leave it empty to archive on the
local host.
ARCHIVE_USER
can be used to specify a username for the SSH
connection. When not set, the username is the system user used by
PostgreSQL.
ARCHIVE_COMPRESS
controls if the segment is compressed. Compression
is enabled by default, it can be disabled on busy servers doing a
lot write transactions, this can avoid contention on archiving.
ARCHIVE_COMPRESS_BIN
holds the command to commpress a segment.
ARCHIVE_UNCOMPRESS_BIN
holds the command to uncompress a segment,
it is the reverse of ARCHIVE_COMPRESS_BIN
.
ARCHIVE_COMPRESS_SUFFIX
configure the suffix of the compressed
file, without a starting dot. Eg gz
or bz2
.
ARCHIVE_OVERWRITE
can be set to “no” to check if the file to
archive already exists in the destination directory. Since it
reduces performance when archiving over SSH, it is set to “yes” by
default.
ARCHIVE_CHECK
can be set to “yes” to check the md5 of the archived
file to the md5 of the original WAL file. It is useful when the
storage and the network is not reliable. If overwriting is
disabled, the md5 check enabled and the archive already exists, the
archiving returns success if the md5 check is successful. This
option does not apply on local archiving.
ARCHIVE_FLUSH
can be set to “yes” to force an immediate flush of
the archived file to disk before returning success. It may slow down
the archiving process but ensure archives are not corrupted in case of
a power loss on the destination.
ARCHIVE_FILE_CHMOD
can be used to configure the permission of the
archived file. The value must be in octal form as understood by
chmod
. It can help with uid/gid issues on NFS shares used by
different hosts, and should not be necessary in most of the cases.
SYSLOG
can be set to “yes” to log messages to syslog, otherwise
stderr is used for messages. SYSLOG_FACILITY
and SYSLOG_IDENT
can then by used to store messages in the log file of PostgreSQL
when it is configured to use syslog. This should match the
configuration of PostgreSQL so that the messages of archive_wal
are written to the logfile of PostgreSQL, otherwise they would be
lost.
ARCHIVE_ENCRYPT
can be set to “yes” to encrypt the WAL file using
GnuPG. GPG_ENCRYPT_KEYS
must be configured to give the list of
recipients for encryption
If archiving is set up to a remote host, this host must be reachable using SSH in batch mode, meaning that passphraseless access using keys is to be configured for the system user running PostgreSQL to the remote host.
Once archive_wal
is configured, PostgreSQL must be setup to use it by
modifying the archive_command
parameter in postgresql.conf and
dependent parameters:
# If using PostgreSQL >= 9.0, wal_level must be set to archive or hot_standby
# If using PostgreSQL >= 9.6, wal_level must be set to replica
# Changing this requires a restart
wal_level = replica
# If using PostgreSQL >= 8.3, archiving must be enabled
# Changing this requires a restart
archive_mode = on
# The archive command using the defaults from pitrery.conf
archive_command = '/usr/bin/archive_wal %p'
# The archive command with parameters
#archive_command = '/usr/bin/archive_wal -C /path/to/myconf.conf %p'
# or to search /etc/pitrery for the configuration:
#archive_command = '/usr/bin/archive_wal -C myconf %p'
Depending on the version of PostgreSQL, restart the server if
wal_level
or archive_mode
were changed, otherwise reload it.
Since the WAL archiving is done by PostgreSQL, it is done by the system user running the instance. This means that the configuration file should be readable by this user if we want archive_wal to use it. To keep things simple, it is advised to run pitrery as this user too for backups and restores.
The first parameters configure how to connect to the PostgreSQL server
to backup. It is needed to run pg_start_backup()
and
pg_stop_backup()
to let us tell PostgreSQL a backup is being
run. pitrery
uses the client tools of PostgreSQL, so the usual
environment varibles are used to:
PGDATA
is the path to the directory storing the cluster
PGPSQL
is the path to the psql program
PostgreSQL access configuration: PGUSER
, PGPORT
, PGHOST
and
PGDATABASE
are the well known variables to reach the server.
If psql
is in the PATH, the variable PGPSQL
can be commented out
to use the one found in the PATH. If other variables are defined in
the environment, they can be commented out in the file to have pitrery
use them. Please note that it is usually safer to configure them in
the configuration file as environment variables may not be set when
running commands using cron.
The following parameters control the different actions accessible through pitrery:
PGOWNER
is the system user which owns the files of the cluster, it
is useful when restoring as root if the user want to restore as
another user.
PGWAL
is a path where transaction logs can be stored on restore,
pg_wal would then be a symbolic link to this path, like initdb -X
would do.
BACKUP_DIR
is the path to the directory where to store the backups.
BACKUP_HOST
is the name or IP address of the host where backups
shall be stored. If left empty, backups are local.
BACKUP_USER
is the username to use for SSH login, if empty, the
username is the one running pitrery.
RESTORE_COMMAND
can be used to define the command run by PostgreSQL
when it needs to retrieve a WAL file before applying it in recovery
mode. It is useful when WAL archiving is not performed by
pitrery. When archive_wal is used, e.g. RESTORE_COMMAND
is left
empty, it defaults to a call to restore_wal
and it is not necessary
to set it up here.
PURGE_KEEP_COUNT
controls how many backups must be kept when purging
old backups.
PURGE_OLDER_THAN
controls how many days backups are kept when
purging. If PURGE_KEEP_COUNT
is also set, age based purge will
always leave at least PURGE_KEEP_COUNT
backups.
LOG_TIMESTAMP
can be set to “yes” to prefix the messages with the
date for backup, restore and purge actions.
USE_ISO8601_TIMESTAMPS
, when set to “yes”, names the backup
directories using ISO 8601 format. Defaults to “no” to keep the
backward compatibility, as mixing formats of backup names would
break the sorting of backups on restore.
RSYNC_WHOLEFILE
, when set to “yes”, disable the rsync on the fly
comparison algorithm by adding --whole-file
to the rsync
commandline. This may improve performance over NFS. Default is “no”.
RSYNC_BWLIMIT
, limit the bandwidth usage for rsync. This is the
value of –bwlimit of rsync. With no unit, it is in kB/s. Leave
empty for no limit, there is no limit by default.
pitrery offers two storage techniques for the base backup.
The first, and historical, is tar
, where it creates one compressed
tarball (with gzip
by default) for PGDATA
and one for each
tablespace. The tar
method is quite slow and can become difficult to
use with bigger database clusters, however the compression saves a lot
of space.
The second is rsync
. It synchronises PGDATA and each tablespace to a
directory inside the backup, and try to optimise data transfer by
hardlinking the files of the previous backup (provided it was done
with the “rsync” method). This method should offer the best speed for
the base backup, and is recommended for bigger databases clusters (more
than several hundreds of gigabytes).
The default method is tar
. It can be configured by setting the
STORAGE
variable to either tar
or rsync
in the configuration
file.
By default, archive_wal
uses gzip -4
to compress the WAL files
when configured to do so (ARCHIVE_COMPRESS="yes"
). It is possible to
compress more and/or faster by using other compression tools, like
bzip2
, pigz
, the prerequisites are that the compression program
must accept the -c
option to output on stdout and the data to
compress from stdin. The compression program can be configured by
setting ARCHIVE_COMPRESS_BIN
in the configuration file. The output
filename has a suffix depending on the program used (e.g. “gz” or
“bz2”, etc), it must be configured using ARCHIVE_COMPRESS_SUFFIX
(without the leading dot), this suffix is most of the time mandatory
for decompression. The decompression program is then configured using
ARCHIVE_UNCOMPRESS_BIN
, this command must accept a compressed file
as its first argument.
For example, the fastest compression is archived with pigz
, a
multithreaded implementation of gzip:
ARCHIVE_COMPRESS_BIN="pigz"
ARCHIVE_UNCOMPRESS_BIN="pigz -d"
Or maximum, but slow, compression with the standard bzip2
:
ARCHIVE_COMPRESS_BIN="bzip2 -9"
ARCHIVE_COMPRESS_SUFFIX="bz2"
ARCHIVE_UNCOMPRESS_BIN="bunzip"
When encryption is active, compression is managed by GnuPG. The compression options listed before do not apply.
When using tar for storing backups, PGDATA and the tablespaces are
compressed using gzip
. This can be changed by configuring:
BACKUP_COMPRESS_BIN
to specify the command line to use for
compression of the tar files. The output of tar
is piped to this
command, then the result is redirected to the target file.
BACKUP_COMPRESS_SUFFIX
must be used to tell pitrery what is
the suffix appended by the compression program used in
BACKUP_COMPRESS_BIN
. This is mandatory for the restore.
BACKUP_UNCOMPRESS_BIN
to specify the command line to uncompress
files produced by the previous command. It must work with pipes and
understand that a -c
switch makes it output on stdout. Widely used
compression tools such as gzip
, bzip2
, pigz
, pbzip2
, xz
work this way.
When encryption is active, compression is managed by GnuPG. The compression options listed before do not apply.
Warning: Encryption of backups and archived WAL files is experimental and should be used with caution.
When using tar for storing backups, they can also be encrypted using GnuPG. For backup and WAL archiving, the public keys of recipients must be in the keyring of the user running the PostgreSQL instance and running pitrery.
When restoring, the secret key must be in the keyring of the user running pitrery and the PostgreSQL instance.
PostgreSQL can decrypt the restored WAL files without prompting for a
passphrase if it started with pg_ctl
in a session where a GnuPG
Agent is running with the passphrase cached. The simplest way to feed
the passphrase to the agent is to start PostgreSQL right after
restoring intarectively. See [Restoring an encrypted backup] below.
The encryption of tar backups is controlled by the following parameters:
BACKUP_ENCRYPT
must be set to “yes” to encrypt base backups.
ARCHIVE_ENCRYPT
must be set to “yes” to encrypt archived WAL files.
GPG_ENCRYPT_KEYS
is a colon separeted list of USER-ID recognized
by the –recipient option of gpg
. The keys must be in the keyring
of the user running PostgreSQL in order encrypt WAL files at
archiving time.
It is advised to encrypt with a local public key and another key that has its private counterpart stored on another machine: if the private keys are lost, the backups become unusable.
Some user defined commands can be executed, they are given in the following configuration variables:
PRE_BACKUP_COMMAND
is run before the backup is started.
POST_BACKUP_COMMAND
is run after the backup is finished. The
command is run even if the backup fails, but not if the backup fails
because of the PRE_BACKUP_COMMAND
or earlier (e.g. the order “pre
– base backup – post” is ensured).
The following variables are then available, to access the PostgreSQL or the current backup:
PITRERY_HOOK
is the name of the hook being run
PITRERY_PSQL
is the psql command line to run SQL statement on the
saved PostgreSQL server
PITRERY_DATABASE
is the name of the connection database
PITRERY_BACKUP_DIR
is the full path to the directory of the backup
PITRERY_BACKUP_LOCAL
can be used to know if SSH is required to access the backup directory
PITRERY_SSH_TARGET
the user@host part needed to access the backup server
PITRERY_EXIT_CODE
is the exit code of the backup. 0 for success, 1 for failure
By running pitrery without action, we can use the -l
option to list
the configuration files in the default configuration directory.
$ pitrery -l
INFO: listing configuration files in /etc/pitrery
prod
pitrery
pitrery
being the default configuration file, it is not mandatory to
specify it on the command line. If using another one, use the -f
option in every call to pitrery before the action to perform:
$ pitrery -f prod check
$ pitrery -f prod list -v
The configure
action can create a configuration file. It needs a
destination of the form [[user@]host:]/path
to know where backups
shall be stored. If a host is not provided, the backup is considered
local. Some options are available to create a configuration :
$ pitrery configure -?
pitrery configure - Create a configuration file from the command line
usage: pitrery configure [options] [[user@]host:]/path/to/backups
options:
-o config_file Output configuration file
-f Overwrite the destination file
-C Do not connect to PostgreSQL
-s mode Storage method, tar or rsync
-m count Number of backups to keep when purging
-g days Remove backups older than this number of days
-D dir Path to $PGDATA
-a [[user@]host:]/dir Place to store WAL archives
-E Encrypt tar backups with GPG
-r keys:... Colon separated list of recipients for GPG encryption
-P psql Path to the psql command
-h hostname Database server host or socket directory
-p port Database server port number
-U name Connect as specified database user
-d database Database to use for connection
-? Print help
Not all possible configuration options are provided, the purpose is to
quickly set pitrery up, then a edit the configuration file created for
further tuning. It is worth noting that -C
avoids making pitrery
connect to PostgreSQL so that the correct parameters for WAL archiving
are output. -o
writes the configuration files if it does not exists,
if only a keyword is given, the file is created in the default
configuration directory.
The check
action can check if a configuration file is correct. The
action tests if the backup directory is reachable, if WAL archiving
can be done with archive_wal
, if PostgreSQL is up and properly
configured for PITR and if the current user can actually backup the
files.
For example, the following commands checks the prod.conf
configuration file:
$ pitrery -f prod check
INFO: the configuration file contains:
PGDATA="/var/lib/postgresql/12/main"
PGPORT=5432
BACKUP_DIR="/var/backups/postgresql"
PURGE_KEEP_COUNT=2
USE_ISO8601_TIMESTAMPS="no"
INFO: ==> checking the configuration for inconsistencies
INFO: configuration seems correct
INFO: ==> checking backup configuration
INFO: backups are local, not checking SSH
INFO: target directory '/var/backups/postgresql' exists
INFO: target directory '/var/backups/postgresql' is writable
INFO: ==> checking WAL files archiving configuration
INFO: WAL archiving is local, not checking SSH
INFO: checking WAL archiving directory: /var/backups/postgresql/archived_wal
ERROR: target directory '/var/backups/postgresql/archived_wal' does NOT exist or is NOT reachable
INFO: ==> checking access to PostgreSQL
INFO: psql command and connection options are: psql -X -p 5432
INFO: connection database is: postgres
INFO: environment variables (maybe overwritten by the configuration file):
INFO: PostgreSQL version is: 12.1
INFO: connection role can run backup functions
INFO: current configuration:
INFO: wal_level = minimal
INFO: archive_mode = off
INFO: archive_command = '(disabled)'
ERROR: wal_level must be set at least to replica
ERROR: archive_mode must be set to on
INFO: ==> checking access to PGDATA
INFO: PostgreSQL and the configuration reports the same PGDATA
INFO: permissions of PGDATA ok
INFO: owner of PGDATA is the current user
INFO: access to the contents of PGDATA ok
Here the check
action reports that PostgreSQL is not properly
configured for archiving WAL files and that a directory is
missing. Fixing those error is mandatory to make the backups work:
$ pitrery -f prod check
INFO: the configuration file contains:
PGDATA="/var/lib/postgresql/12/main"
PGPORT=5433
BACKUP_DIR="/var/backups/postgresql"
PURGE_KEEP_COUNT=2
USE_ISO8601_TIMESTAMPS="no"
INFO: ==> checking the configuration for inconsistencies
INFO: configuration seems correct
INFO: ==> checking backup configuration
INFO: backups are local, not checking SSH
INFO: target directory '/var/backups/postgresql' exists
INFO: target directory '/var/backups/postgresql' is writable
INFO: ==> checking WAL files archiving configuration
INFO: WAL archiving is local, not checking SSH
INFO: checking WAL archiving directory: /var/backups/postgresql/archived_wal
INFO: target directory '/var/backups/postgresql/archived_wal' exists
INFO: target directory '/var/backups/postgresql/archived_wal' is writable
INFO: ==> checking access to PostgreSQL
INFO: psql command and connection options are: psql -X -p 5433
INFO: connection database is: postgres
INFO: environment variables (maybe overwritten by the configuration file):
INFO: PostgreSQL version is: 12.1
INFO: connection role can run backup functions
INFO: current configuration:
INFO: wal_level = replica
INFO: archive_mode = on
INFO: archive_command = '/usr/bin/archive_wal -C prod %p'
INFO: ==> checking access to PGDATA
INFO: PostgreSQL and the configuration reports the same PGDATA
INFO: permissions of PGDATA ok
INFO: owner of PGDATA is the current user
INFO: access to the contents of PGDATA ok
Beware that the backup must run on the PostgreSQL server host, SSH login is used to push data to a backup server, and PostgreSQL connection options to run SQL locally.
To run a backup with pitrery, either a configuration file is needed or the options must be put on the commandline. The usage of the backup action is:
$ pitrery backup -?
pitrery backup - Perform a base backup
usage: pitrery backup [options] [[[user@]host:]/path/to/backups]
options:
-D dir Path to $PGDATA
-s mode Storage method, tar or rsync
-c compress_bin Compression command for tar method
-e compress_suffix Suffix added by the compression program
-E Encrypt tar backups with GPG
-r keys:... Colon separated list of recipients for GPG encryption
-t Use ISO 8601 format to name backups
-T Timestamp log messages
-P PSQL path to the psql command
-h HOSTNAME database server host or socket directory
-p PORT database server port number
-U NAME connect as specified database user
-d DATABASE database to use for connection
-? Print help
For example:
$ pitrery -f prod backup
INFO: preparing directories in /var/backups/postgresql/
INFO: listing tablespaces
INFO: starting the backup process
INFO: performing a non-exclusive backup
INFO: backing up PGDATA with tar
INFO: archiving /var/lib/postgresql/12/main
INFO: stopping the backup process
INFO: saving /etc/postgresql/12/main/postgresql.conf
INFO: saving /etc/postgresql/12/main/pg_hba.conf
INFO: saving /etc/postgresql/12/main/pg_ident.conf
INFO: copying the backup history file
INFO: copying the tablespace_map file
INFO: copying the tablespaces list
INFO: copying PG_VERSION
INFO: backup directory is /var/backups/postgresql//2020.01.17_14.57.34
INFO: done
If we have a look at the contents of the /var/backups/postgresql
directory on the backup host:
$ tree /var/backups/postgresql
/var/backups/postgresql
├── 2020.01.17_14.57.34
│ ├── backup_label
│ ├── backup_timestamp
│ ├── conf
│ │ ├── pg_hba.conf
│ │ ├── pg_ident.conf
│ │ └── postgresql.conf
│ ├── pgdata.tar.gz
│ ├── PG_VERSION
│ ├── tablespace_map
│ ├── tblspc
│ └── tblspc_list
└── archived_wal
├── 00000001000000000000002E.gz
├── 00000001000000000000002F.gz
├── 000000010000000000000030.gz
├── 000000010000000000000031.00000028.backup.gz
└── 000000010000000000000031.gz
The backup directory is named with the stop date and time of the
backup. The backup_timestamp
file contains the timestamp value of
the stop time of the backup, which is used by the restore action to
find the best candidate when restoring to a specific date and time and
by the purge action. The directory stores the backup label file of
PostgreSQL, a tarball of the PGDATA directory, tarballs for each
tablespace and the tablespace list with their path. Finally, shown in
the example but not always present, a conf
directory can be created
to store configuration files of the database cluster
(postgresql.conf
, pg_hba.conf
and pg_ident.conf
) when they are
not located inside PGDATA
.
Notes:
archive_wal
to
store the WAL files in archived_wal
. This keep them close to
the base backups.rsync
storage method, tarballs are replaced with
directories with the same base name.The list action allow to find the backups the backup host or the localhost depending on the configuration. By default, it prints a parsable list of backups, with one backups on each line:
$ pitrery -f prod list
List of local backups
/var/backups/postgresql/2020.01.17_14.57.34 5.8M 2020-01-17 14:57:34 CEST
/var/backups/postgresql/2020.01.17_15.03.50 5.8M 2020-01-17 15:03:50 CEST
The -v
switch display more information on each backups, like needed space
for each tablespace :
The “space used” value is the size of the backup,
The disk usage for PGDATA and tablespaces is recorded at backup time, it is the space one need to restore
For example :
$ pitrery -f prod list -v
List of local backups
----------------------------------------------------------------------
Directory:
/var/backups/postgresql/2020.01.17_15.03.50
space used: 5.8M
storage: tar with gz compression
encryption: false
Minimum recovery target time:
2020-01-17 15:03:50 CEST
PGDATA:
pg_default 47 MB
pg_global 505 kB
Tablespaces:
----------------------------------------------------------------------
Directory:
/var/backups/postgresql/2020.01.17_15.06.46
space used: 5.8M
storage: tar with gpg compression
encryption: true
Minimum recovery target time:
2020-01-17 15:06:46 CEST
PGDATA:
pg_default 47 MB
pg_global 505 kB
Tablespaces:
"ts1" /var/lib/postgresql/tblspc/12/main/ts1 (18768) 0 bytes
Like the other commands, the options of the list action can be display by adding the -? option after the action:
$ pitrery list -?
pitrery list - Display information about backups
usage: pitrery list [options] [[user@]host:]/path/to/backups
options:
-v Display details of the backup
-? Print help
The restore action selects a backup and prepares the recovery to restore
to a particular point in time. The target date must be given on the
command line using the -d
option.
The best format is the one expected by PostgreSQL: YYYY-mm-DD HH:MM:SS
[+-]TZTZ
. The '[+-]TZTZ'
is the timezone offset, it must given as HHMM
,
.e.g ‘+2h30’ would be +0230
and ‘-7h’ would be -0700
. This work best with
the date
command found on most Unix systems.
Depending on the local date
command, the target date can be
anything it can parse, for example, offsets like 1 day ago
work with
GNU date.
This action perform the following steps:
Find the newest possible backup from the store.
Retrieve and extract the contents of PGDATA and the tablespaces.
Create a recovery.conf
file for PostgreSQL.
Optionally, restore the saved configuration files in
PGDATA/restored_config_files
if they were outside PGDATA at the time
of the backup.
Create a script which can be used to optionally restore any replication slots that were active (or inactive) at the time of the base backup.
Optionally, create a script to update the catalogue when paths to tablespaces have changed, for PostgreSQL <= 9.1.
The restore will only work if the target destination directory (PGDATA
in the configuration file of pitrery) and the directories used by
tablespaces exist or can be created, are writable and empty. It is
important to prepare those directories before running the restore. It
is possible to overwrite contents of target directories with the -R
option.
When specifying a target date, it will be used in the
$PGDATA/recovery.conf
file as value for the recovery_target_time
parameter.
Unless RESTORE_COMMAND
is defined to something else, the restore_wal
script will be used by PostgreSQL to retrieve archived WAL files. The
purpose of this script is to find, copy on PostgreSQL server, and
uncompress the archived WAL file asked by PostgreSQL.
The restore actions uses options values from the configuration, which
is passed by the restore action to restore_wal
, using the -C
option. If options, different from the configuration, must be given to
restore_wal
, the complete command must be provided to the restore
action with -r
.
The command line for the restore action can be tested using the -n
(dry run) option:
$ pitrery -f prod restore -n
INFO: searching backup directory
INFO: searching for tablespaces information
INFO:
INFO: backup directory:
INFO: /var/backups/postgresql/2020.01.17_15.06.46
INFO:
INFO: destinations directories:
INFO: PGDATA -> /var/lib/postgresql/12/main
INFO: tablespace "ts1" (18768) -> /var/lib/postgresql/tblspc/12/main/ts1 (relocated: no)
INFO:
INFO: recovery configuration:
INFO: target owner of the restored files: postgres
INFO: restore_command = 'restore_wal -C /etc/pitrery/prod.conf %f %p'
INFO:
Let’s say the target directories are ready for a restore run by the
postgres
user, the restore can be started with pitrery on an example
production server, ensure the date (or other arguments that need it)
are properly quoted:
$ pitrery -f prod restore -d '2020-01-17 15:04:30 +0200'
INFO: searching backup directory
INFO: searching for tablespaces information
INFO:
INFO: backup directory:
INFO: /var/backups/postgresql/2020.01.17_15.03.50
INFO:
INFO: destinations directories:
INFO: PGDATA -> /var/lib/postgresql/12/main
INFO:
INFO: recovery configuration:
INFO: target owner of the restored files: postgres
INFO: restore_command = 'restore_wal -C /etc/pitrery/prod.conf %f %p'
INFO: recovery_target_time = '2020-01-17 15:04:30 +0200'
INFO:
INFO: creating /var/lib/postgresql/12/main with permission 0700
INFO: extracting PGDATA to /var/lib/postgresql/12/main
INFO: extraction of PGDATA successful
INFO: restoring configuration files to /var/lib/postgresql/12/main/restored_config_files
INFO: preparing pg_wal directory
INFO: preparing recovery.conf file
INFO: done
INFO:
INFO: saved configuration files have been restored to:
INFO: /var/lib/postgresql/12/main/restored_config_files
INFO:
INFO: please check directories and recovery.conf before starting the cluster
INFO: and do not forget to update the configuration of pitrery if needed
INFO:
The restore script finds that the backup to be restored is located in
/var/backups/postgresql/2020.01.17_15.03.50
on our backup server. It
then extracts everything, including the tablespaces if some exists and
prepares the recovery.conf
at the root of $PGDATA
. The script asks
the user to check everything before starting the PostgreSQL cluster:
This behaviour is intentional, it allows the user to modify parameters
of PostgreSQL or change how the recovery is configured in
recovery.conf
.
When everything is fine, the PostgreSQL can be started, it will apply the archived WAL files until the target date is reached or until all archived WAL files are consumed if no target date was specified.
If unsure about the options to give for a restore, use the -n
switch
of the restore action to make it stop after showing the information.
Furthermore, it possible choose the target directories when restoring,
use -D
switch to set the target directory for PGDATA, and one to many
-t
switches to relocate the tablespaces to other directories. The
format of the value of a -t
option is tablespace_name_or_oid:new_directory
.
One -t
option apply to one tablespace. For example:
$ pitrery -f prod restore -D /var/lib/postgresql/12/main2 \
> -t ts1:/var/lib/postgresql/tblspc/12/main/ts1_2
INFO: searching backup directory
INFO: searching for tablespaces information
INFO:
INFO: backup directory:
INFO: /var/backups/postgresql/2020.01.17_15.06.46
INFO:
INFO: destinations directories:
INFO: PGDATA -> /var/lib/postgresql/12/main2
INFO: tablespace "ts1" (18768) -> /var/lib/postgresql/tblspc/12/main/ts1_2 (relocated: yes)
INFO:
INFO: recovery configuration:
INFO: target owner of the restored files: postgres
INFO: restore_command = 'restore_wal -C /etc/pitrery/prod.conf %f %p'
INFO:
INFO: creating /var/lib/postgresql/12/main2 with permission 0700
INFO: creating /var/lib/postgresql/tblspc/12/main/ts1_2 with permission 0700
INFO: extracting PGDATA to /var/lib/postgresql/12/main2
INFO: extraction of PGDATA successful
INFO: restoring configuration files to /var/lib/postgresql/12/main2/restored_config_files
INFO: extracting tablespace "ts1" to /var/lib/postgresql/tblspc/12/main/ts1_2
INFO: extraction of tablespace "ts1" successful
INFO: preparing pg_wal directory
INFO: preparing recovery.conf file
INFO: done
INFO:
INFO: saved configuration files have been restored to:
INFO: /var/lib/postgresql/12/main2/restored_config_files
INFO:
INFO: please check directories and recovery.conf before starting the cluster
INFO: and do not forget to update the configuration of pitrery if needed
INFO:
In the above example, the PGDATA has been changed along with the path of the ts1 tablespace.
When the version of PostgreSQL is 9.1 or older, pitrery creates a SQL
file with the UPDATE
statements needed to change the spclocation
column of pg_tablespace
(this columns has been removed as of
9.2). This script must be run as a superuser role on the restored
cluster after the recovery.
Again, if unsure, run the restore action with the -n
switch to display
what would be done.
The options of restore are:
$ pitrery restore -?
pitrery restore - Restore a base backup and prepare PITR
usage: pitrery restore [options] [[[user@]host:]/path/to/backups]
options:
-D dir Path to target $PGDATA
-x dir Path to the wal directory (only if outside $PGDATA)
-d date Restore until this date
-O user If run by root, owner of the files
-t tblspc:dir Change the target directory of tablespace "tblspc"
this switch can be used many times
-n Dry run: show restore information only
-R Overwrite destination directories
-c compress_bin Uncompression command for tar method
-e compress_suffix Suffix added by the compression program
-r command Command line to use in restore_command
-C config Configuration file for restore_wal in restore_command
-m restore_mode restore either in "standby" or "recovery" mode
-T Timestamp log messages
-? Print help
Warning: Encryption of backups and archived WAL files is experimental and should be used with caution.
When the backup is encrypted, decryption is transparent. The user running the restore action must have the secret key required to decrypt the data in its keyring. Since a restore can take quite a long time, it is recommended to start a gpg-agent explicitely for the operation with longer cache TTL values (7 days in the example):
eval $(/usr/bin/gpg-agent --daemon --allow-preset-passphrase \
--default-cache-ttl 604800 --max-cache-ttl 604800)
If the restore fails on GnuPG complaining there are no tty for prompting
the passphrase, get one using script
:
script /dev/null
export GPG_TTY=$(tty)
This will allow pinentry to prompt for the passphrase.
Since those information are in the shell environment, PostgreSQL can
use the agent when started using pg_ctl
. To make the recovery
process successfully decrypt the archived WAL files, start the
PostgreSQL instance with pg_ctl
after the restore in the same
console session, the restart it using the regular init script/systemd
unit if needed.
The purge action removes old backups according to a policy based on the number of backups to keep and/or their age in days. If the maximum number of backups and the maximum age are set, the number is always respected: it prevents the user from removing all backups if all of them are too old. The purge script will also try to remove unnecessary archived WAL files, provided it can reach the location where they are stored.
The -m
on the command line or PURGE_KEEP_COUNT
in the
configuration file define the maximum number of backups to keep. The
-d
on the command line or PURGE_OLDER_THAN
in the configuration
file is used to define the maximum age in days.
For example, we have three backups on the store and we want to keep only
one, while PURGE_KEEP_COUNT=2
:
$ pitrery -f prod purge -m 1
INFO: searching backups
INFO: Would be purging the following backups:
INFO: /var/backups/postgresql/2020.01.17_14.57.34
INFO: /var/backups/postgresql/2020.01.17_15.03.50
INFO: listing WAL files older than 000000010000000000000035
INFO: 9 old WAL file(s) to remove
INFO: purging old WAL files
INFO: done
Note that if there are no backups but archived WAL files, the purge action will not remove those WAL files.
The options of purge are:
$ pitrery purge -?
pitrery purge - Clean old base backups and archived WAL files
usage: pitrery purge [options] [[[user@]host:]/path/to/backups]
options:
-m count Keep this number of backups
-d days Purge backups older than this number of days
-a [[user@]host:]/dir Path to WAL archives
-N Dry run: show what would be purged only
-T Timestamp log messages
-? Print help
If unsure about the configuration of the purge, the -N
switch can be
used to display what would be done.
The check action can check if there are enough backups and their
against thresholds. This mode must be selected with the -B
option of
the check action.
The check is successful if the number of backups is greater or equal
than the number provided to the -m
option or PURGE_KEEP_COUNT
if
not specified. It is also successful if the age of the newest backup
is less than the interval provided to the -g
option or
PURGE_OLDER_THAN
if not specified. The age limit is a number of
days or less if a time unit is specified, the supported units being
“s” (seconds), “min” (minutes), “h” (hours) and “d” (days), like ine
the PostgreSQL configuration.
This check mode can behave like a Nagios plugin with the -n
option.
For example:
$ pitrery check -B -g 1d -m 2 /var/backups/postgresql
INFO: checking local backups in: /var/backups/postgresql
INFO: newest backup age: 16d 20h 31min 30s
INFO: number of backups: 3
ERROR: backups are too old
With the nagios output:
$ pitrery check -B -m 3 -g 30d -n localhost:/var/backups/postgresql
PITRERY BACKUPS OK - count: 3, newest: 16d 20h 40min 15s | count=3;3;3 newest=1456815s;2592000;2592000
The check action can check if there are no missing WAL files in the archives. Archived WAL files are in a sequence: missing files could make a backup impossible to restore or a make a point in time unreachable.
To check the archives, use the -A
option of the check action. The
path to the directory of the WAL archives can be specified with
-a
. Access to the backups is mandatory to find the version of
PostgreSQL and the number of segments per log (the last segment was
skipped before PostgreSQL 9.3).
Like the check backups mode, it can behave like a Nagios plugin with
the -n
option. Currently, backups and archives cannot be checked at
the same time when Nagios plugin output is selected.
For example:
$ pitrery check -A -a /var/backups/postgresql/archived_wal /var/backups/postgresql/
INFO: checking local archives in /var/backups/postgresql/archived_wal
INFO: oldest backup is: /var/backups/postgresql/2020.01.18_22.05.57
INFO: start wal file is: 000000010000000000000017
INFO: listing WAL files
INFO: first WAL file checked is: 00000001000000000000000F.gz
INFO: start WAL file found
ERROR: missing WAL file: 00000001000000000000001C
INFO: next found is: 0000000100000001000000F9
INFO: last WAL file checked is: 000000010000000200000003.gz
INFO: missing count is: 477
With the Nagios output:
$ pitrery check -A -a /var/backups/postgresql/archived_wal -n /var/backups/postgresql/
PITRERY WAL ARCHIVES CRITICAL - total: 32, missing: 477 | total=32;; missing=477;;
The options are:
$ pitrery check -?
pitrery check - Verify configuration and backups integrity
usage: pitrery check [options] [[[user@]host:]/path/to/backups]
options:
-C conf Configuration file
-B Check backups
-m count Fail when the number of backups is less than count
-g age Fail when the newest backup is older than age
-A Check WAL archives
-a [[user@]host:]/dir Path to WAL archives
-c command Uncompression command
-n Nagios compatible output for -B and -A
-? Print help