Creating backups

You should perform full backups of your database regularly.

When to back up your database

You should perform full backups of your database regularly. You should also perform a full backup under the following circumstances:

Before:

  • You upgrade Vertica to another release.

  • You drop a partition.

  • You add, remove, or replace nodes in your database cluster.

After:

  • You load a large volume of data.

  • You add, remove, or replace nodes in your database cluster. Always create a new full backup in this case.

  • You recover a cluster from a crash.

If:

  • The epoch in the latest backup is earlier than the current ancient history mark.

Ideally, schedule ongoing backups to back up your data. You can run the Vertica vbr from a cron job or other task scheduler.

You can also back up selected objects. Use object backups to supplement full backups, not to replace them. Backup types are described in Types of backups.

Running vbr does not affect active database applications. vbr supports creating backups while concurrently running applications that execute DML statements, including COPY, INSERT, UPDATE, DELETE, and SELECT.

Backup locations and contents

Full and object-level backups reside on backup hosts, the computer systems on which backups and archives are stored.

Vertica saves backups in a specific backup location, the directory on a backup host. This location can contain multiple backups, both full and object-level, including associated archives. The backups are also compatible, allowing you to restore any objects from a full database backup. Backup locations for Eon Mode databases must be on S3.

Before beginning a backup, you must prepare your backup locations using the vbr init task, as in the following example:

$ vbr -t init -c full_backup.ini

For more information about backup locations, see Setting up backup locations.

Backups contain all committed data for the backed-up objects as of the start time of the backup. Backups do not contain uncommitted data or data committed during the backup. Backups do not delay mergeout or load activity.

Backing up HDFS storage locations

If your Vertica cluster uses HDFS storage locations, you must do some additional configuration before you can perform backups. See Requirements for backing up and restoring HDFS storage locations.

HDFS storage locations support only full backup and restore. You cannot perform object backup or restore on a cluster that uses HDFS storage locations.

Impact of backups on Vertica nodes

While a backup is taking place, the backup process can consume additional storage. The amount of space consumed depends on the size of your catalog and any objects that you drop during the backup. The backup process releases this storage when the backup is complete.

Best practices for creating backups

When creating backup configuration files:

  • Create separate configuration files to create full and object-level backups.

  • Use a unique snapshot name in each configuration file.

  • Use the same backup host directory location for both kinds of backups:

    • Because the backups share disk space, they are compatible when performing a restore.

    • Each cluster node must also use the same directory location on its designated backup host.

  • For best network performance, use one backup host per cluster node.

  • Use one directory on each backup node to store successive backups.

  • For future reference, append the major Vertica version number to the configuration file name (mybackup9x).

The selected objects of a backup can include one or more schemas or tables, or a combination of both. For example, you can include schema S1 and tables T1 and T2 in an object-level backup. Multiple backups can be combined into a single backup. A schema-level backup can be integrated with a database backup (and a table backup integrated with a schema-level backup, and so on).