Best practices for disaster recovery

To protect your database from site failures caused by catastrophic disasters, maintain an off-site replica of your database to provide a standby.

To protect your database from site failures caused by catastrophic disasters, maintain an off-site replica of your database to provide a standby. In case of disaster, you can switch database users over to the standby database. The amount of data loss between a disaster and fail over to the offsite replica depends on how frequently you save a full database backup.

The solution to employ for disaster recover depends upon two factors that you must determine for your application:

  • Recovery point objective (RPO): How much data loss can your organization tolerate upon a disaster recovery?

  • Recovery time objective (RTO): How quickly do you need to recover the database following a disaster?

Depending on your RPO and RTO, Vertica recommends choosing from the following solutions:

  1. Dual-load: During each load process for the database, simultaneously load a second database. You can achieve this easily with off-the-shelf ETL software.

  2. Periodic Incremental Backups: Use the procedure described in Copying the database to another cluster to periodically copy the data to the target database. Remember that the script copies only files that have changed.

  3. Replication solutions provided by Storage Vendors: Although some users have had success with SAN storage, the number of vendors and possible configurations prevent Vertica from providing support for SANs.

The following table summarizes the RPO, RTO, and the pros and cons of each approach:

Dual Load Periodic Incremental Storage Replication
RPO Up to the minute data Up to the last backup Recover to the minute
RTO Available at all times Available except when backup in progress Available at all times
Pros
  • Standby database can have different configuration

  • Can use the standby database for queries

  • Built-in scripts

  • High performance due to compressed file transfers

Transparent to the database
Cons
  • Possibly incur additional ETL licenses

  • Requires application logic to handle errors

Need identical standby system
  • More expensive

  • Media corruptions are also replicated