Epochs

An epoch represents a cutoff point of historical data within the database.

An epoch represents a cutoff point of historical data within the database. The timestamp of all commits within a given epoch are equal to or less than the epoch's timestamp. Understanding epochs is useful when you need to perform the following operations:

  • Database recovery: Vertica uses epochs to determine the last time data was consistent across all nodes in a database cluster.

  • Execute historical queries: A SELECT statement that includes an AT epoch clause only returns data that was committed on or before the specified epoch.

  • Purge deleted data: Deleted data is not removed from physical storage until it is purged from the database. You can purge deleted data from the database only if it precedes the ancient history marker (AHM) epoch.

Vertica has one open epoch and any number of closed epochs, depending on your system configuration. New and updated data is written into the open epoch, and each closed epoch represents a previous commit to your database. When data is committed with a DML operation (INSERT, UPDATE, MERGE, COPY, or DELETE), Vertica writes the data, closes the open epoch, and opens a new epoch. Each row committed to the database is associated with the epoch in which it was written.

The EPOCHS system table contains information about each available closed epoch. The epoch_close_time column stores the date and time of the commit. The epoch_number column stores the corresponding epoch number:

=> SELECT * FROM EPOCHS;
       epoch_close_time        | epoch_number
-------------------------------+--------------
 2020-07-27 14:29:49.687106-04 |           91
 2020-07-28 12:51:53.291795-04 |           92
(2 rows)

Epoch milestones

As an epoch progresses through its life cycle, it reaches milestones that Vertica uses it to perform a variety of operations and maintain the state of the database. The following image generally depicts these milestones within the epoch life cycle:

Vertica defines each milestone as follows:

  • Current epoch (CE): The current, open epoch that you are presently writing data to.

  • Latest epoch (LE): The most recently closed epoch.

  • Checkpoint epoch: Enterprise Mode only. A node-level epoch that is the latest epoch in which data is consistent across all projections on that node.

  • Last good epoch (LGE): The minimum checkpoint epoch in which data is consistent across all nodes.

  • Ancient history mark (AHM): The oldest epoch that contains data that is accessible by historical queries.

See Epoch life cycle for detailed information about each stage.