Epoch life cycle
The epoch life cycle consists of a sequence of milestones that enable you to perform a variety of operations and manage the state of your database.
Note
Depending on your configuration, a single epoch can represent the latest epoch, last good epoch, checkpoint epoch, and ancient history mark.Vertica provides epoch management parameters and functions so that you can retrieve and adjust epoch values. Additionally, see Configuring epochs for recommendations on how to set epochs for specific use cases.
Current epoch (CE)
The open epoch that contains all uncommitted changes that you are presently writing to the database. The current epoch is stored in the SYSTEM
system table:
=> SELECT CURRENT_EPOCH FROM SYSTEM;
CURRENT_EPOCH
---------------
71
(1 row)
The following example demonstrates how the current epoch advances when you commit data:
-
Query the
SYSTEM
systems table to return the current epoch:=> SELECT CURRENT_EPOCH FROM SYSTEM; CURRENT_EPOCH --------------- 71 (1 row)
The current epoch is open, which means it is the epoch that you are presently writing data to.
-
Insert a row into the
orders
table:=> INSERT INTO orders VALUES ('123456789', 323426, 'custacct@example.com'); OUTPUT -------- 1 (1 row)
Each row of data has an implicit epoch column that stores that row's commit epoch. The row that you just inserted into the table was not committed, so the
epoch
column is blank:=> SELECT epoch, orderkey, custkey, email_addrs FROM orders; epoch | orderkey | custkey | email_addrs -------+-----------+---------+---------------------- | 123456789 | 323426 | custacct@example.com (1 row)
-
Commit the data, then query the table again. The committed data is associated with epoch
71
, the current epoch that was previously returned from theSYSTEM
systems table:=> COMMIT; COMMIT => SELECT epoch, orderkey, custkey, email_addrs FROM orders; epoch | orderkey | custkey | email_addrs -------+-----------+---------+---------------------- 71 | 123456789 | 323426 | custacct@example.com (1 row)
-
Query the
SYSTEMS
table again to return the current epoch. The current epoch is 1 integer higher:=> SELECT CURRENT_EPOCH FROM SYSTEM; CURRENT_EPOCH --------------- 72 (1 row)
Latest epoch (LE)
The most recently closed epoch. The current epoch becomes the latest epoch after a commit operation.
The LE is the most recent epoch stored in the EPOCHS system table:
=> SELECT * FROM EPOCHS;
epoch_close_time | epoch_number
-------------------------------+--------------
2020-07-27 14:29:49.687106-04 | 91
2020-07-28 12:51:53.291795-04 | 92
(2 rows)
Checkpoint epoch (CPE)
Valid in Enterprise Mode only. Each node has a checkpoint epoch, which is the most recent epoch in which the data on that node is consistent across all projections. When the database runs optimally, the checkpoint epoch is equal to the LE, which is always one epoch older than the current epoch.
The checkpoint epoch is used during node failure and recovery. When a single node fails, that node attempts to rebuild data beyond its checkpoint epoch from other nodes. If the failed node cannot recover data using any of those epochs, then the failed node recovers data using the checkpoint epoch.
Use PROJECTION_CHECKPOINT_EPOCHS to query information about the checkpoint epochs. The following query returns information about the checkpoint epoch on nodes that store the orders
projection:
=> SELECT checkpoint_epoch, node_name, projection_name, is_up_to_date, would_recover, is_behind_ahm
FROM PROJECTION_CHECKPOINT_EPOCHS WHERE projection_name ILIKE 'orders_b%';
checkpoint_epoch | node_name | projection_name | is_up_to_date | would_recover | is_behind_ahm
------------------+------------------+-----------------+---------------+---------------+---------------
92 | v_vmart_node0001 | orders_b1 | t | f | f
92 | v_vmart_node0001 | orders_b0 | t | f | f
92 | v_vmart_node0003 | orders_b1 | t | f | f
92 | v_vmart_node0003 | orders_b0 | t | f | f
92 | v_vmart_node0002 | orders_b0 | t | f | f
92 | v_vmart_node0002 | orders_b1 | t | f | f
(6 rows)
This query confirms that the database epochs are advancing correctly. The would_recover
column displays an f
when the last good epoch (LGE) is equal to the CPE because Vertica gives precedence to the LGE for recovery when possible. The is_behind_ahm
column shows whether the checkpoint epoch is behind the AHM. Any data in an epoch that precedes the ancient history mark (AHM) is unrecoverable in case of a database or node failure.
Last good epoch (LGE)
The minimum checkpoint epoch in which data is consistent across all nodes in the cluster. Each node has an LGE, and Vertica evaluates the LGE for each node to determine the cluster LGE. The cluster's LGE is stored in the SYSTEM
system table:
=> SELECT LAST_GOOD_EPOCH FROM SYSTEM;
LAST_GOOD_EPOCH
-----------------
70
(1 row)
You can retrieve the LGE for each node by querying the expected recovery epoch:
=> SELECT GET_EXPECTED_RECOVERY_EPOCH();
INFO 4544: Recovery Epoch Computation:
Node Dependencies:
011 - cnt: 21
101 - cnt: 21
110 - cnt: 21
111 - cnt: 9
001 - name: v_vmart_node0001
010 - name: v_vmart_node0002
100 - name: v_vmart_node0003
Nodes certainly in the cluster:
Node 0(v_vmart_node0001), epoch 70
Node 1(v_vmart_node0002), epoch 70
Filling more nodes to satisfy node dependencies:
Data dependencies fulfilled, remaining nodes LGEs don't matter:
Node 2(v_vmart_node0003), epoch 70
--
GET_EXPECTED_RECOVERY_EPOCH
-----------------------------
70
(1 row)
Because the LGE is a snapshot of all of the most recent data on the disk, it is used to recover from database failure. Administration Tools uses the LGE to manually reset the database. If you are recovering from database failure after an unclean shutdown, Vertica prompts you to accept recovery using the LGE during restart.
Ancient history mark (AHM)
The oldest epoch that contains data that is accessible by historical queries. The AHM is stored in the SYSTEM
system table:
=> SELECT AHM_EPOCH FROM SYSTEM;
AHM_EPOCH
-----------
70
(1 row)
Epochs that precede the AHM are unavailable for historical queries. The following example returns the AHM, and then returns an error when executing a historical query that precedes the AHM:
=> SELECT GET_AHM_EPOCH();
GET_AHM_EPOCH
---------------
93
(1 row)
=> AT EPOCH 92 SELECT * FROM orders;
ERROR 3183: Epoch number out of range
HINT: Epochs prior to [93] do not exist. Epochs [94] and later have not yet closed
The AHM advances according to your HistoryRetentionTime
, HistoryRetentionEpochs
, and AdvanceAHMInterval
parameter settings. By default, the AHM advances every 180 seconds until it is equal with the LGE. This helps reduce the number of epochs saved to the epoch map, which reduces the catalog size. The AHM cannot advance beyond the LGE.
The AHM serves as the cutoff epoch for purging data from physical disk. As the AHM advances, the Tuple Mover mergeout process purges any deleted data that belongs to an epoch that precedes the AHM. See Purging deleted data for details about automated or manual purges.