When your Vertica database is recovering from a failure, it's important to monitor the recovery process. There are several ways to monitor database recovery:
This is the multi-page printable view of this section. Click here to print.
Monitoring recovery
- 1: Viewing log files on each node
- 2: Using system tables to monitor recovery
- 3: Viewing cluster state and recovery status
- 4: Monitoring cluster status after recovery
1 - Viewing log files on each node
During database recovery, Vertica adds logging information to the vertica.log
on each host. Each message is identified with a [Recover]
string.
Use the tail
command to monitor recovery progress by viewing the relevant status messages, as follows.
$ tail -f catalog-path/database-name/node-name_catalog/vertica.log
01/23/08 10:35:31 thr:Recover:0x2a98700970 [Recover] <INFO> Changing host v_vmart_node0001 startup state from INITIALIZING to RECOVERING
01/23/08 10:35:31 thr:CatchUp:0x1724b80 [Recover] <INFO> Recovering to specified epoch 0x120b6
01/23/08 10:35:31 thr:CatchUp:0x1724b80 [Recover] <INFO> Running 1 split queries
01/23/08 10:35:31 thr:CatchUp:0x1724b80 [Recover] <INFO> Running query: ALTER PROJECTION proj_tradesquotes_0 SPLIT v_vmart_node0001 FROM 73911;
2 - Using system tables to monitor recovery
Use the following system tables to monitor recover:
Specifically, the recovery_status
system table includes information about the node that is recovering, the epoch being recovered, the current recovery phase, and running status:
=>select node_name, recover_epoch, recovery_phase, current_completed, is_running from recovery_status;
node_name | recover_epoch | recovery_phase | current_completed | is_running
---------------------+---------------+-------------------+-------------------+--------------
v_vmart_node0001 | | | 0 | f
v_vmart_node0002 | 0 | historical pass 1 | 0 | t
v_vmart_node0003 | 1 | current | 0 | f
The projection_recoveries
system table maintains history of projection recoveries. To check the recovery status, you can summarize the data for the recovering node, and run the same query several times to see if the counts change. Differing counts indicate that the recovery is working and in the process of recovering all missing data.
=> select node_name, status , progress from projection_recoveries;
node_name | status | progress
-----------------------+-------------+---------
v_vmart_node0001 | running | 61
To see a single record from the projection_recoveries
system table, add limit 1 to the query.
After a recovery has completed, Vertica continues to store information from the most recent recovery in these tables.
3 - Viewing cluster state and recovery status
Use the admintools view_cluster
tool from the command line to see the cluster state:
$ /opt/vertica/bin/admintools -t view_cluster
DB | Host | State
---------+--------------+------------
<data_base> | 112.17.31.10 | RECOVERING
<data_base> | 112.17.31.11 | UP
<data_base> | 112.17.31.12 | UP
<data_base> | 112.17.31.17 | UP
________________________________
4 - Monitoring cluster status after recovery
When recovery has completed:
-
Launch Administration Tools.
-
From the Main Menu, select View Database Cluster State and click OK.
The utility reports your node's status as
UP
.