This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Monitoring recovery

When your Vertica database is recovering from a failure, it's important to monitor the recovery process.

When your Vertica database is recovering from a failure, it's important to monitor the recovery process. There are several ways to monitor database recovery:

1 - Viewing log files on each node

During database recovery, Vertica adds logging information to the .log on each host.

During database recovery, Vertica adds logging information to the vertica.log on each host. Each message is identified with a [Recover]string.

Use the tail command to monitor recovery progress by viewing the relevant status messages, as follows.

$ tail -f catalog-path/database-name/node-name_catalog/vertica.log
01/23/08 10:35:31 thr:Recover:0x2a98700970 [Recover] <INFO> Changing host v_vmart_node0001 startup state from INITIALIZING to RECOVERING
01/23/08 10:35:31 thr:CatchUp:0x1724b80 [Recover] <INFO> Recovering to specified epoch 0x120b6
01/23/08 10:35:31 thr:CatchUp:0x1724b80 [Recover] <INFO> Running 1 split queries
01/23/08 10:35:31 thr:CatchUp:0x1724b80 [Recover] <INFO> Running query: ALTER PROJECTION proj_tradesquotes_0 SPLIT v_vmart_node0001 FROM 73911;

2 - Using system tables to monitor recovery

Use the following system tables to monitor recover:.

Use the following system tables to monitor recover:

Specifically, the recovery_status system table includes information about the node that is recovering, the epoch being recovered, the current recovery phase, and running status:

=>select node_name, recover_epoch, recovery_phase, current_completed, is_running from recovery_status;
node_name            | recover_epoch | recovery_phase    | current_completed | is_running
---------------------+---------------+-------------------+-------------------+--------------
 v_vmart_node0001    |               |                   | 0                 | f
 v_vmart_node0002    | 0             | historical pass 1 | 0                 | t
 v_vmart_node0003    | 1             | current           | 0                 | f

The projection_recoveries system table maintains history of projection recoveries. To check the recovery status, you can summarize the data for the recovering node, and run the same query several times to see if the counts change. Differing counts indicate that the recovery is working and in the process of recovering all missing data.

=> select node_name, status , progress from projection_recoveries;
node_name              | status      | progress
-----------------------+-------------+---------
v_vmart_node0001       | running     | 61

To see a single record from the projection_recoveries system table, add limit 1 to the query.

After a recovery has completed, Vertica continues to store information from the most recent recovery in these tables.

3 - Viewing cluster state and recovery status

Use the admintools view_cluster tool from the command line to see the cluster state:.

Use the admintools view_cluster tool from the command line to see the cluster state:

$ /opt/vertica/bin/admintools -t view_cluster
DB | Host | State
---------+--------------+------------
<data_base> | 112.17.31.10 | RECOVERING
<data_base> | 112.17.31.11 | UP
<data_base> | 112.17.31.12 | UP
<data_base> | 112.17.31.17 | UP
________________________________

4 - Monitoring cluster status after recovery

When recovery has completed:.

When recovery has completed:

  1. Launch Administration Tools.

  2. From the Main Menu, select View Database Cluster State and click OK.

    The utility reports your node's status as UP.