This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Rebalancing data across nodes

Vertica can rebalance your database when you add or remove nodes.

Vertica can rebalance your database when you add or remove nodes. As a superuser, you can manually trigger a rebalance with Administration Tools, SQL functions, or the Management Console.

A rebalance operation can take some time, depending on the cluster size, and the number of projections and the amount of data they contain. You should allow the process to complete uninterrupted. If you must cancel the operation, call CANCEL_REBALANCE_CLUSTER.

Why rebalance?

Rebalancing is useful or even necessary after you perform one of the following operations:

  • Change the size of the cluster by adding or removing nodes.

  • Mark one or more nodes as ephemeral in preparation of removing them from the cluster.

  • Change the scaling factor of an elastic cluster, which determines the number of storage containers used to store a projection across the database.

  • Set the control node size or realign control nodes on a large cluster layout.

  • Specify more than 120 nodes in your initial Vertica cluster configuration.

  • Modify a fault group by adding or removing nodes.

General rebalancing tasks

When you rebalance a database cluster, Vertica performs the following tasks for all projections, segmented and unsegmented alike:

  • Distributes data based on:

  • Ignores node-specific distribution specifications in projection definitions. Node rebalancing always distributes data across all nodes.

  • When rebalancing is complete, sets the Ancient History Mark the greatest allowable epoch (now).

Vertica rebalances segmented and unsegmented projections differently, as described below.

Rebalancing segmented projections

For each segmented projection, Vertica performs the following tasks:

  1. Copies and renames projection buddies and distributes them evenly across all nodes. The renamed projections share the same base name.

  2. Refreshes the new projections.

  3. Drops the original projections.

Rebalancing unsegmented projections

For each unsegmented projection, Vertica performs the following tasks:

If adding nodes:

  • Creates projection buddies on them.

  • Maps the new projections to their shared name in the database catalog.

If dropping nodes: drops the projection buddies from them.

K-safety and rebalancing

Until rebalancing completes, Vertica operates with the existing K-safe value. After rebalancing completes, Vertica operates with the K-safe value specified during the rebalance operation. The new K-safe value must be equal to or higher than current K-safety. Vertica does not support downgrading K-safety and returns a warning if you try to reduce it from its current value. For more information, see Lowering K-Safety to enable node removal.

Rebalancing failure and projections

If a failure occurs while rebalancing the database, you can rebalance again. If the cause of the failure has been resolved, the rebalance operation continues from where it failed. However, a failed data rebalance can result in projections becoming out of date.

To locate out-of-date projections, query the system table PROJECTIONS as follows:

=> SELECT projection_name, anchor_table_name, is_up_to_date FROM projections
   WHERE is_up_to_date = false;

To remove out-of-date projections, use DROP PROJECTION.

Temporary tables

Node rebalancing has no effect on projections of temporary tables.

For Detailed Information About Rebalancing

See the Knowledge Base articles:

1 - Rebalancing data using the administration tools UI

To rebalance the data in your database:.

To rebalance the data in your database:

  1. Open the Administration Tools. (See Using the administration tools.)

  2. On the Main Menu, select View Database Cluster State to verify that the database is running. If it is not, start it.

  3. From the Main Menu, select Advanced Menu and click OK.

  4. In the Advanced Menu, select Cluster Management and click OK.

  5. In the Cluster Management menu, select Re-balance Data and click OK.

  6. Select the database you want to rebalance, and then select OK.

  7. Enter the directory for the Database Designer outputs (for example /tmp) and click OK.

  8. Accept the proposed K-safety value or provide a new value. Valid values are 0 to 2.

  9. Review the message and click Proceed to begin rebalancing data.

    The Database Designer modifies existing projections to rebalance data across all database nodes with the K-safety you provided. A script to rebalance data, which you can run manually at a later time, is also generated and resides in the path you specified; for example /tmp/extend_catalog_rebalance.sql.

    The terminal window notifies you when the rebalancing operation is complete.

  10. Press Enter to return to the Administration Tools.

2 - Rebalancing data using SQL functions

Vertica has three SQL functions for starting and stopping a cluster rebalance.

Vertica has three SQL functions for starting and stopping a cluster rebalance. You can call these functions from a script that runs during off-peak hours, rather than manually trigger a rebalance through Administration Tools.