This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Fault groups

You cannot create fault groups for an Eon Mode database.

Fault groups let you configure an Enterprise Mode database for your physical cluster layout. Sharing your cluster topology lets you use terrace routing to reduce the buffer requirements of large queries. It also helps to minimize the risk of correlated failures inherent in your environment, usually caused by shared resources.

Vertica automatically creates fault groups around control nodes (servers that run spread) in large cluster arrangements, placing nodes that share a control node in the same fault group. Automatic and user-defined fault groups do not include ephemeral nodes because such nodes hold no data.

Consider defining your own fault groups specific to your cluster's physical layout if you want to:

  • Use terrace routing to reduce the buffer requirements of large queries.

  • Reduce the risk of correlated failures. For example, by defining your rack layout, Vertica can better tolerate a rack failure.

  • Influence the placement of control nodes in the cluster.

Vertica supports complex, hierarchical fault groups of different shapes and sizes. The database platform provides a fault group script (DDL generator), SQL statements, system tables, and other monitoring tools.

See High availability with fault groups for an overview of fault groups with a cluster topology example.

1 - About the fault group script

To help you define fault groups on your cluster, Vertica provides a script named fault_group_ddl_generator.py in the /opt//scripts directory.

To help you define fault groups on your cluster, Vertica provides a script named fault_group_ddl_generator.py in the /opt/vertica/scripts directory. This script generates the SQL statements you need to run to create fault groups.

The fault_group_ddl_generator.py script does not create fault groups for you, but you can copy the output to a file. Then, when you run the helper script, you can use \i or vsql–f commands to pass the cluster topology to Vertica.

The fault group script takes the following arguments:

  • The database name

  • The fault group input file

For example:

$ python /opt/vertica/scripts/fault_group_ddl_generator.py VMartdb fault_grp_input.out

See also

2 - Creating a fault group input file

Use a text editor to create a fault group input file for the targeted cluster.

Use a text editor to create a fault group input file for the targeted cluster.

The following example shows how you can create a fault group input file for a cluster that has 8 racks with 8 nodes on each rack—for a total of 64 nodes in the cluster.

  1. On the first line of the file, list the parent (top-level) fault groups, delimited by spaces.

    rack1 rack2 rack3 rack4 rack5 rack6 rack7 rack8
    
  2. On the subsequent lines, list the parent fault group followed by an equals sign (=). After the equals sign, list the nodes or fault groups delimited by spaces.

    <parent> = <child_1> <child_2> <child_n...>
    

    Such as:

    rack1 = v_vmart_node0001 v_vmart_node0002 v_vmart_node0003 v_vmart_node0004
    rack2 = v_vmart_node0005 v_vmart_node0006 v_vmart_node0007 v_vmart_node0008
    rack3 = v_vmart_node0009 v_vmart_node0010 v_vmart_node0011 v_vmart_node0012
    rack4 = v_vmart_node0013 v_vmart_node0014 v_vmart_node0015 v_vmart_node0016
    rack5 = v_vmart_node0017 v_vmart_node0018 v_vmart_node0019 v_vmart_node0020
    rack6 = v_vmart_node0021 v_vmart_node0022 v_vmart_node0023 v_vmart_node0024
    rack7 = v_vmart_node0025 v_vmart_node0026 v_vmart_node0027 v_vmart_node0028
    rack8 = v_vmart_node0029 v_vmart_node0030 v_vmart_node0031 v_vmart_node0032
    

    After the first row of parent fault groups, the order in which you write the group descriptions does not matter. All fault groups that you define in this file must refer back to a parent fault group. You can indicate the parent group directly or by specifying the child of a fault group that is the child of a parent fault group.

    Such as:

    rack1 rack2 rack3 rack4 rack5 rack6 rack7 rack8
    rack1 = v_vmart_node0001 v_vmart_node0002 v_vmart_node0003 v_vmart_node0004
    rack2 = v_vmart_node0005 v_vmart_node0006 v_vmart_node0007 v_vmart_node0008
    rack3 = v_vmart_node0009 v_vmart_node0010 v_vmart_node0011 v_vmart_node0012
    rack4 = v_vmart_node0013 v_vmart_node0014 v_vmart_node0015 v_vmart_node0016
    rack5 = v_vmart_node0017 v_vmart_node0018 v_vmart_node0019 v_vmart_node0020
    rack6 = v_vmart_node0021 v_vmart_node0022 v_vmart_node0023 v_vmart_node0024
    rack7 = v_vmart_node0025 v_vmart_node0026 v_vmart_node0027 v_vmart_node0028
    rack8 = v_vmart_node0029 v_vmart_node0030 v_vmart_node0031 v_vmart_node0032
    

After you create your fault group input file, you are ready to run the fault_group_ddl_generator.py. This script generates the DDL statements you need to create fault groups in Vertica.

If your Vertica database is co-located on a Hadoop cluster, and that cluster uses more than one rack, you can use fault groups to improve performance. See Configuring rack locality.

See also

Creating fault groups

3 - Creating fault groups

When you define fault groups, Vertica distributes data segments across the cluster.

When you define fault groups, Vertica distributes data segments across the cluster. This allows the cluster to be aware of your cluster topology so it can tolerate correlated failures inherent in your environment, such as a rack failure. For an overview, see High Availability With Fault Groups.

Prerequisites

To define a fault group, you must have:

Run the fault group script

  1. As the database administrator, run the fault_group_ddl_generator.py script:

    python /opt/vertica/scripts/fault_group_ddl_generator.py databasename fault-group-inputfile > sql-filename
    

    For example, the following command writes the Python script output to the SQL file fault_group_ddl.sql.

    $ python /opt/vertica/scripts/fault_group_ddl_generator.py
        VMart fault_groups_VMart.out > fault_group_ddl.sql
    

    After the script returns, you can run the SQL file, instead of multiple DDL statements individually.

  2. Using vsql, run the DDL statements in fault_group_ddl.sql or execute the commands in the file using vsql.

    => \i fault_group_ddl.sql
    
  3. If large cluster is enabled, realign control nodes with REALIGN_CONTROL_NODES. Otherwise, skip this step.

    => SELECT REALIGN_CONTROL_NODES();
    
  4. Save cluster changes to the Spread configuration file by calling RELOAD_SPREAD:

    => SELECT RELOAD_SPREAD(true);
    
  5. Use Administration tools to restart the database.

  6. Save changes to the cluster's data layout by calling REBALANCE_CLUSTER:

    => SELECT REBALANCE_CLUSTER();
    

See also

4 - Monitoring fault groups

You can monitor fault groups by querying Vertica system tables or by logging in to the Management Console (MC) interface.

You can monitor fault groups by querying Vertica system tables or by logging in to the Management Console (MC) interface.

Monitor fault groups using system tables

Use the following system tables to view information about fault groups and cluster vulnerabilities, such as the nodes the cluster cannot lose without the database going down:

  • V_CATALOG.FAULT_GROUPS: View the hierarchy of all fault groups in the cluster.

  • V_CATALOG.CLUSTER_LAYOUT: Observe the arrangement of the nodes participating in the data business and the fault groups that affect them. Ephemeral nodes do not appear in the cluster layout ring because they hold no data.

Monitoring fault groups using Management Console

An MC administrator can monitor and highlight fault groups of interest by following these steps:

  1. Click the running database you want to monitor and click Manage in the task bar.

  2. Open the Fault Group View menu, and select the fault groups you want to view.

  3. (Optional) Hide nodes that are not in the selected fault group to focus on fault groups of interest.

Nodes assigned to a fault group each have a colored bubble attached to the upper-left corner of the node icon. Each fault group has a unique color.If the number of fault groups exceeds the number of colors available, MC recycles the colors used previously.

Because Vertica supports complex, hierarchical fault groups of different shapes and sizes, MC displays multiple fault group participation as a stack of different-colored bubbles. The higher bubbles represent a lower-tiered fault group, which means that bubble is closer to the parent fault group, not the child or grandchild fault group.

For more information about fault group hierarchy, see High Availability With Fault Groups.

5 - Dropping fault groups

When you remove a fault group from the cluster, be aware that the drop operation removes the specified fault group and its child fault groups.

When you remove a fault group from the cluster, be aware that the drop operation removes the specified fault group and its child fault groups. Vertica places all nodes under the parent of the dropped fault group. To see the current fault group hierarchy in the cluster, query system table FAULT_GROUPS.

Drop a fault group

Use the DROP FAULT GROUP statement to remove a fault group from the cluster. The following example shows how you can drops the group2 fault group:

=> DROP FAULT GROUP group2;
DROP FAULT GROUP

Drop all fault groups

Use the ALTER DATABASE statement to drop all fault groups, along with any child fault groups, from the specified database cluster.

The following command drops all fault groups from the current database.

=> ALTER DATABASE DEFAULT DROP ALL FAULT GROUP;
ALTER DATABASE

Add nodes back to a fault group

To add a node back to a fault group, you must manually reassign it to a new or existing fault group. To do so, use the CREATE FAULT GROUP and ALTER FAULT GROUP..ADD NODE statements.

See also