This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Large cluster

Vertica uses the Spread service to broadcast control messages between database nodes.

1: Planning a large cluster
2: Enabling large cluster
3: Changing the number of control nodes and realigning
4: Monitoring large clusters

Vertica uses the Spread service to broadcast control messages between database nodes. This service can limit the growth of a Vertica database cluster. As you increase the number of cluster nodes, the load on the Spread service also increases as more participants exchange messages. This increased load can slow overall cluster performance. Also, network addressing limits the maximum number of participants in the Spread service to 120 (and often far less). In this case, you can use large cluster to overcome these Spread limitations.

When large cluster is enabled, a subset of cluster nodes, called control nodes, exchange messages using the Spread service. Other nodes in the cluster are assigned to one of these control nodes, and depend on them for cluster-wide communication. Each control node passes messages from the Spread service to its dependent nodes. When a dependent node needs to broadcast a message to other nodes in the cluster, it passes the message to its control node, which in turn sends the message out to its other dependent nodes and the Spread service.

By setting up dependencies between control nodes and other nodes, you can grow the total number of database nodes, and remain in compliance with the Spread limit of 120 nodes.

Note

Technically, when large cluster is disabled, all of the nodes in the cluster are control nodes. In this case, all nodes connect to Spread. When large cluster is enabled, some nodes become dependent on control nodes.

A downside of the large cluster feature is that if a control node fails, its dependent nodes are cut off from the rest of the database cluster. These nodes cannot participate in database activities, and Vertica considers them to be down as well. When the control node recovers, it re-establishes communication between its dependent nodes and the database, so all of the nodes rejoin the cluster.

Note

The Spread service demon runs as an independent process on the control node host. It is not part of the Vertica process. If the Vertica process goes down on the node—for example, you use admintools to stop the Vertica process on the host—Spread continues to run. As long as the Spread demon runs on the control node, the node's dependents can communicate with the database cluster and participate in database activity. Normally, the control node only goes down if the node's host has an issue—or example, you shut it down, it becomes disconnected from the network, or a hardware failure occurs.

Large cluster and database growth

When your database has large cluster enabled, Vertica decides whether to make a newly added node into a control or a dependent node as follows:

In Enterprise Mode, if the number of control nodes configured for the database cluster is greater than the current number of nodes it contains, Vertica makes the new node a control node. In Eon Mode, the number of control nodes is set at the subcluster level. If the number of control nodes set for the subcluster containing the new node is less than this setting, Vertica makes the new node a control node.
If the Enterprise Mode cluster or Eon Mode subcluster has reached its limit on control nodes, a new node becomes a dependent of an existing control node.

When a newly-added node is a dependent node, Vertica automatically assigns it to a control node. Which control node it chooses is guided by the database mode:

Enterprise Mode database: Vertica assigns the new node to the control node with the least number of dependents. If you created fault groups in your database, Vertica chooses a control node in the same fault group as the new node. This feature lets you use fault groups to organize control nodes and their dependents to reflect the physical layout of the underlying host hardware. For example, you might want dependent nodes to be in the same rack as their control nodes. Otherwise, a failure that affects the entire rack (such as a power supply failure) will not only cause nodes in the rack to go down, but also nodes in other racks whose control node is in the affected rack. See Fault groups for more information.
Eon Mode database: Vertica always adds new nodes to a subcluster. Vertica assigns the new node to the control node with the fewest dependent nodes in that subcluster. Every subcluster in an Eon Mode database with large cluster enabled has at least one control node. Keeping dependent nodes in the same subcluster as their control node maintains subcluster isolation.

Important

In versions of Vertica prior to 10.0.1, nodes in an Eon Mode database with large cluster enabled were not necessarily assigned a control node in their subcluster. If you have upgraded your Eon Mode database from a version of Vertica earlier than 10.0.1 and have large cluster enabled, realign the control nodes in your database. This process reassigns dependent nodes and fixes any cross-subcluster control node dependencies. See Realigning Control Nodes and Reloading Spread for more information.

Spread's upper limit of 120 participants can cause errors when adding a subcluster to an Eon Mode database. If your database cluster has 120 control nodes, attempting to create a subcluster fails with an error. Every subcluster must have at least one control node. When your cluster has 120 control nodes , Vertica cannot create a control node for the new subcluster. If this error occurs, you must reduce the number of control nodes in your database cluster before adding a subcluster.

When to enable large cluster

Vertica automatically enables large cluster in two cases:

The database cluster contains 120 or more nodes. This is true for both Enterprise Mode and Eon Mode.
You create an Eon Mode subcluster (either a primary subcluster or a secondary subcluster) with an initial node count of 16 or more.

Vertica does not automatically enable large cluster if you expand an existing subcluster to 16 or more nodes by adding nodes to it.

Note
You can prevent Vertica from automatically enabling large cluster when you create a subcluster with 16 or more nodes by setting the control-set-size parameter to -1. See Creating subclusters for details.

You can choose to manually enable large cluster mode before Vertica automatically enables it. Your best practice is to enable large cluster when your database cluster size reaches a threshold:

For cloud-based databases, enable large cluster when the cluster contains 16 or more nodes. In a cloud environment, your database uses point-to-point network communications. Spread scales poorly in point-to-point communications mode. Enabling large cluster when the database cluster reaches 16 nodes helps limit the impact caused by Spread being in point-to-point mode.
For on-premises databases, enable large cluster when the cluster reaches 50 to 80 nodes. Spread scales better in an on-premises environment. However, by the time the cluster size reaches 50 to 80 nodes, Spread may begin exhibiting performance issues.

In either cloud or on-premises environments, enable large cluster if you begin to notice Spread-related performance issues. Symptoms of Spread performance issues include:

The load on the spread service begins to cause performance issues. Because Vertica uses Spread for cluster-wide control messages, Spread performance issues can adversely affect database performance. This is particularly true for cloud-based databases, where Spread performance problems becomes a bottleneck sooner, due to the nature of network broadcasting in the cloud infrastructure. In on-premises databases, broadcast messages are usually less of a concern because messages usually remain within the local subnet. Even so, eventually, Spread usually becomes a bottleneck before Vertica automatically enables large cluster automatically when the cluster reaches 120 nodes.
The compressed list of addresses in your cluster is too large to fit in a maximum transmission unit (MTU) packet (1478 bytes). The MTU packet has to contain all of the addresses for the nodes participating in the Spread service. Under ideal circumstances (when your nodes have the IP addresses 1.1.1.1, 1.1.1.2 and so on) 120 addresses can fit in this packet. This is why Vertica automatically enables large cluster if your database cluster reaches 120 nodes. In practice, the compressed list of IP addresses will reach the MTU packet size limit at 50 to 80 nodes.

1 - Planning a large cluster

There are two factors you should consider when planning to expand your database cluster to the point that it needs to use large cluster:.

There are two factors you should consider when planning to expand your database cluster to the point that it needs to use large cluster:

How many control nodes should your database cluster have?
How should those control nodes be distributed?

Determining the number of control nodes

When you manually enable large cluster or add enough nodes to trigger Vertica to enable it automatically, a subset of the cluster nodes become control nodes. In subclusters with fewer than 16 nodes, all nodes are control nodes. In many cases, you can set the number of control nodes to the square root of the total number of nodes in the entire Enterprise Mode cluster, or in Eon Mode subclusters with more than 16 nodes. However, this formula for calculating the number of control is not guaranteed to always meet your requirements.

When choosing the number of control nodes in a database cluster, you must balance two competing considerations:

If a control node fails or is shut down, all nodes that depend on it are cut off from the database. They are also down until the control node rejoins the database. You can reduce the impact of a control node failure by increasing the number of control nodes in your cluster.
The more control nodes in your cluster, the greater the load on the spread service. In cloud environments, increased complexity of the network environment broadcast can contribute to high latency. This latency can cause messages sent over the spread service to take longer to reach all of the nodes in the cluster.

In a cloud environment, experience has shown that 16 control nodes balances the needs of reliability and performance. In an Eon Mode database, you must have at least one control node per subcluster. Therefore, if you have more than 16 subclusters, you must have more than 16 control nodes.

In an Eon Mode database, whether on-premises or in the cloud, consider adding more control nodes to your primary subclusters than to secondary subclusters. Only nodes in primary subclusters are responsible for maintaining K-safety in an Eon Mode database. Therefore, a control node failure in a primary subcluster can have greater impact on your database than a control node failure in a secondary subcluster.

In an on-premises Enterprise Mode database, consider the physical layout of the hosts running your database when choosing the number of control nodes. If your hosts are spread across multiple server racks, you want to have enough control nodes to distribute them across the racks. Distributing the control nodes helps ensure reliability in the case of a failure that involves the entire rack (such as a power supply or network switch failure). You can configure your database so no node depends on a control node that is in a separate rack. Limiting dependency to within a rack prevents a failure that affects an entire rack from causing additional node loss outside the rack due to control node loss.

Selecting the number of control nodes based on the physical layout also lets you reduce network traffic across switches. By having dependent nodes on the same racks as their control nodes, the communications between them remain in the rack, rather that traversing a network switch.

You might need to increase the number of control nodes to evenly distribute them across your racks. For example, on-premises Enterprise Mode database has 64 total nodes, spread across three racks. The square root of the number of nodes yields 8 control nodes for this cluster. However, you cannot evenly distribute eight control nodes among the three racks. Instead, you can have 9 control nodes and evenly distribute three control nodes per rack.

Influencing control node placement

After you determine the number of nodes for your cluster, you need to determine how to distribute them among the cluster nodes. Vertica chooses which nodes become control nodes. You can influence how Vertica chooses the control nodes and which nodes become their dependents. The exact process you use depends on your database's mode:

Enterprise Mode on-premises database: Define fault groups to influence control node placement. Dependent nodes are always in the same fault group as their control node. You usually define fault groups that reflect the physical layout of the hosts running your database. For example, you usually define one or more fault groups for the nodes in a single rack of servers. When the fault groups reflect your physical layout, Vertica places control nodes and dependents in a way that can limit the impact of rack failures. See Fault groups for more information.
Eon Mode database: Use subclusters to control the placement of control nodes. Each subcluster must have at least one control node. Dependent nodes are always in the same subcluster as their control nodes. You can set the number of control nodes for each subcluster. Doing so lets you assign more control nodes to primary subclusters, where it's important to minimize the impact of a control node failure.

How Vertica chooses a default number of control nodes

Vertica can automatically choose the number of control nodes in the entire cluster (when in Enterprise Mode) or for a subcluster (when in Eon Mode). It sets a default value in these circumstances:

When you pass the default keyword to the --large-cluster option of the install_vertica script (see Enable Large Cluster When Installing Vertica).
Vertica automatically enables large cluster when your database cluster grows to 120 or more nodes.
Vertica automatically enables large cluster for an Eon Mode subcluster if you create it with more than 16 nodes. Note that Vertica does not enable large cluster on a subcluster you expand past the 16 node limit. It only enables large clusters that start out larger than 16 nodes.

The number of control nodes Vertica chooses depends on what triggered Vertica to set the value.

If you pass the --large-cluster default option to the install_vertica script, Vertica sets the number of control nodes to the square root of the number of nodes in the initial cluster.

If your database cluster reaches 120 nodes, Vertica enables large cluster by making any newly-added nodes into dependents. The default value for the limit on the number of control nodes is 120. When you reach this limit, any newly-added nodes are added as dependents. For example, suppose you have a 115 node Enterprise Mode database cluster where you have not manually enabled large cluster. If you add 10 nodes to this cluster, Vertica adds 5 of the nodes as control nodes (bringing you up to the 120-node limit) and the other 5 nodes as dependents.

Important

You should manually enable large cluster before your database reaches 120 nodes.

In an Eon Mode database, each subcluster has its own setting for the number of control nodes. Vertica only automatically sets the number of control nodes when you create a subcluster with more than 16 nodes initially. When this occurs, Vertica sets the number of control nodes for the subcluster to the square root of the number of nodes in the subcluster.

For example, suppose you add a new subcluster with 25 nodes in it. This subcluster starts with more than the 16 node limit, so Vertica sets the number of control nodes for subcluster to 5 (which is the square root of 25). Five of the nodes are added as control nodes, and the remaining 20 are added as dependents of those five nodes.

Even though each subcluster has its own setting for the number of control nodes, an Eon Mode database cluster still has the 120 node limit on the total number of control nodes that it can have.

2 - Enabling large cluster

Vertica enables the large cluster feature automatically when:.

Vertica enables the large cluster feature automatically when:

The total number of nodes in the database cluster exceeds 120.
You create an Eon Mode subcluster with more than 16 nodes.

In most cases, you should consider manually enabling large cluster before your cluster size reaches either of these thresholds. See Planning a large cluster for guidance on when to enable large cluster.

You can enable large cluster on a new Vertica database, or on an existing database.

Enable large cluster when installing Vertica

You can enable large cluster when installing Vertica onto a new database cluster. This option is useful if you know from the beginning that your database will benefit from large cluster.

The install_vertica script's --large-cluster argument enables large cluster during installation. It takes a single integer value between 1 and 120 that specifies the number of control nodes to create in the new database cluster. Alternatively, this option can take the literal argument default. In this case, Vertica enables large cluster mode and sets the number of control nodes to the square root of the number nodes you provide in the --hosts argument. For example, if --hosts specifies 25 hosts and --large-cluster is set to default, the install script creates a database cluster with 5 control nodes.

The --large-cluster argument has a slightly different effect depending on the database mode you choose when creating your database:

Enterprise Mode: --large-cluster sets the total number of control nodes for the entire database cluster.
Eon Mode : --large-cluster sets the number of control nodes in the initial default subcluster. This setting has no effect on subclusters that you create later.

Note

You cannot use --large-cluster to set the number of control nodes in your initial database to be higher than the number of you pass in the --hosts argument. The installer sets the number of control nodes to whichever is the lower value: the value you pass to the --large-cluster option or the number of hosts in the --hosts option.

You can set the number of control nodes to be higher than the number of nodes currently in an existing database, with the meta-function SET_CONTROL_SET_SIZE function. You choose to set a higher number to preallocate control nodes when planning for future expansion. For details, see Changing the number of control nodes and realigning.

After the installation process completes, use the Administration tools or the [%=Vertica.MC%] to create a database. See Create an empty database for details.

If your database is on-premises and running in Enterprise Mode, you usually want to define fault groups that reflect the physical layout of your hosts. They let you define which hosts are in the same server racks, and are dependent on the same infrastructure (such power supplies and network switches). With this knowledge, Vertica can realign the control nodes to make your database better able to cope with hardware failures. See Fault groups for more information.

After creating a database, any nodes that you add are, by default, dependent nodes. You can change the number of control nodes in the database with the meta-function SET_CONTROL_SET_SIZE.

Enable large cluster in an existing database

You can manually enable large cluster in an existing database. You usually choose to enable large cluster manually before your database reaches the point where Vertica automatically enables it. See When To Enable Large Cluster for an explanation of when you should consider enabling large cluster.

Use the meta-function SET_CONTROL_SET_SIZE to enable large cluster and set the number of control nodes. You pass this function an integer value that sets the number of control nodes in the entire Enterprise Mode cluster, or in an Eon Mode subcluster.

3 - Changing the number of control nodes and realigning

You can change the number of control nodes in the entire database cluster in Enterprise Mode, or the number of control nodes in a subcluster in Eon Mode.

You can change the number of control nodes in the entire database cluster in Enterprise Mode, or the number of control nodes in a subcluster in Eon Mode. You may choose to change the number of control nodes in a cluster or subcluster to reduce the impact of control node loss on your database. See Planning a large cluster to learn more about when you should change the number of control nodes in your database.

You change the number of control nodes by calling the meta-function SET_CONTROL_SET_SIZE. If large cluster was not enabled before the call to SET_CONTROL_SET_SIZE, the function enables large cluster in your database. See Enabling large cluster for more information.

When you call SET_CONTROL_SET_SIZE in an Enterprise Mode database, it sets the number of control nodes in the entire database cluster. In an Eon Mode database, you must supply SET_CONTROL_SET_SIZE with the name of a subcluster in addition to the number of control nodes. The function sets the number of control nodes for that subcluster. Other subclusters in the database cluster are unaffected by this call.

Before changing the number of control nodes in an Eon Mode subcluster, verify that the subcluster is running. Changing the number of control nodes of a subcluster while it is down can cause configuration issues that prevent nodes in the subcluster from starting.

Note

You can set the number of control nodes to a value that is higher than the number of nodes currently in the cluster or subcluster. When the number of control nodes is higher than the current node count, newly-added nodes become control nodes until the number of nodes in the cluster or subcluster reaches the number control nodes you set.

You may choose to set the number of control nodes higher than the current node count to plan for future expansion. For example, suppose you have a 4-node subcluster in an Eon Mode database that you plan to expand in the future. You determine that you want limit the number of control nodes in this cluster to 8, even if you expand it beyond that size. In this case, you can choose to set the control node size for the subcluster to 8 now. As you add new nodes to the subcluster, they become control nodes until the size of the subcluster reaches 8. After that point, Vertica assigns newly-added nodes as a dependent of an existing control node in the subcluster.

Realigning control nodes and reloading spread

After you call the SET_CONTROL_SET_SIZE function, there are several additional steps you must take before the new setting takes effect.

Important

Follow these steps if you have upgraded your large-cluster enabled Eon Mode database from a version prior to 10.0.1. Earlier versions of Vertica did not restrict control node assignments to be within the same subcluster. When you realign the control nodes after an upgrade, Vertica configures each subcluster to have at least one control node, and assigns nodes to a control node in their own subcluster.

Call the REALIGN_CONTROL_NODES function. This function tells Vertica to re-evaluate the assignment of control nodes and their dependents in your cluster or subcluster. When calling this function in an Eon Mode database, you must supply the name of the subcluster where you changed the control node settings.
Call the RELOAD_SPREAD function. This function updates the control node assignment information in configuration files and triggers Spread to reload.
Restart the nodes affected by the change in control nodes. In an Enterprise Mode database, you must restart the entire database to ensure all nodes have updated configuration information. In Eon Mode, restart the subcluster or subclusters affected by your changes. You must restart the entire Eon Mode database if you changed a critical subcluster (such as the only primary subcluster).

Note
You do not need to restart nodes if the earlier steps didn't change control node assignments. This case usually only happens when you set the number of control nodes in an Eon Mode subcluster to higher than the subcluster's current node count, and all nodes in the subcluster are already control nodes. In this case, no control nodes are added or removed, so node dependencies do not change. Because the dependencies did not change, the nodes do not need to reload the Spread configuration.
In an Enterprise Mode database, call START_REBALANCE_CLUSTER to rebalance the cluster. This process improves your database's fault tolerance by shifting buddy projection assignments to limit the impact of a control node failure. You do not need to take this step in an Eon Mode database.

Enterprise Mode example

The following example makes 4 out of the 8 nodes in an Enterprise Mode database into control nodes. It queries the LARGE_CLUSTER_CONFIGURATION_STATUS system table which shows control node assignments for each node in the database. At the start, all nodes are their own control nodes. See Monitoring large clusters for more information the system tables associated with large cluster.

=> SELECT * FROM V_CATALOG.LARGE_CLUSTER_CONFIGURATION_STATUS;
    node_name     | spread_host_name | control_node_name
------------------+------------------+-------------------
 v_vmart_node0001 | v_vmart_node0001 | v_vmart_node0001
 v_vmart_node0002 | v_vmart_node0002 | v_vmart_node0002
 v_vmart_node0003 | v_vmart_node0003 | v_vmart_node0003
 v_vmart_node0004 | v_vmart_node0004 | v_vmart_node0004
 v_vmart_node0005 | v_vmart_node0005 | v_vmart_node0005
 v_vmart_node0006 | v_vmart_node0006 | v_vmart_node0006
 v_vmart_node0007 | v_vmart_node0007 | v_vmart_node0007
 v_vmart_node0008 | v_vmart_node0008 | v_vmart_node0008
(8 rows)

=> SELECT SET_CONTROL_SET_SIZE(4);
 SET_CONTROL_SET_SIZE
----------------------
 Control size set
(1 row)

=> SELECT REALIGN_CONTROL_NODES();
                       REALIGN_CONTROL_NODES
---------------------------------------------------------------
 The new control node assignments can be viewed in vs_nodes.
 Check vs_cluster_layout to see the proposed new layout. Reboot
 all the nodes and call rebalance_cluster now

(1 row)

=> SELECT RELOAD_SPREAD(true);
 RELOAD_SPREAD
---------------
 Reloaded
(1 row)

=> SELECT SHUTDOWN();

After restarting the database, the final step is to rebalance the cluster and query the LARGE_CLUSTER_CONFIGURATION_STATUS table to see the current control node assignments:

=> SELECT START_REBALANCE_CLUSTER();
 START_REBALANCE_CLUSTER
-------------------------
 REBALANCING
(1 row)

=> SELECT * FROM V_CATALOG.LARGE_CLUSTER_CONFIGURATION_STATUS;
    node_name     | spread_host_name | control_node_name
------------------+------------------+-------------------
 v_vmart_node0001 | v_vmart_node0001 | v_vmart_node0001
 v_vmart_node0002 | v_vmart_node0002 | v_vmart_node0002
 v_vmart_node0003 | v_vmart_node0003 | v_vmart_node0003
 v_vmart_node0004 | v_vmart_node0004 | v_vmart_node0004
 v_vmart_node0005 | v_vmart_node0001 | v_vmart_node0001
 v_vmart_node0006 | v_vmart_node0002 | v_vmart_node0002
 v_vmart_node0007 | v_vmart_node0003 | v_vmart_node0003
 v_vmart_node0008 | v_vmart_node0004 | v_vmart_node0004
(8 rows)

Eon Mode example

The following example configures 4 control nodes in an 8-node secondary subcluster named analytics. The primary subcluster is not changed. The primary differences between this example and the previous Enterprise Mode example is the need to specify a subcluster when calling SET_CONTROL_SET_SIZE, not having to restart the entire database, and not having to call START_REBALANCE_CLUSTER.

=>  SELECT * FROM V_CATALOG.LARGE_CLUSTER_CONFIGURATION_STATUS;
      node_name       |   spread_host_name   |  control_node_name
----------------------+----------------------+----------------------
 v_verticadb_node0001 | v_verticadb_node0001 | v_verticadb_node0001
 v_verticadb_node0002 | v_verticadb_node0002 | v_verticadb_node0002
 v_verticadb_node0003 | v_verticadb_node0003 | v_verticadb_node0003
 v_verticadb_node0004 | v_verticadb_node0004 | v_verticadb_node0004
 v_verticadb_node0005 | v_verticadb_node0005 | v_verticadb_node0005
 v_verticadb_node0006 | v_verticadb_node0006 | v_verticadb_node0006
 v_verticadb_node0007 | v_verticadb_node0007 | v_verticadb_node0007
 v_verticadb_node0008 | v_verticadb_node0008 | v_verticadb_node0008
 v_verticadb_node0009 | v_verticadb_node0009 | v_verticadb_node0009
 v_verticadb_node0010 | v_verticadb_node0010 | v_verticadb_node0010
 v_verticadb_node0011 | v_verticadb_node0011 | v_verticadb_node0011
(11 rows)


=> SELECT subcluster_name,node_name,is_primary,control_set_size FROM
   V_CATALOG.SUBCLUSTERS;
  subcluster_name   |      node_name       | is_primary | control_set_size
--------------------+----------------------+------------+------------------
 default_subcluster | v_verticadb_node0001 | t          |               -1
 default_subcluster | v_verticadb_node0002 | t          |               -1
 default_subcluster | v_verticadb_node0003 | t          |               -1
 analytics          | v_verticadb_node0004 | f          |               -1
 analytics          | v_verticadb_node0005 | f          |               -1
 analytics          | v_verticadb_node0006 | f          |               -1
 analytics          | v_verticadb_node0007 | f          |               -1
 analytics          | v_verticadb_node0008 | f          |               -1
 analytics          | v_verticadb_node0009 | f          |               -1
 analytics          | v_verticadb_node0010 | f          |               -1
 analytics          | v_verticadb_node0011 | f          |               -1
(11 rows)

=> SELECT SET_CONTROL_SET_SIZE('analytics',4);
 SET_CONTROL_SET_SIZE
----------------------
 Control size set
(1 row)

=> SELECT REALIGN_CONTROL_NODES('analytics');
                      REALIGN_CONTROL_NODES
-----------------------------------------------------------------------------
 The new control node assignments can be viewed in vs_nodes. Call
reload_spread(true). If the subcluster is critical, restart the database.
Otherwise, restart the subcluster

(1 row)

=> SELECT RELOAD_SPREAD(true);
 RELOAD_SPREAD
---------------
 Reloaded
(1 row)

At this point, the analytics subcluster needs to restart. You have several options to restart it. See Starting and stopping subclusters for details. This example uses the admintools command line to stop and start the subcluster.

$ admintools -t stop_subcluster -d verticadb -c analytics -p password
*** Forcing subcluster shutdown ***
Verifying subcluster 'analytics'
    Node 'v_verticadb_node0004' will shutdown
    Node 'v_verticadb_node0005' will shutdown
    Node 'v_verticadb_node0006' will shutdown
    Node 'v_verticadb_node0007' will shutdown
    Node 'v_verticadb_node0008' will shutdown
    Node 'v_verticadb_node0009' will shutdown
    Node 'v_verticadb_node0010' will shutdown
    Node 'v_verticadb_node0011' will shutdown
Shutdown subcluster command successfully sent to the database

$ admintools -t restart_subcluster -d verticadb -c analytics -p password
*** Restarting subcluster for database verticadb ***
    Restarting host [10.11.12.19] with catalog [v_verticadb_node0004_catalog]
    Restarting host [10.11.12.196] with catalog [v_verticadb_node0005_catalog]
    Restarting host [10.11.12.51] with catalog [v_verticadb_node0006_catalog]
    Restarting host [10.11.12.236] with catalog [v_verticadb_node0007_catalog]
    Restarting host [10.11.12.103] with catalog [v_verticadb_node0008_catalog]
    Restarting host [10.11.12.185] with catalog [v_verticadb_node0009_catalog]
    Restarting host [10.11.12.80] with catalog [v_verticadb_node0010_catalog]
    Restarting host [10.11.12.47] with catalog [v_verticadb_node0011_catalog]
    Issuing multi-node restart
    Starting nodes:
        v_verticadb_node0004 (10.11.12.19) [CONTROL]
        v_verticadb_node0005 (10.11.12.196) [CONTROL]
        v_verticadb_node0006 (10.11.12.51) [CONTROL]
        v_verticadb_node0007 (10.11.12.236) [CONTROL]
        v_verticadb_node0008 (10.11.12.103)
        v_verticadb_node0009 (10.11.12.185)
        v_verticadb_node0010 (10.11.12.80)
        v_verticadb_node0011 (10.11.12.47)
    Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
    Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
                     v_verticadb_node0007: (DOWN) v_verticadb_node0008: (DOWN) v_verticadb_node0009: (DOWN)
                     v_verticadb_node0010: (DOWN) v_verticadb_node0011: (DOWN)
    Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
                     v_verticadb_node0007: (DOWN) v_verticadb_node0008: (DOWN) v_verticadb_node0009: (DOWN)
                     v_verticadb_node0010: (DOWN) v_verticadb_node0011: (DOWN)
    Node Status: v_verticadb_node0004: (INITIALIZING) v_verticadb_node0005: (INITIALIZING) v_verticadb_node0006:
                     (INITIALIZING) v_verticadb_node0007: (INITIALIZING) v_verticadb_node0008: (INITIALIZING)
                     v_verticadb_node0009: (INITIALIZING) v_verticadb_node0010: (INITIALIZING) v_verticadb_node0011: (INITIALIZING)
    Node Status: v_verticadb_node0004: (UP) v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
                     v_verticadb_node0007: (UP) v_verticadb_node0008: (UP) v_verticadb_node0009: (UP)
                     v_verticadb_node0010: (UP) v_verticadb_node0011: (UP)
Syncing catalog on verticadb with 2000 attempts.

Once the subcluster restarts, you can query the system tables to see the control node configuration:

=>  SELECT * FROM V_CATALOG.LARGE_CLUSTER_CONFIGURATION_STATUS;
      node_name       |   spread_host_name   |  control_node_name
----------------------+----------------------+----------------------
 v_verticadb_node0001 | v_verticadb_node0001 | v_verticadb_node0001
 v_verticadb_node0002 | v_verticadb_node0002 | v_verticadb_node0002
 v_verticadb_node0003 | v_verticadb_node0003 | v_verticadb_node0003
 v_verticadb_node0004 | v_verticadb_node0004 | v_verticadb_node0004
 v_verticadb_node0005 | v_verticadb_node0005 | v_verticadb_node0005
 v_verticadb_node0006 | v_verticadb_node0006 | v_verticadb_node0006
 v_verticadb_node0007 | v_verticadb_node0007 | v_verticadb_node0007
 v_verticadb_node0008 | v_verticadb_node0004 | v_verticadb_node0004
 v_verticadb_node0009 | v_verticadb_node0005 | v_verticadb_node0005
 v_verticadb_node0010 | v_verticadb_node0006 | v_verticadb_node0006
 v_verticadb_node0011 | v_verticadb_node0007 | v_verticadb_node0007
(11 rows)

=> SELECT subcluster_name,node_name,is_primary,control_set_size FROM subclusters;
  subcluster_name   |      node_name       | is_primary | control_set_size
--------------------+----------------------+------------+------------------
 default_subcluster | v_verticadb_node0001 | t          |               -1
 default_subcluster | v_verticadb_node0002 | t          |               -1
 default_subcluster | v_verticadb_node0003 | t          |               -1
 analytics          | v_verticadb_node0004 | f          |                4
 analytics          | v_verticadb_node0005 | f          |                4
 analytics          | v_verticadb_node0006 | f          |                4
 analytics          | v_verticadb_node0007 | f          |                4
 analytics          | v_verticadb_node0008 | f          |                4
 analytics          | v_verticadb_node0009 | f          |                4
 analytics          | v_verticadb_node0010 | f          |                4
 analytics          | v_verticadb_node0011 | f          |                4
(11 rows)

Disabling large cluster

To disable large cluster, call SET_CONTROL_SET_SIZE with a value of -1. This value is the default for non-large cluster databases. It tells Vertica to make all nodes into control nodes.

In an Eon Mode database, to fully disable large cluster you must to set the number of control nodes to -1 in every subcluster that has a set number of control nodes. You can see which subclusters have a set number of control nodes by querying the CONTROL_SET_SIZE column of the V_CATALOG.SUBCLUSTERS system table.

The following example resets the number of control nodes set in the previous Eon Mode example.

=> SELECT subcluster_name,node_name,is_primary,control_set_size FROM subclusters;
  subcluster_name   |      node_name       | is_primary | control_set_size
--------------------+----------------------+------------+------------------
 default_subcluster | v_verticadb_node0001 | t          |               -1
 default_subcluster | v_verticadb_node0002 | t          |               -1
 default_subcluster | v_verticadb_node0003 | t          |               -1
 analytics          | v_verticadb_node0004 | f          |                4
 analytics          | v_verticadb_node0005 | f          |                4
 analytics          | v_verticadb_node0006 | f          |                4
 analytics          | v_verticadb_node0007 | f          |                4
 analytics          | v_verticadb_node0008 | f          |                4
 analytics          | v_verticadb_node0009 | f          |                4
 analytics          | v_verticadb_node0010 | f          |                4
 analytics          | v_verticadb_node0011 | f          |                4
(11 rows)



=> SELECT SET_CONTROL_SET_SIZE('analytics',-1);
 SET_CONTROL_SET_SIZE
----------------------
 Control size set
(1 row)

=> SELECT REALIGN_CONTROL_NODES('analytics');
                       REALIGN_CONTROL_NODES
---------------------------------------------------------------------------------------
 The new control node assignments can be viewed in vs_nodes. Call reload_spread(true).
 If the subcluster is critical, restart the database. Otherwise, restart the subcluster

(1 row)

=> SELECT RELOAD_SPREAD(true);
 RELOAD_SPREAD
---------------
 Reloaded
(1 row)

-- After restarting the analytics subcluster...

=> SELECT * FROM V_CATALOG.LARGE_CLUSTER_CONFIGURATION_STATUS;
      node_name       |   spread_host_name   |  control_node_name
----------------------+----------------------+----------------------
 v_verticadb_node0001 | v_verticadb_node0001 | v_verticadb_node0001
 v_verticadb_node0002 | v_verticadb_node0002 | v_verticadb_node0002
 v_verticadb_node0003 | v_verticadb_node0003 | v_verticadb_node0003
 v_verticadb_node0004 | v_verticadb_node0004 | v_verticadb_node0004
 v_verticadb_node0005 | v_verticadb_node0005 | v_verticadb_node0005
 v_verticadb_node0006 | v_verticadb_node0006 | v_verticadb_node0006
 v_verticadb_node0007 | v_verticadb_node0007 | v_verticadb_node0007
 v_verticadb_node0008 | v_verticadb_node0008 | v_verticadb_node0008
 v_verticadb_node0009 | v_verticadb_node0009 | v_verticadb_node0009
 v_verticadb_node0010 | v_verticadb_node0010 | v_verticadb_node0010
 v_verticadb_node0011 | v_verticadb_node0011 | v_verticadb_node0011
(11 rows)

=> SELECT subcluster_name,node_name,is_primary,control_set_size FROM subclusters;
  subcluster_name   |      node_name       | is_primary | control_set_size
--------------------+----------------------+------------+------------------
 default_subcluster | v_verticadb_node0001 | t          |               -1
 default_subcluster | v_verticadb_node0002 | t          |               -1
 default_subcluster | v_verticadb_node0003 | t          |               -1
 analytics          | v_verticadb_node0004 | f          |               -1
 analytics          | v_verticadb_node0005 | f          |               -1
 analytics          | v_verticadb_node0006 | f          |               -1
 analytics          | v_verticadb_node0007 | f          |               -1
 analytics          | v_verticadb_node0008 | f          |               -1
 analytics          | v_verticadb_node0009 | f          |               -1
 analytics          | v_verticadb_node0010 | f          |               -1
 analytics          | v_verticadb_node0011 | f          |               -1
(11 rows)

4 - Monitoring large clusters

Monitor large cluster traits by querying the following system tables:.

Monitor large cluster traits by querying the following system tables:

V_CATALOG.LARGE_CLUSTER_CONFIGURATION_STATUS—Shows the current spread hosts and the control designations in the catalog so you can see if they match.
V_MONITOR.CRITICAL_HOSTS—Lists the hosts whose failure would cause the database to become unsafe and force a shutdown.

Tip
The CRITICAL_HOSTS view is especially useful for large cluster arrangements. For non-large clusters, query the CRITICAL_NODES table.
In an Eon Mode database, the CONTROL_SET_SIZE column of the V_CATALOG.SUBCLUSTERS system table shows the number of control nodes set for each subcluster.

You might also want to query the following system tables:

V_CATALOG.FAULT_GROUPS—Shows fault groups and their hierarchy in the cluster.
V_CATALOG.CLUSTER_LAYOUT—Shows the relative position of the actual arrangement of the nodes participating in the database cluster and the fault groups that affect them.

Large cluster

Note

Note

Large cluster and database growth

Important

When to enable large cluster

Note

1 - Planning a large cluster

Determining the number of control nodes

Influencing control node placement

How Vertica chooses a default number of control nodes

Important

2 - Enabling large cluster

Enable large cluster when installing Vertica

Note

Enable large cluster in an existing database

3 - Changing the number of control nodes and realigning

Note

Realigning control nodes and reloading spread

Important

Note

Enterprise Mode example

Eon Mode example

Disabling large cluster

4 - Monitoring large clusters

Tip