Adding and removing nodes from subclusters

You will often want to add new nodes to and remove existing nodes from a subcluster.

You will often want to add new nodes to and remove existing nodes from a subcluster. This ability lets you scale your database to respond to changing analytic needs. For more information on how adding nodes to a subcluster affects your database's performance, see Scaling your Eon Mode database.

Adding new nodes to a subcluster

You can add nodes to a subcluster to meet additional workloads. The nodes that you add to the subcluster must already be part of your cluster. These can be:

Nodes that you removed from other subclusters.
Nodes you added following the steps in Add nodes to a running cluster on the cloud.
Nodes you created using your cloud provider's interface, such as the AWS EC2 "Launch more like this" feature.

To add new nodes to a subcluster, use the db_add_node command of admintools:

$ adminTools -t db_add_node -h
Usage: db_add_node [options]

Options:
  -h, --help            show this help message and exit
  -d DB, --database=DB  Name of the database
  -s HOSTS, --hosts=HOSTS
                        Comma separated list of hosts to add to database
  -p DBPASSWORD, --password=DBPASSWORD
                        Database password in single quotes
  -a AHOSTS, --add=AHOSTS
                        Comma separated list of hosts to add to database
  -c SCNAME, --subcluster=SCNAME
                        Name of subcluster for the new node
  --timeout=NONINTERACTIVE_TIMEOUT
                        set a timeout (in seconds) to wait for actions to
                        complete ('never') will wait forever (implicitly sets
                        -i)
  -i, --noprompts       do not stop and wait for user input(default false).
                        Setting this implies a timeout of 20 min.
  --compat21            (deprecated) Use Vertica 2.1 method using node names
                        instead of hostnames

If you do not use the -c option, Vertica adds new nodes to the default subcluster (set to default_subcluster in new databases). This example adds a new node without specifying the subcluster:

$ adminTools -t db_add_node -p 'password' -d verticadb -s 10.11.12.117
Subcluster not specified, validating default subcluster
Nodes will be added to subcluster 'default_subcluster'
                Verifying database connectivity...10.11.12.10
Eon database detected, creating new depot locations for newly added nodes
Creating depots for each node
        Generating new configuration information and reloading spread
        Replicating configuration to all nodes
        Starting nodes
        Starting nodes:
                v_verticadb_node0004 (10.11.12.117)
        Starting Vertica on all nodes. Please wait, databases with a
            large catalog may take a while to initialize.
        Checking database state
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (UP)
Communal storage detected: syncing catalog

        Multi-node DB add completed
Nodes added to verticadb successfully.
You will need to redesign your schema to take advantage of the new nodes.

To add nodes to a specific existing subcluster, use the db_add_node tool's -c option:

$ adminTools -t db_add_node -s 10.11.12.178 -d verticadb -p 'password' \
             -c analytics_subcluster
Subcluster 'analytics_subcluster' specified, validating
Nodes will be added to subcluster 'analytics_subcluster'
                Verifying database connectivity...10.11.12.10
Eon database detected, creating new depot locations for newly added nodes
Creating depots for each node
        Generating new configuration information and reloading spread
        Replicating configuration to all nodes
        Starting nodes
        Starting nodes:
                v_verticadb_node0007 (10.11.12.178)
        Starting Vertica on all nodes. Please wait, databases with a
              large catalog may take a while to initialize.
        Checking database state
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (UP)
Communal storage detected: syncing catalog

        Multi-node DB add completed
Nodes added to verticadb successfully.
You will need to redesign your schema to take advantage of the new nodes.

Updating shard subscriptions after adding nodes

After you add nodes to a subcluster they do not yet subscribe to shards. You can view the subscription status of all nodes in your database using the following query that joins the V_CATALOG.NODES and V_CATALOG.NODE_SUBSCRIPTIONS system tables:

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 |             |
 analytics_subcluster | v_verticadb_node0005 |             |
 analytics_subcluster | v_verticadb_node0006 |             |
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(12 rows)

You can see that none of the nodes in the newly-added analytics_subcluster have subscriptions.

To update the subscriptions for new nodes, call the REBALANCE_SHARDS function. You can limit the rebalance to the subcluster containing the new nodes by passing its name to the REBALANCE_SHARDS function call. The following example runs rebalance shards to update the analytics_subcluster's subscriptions:


=> SELECT REBALANCE_SHARDS('analytics_subcluster');
 REBALANCE_SHARDS
-------------------
 REBALANCED SHARDS
(1 row)

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0003 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)

Removing nodes

Your database must meet these requirements before you can remove a node from a subcluster:

To remove a node from a primary subcluster, all of the primary nodes in the subcluster must be up, and the database must be able to maintain quorum after the primary node is removed (see Data integrity and high availability in an Eon Mode database). These requirements are necessary because Vertica calls REBALANCE_SHARDS to redistribute shard subscriptions among the remaining nodes in the subcluster. If you attempt to remove a primary node when the database does not meet the requirements, the rebalance shards process waits until either the down nodes recover or a timeout elapses. While it waits, you periodically see a message "Rebalance shards polling iteration number [nn]" indicating that the rebalance process is waiting to complete.

You can remove nodes from a secondary subcluster even when nodes in the subcluster are down.
If your database has the large cluster feature enabled, you cannot remove a node if it is the subcluster's last control node and there are nodes that depend on it. See Large cluster for more information.

If there are other control nodes in the subcluster, you can drop a control node. Vertica reassigns the nodes that depend on the node being dropped to other control nodes.

To remove one or more nodes, use admintools's db_remove_node tool:

$ adminTools -t db_remove_node -p 'password' -d verticadb -s 10.11.12.117
connecting to 10.11.12.10
Waiting for rebalance shards. We will wait for at most 36000 seconds.
Rebalance shards polling iteration number [0], started at [14:56:41], time out at [00:56:41]
Attempting to drop node v_verticadb_node0004 ( 10.11.12.117 )
        Shutting down node v_verticadb_node0004
        Sending node shutdown command to '['v_verticadb_node0004', '10.11.12.117', '/vertica/data', '/vertica/data']'
        Deleting catalog and data directories
        Update admintools metadata for v_verticadb_node0004
        Eon mode detected. The node v_verticadb_node0004 has been removed from host 10.11.12.117. To remove the
        node metadata completely, please clean up the files corresponding to this node, at the communal
        location: s3://eonbucket/metadata/verticadb/nodes/v_verticadb_node0004
        Reload spread configuration
        Replicating configuration to all nodes
        Checking database state
        Node Status: v_verticadb_node0001: (UP) v_verticadb_node0002: (UP) v_verticadb_node0003: (UP)
Communal storage detected: syncing catalog

When you remove one or more nodes from a subcluster, Vertica automatically rebalances shards in the subcluster. You do not need to manually rebalance shards after removing nodes.

Moving nodes between subclusters

To move a node from one subcluster to another:

Remove the node or nodes from the subcluster it is currently a part of.
Add the node to the subcluster you want to move it to.