This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Eon Mode

You can operate your Vertica database in Eon Mode instead of in Enterprise Mode.

You can operate your Vertica database in Eon Mode instead of in Enterprise Mode. The two modes differ primarily in how they store data:

  • Eon Mode databases use communal storage for their data.

  • Enterprise Mode databases store data locally in the file system of nodes that make up the database.

These different storage methods lead to a number of important differences between the two modes. In Enterprise Mode, each database node stores a portion of the data and performs a portion of the computation. In Eon Mode, computational processes are separated from a communal (shared) storage layer, which enables rapid scaling of computational resources as demand changes.

For more on how these two modes compare, see Architecture.

1 - Creating a database in Eon Mode

The easiest way to create an Eon Mode database in the cloud is to use the MC.

Creating an Eon Mode database in a cloud environment

The easiest way to create an Eon Mode database in the cloud is to use the MC. The MC can create your database and provision the nodes to run the database at the same time. See Creating a database using MC for details. For specific instructions for your cloud environment, see:

On AWS, you can also create a database using admintools:

  1. Provision nodes for the new database manually. See Deploy AWS instances for your Vertica database cluster for instructions.

  2. Create your database using admintools. See Create an Eon Mode Database for steps.

Creating an on-premises Eon Mode database

If you have an on-premises install, you create an Eon Mode database using admintools. For more information see the Create Database step in the installation instructions for your communal storage backend:

2 - Configuring your Vertica cluster for Eon Mode

Running Vertica in Eon Mode decouples the cluster size from the data volume and lets you configure for your compute needs independently from your storage needs.

Running Vertica in Eon Mode decouples the cluster size from the data volume and lets you configure for your compute needs independently from your storage needs. There are a number of factors you must consider when deciding what sorts of instances to use and how large your Eon Mode cluster will be.

Before you begin

Vertica Eon Mode works both in the cloud and on-premises. As a Vertica administrator setting up your production cluster running in Eon Mode, you must make decisions about the virtual or physical hardware you will use to meet your needs. This topic provides guidelines and best practices for selecting server types and cluster sizes for a Vertica database running in Eon Mode.It assumes that you have a basic understanding of the Eon Mode architecture and concepts such as communal storage, depot, and shards. If you need to refresh your understanding, see Eon Mode architecture.

Cluster size overview

Because Eon Mode separates data storage from the computing power of your nodes, choosing a cluster size is more complex than for an Enterprise Mode database. Usually, you choose a base number of nodes that will form one or more primary subclusters. These subclusters contain nodes that are always running in your database. You usually use them for workloads such as data loading and executing DDL statements. You rarely alter the size of these subclusters dynamically. As you need additional compute resources to execute queries, you add one or more subclusters (usually secondary subclusters) of nodes to your database.

When choosing your instances and sizing your cluster for Eon Mode, consider the working data size that your database will be dealing with. This is the amount of data that most of your queries operate on. For example, suppose your database stores sales data. If most of the queries running on your database analyze the last month or two of sales to create reports on sales trends, then your working data size is the amount of data you typically load over a few months.

Choosing instances or physical hardware

Depending on the complexity of your workload and expected concurrency, choose physical or virtual hardware that has sufficient CPU and memory. For production clusters Vertica recommends the following minimum configuration for either virtual or physical nodes in an Eon Mode database:

  • 16 cores

  • 128 GB RAM

  • 2 TB of local storage

You must have a minimum of 3 nodes in the an Eon Mode database cluster.

For specific recommendations of instances for cloud-based Eon Mode database, see:

Determining local storage requirements

For both virtual and physical hardware, you must decide how much local storage your nodes need. In Eon Mode, the definitive copy of your database's data resides in the communal storage. This storage is provided by either a cloud-based object store such as AWS S3, or by an on-premises object store, such as a Pure Storage FlashBlade appliance.

Even though your database's data is stored in communal storage, your nodes still need some local storage. A node in an Eon Mode database uses local storage for three purposes:

  • Depot storage: To get the fastest response time for frequently executed queries, provision a depot large enough to hold your working data set after data compression. Divide the working data size by the number of nodes you will have in your subcluster to estimate the size of the depot you need for each node in a subcluster. See Choosing the Number of Shards and the Initial Node Count below to get an idea of how many nodes you want in your initial database subcluster. In cases where you expect to dynamically scale your cluster up, estimate the depot size based on the minimum number of nodes you anticipate having in the subcluster.

    Also consider how much data you will load at once when sizing your depot. When loading data, Vertica defaults to writing uncommitted ROS files into the depot before uploading the files to communal storage. If the free space in the depot is not sufficient, Vertica evicts files from the depot to make space for new files.

    Your data load fails if the amount of data you try to load in a single transaction is larger the total sizes of all the depots in the subcluster. To load more data than there is space in the subcluster's combined depots, set UseDepotForWrites to 0. This configuration parameter tells Vertica to load the data directly into communal storage.

  • Data storage: The data storage location holds files that belong to temporary tables and temporary data from sort operators that spill to disk. When loading data into Vertica, the sort operator may spill to disk. Depending on the size of the load, Vertica may perform the sort in multiple merge phases. The amount of data concurrently loaded into Vertica cannot be larger than the sum of temporary storage location sizes across all nodes divided by 2.

  • Catalog storage. The catalog size depends on the number of database objects per shard and the number of shard subscriptions per node.

Vertica recommends a minimum local storage capacity of 2 TB per node, out of which 60% is reserved for the depot and the other 40% is shared between the catalog and data location. If you determine that you need a depot larger than 1.2TB per node (which is 60% of 2TB) then add more storage than this minimum recommendation.You can calculate the space you need using this equation:

Disk SPace Per Node = Working Data Size / # of Nodes in subluster * 1.40

For example, suppose you have a compressed working data size of 24TB, and you want to have a initial primary subcluster of 3 nodes. Using these values in the equation results in 13.33TB:

24TB / 3 Nodes * 1.40 = 11.2TB

Choosing the number of shards and the initial node count

Shards are how Vertica divides the responsibility for the data in communal storage among nodes. Each node in a subcluster subscribes to at least one shard in communal storage. During queries, data loads, and other database tasks, each node is responsible for the data in the shards it subscribes to. See Shards and subscriptions for more information.

The relation between shards and nodes means that when picking the number of shards for your database, you must consider the number of nodes you will have in your database, and how you expect to expand your database in the future.

You set the number of shards when you create your database. Once created, you cannot change the number shards. Therefore, choose a shard count that will allow your database to grow over time.

The following table shows the recommended shard count and initial node count base on the working data size. The initial node count is the number of nodes you will have in your core primary subcluster. Note that the number of shards is always a multiple of the node count. This ensures that the shards are evenly divided between the nodes. For best performance, always make the number of shards a multiple or even divisor of the number of nodes in a subcluster.

Cluster Type Working Data Size Number of Shards Initial Node Count
Small Up to 24 TB 6 3
Medium Up to 48 TB 12 6
Large Up to 96 TB 24 12
Extra large Up to 192 TB 48 24

How shard count affects scaling your cluster

The number of shards you choose for your database impacts your ability to scale your database in the future. If you have too few shards, your ability to efficiently scale your database can be limited. If you have too many shards, your database performance can suffer. A larger number of shards increases inter-node communication and catalog complexity.

One key factor in deciding your shard count is determining how you want to scale your database. There are two strategies you can use when adding nodes to your database. Each of these strategies let you improve different types of database performance:

  • To increase the performance of complex, long-running queries, add nodes to an existing subcluster. These additional nodes improve the overall performance of these complex queries by splitting the load across more compute nodes. You can add more nodes to a subcluster than you have shards in the database. In this case, nodes in the subcluster that subscribe to the same shard will split up the data in the shard when performing a query. See Elasticity for more information.

  • To increase the throughput of multiple short-term queries (often called "dashboard queries"), improve your cluster's parallelism by adding additional subclusters. Subclusters work independently and in parallel on these shorter queries. See Subclusters for more information.

These two approaches have an impact on the number of shards you choose to start your database with. Complex analytic queries perform better on subclusters with more nodes, which means that 6 nodes with 6 shards perform better than 3 nodes and 6 shards. Having more nodes than shards can increase performance further, but the performance gain is not linear. For example, a subcluster containing 12 nodes in a 6-shard database is not quite as efficient as as 12-node subcluster in a 12-shard database. Dashboard-type queries operating on smaller data sets may not see much difference between a 3-node subcluster in a 6-shard database and 6-node subcluster in a 6-shard database.

In general, choose a shard count that matches your expected working data size 6–12 months in the future. For more information about scaling your database, see Elasticity.

Use cases

Let’s look at some use cases to learn how to size your Eon Mode cluster to meet your own particular requirements.

Use case 1: save compute by provisioning when needed, rather than for peak times

This use case highlights increasing query throughput in Eon Mode by scaling a cluster from 6 to 18 nodes with 3 subclusters of 6 nodes each. In this use case, you need to support a high concurrent, short query workload on a 24 TB or less working data set. You create an initial database with 6 nodes and 6 shards. You scale your database for concurrent throughput on demand by adding one or more subclusters during certain days of the week or for specific date ranges when you are expecting a peak load. You can then shut down or terminate the additional subclusters when your database experiences lower demand. With Vertica in Eon Mode, you save money by provisioning on demand, rather than provisioning for the peak times.

Use case 2: complex analytic workload requires more compute nodes

This use case showcases the idea that complex analytic workloads on large working data sets benefit from high shard count and node count. You create an initial subcluster with 24 nodes and 24 shards. As needed, you can add an additional 24 nodes to your initial subcluster. These additional nodes enable the subcluster to use elastic crunch scaling to reduce the time it takes to complete complex analytic queries.

Use case 3: workload isolation

This use case showcases the idea of having separate subclusters to isolate ETL and report workloads. You create an initial primary subcluster with 6 nodes and 6 shards for servicing ETL workloads. Then add another 6-node secondary subcluster for executing query workloads. To separate the two workloads, you can configure a network load balancer or create connection load balancing policies in Vertica to direct clients to the correct subcluster based on the type of workloads they need to execute.

3 - Migrating an enterprise database to Eon Mode

Vertica meta-function MIGRATE_ENTERPRISE_TO_EON migrates an Enterprise database to Eon Mode.

Vertica meta-function MIGRATE_ENTERPRISE_TO_EON migrates an Enterprise database to Eon Mode. The migration process includes the following stages:

  1. Check migration prerequisites

  2. Verify compliance

  3. Execute the migration

  4. Check migration results

  5. Activate the Eon database

Migration prerequisites

The following conditions must be true; otherwise, MIGRATE_ENTERPRISE_TO_EON returns with an error:

  • The source Enterprise database version must be ≥10.0.

  • All nodes in the source database must be in an UP state and of type PERMANENT or EPHEMERAL. Verify by querying system table NODES:

    => SELECT node_name, node_type, node_state FROM nodes;
        node_name     | node_type | node_state
    ------------------+-----------+------------
     v_vmart_node0001 | PERMANENT | UP
     v_vmart_node0002 | PERMANENT | UP
     v_vmart_node0003 | PERMANENT | UP
    (3 rows)
    
  • The source database must be configured as an elastic cluster. By default, any database created since Vertica release 9.2.1 is configured as an elastic cluster. To verify whether an Enterprise database is configured as an elastic cluster, query system table ELASTIC_CLUSTER:

    => SELECT is_enabled FROM elastic_cluster;
     is_enabled
    ------------
     t
    (1 row)
    

    If the query returns false, call meta-function ENABLE_ELASTIC_CLUSTER on the Enterprise database.

  • The source Enterprise database must configure Eon parameters as required by the target Eon object store (see Configuration Requirements below).

  • The database must not contain projections that are unsupported by Eon.

Unsupported projections

Eon databases do not support four types of projections, as described below. If MIGRATE_ENTERPRISE_TO_EON finds any of these projection types in the Enterprise database, it rolls back the migration and reports the offending projections or their anchor tables in the migration error log. For example:

The following projections are inconsistent with cluster segmentation. Rebalance them with REBALANCE_CLUSTER() or REBALANCE_TABLE():
Projection(Anchor Table): public.incon1_p1_b0(public.incon1)
Why projection is invalid Notes Resolution
Inconsistent with cluster segmentation. For example, nodes were added to the cluster, so current distribution of projection data is inconsistent with new cluster segmentation requirements.

Rebalance cluster or table.

The error log file lists the names of all tables with problematic projections. You can use these names as arguments to meta-function REBALANCE_TABLE. You can also rebalance all projections by calling REBALANCE_CLUSTER.

Does not support elastic segmentation. Projection was created with the NODES option, or in a database where elastic segmentation was disabled. Drop projection, recreate with ALL NODES.
Defined with a GROUPED clause. Consolidates multiple columns in a single ROS container. Drop projection, recreate without GROUPED clause.
Data stored in unbundled storage containers. Found only in Vertica databases that were created before storage container bundling was introduced in version 7.2.

Bundle storage containers in database with meta-function COMPACT_STORAGE.

The error log names all tables with projections that store data in unbundled storage containers. You can use these names as arguments to meta-function COMPACT_STORAGE.

Configuration requirements

Before migration, set the following configuration parameters on the source database.

Target Eon environment Configuration requirements
S3: AWS, Pure Storage, MinIO

The following requirements apply to all supported cloud and non-cloud (on-premises) environments: AWS, Pure Storage, and MinIO. One exception applies: migration from an Enterprise Mode database on AWS.

Azure
  • You must use an azb:// schema URI to set the Azure Blob Storage location for communal data storage. See Azure Blob Storage object store for the format of this URI.

  • Select one of the following authentication methods to grant Vertica access to the storage location:

    • Configure managed identities to grant your Azure VMs access to the storage location. This option does not require any configuration within Vertica.

    • Set the AzureStorageCredentials and AzureStorageEndpointConfig configuration parameters at the database level to have Vertica authenticate with the storage location.

See Azure Blob Storage object store for more about the two authentication methods.

GCP
HDFS

The following requirements apply:

  • Set configuration parameter HadoopConfDir at the database level (not required for non-HA environments).

Compliance verification

Before running migration, check whether the Enterprise source database complies with all migration requirements. You do so by setting the last Boolean argument of MIGRATE_ENTERPRISE_TO_EON to true. This optional argument—by default set to false—specifies whether the meta-function only performs a compliance check. For example:

=> SELECT migrate_enterprise_to_eon('s3://verticadbbucket', '/vertica/depot', true);

If the meta-function discovers any compliance issues, it writes these to the migration error log migrate_enterprise_to_eon_error.log in the database directory.

Migration execution

MIGRATE_ENTERPRISE_TO_EON migrates an Enterprise database to an Eon Mode database. For example:

=> SELECT migrate_enterprise_to_eon('s3://verticadbbucket', '/vertica/depot', false);

If the meta-function's last Boolean argument is omitted or set to false, it executes the migration. MIGRATE_ENTERPRISE_TO_EON runs in the foreground, and until it returns—either with success or an error—it blocks all operations in the same session on the source Enterprise database. If successful, MIGRATE_ENTERPRISE_TO_EON returns with a list of nodes in the migrated database. You can then proceed to revive the migrated Eon database.

Handling interrupted migration

If migration is interrupted before the meta-function returns—for example, the client disconnects, or a network outage occurs—the migration errors out. In this case, call MIGRATE_ENTERPRISE_TO_EON to restart migration.

Communal storage of the target database retains data that was already copied before the error occurred. When you call MIGRATE_ENTERPRISE_TO_EON to resume migration, the meta-function first checks the data on communal storage and only copies unprocessed data from the source database.

Repeating migration

You can repeat migration multiple times to the same communal storage location. This can be useful for backfilling changes that occurred in the source database during the previous migration.

The following constraints apply:

  • You can migrate from only one database to the same communal storage location.

  • After reviving the newly migrated Eon database, you cannot migrate again to its communal storage, unless you first drop the database and then clean up storage.

Monitoring migration

System table DATABASE_MIGRATION_STATUS displays the progress of a migration in real time, and also stores data from previous migrations. The following example shows data of a migration that is in progress:

=> SELECT node_name, phase, status, bytes_to_transfer, bytes_transferred, communal_storage_location FROM database_migration_status ORDER BY node_name, start_time;
     node_name    |       phase        |  status   | bytes_to_transfer | bytes_transferred | communal_storage_location
------------------+--------------------+-----------+-------------------+------------------+---------------------------
 v_vmart_node0001 | Catalog Conversion | COMPLETED |                 0 |                0 | s3://verticadbbucket/
 v_vmart_node0001 | Data Transfer      | COMPLETED |              1134 |             1134 | s3://verticadbbucket/
 v_vmart_node0001 | Catalog Transfer   | COMPLETED |              3765 |             3765 | s3://verticadbbucket/
 v_vmart_node0002 | Catalog Conversion | COMPLETED |                 0 |                0 | s3://verticadbbucket/
 v_vmart_node0002 | Data Transfer      | COMPLETED |              1140 |             1140 | s3://verticadbbucket/
 v_vmart_node0002 | Catalog Transfer   | COMPLETED |              3766 |             3766 | s3://verticadbbucket/
 v_vmart_node0003 | Catalog Conversion | COMPLETED |                 0 |                0 | s3://verticadbbucket/
 v_vmart_node0003 | Data Transfer      | RUNNING   |           5272616 |           183955 | s3://verticadbbucket/

Error logging

MIGRATE_ENTERPRISE_TO_EON logs migration-related warnings, errors, and hints in migrate_enterprise_to_eon_error.log in the database directory. During execution, the meta-function also prints messages to standard output, together with the error log's pathname.

Conversion results

The following table shows how migration handles visible objects in the source Enterprise database:

From source Enterprise database... To new Eon Mode database...
Global catalog objects Synced to communal storage
Multiple segmented projections in identical buddy projection group One projection in the group
Unsegmented projection replicated on only one node Distributed across all nodes
Number of nodes Same number of nodes, and an equal number of segmented shards
Depot location As specified in MIGRATE_ENTERPRISE_TO_EON. Default depot size is set to 80% of local file system after revive.
USER and TEMP storage locations

Migrated

DATA and TEMP,DATA storage locations Not migrated. New default DATA and TEMP,DATA storage locations are on the same path as the depot.
Fault groups and storage policies Not migrated
External procedures
Catalog objects related to network settings—load balance groups, network addresses, routing rules, subnets, etc.

Eon database activation

After migration is complete and the Eon database is ready for use, perform these steps:

  1. If migrating to an Eon database with HDFS communal storage, create a bootstrapping file to use when you revive the new Eon database. The bootstrapping file must be on the same node where the revive operation is launched, and readable by the user who launches the revive operation.

    A bootstrapping file is required only if the new Eon database uses one or both of the following:

    • High Availability (HA) NameNodes
      Set HadoopConfDir to the location of configuration file hdfs-site.xml—typically, /etc/hadoop/conf. This file defines parameter hdfs.nameservices and individual NameNodes, and must be distributed across all cluster nodes. For details, see Configuring HDFS access.

    • Kerberos authentication
      Set the following Kerberos configuration parameters:

      KerberosServiceName
      KerberosRealm
      KerberosKeytabFile
      

    For example, the bootstrapping file for an Eon database with HA and Kerberos authentication must have the following settings:

    HadoopConfDir = config-path
    KerberosServiceName = principal-name
    KerberosRealm = realm-name
    KerberosKeytabFile = keytab-path
    
  2. Revive the database using one of the following methods:

    • From communal storage on S3 or GCP with Management Console.
    • From communal storage on Azure, S3, GCP, or HDFS with admintools.

    In the following example, the admintools revive_db command revives a three-node database that uses S3 communal storage:

    admintools -t revive_db
       -x auth_params.conf \
       --communal-storage-location=s3://verticadbbucket \
       -d VMart \
       -s 172.16.116.27,172.16.116.28,172.16.116.29 \
       --force
    

    In the next example, revive_db revives an three-node database that uses HDFS communal storage:

    admintools -t revive_db
       -x bootstrap_params.conf \
      --communal-storage-location=webhdfs://mycluster/verticadb \
      -d verticadb \
      -s vnode01,vnode02,vnode03
    
  3. Check the controlmode setting in /opt/vertica/config/admintools.conf. This setting must be compatible with the network messaging requirements of your Eon implementation. For example, S3/AWS (Amazon Cloud) relies on unicast messaging, which is compatible with a controlmode setting of point-to-point (pt2pt). If the source database controlmode setting was broacast and you migrate to S3/AWS communal storage, you must change controlmode with admintools:

    $ admintools -t re_ip -d dbname -T
    
  4. Start the Eon Mode database.

  5. Call CLEAN_COMMUNAL_STORAGE to remove unneeded data files that might be left over from the migration.

  6. If migrating to S3 on-premises communal storage—Pure Storage or MinIO—set configuration parameter AWSStreamingConnectionPercentage to 0 with ALTER DATABASE...SET PARAMETER.

  7. Review the depot storage location size and adjust as needed.

4 - Managing subclusters

Subclusters help you organize the nodes in your clusters to isolate workloads and make elastic scaling easier.

Subclusters help you organize the nodes in your clusters to isolate workloads and make elastic scaling easier. See Subclusters for an overview of how subclusters can help you.

See also

4.1 - Creating subclusters

By default, new Eon Mode databases contain a single primary subcluster named default_subcluster.

By default, new Eon Mode databases contain a single primary subcluster named default_subcluster. This subcluster contains all of the nodes that are part of your database when you create it. You will often want to create subclusters to separate and manage workloads. You have three options to create new subclusters in your database:

Create a subcluster using admintools

To create a new subcluster, use the admintools db_add_subcluster tool:

$ admintools -t db_add_subcluster --help
Usage: db_add_subcluster [options]

Options:
  -h, --help            show this help message and exit
  -d DB, --database=DB  Name of database to be modified
  -s HOSTS, --hosts=HOSTS
                        Comma separated list of hosts to add to the subcluster
  -p DBPASSWORD, --password=DBPASSWORD
                        Database password in single quotes
  -c SCNAME, --subcluster=SCNAME
                        Name of the new subcluster for the new node
  --is-primary          Create primary subcluster
  --is-secondary        Create secondary subcluster
  --control-set-size=CONTROLSETSIZE
                        Set the number of nodes that will run spread within
                        the subcluster
  --like=CLONESUBCLUSTER
                        Name of an existing subcluster from which to clone
                        properties for the new subcluster
  --timeout=NONINTERACTIVE_TIMEOUT
                        set a timeout (in seconds) to wait for actions to
                        complete ('never') will wait forever (implicitly sets
                        -i)
  -i, --noprompts       do not stop and wait for user input(default false).
                        Setting this implies a timeout of 20 min.

The simplest command adds an empty subcluster. It requires the database name, password, and name for the new subcluster. This example adds a subcluster named analytics_cluster to the database named verticadb:

$ adminTools -t db_add_subcluster -d verticadb -p 'password' -c analytics_cluster
Creating new subcluster 'analytics_cluster'
Subcluster added to verticadb successfully.

By default, admintools creates the new subcluster as a secondary subcluster. You can have it create a primary subcluster instead by supplying the --is-primary argument.

Adding nodes while creating a subcluster

You can also specify one or more hosts for admintools to add to the subcluster as new nodes. These hosts must be part of the cluster but not already part of the database. For example, you can use hosts that you added to the cluster using the MC or admintools, or hosts that remain part of the cluster after you dropped nodes from the database. This example creates a subcluster named analytics_cluster and uses the -s option to specify the available hosts in the cluster:

$ adminTools -t db_add_subcluster -c analytics_cluster -d verticadb -p 'password' -s 10.0.33.77,10.0.33.181,10.0.33.85

View the subscription status of all nodes in your database with the following query that joins the V_CATALOG.NODES and V_CATALOG.NODE_SUBSCRIPTIONS system tables:

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

subcluster_name    |      node_name       | shard_name  | subscription_state
-------------------+----------------------+-------------+--------------------
analytics_cluster  | v_verticadb_node0004 | replica     | ACTIVE
analytics_cluster  | v_verticadb_node0004 | segment0001 | ACTIVE
analytics_cluster  | v_verticadb_node0004 | segment0003 | ACTIVE
analytics_cluster  | v_verticadb_node0005 | replica     | ACTIVE
analytics_cluster  | v_verticadb_node0005 | segment0001 | ACTIVE
analytics_cluster  | v_verticadb_node0005 | segment0002 | ACTIVE
analytics_cluster  | v_verticadb_node0006 | replica     | ACTIVE
analytics_cluster  | v_verticadb_node0006 | segment0002 | ACTIVE
analytics_cluster  | v_verticadb_node0006 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0001 | replica     | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0002 | replica     | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | replica     | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)

If you do not include hosts when you create the subcluster, you must manually rebalance the shards in the subcluster when you add nodes at a later time. For more information, see Updating Shard Subscriptions After Adding Nodes.

Subclusters and large cluster

Vertica has a feature named large cluster that helps manage broadcast messages as the database cluster grows. It has several impacts on adding new subclusters:

  • If you create a new subcluster with 16 or more nodes, Vertica automatically enables the large cluster feature. It sets the number of control nodes to the square root of the number of nodes in your subcluster. See Planning a large cluster.

  • You can set the number of control nodes in a subcluster by using the --control-set-size option in the admintools command line.

  • If your database cluster has 120 control nodes, Vertica returns an error if you try to add a new subcluster. Every subcluster must have at least one control node. Your database cannot have more than 120 control nodes. When your database reaches this limit, you must reduce the number of control nodes in other subclusters before you can add a new subcluster. See Changing the number of control nodes and realigning for more information.

  • If you attempt to create a subcluster with a number of control nodes that would exceed the 120 control node limit, Vertica warns you and creates the subcluster with fewer control nodes. It adds as many control nodes as it can to the subcluster, which is 120 minus the current count of control nodes in the cluster. For example, suppose you create a 16-node subcluster in a database cluster that already has 118 control nodes. In this case, Vertica warns you and creates your subcluster with just 2 control nodes rather than the default 4.

See Large cluster for more information about the large cluster feature.

4.2 - Duplicating a subcluster

Subclusters have many settings you can tune to get them to work just the way you want.

Subclusters have many settings you can tune to get them to work just the way you want. After you have tuned a subcluster, you may want additional subclusters that are configured the same way. For example, suppose you have a subcluster that you have tuned to perform analytics workloads. To improve query throughput, you can create several more subclusters configured exactly like it. Instead of creating the new subclusters and then manually configuring them from scratch, you can duplicate the existing subcluster (called the source subcluster) to a new subcluster (the target subcluster).

When you create a new subcluster based on another subcluster, Vertica copies most of the source subcluster's settings. See below for a list of the settings that Vertica copies. These settings are both on the node level and the subcluster level.

Requirements for the target subcluster

You must have a set of hosts in your database cluster that you will use as the target of the subcluster duplication. Vertica forms these hosts into a target subcluster that receives most of the settings of the source subcluster. The hosts for the target subcluster must meet the following requirements:

  • They must be part of your database cluster but not part of your database. For example, you can use hosts you have dropped from a subcluster or whose subcluster you have removed. Vertica returns an error if you attempt to duplicate a subcluster onto one or more nodes that are currently participating in the database.

  • The number of nodes you supply for the target subcluster must equal the number of nodes in the source subcluster. When duplicating the subcluster, Vertica performs a 1:1 copy of some node-level settings from each node in the source subcluster to a corresponding node in the target.

  • The RAM and disk allocation for the hosts in the target subcluster should be at least the same as the source nodes. Technically, your target nodes can have less RAM or disk space than the source nodes. However, you will usually see performance issues in the new subcluster because the settings of the original subcluster will not be tuned for the resources of the target subcluster.

You can duplicate a subcluster even if some of the nodes in the source subcluster or hosts in the target are down. If nodes in the target are down, they use the catalog Vertica copied from the source node when they recover.

Duplication of subcluster-level settings

The following table lists the subcluster-level settings that Vertica copies from the source subcluster to the target.

Setting Type Setting Details
Basic subcluster settings Whether the subcluster is a primary or secondary subcluster.
Large cluster settings The number of control nodes in the subcluster.
Resource pool settings
  • Vertica creates a new resource pool for every subcluster-specific resource pool in the source subcluster.
  • Subcluster-specific resource pool cascade settings are copied from the source subcluster and are applied to the newly-created resource pool for the target subcluster.

  • Subcluster-level overrides on global resource pools settings such as MEMORYSIZE. See Managing workload resources in an Eon Mode database for more information.

  • Grants on resource pools are copied from the source subcluster.

Connection load balancing settings

If the source subcluster is part of a subcluster-based load balancing group (you created the load balancing group using CREATE LOAD BALANCE GROUP...WITH SUBCLUSTER) the new subcluster is added to the group. See Creating Connection Load Balance Groups.

Storage policy settings Table and table partition pinning policies are copied from the source to the target subcluster. See Pinning Depot Objects for more information. Any existing storage policies on the target subcluster are dropped before the policies are copied from the source.

Vertica does not copy the following subcluster settings:

Setting Type Setting Details
Basic subcluster settings
  • Subcluster name (you must provide a new name for the target subcluster).

  • If the source is the default subcluster, the setting is not copied to the target. Your Vertica database has a single default subcluster. If Vertica copied this value, the source subcluster could no longer be the default.

Connection load balancing settings

Address-based load balancing groups are not duplicated for the target subcluster.

For example, suppose you created a load balancing group for the source subcluster by adding the network addresses of all subcluster's nodes . In this case, Vertica does not create a load balancing group for the target subcluster because it does not duplicate the network addresses of the source nodes (see the next section). Because it does not copy the addresses, it cannot not create an address-based group.

Duplication of node-level settings

When Vertica duplicates a subcluster, it maps each node in the source subcluster to a node in the destination subcluster. Then it copies relevant node-level settings from each individual source node to the corresponding target node.

For example, suppose you have a three-node subcluster consisting of nodes named node01, node02, and node03. The target subcluster has nodes named node04, node05, and node06. In this case, Vertica copies the settings from node01 to node04, from node02 to node05, and from node03 to node06.

The node-level settings that Vertica copies from the source nodes to the target nodes are:

Setting Type Setting Details
Configuration parameters

Vertica copies the value of configuration parameters that you have set at the node level in the source node to the target node. For example, suppose you set CompressCatalogOnDisk on the source node using the statement:

ALTER NODE node01 SET CompressCatalogOnDisk = 0;

If you then duplicated the subcluster containing node01, the setting is copied to the target node.

Eon Mode settings
  • Shard subscriptions are copied from the source node to the target.

  • Whether the node is the participating primary node for the shard.

Storage location settings

The DATA, TEMP, DEPOT, and USER storage location paths on the source node are duplicated on the target node. When duplicating node-specific paths (such as DATA or DEPOT) the path names are adjusted for the new node name. For example, suppose node 1 has a depot path of /vertica/depot/vmart/v_vmart_node0001_depot. If Vertica duplicates node 1 to node 4, it adjusts the path to /vertica/depot/vmart/v_vmart_node0004_depot.

Large cluster settings

Control node assignments are copied from the source node to the target node:

  • If the source node is a control node, then the target node is made into a control node.

  • If the source node depends on a control node, then the target node becomes a dependent of the corresponding control node in the new subcluster.

Vertica does not copy the following node-level settings:

Setting Type Setting Details
Connection load balancing settings Network Addresses are not copied. The destination node's network addresses do not depend on the settings of the source node. Therefore, Vertica cannot determine what the target node's addresses should be.
Depot settings Depot-related configuration parameters that can be set on a node level (such as FileDeletionServiceInterval) are not copied from the source node to the target node.

Using admintools to duplicate a subcluster

To duplicate a subcluster, you use the same admintools db_add_subcluster tool that you use to create a new subcluster (see Creating subclusters). In addition to the required options to create a subcluster (the list of hosts, name for the new subcluster, database name, and so on), you also pass the --like option with the name of the source subcluster you want to duplicate.

The following examples demonstrate duplicating a three-node subcluster named analytics_1. The first example examines some of the settings in the analytics_1 subcluster:

  • An override of the global TM resource pool's memory size.

  • Its own resource pool named analytics

  • Its membership in a subcluster-based load balancing group named analytics

=> SELECT name, subcluster_name, memorysize FROM SUBCLUSTER_RESOURCE_POOL_OVERRIDES;
 name | subcluster_name | memorysize
------+-----------------+------------
 tm   | analytics_1     | 0%
(1 row)

=> SELECT name, subcluster_name, memorysize, plannedconcurrency
      FROM resource_pools WHERE subcluster_name IS NOT NULL;
      name      | subcluster_name | memorysize | plannedconcurrency
----------------+-----------------+------------+--------------------
 analytics_pool | analytics_1     | 70%        | 8
(1 row)

=> SELECT * FROM LOAD_BALANCE_GROUPS;
   name    |   policy   |  filter   |    type    | object_name
-----------+------------+-----------+------------+-------------
 analytics | ROUNDROBIN | 0.0.0.0/0 | Subcluster | analytics_1
(1 row)

The following example calls admintool's db_add_subcluster tool to duplicate the analytics_1 subcluster onto a set of three hosts to create a subcluster named analytics_2.

$ admintools -t db_add_subcluster -d verticadb \
             -s 10.11.12.13,10.11.12.14,10.11.12.15 \
          -p mypassword --like=analytics_1 -c analytics_2

Creating new subcluster 'analytics_2'
Adding new hosts to 'analytics_2'
Eon database detected, creating new depot locations for newly added nodes
Creating depot locations for 1 nodes
 Warning when creating depot location for node: v_verticadb_node0007
 WARNING: Target node v_verticadb_node0007 is down, so depot size has been
          estimated from depot location on initiator. As soon as the node comes
          up, its depot size might be altered depending on its disk size
Eon database detected, creating new depot locations for newly added nodes
Creating depot locations for 1 nodes
 Warning when creating depot location for node: v_verticadb_node0008
 WARNING: Target node v_verticadb_node0008 is down, so depot size has been
          estimated from depot location on initiator. As soon as the node comes
          up, its depot size might be altered depending on its disk size
Eon database detected, creating new depot locations for newly added nodes
Creating depot locations for 1 nodes
 Warning when creating depot location for node: v_verticadb_node0009
 WARNING: Target node v_verticadb_node0009 is down, so depot size has been
          estimated from depot location on initiator. As soon as the node comes
          up, its depot size might be altered depending on its disk size
Cloning subcluster properties
NOTICE: Nodes in subcluster analytics_1 have network addresses, you
might need to configure network addresses for nodes in subcluster
analytics_2 in order to get load balance groups to work correctly.

    Replicating configuration to all nodes
    Generating new configuration information and reloading spread
    Starting nodes:
        v_verticadb_node0007 (10.11.12.81)
        v_verticadb_node0008 (10.11.12.209)
        v_verticadb_node0009 (10.11.12.186)
    Starting Vertica on all nodes. Please wait, databases with a large catalog
         may take a while to initialize.
    Checking database state for newly added nodes
    Node Status: v_verticadb_node0007: (DOWN) v_verticadb_node0008:
                 (DOWN) v_verticadb_node0009: (DOWN)
    Node Status: v_verticadb_node0007: (INITIALIZING) v_verticadb_node0008:
                 (INITIALIZING) v_verticadb_node0009: (INITIALIZING)
    Node Status: v_verticadb_node0007: (UP) v_verticadb_node0008:
                 (UP) v_verticadb_node0009: (UP)
Syncing catalog on verticadb with 2000 attempts.
    Multi-node DB add completed
Nodes added to subcluster analytics_2 successfully.
Subcluster added to verticadb successfully.

Re-running the queries in the first part of the example shows that the settings from analytics_1 have been duplicated in analytics_2:

=> SELECT name, subcluster_name, memorysize FROM SUBCLUSTER_RESOURCE_POOL_OVERRIDES;
 name | subcluster_name | memorysize
------+-----------------+------------
 tm   | analytics_1     | 0%
 tm   | analytics_2     | 0%
(2 rows)

=> SELECT name, subcluster_name, memorysize, plannedconcurrency
       FROM resource_pools WHERE subcluster_name IS NOT NULL;
      name      | subcluster_name | memorysize |  plannedconcurrency
----------------+-----------------+------------+--------------------
 analytics_pool | analytics_1     | 70%        | 8
 analytics_pool | analytics_2     | 70%        | 8
(2 rows)

=> SELECT * FROM LOAD_BALANCE_GROUPS;
   name    |   policy   |  filter   |    type    | object_name
-----------+------------+-----------+------------+-------------
 analytics | ROUNDROBIN | 0.0.0.0/0 | Subcluster | analytics_2
 analytics | ROUNDROBIN | 0.0.0.0/0 | Subcluster | analytics_1
(2 rows)

As noted earlier, even though analytics_2 subcluster is part of the analytics load balancing group, its nodes do not have network addresses defined for them. Until you define network addresses for the nodes, Vertica cannot redirect client connections to them.

4.3 - Adding and removing nodes from subclusters

You will often want to add new nodes to and remove existing nodes from a subcluster.

You will often want to add new nodes to and remove existing nodes from a subcluster. This ability lets you scale your database to respond to changing analytic needs. For more information on how adding nodes to a subcluster affects your database's performance, see Scaling your Eon Mode database.

Adding new nodes to a subcluster

You can add nodes to a subcluster to meet additional workloads. The nodes that you add to the subcluster must already be part of your cluster. These can be:

  • Nodes that you removed from other subclusters.

  • Nodes you added following the steps in Add nodes to a running cluster on the cloud.

  • Nodes you created using your cloud provider's interface, such as the AWS EC2 "Launch more like this" feature.

To add new nodes to a subcluster, use the db_add_node command of admintools:

$ adminTools -t db_add_node -h
Usage: db_add_node [options]

Options:
  -h, --help            show this help message and exit
  -d DB, --database=DB  Name of the database
  -s HOSTS, --hosts=HOSTS
                        Comma separated list of hosts to add to database
  -p DBPASSWORD, --password=DBPASSWORD
                        Database password in single quotes
  -a AHOSTS, --add=AHOSTS
                        Comma separated list of hosts to add to database
  -c SCNAME, --subcluster=SCNAME
                        Name of subcluster for the new node
  --timeout=NONINTERACTIVE_TIMEOUT
                        set a timeout (in seconds) to wait for actions to
                        complete ('never') will wait forever (implicitly sets
                        -i)
  -i, --noprompts       do not stop and wait for user input(default false).
                        Setting this implies a timeout of 20 min.
  --compat21            (deprecated) Use Vertica 2.1 method using node names
                        instead of hostnames

If you do not use the -c option, Vertica adds new nodes to the default subcluster (set to default_subcluster in new databases). This example adds a new node without specifying the subcluster:

$ adminTools -t db_add_node -p 'password' -d verticadb -s 10.11.12.117
Subcluster not specified, validating default subcluster
Nodes will be added to subcluster 'default_subcluster'
                Verifying database connectivity...10.11.12.10
Eon database detected, creating new depot locations for newly added nodes
Creating depots for each node
        Generating new configuration information and reloading spread
        Replicating configuration to all nodes
        Starting nodes
        Starting nodes:
                v_verticadb_node0004 (10.11.12.117)
        Starting Vertica on all nodes. Please wait, databases with a
            large catalog may take a while to initialize.
        Checking database state
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (DOWN)
        Node Status: v_verticadb_node0004: (UP)
Communal storage detected: syncing catalog

        Multi-node DB add completed
Nodes added to verticadb successfully.
You will need to redesign your schema to take advantage of the new nodes.

To add nodes to a specific existing subcluster, use the db_add_node tool's -c option:

$ adminTools -t db_add_node -s 10.11.12.178 -d verticadb -p 'password' \
             -c analytics_subcluster
Subcluster 'analytics_subcluster' specified, validating
Nodes will be added to subcluster 'analytics_subcluster'
                Verifying database connectivity...10.11.12.10
Eon database detected, creating new depot locations for newly added nodes
Creating depots for each node
        Generating new configuration information and reloading spread
        Replicating configuration to all nodes
        Starting nodes
        Starting nodes:
                v_verticadb_node0007 (10.11.12.178)
        Starting Vertica on all nodes. Please wait, databases with a
              large catalog may take a while to initialize.
        Checking database state
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (DOWN)
        Node Status: v_verticadb_node0007: (UP)
Communal storage detected: syncing catalog

        Multi-node DB add completed
Nodes added to verticadb successfully.
You will need to redesign your schema to take advantage of the new nodes.

Updating shard subscriptions after adding nodes

After you add nodes to a subcluster they do not yet subscribe to shards. You can view the subscription status of all nodes in your database using the following query that joins the V_CATALOG.NODES and V_CATALOG.NODE_SUBSCRIPTIONS system tables:

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 |             |
 analytics_subcluster | v_verticadb_node0005 |             |
 analytics_subcluster | v_verticadb_node0006 |             |
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(12 rows)

You can see that none of the nodes in the newly-added analytics_subcluster have subscriptions.

To update the subscriptions for new nodes, call the REBALANCE_SHARDS function. You can limit the rebalance to the subcluster containing the new nodes by passing its name to the REBALANCE_SHARDS function call. The following example runs rebalance shards to update the analytics_subcluster's subscriptions:


=> SELECT REBALANCE_SHARDS('analytics_subcluster');
 REBALANCE_SHARDS
-------------------
 REBALANCED SHARDS
(1 row)

=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
   v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
   = ns.node_name) ORDER BY 1,2,3;

   subcluster_name    |      node_name       | shard_name  | subscription_state
----------------------+----------------------+-------------+--------------------
 analytics_subcluster | v_verticadb_node0004 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0004 | segment0003 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0001 | ACTIVE
 analytics_subcluster | v_verticadb_node0005 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | replica     | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0002 | ACTIVE
 analytics_subcluster | v_verticadb_node0006 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0001 | segment0003 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0001 | ACTIVE
 default_subcluster   | v_verticadb_node0002 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | replica     | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0002 | ACTIVE
 default_subcluster   | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)

Removing nodes

Your database must meet these requirements before you can remove a node from a subcluster:

  • To remove a node from a primary subcluster, all of the primary nodes in the subcluster must be up, and the database must be able to maintain quorum after the primary node is removed (see Data integrity and high availability in an Eon Mode database). These requirements are necessary because Vertica calls REBALANCE_SHARDS to redistribute shard subscriptions among the remaining nodes in the subcluster. If you attempt to remove a primary node when the database does not meet the requirements, the rebalance shards process waits until either the down nodes recover or a timeout elapses. While it waits, you periodically see a message "Rebalance shards polling iteration number [nn]" indicating that the rebalance process is waiting to complete.

    You can remove nodes from a secondary subcluster even when nodes in the subcluster are down.

  • If your database has the large cluster feature enabled, you cannot remove a node if it is the subcluster's last control node and there are nodes that depend on it. See Large cluster for more information.

    If there are other control nodes in the subcluster, you can drop a control node. Vertica reassigns the nodes that depend on the node being dropped to other control nodes.

To remove one or more nodes, use admintools's db_remove_node tool:

$ adminTools -t db_remove_node -p 'password' -d verticadb -s 10.11.12.117
connecting to 10.11.12.10
Waiting for rebalance shards. We will wait for at most 36000 seconds.
Rebalance shards polling iteration number [0], started at [14:56:41], time out at [00:56:41]
Attempting to drop node v_verticadb_node0004 ( 10.11.12.117 )
        Shutting down node v_verticadb_node0004
        Sending node shutdown command to '['v_verticadb_node0004', '10.11.12.117', '/vertica/data', '/vertica/data']'
        Deleting catalog and data directories
        Update admintools metadata for v_verticadb_node0004
        Eon mode detected. The node v_verticadb_node0004 has been removed from host 10.11.12.117. To remove the
        node metadata completely, please clean up the files corresponding to this node, at the communal
        location: s3://eonbucket/metadata/verticadb/nodes/v_verticadb_node0004
        Reload spread configuration
        Replicating configuration to all nodes
        Checking database state
        Node Status: v_verticadb_node0001: (UP) v_verticadb_node0002: (UP) v_verticadb_node0003: (UP)
Communal storage detected: syncing catalog

When you remove one or more nodes from a subcluster, Vertica automatically rebalances shards in the subcluster. You do not need to manually rebalance shards after removing nodes.

Moving nodes between subclusters

To move a node from one subcluster to another:

  1. Remove the node or nodes from the subcluster it is currently a part of.

  2. Add the node to the subcluster you want to move it to.

4.4 - Managing workloads with subclusters

By default, queries are limited to executing on the nodes in the subcluster that contains the initiator node (the node the client is connected to).

By default, queries are limited to executing on the nodes in the subcluster that contains the initiator node (the node the client is connected to). This example demonstrates executing an explain plan for a query when connected to node 4 of a cluster. Node 4 is part of a subcluster containing nodes 4 through 6. You can see that only the nodes in the subcluster will participate in a query:

=> EXPLAIN SELECT customer_name, customer_state FROM customer_dimension LIMIT 10;

                                   QUERY PLAN
--------------------------------------------------------------------------------

 QUERY PLAN DESCRIPTION:
 ------------------------------

 EXPLAIN SELECT customer_name, customer_state FROM customer_dimension LIMIT 10;

 Access Path:
 +-SELECT  LIMIT 10 [Cost: 442, Rows: 10 (NO STATISTICS)] (PATH ID: 0)
 |  Output Only: 10 tuples
 |  Execute on: Query Initiator
 | +---> STORAGE ACCESS for customer_dimension [Cost: 442, Rows: 10K (NO
           STATISTICS)] (PATH ID: 1)
 | |      Projection: public.customer_dimension_b0
 | |      Materialize: customer_dimension.customer_name,
            customer_dimension.customer_state
 | |      Output Only: 10 tuples
 | |      Execute on: v_verticadb_node0004, v_verticadb_node0005,
                      v_verticadb_node0006
     .   .   .

In Eon Mode, you can override the MEMORYSIZE, MAXMEMORYSIZE, and MAXQUERYMEMORYSIZE settings for built-in global resource pools to fine-tune workloads within a subcluster. See Managing workload resources in an Eon Mode database for more information.

What happens when a subcluster cannot run a query

In order to process queries, each subcluster's nodes must have full coverage of all shards in the database. If the nodes do not have full coverage (which can happen if nodes are down), the subcluster can no longer process queries. This state does not cause the subcluster to shut down. Instead, if you attempt to run a query on a subcluster in this state, you receive error messages telling you that not enough nodes are available to complete the query.

=> SELECT node_name, node_state FROM nodes
   WHERE subcluster_name = 'analytics_cluster';
      node_name       | node_state
----------------------+------------
 v_verticadb_node0004 | DOWN
 v_verticadb_node0005 | UP
 v_verticadb_node0006 | DOWN
(3 rows)

=> SELECT * FROM online_sales.online_sales_fact;
ERROR 9099:  Cannot find participating nodes to run the query

Once the down nodes have recovered and the subcluster has full shard coverage, it will be able to process queries.

Controlling where a query runs

You can control where specific types of queries run by controlling which subcluster the clients connect to. The best way to enforce restrictions is to create a set of connection load balancing policies to steer clients from specific IP address ranges to clients in the correct subcluster.

For example, suppose you have the following database with two subclusters: one for performing data loading, and one for performing analytics.

The data load tasks come from a set of ETL systems in the IP 10.20.0.0/16 address range. Analytics tasks can come from any other IP address. In this case, you can create set of connection load balance policies that ensure that the ETL systems connect to the data load subcluster, and all other connections go to the analytics subcluster.

=> SELECT node_name,node_address,node_address_family,subcluster_name
   FROM v_catalog.nodes;
      node_name       | node_address | node_address_family |  subcluster_name
----------------------+--------------+---------------------+--------------------
 v_verticadb_node0001 | 10.11.12.10  | ipv4                | load_subcluster
 v_verticadb_node0002 | 10.11.12.20  | ipv4                | load_subcluster
 v_verticadb_node0003 | 10.11.12.30  | ipv4                | load_subcluster
 v_verticadb_node0004 | 10.11.12.40  | ipv4                | analytics_subcluster
 v_verticadb_node0005 | 10.11.12.50  | ipv4                | analytics_subcluster
 v_verticadb_node0006 | 10.11.12.60  | ipv4                | analytics_subcluster
(6 rows)

=> CREATE NETWORK ADDRESS node01 ON v_verticadb_node0001 WITH '10.11.12.10';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_verticadb_node0002 WITH '10.11.12.20';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_verticadb_node0003 WITH '10.11.12.30';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node04 ON v_verticadb_node0004 WITH '10.11.12.40';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_verticadb_node0005 WITH '10.11.12.50';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node06 ON v_verticadb_node0006 WITH '10.11.12.60';
CREATE NETWORK ADDRESS

=> CREATE LOAD BALANCE GROUP load_subcluster WITH SUBCLUSTER load_subcluster
   FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP analytics_subcluster WITH SUBCLUSTER
   analytics_subcluster FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP

=> CREATE ROUTING RULE etl_systems ROUTE '10.20.0.0/16' TO load_subcluster;
CREATE ROUTING RULE
=> CREATE ROUTING RULE analytic_clients ROUTE '0.0.0.0/0' TO analytics_subcluster;
CREATE ROUTING RULE

Once you have created the load balance policies, you can test them using the DESCRIBE_LOAD_BALANCE_DECISION function.

=> SELECT describe_load_balance_decision('192.168.1.1');

               describe_load_balance_decision
           --------------------------------
 Describing load balance decision for address [192.168.1.1]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_systems] source ip filter [10.20.0.0/16]...
   input address does not match source ip filter for this rule.
Considered rule [analytic_clients] source ip filter [0.0.0.0/0]...
   input address matches this rule
Matched to load balance group [analytics_cluster] the group has
   policy [ROUNDROBIN] number of addresses [3]
(0) LB Address: [10.11.12.181]:5433
(1) LB Address: [10.11.12.205]:5433
(2) LB Address: [10.11.12.192]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.11.12.205]
    port [5433]

(1 row)

=> SELECT describe_load_balance_decision('10.20.1.1');

        describe_load_balance_decision
    --------------------------------
 Describing load balance decision for address [10.20.1.1]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_systems] source ip filter [10.20.0.0/16]...
  input address matches this rule
Matched to load balance group [default_cluster] the group has policy
  [ROUNDROBIN] number of addresses [3]
(0) LB Address: [10.11.12.10]:5433
(1) LB Address: [10.11.12.20]:5433
(2) LB Address: [10.11.12.30]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.11.12.20]
  port [5433]

(1 row)

Normally, with these policies, all queries run by the ETL system will run on the load subcluster. All other queries will run on the analytics subcluster. There are some cases (especially if a subcluster is down) where a client may connect to a node in another subcluster. For this reason, clients should always verify they are connected to the correct subcluster. See Connection load balancing policies for more information about load balancing policies.

4.5 - Starting and stopping subclusters

make it convenient to start and stop a group of nodes as needed.

Subclusters make it convenient to start and stop a group of nodes as needed. You start and stop them using the admintools command line.

Starting a subcluster

To start a subcluster, use the restart_subcluster tool:

$ adminTools -t restart_subcluster -h
Usage: restart_subcluster [options]

Options:
  -h, --help            show this help message and exit
  -d DB, --database=DB  Name of database whose subcluster is to be restarted
  -c SCNAME, --subcluster=SCNAME
                        Name of subcluster to be restarted
  -p DBPASSWORD, --password=DBPASSWORD
                        Database password in single quotes
  --timeout=NONINTERACTIVE_TIMEOUT
                        set a timeout (in seconds) to wait for actions to
                        complete ('never') will wait forever (implicitly sets
                        -i)
  -i, --noprompts       do not stop and wait for user input(default false).
                        Setting this implies a timeout of 20 min.
  -F, --force           Force the nodes in the subcluster to start and auto
                        recover if necessary

This example starts the subcluster named analytics_cluster:

$ adminTools -t restart_subcluster -c analytics_cluster \
          -d verticadb -p password
*** Restarting subcluster for database verticadb ***
        Restarting host [10.11.12.192] with catalog [v_verticadb_node0006_catalog]
        Restarting host [10.11.12.181] with catalog [v_verticadb_node0004_catalog]
        Restarting host [10.11.12.205] with catalog [v_verticadb_node0005_catalog]
        Issuing multi-node restart
        Starting nodes:
                v_verticadb_node0004 (10.11.12.181)
                v_verticadb_node0005 (10.11.12.205)
                v_verticadb_node0006 (10.11.12.192)
        Starting Vertica on all nodes. Please wait, databases with a large
            catalog may take a while to initialize.
        Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
                     v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
        Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
                     v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
        Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
                     v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
        Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
                     v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
        Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (UP)
                     v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Communal storage detected: syncing catalog

Restart Subcluster result:  1

Stopping a subcluster

To stop a subcluster, use the stop_subcluster tool:

$ adminTools -t stop_subcluster -h
Usage: stop_subcluster [options]

Options:
  -h, --help            show this help message and exit
  -d DB, --database=DB  Name of database whose subcluster is to be stopped
  -c SCNAME, --subcluster=SCNAME
                        Name of subcluster to be stopped
  -p DBPASSWORD, --password=DBPASSWORD
                        Database password in single quotes
  --timeout=NONINTERACTIVE_TIMEOUT
                        set a timeout (in seconds) to wait for actions to
                        complete ('never') will wait forever (implicitly sets
                        -i)
  -i, --noprompts       do not stop and wait for user input(default false).
                        Setting this implies a timeout of 20 min.

This example stops the subcluster named analytics_cluster:

$ adminTools -t stop_subcluster -c analytics_cluster -d verticadb -p password
*** Forcing subcluster shutdown ***
Verifying subcluster 'analytics_cluster'
        Node 'v_verticadb_node0004' will shutdown
        Node 'v_verticadb_node0005' will shutdown
        Node 'v_verticadb_node0006' will shutdown
Shutdown subcluster command sent to the database

You can also use the SHUTDOWN_SUBCLUSTER function to stop a subcluster. See SHUTDOWN_SUBCLUSTER for details.

4.6 - Altering subcluster settings

There are several settings you can alter on a subcluster using the ALTER SUBCLUSTER statement.

There are several settings you can alter on a subcluster using the ALTER SUBCLUSTER statement. You can also switch a subcluster from a primary to a secondary subcluster, or from a secondary to a primary.

Renaming a subcluster

To rename an existing subcluster, use the ALTER SUBCLUSTER statement's RENAME TO clause:

=> ALTER SUBCLUSTER default_subcluster RENAME TO load_subcluster;
ALTER SUBCLUSTER

=> SELECT DISTINCT subcluster_name FROM subclusters;
  subcluster_name
-------------------
 load_subcluster
 analytics_cluster
(2 rows)

Changing the default subcluster

The default subcluster designates which subcluster Vertica adds nodes to if you do not explicitly specify a subcluster when adding nodes to the database. When you create a new database (or when a database is upgraded from a version prior to 9.3.0) the default_subcluster is the default. You can find the current default subcluster by querying the is_default column of the SUBCLUSTERS system table.

The following example demonstrates finding the default subcluster, and then changing it to the subcluster named analytics_cluster:

=> SELECT DISTINCT subcluster_name FROM SUBCLUSTERS WHERE is_default = true;
  subcluster_name
--------------------
 default_subcluster
(1 row)

=> ALTER SUBCLUSTER analytics_cluster SET DEFAULT;
ALTER SUBCLUSTER
=> SELECT DISTINCT subcluster_name FROM SUBCLUSTERS WHERE is_default = true;
  subcluster_name
-------------------
 analytics_cluster
(1 row)

Converting a subcluster from primary to secondary, or secondary to primary

You usually choose whether a subcluster is primary or secondary when creating it (see Creating subclusters for more information). However, you can switch a subcluster between the two settings after you have created it. You may want to change whether a subcluster is primary or secondary to impact the K-safety of your database. For example, if you have a single primary subcluster that has down nodes that you cannot easily replace, you can promote a secondary subcluster to primary to ensure losing another primary node will not cause your database to shut down. On the oither hand, you may choose to convert a primary subcluster to a secondary before eventually shutting it down. This conversion can prevent the database from losing K-Safety if the subcluster you are shutting down contains half or more of the total number of primary nodes in the database.

To make a secondary subcluster into a primary subcluster, use the PROMOTE_SUBCLUSTER_TO_PRIMARY function:

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | f
 load_subcluster   | t
(2 rows)


=> SELECT PROMOTE_SUBCLUSTER_TO_PRIMARY('analytics_cluster');
 PROMOTE_SUBCLUSTER_TO_PRIMARY
-------------------------------
 PROMOTE SUBCLUSTER TO PRIMARY
(1 row)


=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | t
 load_subcluster   | t
(2 rows)

Making a primary subcluster into a secondary subcluster is similar. Unlike converting a secondary subcluster to a primary, there are several issues that may prevent you from making a primary into a secondary. Vertica prevents you from making a primary into a secondary if any of the following is true:

  • The subcluster contains a critical node.

  • The subcluster is the only primary subcluster in the database. You must have at least one primary subcluster.

  • The initiator node is a member of the subcluster you are trying to demote. You must call DEMOTE_SUBCLUSTER_TO_SECONDARY from another subcluster.

To convert a primary subcluster to secondary, use the DEMOTE_SUBCLUSTER_TO_SECONDARY function:

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | t
 load_subcluster   | t
(2 rows)

=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics_cluster');
 DEMOTE_SUBCLUSTER_TO_SECONDARY
--------------------------------
 DEMOTE SUBCLUSTER TO SECONDARY
(1 row)

=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
  subcluster_name  | is_primary
-------------------+------------
 analytics_cluster | f
 load_subcluster   | t
(2 rows)

4.7 - Removing subclusters

Removing a subcluster from the database deletes the subcluster from the Vertica catalog.

Removing a subcluster from the database deletes the subcluster from the Vertica catalog. During the removal, Vertica removes any nodes in the subcluster from the database. These nodes are still part of the database cluster, but are no longer part of the database. If you view your cluster in the MC, you will see these nodes with the status STANDBY. They can be added back to the database by adding them to another subcluster. See Creating subclusters and Adding New Nodes to a Subcluster.

Vertica places several restrictions on removing a subcluster:

To remove a subcluster, use the admintools command line db_remove_subcluster tool:

$ adminTools -t db_remove_subcluster -h
Usage: db_remove_subcluster [options]

Options:
  -h, --help            show this help message and exit
  -d DB, --database=DB  Name of database to be modified
  -c SCNAME, --subcluster=SCNAME
                        Name of subcluster to be removed
  -p DBPASSWORD, --password=DBPASSWORD
                        Database password in single quotes
  --timeout=NONINTERACTIVE_TIMEOUT
                        set a timeout (in seconds) to wait for actions to
                        complete ('never') will wait forever (implicitly sets
                        -i)
  -i, --noprompts       do not stop and wait for user input(default false).
                        Setting this implies a timeout of 20 min.
  --skip-directory-cleanup
                        Caution: this option will force you to do a manual
                        cleanup. This option skips directory deletion during
                        remove subcluster. This is best used in a cloud
                        environment where the hosts being removed will be
                        subsequently discarded.

This example removes the subcluster named analytics_cluster:

$ adminTools -t db_remove_subcluster -d verticadb -c analytics_cluster -p 'password'
Found node v_verticadb_node0004 in subcluster analytics_cluster
Found node v_verticadb_node0005 in subcluster analytics_cluster
Found node v_verticadb_node0006 in subcluster analytics_cluster
Found node v_verticadb_node0007 in subcluster analytics_cluster
Waiting for rebalance shards. We will wait for at most 36000 seconds.
Rebalance shards polling iteration number [0], started at [17:09:35], time
    out at [03:09:35]
Attempting to drop node v_verticadb_node0004 ( 10.11.12.40 )
    Shutting down node v_verticadb_node0004
    Sending node shutdown command to '['v_verticadb_node0004', '10.11.12.40',
        '/vertica/data', '/vertica/data']'
    Deleting catalog and data directories
    Update admintools metadata for v_verticadb_node0004
    Eon mode detected. The node v_verticadb_node0004 has been removed from
        host 10.11.12.40. To remove the node metadata completely, please clean
        up the files corresponding to this node, at the communal location:
        s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0004
Attempting to drop node v_verticadb_node0005 ( 10.11.12.50 )
    Shutting down node v_verticadb_node0005
    Sending node shutdown command to '['v_verticadb_node0005', '10.11.12.50',
        '/vertica/data', '/vertica/data']'
    Deleting catalog and data directories
    Update admintools metadata for v_verticadb_node0005
    Eon mode detected. The node v_verticadb_node0005 has been removed from
        host 10.11.12.50. To remove the node metadata completely, please clean
        up the files corresponding to this node, at the communal location:
        s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0005
Attempting to drop node v_verticadb_node0006 ( 10.11.12.60 )
    Shutting down node v_verticadb_node0006
    Sending node shutdown command to '['v_verticadb_node0006', '10.11.12.60',
        '/vertica/data', '/vertica/data']'
    Deleting catalog and data directories
    Update admintools metadata for v_verticadb_node0006
    Eon mode detected. The node v_verticadb_node0006 has been removed from
        host 10.11.12.60. To remove the node metadata completely, please clean
        up the files corresponding to this node, at the communal location:
        s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0006
Attempting to drop node v_verticadb_node0007 ( 10.11.12.70 )
    Shutting down node v_verticadb_node0007
    Sending node shutdown command to '['v_verticadb_node0007', '10.11.12.70',
        '/vertica/data', '/vertica/data']'
    Deleting catalog and data directories
    Update admintools metadata for v_verticadb_node0007
    Eon mode detected. The node v_verticadb_node0007 has been removed from
        host 10.11.12.70. To remove the node metadata completely, please clean
        up the files corresponding to this node, at the communal location:
        s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0007
    Reload spread configuration
    Replicating configuration to all nodes
    Checking database state
    Node Status: v_verticadb_node0001: (UP) v_verticadb_node0002: (UP)
        v_verticadb_node0003: (UP)
Communal storage detected: syncing catalog

5 - Depot management

The nodes of an Eon Mode database fetch data from communal storage as needed to process queries, and cache that data locally on disk.

The nodes of an Eon Mode database fetch data from communal storage as needed to process queries, and cache that data locally on disk. The cached data of all nodes within a subcluster comprise that cluster's depot. Vertica uses depots to facilitate query execution: when processing a query, Vertica first checks the current depot for the required data. If the data is unavailable, Vertica fetches it from communal storage and saves a copy in the depot to expedite future queries. Vertica also uses the depot for load operations, caching newly-loaded data in the depot before uploading it to communal storage.

5.1 - Managing depot caching

You can control depot caching in several ways:.

You can control depot caching in several ways:

You can monitor depot activity and settings with several V_MONITOR system tables, or with the Management Console.

Depot gateway parameters

Vertica depots can cache two types of data:

  • Queried data: The depot facilitates query execution by fetching queried data from communal storage and caching it in the depot. The cached data remains available until it is evicted to make room for fresher data, or for data that is fetched for more recent queries.

  • Loaded data: The depot expedites load operations such as COPY by temporarily caching data until it is uploaded to communal storage.

By default, depots are configured to cache both types of data.

Two configuration parameters determine whether a depot caches queried or loaded data:

Parameter Settings
UseDepotForReads

Boolean:

  • 1 (default): Search the depot for the queried data; if not found, fetch the data from communal storage.

  • 0: Bypass the depot and get queried data from communal storage.

UseDepotForWrites

Boolean:

  • 1 (default): Write loaded data to the depot, then upload files to communal storage.

  • 0: Bypass the depot and write directly to communal storage.

Both parameters can be set at session, user and database levels.

If set on the session or user levels, these parameters can be used to segregate read and write activity on the depots of different subclusters. For example, parameters UseDepotForReads and UseDepotForWrites might be set as follows for users joe and rhonda:

=> SHOW USER joe ALL;
          name           | setting
-------------------------+---------
 UseDepotForReads        | 1
 UseDepotForWrites       | 0
(2 rows)
=> SHOW USER rhonda ALL;
          name           | setting
-------------------------+---------
 UseDepotForReads        | 0
 UseDepotForWrites       | 1
(2 rows)

Given these user settings, when joe connects to a Vertica subcluster, his session only uses the current depot to process queries; all load operations are uploaded to communal storage. Conversely, rhonda's sessions only use the depot to process load operations; all queries must fetch their data from communal storage.

Depot fetching

If a depot is enabled to cache queried data (UseDepotForReads = 1), you can configure how it fetches data from communal storage with configuration parameter DepotOperationsForQuery. This parameter has three settings:

  • ALL (default): Fetch file data from communal storage, if necessary displace existing files by evicting them from the depot.

  • FETCHES: Fetch file data from communal storage only if space is available; otherwise, read the queried data directly from communal storage.

  • NONE: Do not fetch file data to the depot, read the queried data directly from communal storage.

You can set fetching behavior at four levels, in ascending levels of precedence:

For example, you can set DepotOperationsForQuery at the database level as follows:

=> ALTER DATABASE default SET PARAMETER DepotOperationsForQuery = FETCHES;
ALTER DATABASE

This setting applies to all database depots unless overridden at other levels. For example, the following ALTER USER statement specifies fetching behavior for a depot when it processes queries from user joe:

=> ALTER USER joe SET PARAMETER DepotOperationsForQuery = ALL;
ALTER USER

Finally, joe can override his own DepotOperationsForQuery setting by including the DEPOT_FETCH hint in individual queries:


SELECT /*+DEPOT_FETCH(NONE)*/ count(*) FROM bar;

Evicting depot data

In general, Vertica evicts data from the depot as needed to provide room for new data, and expedite request processing. Before writing new data to the depot, Vertica evaluates it as follows:

  • Data fetched from communal storage: Vertica sizes the download and evicts data from the depot accordingly.

  • Data uploaded from a DML operation such as COPY: Vertica cannot estimate the total size of the upload before it is complete, so it sizes individual buffers and evicts data from the depot as needed.

In both cases, Vertica assesses existing depot data and determines which objects to evict from the depot as follows, in descending order of precedence (most to least vulnerable):

  1. Least recently used unpinned object evicted for any new object, pinned or unpinned.

  2. Least recently used pinned object evicted for a new pinned object.

Pinning depot objects

You can set depot pinning policies on database objects to reduce their exposure to eviction. Pinning policies can be set on individual subclusters, or on the entire database, and at different levels of granularity—table, projection, and partitions:

Pinning of... Supported by...
Tables SET_DEPOT_PIN_POLICY_TABLE
Projections SET_DEPOT_PIN_POLICY_PROJECTION
Partitions SET_DEPOT_PIN_POLICY_PARTITION

By default, pinned objects are queued for download from communal storage as needed to execute a query or DML operation. SET_DEPOT_PIN_POLICY functions can specify to override this behavior and immediately queue newly pinned objects for download: set the last Boolean argument of the function to true.

In the following example, SET_DEPOT_PIN_POLICY_TABLE pins the data of table foo and specifies to queue the data immediately for download:

=> SELECT SET_DEPOT_PIN_POLICY_TABLE ('foo', 'default_subluster', true );

Usage guidelines

Pinning one or more objects on a depot affects its retention of fetched (queried) data and uploaded (newly loaded) data. If too much depot space is claimed by pinned objects, the depot might be unable to handle load operations on unpinned objects. In this case, set configuration parameter UseDepotForWrites to 0, so load operations are routed directly to communal storage for processing. Otherwise, load operations are liable to return with an error.

To minimize contention over depot usage, consider the following guidelines:

  • Pin only those objects that are most active in DML operations and queries.

  • Minimize the size of pinned data by setting policies at the smallest effective level—for example, pin only the data of a table's active partition.

Depot warming

On startup, the depots of new nodes are empty, while the depots of restarted nodes often contain stale data that must be refreshed. When depot warming is enabled, a node that is undergoing startup preemptively loads its depot with frequently queried and pinned data. When the node completes startup and begins to execute queries, its depot already contains much of the data it needs to process those queries. This reduces the need to fetch data from communal storage, and expedites query performance accordingly.

By default, depot warming is disabled (EnableDepotWarmingFromPeers = 0). A node executes depot warming as follows:

  1. The node checks configuration parameter PreFetchPinnedObjectsToDepotAtStartup. If enabled (set to 1), the node:

    • Gets from the database catalog a list of all objects that are pinned on this node's subcluster.

    • Queues the pinned objects for fetching and calculates their total size.

  2. The node checks configuration parameter EnableDepotWarmingFromPeers. If enabled (set to 1), the node:

    • Identifies a peer node in the same subcluster whose depot contents it can copy.

    • After taking into account all pinned objects, calculates how much space remains available in the warming depot.

    • Gets from the peer node a list of the most recently used objects that can fit in the depot.

    • Queues the objects for fetching.

  3. If BackgroundDepotWarming is enabled (set to 1, default), the node loads queued objects into its depot while it is warming, and continues to do so in the background after the node becomes active and starts executing queries. Otherwise (BackgroundDepotWarming = 0), node activation is deferred until the depot fetches and loads all queued objects.

Monitoring the depot

You can monitor depot activity and settings with several V_MONITOR system tables.

System table... Shows...
DATA_READS All storage locations that a query reads to obtain data.
DEPOT_EVICTIONS Details about objects that were evicted from the depot.
DEPOT_FETCH_QUEUE Pending depot requests for queried file data to fetch from communal storage.
DEPOT_FILES Objects that are cached in database depots.
DEPOT_PIN_POLICIES Objects —tables and table partitions—that are pinned to database depots.
DEPOT_SIZES Depot caching capacity per node.
DEPOT_UPLOADS Details about depot uploads to communal storage.

5.2 - Resizing depot caching capacity

Each node in an Eon database caches depot data in a predefined storage location.

Each node in an Eon database caches depot data in a predefined storage location. The storage location path depends on your Vertica installation's filesystem. By default, each node in a cluster can use up to 60 percent of disk space on the storage location's filesystem to cache depot data. You can change caching capacity with ALTER_LOCATION_SIZE, by specifying to a fixed size or a percentage of total disk space. The function can specify a single node, a subcluster, or all nodes in the database cluster. You can increase depot caching capacity for each node up to 80 percent.

In the following example, ALTER_LOCATION_SIZE increases depot caching capacity to 80 percent of disk space on the storage location's filesystem. The function supplies an empty string as the second (node-name) argument, so the change applies to all nodes:

=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
    node_name     | location_label  |      location_path      |  max_size   | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
 v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
 v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 36059377664 | 70%
 v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
(3 rows)

=> SELECT alter_location_size('depot', '','80%');
 alter_location_size
---------------------
 depotSize changed.
(1 row)

=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
    node_name     | location_label  |      location_path      |  max_size   | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
 v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
 v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 41210717184 | 80%
 v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
(3 rows)

Rescaling depot capacity

When a database is revived on an instance with greater or lesser disk space than it had previously, Vertica evaluates the depot size settings that were previously in effect. If depot size was specified as a percentage of available disk space, Vertica proportionately rescales depot capacity. For example, if depot caching capacity for a given node was set to 70 percent, the revived node applies that setting to the new disk space and adjusts depot caching capacity accordingly. If depot capacity was set to a fixed size, Vertica applies that setting, unless doing so will consume more than 80 percent of available disk space. In that case, Vertica automatically adjusts depot size as needed.

6 - Scaling your Eon Mode database

One of the strengths of an Eon Mode database is its ability to grow or shrink to meet your workload demands.

One of the strengths of an Eon Mode database is its ability to grow or shrink to meet your workload demands. You can add nodes to and remove nodes from your database to meet changing workload demands. For an overview of why you would scale your database and how it affects queries, see Elasticity.

Scaling up your database by starting stopped nodes

The easiest way to scale up your database is to start any stopped nodes:

Scaling up your database by adding nodes

If you do not have stopped nodes in your database, or the stopped nodes are not in the subclusters where you want to add new nodes, then you can add new nodes to the database. In supported environments, you can use the MC to provision and add new nodes to your database in a single step. See Viewing and managing your cluster for more information.

You can also manually add new nodes:

Controlling how Vertica uses your new nodes

New nodes can improve your database's performance in one of two ways:

  • Increase the query throughput (the number of queries your database processes at the same time).

  • Increase individual query performance (how fast each query runs).

See Elasticity for details on these performance improvements. You control how the new nodes improve your database's performance by choosing what subclusters you add them to. The following topics explain how to use scaling to improve throughput and query performance.

6.1 - Improving query throughput using subclusters

Improving query throughput increases the number of queries your Eon Mode database processes at the same time.

Improving query throughput increases the number of queries your Eon Mode database processes at the same time. You are usually concerned about your database's throughput when your workload consists of many short-running queries. They are often referred to as "dashboard queries." This term describes type of workload you see when a large number of users have web-based dashboard pages open to monitor some sort of status. These dashboards tend to update frequently, using simpler, short-running queries instead of analytics-heavy long running queries.

The best way to improve your database's throughput is to add new subclusters to the database or start any stopped subclusters. Then distribute the client connections among these subclusters using connection load balancing policies. Subclusters independently process queries. By adding more subclusters, you improve your database's parallelism.

For the best performance, make the number of nodes in your subcluster the same as the number of shards in your database. If you choose to have less nodes than the number of shards, make the number of nodes an even divisor of the number of shards. When the number of shards is divisible by the number of nodes, the data in your database is equally divided among the nodes in the subcluster.

The easiest way of adding subclusters is to use the MC:

  1. From the MC home page, click the database you want to add subclusters to.

  2. Click Manage.

  3. Click Add Subcluster.

  4. Follow the steps in the wizard to add the subcluster. Normally, the only items you need to fill in are the subcluster name and the number of instances to add to it.

Distributing clients among the throughput subclusters

To gain benefits from the added subclusters, you must have clients that will execute short-running queries connect to the nodes that the subclusters contain. Queries run only on the subcluster that contains the initiator node (the node that the client is connected to). Use connection load balancing policies to spread the connections across all of the subclusters you created to increase query throughput. See Connection load balancing policies for details.

The following example creates a load balancing policy that spreads client connections across two three-node subclusters named query_pool_a and query_pool_b. This example:

  • Creates network addresses on the six nodes that are in the two subclusters.

  • Creates a load balance group from all the nodes in the two subclusters.

  • Creates the routing rule to redirect all incoming connections to the two subclusters.

=> CREATE NETWORK ADDRESS node04 ON v_verticadb_node0004 WITH '203.0.113.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_verticadb_node0005 WITH '203.0.113.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node06 ON v_verticadb_node0006 WITH '203.0.113.3';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node07 ON v_verticadb_node0007 WITH '203.0.113.4';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node08 ON v_verticadb_node0008 WITH '203.0.113.5';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node09 ON v_verticadb_node0009 WITH '203.0.113.6';
CREATE NETWORK ADDRESS

=> CREATE LOAD BALANCE GROUP query_subclusters WITH SUBCLUSTER query_pool_a,
   query_pool_b FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE query_clients ROUTE '0.0.0.0/0' TO query_subclusters;
CREATE ROUTING RULE

After creating the policy, any client that opts into load balancing is redirected to one of the nodes in the two subclusters. For example, when you connect to node 1 in the cluster (with the IP address 203.0.113.1) using vsql with the -C flag, you see output similar to this:

$ vsql -h 203.0.113.1 -U dbadmin -w mypassword -C
Welcome to vsql, the Vertica Analytic Database interactive terminal.

Type:  \h or \? for help with vsql commands
       \g or terminate with semicolon to execute query
       \q to quit

SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, protocol: TLSv1.2)

INFO: Connected using a load-balanced connection.
INFO: Connected to 203.0.113.7 at port 5433.
=>

Connection load balancing policies take into account nodes that are stopped when picking a node to handle a client connection. If you shut down one or more subclusters to save money during low-demand periods, you do not need to adjust your load balancing policy as long as some of the nodes are still up.

6.2 - Using elastic crunch scaling to improve query performance

You can choose to add nodes to your database to improve the performance of complex long-running analytic queries.

You can choose to add nodes to your database to improve the performance of complex long-running analytic queries. Adding nodes helps these queries run faster.

When you have more nodes in a subcluster than you have shards in your database, multiple nodes subscribe to each shard. To involve all of the nodes in the subcluster in queries, the Vertica query optimizer automatically uses a feature called Elastic Crunch Scaling (ECS). This feature splits the responsibility for processing the data in each shard among the nodes that subscribe to it. During a query, each node has less data to process and usually finishes the query faster.

For example, suppose you have a six-node subcluster in a three-shard database. In this subcluster, two nodes subscribe to each shard. When you execute a query, Vertica assigns each node roughly half of the data in the shard it subscribes to. Because all nodes in the subcluster participate in the query, the query usually finishes faster than if only half the nodes had participated.

ECS lets a subcluster that has more nodes than shards act as if the shard count in the database were higher. In a three-shard database, a six-node subcluster acts as if the database has six shards by splitting each shard in half. However, using ECS isn't as efficient as having a higher shard count. In practice, you will see slightly slower query performance on a six-node subcluster in a three shard database than you would see from a six-node subcluster in a six-shard database.

You can determine when the optimizer will use ECS in a subcluster by querying the V_CATALOG.SESSION_SUBSCRIPTIONS system table and look for nodes whose is_collaborating column is TRUE. Subclusters whose node count is less than or equal to the number of shards in the database only have participating nodes. Subclusters that have more nodes than the database's shard count assign the "extra" nodes the role of collaborators. The differences between the two types of nodes are not important for when you are executing queries. The two types just relate to how Vertica organizes the nodes to execute ECS-enabled queries.

This example shows how to get the list of nodes that are participating or collaborating in resolving queries for the current session:

=> SELECT node_name, shard_name, is_collaborating, is_participating
        FROM V_CATALOG.SESSION_SUBSCRIPTIONS
        WHERE is_participating = TRUE OR is_collaborating = TRUE
        ORDER BY shard_name, node_name;
      node_name       | shard_name  | is_collaborating | is_participating
----------------------+-------------+------------------+------------------
 v_verticadb_node0004 | replica     | f                | t
 v_verticadb_node0005 | replica     | f                | t
 v_verticadb_node0006 | replica     | t                | f
 v_verticadb_node0007 | replica     | f                | t
 v_verticadb_node0008 | replica     | t                | f
 v_verticadb_node0009 | replica     | t                | f
 v_verticadb_node0007 | segment0001 | f                | t
 v_verticadb_node0008 | segment0001 | t                | f
 v_verticadb_node0005 | segment0002 | f                | t
 v_verticadb_node0009 | segment0002 | t                | f
 v_verticadb_node0004 | segment0003 | f                | t
 v_verticadb_node0006 | segment0003 | t                | f
(12 rows)

You can see that nodes 4, 5, and 7 are participating, and nodes 6, 8, and 9 are collaborating.

You can also see that ECS is enabled by looking at an EXPLAIN plan for a query. At the top of the plan for an ECS-enabled query is the statement "this query involves non-participating nodes." These non-participating nodes are the collaborating nodes that are splitting the data in the shard with the participating nodes. The plan also lists the nodes taking part in the query.

This example shows an explain plan for an ECS-enabled query in a six-node subcluster in a three-shard database:

=> EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
     FROM online_sales.online_sales_fact
     INNER JOIN online_sales.call_center_dimension
     ON (online_sales.online_sales_fact.call_center_key
         = online_sales.call_center_dimension.call_center_key
         AND sale_date_key = 156)
     ORDER BY sales_dollar_amount DESC;

                                      QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes. Crunch scaling
 strategy preserves data segmentation
 ------------------------------

 EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
 FROM online_sales.online_sales_fact
 INNER JOIN online_sales.call_center_dimension
 ON (online_sales.online_sales_fact.call_center_key
     = online_sales.call_center_dimension.call_center_key
     AND sale_date_key = 156)
 ORDER BY sales_dollar_amount DESC;

 Access Path:
 +-SORT [Cost: 6K, Rows: 754K] (PATH ID: 1)
 |  Order: online_sales_fact.sales_dollar_amount DESC
 |  Execute on: v_verticadb_node0007, v_verticadb_node0004, v_verticadb_node0005,
 |     v_verticadb_node0006, v_verticadb_node0008, v_verticadb_node0009
 | +---> JOIN MERGEJOIN(inputs presorted) [Cost: 530, Rows: 754K (202 RLE)] (PATH ID: 2)
 | |      Join Cond: (online_sales_fact.call_center_key = call_center_dimension.call_center_key)
 | |      Materialize at Output: online_sales_fact.sales_quantity,
 | |            online_sales_fact.sales_dollar_amount, online_sales_fact.transaction_type
 | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
 | |            v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
 | |            v_verticadb_node0009
 | | +-- Outer -> STORAGE ACCESS for online_sales_fact [Cost: 13, Rows: 754K (202 RLE)] (PATH ID: 3)
 | | |      Projection: online_sales.online_sales_fact_DBD_18_seg_vmart_b0
 | | |      Materialize: online_sales_fact.call_center_key
 | | |      Filter: (online_sales_fact.sale_date_key = 156)
 | | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
 | | |          v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
 | | |          v_verticadb_node0009
 | | |      Runtime Filter: (SIP1(MergeJoin): online_sales_fact.call_center_key)
 | | +-- Inner -> STORAGE ACCESS for call_center_dimension [Cost: 17, Rows: 200] (PATH ID: 4)
 | | |      Projection: online_sales.call_center_dimension_DBD_16_seg_vmart_b0
 | | |      Materialize: call_center_dimension.call_center_key, call_center_dimension.cc_name
 | | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
                v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
                v_verticadb_node0009
 . . .

Taking advantage of ECS

To take advantage of ECS, create a secondary subcluster where the number of nodes is a multiple of the number of shards in your database. For example, in a 12-shard database, create a subcluster that contains a multiple of 12 nodes such as 24 or 36. The number of nodes must be a multiple of the number of shards to evenly distribute the data across the nodes in the subcluster. You create a secondary subcluster because secondary subclusters are better at performing queries. See Subclusters for more information.

Once you have created the subcluster, have users connect to it and run their analytic queries. Vertica automatically enables ECS in the subcluster because it has more nodes than there are shards in the database.

How the optimizer assigns data responsibilities to nodes

The optimizer has two strategies to choose from when dividing the data in a shard among its subscribing nodes. One strategy is optimized for queries that use data segmentation. Queries that contain a JOIN or GROUP BY clause rely on data segmentation. The other strategy is for queries that do not need segmentation.

By default, the optimizer automatically chooses the strategy to use. For most queries, the automatically-chosen strategy results in faster query performance. For some queries, you may want to manually override the strategy using hints. In a small number of queries, ECS does not help performance. In these cases, you can disable ECS. See Manually choosing an ECS strategy for details.

6.3 - Manually choosing an ECS strategy

When the number of nodes in a subcluster is greater than the number of database shards, the Vertica query optimizer uses elastic crunch scaling (ECS) to involve all nodes in processing queries.

When the number of nodes in a subcluster is greater than the number of database shards, the Vertica query optimizer uses elastic crunch scaling (ECS) to involve all nodes in processing queries. For each shard, the optimizer divides responsibility for processing shard data among its subscribing nodes, using one of the following strategies:

Strategy Description
I/O-optimized Optimizer divides the list of ROS containers in the shard among the subscribing nodes. Use this strategy when nodes must fetch the data for the query from communal storage, rather than the depot. Nodes only fetch the ROS containers they need to resolve the query from communal storage, reducing the amount of data each needs to transfer from communal storage. Due to the arbitrary division of data among the nodes, this strategy does not support query optimizations that rely on data segmentation.
Compute-optimized Optimizer uses data segmentation to assign portions to each subscribing node. The nodes scan the entire shard, but use sub-segment filtering to find their assigned segments of the data. Use this strategy when most data for the query is in the depot, because nodes must scan the entire contents of the shard. Because this strategy uses data segmentation, it supports optimizations such as local joins that the I/O-optimized strategy cannot.

The optimizer automatically chooses a strategy based on whether the query can take advantage of data segmentation. You can tell which strategy the optimizer chooses for a query by using EXPLAIN. The top of the plan explanation states whether ECS is preserving segmentation. For example, this simple query on a single table does not need to use segmentation, so it uses the I/O-optimized strategy:

=> EXPLAIN SELECT employee_last_name,
            employee_first_name,employee_age
            FROM employee_dimension
            ORDER BY employee_age DESC;

                                 QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy does not preserve data segmentation
 ------------------------------
. . .

A more complex query using a JOIN results in ECS preserving data segmentation by using the compute-optimized strategy. The query plan tells you that segmentation is preserved:

=> EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
    FROM online_sales.online_sales_fact
    INNER JOIN online_sales.call_center_dimension
    ON (online_sales.online_sales_fact.call_center_key
        = online_sales.call_center_dimension.call_center_key
        AND sale_date_key = 156)
    ORDER BY sales_dollar_amount DESC;

                                     QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy preserves data segmentation
 ------------------------------
 . . .

In most cases, the optimizer chooses the best strategy to use to split the data among the nodes subscribing to the same shard. However, you might occasionally find that some queries perform poorly. In these cases, the query can embed the ECSMODE hint to specify which strategy to use, or even disable ECS.

Setting the ECS strategy for individual queries

You can use the ECSMODE hint in a query to force the optimizer to use a specific ECS strategy (or disable ECS entirely). The ECSMODE hint takes one of the following arguments:

  • AUTO: The optimizer chooses the strategy to use, useful only if ECS mode is set at the session level (see Setting the ECS Strategy for the Session or Database).

  • IO_OPTIMIZED: Use I/O-optimized strategy.

  • COMPUTE_OPTIMIZED: Use compute-optimized strategy.

  • NONE: Disable use of ECS for this query. Only participating nodes are involved in query execution; collaborating nodes are not.

The following example shows the query plan for a simple single-table query that is forced to use the compute-optimized strategy:

=> EXPLAIN SELECT /*+ECSMode(COMPUTE_OPTIMIZED)*/ employee_last_name,
             employee_first_name,employee_age
             FROM employee_dimension
             ORDER BY employee_age DESC;

                                   QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy preserves data segmentation
 ------------------------------
. . .

This example disable ECS in a six-node cluster in a three-shard database:

=> EXPLAIN SELECT /*+ECSMode(NONE)*/ employee_last_name,
             employee_first_name,employee_age
             FROM employee_dimension
             ORDER BY employee_age DESC;

                                     QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 ------------------------------

 EXPLAIN SELECT /*+ECSMode(NONE)*/ employee_last_name,
             employee_first_name,employee_age
             FROM employee_dimension
             ORDER BY employee_age DESC;

 Access Path:
 +-SORT [Cost: 243, Rows: 10K] (PATH ID: 1)
 |  Order: employee_dimension.employee_age DESC
 |  Execute on: v_verticadb_node0007, v_verticadb_node0004, v_verticadb_node0005
 | +---> STORAGE ACCESS for employee_dimension [Cost: 71, Rows: 10K] (PATH ID: 2)
 | |      Projection: public.employee_dimension_DBD_8_seg_vmart_b0
 | |      Materialize: employee_dimension.employee_first_name,
 | |      employee_dimension.employee_last_name, employee_dimension.employee_age
 | |      Execute on: v_verticadb_node0007, v_verticadb_node0004,
 | |          v_verticadb_node0005
 . . .

Note that this query plan lacks the "this query involves non-participating nodes" statement, indicating that it does not use ECS. It also lists just three participating nodes. These nodes are marked as participating in the V_CATALOG.SESSION_SUBSCRIPTIONS system table.

Setting the ECS strategy for the session or database

You can use the ECSMode configuration parameter to set the ECS strategy for the current session. This parameter accepts the same values as the ECSMODE hint except NONE, which is valid only for individual queries.

The following example demonstrates using the configuration parameter to force a simple query to use the COMPUTE_OPTIMIZED strategy. It then sets the parameter back to its default value of AUTO:

=> EXPLAIN SELECT employee_first_name,employee_age
    FROM employee_dimension ORDER BY employee_age DESC;

                                 QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy does not preserve data segmentation
 ------------------------------
. . .

=> ALTER SESSION SET ECSMode = 'COMPUTE_OPTIMIZED';
ALTER SESSION
=> EXPLAIN SELECT employee_first_name,employee_age
    FROM employee_dimension ORDER BY employee_age DESC;


                                  QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy preserves data segmentation
 ------------------------------
 . . .

=> ALTER SESSION SET ECSMode = 'AUTO';
ALTER SESSION

Individual query hints override the session-level settings. This example sets the session default to use COMPUTE_OPTIMIZED, then restores the default behavior for a query by using the ECSMode hint with the value AUTO:

=> ALTER SESSION SET ECSMode = 'COMPUTE_OPTIMIZED';
ALTER SESSION
=> EXPLAIN SELECT /*+ECSMode(AUTO)*/ employee_first_name,employee_age
   FROM employee_dimension ORDER BY employee_age DESC;


                                  QUERY PLAN
-----------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy does not preserve data segmentation
 ------------------------------

Note that setting the ECSMode hint to AUTO let the optimizer pick the I/O-optimized strategy (which does not preserve segmentation) instead of using the compute-optimized strategy set at the session level.

You can also set the ECS strategy at the database level using ALTER DATABASE. However, doing so overrides the Vertica optimizer's settings for all users in all subclusters that use ECS. Before setting the ECS strategy at the database level, verify that the majority of the queries run by all users of the ECS-enabled subclusters must have the optimizer's default behavior overridden. If not, then use the session or query-level settings to override the optimizer for just the queries that benefit from a specific strategy.

7 - Local caching of storage containers

The Vertica execution engine uses the StorageMerge operator to read data from storage containers in cases where it is important to read container data in its projection-specified sort order.

The Vertica execution engine uses the StorageMerge operator to read data from storage containers in cases where it is important to read container data in its projection-specified sort order. This is particularly useful for operations that must preserve the sort order of the data that is read from multiple storage containers, before merging it into a single storage container. Common operations that enforce sort order include mergeout, and some queries with ORDER BY clauses—for example CREATE TABLE...AS, where the query includes an ORDER BY clause.

The execution engine typically allocates multiple threads to the StorageMerge operator. Each thread is assigned a single Scan operator to open and read container contents. If the number of containers to read is greater than the number of available threads, the execution engine is likely to assign individual Scan operators to multiple containers. In this case, Scan operators might need to switch among different containers and reopen them multiple times before all required data is fetched and assembled. Doing so is especially problematic when reading storage containers on remote filesystems such as S3. The extra overhead incurred by reopening and reading remote storage containers can significantly impact performance and usage costs.

You can configure your database so the execution engine caches on local disk the data of S3 storage containers that require multiple opens. The size of temp space allocated per query to the StorageMerge operator for caching is set by configuration parameter StorageMergeMaxTempCacheMB. By default, this configuration parameter is set to -1 (unlimited). If caching requests exceed temp space limits or available disk space, Vertica caches as much container data as it can, and then reads from S3.

To turn off caching, set StorageMergeMaxTempCacheMB to 0.

8 - Managing an Eon Mode database in MC

Vertica Management Console (MC), a database health and activity monitoring tool, provides in-browser wizards you can follow to deploy Vertica cluster instances and create an Eon Mode database on them.

Vertica Management Console (MC), a database health and activity monitoring tool, provides in-browser wizards you can follow to deploy Vertica cluster instances and create an Eon Mode database on them. You can also use MC to manage and monitor resources that are specific to Eon Mode:

See also

9 - Stopping and starting an Eon Mode cluster

When running an Eon Mode database in the cloud, you usually want to stop the nodes running your database when you stop the database.

Stopping your Eon Mode database and cluster using the MC

When running an Eon Mode database in the cloud, you usually want to stop the nodes running your database when you stop the database. Stopping your nodes avoids wasting money. The nodes aren't needed while the database is down.

The easiest way to stop both your database and the nodes that run it is to use the MC:

  1. From the MC home page, click View Your Infrastructure.

  2. In row labeled Databases, click the database you want to stop.

  3. In the popup, click Stop.

  4. Click OK to confirm you want to stop the database.

  5. Once your database has stopped, in the row labeled Clusters, click the entry for the cluster running the database you just stopped.

  6. In the popup, click Manage.

  7. In the ribbon at the top of the cluster view, click Stop Cluster.

  8. In the dialog box, check the I would like to stop all instances in the cluster box and click Stop Cluster.

Manually stopping the database and cluster

To manually stop your database and cluster, first stop your database using one of the following methods:

Once you have stopped the database, you can stop your nodes. If you are in a cloud environment, see your cloud provider's documentation for instruction on stopping nodes.

Starting your cluster and database using the MC

To start your database cluster and database:

  1. From the MC home, click View Infrastructure.

  2. In the Clusters row, click the cluster that runs the database you want to start.

  3. In the pop-up, click Manage.

  4. In the ribbon at the top of the cluster's page, click Start Cluster.

  5. Check the I would like to start all instances in the cluster box and click Start Cluster.

Starting the cluster automatically starts the database.

Manually starting your cluster and database

To manually start your cluster and database:

  1. Start the nodes in your database cluster. If you are running in the cloud, see your cloud provider's documentation on how to start instances.

  2. Connect to one of the nodes in the cluster and use the admintools menus or command line to start your database. See Starting the database for instructions.

10 - Terminating an Eon Mode database cluster

When you terminate an Eon Mode database's cluster, you free its resources.

When you terminate an Eon Mode database's cluster, you free its resources. In a cloud environment, terminating the cluster deletes the instances that ran the database's nodes. In an on-premises database, terminating the cluster usually means repurposing physical hardware for other uses. See Stopping, starting, terminating, and reviving Eon Mode database clusters for more information.

Terminating an Eon Mode database's cluster does not affect the data it stores. The data remains stored in the communal storage location. As long as you do not delete the communal storage location, you can revive the database onto a new Eon Mode cluster. See Reviving an Eon Mode database clusterfor more information.

Terminating an Eon Mode cluster using Management Console

Management Console provides the easiest way to terminate an Eon Mode cluster. You must follow a two-step process: first stop the database, then terminate the cluster:

  1. If you have not yet synchronized the database's catalog, follow the steps in Synchronizing metadata.

  2. From the Management Console home page, click View Your Infrastructure.

  3. In the row labeled Databases, click the database whose cluster you want to terminate.

  4. In the popup, click Stop.

  5. Click OK to confirm you want to stop the database.

  6. After the database stops, in the row labeled Clusters, click the entry for the cluster you want to terminate.

  7. In the popup, click Manage.

  8. In the ribbon at the top of the cluster view, click Advanced and then select Terminate Cluster.

  9. In the dialog box:

    • Check I understand that terminating a cluster will terminate all instances in the cluster

    • Click Terminate Cluster.

Manually terminating an Eon Mode cluster

To manually terminate your Eon Mode cluster:

  1. If you have not yet synchronized the database's catalog, follow the steps in Synchronizing metadata.

  2. Stop the database using one of the following methods:

  3. Terminate the database node instances. If you are in a cloud environment, see your cloud provider's documentation for instructions on terminating instances. For on-premises database clusters, you can repurpose the systems that were a part of the cluster.

See also

11 - Reviving an Eon Mode database cluster

If you have terminated your Eon Mode database's cluster, but have not deleted the database's communal storage, you can revive your database.

If you have terminated your Eon Mode database's cluster, but have not deleted the database's communal storage, you can revive your database. Reviving the database restores it to its pre-shutdown state. The revival process requires creating a new database cluster and configuring it to use the database's communal storage location. See Stopping, starting, terminating, and reviving Eon Mode database clusters for more information.

You can also use the revive process to restart a database when its nodes do not have persistent local storage. You may choose to configure your node's instances in your cloud-based Eon Mode cluster with non-persistent local storage to reduce cost. Cloud providers such as AWS and GCP charge less for instances when they are not required to retain data when you shut them down.

You revive a database using either the Management Console or admintools. The MC and admintools offer different revival methods:

  • The MC always revives onto a newly-provision cluster that it creates itself. It cannot revive onto an existing cluster. Use the MC to revive a database when you do not have a cluster already provisioned for your database.

  • admintools only revives onto an existing database cluster. You can manually create a cluster to revive your database. See Installing manually.

    You can also revive a database whose hosts use instance storage where data is not persistently stored between shutdowns. In this case, admintools treats the existing database cluster as a new cluster, because the hosts do not contain the database's catalog data.

  • Currently, only admintools lets you revive just the primary subclusters in a database cluster. This option is useful if you want to revive the minimum number of nodes necessary to start your database. See Reviving Only Primary Subclusters below.

    The MC always revives the entire database cluster.

Reviving using the Management Console

You can use a wizard in the Management Console to provision a new cluster and revive a database onto it from a browser. For details, see:

Revive using admintools

You can use admintools to revive your Eon Mode database on an existing cluster.

Cluster requirements

This existing cluster must:

  • Have the same version (or later version) of Vertica installed on it. You can repurpose an existing Vertica cluster whose database you have shut down. Another option is to create a cluster from scratch by manually installing Vertica (see Installing manually).

  • Contain a number of hosts in the cluster that is equal to or greater than either:

    • The total number of nodes that the database cluster had when it shut down.

    • The total number of primary nodes the database cluster had when it shut down. When you supply a cluster that matches the number of primary nodes in the database, admintools revives just the primary nodes.

When reviving, you supply admintools with a list of the hosts in the cluster to revive the database onto. The number of hosts in this list must match either the total number of nodes or the number of primary nodes in the database when it shut down. If the number of nodes you supply does not match either of these values, admintools returns an error.

You do not need to use all of the hosts in the cluster to revive the database. You can revive a database onto a subset of the hosts in the cluster. But you must have at least enough hosts to revive all of the primary nodes.

For example, suppose you want to revive a database that had 16 nodes when it was shut down, with four of those nodes being primary nodes. In that case, you can revive:

  • Just the primary nodes onto a cluster that contains at least four nodes.

  • All of the 16 nodes onto a cluster that contains at least 16 nodes.

You may choose to revive your database onto a cluster with more nodes that is necessary in cases where you want to quickly add new nodes. You may also want to revive just the primary nodes in a database onto a larger cluster. In this case, you can use the extra nodes in the cluster to start one or more secondary subclusters.

Required database information

To revive the database, you must know:

  • The name of the database to revive (note that the database name is case sensitive)

  • The version of Vertica that created the database, so you can use the same or later version

  • The total number of all nodes or the number of primary nodes in the database when it shut down

  • The URL and credentials for the database's communal storage location

  • The user name and password of the database administrator

  • The IP addresses of all hosts in the cluster you want to revive onto

If you do not know what version of Vertica created the database or are unsure how many nodes it had, see Getting Database Details From a Communal Storage Location below.

Required database settings

Before starting the revive process, verify the following conditions are true for your Eon Mode environment:

Eon environment Revived database requirements
All
  • The uppermost directories of the catalog, data, and depot directories on all nodes exist and are owned by the database dbadmin

  • The cluster has no other database running on it

Azure

If your database does not use Azure managed identities to authenticate with the communal storage blob container, the following values must be set:

  • AzureStorageCredentials

  • AzureStorageEndpointConfig

See Azure Blob Storage object store for details.

S3: AWS, on-premises

The following configuration parameters are set:

GCP

The following configuration parameters are set:

Getting database details from a communal storage location

To revive a database, you must know:

  • The version of Vertica that created it (so you can use the same or a later version)

  • The total number of nodes (when reviving both primary and secondary nodes) or primary nodes (when just reviving the primary nodes) in the database's cluster when it shut down.

If you do not know these details, you can determine them based on the contents of the communal storage location.

If you are not sure which version of Vertica created the database stored in a communal storage location, examine the cluster_config.json file. This file is stored in the communal storage location in the folder named metadata/databasename. For example, suppose you have a database named mydb stored in the communal storage location s3://mybucket/mydb. Then you can download and examine the file s3://mybucket/mydb/metadata/mydb/cluster_config.json.

In the cluster_config.json, the Vertica version that created the database is stored with the JSON key named DatabaseVersion near the top of the file:

{
   "CatalogTruncationVersion" : 804,
   "ClusterLeaseExpiration" : "2020-12-21 21:52:31.005936",
   "Database" : {
      "branch" : "",
      "name" : "verticadb"
   },
   "DatabaseVersion" : "v10.1.0",
   "GlobalSettings" : {
      "TupleMoverServices" : -33,
      "appliedUpgrades" : [
 . . .

In this example, you can revive the storage location using Vertica version 10.1.0 or later.

If you do not know how many nodes or primary nodes the cluster had when it shut down, use the --display-only option of the admintools revive_db tool. Adding this option prevents admintools from reviving the database. Instead, it validates the files in the communal storage and reports details about the nodes that made up the database cluster. Parts of this report show the total number of nodes in the cluster and the number of primary nodes:

$ admintools -t revive_db --display-only --communal-storage-location \
             s3://mybucket/verticadb -d verticadb
Attempting to retrieve file: [s3://mybucket/verticadb/metadata/verticadb/cluster_config.json]

Validated 6-node database verticadb defined at communal storage s3://mybucket/verticadb.

Expected layout of database after reviving from communal storage: s3://mybucket/verticadb

== Communal location details: ==
{
 "communal_storage_url": "s3://mybucket/verticadb",
 "num_shards": "3",
 "depot_path": "/vertica/data",
   . . .
]

Number of primary nodes: 3

You can use grep to find just the relevant lines in the report:

$ admintools -t revive_db --display-only --communal-storage-location \
             s3://mybucket/verticadb -d verticadb | grep  'Validated\|primary nodes'
Validated 6-node database verticadb defined at communal storage s3://mybucket/verticadb.
Number of primary nodes: 3

Creating a parameter file

For Eon Mode deployments that are not on AWS, you must create a configuration file to pass the parameters listed in the table in the previous section to admintools. Traditionally this file is named auth_params.conf although you can choose any file name you want.

For on-premises Eon Mode databases, this parameter file is the same one you used when initially installing the database. See the following links for instructions on creating a parameter file for the communal storage solution you are using for your database:

For databases running on Microsoft Azure, the parameter file is only necessary if your database does not use managed identities. This file is the same format that you use to manually install an Eon Mode database. See Manually creating an Eon Mode database on Azure for more information.

To revive an Eon Mode database on GCP manually, create a configuration file to hold the GCSAuth parameter and optionally, the GCSEnableHttp parameter.

You must supply the GCSAuth parameter to enable Vertica to read from the communal storage location stored in GCS. The value for this parameter is the HMAC access key and secret:

GCSAuth = HMAC_access_key:HMAC_secret_key

See Creating an HMAC Key for more information about HMAC keys.

If your Eon Mode database does not use encryption when accessing communal storage on GCS, then disable HTTPS access by adding the following line to auth_params.conf:

GCSEnableHttps = 0

Running the revive_db tool

Use the admintools revive_db tool to revive the database:

  1. Use SSH to access a cluster host as an administrator.

  2. Depending on your environment, run one of the following admintools command lines:

    • AWS:

      $ admintools -t revive_db \
       --communal-storage-location=s3://communal_store_path \
       -s host1,... -d database_name
      
    • For on premises and other environments, run the command:

      $ admintools -t revive_db -x auth_params.conf \
        --communal-storage-location=storage-schema://communal_store_path \
        -s host1_ip,... -d database_name
      

This example revives a six-node on-premises database:

$ admintools -t revive_db -x auth_params.conf \
   --communal-storage-location=s3://mybucket/mydir \
   -s 172.16.116.27,172.16.116.28,172.16.116.29,172.16.116.30,\
   172.16.116.31,172.16.116.32 -d VMart

The following example demonstrates reviving a three-node database hosted on GCP:

$ admintools -t revive_db -x auth_params.conf \
--communal-storage-location gs://mybucket/verticadb \
-s 10.142.0.35,10.142.0.38,10.142.0.39 -d VerticaDB

Attempting to retrieve file:
   [gs://mybucket/verticadb/metadata/VerticaDB/cluster_config.json]
Validated 3-node database VerticaDB defined at communal storage
  gs://mybucket/verticadb .
Cluster lease has expired.
Preparation succeeded all hosts
Calculated necessary addresses for all nodes.
Starting to bootstrap nodes. Please wait, databases with a large
  catalog may take a while to initialize.
>>Calling bootstrap on node v_verticadb_node0002 (10.142.0.38)
>>Calling bootstrap on node v_verticadb_node0003 (10.142.0.39)
Load Remote Catalog succeeded on all hosts
Database revived successfully.

Reviving only primary subclusters

You can revive just the primary subclusters in an Eon Mode database. Make the list of hosts you pass to the admintools revive_db tool's --hosts (or -s) argument match the number of primary nodes that were in the database when it shut down. For example, if you have a six-node Eon Mode database that had three primary nodes, you can revive just the primary nodes by supplying three hosts in the --hosts argument:

$ admintools -t revive_db --communal-storage-location=s3://verticadb -d verticadb \
             -x auth_params.conf --hosts node01,node02,node03
Attempting to retrieve file: [s3://verticadb/metadata/verticadb/cluster_config.json]
Consider reviving to only primary nodes: communal storage indicates 6 nodes, while
  3 nodes were specified

Validated 3-node database verticadb defined at communal storage s3://verticadb.
Cluster lease has expired.
Preparation succeeded all hosts

Calculated necessary addresses for all nodes.
Starting to bootstrap nodes. Please wait, databases with a large catalog may take a
  while to initialize.
>>Calling bootstrap on node v_verticadb_node0002 (192.168.56.103)
>>Calling bootstrap on node v_verticadb_node0003 (192.168.56.104)
Load Remote Catalog succeeded on all hosts

Database revived successfully.

In a database where you have revived only the primary nodes, the secondary nodes are down. Their IP address is set to 0.0.0.0 so they are not part of the database. For example, querying the NODES system table in the database revived in the previous example shows the secondary nodes are all down:

=> SELECT node_name,node_state,node_address,subcluster_name FROM NODES;
      node_name       | node_state |  node_address  |  subcluster_name
----------------------+------------+----------------+--------------------
 v_verticadb_node0001 | UP         | 192.168.56.102 | default_subcluster
 v_verticadb_node0002 | UP         | 192.168.56.103 | default_subcluster
 v_verticadb_node0003 | UP         | 192.168.56.104 | default_subcluster
 v_verticadb_node0004 | DOWN       | 0.0.0.0        | analytics
 v_verticadb_node0005 | DOWN       | 0.0.0.0        | analytics
 v_verticadb_node0006 | DOWN       | 0.0.0.0        | analytics

Because Vertica considers these unrevived nodes to be down, it may not allow you to remove them or remove their subcluster while they are in their unrevived state. The best way to remove the nodes or the secondary subcluster is to revive them first.

Reviving unrevived secondary subclusters

If you revived just the primary subclusters in your database, you can later choose to revive some or all of the secondary subclusters. Your cluster must have hosts that are not nodes in the database that Vertica can use to revive the unrevived nodes. If your cluster does not have enough of these non-node hosts, you can add more hosts. See Adding hosts to a cluster.

You revive a secondary subcluster by using the admintools' restart_subcluster tool. You supply it with the list of hosts in the --hosts argument where the nodes will be revived. The number of hosts in this list must match the number of nodes in the subcluster. You must revive all nodes in the subcluster at the same time. If you pass restart_subcluster a list with fewer or more hosts than the number of nodes defined in the subcluster, it returns an error.

The follow example demonstrates reviving the secondary subcluster named analytics shown in the previous examples.

$ admintools -t restart_subcluster -d verticadb --hosts node04,node05,node06 \
             -p 'password' -c analytics
Updating hostnames of nodes in subcluster analytics.
    Replicating configuration to all nodes
    Generating new configuration information and reloading spread
Hostnames of nodes in subcluster analytics updated successfully.
*** Restarting subcluster for database verticadb ***
    Restarting host [192.168.56.105] with catalog [v_verticadb_node0004_catalog]
    Restarting host [192.168.56.106] with catalog [v_verticadb_node0005_catalog]
    Restarting host [192.168.56.107] with catalog [v_verticadb_node0006_catalog]
    Issuing multi-node restart
    Starting nodes:
        v_verticadb_node0004 (192.168.56.105)
        v_verticadb_node0005 (192.168.56.106)
        v_verticadb_node0006 (192.168.56.107)
    Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
    Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
    Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
    Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
    Node Status: v_verticadb_node0004: (INITIALIZING) v_verticadb_node0005: (INITIALIZING) v_verticadb_node0006: (INITIALIZING)
    Node Status: v_verticadb_node0004: (UP) v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Syncing catalog on verticadb with 2000 attempts.

See also

12 - Synchronizing metadata

An Eon Mode database maintains its catalog, which contains all database metadata, in communal storage.

An Eon Mode database maintains its catalog, which contains all database metadata, in communal storage. Vertica uses this metadata when it revives the database, so it is important that the catalog is always up to date. Vertica automatically synchronizes the catalog at regular intervals as specified by the configuration parameter CatalogSyncInterval —by default, set to five minutes.

In general, it is not necessary to monitor the synchronization process or change it. One exception applies: before shutting down a database that you intend to revive or replicate, it is good practice to verify that the catalog contains all recent changes, and if necessary synchronize it manually.

Verifying catalog status

You can verify the synchronization status of the database catalog in two ways, depending on whether the database is running.

If the database is running, query and compare these two system tables:

  • CATALOG_SYNC_STATE: Shows how recently each node synchronized its catalog to communal storage, and the version that it synchronized.

  • CATALOG_TRUNCATION_STATUS: Shows the latest synchronization status of the database catalog. Catalog synchronization is up to date when columns TRUNCATION_CATALOG_VERSION and CURRENT_CATALOG_VERSION are the same.

If the database is not currently running, check the following JSON file on communal storage:

/metadata/database-name/cluster_config.json

The catalog truncation version and timestamp in this file indicate when Vertica last synchronized the database catalog.

Manually synchronizing the database datalog

If necessary, call SYNC_CATALOG to synchronize the catalog immediately with all nodes or a specific node:

=> SELECT sync_catalog();

Customizing synchronization intervals

By default, Vertica checks for catalog changes every five minutes. Occasionally, you might want to change this setting temporarily—for example, set it to a high value in order to take a snapshot of the current bucket contents:

=> ALTER DATABASE DEFAULT SET CatalogSyncInterval = 300;