This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Eon Mode
You can operate your Vertica database in Eon Mode instead of in Enterprise Mode.
You can operate your Vertica database in Eon Mode instead of in Enterprise Mode. The two modes differ primarily in how they store data:
These different storage methods lead to a number of important differences between the two modes. In Enterprise Mode, each database node stores a portion of the data and performs a portion of the computation. In Eon Mode, computational processes are separated from a communal (shared) storage layer, which enables rapid scaling of computational resources as demand changes.
For more on how these two modes compare, see Architecture.
1 - Create a database in Eon Mode
The easiest way to create an Eon Mode database in the cloud is to use the MC.
Create an Eon Mode database in a cloud environment
The easiest way to create an Eon Mode database in the cloud is to use the MC. The MC can create your database and provision the nodes to run the database at the same time. For specific instructions for your cloud environment, see:
On AWS and Azure, you can also create an Eon Mode database using admintools. For specific instructions for your cloud environment, see:
Create an on-premises Eon Mode database
If you have an on-premises install, you can create an Eon Mode database using admintools. See Eon on-premises storage for a list of object stores that Vertica supports for communal storage. The following topics detail installation instructions for each on-premises communal storage option:
1.1 - Create an Eon Mode database on-premises with FlashBlade
You have two options on how to create an Eon Mode database on premises with Pure Storage FlashBlade as your S3-compatible communal storage:.
You have two options on how to create an Eon Mode database on premises with Pure Storage FlashBlade as your S3-compatible communal storage:
Step 1: create a bucket and credentials on the Pure Storage FlashBlade
To use a Pure Storage FlashBlade appliance as a communal storage location for an Eon Mode database you must have:
-
The IP address of the FlashBlade appliance. You must also have the connection port number if your FlashBlade is not using the standard port 80 or 443 to access the bucket. All of the nodes in your Vertica cluster must be able to access this IP address. Make sure any firewalls between the FlashBlade appliance and the nodes are configured to allow access.
-
The name of the bucket on the FlashBlade to use for communal storage.
-
An access key and secret key for a user account that has read and write access to the bucket.
See the Pure Storage support site for instructions on how to create the bucket and the access keys needed for a communal storage location.
Step 2: install Vertica on your cluster
To install Vertica:
-
Ensure your nodes are configured properly by reviewing all of the content in the Before you install Vertica section.
-
Use the install_vertica
script to verify that your nodes are correctly configured and to install the Vertica binaries on all of your nodes. Follow the steps under Install Vertica using the command line to install Vertica.
Note
These installation steps are the same ones you follow to install Vertica in Enterprise Mode. The difference between Eon Mode and Enterprise Mode on-premises databases is how you create the database, not how you install the Vertica software.
Step 3: create an authorization file
Before you create your Eon Mode on-premises database, you must create an authorization file that admintools will use to authenticate with the FlashBlade storage.
-
On the Vertica node where you will run admintools to create your database, use a text editor to create a file. You can name this file anything you wish. In these steps, it is named auth_params.conf
. The location of this file isn't important, as long as it is readable by the Linux user you use to create the database (usually, dbadmin).
Important
The auth_params.conf
file contains the secret key to access the bucket containing your Eon Mode database's data. This information is sensitive, and can be used to access the raw data in your database. Be sure this file is not readable by unauthorized users. After you have created your database, you can delete this file.
-
Add the following lines to the file:
awsauth = FlasbBlade_Access_Key:FlashBlade_Secret_Key
awsendpoint = FlashBladeIp:FlashBladePort
Note
You do not need to supply a port number in the awsendpoint
setting if you are using the default port for the connection between Vertica and the FlashBlade (80 for an unencrypted connection or 443 for an encrypted connection).
-
If you are not using TLS encryption for the connection between Vertica and the FlashBlade, add the following line to the file:
awsenablehttps = 0
-
Save the file and exit the editor.
This example auth_params.conf
file is for an unencrypted connection between the Vertica cluster and a FlashBlade appliance at IP address 10.10.20.30 using the standard port 80.
awsauth = PIWHSNDGSHVRPIQ:339068001+e904816E02E5fe9103f8MQOEAEHFFVPKBAAL
awsendpoint = 10.10.20.30
awsenablehttps = 0
Step 4: choose a depot path on all nodes
Choose or create a directory on each node for the depot storage path. The directory you supply for the depot storage path parameter must:
-
Have the same path on all nodes in the cluster (i.e. /home/dbadmin/depot
).
-
Be readable and writable by the dbadmin user.
-
Have sufficient storage. By default, Vertica uses 60% of the filesystem space containing the directory for depot storage. You can limit the size of the depot by using the --depot-size
argument in the create_db command. See Configuring your Vertica cluster for Eon Mode for guidelines on choosing a size for your depot.
The admintools create_db tool will attempt to create the depot path for you if it doesn't exist.
Step 5: create the Eon on-premises database
Use the admintools create_db tool to create the database. You must pass this tool the following arguments:
Argument |
Description |
-x |
The path to the auth_params.conf file. |
--communal-storage-location |
The S3 URL for the bucket on the FlashBlade appliance (usually, this is s3://bucketname). |
--depot-path |
The absolute path to store the depot on the nodes in the cluster. |
--shard-count |
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divider. See Planning for Scaling Your Cluster for more information. |
-s |
A comma-separated list of the nodes in your database. |
-d |
The name for your database. |
Some common optional arguments include:
Argument |
Description |
-l |
The absolute path to the Vertica license file to apply to the new database. |
-p |
The password for the new database. |
--depot-size |
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
-
integer % : Percentage of filesystem's disk space to allocate.
-
integer {K|M|G|T} : Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
|
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating a three-node database named verticadb, specifying the depot will be stored in the home directory of the dbadmin user.
$ admintools -t create_db -x auth_params.conf \
--communal-storage-location=s3://verticadbbucket \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
Step 6: disable streaming limitations
After creating the database, disable the AWSStreamingConnectionPercentage configuration parameter. This setting is unnecessary for an Eon Mode on-premises install with communal storage on FlashBlade or MinIO. This configuration parameter controls the number of connections to the object store that Vertica uses for streaming reads. In a cloud environment, this setting helps avoid having streaming data from the object store use up all of the available file handles. It leaves some file handles available for other object store operations. Due to the low latency of on-premises object stores, this option is unnecessary. Set it to 0 to disable it.
The following example shows how to disable this parameter using ALTER DATABASE...SET PARAMETER:
=> ALTER DATABASE DEFAULT SET PARAMETER AWSStreamingConnectionPercentage = 0;
ALTER DATABASE
Deciding whether to disable the depot
The FlashBlade object store's performance is fast enough that you may consider disabling the depot in your Vertica database. If you disable the depot, you can get by with less local storage on your nodes. However, there is always a performance impact of disabling the depot. The exact impact depends mainly on the types of workloads you run on your database. The performance impact can range from a 30% to 4000% decrease in query performance. Only consider disabling the depot if you will see a significant benefit from reducing the storage requirements of your nodes. Before disabling the depot on a production database, always run a proof of concept test that executes the same workloads as your production database.
To disable the depot, set the UseDepotForReads configuration parameter to 0. The following example demonstrates disabling this parameter using ALTER DATABASE...SET PARAMETER:
=> ALTER DATABASE DEFAULT SET PARAMETER UseDepotForReads = 0;
ALTER DATABASE
1.2 - Create an Eon Mode database on-premises with HDFS
To use HDFS as a communal storage location for an Eon Mode database you must:.
Step 1: satisfy HDFS environment prerequisites
To use HDFS as a communal storage location for an Eon Mode database you must:
-
Run the WebHDFS service.
-
If using Kerberos, create a Kerberos principal for the Vertica (system) user as described in Kerberos authentication, and grant it read and write access to the location in HDFS where you will place your communal storage. Vertica always uses this system principal to access communal storage.
-
If using High Availability Name Node or swebhdfs, distribute the HDFS configuration files to all Vertica nodes as described in Configuring HDFS access. This step is necessary even though you do not use the hdfs scheme for communal storage.
-
If using swebhdfs (wire encryption) instead of webhdfs, configure the HDFS cluster with certificates trusted by the Vertica hosts and set dfs.encrypt.data.transfer in hdfs-site.xml.
-
Vertica has no additional requirements for encryption at rest. Consult the documentation for your Hadoop distribution for information on how to configure encryption at rest for WebHDFS.
Note
Hadoop currently does not support IPv6 network addresses. Your cluster must use IPv4 addresses to access HDFS. If you choose to use IPv6 network addresses for the hosts in your database cluster, make sure they can access IPv4 addresses. One way to enable this access is to assign your Vertica hosts an IPv4 address in addition to an IPv6 address.
Step 2: install Vertica on your cluster
To install Vertica:
-
Ensure your nodes are configured properly by reviewing all of the content in the Before you install Vertica section.
-
Use the install_vertica
script to verify that your nodes are correctly configured and to install the Vertica binaries on all of your nodes. Follow the steps under Install Vertica using the command line to install Vertica.
Note
These installation steps are the same ones you follow to install Vertica in Enterprise Mode. The difference between Eon Mode and Enterprise Mode on-premises databases is how you create the database, not how you install the Vertica software.
Step 3: create a bootstrapping file
Before you create your Eon Mode on-premises database, you must create a bootstrapping file to specify parameters that are required for database creation. This step applies if you are using Kerberos, High Availability Name Node, or TLS (wire encryption).
-
On the Vertica node where you will run admintools to create your database, use a text editor to create a file. You can name this file anything you wish. In these steps, it is named bootstrap_params.conf
. The location of this file isn't important, as long as it is readable by the Linux user you use to create the database (usually, dbadmin).
-
Add the following lines to the file. HadoopConfDir is typically set to /etc/hadoop/conf
; KerberosServiceName is usually set to vertica
.
HadoopConfDir = config-path
KerberosServiceName = principal-name
KerberosRealm = realm-name
KerberosKeytabFile = keytab-path
If you are not using HA Name Node, for example in a test environment, you can omit HadoopConfDir and use an explicit Name Node host and port when specifying the location of the communal storage.
-
Save the file and exit the editor.
Step 4: choose a depot path on all nodes
Choose or create a directory on each node for the depot storage path. The directory you supply for the depot storage path parameter must:
-
Have the same path on all nodes in the cluster (i.e. /home/dbadmin/depot
).
-
Be readable and writable by the dbadmin user.
-
Have sufficient storage. By default, Vertica uses 60% of the filesystem space containing the directory for depot storage. You can limit the size of the depot by using the --depot-size
argument in the create_db command. See Configuring your Vertica cluster for Eon Mode for guidelines on choosing a size for your depot.
The admintools create_db tool will attempt to create the depot path for you if it doesn't exist.
Step 5: create the Eon on-premises database
Use the admintools create_db tool to create the database. You must pass this tool the following arguments:
Argument |
Description |
-x |
The path to the bootstrap configuration file (bootstrap_params.conf in the examples in this section). |
--communal-storage-location |
The webhdfs or swebhdfs URL for the HDFS location. You cannot use the hdfs scheme. |
--depot-path |
The absolute path to store the depot on the nodes in the cluster. |
--shard-count |
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divider. See Planning for Scaling Your Cluster for more information. |
-s |
A comma-separated list of the nodes in your database. |
-d |
The name for your database. |
Some common optional arguments include:
Argument |
Description |
-l |
The absolute path to the Vertica license file to apply to the new database. |
-p |
The password for the new database. |
--depot-size |
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
-
integer % : Percentage of filesystem's disk space to allocate.
-
integer {K|M|G|T} : Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
|
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating a three-node database named verticadb, specifying the depot will be stored in the home directory of the dbadmin user.
$ admintools -t create_db -x bootstrap_params.conf \
--communal-storage-location=webhdfs://mycluster/verticadb \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
If you are not using HA Name Node, for example in a test environment, you can use an explicit Name Node host and port for --communal-storage-location as in the following example.
$ admintools -t create_db -x bootstrap_params.conf \
--communal-storage-location=webhdfs://namenode.hadoop.example.com:50070/verticadb \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
1.3 - Create an Eon Mode database on-premises with MinIO
To use MinIO as a communal storage location for an Eon Mode database, you must have:.
Step 1: create a bucket and credentials on MinIO
To use MinIO as a communal storage location for an Eon Mode database, you must have:
-
The IP address and port number of the MinIO cluster. MinIO's default port number is 9000. A Vertica database running in Eon Mode defaults to using port 80 for unencrypted connections and port 443 for TLS encrypted connection. All of the nodes in your Vertica cluster must be able to access the MinIO cluster's IP address. Make sure any firewalls between the MinIO cluster and the nodes are configured to allow access.
-
The name of the bucket on the MinIO cluster to use for communal storage.
-
An access key and secret key for a user account that has read and write access to the bucket.
See the MinIO documentation for instructions on how to create the bucket and the access keys needed for a communal storage location.
Step 2: install Vertica on your cluster
To install Vertica:
-
Ensure your nodes are configured properly by reviewing all of the content in the Before you install Vertica section.
-
Use the install_vertica
script to verify that your nodes are correctly configured and to install the Vertica binaries on all of your nodes. Follow the steps under Install Vertica using the command line to install Vertica.
Note
These installation steps are the same ones you follow to install Vertica in Enterprise Mode. The difference between Eon Mode and Enterprise Mode on-premises databases is how you create the database, not how you install the Vertica software.
Step 3: create an authorization file
Before you create your Eon Mode on-premises database, you must create an authorization file that admintools will use to authenticate with the MinIO storage cluster.
-
On the Vertica node where you will run admintools to create your database, use a text editor to create a file. You can name this file anything you wish. In these steps, it is named auth_params.conf
. The location of this file isn't important, as long as it is readable by the Linux user you use to create the database (usually, dbadmin).
Important
The auth_params.conf
file contains the secret key to access the bucket containing your Eon Mode database's data. This information is sensitive, and can be used to access the raw data in your database. Be sure this file is not readable by unauthorized users. After you have created your database, you can delete this file.
-
Add the following lines to the file:
awsauth = MinIO_Access_Key:MinIO_Secret_Key
awsendpoint = MinIOIp:MinIOPort
Note
You do not need to supply a port number in the awsendpoint
setting if you configured your MinIO cluster to use the default HTTP ports (80 for an unencrypted connection or 443 for an encrypted connection). MinIO uses port 9000 by default.
-
If you are not using TLS encryption for the connection between Vertica and MinIO, add the following line to the file:
awsenablehttps = 0
-
Save the file and exit the editor.
This example auth_params.conf
file is for an unencrypted connection between the Vertica cluster and a MinIO cluster at IP address 10.20.30.40 using port 9000 (which is the default for MinIO).
awsauth = PIWHSNDGSHVRPIQ:339068001+e904816E02E5fe9103f8MQOEAEHFFVPKBAAL
awsendpoint = 10.20.30.40:9000
awsenablehttps = 0
Step 4: choose a depot path on all nodes
Choose or create a directory on each node for the depot storage path. The directory you supply for the depot storage path parameter must:
-
Have the same path on all nodes in the cluster (i.e. /home/dbadmin/depot
).
-
Be readable and writable by the dbadmin user.
-
Have sufficient storage. By default, Vertica uses 60% of the filesystem space containing the directory for depot storage. You can limit the size of the depot by using the --depot-size
argument in the create_db command. See Configuring your Vertica cluster for Eon Mode for guidelines on choosing a size for your depot.
The admintools create_db tool will attempt to create the depot path for you if it doesn't exist.
Step 5: create the Eon on-premises database
Use the admintools create_db tool to create the database. You must pass this tool the following arguments:
Argument |
Description |
-x |
The path to the auth_params.conf file. |
--communal-storage-location |
The S3 URL for the bucket on the MinIO cluster (usually, this is s3://bucketname). |
--depot-path |
The absolute path to store the depot on the nodes in the cluster. |
--shard-count |
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divider. See Planning for Scaling Your Cluster for more information. |
-s |
A comma-separated list of the nodes in your database. |
-d |
The name for your database. |
Some common optional arguments include:
Argument |
Description |
-l |
The absolute path to the Vertica license file to apply to the new database. |
-p |
The password for the new database. |
--depot-size |
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
-
integer % : Percentage of filesystem's disk space to allocate.
-
integer {K|M|G|T} : Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
|
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating a three-node database named verticadb, specifying the depot will be stored in the home directory of the dbadmin user.
$ admintools -t create_db -x auth_params.conf \
--communal-storage-location=s3://verticadbbucket \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
Step 6: disable streaming limitations
After creating the database, disable the AWSStreamingConnectionPercentage configuration parameter. This setting is unnecessary for an Eon Mode on-premises install with communal storage on FlashBlade or MinIO. This configuration parameter controls the number of connections to the object store that Vertica uses for streaming reads. In a cloud environment, this setting helps avoid having streaming data from the object store use up all of the available file handles. It leaves some file handles available for other object store operations. Due to the low latency of on-premises object stores, this option is unnecessary. Set it to 0 to disable it.
The following example shows how to disable this parameter using ALTER DATABASE...SET PARAMETER:
=> ALTER DATABASE DEFAULT SET PARAMETER AWSStreamingConnectionPercentage = 0;
ALTER DATABASE
1.4 - Manually create an Eon Mode database on Azure
Once you have met the cluster and storage requirements for using an Eon Mode database on Azure, you are ready to create an Eon Mode database.
Once you have met the cluster and storage requirements for using an Eon Mode database on Azure, you are ready to create an Eon Mode database. Use the admintools create_db
tool to create your Eon Mode database.
Creating an authentication file
If your database will use a managed identity to authenticate with the Azure storage container, you do not need to supply any additional configuration information to the create_db
tool.
If your database will not use a managed identity, you must supply create_db
with authentication information in a configuration file. It must contain at least the AzureStorageCredentials parameter that defines one or more account names and keys Vertica will use to access blob storage. It can also contain an AzureStorageEnpointConfig parameter that defines an alternate endpoint to use instead of the the default Azure host name. This option is useful if you are creating a test environment using an Azure storage emulator such as Azurite.
Important
Vertica does not officially support Azure storage emulators as a communal storage location.
The following table defines the values that can be set in these two parameters.
- AzureStorageCredentials
- Collection of JSON objects, each of which specifies connection credentials for one endpoint. This parameter takes precedence over Azure managed identities.
The collection must contain at least one object and may contain more. Each object must specify at least one of accountName
or blobEndpoint
, and at least one of accountKey
or sharedAccessSignature
.
accountName
: If not specified, uses the label of blobEndpoint
.
blobEndpoint
: Host name with optional port (host:port
). If not specified, uses account
.blob.core.windows.net
.
accountKey
: Access key for the account or endpoint.
sharedAccessSignature
: Access token for finer-grained access control, if being used by the Azure endpoint.
- AzureStorageEndpointConfig
- Collection of JSON objects, each of which specifies configuration elements for one endpoint. Each object must specify at least one of
accountName
or blobEndpoint
.
accountName
: If not specified, uses the label of blobEndpoint
.
blobEndpoint
: Host name with optional port (host:port
). If not specified, uses account
.blob.core.windows.net
.
protocol
: HTTPS (default) or HTTP.
isMultiAccountEndpoint
: true if the endpoint supports multiple accounts, false otherwise (default is false). To use multiple-account access, you must include the account name in the URI. If a URI path contains an account, this value is assumed to be true unless explicitly set to false.
The authentication configuration file is a text file containing the configuration parameter names and their values. The values are in a JSON format. The name of this file is not important. The following examples use the file name auth_params.conf
.
The following example is a configuration file for a storage account hosted on Azure. The storage account name is mystore, and the key value is a placeholder. In your own configuration file, you must provide the storage account's access key. You can find this value by right-clicking the storage account in the Azure Storage Explorer and selecting Copy Primary Key.
AzureStorageCredentials=[{"accountName": "mystore", "accountKey": "access-key"}]
The following example shows a configuration file that defines an account for a storage container hosted on the local system using the Azurite storage system. The user account and key are the "well-known" account provided by Azurite by default. Because this configuration uses an alternate storage endpoint, it also defines the AzureStorageEndpointConfig parameter. In addition to reiterating the account name and endpoint definition, this example sets the protocol to the non-encrypted HTTP.
Important
This example wraps the contents of the JSON values for clarity. In an actual configuration file, you cannot wrap these values. They must be on a single line.
AzureStorageCredentials=[{"accountName": "devstoreaccount1", "blobEndpoint": "127.0.0.1:10000 ",
"accountKey":
"Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw=="
}]
AzureStorageEndpointConfig=[{"accountName": "devstoreaccount1",
"blobEndpoint": "127.0.0.1:10000", "protocol": "http"}]
Creating the Eon Mode database
Use the admintools create_db
tool to create your Eon Mode database. The required arguments you pass to this tool are:
Argument |
Description |
--communal-storage-location |
The URI for the storage container Vertica will use for communal storage. This URI must use the azb:// schema. See Azure Blob Storage object store for the format of this URI. |
-x |
The path to the file containing the authentication parameters Vertica needs to access the communal storage location. This argument is only required if your database will use a storage account name and key to authenticate with the storage container. If it is using a managed identity, you do not need to specify this argument. |
--depot-path |
The absolute path to store the depot on the nodes in the cluster. |
--shard-count |
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divisor. See Planning for Scaling Your Cluster for more information. |
-s |
A comma-separated list of the nodes in your database. |
-d |
The name for your database. |
Some other common optional arguments for create_db
are:
Argument |
Description |
-l |
The absolute path to the Vertica license file to apply to the new database. |
-p |
The password for the new database. |
--depot-size |
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
-
integer % : Percentage of filesystem's disk space to allocate.
-
integer {K|M|G|T} : Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
|
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating an Eon Mode database with the following settings:
-
Vertica will use a storage account named mystore.
-
The communal data will be stored in a directory named verticadb
located in a storage container named db_blobs
.
-
The authentication information Vertica needs to access the storage container is in the file named auth_params.conf
in the current directory. The contents of this file are shown in the first example under Creating an Authentication File.
-
The hostnames of the nodes in the cluster are node01 through node03.
$ admintools -t create_db \
--communal-storage-location=azb://mystore/db_blobs/verticadb \
-x auth_params.conf -s node01,node02,node03 \
-d verticadb --depot-path /vertica/depot --shard-count 3 \
-p 'mypassword'
2 - Configuring your Vertica cluster for Eon Mode
Running Vertica in Eon Mode decouples the cluster size from the data volume and lets you configure for your compute needs independently from your storage needs.
Running Vertica in Eon Mode decouples the cluster size from the data volume and lets you configure for your compute needs independently from your storage needs. There are a number of factors you must consider when deciding what sorts of instances to use and how large your Eon Mode cluster will be.
Before you begin
Vertica Eon Mode works both in the cloud and on-premises. As a Vertica administrator setting up your production cluster running in Eon Mode, you must make decisions about the virtual or physical hardware you will use to meet your needs. This topic provides guidelines and best practices for selecting server types and cluster sizes for a Vertica database running in Eon Mode. It assumes that you have a basic understanding of the Eon Mode architecture and concepts such as communal storage, depot, namespaces, and shards. If you need to refresh your understanding, see Eon Mode architecture and Namespaces and shards.
Cluster size overview
Because Eon Mode separates data storage from the computing power of your nodes, choosing a cluster size is more complex than for an Enterprise Mode database. Usually, you choose a base number of nodes that will form one or more primary subclusters. These subclusters contain nodes that are always running in your database. You usually use them for workloads such as data loading and executing DDL statements. You rarely alter the size of these subclusters dynamically. As you need additional compute resources to execute queries, you add one or more subclusters (usually secondary subclusters) of nodes to your database.
When choosing your instances and sizing your cluster for Eon Mode, consider the working data size that your database will be dealing with. This is the amount of data that most of your queries operate on. For example, suppose your database stores sales data. If most of the queries running on your database analyze the last month or two of sales to create reports on sales trends, then your working data size is the amount of data you typically load over a few months.
Choosing instances or physical hardware
Depending on the complexity of your workload and expected concurrency, choose physical or virtual hardware that has sufficient CPU and memory. For production clusters Vertica recommends the following minimum configuration for either virtual or physical nodes in an Eon Mode database:
-
16 cores
-
128 GB RAM
-
2 TB of local storage
Note
The above specifications are just a minimum recommendation. You should consider increasing these specifications based on your workloads and the amount of data you are processing.
You must have a minimum of 3 nodes in the an Eon Mode database cluster.
For specific recommendations of instances for cloud-based Eon Mode database, see:
Determining local storage requirements
For both virtual and physical hardware, you must decide how much local storage your nodes need. In Eon Mode, the definitive copy of your database's data resides in the communal storage. This storage is provided by either a cloud-based object store such as AWS S3, or by an on-premises object store, such as a Pure Storage FlashBlade appliance.
Even though your database's data is stored in communal storage, your nodes still need some local storage. A node in an Eon Mode database uses local storage for three purposes:
-
Depot storage: To get the fastest response time for frequently executed queries, provision a depot large enough to hold your working data set after data compression. Divide the working data size by the number of nodes you will have in your subcluster to estimate the size of the depot you need for each node in a subcluster. See Choosing the Number of Shards and the Initial Node Count below to get an idea of how many nodes you want in your initial database subcluster. In cases where you expect to dynamically scale your cluster up, estimate the depot size based on the minimum number of nodes you anticipate having in the subcluster.
Also consider how much data you will load at once when sizing your depot. When loading data, Vertica defaults to writing uncommitted ROS files into the depot before uploading the files to communal storage. If the free space in the depot is not sufficient, Vertica evicts files from the depot to make space for new files.
Your data load fails if the amount of data you try to load in a single transaction is larger the total sizes of all the depots in the subcluster. To load more data than there is space in the subcluster's combined depots, set UseDepotForWrites to 0. This configuration parameter tells Vertica to load the data directly into communal storage.
-
Data storage: The data storage location holds files that belong to temporary tables and temporary data from sort operators that spill to disk. When loading data into Vertica, the sort operator may spill to disk. Depending on the size of the load, Vertica may perform the sort in multiple merge phases. The amount of data concurrently loaded into Vertica cannot be larger than the sum of temporary storage location sizes across all nodes divided by 2.
-
Catalog storage. The catalog size depends on the number of database objects per shard and the number of shard subscriptions per node.
Vertica recommends a minimum local storage capacity of 2 TB per node, out of which 60% is reserved for the depot and the other 40% is shared between the catalog and data location. If you determine that you need a depot larger than 1.2TB per node (which is 60% of 2TB) then add more storage than this minimum recommendation.You can calculate the space you need using this equation:
For example, suppose you have a compressed working data size of 24TB, and you want to have a initial primary subcluster of 3 nodes. Using these values in the equation results in 13.33TB:
Choosing the initial node and shard counts
Shards are how Vertica divides the responsibility for the data in communal storage among nodes. The number of shards that data is segmented into depends on the shard count of the namespace to which the data belongs. Each node in a subcluster subscribes to at least one shard in communal storage. During queries, data loads, and other database tasks, each node is responsible for the data in the shards it subscribes to. See Namespaces and shards for more information.
The relation between shards and nodes means that when selecting the number of shards for a namespace, you must consider the number of nodes you will have in your database.
You set the initial shard count of the default_namespace
when you create your database. The shard count of non-default namespaces is set with the CREATE NAMESPACE statement. The initial node count is the number of nodes you will have in your core primary subcluster. The number of shards in a namespace should always be a multiple or divisor of the node count. This ensures that the shards are evenly divided between the nodes. For example, in a six-shard namespace, you should have subclusters that contain two, three, six, or a multiple of six nodes. If the number of shards is not a divisor or multiple of the node count, the shard subscriptions are not spread evenly across the nodes. This leads to some nodes being more heavily loaded than others.
Note
In a database where the number of nodes is a multiple of the number of shards in a namespace, the subcluster uses Elastic Crunch Scaling (ECS) to evenly divide shard coverage among two or more nodes. For more information, see
Using elastic crunch scaling to improve query performance.
The following table shows the recommended initial node count based on the working data size:
Cluster Type |
Working Data Size |
Initial Node Count |
Small |
Up to 24 TB |
3 |
Medium |
Up to 48 TB |
6 |
Large |
Up to 96 TB |
12 |
Extra large |
Up to 192 TB |
24 |
Scaling your cluster
There are two strategies you can use when adding nodes to your database. Each of these strategies lets you improve different types of database performance:
-
To increase the performance of complex, long-running queries, add nodes to an existing subcluster. These additional nodes improve the overall performance of these complex queries by splitting the load across more compute nodes. You can add more nodes to a subcluster than you have shards in a namespace. In this case, nodes in the subcluster that subscribe to the same shard will split up the data in the shard when performing a query. See Elasticity for more information.
-
To increase the throughput of multiple short-term queries (often called "dashboard queries"), improve your cluster's parallelism by adding additional subclusters. Subclusters work independently and in parallel on these shorter queries. See Subclusters for more information.
Complex analytic queries perform better on subclusters with more nodes, which means that 6 nodes with 6 shards perform better than 3 nodes and 6 shards. Having more nodes than shards can increase performance further, but the performance gain is not linear. For example, a subcluster containing 12 nodes in a 6-shard namespace is not as efficient as a 12-node subcluster in a 12-shard namespace. Dashboard-type queries operating on smaller data sets may not see much difference between a 3-node subcluster in a 6-shard namespacce and 6-node subcluster in a 6-shard namespace.
Use cases
Let’s look at some use cases to learn how to size your Eon Mode cluster to meet your own particular requirements.
Use case 1: save compute by provisioning when needed, rather than for peak times
This use case highlights increasing query throughput in Eon Mode by scaling a cluster from 6 to 18 nodes with 3 subclusters of 6 nodes each. In this use case, you need to support a high concurrent, short query workload on a 24 TB or less working data set. You create an initial database with 6 nodes and default_namespace
of 6 shards. You scale your database for concurrent throughput on demand by adding one or more subclusters during certain days of the week or for specific date ranges when you are expecting a peak load. You can then shut down or terminate the additional subclusters when your database experiences lower demand. With Vertica in Eon Mode, you save money by provisioning on demand, rather than provisioning for the peak times.
Use case 2: complex analytic workload requires more compute nodes
This use case showcases the idea that complex analytic workloads on large working data sets benefit from high shard count and node count. You create an initial subcluster with 24 nodes and default_namespace
of 24 shards. As needed, you can add an additional 24 nodes to your initial subcluster. These additional nodes enable the subcluster to use elastic crunch scaling to reduce the time it takes to complete complex analytic queries.
If you use multiple namespaces, you can separate the large working data set and any smaller data sets into different namespaces. For details, see Managing namespaces.
Use case 3: workload isolation
This use case showcases the idea of having separate subclusters to isolate ETL and report workloads. You create an initial primary subcluster with 6 nodes and a default_namespace
of 6 shards for servicing ETL workloads. Then add another 6-node secondary subcluster for executing query workloads. To separate the two workloads, you can configure a network load balancer or create connection load balancing policies in Vertica to direct clients to the correct subcluster based on the type of workloads they need to execute.
3 - Migrating an enterprise database to Eon Mode
The MIGRATE_ENTERPRISE_TO_EON function migrates an Enterprise database to Eon Mode.
The MIGRATE_ENTERPRISE_TO_EON function migrates an Enterprise database to Eon Mode. The migration process includes the following stages:
-
Check migration prerequisites
-
Verify compliance
-
Execute the migration
-
Check migration results
-
Activate the Eon database
Tip
When planning how to provision the target Eon database, consider that MIGRATE_ENTERPRISE_TO_EON creates the same number of nodes as on the original Enterprise database , and an equal number of segmented shards. Choose the appropriate instance type accordingly.
Migration prerequisites
The following conditions must be true; otherwise, MIGRATE_ENTERPRISE_TO_EON returns with an error:
-
The source Enterprise database version must be ≥10.0.
-
All nodes in the source database must be in an UP state and of type PERMANENT or EPHEMERAL. Verify by querying the NODES system table:
=> SELECT node_name, node_type, node_state FROM nodes;
node_name | node_type | node_state
------------------+-----------+------------
v_vmart_node0001 | PERMANENT | UP
v_vmart_node0002 | PERMANENT | UP
v_vmart_node0003 | PERMANENT | UP
(3 rows)
-
The source database must be configured as an elastic cluster. By default, any database created since Vertica release 9.2.1 is configured as an elastic cluster. To verify whether an Enterprise database is configured as an elastic cluster, query the ELASTIC_CLUSTER system table:
=> SELECT is_enabled FROM elastic_cluster;
is_enabled
------------
t
(1 row)
If the query returns false, call the ENABLE_ELASTIC_CLUSTER function on the Enterprise database.
-
The source Enterprise database must configure Eon parameters as required by the target Eon object store (see Configuration Requirements below).
-
The database must not contain projections that are unsupported by Eon.
Unsupported projections
Eon databases do not support four types of projections, as described below. If MIGRATE_ENTERPRISE_TO_EON finds any of these projection types in the Enterprise database, it rolls back the migration and reports the offending projections or their anchor tables in the migration error log. For example:
The following projections are inconsistent with cluster segmentation. Rebalance them with REBALANCE_CLUSTER() or REBALANCE_TABLE():
Projection(Anchor Table): public.incon1_p1_b0(public.incon1)
Why projection is invalid |
Notes |
Resolution |
Inconsistent with cluster segmentation. |
For example, nodes were added to the cluster, so current distribution of projection data is inconsistent with new cluster segmentation requirements. |
Rebalance cluster or table.
The error log file lists the names of all tables with problematic projections. You can use these names as arguments to meta-function REBALANCE_TABLE. You can also rebalance all projections by calling REBALANCE_CLUSTER.
|
Does not support elastic segmentation. |
Projection was created with the NODES option, or in a database where elastic segmentation was disabled. |
Drop projection, recreate with ALL NODES. |
Defined with a GROUPED clause. |
Consolidates multiple columns in a single ROS container. |
Drop projection, recreate without GROUPED clause. |
Data stored in unbundled storage containers. |
Found only in Vertica databases that were created before storage container bundling was introduced in version 7.2. |
Bundle storage containers in database with meta-function COMPACT_STORAGE.
The error log names all tables with projections that store data in unbundled storage containers. You can use these names as arguments to meta-function COMPACT_STORAGE.
|
Configuration requirements
Before migration, you must set certain configuration parameters in the source database. The specific parameters depend on the environment of the Eon database.
Important
All parameters must be set at the database level.
S3: AWS, Pure Storage, MinIO
The following requirements apply to all supported cloud and non-cloud (on-premises) S3 environments: AWS, Pure Storage, and MinIO. One exception applies: migration from an Enterprise Mode database on AWS.
Important
If migrating to on-premises communal storage with Pure Storage and MinIO, set AWSEnableHttps to be compatible with the database TLS encryption setup: AWSEnableHttps=1 if using TLS, otherwise 0. If settings are incompatible, the migration returns with an error.
Azure
-
You must use an azb://
schema URI to set the Azure Blob Storage location for communal data storage. See Azure Blob Storage object store for the format of this URI.
-
Select one of the following authentication methods to grant Vertica access to the storage location:
-
Configure managed identities to grant your Azure VMs access to the storage location. This option does not require any configuration within Vertica.
-
Set the AzureStorageCredentials and AzureStorageEndpointConfig configuration parameters at the database level to have Vertica authenticate with the storage location.
See Azure Blob Storage object store for more about the two authentication methods.
GCP
HDFS
Tip
If using Kerberos authentication, record settings for the following Kerberos configuration parameters. You will need to apply these settings to the migrated Eon database:
-
KerberosServiceName
-
KerberosRealm
-
KerberosKeytabFile
Important
The following restrictions apply to Kerberos authentication:
Compliance verification
Before running migration, check whether the Enterprise source database complies with all migration requirements. You do so by setting the last Boolean argument of MIGRATE_ENTERPRISE_TO_EON to true to indicate that this is a dry run and not an actual migration:
=> SELECT migrate_enterprise_to_eon('s3://dbbucket', '/vertica/depot', true);
If the function encounters any compliance issues, it writes these to the migration error log, migrate_enterprise_to_eon_error.log
, in the database directory.
Migration execution
MIGRATE_ENTERPRISE_TO_EON migrates an Enterprise database to an Eon Mode database. For example:
=> SELECT migrate_enterprise_to_eon('s3://dbbucket', '/vertica/depot', false);
If the last argument is omitted or false, the function executes the migration. MIGRATE_ENTERPRISE_TO_EON runs in the foreground, and until it returns—either with success or an error—it blocks all operations in the same session on the source Enterprise database. If successful, MIGRATE_ENTERPRISE_TO_EON returns with a list of nodes in the migrated database. You can then proceed to revive the migrated Eon database.
Handling interrupted migration
If migration is interrupted before the function returns—for example, the client disconnects, or a network outage occurs—the migration errors out. In this case, call MIGRATE_ENTERPRISE_TO_EON to restart migration.
Communal storage of the target database retains data that was already copied before the error occurred. When you call MIGRATE_ENTERPRISE_TO_EON to resume migration, the function first checks the data on communal storage and only copies unprocessed data from the source database.
Important
When migrating to an Eon database with HDFS communal storage, migration interruptions can leave files in an incomplete state that is visible to users. When you call MIGRATE_ENTERPRISE_TO_EON to resume migration, the function first compares source and target file sizes. If it finds inconsistent sizes for a given file, it truncates the target cluster file and repeats the entire transfer.
Repeating migration
You can repeat migration multiple times to the same communal storage location. This can be useful for backfilling changes that occurred in the source database during the previous migration.
The following constraints apply:
-
You can migrate from only one database to the same communal storage location.
-
After reviving the newly migrated Eon database, you cannot migrate again to its communal storage, unless you first drop the database and then clean up storage.
Monitoring migration
The DATABASE_MIGRATION_STATUS system table displays the progress of a migration in real time, and also stores data from previous migrations. The following example shows data of a migration that is in progress:
=> SELECT node_name, phase, status, bytes_to_transfer, bytes_transferred, communal_storage_location FROM database_migration_status ORDER BY node_name, start_time;
node_name | phase | status | bytes_to_transfer | bytes_transferred | communal_storage_location
------------------+--------------------+-----------+-------------------+------------------+---------------------------
v_vmart_node0001 | Catalog Conversion | COMPLETED | 0 | 0 | s3://verticadbbucket/
v_vmart_node0001 | Data Transfer | COMPLETED | 1134 | 1134 | s3://verticadbbucket/
v_vmart_node0001 | Catalog Transfer | COMPLETED | 3765 | 3765 | s3://verticadbbucket/
v_vmart_node0002 | Catalog Conversion | COMPLETED | 0 | 0 | s3://verticadbbucket/
v_vmart_node0002 | Data Transfer | COMPLETED | 1140 | 1140 | s3://verticadbbucket/
v_vmart_node0002 | Catalog Transfer | COMPLETED | 3766 | 3766 | s3://verticadbbucket/
v_vmart_node0003 | Catalog Conversion | COMPLETED | 0 | 0 | s3://verticadbbucket/
v_vmart_node0003 | Data Transfer | RUNNING | 5272616 | 183955 | s3://verticadbbucket/
Error logging
MIGRATE_ENTERPRISE_TO_EON logs migration-related warnings, errors, and hints in migrate_enterprise_to_eon_error.log
in the database directory. During execution, the function also prints messages to standard output, together with the error log's pathname.
Conversion results
Visible objects in the source database are handled as follows:
-
Global catalog objects: synced to communal storage.
-
Multiple segmented projections in identical buddy projection group: one projection in the group is migrated.
-
Unsegmented projection replicated on only one node: distributed across all nodes.
-
Number of nodes: same number of nodes, and an equal number of segmented shards. You might want to alter the number of shards to better align with the number of nodes in your subclusters. For details, see RESHARD_DATABASE.
-
USER and TEMP storage locations: migrated. Consider evaluating all migrated storage locations for their relevance in an Eon Mode database. For details, see S3 Storage of Temporary Data.
-
DATA and TEMP,DATA storage locations: not migrated. New default DATA and TEMP,DATA locations are on the same path as the depot.
-
Fault groups and storage policies: not migrated.
-
External procedures: not migrated.
-
Catalog objects related to network settings (load balance groups, network addresses, routing rules, subnets, etc.): not migrated.
The depot location is specified in MIGRATE_ENTERPRISE_TO_EON. Default depot size is set to 80% of local file system after revive.
Eon database activation
HDFS prerequisites
If migrating to an Eon database with HDFS communal storage, create a bootstrapping file to use when you revive the new Eon database. The bootstrapping file must be on the same node where the revive operation is launched, and readable by the user who launches the revive operation.
A bootstrapping file is required only if the new Eon database uses one or both of the following:
-
High Availability (HA) NameNodes: Set HadoopConfDir
to the location of the hdfs-site.xml
configuration file—typically, /etc/hadoop/conf
. This file defines the hdfs.nameservices
parameter and individual NameNodes, and must be distributed across all cluster nodes. For details, see Configuring HDFS access.
-
Kerberos authentication: Set the following Kerberos configuration parameters:
-
KerberosServiceName
-
KerberosRealm
-
KerberosKeytabFile
For example, the bootstrapping file for an Eon database with HA and Kerberos authentication must have the following settings:
HadoopConfDir = config-path
KerberosServiceName = principal-name
KerberosRealm = realm-name
KerberosKeytabFile = keytab-path
All migrations
After migration is complete and the Eon database is ready for use, perform these steps:
-
Revive the database using one of the following methods:
- From communal storage on S3 or GCP with Management Console.
- From communal storage on Azure, S3, GCP, or HDFS with admintools.
In the following example, the admintools revive_db
command revives a three-node database that uses S3 communal storage:
admintools -t revive_db
-x auth_params.conf \
--communal-storage-location=s3://verticadbbucket \
-d VMart \
-s 172.16.116.27,172.16.116.28,172.16.116.29 \
--force
In the next example, revive_db
revives a three-node database that uses HDFS communal storage:
admintools -t revive_db
-x bootstrap_params.conf \
--communal-storage-location=webhdfs://mycluster/verticadb \
-d verticadb \
-s vnode01,vnode02,vnode03
-
Check the controlmode
setting in /opt/vertica/config/admintools.conf
. This setting must be compatible with the network messaging requirements of your Eon implementation. For example, S3/AWS (Amazon Cloud) relies on unicast messaging, which is compatible with a controlmode
setting of point-to-point
(pt2pt). If the source database controlmode
setting was broacast
and you migrate to S3/AWS communal storage, you must change controlmode
with admintools:
$ admintools -t re_ip -d dbname -T
Important
If controlmode
is set incorrectly, attempts to start the migrated Eon database will fail.
-
Start the Eon Mode database.
-
Call CLEAN_COMMUNAL_STORAGE to remove unneeded data files that might be left over from the migration.
-
If migrating to S3 on-premises communal storage—Pure Storage or MinIO—set the AWSStreamingConnectionPercentage configuration parameter to 0 with ALTER DATABASE...SET PARAMETER.
-
Review the depot storage location size and adjust as needed.
-
Consider re-sharding the Eon Mode database if the number of shards is not optimal. See Choosing the Number of Shards and the Initial Node Count for more information. If needed, use RESHARD_DATABASE to change the number of shards.
4 - Managing namespaces
Eon Mode databases group all schemas and tables into one or more namespaces. Namespaces represent the top level in the Vertica object hierarchy, where every schema—and its constituent tables—belong to a namespace.
Eon Mode databases group all schemas and tables in the database into one or more namespaces. Namespaces represent the top level in the Vertica object hierarchy, where every schema—and its constituent tables—belong to a namespace. Each namespace has a distinct shard count that defines the segmentation of its member objects. By default, the database contains a single namespace, default_namespace
, which is created on database creation with the shard count specified during setup.
You can create additional namespaces with the CREATE NAMESPACE statement. When you create a table or schema, you can specify the namespace under which to create the object. If no namespace is specified, the table or schema is created under the default_namespace
. If you have not created any objects under non-default namespaces, you can ignore namespaces when operating your database because Vertica correctly assumes the namespace of all tables and schemas to be default_namespace
.
If you upgrade a database using a Vertica version less than 24.1 to a version 24.1 or later, Vertica creates the default_namespace
and assigns to it all schemata and tables in your database.
Create namespaces
To create a namespace, use CREATE NAMESPACE, specifying its name and shard count:
=> CREATE NAMESPACE airport SHARD COUNT 12;
CREATE NAMESPACE
If no shard count is provided, the namespace is created with the shard count of the default_namespace
. You can view all namespaces in your database by querying the NAMESPACES system table:
=> SELECT namespace_name, is_default, default_shard_count FROM NAMESPACES;
namespace_name | is_default | default_shard_count
-------------------+------------+---------------------
default_namespace | t | 6
airport | f | 12
(2 rows)
To drop namespaces, use DROP NAMESPACE.
Create schemas and tables under namespaces
When creating schemas and tables, you can specify the namespace in which the object is created. The CREATE TABLE and CREATE SCHEMA statements accept a 3-part naming format, where the top level name identifies the namespace. For example, n.s.t
refers to table t
in schema s
in namespace n
. If no namespace is specifed, the schema or table is created in the default_namespace
.
Schema names must be unique in the same namespace, but there can be schemas of the same name in different namespaces:
=> CREATE SCHEMA airline;
CREATE SCHEMA
=> CREATE SCHEMA airport.airline;
CREATE SCHEMA
=> SELECT schema_name, schema_namespace_name FROM SCHEMATA WHERE schema_name = 'airline' OR schema_name = 'airport.airline';
schema_name | schema_namespace_name
-----------------+-----------------------
airline | default_namespace
airport.airline | airport
(2 rows)
All tables created in a given schema exist in the same namespace as that schema. You can use the ALTER TABLE statement to change the schema of a table to another schema in the same namespace, but not to a schema in a different namespace.
Create a flights
table in the airline
schema under both the airport
and default namespace:
=> CREATE TABLE airport.airline.flights(
flight_number INT,
leaving_from VARCHAR,
arriving_at VARCHAR,
expected_departure DATE,
gate VARCHAR
);
CREATE TABLE
=> CREATE TABLE airline.flights(
flight_number INT,
leaving_from VARCHAR,
arriving_at VARCHAR,
expected_departure DATE,
gate VARCHAR
);
CREATE TABLE
=> SELECT table_name, table_schema, table_namespace_name FROM TABLES WHERE table_name = 'flights';
table_name | table_schema | table_namespace_name
------------+--------------+----------------------
flights | airline | default_namespace
flights | airline | airport
(2 rows)
Backing up, restoring, and replicating namespaces
The vbr
backup, restore, and replicate tasks support namespaces. Full-database backup and restore operations work as expected and require no extra configurations for namespaces. However, object-level vbr
tasks have the following requirements for databases using multiple namespaces:
- Specify the namespace of objects when identifying them in either the
vbr
configuration file or command line arguments. If you do not specify an object's namespace, it is assumed to be default_namespace
.
- For
vbr
tasks, namespaces are prefixed with a period. For example, .n.s.t
refers to table t
in schema s
in namespace n
.
- Object-level restore and replicate
vbr
tasks let you specify a target namespace with the --target-namespace
argument. This is the namespace to which the objects are restored or replicated. vbr
behaves differently depending on whether the target namespace exists:
- Exists:
vbr
attempts to restore or replicate the objects to the existing namespace, which must have the same shard count, shard boundaries, and node subscriptions as the source namespace. If these conditions are not met, the vbr
task fails.
- Nonexistent:
vbr
creates a namespace in the target database with the name specified in --target-namespace
and the shard count of the source namespace, and then replicates or restores the objects to that namespace.
If no target namespace is specified, vbr
attempts to restore or replicate objects to a namespace with the same name as the source namespace.
For details, see Eon Mode database requirements.
Server-based replication also supports namespaces. The source and target namespaces of the replicated tables must have the same shard count and segmentation. The REPLICATE command accepts an optional target namespace parameter that specifies the namespace to which the tables are replicated in the target cluster. If you do not specify a target namespace, objects are replicated to a namespace with the same name as the source namespace. If no such namespace exists in the target cluster, it is created with the same name and shard count as the source namespace. For details, see REPLICATE.
Naming conventions
Database objects in Vertica can be uniquely identified by a 3-part name. The valid components of an object's 3-part name in an Eon Mode database depend on whether the object is in the default namespace:
- Default namespace: 3-part names can take the form
database
.schema
.object
or namespace
.schema
.object
. If the first part of a 3-part name is a valid database name, Vertica assumes the object is in default_namespace
.
- Non-default namespace: 3-part names must be in the form
namespace
.schema
.object
.
4-part names—database
.namespace
.schema
.object
—are not supported.
To run federated queries across multiple databases, some third-party tools require that 3-part names be in the form database
.schema
.object
. If any objects in the database are in a non-default namespace, theses federated queries will fail.
Namespace restrictions
Namespaces currently have the following restrictions:
- You can only use the RESHARD_DATABASE function to reshard the
default_namespace
. If your database has any non-default namespaces, running the function results in an error.
- You can only create, restore, and replicate flex tables to the
default_namespace
.
- Some third-party libraries do not support namespaces. If your workload relies on one of these libraries, Vertica recommends that you do not use non-default namespaces until the third-party library adds namespace support.
5 - Managing subclusters
Subclusters help you organize the nodes in your clusters to isolate workloads and make elastic scaling easier.
Subclusters help you organize the nodes in your clusters to isolate workloads and make elastic scaling easier. See Subclusters for an overview of how subclusters can help you.
See also
5.1 - Creating subclusters
By default, new Eon Mode databases contain a single primary subcluster named default_subcluster.
By default, new Eon Mode databases contain a single primary subcluster named default_subcluster
. This subcluster contains all nodes that are part of the database when you create it. You will often want to create subclusters to separate and manage workloads. You have three options to add subclusters to the database:
To create a subcluster, use the admintools db_add_subcluster
tool:
$ admintools -t db_add_subcluster --help
Usage: db_add_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database to be modified
-s HOSTS, --hosts=HOSTS
Comma separated list of hosts to add to the subcluster
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
-c SCNAME, --subcluster=SCNAME
Name of the new subcluster for the new node
--is-primary Create primary subcluster
--is-secondary Create secondary subcluster
--control-set-size=CONTROLSETSIZE
Set the number of nodes that will run spread within
the subcluster
--like=CLONESUBCLUSTER
Name of an existing subcluster from which to clone
properties for the new subcluster
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
The simplest command adds an empty subcluster. It requires the database name, password, and name for the new subcluster. This example adds a subcluster analytics_cluster
to the verticadb
database:
$ adminTools -t db_add_subcluster -d verticadb -p 'password' -c analytics_cluster
Creating new subcluster 'analytics_cluster'
Subcluster added to verticadb successfully.
By default, admintools creates the subcluster as a secondary subcluster. You can have it create a primary subcluster instead by supplying the --is-primary
argument.
Adding nodes while creating a subcluster
You can also specify one or more hosts for admintools to add to the subcluster as new nodes. These hosts must be part of the cluster but not already part of the database. For example, you can use hosts that you added to the cluster using the MC or admintools, or hosts that remain part of the cluster after you dropped nodes from the database. This example creates a subcluster analytics_cluster
and uses the -s
option to specify the available hosts in the cluster:
$ adminTools -t db_add_subcluster -c analytics_cluster -d verticadb -p 'password' -s 10.0.33.77,10.0.33.181,10.0.33.85
View the subscription status of all nodes in the database with the following query that joins the V_CATALOG.NODES and V_CATALOG.NODE_SUBSCRIPTIONS system tables:
=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
= ns.node_name) ORDER BY 1,2,3;
subcluster_name | node_name | shard_name | subscription_state
-------------------+----------------------+-------------+--------------------
analytics_cluster | v_verticadb_node0004 | replica | ACTIVE
analytics_cluster | v_verticadb_node0004 | segment0001 | ACTIVE
analytics_cluster | v_verticadb_node0004 | segment0003 | ACTIVE
analytics_cluster | v_verticadb_node0005 | replica | ACTIVE
analytics_cluster | v_verticadb_node0005 | segment0001 | ACTIVE
analytics_cluster | v_verticadb_node0005 | segment0002 | ACTIVE
analytics_cluster | v_verticadb_node0006 | replica | ACTIVE
analytics_cluster | v_verticadb_node0006 | segment0002 | ACTIVE
analytics_cluster | v_verticadb_node0006 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0001 | replica | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0002 | replica | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | replica | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)
If you do not include hosts when you create the subcluster, you must manually rebalance the shards in the subcluster when you add nodes at a later time. For more information, see Updating Shard Subscriptions After Adding Nodes.
Subclusters and large cluster
Vertica has a feature named large cluster that helps manage broadcast messages as the database cluster grows. It has several impacts on adding new subclusters:
- If you create a subcluster with 16 or more nodes, Vertica automatically enables the large cluster feature. It sets the number of control nodes to the square root of the number of nodes in the subcluster. See Planning a large cluster.
- You can set the number of control nodes in a subcluster by using the
--control-set-size
option in the admintools command line.
- If the database cluster has 120 control nodes, Vertica returns an error if you try to add a new subcluster. Every subcluster must have at least one control node. The database cannot have more than 120 control nodes. When the database reaches this limit, you must reduce the number of control nodes in other subclusters before you can add a new subcluster. See Changing the number of control nodes and realigning for more information.
- If you attempt to create a subcluster with a number of control nodes that would exceed the 120 control node limit, Vertica warns you and creates the subcluster with fewer control nodes. It adds as many control nodes as it can to the subcluster, which is 120 minus the current count of control nodes in the cluster. For example, suppose you create a 16-node subcluster in a database cluster that already has 118 control nodes. In this case, Vertica warns you and creates the subcluster with just 2 control nodes rather than the default 4.
See Large cluster for more information about the large cluster feature.
5.2 - Duplicating a subcluster
Subclusters have many settings you can tune to get them to work just the way you want.
Subclusters have many settings you can tune to get them to work just the way you want. After you have tuned a subcluster, you may want additional subclusters that are configured the same way. For example, suppose you have a subcluster that you have tuned to perform analytics workloads. To improve query throughput, you can create several more subclusters configured exactly like it. Instead of creating the new subclusters and then manually configuring them from scratch, you can duplicate the existing subcluster (called the source subcluster) to a new subcluster (the target subcluster).
When you create a new subcluster based on another subcluster, Vertica copies most of the source subcluster's settings. See below for a list of the settings that Vertica copies. These settings are both on the node level and the subcluster level.
Note
After you duplicate a subcluster, the target is not connected to the source in any way. Any changes you make to the source subcluster's settings after duplication are not copied to the target. The subclusters are completely independent after duplication.
Requirements for the target subcluster
You must have a set of hosts in your database cluster that you will use as the target of the subcluster duplication. Vertica forms these hosts into a target subcluster that receives most of the settings of the source subcluster. The hosts for the target subcluster must meet the following requirements:
-
They must be part of your database cluster but not part of your database. For example, you can use hosts you have dropped from a subcluster or whose subcluster you have removed. Vertica returns an error if you attempt to duplicate a subcluster onto one or more nodes that are currently participating in the database.
Tip
If you want to duplicate the settings of a subcluster to another subcluster, remove the target subcluster (see
Removing subclusters). Then duplicate the source subcluster onto the hosts of the now-removed target subcluster.
-
The number of nodes you supply for the target subcluster must equal the number of nodes in the source subcluster. When duplicating the subcluster, Vertica performs a 1:1 copy of some node-level settings from each node in the source subcluster to a corresponding node in the target.
-
The RAM and disk allocation for the hosts in the target subcluster should be at least the same as the source nodes. Technically, your target nodes can have less RAM or disk space than the source nodes. However, you will usually see performance issues in the new subcluster because the settings of the original subcluster will not be tuned for the resources of the target subcluster.
You can duplicate a subcluster even if some of the nodes in the source subcluster or hosts in the target are down. If nodes in the target are down, they use the catalog Vertica copied from the source node when they recover.
Duplication of subcluster-level settings
The following table lists the subcluster-level settings that Vertica copies from the source subcluster to the target.
Setting Type |
Setting Details |
Basic subcluster settings |
Whether the subcluster is a primary or secondary subcluster. |
Large cluster settings |
The number of control nodes in the subcluster. |
Resource pool settings |
Vertica creates a new resource pool for every subcluster-specific resource pool in the source subcluster.
Note
Duplicating a subcluster can fail due to subcluster-specific resource pools. If creating the subcluster-specific resource pools leave less than 25% of the total memory free for the general pool, Vertica stops the duplication and reports an error.
Subcluster-specific resource pool cascade settings are copied from the source subcluster and are applied to the newly-created resource pool for the target subcluster.
Subcluster-level overrides on global resource pools settings such as MEMORYSIZE. See Managing workload resources in an Eon Mode database for more information.
Grants on resource pools are copied from the source subcluster.
|
Connection load balancing settings |
If the source subcluster is part of a subcluster-based load balancing group (you created the load balancing group using CREATE LOAD BALANCE GROUP...WITH SUBCLUSTER) the new subcluster is added to the group. See Creating Connection Load Balance Groups.
Important
Vertica adds the new subcluster to the subcluster-based load balancing group. However, it does not create network addresses for the nodes in the target subcluster. Load balancing policies cannot direct connections to the new subcluster until you create network addresses for the nodes in the target subcluster. See Creating network addresses for the steps you must take.
|
Storage policy settings |
Table and table partition pinning policies are copied from the source to the target subcluster. See Pinning Depot Objects for more information. Any existing storage policies on the target subcluster are dropped before the policies are copied from the source. |
Vertica does not copy the following subcluster settings:
Setting Type |
Setting Details |
Basic subcluster settings |
-
Subcluster name (you must provide a new name for the target subcluster).
-
If the source is the default subcluster, the setting is not copied to the target. Your Vertica database has a single default subcluster. If Vertica copied this value, the source subcluster could no longer be the default.
|
Connection load balancing settings |
Address-based load balancing groups are not duplicated for the target subcluster.
For example, suppose you created a load balancing group for the source subcluster by adding the network addresses of all subcluster's nodes . In this case, Vertica does not create a load balancing group for the target subcluster because it does not duplicate the network addresses of the source nodes (see the next section). Because it does not copy the addresses, it cannot not create an address-based group.
|
Duplication of node-level settings
When Vertica duplicates a subcluster, it maps each node in the source subcluster to a node in the destination subcluster. Then it copies relevant node-level settings from each individual source node to the corresponding target node.
For example, suppose you have a three-node subcluster consisting of nodes named node01, node02, and node03. The target subcluster has nodes named node04, node05, and node06. In this case, Vertica copies the settings from node01 to node04, from node02 to node05, and from node03 to node06.
The node-level settings that Vertica copies from the source nodes to the target nodes are:
Setting Type |
Setting Details |
Configuration parameters |
Vertica copies the value of configuration parameters that you have set at the node level in the source node to the target node. For example, suppose you set CompressCatalogOnDisk on the source node using the statement:
ALTER NODE node01 SET CompressCatalogOnDisk = 0;
If you then duplicated the subcluster containing node01, the setting is copied to the target node.
|
Eon Mode settings |
|
Storage location settings |
The DATA, TEMP, DEPOT, and USER storage location paths on the source node are duplicated on the target node. When duplicating node-specific paths (such as DATA or DEPOT) the path names are adjusted for the new node name. For example, suppose node 1 has a depot path of /vertica/depot/vmart/v_vmart_node0001_depot . If Vertica duplicates node 1 to node 4, it adjusts the path to /vertica/depot/vmart/v_vmart_node0004_depot .
Important
The directories for these storage locations on the target node must be empty. They must also have the correct file permissions to allow Vertica to read and write to them.
Vertica does not duplicate a storage location if it cannot access its directory on the target node or if the directory is not empty. In this case, the target node will not have the location defined after the duplication process finishes. Admintools does not warn you if any locations were not duplicated.
If you find that storage locations have not been duplicated on one or more target nodes, you must fix the issues with the directories on the target nodes. Then re-run the duplication command.
|
Large cluster settings |
Control node assignments are copied from the source node to the target node:
-
If the source node is a control node, then the target node is made into a control node.
-
If the source node depends on a control node, then the target node becomes a dependent of the corresponding control node in the new subcluster.
|
Vertica does not copy the following node-level settings:
Setting Type |
Setting Details |
Connection load balancing settings |
Network Addresses are not copied. The destination node's network addresses do not depend on the settings of the source node. Therefore, Vertica cannot determine what the target node's addresses should be. |
Depot settings |
Depot-related configuration parameters that can be set on a node level (such as FileDeletionServiceInterval) are not copied from the source node to the target node. |
To duplicate a subcluster, you use the same admintools db_add_subcluster
tool that you use to create a new subcluster (see Creating subclusters). In addition to the required options to create a subcluster (the list of hosts, name for the new subcluster, database name, and so on), you also pass the --like
option with the name of the source subcluster you want to duplicate.
Important
When you use the --like
option, you cannot use the --is-secondary
or --control-set-size
options. Vertica determines whether the new subcluster is secondary and the number of control nodes it contains based on the source subcluster. If you supply these options along with the --like
option, admintools returns an error.
The following examples demonstrate duplicating a three-node subcluster named analytics_1. The first example examines some of the settings in the analytics_1 subcluster:
-
An override of the global TM resource pool's memory size.
-
Its own resource pool named analytics
-
Its membership in a subcluster-based load balancing group named analytics
=> SELECT name, subcluster_name, memorysize FROM SUBCLUSTER_RESOURCE_POOL_OVERRIDES;
name | subcluster_name | memorysize
------+-----------------+------------
tm | analytics_1 | 0%
(1 row)
=> SELECT name, subcluster_name, memorysize, plannedconcurrency
FROM resource_pools WHERE subcluster_name IS NOT NULL;
name | subcluster_name | memorysize | plannedconcurrency
----------------+-----------------+------------+--------------------
analytics_pool | analytics_1 | 70% | 8
(1 row)
=> SELECT * FROM LOAD_BALANCE_GROUPS;
name | policy | filter | type | object_name
-----------+------------+-----------+------------+-------------
analytics | ROUNDROBIN | 0.0.0.0/0 | Subcluster | analytics_1
(1 row)
The following example calls admintool's db_add_subcluster
tool to duplicate the analytics_1 subcluster onto a set of three hosts to create a subcluster named analytics_2.
$ admintools -t db_add_subcluster -d verticadb \
-s 10.11.12.13,10.11.12.14,10.11.12.15 \
-p mypassword --like=analytics_1 -c analytics_2
Creating new subcluster 'analytics_2'
Adding new hosts to 'analytics_2'
Eon database detected, creating new depot locations for newly added nodes
Creating depot locations for 1 nodes
Warning when creating depot location for node: v_verticadb_node0007
WARNING: Target node v_verticadb_node0007 is down, so depot size has been
estimated from depot location on initiator. As soon as the node comes
up, its depot size might be altered depending on its disk size
Eon database detected, creating new depot locations for newly added nodes
Creating depot locations for 1 nodes
Warning when creating depot location for node: v_verticadb_node0008
WARNING: Target node v_verticadb_node0008 is down, so depot size has been
estimated from depot location on initiator. As soon as the node comes
up, its depot size might be altered depending on its disk size
Eon database detected, creating new depot locations for newly added nodes
Creating depot locations for 1 nodes
Warning when creating depot location for node: v_verticadb_node0009
WARNING: Target node v_verticadb_node0009 is down, so depot size has been
estimated from depot location on initiator. As soon as the node comes
up, its depot size might be altered depending on its disk size
Cloning subcluster properties
NOTICE: Nodes in subcluster analytics_1 have network addresses, you
might need to configure network addresses for nodes in subcluster
analytics_2 in order to get load balance groups to work correctly.
Replicating configuration to all nodes
Generating new configuration information and reloading spread
Starting nodes:
v_verticadb_node0007 (10.11.12.81)
v_verticadb_node0008 (10.11.12.209)
v_verticadb_node0009 (10.11.12.186)
Starting Vertica on all nodes. Please wait, databases with a large catalog
may take a while to initialize.
Checking database state for newly added nodes
Node Status: v_verticadb_node0007: (DOWN) v_verticadb_node0008:
(DOWN) v_verticadb_node0009: (DOWN)
Node Status: v_verticadb_node0007: (INITIALIZING) v_verticadb_node0008:
(INITIALIZING) v_verticadb_node0009: (INITIALIZING)
Node Status: v_verticadb_node0007: (UP) v_verticadb_node0008:
(UP) v_verticadb_node0009: (UP)
Syncing catalog on verticadb with 2000 attempts.
Multi-node DB add completed
Nodes added to subcluster analytics_2 successfully.
Subcluster added to verticadb successfully.
Re-running the queries in the first part of the example shows that the settings from analytics_1 have been duplicated in analytics_2:
=> SELECT name, subcluster_name, memorysize FROM SUBCLUSTER_RESOURCE_POOL_OVERRIDES;
name | subcluster_name | memorysize
------+-----------------+------------
tm | analytics_1 | 0%
tm | analytics_2 | 0%
(2 rows)
=> SELECT name, subcluster_name, memorysize, plannedconcurrency
FROM resource_pools WHERE subcluster_name IS NOT NULL;
name | subcluster_name | memorysize | plannedconcurrency
----------------+-----------------+------------+--------------------
analytics_pool | analytics_1 | 70% | 8
analytics_pool | analytics_2 | 70% | 8
(2 rows)
=> SELECT * FROM LOAD_BALANCE_GROUPS;
name | policy | filter | type | object_name
-----------+------------+-----------+------------+-------------
analytics | ROUNDROBIN | 0.0.0.0/0 | Subcluster | analytics_2
analytics | ROUNDROBIN | 0.0.0.0/0 | Subcluster | analytics_1
(2 rows)
As noted earlier, even though analytics_2 subcluster is part of the analytics load balancing group, its nodes do not have network addresses defined for them. Until you define network addresses for the nodes, Vertica cannot redirect client connections to them.
5.3 - Adding and removing nodes from subclusters
You will often want to add new nodes to and remove existing nodes from a subcluster.
You will often want to add new nodes to and remove existing nodes from a subcluster. This ability lets you scale your database to respond to changing analytic needs. For more information on how adding nodes to a subcluster affects your database's performance, see Scaling your Eon Mode database.
Adding new nodes to a subcluster
You can add nodes to a subcluster to meet additional workloads. The nodes that you add to the subcluster must already be part of your cluster. These can be:
To add new nodes to a subcluster, use the db_add_node
command of admintools:
$ adminTools -t db_add_node -h
Usage: db_add_node [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of the database
-s HOSTS, --hosts=HOSTS
Comma separated list of hosts to add to database
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
-a AHOSTS, --add=AHOSTS
Comma separated list of hosts to add to database
-c SCNAME, --subcluster=SCNAME
Name of subcluster for the new node
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
--compat21 (deprecated) Use Vertica 2.1 method using node names
instead of hostnames
If you do not use the -c
option, Vertica adds new nodes to the default subcluster (set to default_subcluster in new databases). This example adds a new node without specifying the subcluster:
$ adminTools -t db_add_node -p 'password' -d verticadb -s 10.11.12.117
Subcluster not specified, validating default subcluster
Nodes will be added to subcluster 'default_subcluster'
Verifying database connectivity...10.11.12.10
Eon database detected, creating new depot locations for newly added nodes
Creating depots for each node
Generating new configuration information and reloading spread
Replicating configuration to all nodes
Starting nodes
Starting nodes:
v_verticadb_node0004 (10.11.12.117)
Starting Vertica on all nodes. Please wait, databases with a
large catalog may take a while to initialize.
Checking database state
Node Status: v_verticadb_node0004: (DOWN)
Node Status: v_verticadb_node0004: (DOWN)
Node Status: v_verticadb_node0004: (DOWN)
Node Status: v_verticadb_node0004: (DOWN)
Node Status: v_verticadb_node0004: (UP)
Communal storage detected: syncing catalog
Multi-node DB add completed
Nodes added to verticadb successfully.
You will need to redesign your schema to take advantage of the new nodes.
To add nodes to a specific existing subcluster, use the db_add_node
tool's -c
option:
$ adminTools -t db_add_node -s 10.11.12.178 -d verticadb -p 'password' \
-c analytics_subcluster
Subcluster 'analytics_subcluster' specified, validating
Nodes will be added to subcluster 'analytics_subcluster'
Verifying database connectivity...10.11.12.10
Eon database detected, creating new depot locations for newly added nodes
Creating depots for each node
Generating new configuration information and reloading spread
Replicating configuration to all nodes
Starting nodes
Starting nodes:
v_verticadb_node0007 (10.11.12.178)
Starting Vertica on all nodes. Please wait, databases with a
large catalog may take a while to initialize.
Checking database state
Node Status: v_verticadb_node0007: (DOWN)
Node Status: v_verticadb_node0007: (DOWN)
Node Status: v_verticadb_node0007: (DOWN)
Node Status: v_verticadb_node0007: (DOWN)
Node Status: v_verticadb_node0007: (UP)
Communal storage detected: syncing catalog
Multi-node DB add completed
Nodes added to verticadb successfully.
You will need to redesign your schema to take advantage of the new nodes.
Updating shard subscriptions after adding nodes
After you add nodes to a subcluster they do not yet subscribe to shards. You can view the subscription status of all nodes in your database using the following query that joins the V_CATALOG.NODES and V_CATALOG.NODE_SUBSCRIPTIONS system tables:
=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
= ns.node_name) ORDER BY 1,2,3;
subcluster_name | node_name | shard_name | subscription_state
----------------------+----------------------+-------------+--------------------
analytics_subcluster | v_verticadb_node0004 | |
analytics_subcluster | v_verticadb_node0005 | |
analytics_subcluster | v_verticadb_node0006 | |
default_subcluster | v_verticadb_node0001 | replica | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0002 | replica | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | replica | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0003 | ACTIVE
(12 rows)
You can see that none of the nodes in the newly-added analytics_subcluster have subscriptions.
To update the subscriptions for new nodes, call the REBALANCE_SHARDS function. You can limit the rebalance to the subcluster containing the new nodes by passing its name to the REBALANCE_SHARDS function call. The following example runs rebalance shards to update the analytics_subcluster's subscriptions:
=> SELECT REBALANCE_SHARDS('analytics_subcluster');
REBALANCE_SHARDS
-------------------
REBALANCED SHARDS
(1 row)
=> SELECT subcluster_name, n.node_name, shard_name, subscription_state FROM
v_catalog.nodes n LEFT JOIN v_catalog.node_subscriptions ns ON (n.node_name
= ns.node_name) ORDER BY 1,2,3;
subcluster_name | node_name | shard_name | subscription_state
----------------------+----------------------+-------------+--------------------
analytics_subcluster | v_verticadb_node0004 | replica | ACTIVE
analytics_subcluster | v_verticadb_node0004 | segment0001 | ACTIVE
analytics_subcluster | v_verticadb_node0004 | segment0003 | ACTIVE
analytics_subcluster | v_verticadb_node0005 | replica | ACTIVE
analytics_subcluster | v_verticadb_node0005 | segment0001 | ACTIVE
analytics_subcluster | v_verticadb_node0005 | segment0002 | ACTIVE
analytics_subcluster | v_verticadb_node0006 | replica | ACTIVE
analytics_subcluster | v_verticadb_node0006 | segment0002 | ACTIVE
analytics_subcluster | v_verticadb_node0006 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0001 | replica | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0001 | segment0003 | ACTIVE
default_subcluster | v_verticadb_node0002 | replica | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0001 | ACTIVE
default_subcluster | v_verticadb_node0002 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | replica | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0002 | ACTIVE
default_subcluster | v_verticadb_node0003 | segment0003 | ACTIVE
(18 rows)
Removing nodes
Your database must meet these requirements before you can remove a node from a subcluster:
-
To remove a node from a primary subcluster, all of the primary nodes in the subcluster must be up, and the database must be able to maintain quorum after the primary node is removed (see Data integrity and high availability in an Eon Mode database). These requirements are necessary because Vertica calls REBALANCE_SHARDS to redistribute shard subscriptions among the remaining nodes in the subcluster. If you attempt to remove a primary node when the database does not meet the requirements, the rebalance shards process waits until either the down nodes recover or a timeout elapses. While it waits, you periodically see a message "Rebalance shards polling iteration number [nn]" indicating that the rebalance process is waiting to complete.
You can remove nodes from a secondary subcluster even when nodes in the subcluster are down.
-
If your database has the large cluster feature enabled, you cannot remove a node if it is the subcluster's last control node and there are nodes that depend on it. See Large cluster for more information.
If there are other control nodes in the subcluster, you can drop a control node. Vertica reassigns the nodes that depend on the node being dropped to other control nodes.
To remove one or more nodes, use admintools's db_remove_node
tool:
$ adminTools -t db_remove_node -p 'password' -d verticadb -s 10.11.12.117
connecting to 10.11.12.10
Waiting for rebalance shards. We will wait for at most 36000 seconds.
Rebalance shards polling iteration number [0], started at [14:56:41], time out at [00:56:41]
Attempting to drop node v_verticadb_node0004 ( 10.11.12.117 )
Shutting down node v_verticadb_node0004
Sending node shutdown command to '['v_verticadb_node0004', '10.11.12.117', '/vertica/data', '/vertica/data']'
Deleting catalog and data directories
Update admintools metadata for v_verticadb_node0004
Eon mode detected. The node v_verticadb_node0004 has been removed from host 10.11.12.117. To remove the
node metadata completely, please clean up the files corresponding to this node, at the communal
location: s3://eonbucket/metadata/verticadb/nodes/v_verticadb_node0004
Reload spread configuration
Replicating configuration to all nodes
Checking database state
Node Status: v_verticadb_node0001: (UP) v_verticadb_node0002: (UP) v_verticadb_node0003: (UP)
Communal storage detected: syncing catalog
When you remove one or more nodes from a subcluster, Vertica automatically rebalances shards in the subcluster. You do not need to manually rebalance shards after removing nodes.
Moving nodes between subclusters
To move a node from one subcluster to another:
-
Remove the node or nodes from the subcluster it is currently a part of.
-
Add the node to the subcluster you want to move it to.
5.4 - Managing workloads with subclusters
By default, queries are limited to executing on the nodes in the subcluster that contains the initiator node (the node the client is connected to).
By default, queries are limited to executing on the nodes in the subcluster that contains the initiator node (the node the client is connected to). This example demonstrates executing an explain plan for a query when connected to node 4 of a cluster. Node 4 is part of a subcluster containing nodes 4 through 6. You can see that only the nodes in the subcluster will participate in a query:
=> EXPLAIN SELECT customer_name, customer_state FROM customer_dimension LIMIT 10;
QUERY PLAN
--------------------------------------------------------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT customer_name, customer_state FROM customer_dimension LIMIT 10;
Access Path:
+-SELECT LIMIT 10 [Cost: 442, Rows: 10 (NO STATISTICS)] (PATH ID: 0)
| Output Only: 10 tuples
| Execute on: Query Initiator
| +---> STORAGE ACCESS for customer_dimension [Cost: 442, Rows: 10K (NO
STATISTICS)] (PATH ID: 1)
| | Projection: public.customer_dimension_b0
| | Materialize: customer_dimension.customer_name,
customer_dimension.customer_state
| | Output Only: 10 tuples
| | Execute on: v_verticadb_node0004, v_verticadb_node0005,
v_verticadb_node0006
. . .
In Eon Mode, you can override the MEMORYSIZE
, MAXMEMORYSIZE
, and MAXQUERYMEMORYSIZE
settings for built-in global resource pools to fine-tune workloads within a subcluster. See Managing workload resources in an Eon Mode database for more information.
What happens when a subcluster cannot run a query
In order to process queries, each subcluster's nodes must have full coverage of all shards in the database. If the nodes do not have full coverage (which can happen if nodes are down), the subcluster can no longer process queries. This state does not cause the subcluster to shut down. Instead, if you attempt to run a query on a subcluster in this state, you receive error messages telling you that not enough nodes are available to complete the query.
=> SELECT node_name, node_state FROM nodes
WHERE subcluster_name = 'analytics_cluster';
node_name | node_state
----------------------+------------
v_verticadb_node0004 | DOWN
v_verticadb_node0005 | UP
v_verticadb_node0006 | DOWN
(3 rows)
=> SELECT * FROM online_sales.online_sales_fact;
ERROR 9099: Cannot find participating nodes to run the query
Once the down nodes have recovered and the subcluster has full shard coverage, it will be able to process queries.
Controlling where a query runs
You can control where specific types of queries run by controlling which subcluster the clients connect to. The best way to enforce restrictions is to create a set of connection load balancing policies to steer clients from specific IP address ranges to clients in the correct subcluster.
For example, suppose you have the following database with two subclusters: one for performing data loading, and one for performing analytics.
The data load tasks come from a set of ETL systems in the IP 10.20.0.0/16 address range. Analytics tasks can come from any other IP address. In this case, you can create set of connection load balance policies that ensure that the ETL systems connect to the data load subcluster, and all other connections go to the analytics subcluster.
=> SELECT node_name,node_address,node_address_family,subcluster_name
FROM v_catalog.nodes;
node_name | node_address | node_address_family | subcluster_name
----------------------+--------------+---------------------+--------------------
v_verticadb_node0001 | 10.11.12.10 | ipv4 | load_subcluster
v_verticadb_node0002 | 10.11.12.20 | ipv4 | load_subcluster
v_verticadb_node0003 | 10.11.12.30 | ipv4 | load_subcluster
v_verticadb_node0004 | 10.11.12.40 | ipv4 | analytics_subcluster
v_verticadb_node0005 | 10.11.12.50 | ipv4 | analytics_subcluster
v_verticadb_node0006 | 10.11.12.60 | ipv4 | analytics_subcluster
(6 rows)
=> CREATE NETWORK ADDRESS node01 ON v_verticadb_node0001 WITH '10.11.12.10';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_verticadb_node0002 WITH '10.11.12.20';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_verticadb_node0003 WITH '10.11.12.30';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node04 ON v_verticadb_node0004 WITH '10.11.12.40';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_verticadb_node0005 WITH '10.11.12.50';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node06 ON v_verticadb_node0006 WITH '10.11.12.60';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP load_subcluster WITH SUBCLUSTER load_subcluster
FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP analytics_subcluster WITH SUBCLUSTER
analytics_subcluster FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE etl_systems ROUTE '10.20.0.0/16' TO load_subcluster;
CREATE ROUTING RULE
=> CREATE ROUTING RULE analytic_clients ROUTE '0.0.0.0/0' TO analytics_subcluster;
CREATE ROUTING RULE
Once you have created the load balance policies, you can test them using the DESCRIBE_LOAD_BALANCE_DECISION function.
=> SELECT describe_load_balance_decision('192.168.1.1');
describe_load_balance_decision
--------------------------------
Describing load balance decision for address [192.168.1.1]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_systems] source ip filter [10.20.0.0/16]...
input address does not match source ip filter for this rule.
Considered rule [analytic_clients] source ip filter [0.0.0.0/0]...
input address matches this rule
Matched to load balance group [analytics_cluster] the group has
policy [ROUNDROBIN] number of addresses [3]
(0) LB Address: [10.11.12.181]:5433
(1) LB Address: [10.11.12.205]:5433
(2) LB Address: [10.11.12.192]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.11.12.205]
port [5433]
(1 row)
=> SELECT describe_load_balance_decision('10.20.1.1');
describe_load_balance_decision
--------------------------------
Describing load balance decision for address [10.20.1.1]
Load balance cache internal version id (node-local): [1]
Considered rule [etl_systems] source ip filter [10.20.0.0/16]...
input address matches this rule
Matched to load balance group [default_cluster] the group has policy
[ROUNDROBIN] number of addresses [3]
(0) LB Address: [10.11.12.10]:5433
(1) LB Address: [10.11.12.20]:5433
(2) LB Address: [10.11.12.30]:5433
Chose address at position [1]
Routing table decision: Success. Load balance redirect to: [10.11.12.20]
port [5433]
(1 row)
Normally, with these policies, all queries run by the ETL system will run on the load subcluster. All other queries will run on the analytics subcluster. There are some cases (especially if a subcluster is down or draining) where a client may connect to a node in another subcluster. For this reason, clients should always verify they are connected to the correct subcluster. See Connection load balancing policies for more information about load balancing policies.
5.5 - Starting and stopping subclusters
Subclusters make it convenient to start and stop a group of nodes as needed.
Subclusters make it convenient to start and stop a group of nodes as needed. You start and stop them with admintools commands or Vertica functions. You can also start and stop subclusters with Management Console.
Starting a subcluster
To start a subcluster, use the admintools command restart_subcluster
:
$ adminTools -t restart_subcluster -h
Usage: restart_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database whose subcluster is to be restarted
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be restarted
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
-F, --force Force the nodes in the subcluster to start and auto
recover if necessary
This example starts the subcluster analytics_cluster
:
$ adminTools -t restart_subcluster -c analytics_cluster \
-d verticadb -p password
*** Restarting subcluster for database verticadb ***
Restarting host [10.11.12.192] with catalog [v_verticadb_node0006_catalog]
Restarting host [10.11.12.181] with catalog [v_verticadb_node0004_catalog]
Restarting host [10.11.12.205] with catalog [v_verticadb_node0005_catalog]
Issuing multi-node restart
Starting nodes:
v_verticadb_node0004 (10.11.12.181)
v_verticadb_node0005 (10.11.12.205)
v_verticadb_node0006 (10.11.12.192)
Starting Vertica on all nodes. Please wait, databases with a large
catalog may take a while to initialize.
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (DOWN)
v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0002: (UP) v_verticadb_node0004: (UP)
v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Communal storage detected: syncing catalog
Restart Subcluster result: 1
Stopping a subcluster
You can stop a subcluster gracefully with the function SHUTDOWN_WITH_DRAIN, or immediately with SHUTDOWN_SUBCLUSTER. You can also shut down subclusters with the admintools command
stop_subcluster
.
Graceful shutdown
The SHUTDOWN_WITH_DRAIN function drains a subcluster's client connections before shutting it down. The function first marks all nodes in the specified subcluster as draining. Work from existing user sessions continues on draining nodes, but the nodes refuse new client connections and are excluded from load-balancing operations. A dbadmin user can still connect to draining nodes. For more information about client connection draining, see Drain client connections.
To run the SHUTDOWN_WITH_DRAIN function, you must specify a timeout value. The function's behavior depends on the sign of the timeout value:
- Positive: The nodes drain until either all the existing connections close or the function reaches the runtime limit set by the timeout value. As soon as one of these conditions is met, the function sends a shutdown message to the subcluster and returns.
- Zero: The function immediately closes any active user sessions on the subcluster and then shuts down the subcluster and returns.
- Negative: The function marks the subcluster's nodes as draining and waits to shut down the subcluster until all active user sessions disconnect.
After all nodes in a draining subcluster are down, its nodes are automatically reset to a not draining status.
The following example demonstrates how you can use a positive timeout value to give active user sessions time to finish their work before shutting down the subcluster:
=> SELECT node_name, subcluster_name, is_draining, count_client_user_sessions, oldest_session_user FROM draining_status ORDER BY 1;
node_name | subcluster_name | is_draining | count_client_user_sessions | oldest_session_user
----------------------+--------------------+-------------+----------------------------+---------------------
v_verticadb_node0001 | default_subcluster | f | 0 |
v_verticadb_node0002 | default_subcluster | f | 0 |
v_verticadb_node0003 | default_subcluster | f | 0 |
v_verticadb_node0004 | analytics | f | 1 | analyst
v_verticadb_node0005 | analytics | f | 0 |
v_verticadb_node0006 | analytics | f | 0 |
(6 rows)
=> SELECT SHUTDOWN_WITH_DRAIN('analytics', 300);
NOTICE 0: Draining has started on subcluster (analytics)
NOTICE 0: Begin shutdown of subcluster (analytics)
SHUTDOWN_WITH_DRAIN
--------------------------------------------------------------------------------------------------------------------
Set subcluster (analytics) to draining state
Waited for 3 nodes to drain
Shutdown message sent to subcluster (analytics)
(1 row)
You can query the NODES system table to confirm that the subcluster shut down:
=> SELECT subcluster_name, node_name, node_state FROM nodes;
subcluster_name | node_name | node_state
--------------------+----------------------+------------
default_subcluster | v_verticadb_node0001 | UP
default_subcluster | v_verticadb_node0002 | UP
default_subcluster | v_verticadb_node0003 | UP
analytics | v_verticadb_node0004 | DOWN
analytics | v_verticadb_node0005 | DOWN
analytics | v_verticadb_node0006 | DOWN
(6 rows)
If you want to see more information about the draining and shutdown events, such as whether all user sessions finished their work before the timeout, you can query the dc_draining_events table. In this case, the subcluster still had one active user session when the function reached timeout:
=> SELECT event_type, event_type_name, event_description, event_result, event_result_name FROM dc_draining_events;
event_type | event_type_name | event_description | event_result | event_result_name
------------+------------------------------+---------------------------------------------------------------------+--------------+-------------------
0 | START_DRAIN_SUBCLUSTER | START_DRAIN for SHUTDOWN of subcluster (analytics) | 0 | SUCCESS
2 | START_WAIT_FOR_NODE_DRAIN | Wait timeout is 300 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 0 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 60 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 120 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 125 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 180 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 240 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 250 seconds | 4 | INFORMATIONAL
4 | INTERVAL_WAIT_FOR_NODE_DRAIN | 1 sessions remain after 300 seconds | 4 | INFORMATIONAL
3 | END_WAIT_FOR_NODE_DRAIN | Wait for drain ended with 1 sessions remaining | 2 | TIMEOUT
5 | BEGIN_SHUTDOWN_AFTER_DRAIN | Staring shutdown of subcluster (analytics) following drain | 4 | INFORMATIONAL
(12 rows)
After you restart the subcluster, you can query the DRAINING_STATUS system table to confirm that the nodes have reset their draining statuses to not draining:
=> SELECT node_name, subcluster_name, is_draining, count_client_user_sessions, oldest_session_user FROM draining_status ORDER BY 1;
node_name | subcluster_name | is_draining | count_client_user_sessions | oldest_session_user
----------------------+--------------------+-------------+----------------------------+---------------------
v_verticadb_node0001 | default_subcluster | f | 0 |
v_verticadb_node0002 | default_subcluster | f | 0 |
v_verticadb_node0003 | default_subcluster | f | 0 |
v_verticadb_node0004 | analytics | f | 0 |
v_verticadb_node0005 | analytics | f | 0 |
v_verticadb_node0006 | analytics | f | 0 |
(6 rows)
To shut down a subcluster immediately, call SHUTDOWN_SUBCLUSTER. The following example shuts down the analytics
subcluster immediately, without checking for active client connections:
=> SELECT SHUTDOWN_SUBCLUSTER('analytics');
SHUTDOWN_SUBCLUSTER
---------------------
Subcluster shutdown
(1 row)
You can use the stop_subcluster
tool to stop a subcluster:
$ adminTools -t stop_subcluster -h
Usage: stop_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database whose subcluster is to be stopped
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be stopped
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
-n DRAIN_SECONDS, --drain-seconds=DRAIN_SECONDS
Seconds to wait for user connections to close.
Default value is 60 seconds.
When the time expires, connections will be forcibly closed
and the db will shut down.
-F, --force Force the subcluster to shutdown immediately,
even if users are connected.
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
By default, stop_subcluster
calls SHUTDOWN_WITH_DRAIN to gracefully shut down the target subcluster. The shutdown process drains client connections from the subcluster before shutting it down.
The -n
(--drain-seconds
) option, which has a default value of 60 seconds, allows you to specify the number of seconds to wait before forcefully closing client connections and shutting down the subcluster. If you set a negative -n
value, the subcluster is marked as draining but is not shut down until all active user sessions disconnect.
In the following example, the subcluster named analytics initially has an active client session, but the session closes before the timeout limit is reached and the subcluster shuts down:
$ admintools -t stop_subcluster -d verticadb -c analytics --password password --drain-seconds 200
--- Subcluster shutdown ---
Verifying subcluster 'analytics'
Node 'v_verticadb_node0004' will shutdown
Node 'v_verticadb_node0005' will shutdown
Node 'v_verticadb_node0006' will shutdown
Connecting to database to begin shutdown of subcluster 'analytics'
Shutdown will use connection draining.
Shutdown will wait for all client sessions to complete, up to 200 seconds
Then it will force a shutdown.
Poller has been running for 0:00:00.000022 seconds since 2022-07-28 12:18:04.891781
------------------------------------------------------------
client_sessions |node_count |node_names
--------------------------------------------------------------
0 |5 |v_verticadb_node0002,v_verticadb_node0004,v_verticadb_node0003,v_verticadb_node0...
1 |1 |v_verticadb_node0005
STATUS: vertica.engine.api.db_client.module is still running on 1 host: nodeIP as of 2022-07-28 12:18:14. See /opt/vertica/log/adminTools.log for full details.
Poller has been running for 0:00:10.383018 seconds since 2022-07-28 12:18:04.891781
...
------------------------------------------------------------
client_sessions |node_count |node_names
--------------------------------------------------------------
0 |3 |v_verticadb_node0002,v_verticadb_node0001,v_verticadb_node0003
down |3 |v_verticadb_node0004,v_verticadb_node0005,v_verticadb_node0006
Stopping poller drain_status because it was canceled
SUCCESS running the shutdown metafunction
Not waiting for processes to completely exit
Shutdown operation was successful
You can use the -F
(or --force
) option to shut down a subcluster immediately, without checking for active user sessions or draining the subcluster:
$ admintools -t stop_subcluster -d verticadb -c analytics --password password -F
--- Subcluster shutdown ---
Verifying subcluster 'analytics'
Node 'v_verticadb_node0004' will shutdown
Node 'v_verticadb_node0005' will shutdown
Node 'v_verticadb_node0006' will shutdown
Connecting to database to begin shutdown of subcluster 'analytics'
Running shutdown metafunction. Not using connection draining
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2022-07-28 13:13:57. See /opt/vertica/log/adminTools.log for full details.
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2022-07-28 13:14:07. See /opt/vertica/log/adminTools.log for full details.
SUCCESS running the shutdown metafunction
Not waiting for processes to completely exit
Shutdown operation was successful
If you want to shut down all subclusters in a database, see Stopping an Eon Mode Database.
5.6 - Altering subcluster settings
There are several settings you can alter on a subcluster using the ALTER SUBCLUSTER statement.
There are several settings you can alter on a subcluster using the ALTER SUBCLUSTER statement. You can also switch a subcluster from a primary to a secondary subcluster, or from a secondary to a primary.
Renaming a subcluster
To rename an existing subcluster, use the ALTER SUBCLUSTER statement's RENAME TO clause:
=> ALTER SUBCLUSTER default_subcluster RENAME TO load_subcluster;
ALTER SUBCLUSTER
=> SELECT DISTINCT subcluster_name FROM subclusters;
subcluster_name
-------------------
load_subcluster
analytics_cluster
(2 rows)
Changing the default subcluster
The default subcluster designates which subcluster Vertica adds nodes to if you do not explicitly specify a subcluster when adding nodes to the database. When you create a new database (or when a database is upgraded from a version prior to 9.3.0) the default_subcluster is the default. You can find the current default subcluster by querying the is_default column of the SUBCLUSTERS system table.
The following example demonstrates finding the default subcluster, and then changing it to the subcluster named analytics_cluster:
=> SELECT DISTINCT subcluster_name FROM SUBCLUSTERS WHERE is_default = true;
subcluster_name
--------------------
default_subcluster
(1 row)
=> ALTER SUBCLUSTER analytics_cluster SET DEFAULT;
ALTER SUBCLUSTER
=> SELECT DISTINCT subcluster_name FROM SUBCLUSTERS WHERE is_default = true;
subcluster_name
-------------------
analytics_cluster
(1 row)
Converting a subcluster from primary to secondary, or secondary to primary
You usually choose whether a subcluster is primary or secondary when creating it (see Creating subclusters for more information). However, you can switch a subcluster between the two settings after you have created it. You may want to change whether a subcluster is primary or secondary to impact the K-safety of your database. For example, if you have a single primary subcluster that has down nodes that you cannot easily replace, you can promote a secondary subcluster to primary to ensure losing another primary node will not cause your database to shut down. On the oither hand, you may choose to convert a primary subcluster to a secondary before eventually shutting it down. This conversion can prevent the database from losing K-Safety if the subcluster you are shutting down contains half or more of the total number of primary nodes in the database.
Note
You cannot promote or demote a subcluster containing the
initiator node. You must be connected to a node in a subcluster other than the one you want to promote or demote.
To make a secondary subcluster into a primary subcluster, use the PROMOTE_SUBCLUSTER_TO_PRIMARY function:
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | f
load_subcluster | t
(2 rows)
=> SELECT PROMOTE_SUBCLUSTER_TO_PRIMARY('analytics_cluster');
PROMOTE_SUBCLUSTER_TO_PRIMARY
-------------------------------
PROMOTE SUBCLUSTER TO PRIMARY
(1 row)
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | t
load_subcluster | t
(2 rows)
Making a primary subcluster into a secondary subcluster is similar. Unlike converting a secondary subcluster to a primary, there are several issues that may prevent you from making a primary into a secondary. Vertica prevents you from making a primary into a secondary if any of the following is true:
-
The subcluster contains a critical node.
-
The subcluster is the only primary subcluster in the database. You must have at least one primary subcluster.
-
The initiator node is a member of the subcluster you are trying to demote. You must call DEMOTE_SUBCLUSTER_TO_SECONDARY from another subcluster.
To convert a primary subcluster to secondary, use the DEMOTE_SUBCLUSTER_TO_SECONDARY function:
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | t
load_subcluster | t
(2 rows)
=> SELECT DEMOTE_SUBCLUSTER_TO_SECONDARY('analytics_cluster');
DEMOTE_SUBCLUSTER_TO_SECONDARY
--------------------------------
DEMOTE SUBCLUSTER TO SECONDARY
(1 row)
=> SELECT DISTINCT subcluster_name, is_primary from subclusters;
subcluster_name | is_primary
-------------------+------------
analytics_cluster | f
load_subcluster | t
(2 rows)
5.7 - Removing subclusters
Removing a subcluster from the database deletes the subcluster from the Vertica catalog.
Removing a subcluster from the database deletes the subcluster from the Vertica catalog. During the removal, Vertica removes any nodes in the subcluster from the database. These nodes are still part of the database cluster, but are no longer part of the database. If you view your cluster in the MC, you will see these nodes with the status STANDBY. They can be added back to the database by adding them to another subcluster. See Creating subclusters and Adding New Nodes to a Subcluster.
Vertica places several restrictions on removing a subcluster:
Note
Removing a subcluster can fail if the database is repartitioning. If this happens, you will see the error message "Transaction commit aborted because session subscriptions do not match catalog." Wait until the repartitioning is done before removing a subcluster.
To remove a subcluster, use the admintools command line db_remove_subcluster
tool:
$ adminTools -t db_remove_subcluster -h
Usage: db_remove_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database to be modified
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be removed
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
--skip-directory-cleanup
Caution: this option will force you to do a manual
cleanup. This option skips directory deletion during
remove subcluster. This is best used in a cloud
environment where the hosts being removed will be
subsequently discarded.
This example removes the subcluster named analytics_cluster:
$ adminTools -t db_remove_subcluster -d verticadb -c analytics_cluster -p 'password'
Found node v_verticadb_node0004 in subcluster analytics_cluster
Found node v_verticadb_node0005 in subcluster analytics_cluster
Found node v_verticadb_node0006 in subcluster analytics_cluster
Found node v_verticadb_node0007 in subcluster analytics_cluster
Waiting for rebalance shards. We will wait for at most 36000 seconds.
Rebalance shards polling iteration number [0], started at [17:09:35], time
out at [03:09:35]
Attempting to drop node v_verticadb_node0004 ( 10.11.12.40 )
Shutting down node v_verticadb_node0004
Sending node shutdown command to '['v_verticadb_node0004', '10.11.12.40',
'/vertica/data', '/vertica/data']'
Deleting catalog and data directories
Update admintools metadata for v_verticadb_node0004
Eon mode detected. The node v_verticadb_node0004 has been removed from
host 10.11.12.40. To remove the node metadata completely, please clean
up the files corresponding to this node, at the communal location:
s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0004
Attempting to drop node v_verticadb_node0005 ( 10.11.12.50 )
Shutting down node v_verticadb_node0005
Sending node shutdown command to '['v_verticadb_node0005', '10.11.12.50',
'/vertica/data', '/vertica/data']'
Deleting catalog and data directories
Update admintools metadata for v_verticadb_node0005
Eon mode detected. The node v_verticadb_node0005 has been removed from
host 10.11.12.50. To remove the node metadata completely, please clean
up the files corresponding to this node, at the communal location:
s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0005
Attempting to drop node v_verticadb_node0006 ( 10.11.12.60 )
Shutting down node v_verticadb_node0006
Sending node shutdown command to '['v_verticadb_node0006', '10.11.12.60',
'/vertica/data', '/vertica/data']'
Deleting catalog and data directories
Update admintools metadata for v_verticadb_node0006
Eon mode detected. The node v_verticadb_node0006 has been removed from
host 10.11.12.60. To remove the node metadata completely, please clean
up the files corresponding to this node, at the communal location:
s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0006
Attempting to drop node v_verticadb_node0007 ( 10.11.12.70 )
Shutting down node v_verticadb_node0007
Sending node shutdown command to '['v_verticadb_node0007', '10.11.12.70',
'/vertica/data', '/vertica/data']'
Deleting catalog and data directories
Update admintools metadata for v_verticadb_node0007
Eon mode detected. The node v_verticadb_node0007 has been removed from
host 10.11.12.70. To remove the node metadata completely, please clean
up the files corresponding to this node, at the communal location:
s3://eonbucket/verticadb/metadata/verticadb/nodes/v_verticadb_node0007
Reload spread configuration
Replicating configuration to all nodes
Checking database state
Node Status: v_verticadb_node0001: (UP) v_verticadb_node0002: (UP)
v_verticadb_node0003: (UP)
Communal storage detected: syncing catalog
6 - Depot management
The nodes of an Eon Mode database fetch data from communal storage as needed to process queries, and cache that data locally on disk.
The nodes of an Eon Mode database fetch data from communal storage as needed to process queries, and cache that data locally on disk. The cached data of all nodes within a subcluster comprise that cluster's depot. Vertica uses depots to facilitate query execution: when processing a query, Vertica first checks the current depot for the required data. If the data is unavailable, Vertica fetches it from communal storage and saves a copy in the depot to expedite future queries. Vertica also uses the depot for load operations, caching newly-loaded data in the depot before uploading it to communal storage.
6.1 - Managing depot caching
You can control how the depot caches data from communal storage and how data is evicted from the depot.
You can control depot caching in several ways:
You can monitor depot activity and settings with several V_MONITOR
system tables, or with the Management Console.
Note
Depot caching is supported only on primary shard subscriber nodes.
Depot gateway parameters
Vertica depots can cache two types of data:
- Queried data: The depot facilitates query execution by fetching queried data from communal storage and caching it in the depot. The cached data remains available until it is evicted to make room for fresher data, or data that is fetched for more recent queries.
- Loaded data: The depot expedites load operations such as COPY by temporarily caching data until it is uploaded to communal storage.
By default, depots are configured to cache both types of data.
Two configuration parameters determine whether a depot caches queried or loaded data:
UseDepotForReads
(BOOLEAN)
- If
1
(default), search the depot for the queried data and if it is not found, fetch the data from communal storage. If 0
, bypass the depot and fetch queried data from communal storage.
UseDepotForWrites
(BOOLEAN)
- If
1
(default)m write loaded data to the depot and then upload files to communal storage. If 0
, bypass the depot and write directly to communal storage.
Both parameters can be set at session, user and database levels.
If set at session or user levels, these parameters can be used to segregate read and write activity on the depots of different subclusters. For example, parameters UseDepotForReads and UseDepotForWrites might be set as follows for users joe
and rhonda
:
=> SHOW USER joe ALL;
name | setting
-------------------------+---------
UseDepotForReads | 1
UseDepotForWrites | 0
(2 rows)
=> SHOW USER rhonda ALL;
name | setting
-------------------------+---------
UseDepotForReads | 0
UseDepotForWrites | 1
(2 rows)
Given these user settings, when joe
connects to a Vertica subcluster, his session only uses the current depot to process queries; all load operations are uploaded to communal storage. Conversely, rhonda
's sessions only use the depot to process load operations; all queries must fetch their data from communal storage.
Depot fetching
If a depot is enabled to cache queried data (UseDepotForReads = 1
), you can configure how it fetches data from communal storage with configuration parameter DepotOperationsForQuery. This parameter has three settings:
-
ALL
(default): Fetch file data from communal storage, if necessary displace existing files by evicting them from the depot.
-
FETCHES
: Fetch file data from communal storage only if space is available; otherwise, read the queried data directly from communal storage.
-
NONE
: Do not fetch file data to the depot, read the queried data directly from communal storage.
You can set fetching behavior at four levels, in ascending levels of precedence:
For example, you can set DepotOperationsForQuery at the database level as follows:
=> ALTER DATABASE default SET PARAMETER DepotOperationsForQuery = FETCHES;
ALTER DATABASE
This setting applies to all database depots unless overridden at other levels. For example, the following ALTER USER statement overrides database-level fetching behavior: file data is always fetched to the depot for all queries from user joe
:
=> ALTER USER joe SET PARAMETER DepotOperationsForQuery = ALL;
ALTER USER
Finally, joe
can override his own DepotOperationsForQuery setting by including the DEPOT_FETCH hint in individual queries:
=> SELECT /*+DEPOT_FETCH(NONE)*/ count(*) FROM bar;
Evicting depot data
In general, Vertica evicts data from the depot as needed to provide room for new data and expedite request processing. Before writing new data to the depot, Vertica evaluates it as follows:
- Data fetched from communal storage: Vertica sizes the download and evicts data from the depot accordingly.
- Data uploaded from a DML operation such as COPY: Vertica cannot estimate the total size of the upload before it is complete, so it sizes individual buffers and evicts data from the depot as needed.
In both cases, Vertica evicts objects from the depot in the following order, from highest eviction priority to lowest:
- Least recently used objects with an anti-pinning policy.
- Objects with an anti-pinning policy.
- Least recently used unpinned object evicted for any new object, pinned or unpinned.
- Least recently used pinned object evicted for a new pinned object. Only pinned storage can evict other pinned storage.
Depot eviction policies
Vertica supports two policy types to manage precedence of object eviction from the depot:
- Apply pinning policies to objects so Vertica is less likely to evict them from the depot than other objects.
- Apply anti-pinning policies to objects so Vertica is more likely to evict them than other objects.
You can apply either type of policy on individual subclusters, or on the entire database. Policies can apply at different levels of granularity—table, projection, and partitions. Eviction policies that set on an individual subclusters have no effect on how other subclusters handle depot object eviction.
Pinning policies
You can set pinning policies on database objects to reduce their exposure to eviction from the depot. Pinning policies can be set on individual subclusters, or on the entire database, and at different levels of granularity—table, projection, and partitions:
By default, pinned objects are queued for download from communal storage as needed to execute a query or DML operation. SET_DEPOT_PIN_POLICY functions can specify to override this behavior and immediately queue newly pinned objects for download: set the last Boolean argument of the function to true
. For example:
=> SELECT SET_DEPOT_PIN_POLICY_TABLE ('store.store_orders_fact', 'default_subluster', true );
Tip
How soon Vertica downloads a pinned object from communal storage depends on a number of factors, including space availability and precedence of other pinned objects that are queued for download. You can force immediate download of queued objects by calling
FINISH_FETCHING_FILES.
Anti-pinning policies
Vertica complements pinning policies with anti-pinning policies. Among all depot-cached objects, Vertica chooses objects with an anti-pinning policy for eviction before all others. Like pinning policies, you can set anti-pinning policies on individual subclusters, or on the entire database. and at different levels of granularity—table, projection, and partitions:
In some cases, object-specific anti-pinning might be preferable over depot-wide exclusions, such as setting the depot to be read- or write-only, or excluding specific users from fetching objects to the depot. For example, you might want to set anti-pinning on an infrequently-used table to prevent it from displacing tables that are used more frequently.
Overlapping eviction policies
If you set multiple eviction policies on a table or projection, Vertica gives precedence to the most recent policy. For example, if you issue an anti-pinning policy on a table that already has a pinning policy, the Vertica favors the anti-pinning policy over the pinning policy.
If you issue partition-level eviction policies on the same partitioned table, and the key ranges of these policies overlap, Vertica acts as follows:
-
If the overlapping policies are of the same type—that is, all are either anti-pinning or pinning partition policies—then Vertica collates the key ranges. For example, if you create two anti-pinning partition policies with key ranges of 1-3 and 2-10, Vertica combines them into a single anti-pinning partition policy with a key range of 1-10.
-
If there are overlapping pinning and anti-pinning policies, Vertica favors the newer policy, either truncating or splitting the older policy.
For example, if you create an anti-partition pinning policy and then a pinning policy with key ranges of 1-10 and 5-20, respectively, Vertica favors the newer pinning policy by truncating the earlier anti-pinning policy's key range:
policy_type |
min_value |
max_value |
PIN |
5 |
20 |
ANTI_PIN |
1 |
4 |
If the new pinning policy's partition range falls inside the range of an older anti-pinning policy, Vertica splits the anti-pinning policy. So, given an existing partition anti-pinning policy with a key range from 1 through 20, a new partition pinning policy with a key range from 5 through 10 splits the anti-pinning policy:
policy_type |
min_value |
max_value |
ANTI_PIN |
1 |
4 |
PIN |
5 |
10 |
ANTI_PIN |
11 |
20 |
Eviction policy guidelines
Pinning and anti-pinning policies granularly control which objects consume depot space. When depot space is claimed by pinned objects, you guarantee that these objects and their operations take precedence over operations that involve unpinned objects or objects with an anti-pinning policy. However, if you do not create efficient pinning and anti-pinning policies, you might increase eviction frequency and adversely affect overall performance.
To minimize contention over depot usage, consider the following guidelines:
- Pin only those objects that are most active in DML operations and queries.
- Minimize the size of pinned data by setting policies at the smallest effective level. For example, pin only the data of a table's active partition.
- Periodically review eviction policies across all database subclusters, and update as needed to optimize depot usage.
You can also use the configuration parameters UseDepotForReads or UseDepotForWrites to optimize distribution of query and load activity across database subcluster depots.
Clearing depot policies
You clear pinning and anti-pinning policies from objects with the following functions:
Depot warming
On startup, the depots of new nodes are empty, while the depots of restarted nodes often contain stale data that must be refreshed. When depot warming is enabled, a node that is undergoing startup preemptively loads its depot with frequently queried and pinned data. When the node completes startup and begins to execute queries, its depot already contains much of the data it needs to process those queries. This reduces the need to fetch data from communal storage, and expedites query performance accordingly.
Note
Fetching data to a warming depot can delay node startup.
By default, depot warming is disabled (EnableDepotWarmingFromPeers = 0). A node executes depot warming as follows:
- The node checks configuration parameter PreFetchPinnedObjectsToDepotAtStartup. If enabled (set to 1), the node:
- The node checks configuration parameter EnableDepotWarmingFromPeers. If enabled (set to 1), the node:
-
Identifies a peer node in the same subcluster whose depot contents it can copy.
-
After taking into account all pinned objects, calculates how much space remains available in the warming depot.
-
Gets from the peer node a list of the most recently used objects that can fit in the depot.
-
Queues the objects for fetching.
- If BackgroundDepotWarming is enabled (set to 1, default), the node loads queued objects into its depot while it is warming, and continues to do so in the background after the node becomes active and starts executing queries. Otherwise (BackgroundDepotWarming = 0), node activation is deferred until the depot fetches and loads all queued objects.
Monitoring the depot
You can monitor depot activity and settings with several V_MONITOR system tables.
6.2 - Resizing depot caching capacity
Each node in an Eon database caches depot data in a predefined storage location.
Each node in an Eon database caches depot data in a predefined storage location. The storage location path depends on your Vertica installation's filesystem. By default, each node in a cluster can use up to 60 percent of disk space on the storage location's filesystem to cache depot data. You can change caching capacity with ALTER_LOCATION_SIZE, by specifying to a fixed size or a percentage of total disk space. The function can specify a single node, a subcluster, or all nodes in the database cluster. You can increase depot caching capacity for each node up to 80 percent.
In the following example, ALTER_LOCATION_SIZE increases depot caching capacity to 80 percent of disk space on the storage location's filesystem. The function supplies an empty string as the second (node-name
) argument, so the change applies to all nodes:
Important
By default, depot caching capacity cannot exceed 80 percent of disk space on the store location file system; attempts to set it to a higher value return an error. Vertica requires at least 20 percent of disk space for the catalog, Data Collector tables, and temp files.
=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
node_name | location_label | location_path | max_size | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 36059377664 | 70%
v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 36060108800 | 70%
(3 rows)
=> SELECT alter_location_size('depot', '','80%');
alter_location_size
---------------------
depotSize changed.
(1 row)
=> SELECT node_name, location_label, location_path, max_size, disk_percent FROM storage_locations WHERE location_usage = 'DEPOT' ORDER BY node_name;
node_name | location_label | location_path | max_size | disk_percent
------------------+-----------------+-------------------------+-------------+--------------
v_vmart_node0001 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
v_vmart_node0002 | auto-data-depot | /home/dbadmin/verticadb | 41210717184 | 80%
v_vmart_node0003 | auto-data-depot | /home/dbadmin/verticadb | 41211552768 | 80%
(3 rows)
Rescaling depot capacity
When a database is revived on an instance with greater or lesser disk space than it had previously, Vertica evaluates the depot size settings that were previously in effect. If depot size was specified as a percentage of available disk space, Vertica proportionately rescales depot capacity. For example, if depot caching capacity for a given node was set to 70 percent, the revived node applies that setting to the new disk space and adjusts depot caching capacity accordingly. If depot capacity was set to a fixed size, Vertica applies that setting, unless doing so will consume more than 80 percent of available disk space. In that case, Vertica automatically adjusts depot size as needed.
7 - Scaling your Eon Mode database
One of the strengths of an Eon Mode database is its ability to grow or shrink to meet your workload demands.
One of the strengths of an Eon Mode database is its ability to grow or shrink to meet your workload demands. You can add nodes to and remove nodes from your database to meet changing workload demands. For an overview of why you would scale your database and how it affects queries, see Elasticity.
Scaling up your database by starting stopped nodes
The easiest way to scale up your database is to start any stopped nodes:
Scaling up your database by adding nodes
If you do not have stopped nodes in your database, or the stopped nodes are not in the subclusters where you want to add new nodes, then you can add new nodes to the database. In supported environments, you can use the MC to provision and add new nodes to your database in a single step. See Viewing and managing your cluster for more information.
You can also manually add new nodes:
Controlling how Vertica uses your new nodes
New nodes can improve your database's performance in one of two ways:
See Elasticity for details on these performance improvements. You control how the new nodes improve your database's performance by choosing what subclusters you add them to. The following topics explain how to use scaling to improve throughput and query performance.
7.1 - Reshard the default namespace
The initial number of shards is set when you create a namespace.
Important
You can only change the number of shards in the
default_namespace
if it is the only namespace in your database. If your database contains any non-default namespaces, running
RESHARD_DATABASE results in an error.
The initial number of shards for the default_namespace
is set when you create a database. You might choose to change the number of shards in this namespace for the following reasons:
-
Improve large subcluster performance. For example, if you have a 24-node subcluster and a namespace with 6 shards, the subcluster uses Elastic Crunch Scaling (ECS) to split the responsibility for processing the data in each shard among the nodes. Re-sharding the namespace to 24 shards avoids using ECS, which improves performance because ECS is not as efficient as having a 1:1 shard:node ratio. For more information, see Using elastic crunch scaling to improve query performance.
-
Reducing the number of shards in default_namespace
reduces the catalog size.
-
Improve performance after migrating from Enterprise Mode to Eon Mode. When you migrate your database from Enterprise Mode to Eon Mode, the number of shards in your the default_namespace
is initially set to the number of nodes that you had in your Enterprise database. This default number of shards might not be ideal. For details, see Choosing the initial node and shard counts.
-
Scale your database effectively. To evenly distribute work among nodes, the number of nodes in the database should be a multiple of the number of shards. You might re-shard the namespace if you plan to scale the subclusters to a size that is incompatible with this guidance. For example, a namespace with 7 shards works best with subclusters that have a multiple of 7 nodes. Choosing a shard count with more divisors, such as 8, gives you greater flexibility in choosing the number of nodes in a subcluster.
You should not re-shard your default_namespace
every time you scale subclusters. While in progress, re-sharding might affect the database's performance. After re-sharding, the storage containers on the subcluster are not immediately aligned with the new shard subscription bounds. This misalignment adds overhead to query execution.
Re-sharding the default_namespace
To re-shard the default_namespace
, call the RESHARD_DATABASE function with the new shard count as the argument. This function takes a global catalog lock, so avoid running it during busy periods or when performing heavy ETL loads. The runtime depends on the size of your catalog.
After RESHARD_DATABASE completes, the nodes in the cluster use the new catalog shard definitions. However, the re-sharding process does not immediately alter the storage containers in communal storage. The shards continue to point to the existing storage containers. For example, if you double the number of shards in the namespace, each storage container now has two associated shards. During queries, each node filters out the data in the storage containers that does not apply to its subscribed shard. This adds a small overheard to the query. Eventually, the Tuple Mover's background reflexive mergeout processes automatically update the storage containers so they align with the new shard definitions. You can call DO_TM_TASK to run a 'RESHARDMERGEOUT' task that has the Tuple Mover immediately realign the storage containers.
The following query returns the details of any storage containers that Tuple Mover has not yet realigned:
=> SELECT * FROM storage_containers WHERE original_segment_lower_bound IS NOT NULL AND original_segment_upper_bound IS NOT NULL;
Example
This example demonstrates the re-sharding process and how it affects shard assignments and storage containers. To illustrate the impact of re-sharding, the shard assignment and storage container details are compared before and after re-sharding. The following four queries return information about the namespace's shard count, shard bounds, node subscriptions, and storage container catalog objects:
=> SELECT * FROM NAMESPACES;
namespace_oid | namespace_name | is_default | default_shard_count
-------------------+-------------------+------------+---------------------
45035996273704988 | default_namespace | t | 4
=> SELECT shard_name, lower_hash_bound, upper_hash_bound FROM shards ORDER BY shard_name;
shard_name | lower_hash_bound | upper_hash_bound
------------+------------------+------------------
replica | |
segment0001 | 0 | 1073741825
segment0002 | 1073741826 | 2147483649
segment0003 | 2147483650 | 3221225473
segment0004 | 3221225474 | 4294967295
(5 rows)
=> SELECT node_name, shard_name, is_primary, is_resubscribing, is_participating_primary FROM node_subscriptions;
node_name | shard_name | is_primary | is_resubscribing | is_participating_primary
----------+-------------+------------+------------------+--------------------------
initiator | replica | t | f | t
e0 | replica | f | f | t
e1 | replica | f | f | t
e2 | replica | f | f | t
e0 | segment0002 | t | f | t
e1 | segment0003 | t | f | t
e2 | segment0004 | t | f | t
initiator | segment0001 | t | f | t
(8 rows)
=> SELECT node_name, projection_name, storage_oid, sal_storage_id, total_row_count, deleted_row_count, segment_lower_bound, segment_upper_bound, shard_name FROM storage_containers WHERE projection_name = 't_super';
node_name | projection_name | storage_oid | sal_storage_id | total_row_count | deleted_row_count | segment_lower_bound | segment_upper_bound | shard_name
----------+-----------------+-------------------+--------------------------------------------------+-----------------+-------------------+---------------------+---------------------+-------------
initiator | t_super | 45035996273842990 | 022e836bff54b0aed318df2fe73b5afe00a0000000021b2d | 4 | 0 | 0 | 1073741825 | segment0001
e0 | t_super | 49539595901213486 | 024bbf043c1ca3f5c7a86a423fc7e1e300b0000000021b2d | 3 | 0 | 1073741826 | 2147483649 | segment0002
e1 | t_super | 54043195528583990 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b35 | 8 | 0 | 2147483650 | 3221225473 | segment0003
e2 | t_super | 54043195528583992 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b31 | 6 | 0 | 3221225474 | 4294967295 | segment0004
(4 rows)
The following call to RESHARD_DATABASE changes the shard count of default_namespace
to eight:
=> SELECT RESHARD_DATABASE(8);
RESHARD_DATABASE
----------------------------------------------------------
The database has been re-sharded from 4 shards to 8 shards
(1 row)
To confirm that the reshard operation was successful, query the NAMESPACES system table:
=> SELECT * FROM NAMESPACES;
namespace_oid | namespace_name | is_default | default_shard_count
-------------------+-------------------+------------+---------------------
45035996273704988 | default_namespace | t | 8
You can use the following query to view the namespace's new shard definitions:
=> SELECT shard_name, lower_hash_bound, upper_hash_bound FROM shards ORDER BY shard_name;
shard_name | lower_hash_bound | upper_hash_bound
-------------+------------------+------------------
replica | |
segment0001 | 0 | 536870913
segment0002 | 536870914 | 1073741825
segment0003 | 1073741826 | 1610612737
segment0004 | 1610612738 | 2147483649
segment0005 | 2147483650 | 2684354561
segment0006 | 2684354562 | 3221225473
segment0007 | 3221225474 | 3758096385
segment0008 | 3758096386 | 4294967295
(9 rows)
The namespace now has eight shards. Because re-sharding cut the boundary range of each shard in half, each shard is responsible for about half as much of the communal storage data.
The following query returns the namespace's new node subscriptions:
=> SELECT node_name, shard_name, is_primary, is_resubscribing, is_participating_primary FROM node_subscriptions;
node_name | shard_name | is_primary | is_resubscribing | is_participating_primary
-----------+-------------+------------+------------------+--------------------------
initiator | replica | t | f | t
e0 | replica | f | f | t
e1 | replica | f | f | t
e2 | replica | f | f | t
initiator | segment0001 | t | f | t
e0 | segment0002 | t | f | t
e1 | segment0003 | t | f | t
e2 | segment0004 | t | f | t
initiator | segment0005 | t | f | t
e0 | segment0006 | t | f | t
e1 | segment0007 | t | f | t
e2 | segment0008 | t | f | t
(12 rows)
After re-sharding, each node now subscribes to two shards instead of one.
You can use the following query to see how re-sharding affected the database's storage container catalog objects:
=> SELECT node_name, projection_name, storage_oid, sal_storage_id, total_row_count, deleted_row_count, segment_lower_bound, segment_upper_bound, shard_name FROM storage_containers WHERE projection_name = 't_super';
node_name | projection_name | storage_oid | sal_storage_id | total_row_count | deleted_row_count | segment_lower_bound | segment_upper_bound | shard_name
----------+-----------------+-------------------+--------------------------------------------------+-----------------+-------------------+---------------------+---------------------+-------------
initiator | t_super | 45035996273843145 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b35 | 8 | 0 | 2147483650 | 3221225473 | segment0005
initiator | t_super | 45035996273843149 | 022e836bff54b0aed318df2fe73b5afe00a0000000021b2d | 4 | 0 | 0 | 1073741825 | segment0001
e0 | t_super | 49539595901213641 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b35 | 8 | 0 | 2147483650 | 3221225473 | segment0006
e0 | t_super | 49539595901213645 | 022e836bff54b0aed318df2fe73b5afe00a0000000021b2d | 4 | 0 | 0 | 1073741825 | segment0002
e1 | t_super | 54043195528584141 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b31 | 6 | 0 | 3221225474 | 4294967295 | segment0007
e1 | t_super | 54043195528584143 | 02dac7dc405a1620c92bae1a17c7bbad00c0000000021b31 | 6 | 0 | 1073741826 | 2147483649 | segment0003
e2 | t_super | 54043195528584137 | 024bbf043c1ca3f5c7a86a423fc7e1e300b0000000021b2d | 3 | 0 | 3221225474 | 4294967295 | segment0008
e2 | t_super | 54043195528584139 | 024bbf043c1ca3f5c7a86a423fc7e1e300b0000000021b2d | 3 | 0 | 1073741826 | 2147483649 | segment0004
(8 rows)
The shards point to storage files with the same sal_storage_id
as before the re-shard. Eventually, the TM's mergeout processes will automatically update the storage containers.
You can query the RESHARDING_EVENTS system table for information about current and historical resharding operations, such as a node's previous shard subscription bounds and the current status of the resharding operation:
=> SELECT node_name, running_status, old_shard_name, old_shard_lower_bound, old_shard_upper_bound FROM RESHARDING_EVENTS;
node_name | running_status | old_shard_name | old_shard_lower_bound | old_shard_upper_bound
----------+-----------------+----------------+---------------------------+-------------------------
e0 | Running | segment0001 | 0 | 1073741825
e0 | Running | segment0002 | 1073741826 | 2147483649
e0 | Running | segment0003 | 2147483650 | 3221225473
e0 | Running | segment0004 | 3221225474 | 4294967295
e1 | Running | segment0001 | 0 | 1073741825
e1 | Running | segment0002 | 1073741826 | 2147483649
e1 | Running | segment0003 | 2147483650 | 3221225473
e1 | Running | segment0004 | 3221225474 | 4294967295
initiator | Running | segment0001 | 0 | 1073741825
initiator | Running | segment0002 | 1073741826 | 2147483649
initiator | Running | segment0003 | 2147483650 | 3221225473
initiator | Running | segment0004 | 3221225474 | 4294967295
(12 rows)
After you re-shard the namespace, you can query the DC_ROSES_CREATED table to track the original ROS containers and DVMiniROSes from which the new storage containers were derived:
=> SELECT node_name, projection_name, storage_oid, old_storage_oid, is_dv FROM DC_ROSES_CREATED;
node_name | projection_name | storage_oid | old_storage_oid | is_dv
---------------------+------------------+-------------------+-----------------------------
initiator | t_super | 45035996273860625 | 45035996273843149 | f
initiator | t_super | 45035996273860632 | 0 | f
e0 | t_super | 45035996273843149 | 0 | f
(3 rows)
7.2 - Improving query throughput using subclusters
Improving query throughput increases the number of queries your Eon Mode database processes at the same time.
Improving query throughput increases the number of queries your Eon Mode database processes at the same time. You are usually concerned about your database's throughput when your workload consists of many short-running queries. They are often referred to as "dashboard queries." This term describes type of workload you see when a large number of users have web-based dashboard pages open to monitor some sort of status. These dashboards tend to update frequently, using simpler, short-running queries instead of analytics-heavy long running queries.
The best way to improve your database's throughput is to add new subclusters to the database or start any stopped subclusters. Then distribute the client connections among these subclusters using connection load balancing policies. Subclusters independently process queries. By adding more subclusters, you improve your database's parallelism.
For the best performance, make the number of nodes in your subcluster the same as the number of shards in a namespace. If you choose to have less nodes than the number of shards, make the number of nodes an even divisor of the number of shards. When the number of shards is divisible by the number of nodes, the data in your namespace is equally divided among the nodes in the subcluster.
The easiest way of adding subclusters is to use the MC:
-
From the MC home page, click the database you want to add subclusters to.
-
Click Manage.
-
Click Add Subcluster.
-
Follow the steps in the wizard to add the subcluster. Normally, the only items you need to fill in are the subcluster name and the number of instances to add to it.
Note
The MC currently does not support creating instances on all platforms. For those platforms where the MC does not support instances, you can manually add subclusters. See
Creating subclusters for more information.
Distributing clients among the throughput subclusters
To gain benefits from the added subclusters, you must have clients that will execute short-running queries connect to the nodes that the subclusters contain. Queries run only on the subcluster that contains the initiator node (the node that the client is connected to). Use connection load balancing policies to spread the connections across all of the subclusters you created to increase query throughput. See Connection load balancing policies for details.
The following example creates a load balancing policy that spreads client connections across two three-node subclusters named query_pool_a and query_pool_b. This example:
-
Creates network addresses on the six nodes that are in the two subclusters.
-
Creates a load balance group from all the nodes in the two subclusters.
-
Creates the routing rule to redirect all incoming connections to the two subclusters.
=> CREATE NETWORK ADDRESS node04 ON v_verticadb_node0004 WITH '203.0.113.1';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node05 ON v_verticadb_node0005 WITH '203.0.113.2';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node06 ON v_verticadb_node0006 WITH '203.0.113.3';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node07 ON v_verticadb_node0007 WITH '203.0.113.4';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node08 ON v_verticadb_node0008 WITH '203.0.113.5';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node09 ON v_verticadb_node0009 WITH '203.0.113.6';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP query_subclusters WITH SUBCLUSTER query_pool_a,
query_pool_b FILTER '0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE ROUTING RULE query_clients ROUTE '0.0.0.0/0' TO query_subclusters;
CREATE ROUTING RULE
Important
In cloud environments where clients will be connecting from outside the private network, use the external IP address for each node when creating the network addresses. Otherwise, external clients will not be able to connect to the nodes.
If you have a mix of internal and external clients, set up two network addresses for each node and add them to two separate load balancing groups: one for internal clients and another for external. Then create two routing rules one that routes the internal clients to the internal group, and another that routes the external clients to the external group. Make the routing rule for internal clients only apply to the virtual private networks where your internal clients will connect from (for example, 10.0.0.0/8). The routing rule for the external clients can use the 0.0.0.0/0 CIDR range (all IP addresses) as the incoming connection address range. The rules will work correctly together because the more-specific internal client routing rule takes precedence over the less-restrictive external client rule.
After creating the policy, any client that opts into load balancing is redirected to one of the nodes in the two subclusters. For example, when you connect to node 1 in the cluster (with the IP address 203.0.113.1) using vsql with the -C flag, you see output similar to this:
$ vsql -h 203.0.113.1 -U dbadmin -w mypassword -C
Welcome to vsql, the Vertica Analytic Database interactive terminal.
Type: \h or \? for help with vsql commands
\g or terminate with semicolon to execute query
\q to quit
SSL connection (cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, protocol: TLSv1.2)
INFO: Connected using a load-balanced connection.
INFO: Connected to 203.0.113.7 at port 5433.
=>
Connection load balancing policies take into account nodes that are stopped when picking a node to handle a client connection. If you shut down one or more subclusters to save money during low-demand periods, you do not need to adjust your load balancing policy as long as some of the nodes are still up.
7.3 - Using elastic crunch scaling to improve query performance
You can choose to add nodes to your database to improve the performance of complex long-running analytic queries.
You can choose to add nodes to your database to improve the performance of complex long-running analytic queries. Adding nodes helps these queries run faster.
When you have more nodes in a subcluster than you have shards in a namespace, multiple nodes subscribe to each shard. To involve all of the nodes in the subcluster in queries, the Vertica query optimizer automatically uses a feature called Elastic Crunch Scaling (ECS). This feature splits the responsibility for processing the data in each shard among the nodes that subscribe to it. During a query, each node has less data to process and usually finishes the query faster.
For example, suppose you have a six-node subcluster in a three-shard namespace. In this subcluster, two nodes subscribe to each shard of the namespace. When you execute a query, Vertica assigns each node roughly half of the data in the shard it subscribes to. Because all nodes in the subcluster participate in the query, the query usually finishes faster than if only half the nodes had participated.
ECS lets a subcluster that has more nodes than shards in a namespace act as if the shard count in the namespace were higher. In a three-shard namespace, a six-node subcluster acts as if the namespace has six shards by splitting each shard in half. However, using ECS isn't as efficient as having a higher shard count. In practice, you will see slightly slower query performance on a six-node subcluster in a three shard database than you would see from a six-node subcluster in a six-shard database.
If the only namespace in your database is default_namespace
, you can call RESHARD_DATABASE to change the number of shards in the namespace. If the new number of shards is greater than or equal to the number of nodes in the subcluster, the subcluster no longer uses ECS. This will generally lead to faster query performance. However, re-sharding produces a larger catalog size and storage containers that are initially misaligned with the new shard definitions. Until the storage containers are realigned, queries must filter out the data in the storage containers that is outside the new shard bounds. This adds a small overhead to queries. For details, see Reshard the default namespace.
You can determine when the optimizer will use ECS in a subcluster by querying the V_CATALOG.SESSION_SUBSCRIPTIONS system table and look for nodes whose is_collaborating column is TRUE. Subclusters whose node count is less than or equal to the number of shards in a namespace only have participating nodes. Subclusters that have more nodes than a namespace's shard count assign the "extra" nodes the role of collaborators. The differences between the two types of nodes are not important for when you are executing queries. The two types just relate to how Vertica organizes the nodes to execute ECS-enabled queries.
This example shows how to get the list of nodes that are participating or collaborating in resolving queries for the current session:
=> SELECT node_name, shard_name, is_collaborating, is_participating
FROM V_CATALOG.SESSION_SUBSCRIPTIONS
WHERE is_participating = TRUE OR is_collaborating = TRUE
ORDER BY shard_name, node_name;
node_name | shard_name | is_collaborating | is_participating
----------------------+-------------+------------------+------------------
v_verticadb_node0004 | replica | f | t
v_verticadb_node0005 | replica | f | t
v_verticadb_node0006 | replica | t | f
v_verticadb_node0007 | replica | f | t
v_verticadb_node0008 | replica | t | f
v_verticadb_node0009 | replica | t | f
v_verticadb_node0007 | segment0001 | f | t
v_verticadb_node0008 | segment0001 | t | f
v_verticadb_node0005 | segment0002 | f | t
v_verticadb_node0009 | segment0002 | t | f
v_verticadb_node0004 | segment0003 | f | t
v_verticadb_node0006 | segment0003 | t | f
(12 rows)
You can see that nodes 4, 5, and 7 are participating, and nodes 6, 8, and 9 are collaborating.
You can also see that ECS is enabled by looking at an EXPLAIN plan for a query. At the top of the plan for an ECS-enabled query is the statement "this query involves non-participating nodes." These non-participating nodes are the collaborating nodes that are splitting the data in the shard with the participating nodes. The plan also lists the nodes taking part in the query.
This example shows an explain plan for an ECS-enabled query in a six-node subcluster in a three-shard namespace:
=> EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
FROM online_sales.online_sales_fact
INNER JOIN online_sales.call_center_dimension
ON (online_sales.online_sales_fact.call_center_key
= online_sales.call_center_dimension.call_center_key
AND sale_date_key = 156)
ORDER BY sales_dollar_amount DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes. Crunch scaling
strategy preserves data segmentation
------------------------------
EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
FROM online_sales.online_sales_fact
INNER JOIN online_sales.call_center_dimension
ON (online_sales.online_sales_fact.call_center_key
= online_sales.call_center_dimension.call_center_key
AND sale_date_key = 156)
ORDER BY sales_dollar_amount DESC;
Access Path:
+-SORT [Cost: 6K, Rows: 754K] (PATH ID: 1)
| Order: online_sales_fact.sales_dollar_amount DESC
| Execute on: v_verticadb_node0007, v_verticadb_node0004, v_verticadb_node0005,
| v_verticadb_node0006, v_verticadb_node0008, v_verticadb_node0009
| +---> JOIN MERGEJOIN(inputs presorted) [Cost: 530, Rows: 754K (202 RLE)] (PATH ID: 2)
| | Join Cond: (online_sales_fact.call_center_key = call_center_dimension.call_center_key)
| | Materialize at Output: online_sales_fact.sales_quantity,
| | online_sales_fact.sales_dollar_amount, online_sales_fact.transaction_type
| | Execute on: v_verticadb_node0007, v_verticadb_node0004,
| | v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
| | v_verticadb_node0009
| | +-- Outer -> STORAGE ACCESS for online_sales_fact [Cost: 13, Rows: 754K (202 RLE)] (PATH ID: 3)
| | | Projection: online_sales.online_sales_fact_DBD_18_seg_vmart_b0
| | | Materialize: online_sales_fact.call_center_key
| | | Filter: (online_sales_fact.sale_date_key = 156)
| | | Execute on: v_verticadb_node0007, v_verticadb_node0004,
| | | v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
| | | v_verticadb_node0009
| | | Runtime Filter: (SIP1(MergeJoin): online_sales_fact.call_center_key)
| | +-- Inner -> STORAGE ACCESS for call_center_dimension [Cost: 17, Rows: 200] (PATH ID: 4)
| | | Projection: online_sales.call_center_dimension_DBD_16_seg_vmart_b0
| | | Materialize: call_center_dimension.call_center_key, call_center_dimension.cc_name
| | | Execute on: v_verticadb_node0007, v_verticadb_node0004,
v_verticadb_node0005, v_verticadb_node0006, v_verticadb_node0008,
v_verticadb_node0009
. . .
Taking advantage of ECS
To take advantage of ECS, create a secondary subcluster where the number of nodes is a multiple of the number of shards in a namespace. For example, in a 12-shard namespace, create a subcluster that contains a multiple of 12 nodes such as 24 or 36. The number of nodes must be a multiple of the number of shards to evenly distribute the data in the namespace across the nodes in the subcluster. See Subclusters for more information.
Note
Instead of creating a new subcluster, you can add more nodes to an existing secondary subcluster. If possible, make the number of nodes in the subcluster a multiple of the shard count of the relevant namespace.
Once you have created the subcluster, have users connect to it and run their analytic queries. Vertica automatically enables ECS in the subcluster because it has more nodes than there are shards in a namespace.
How the optimizer assigns data responsibilities to nodes
The optimizer has two strategies to choose from when dividing the data in a shard among its subscribing nodes. One strategy is optimized for queries that use data segmentation. Queries that contain a JOIN or GROUP BY clause rely on data segmentation. The other strategy is for queries that do not need segmentation.
By default, the optimizer automatically chooses the strategy to use. For most queries, the automatically-chosen strategy results in faster query performance. For some queries, you may want to manually override the strategy using hints. In a small number of queries, ECS does not help performance. In these cases, you can disable ECS. See Manually choosing an ECS strategy for details.
7.4 - Manually choosing an ECS strategy
When the number of nodes in a subcluster is greater than the number of database shards, the Vertica query optimizer uses elastic crunch scaling (ECS) to involve all nodes in processing queries.
When the number of nodes in a subcluster is greater than the number of database shards, the Vertica query optimizer uses elastic crunch scaling (ECS) to involve all nodes in processing queries. For each shard, the optimizer divides responsibility for processing shard data among its subscribing nodes, using one of the following strategies:
Strategy |
Description |
I/O-optimized |
Optimizer divides the list of ROS containers in the shard among the subscribing nodes. Use this strategy when nodes must fetch the data for the query from communal storage, rather than the depot. Nodes only fetch the ROS containers they need to resolve the query from communal storage, reducing the amount of data each needs to transfer from communal storage. Due to the arbitrary division of data among the nodes, this strategy does not support query optimizations that rely on data segmentation. |
Compute-optimized |
Optimizer uses data segmentation to assign portions to each subscribing node. The nodes scan the entire shard, but use sub-segment filtering to find their assigned segments of the data. Use this strategy when most data for the query is in the depot, because nodes must scan the entire contents of the shard. Because this strategy uses data segmentation, it supports optimizations such as local joins that the I/O-optimized strategy cannot. |
The optimizer automatically chooses a strategy based on whether the query can take advantage of data segmentation. You can tell which strategy the optimizer chooses for a query by using EXPLAIN. The top of the plan explanation states whether ECS is preserving segmentation. For example, this simple query on a single table does not need to use segmentation, so it uses the I/O-optimized strategy:
=> EXPLAIN SELECT employee_last_name,
employee_first_name,employee_age
FROM employee_dimension
ORDER BY employee_age DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes.
Crunch scaling strategy does not preserve data segmentation
------------------------------
. . .
A more complex query using a JOIN results in ECS preserving data segmentation by using the compute-optimized strategy. The query plan tells you that segmentation is preserved:
=> EXPLAIN SELECT sales_quantity, sales_dollar_amount, transaction_type, cc_name
FROM online_sales.online_sales_fact
INNER JOIN online_sales.call_center_dimension
ON (online_sales.online_sales_fact.call_center_key
= online_sales.call_center_dimension.call_center_key
AND sale_date_key = 156)
ORDER BY sales_dollar_amount DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes.
Crunch scaling strategy preserves data segmentation
------------------------------
. . .
In most cases, the optimizer chooses the best strategy to use to split the data among the nodes subscribing to the same shard. However, you might occasionally find that some queries perform poorly. In these cases, the query can embed the ECSMODE hint to specify which strategy to use, or even disable ECS.
Setting the ECS strategy for individual queries
You can use the ECSMODE hint in a query to force the optimizer to use a specific ECS strategy (or disable ECS entirely). The ECSMODE hint takes one of the following arguments:
-
AUTO
: The optimizer chooses the strategy to use, useful only if ECS mode is set at the session level (see Setting the ECS Strategy for the Session or Database).
-
IO_OPTIMIZED
: Use I/O-optimized strategy.
-
COMPUTE_OPTIMIZED
: Use compute-optimized strategy.
-
NONE
: Disable use of ECS for this query. Only participating nodes are involved in query execution; collaborating nodes are not.
The following example shows the query plan for a simple single-table query that is forced to use the compute-optimized strategy:
=> EXPLAIN SELECT /*+ECSMode(COMPUTE_OPTIMIZED)*/ employee_last_name,
employee_first_name,employee_age
FROM employee_dimension
ORDER BY employee_age DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes.
Crunch scaling strategy preserves data segmentation
------------------------------
. . .
This example disable ECS in a six-node cluster in a three-shard database:
=> EXPLAIN SELECT /*+ECSMode(NONE)*/ employee_last_name,
employee_first_name,employee_age
FROM employee_dimension
ORDER BY employee_age DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT /*+ECSMode(NONE)*/ employee_last_name,
employee_first_name,employee_age
FROM employee_dimension
ORDER BY employee_age DESC;
Access Path:
+-SORT [Cost: 243, Rows: 10K] (PATH ID: 1)
| Order: employee_dimension.employee_age DESC
| Execute on: v_verticadb_node0007, v_verticadb_node0004, v_verticadb_node0005
| +---> STORAGE ACCESS for employee_dimension [Cost: 71, Rows: 10K] (PATH ID: 2)
| | Projection: public.employee_dimension_DBD_8_seg_vmart_b0
| | Materialize: employee_dimension.employee_first_name,
| | employee_dimension.employee_last_name, employee_dimension.employee_age
| | Execute on: v_verticadb_node0007, v_verticadb_node0004,
| | v_verticadb_node0005
. . .
Note that this query plan lacks the "this query involves non-participating nodes" statement, indicating that it does not use ECS. It also lists just three participating nodes. These nodes are marked as participating in the V_CATALOG.SESSION_SUBSCRIPTIONS system table.
Setting the ECS strategy for the session or database
You can use the ECSMode configuration parameter to set the ECS strategy for the current session. This parameter accepts the same values as the ECSMODE hint except NONE, which is valid only for individual queries.
The following example demonstrates using the configuration parameter to force a simple query to use the COMPUTE_OPTIMIZED strategy. It then sets the parameter back to its default value of AUTO:
=> EXPLAIN SELECT employee_first_name,employee_age
FROM employee_dimension ORDER BY employee_age DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes.
Crunch scaling strategy does not preserve data segmentation
------------------------------
. . .
=> ALTER SESSION SET ECSMode = 'COMPUTE_OPTIMIZED';
ALTER SESSION
=> EXPLAIN SELECT employee_first_name,employee_age
FROM employee_dimension ORDER BY employee_age DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes.
Crunch scaling strategy preserves data segmentation
------------------------------
. . .
=> ALTER SESSION SET ECSMode = 'AUTO';
ALTER SESSION
Individual query hints override the session-level settings. This example sets the session default to use COMPUTE_OPTIMIZED, then restores the default behavior for a query by using the ECSMode hint with the value AUTO:
=> ALTER SESSION SET ECSMode = 'COMPUTE_OPTIMIZED';
ALTER SESSION
=> EXPLAIN SELECT /*+ECSMode(AUTO)*/ employee_first_name,employee_age
FROM employee_dimension ORDER BY employee_age DESC;
QUERY PLAN
-----------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes.
Crunch scaling strategy does not preserve data segmentation
------------------------------
Note that setting the ECSMode hint to AUTO let the optimizer pick the I/O-optimized strategy (which does not preserve segmentation) instead of using the compute-optimized strategy set at the session level.
You can also set the ECS strategy at the database level using ALTER DATABASE. However, doing so overrides the Vertica optimizer's settings for all users in all subclusters that use ECS. Before setting the ECS strategy at the database level, verify that the majority of the queries run by all users of the ECS-enabled subclusters must have the optimizer's default behavior overridden. If not, then use the session or query-level settings to override the optimizer for just the queries that benefit from a specific strategy.
8 - Subcluster sandboxing
Sandboxing enables you to spin-off a secondary subcluster from an existing cluster, resulting in two mutually isolated clusters that share the same data but do not interfere with each other.
Sandboxing enables you to spin-off secondary subclusters from an existing cluster, resulting in two or more mutually isolated clusters that share the same data but do not interfere with each other. Sandboxed clusters inherit the state of the catalog and data of the main cluster at the time of sandbox creation. As soon as the sandbox is active, the catalog and data of the two clusters are independent and can diverge. Within each cluster, you can perform standard database operations and queries, such as creating new tables or loading libraries, without affecting the other cluster. For example, dropping a table in the sandboxed cluster does not drop the table in the main cluster, and vice versa.
Sandboxes support many use cases, including the following:
-
Try out a version of Vertica without needing to spin-up a new cluster and reload data. After creating a sandbox, you can upgrade sandboxed subclusters and test out the new Vertica features. To rejoin sandboxed subclusters to the main cluster, you just need to downgrade the Vertica version and perform the necessary unsandboxing tasks.
-
Experiment with new features without compromising the consistency of the main cluster. For instance, you could spin-off a sandbox and experiment with external tables using data stored by Apache Iceberg.
-
Share data with another team by giving them access to a sandboxed cluster. This gives the other team the necessary data access, but keeps the changes separate from the main cluster. Anything the other team changes in the sandbox, such as dropping tables, would not propagate back to the main cluster.
After removing the sandbox and performing necessary cleanup tasks, subclusters can rejoin the main cluster.
8.1 - Creating sandboxes
In order to create a sandbox for a secondary subcluster, all nodes in the subcluster must have a status of UP and collectively provide full-subscription coverage for all shards.
In order to create a sandbox for a secondary subcluster, all nodes in the subcluster must have a status of UP and collectively provide full-subscription coverage for all shards.
To sandbox the first subcluster in a sandbox, use the sandbox_subcluster
admintools command:
$ adminTools -t sandbox_subcluster -h
Usage: sandbox_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database to be modified
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be sandboxed
-b SBNAME, --sandbox=SBNAME
Name of the sandbox
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
At command runtime, under a global catalog lock (GCLX), the nodes in the specified subcluster create a checkpoint of the catalog. When these nodes auto-restart in the sandbox cluster, they form a primary subcluster that uses the data and catalog checkpoint from the main cluster. After the nodes successfully restart, the sandbox cluster and the main cluster are mutually isolated and can diverge.
While the nodes in the main cluster sync their metadata to /path-to-communal-storage/
metadata
/db_name
, the nodes in the sandbox sync to /path-to-communal-storage/
metadata
/sandbox_name
.
Note
Though the sandbox cluster and main cluster are mutually isolated, the nodes in both clusters remain in the same
Spread ring. However, if the main cluster and sandbox cluster partition from each other—which can be caused by events like Spread communication problems—the two clusters are not affected.
You can perform standard database operations and queries, such as loading data or creating new tables, in either cluster without affecting the other cluster. For example, dropping a table in the sandbox cluster does not drop the table in the main cluster, and vice versa.
Because both clusters reference the same data files, neither cluster can delete files that existed at the time of sandbox creation. However, the sandbox can remove files that it creates after spin-off from the main cluster. Files in the main cluster are queued for removal, but they are not processed until all active sandboxes are removed.
You cannot nest sandboxes, remove a sandboxed subcluster from the database, or add or remove nodes to an existing sandbox, but you can have multiple individual sandboxes active at the same time.
Adding subclusters to existing sandboxes
You can add additional secondary subclusters to existing sandbox clusters using the SANDBOX_SUBCLUSTER function. Added subclusters must be secondary, cannot be a member of any other sandbox cluster, and all member nodes must have a status of UP.
When sandboxing additional subclusters, you must first call the SANDBOX_SUBCLUSTER function in the sandbox cluster and then in the main cluster, providing the same sandbox cluster and subcluster names in both calls. This makes both clusters aware of the status of the sandboxed subcluster. After the function is called in both clusters, the subcluster automatically restarts, joins the sandbox cluster as a secondary subcluster, and subscribes to the sandbox cluster's shards.
Examples
The following command sandboxes the sc01
secondary subcluster into a sandbox named sand
:
$ admintools -t sandbox_subcluster -d verticadb -p password -c sc_01 -b sand
Validating sandboxing conditions
Sandboxing subcluster sc_01 as sand...
Subcluster 'sc_01' has been sandboxed to 'sand'. It is going to auto-restart and re-form.
Checking for sandboxed nodes to be UP...
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (UP) v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Sandboxing complete. Subcluster sc_01 is ready for use
If you query the NODES system table from the main cluster, you can see that the sc_01
nodes have a status of UNKNOWN and are listed as member of the sand
sandbox:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UP |
v_verticadb_node0002 | default_subcluster | UP |
v_verticadb_node0003 | default_subcluster | UP |
v_verticadb_node0004 | sc_01 | UNKNOWN | sand
v_verticadb_node0005 | sc_01 | UNKNOWN | sand
v_verticadb_node0006 | sc_01 | UNKNOWN | sand
v_verticadb_node0007 | sc_02 | UP |
v_verticadb_node0008 | sc_02 | UP |
v_verticadb_node0009 | sc_02 | UP |
(9 rows)
Note
If the sandboxed and main clusters are sufficiently incompatible or are Spread-partitioned from each other, the status of the sandboxed nodes might appear as DOWN in the above query.
When you issue the same query on one of the sandboxed nodes, the table shows that the sandboxed nodes are UP and the nodes from the main cluster are UNKNOWN, confirming that the cluster is successfully sandboxed and isolated from the main cluster:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UNKNOWN |
v_verticadb_node0002 | default_subcluster | UNKNOWN |
v_verticadb_node0003 | default_subcluster | UNKNOWN |
v_verticadb_node0004 | sc_01 | UP | sand
v_verticadb_node0005 | sc_01 | UP | sand
v_verticadb_node0006 | sc_01 | UP | sand
v_verticadb_node0007 | sc_02 | UNKNOWN |
v_verticadb_node0008 | sc_02 | UNKNOWN |
v_verticadb_node0009 | sc_02 | UNKNOWN |
(9 rows)
You can now perform standard database operations in either cluster without impacting the other cluster. For instance, if you create a machine learning dataset named train_data
in the sandboxed subcluster, the new table does not propagate to the main cluster:
--In the sandboxed subcluster
=> CREATE TABLE train_data(time timestamp, Temperature float);
CREATE TABLE
=> COPY train_data FROM LOCAL 'daily-min-temperatures.csv' DELIMITER ',';
Rows Loaded
-------------
3650
(1 row)
=> SELECT * FROM train_data LIMIT 5;
time | Temperature
---------------------+-------------
1981-01-27 00:00:00 | 19.4
1981-02-20 00:00:00 | 15.7
1981-02-27 00:00:00 | 17.5
1981-03-04 00:00:00 | 16
1981-04-24 00:00:00 | 11.5
(5 rows)
--In the main cluster
=> SELECT * FROM train_data LIMIT 5;
ERROR 4566: Relation "train_data" does not exist
Similarly, if you drop a table in the main cluster, the table is not subsequently dropped in the sandboxed cluster:
--In the main cluster
=> SELECT * FROM transaction_data LIMIT 5;
first_name | last_name | store | cost | fraud
------------+-----------+-----------+---------+-------
Adam | Rice | Gembucket | 8757.35 | FALSE
Alan | Gordon | Wrapsafe | 3874.48 | FALSE
Albert | Harvey | Treeflex | 1558.27 | FALSE
Andrea | Bryant | Greenlam | 1148.2 | FALSE
Andrew | Simmons | Home Ing | 8400.03 | FALSE
(5 rows)
=> DROP TABLE transaction_data;
DROP TABLE
--In the sandboxed subcluster
=> SELECT * FROM transaction_data LIMIT 5;
first_name | last_name | store | cost | fraud
------------+-----------+-----------+---------+-------
Adam | Rice | Gembucket | 8757.35 | FALSE
Alan | Gordon | Wrapsafe | 3874.48 | FALSE
Albert | Harvey | Treeflex | 1558.27 | FALSE
Andrea | Bryant | Greenlam | 1148.2 | FALSE
Andrew | Simmons | Home Ing | 8400.03 | FALSE
(5 rows)
To add an additional secondary subcluster to the sandbox cluster, call SANDBOX_SUBCLUSTER in the sandbox cluster and then in the main cluster:
--In the sandbox cluster
=> SELECT SANDBOX_SUBCLUSTER('sand', 'sc_02', '');
sandbox_subcluster
------------------------------------------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been added to 'sand'. When the same command is executed in the main cluster, it can join the sandbox.
(1 row)
--In the main cluster
=> SELECT SANDBOX_SUBCLUSTER('sand', 'sc_02', '');
sandbox_subcluster
-------------------------------------------------------------------------------------------
Subcluster 'sc_02' has been sandboxed to 'sand'. It is going to auto-restart and re-form.
(1 row)
--In the sandbox cluster
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UNKNOWN |
v_verticadb_node0002 | default_subcluster | UNKNOWN |
v_verticadb_node0003 | default_subcluster | UNKNOWN |
v_verticadb_node0004 | sc_01 | UP | sand
v_verticadb_node0005 | sc_01 | UP | sand
v_verticadb_node0006 | sc_01 | UP | sand
v_verticadb_node0007 | sc_02 | UP | sand
v_verticadb_node0008 | sc_02 | UP | sand
v_verticadb_node0009 | sc_02 | UP | sand
(9 rows)
If you decide to upgrade your sandboxed cluster, you can confirm that the main cluster and sandboxed cluster are running two different Vertica versions by comparing their build_info
values in the NODES system table:
=> SELECT node_name, subcluster_name, node_state, sandbox, build_info FROM NODES;
node_name | subcluster_name | node_state | sandbox | build_info
----------------------+--------------------+------------+---------+------------
v_verticadb_node0001 | default_subcluster | UP | | v12.0.4-0
v_verticadb_node0002 | default_subcluster | UP | | v12.0.4-0
v_verticadb_node0003 | default_subcluster | UP | | v12.0.4-0
v_verticadb_node0004 | sc_01 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0005 | sc_01 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0006 | sc_01 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0007 | sc_02 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0008 | sc_02 | UNKNOWN | sand | v12.0.4-1
v_verticadb_node0009 | sc_02 | UNKNOWN | sand | v12.0.4-1
(9 rows)
See also
8.2 - Removing sandboxes
To remove a sandbox from a subcluster and return that subcluster to the main cluster, you can run the unsandbox_subcluster admintools command:.
To remove a sandbox's primary subcluster from the sandbox and return it to the main cluster, you can run the unsandbox_subcluster
admintools command:
$ adminTools -t unsandbox_subcluster -h
Usage: unsandbox_subcluster [options]
Options:
-h, --help show this help message and exit
-d DB, --database=DB Name of database to be modified
-p DBPASSWORD, --password=DBPASSWORD
Database password in single quotes
-c SCNAME, --subcluster=SCNAME
Name of subcluster to be un-sandboxed
--timeout=NONINTERACTIVE_TIMEOUT
set a timeout (in seconds) to wait for actions to
complete ('never') will wait forever (implicitly sets
-i)
-i, --noprompts do not stop and wait for user input(default false).
Setting this implies a timeout of 20 min.
Note
If you upgraded the Vertica version of the sandboxed subcluster, you must downgrade the version of the subcluster before rejoining it to the main cluster.
The unsandbox_subcluster
command stops the nodes in the sandboxed subcluster, changes the metadata in the main cluster that designates the specified subcluster as sandboxed, wipes the node's local catalogs, and then restarts the nodes. After the nodes restart, they rejoin the main cluster and inherit the current state of the main cluster's catalog. The nodes should then be back to their normal state and can be used as expected.
Because the sandbox synced its metadata to the same communal storage location as the main cluster, you must remove the metadata files that were created in the sandbox. Those files can be found by replacing the name of the database in the path to the metadata with the name of the sandbox—for instance, /path-to-communal-storage/
metadata
/sandbox_name
instead of /path-to-communal-storage/
metadata
/db_name
. Removing these files helps avoid problems that might arise from reusing the same sandbox name.
If there are no more active sandboxes, the main cluster can resume the processing of data queued for deletion. To remove any data created in the sandbox, you can run the CLEAN_COMMUNAL_STORAGE function.
You can also unsandbox a subcluster using the UNSANDBOX_SUBCLUSTER meta-function, but you must manually stop the nodes, wipe their catalog subdirectories, run the function, and restart the nodes.
Removing a sandbox's secondary subclusters
If your sandbox has additional secondary subclusters, you can remove them from the sandbox using the UNSANDBOX_SUBCLUSTER function. As with the unsandbox_subcluster
admintools command, the function requires that the nodes in the specified subcluster are down. Any remaining subclusters in the sandbox cluster continue to operate as normal.
After you call the function in the main cluster and wipe the nodes' catalogs, you can restart the nodes to rejoin them to the main cluster. Vertica recommends that you also call the UNSANDBOX_SUBCLUSTER function in the sandbox cluster. This makes sure that both clusters are aware of the status of the subcluster and that relevant system tables accurately reflect the subcluster's state.
Examples
The following command unsandboxes the sc02
secondary subcluster from the sand
sandbox. The command stops the nodes in sc02
, wipes the nodes catalogs, and then restarts the nodes. After the nodes restart, they should rejoin the main cluster and be ready for normal use:
$ admintools -t unsandbox_subcluster -d verticadb -p vertica -c analytics
Stopping subcluster nodes for unsandboxing...
Sending signal 'TERM' to ['192.168.111.34', '192.168.111.35', '192.168.111.36']
Successfully sent signal 'TERM' to hosts ['192.168.111.34', '192.168.111.35', '192.168.111.36'].
Details:
Host: 192.168.111.34 - Success - PID: 267860 Signal TERM
Host: 192.168.111.35 - Success - PID: 285917 Signal TERM
Host: 192.168.111.36 - Success - PID: 245272 Signal TERM
Checking for processes to be down
All processes are down.
Details:
Host 192.168.111.34 Success process 267860 is down
Host 192.168.111.35 Success process 285917 is down
Host 192.168.111.36 Success process 245272 is down
Unsandboxing Subcluster analytics...
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:23:37. See /opt/vertica/log/adminTools.log for full details.
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:23:47. See /opt/vertica/log/adminTools.log for full details.
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:23:57. See /opt/vertica/log/adminTools.log for full details.
STATUS: vertica.engine.api.db_client.module is still running on 1 host: 192.168.111.31 as of 2023-03-01 13:24:07. See /opt/vertica/log/adminTools.log for full details.
Subcluster 'analytics' has been unsandboxed. If wiped out and restarted, it should be able to rejoin the cluster.
Removing Catalog directory contents from subcluster nodes...
Catalog cleanup complete!
Restarting unsandboxed nodes to re-join the main cluster...
Restarting host [192.168.111.34] with catalog [v_verticadb_node0004_catalog]
Restarting host [192.168.111.35] with catalog [v_verticadb_node0005_catalog]
Restarting host [192.168.111.36] with catalog [v_verticadb_node0006_catalog]
Issuing multi-node restart
Starting nodes:
v_verticadb_node0004 (192.168.111.34)
v_verticadb_node0005 (192.168.111.35)
v_verticadb_node0006 (192.168.111.36)
Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (UP) v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Syncing catalog on verticadb with 2000 attempts.
Unsandboxed nodes have restarted successfully and joined the main cluster and are ready to use
When the admintools command completes, you can query the NODES system table to confirm that the previously sandboxed nodes are UP and are no longer members of sand
:
=> SELECT node_name, subcluster_name, node_state, sandbox FROM NODES;
node_name | subcluster_name | node_state | sandbox
----------------------+--------------------+------------+---------
v_verticadb_node0001 | default_subcluster | UP |
v_verticadb_node0002 | default_subcluster | UP |
v_verticadb_node0003 | default_subcluster | UP |
v_verticadb_node0004 | sc_02 | UP |
v_verticadb_node0005 | sc_02 | UP |
v_verticadb_node0006 | sc_02 | UP |
(6 rows)
If there are no more active sandboxes, you can run the CLEAN_COMMUNAL_STORAGE function to remove any data created in the sandbox. You should also remove the sandbox's metadata from the shared communal storage location, which can be found at /path-to-communal-storage/
metadata
/sandbox_name
. The following example removes the sandbox's metadata from an S3 bucket and then calls CLEAN_COMMUNAL_STORAGE to cleanup any data from the sandbox:
$ aws s3 rm /path-to-communal/metadata/sandbox_name
SELECT CLEAN_COMMUNAL_STORAGE('true');
CLEAN_COMMUNAL_STORAGE
-----------------------------------------------------------------
CLEAN COMMUNAL STORAGE
Total leaked files: 143
Files have been queued for deletion.
Check communal_cleanup_records for more information.
(1 row)
See also
9 - Local caching of storage containers
The Vertica execution engine uses the StorageMerge operator to read data from storage containers in cases where it is important to read container data in its projection-specified sort order.
The Vertica execution engine uses the StorageMerge operator to read data from storage containers in cases where it is important to read container data in its projection-specified sort order. This is particularly useful for operations that must preserve the sort order of the data that is read from multiple storage containers, before merging it into a single storage container. Common operations that enforce sort order include mergeout, and some queries with ORDER BY clauses—for example CREATE TABLE...AS, where the query includes an ORDER BY clause.
The execution engine typically allocates multiple threads to the StorageMerge operator. Each thread is assigned a single Scan operator to open and read container contents. If the number of containers to read is greater than the number of available threads, the execution engine is likely to assign individual Scan operators to multiple containers. In this case, Scan operators might need to switch among different containers and reopen them multiple times before all required data is fetched and assembled. Doing so is especially problematic when reading storage containers on remote filesystems such as S3. The extra overhead incurred by reopening and reading remote storage containers can significantly impact performance and usage costs.
You can configure your database so the execution engine caches on local disk the data of S3 storage containers that require multiple opens. The size of temp space allocated per query to the StorageMerge operator for caching is set by configuration parameter StorageMergeMaxTempCacheMB. By default, this configuration parameter is set to -1 (unlimited). If caching requests exceed temp space limits or available disk space, Vertica caches as much container data as it can, and then reads from S3.
Note
The actual temp space that is allocated is the lesser of these settings:
To turn off caching, set StorageMergeMaxTempCacheMB to 0.
10 - Managing an Eon Mode database in MC
Vertica Management Console (MC), a database health and activity monitoring tool, provides in-browser wizards you can follow to deploy Vertica cluster instances and create an Eon Mode database on them.
Vertica Management Console (MC), a database health and activity monitoring tool, provides in-browser wizards you can follow to deploy Vertica cluster instances and create an Eon Mode database on them. You can also use MC to manage and monitor resources that are specific to Eon Mode:
See also
11 - Stopping and starting an Eon Mode cluster
When running an Eon Mode database in the cloud, you usually want to stop the nodes running your database when you stop the database.
Stopping your Eon Mode database and cluster using the MC
When running an Eon Mode database in the cloud, you usually want to stop the nodes running your database when you stop the database. Stopping your nodes avoids wasting money. The nodes aren't needed while the database is down.
Note
Instead of shutting down your entire database, you can still save money by shutting down subclusters that aren't being used. This technique lets your database continue to run and load data, while reducing the hourly cost. See
Starting and stopping subclusters for more information.
The easiest way to stop both your database and the nodes that run it is to use the MC:
-
From the MC home page, click View Your Infrastructure.
-
In row labeled Databases, click the database you want to stop.
-
In the popup, click Stop.
-
Click OK to confirm you want to stop the database.
-
Once your database has stopped, in the row labeled Clusters, click the entry for the cluster running the database you just stopped.
-
In the popup, click Manage.
-
In the ribbon at the top of the cluster view, click Stop Cluster.
-
In the dialog box, check the I would like to stop all instances in the cluster box and click Stop Cluster.
Manually stopping the database and cluster
To manually stop your database and cluster, first stop your database using one of the following methods:
Once you have stopped the database, you can stop your nodes. If you are in a cloud environment, see your cloud provider's documentation for instruction on stopping nodes.
Starting your cluster and database using the MC
To start your database cluster and database:
-
From the MC home, click View Infrastructure.
-
In the Clusters row, click the cluster that runs the database you want to start.
-
In the pop-up, click Manage.
-
In the ribbon at the top of the cluster's page, click Start Cluster.
-
Check the I would like to start all instances in the cluster box and click Start Cluster.
Starting the cluster automatically starts the database.
Manually starting your cluster and database
To manually start your cluster and database:
-
Start the nodes in your database cluster. If you are running in the cloud, see your cloud provider's documentation on how to start instances.
-
Connect to one of the nodes in the cluster and use the admintools menus or command line to start your database. See Starting the database for instructions.
12 - Terminating an Eon Mode database cluster
When you terminate an Eon Mode database's cluster, you free its resources.
When you terminate an Eon Mode database's cluster, you free its resources. In a cloud environment, terminating the cluster deletes the instances that ran the database's nodes. In an on-premises database, terminating the cluster usually means repurposing physical hardware for other uses. See Stopping, starting, terminating, and reviving Eon Mode database clusters for more information.
Terminating an Eon Mode database's cluster does not affect the data it stores. The data remains stored in the communal storage location. As long as you do not delete the communal storage location, you can revive the database onto a new Eon Mode cluster. See Reviving an Eon Mode database clusterfor more information.
Important
Vertica persists catalog data in communal storage, updating it every few minutes. Nonetheless, before shutting down your database you should make sure your metadata is up to date on communal storage. To do so, see
Synchronizing metadata.
Terminating an Eon Mode cluster using Management Console
Management Console provides the easiest way to terminate an Eon Mode cluster. You must follow a two-step process: first stop the database, then terminate the cluster:
-
If you have not yet synchronized the database's catalog, follow the steps in Synchronizing metadata.
-
From the Management Console home page, click View Your Infrastructure.
-
In the row labeled Databases, click the database whose cluster you want to terminate.
-
In the popup, click Stop.
-
Click OK to confirm you want to stop the database.
-
After the database stops, in the row labeled Clusters, click the entry for the cluster you want to terminate.
-
In the popup, click Manage.
-
In the ribbon at the top of the cluster view, click Advanced and then select Terminate Cluster.
-
In the dialog box:
Manually terminating an Eon Mode cluster
To manually terminate your Eon Mode cluster:
-
If you have not yet synchronized the database's catalog, follow the steps in Synchronizing metadata.
-
Stop the database using one of the following methods:
-
Terminate the database node instances. If you are in a cloud environment, see your cloud provider's documentation for instructions on terminating instances. For on-premises database clusters, you can repurpose the systems that were a part of the cluster.
See also
13 - Reviving an Eon Mode database cluster
If you have terminated your Eon Mode database's cluster, but have not deleted the database's communal storage, you can revive your database.
If you have terminated your Eon Mode database's cluster, but have not deleted the database's communal storage, you can revive your database. Reviving the database restores it to its pre-shutdown state. The revival process requires creating a new database cluster and configuring it to use the database's communal storage location. See Stopping, starting, terminating, and reviving Eon Mode database clusters for more information.
Note
You cannot revive sandboxed subclusters. If you call the revive_db
admintools command on a cluster with both sandboxed and unsandboxed subclusters, the nodes in the unsandboxed subclusters start as expected, but the nodes in the sandboxed subclusters remain down. Attempting to revive only a sandboxed subcluster returns an error.
You can also use the revive process to restart a database when its nodes do not have persistent local storage. You may choose to configure your node's instances in your cloud-based Eon Mode cluster with non-persistent local storage to reduce cost. Cloud providers such as AWS and GCP charge less for instances when they are not required to retain data when you shut them down.
You revive a database using either the Management Console or admintools. The MC and admintools offer different revival methods:
-
The MC always revives onto a newly-provision cluster that it creates itself. It cannot revive onto an existing cluster. Use the MC to revive a database when you do not have a cluster already provisioned for your database.
-
admintools only revives onto an existing database cluster. You can manually create a cluster to revive your database. See Set up Vertica on-premises.
You can also revive a database whose hosts use instance storage where data is not persistently stored between shutdowns. In this case, admintools treats the existing database cluster as a new cluster, because the hosts do not contain the database's catalog data.
-
Currently, only admintools lets you revive just the primary subclusters in a database cluster. This option is useful if you want to revive the minimum number of nodes necessary to start your database. See Reviving Only Primary Subclusters below.
The MC always revives the entire database cluster.
Note
You cannot revive a database from a communal storage location that is currently running on another cluster. The revive process fails if it detects that there is a cluster already running the database. Having two instances of a database running on separate clusters using the same communal storage location leads to data corruption.
Reviving using the Management Console
You can use a wizard in the Management Console to provision a new cluster and revive a database onto it from a browser. For details, see:
You can use admintools to revive your Eon Mode database on an existing cluster.
Cluster requirements
This existing cluster must:
-
Have the same version (or later version) of Vertica installed on it. You can repurpose an existing Vertica cluster whose database you have shut down. Another option is to create a cluster from scratch by manually installing Vertica (see Set up Vertica on-premises).
-
Contain a number of hosts in the cluster that is equal to or greater than either:
-
The total number of nodes that the database cluster had when it shut down.
-
The total number of primary nodes the database cluster had when it shut down. When you supply a cluster that matches the number of primary nodes in the database, admintools revives just the primary nodes.
When reviving, you supply admintools with a list of the hosts in the cluster to revive the database onto. The number of hosts in this list must match either the total number of nodes or the number of primary nodes in the database when it shut down. If the number of nodes you supply does not match either of these values, admintools returns an error.
You do not need to use all of the hosts in the cluster to revive the database. You can revive a database onto a subset of the hosts in the cluster. But you must have at least enough hosts to revive all of the primary nodes.
For example, suppose you want to revive a database that had 16 nodes when it was shut down, with four of those nodes being primary nodes. In that case, you can revive:
You may choose to revive your database onto a cluster with more nodes that is necessary in cases where you want to quickly add new nodes. You may also want to revive just the primary nodes in a database onto a larger cluster. In this case, you can use the extra nodes in the cluster to start one or more secondary subclusters.
To revive the database, you must know:
-
The name of the database to revive (note that the database name is case sensitive)
-
The version of Vertica that created the database, so you can use the same or later version
-
The total number of all nodes or the number of primary nodes in the database when it shut down
-
The URL and credentials for the database's communal storage location
-
The user name and password of the database administrator
-
The IP addresses of all hosts in the cluster you want to revive onto
If you do not know what version of Vertica created the database or are unsure how many nodes it had, see Getting Database Details From a Communal Storage Location below.
Required database settings
Before starting the revive process, verify the following conditions are true for your Eon Mode environment:
Eon environment |
Revived database requirements |
All |
-
The uppermost directories of the catalog , data , and depot directories on all nodes exist and are owned by the database dbadmin
-
The cluster has no other database running on it
|
Azure |
If your database does not use Azure managed identities to authenticate with the communal storage blob container, the following values must be set:
See Azure Blob Storage object store for details.
|
S3: AWS, on-premises |
The following configuration parameters are set:
Important
If migrating to an on-premises database, set configuration parameter AWSEnableHttps to be compatible with the database TLS setup: AWSEnableHttps=1 if using TLS, otherwise 0. If settings are incompatible, the migration returns with an error.
|
GCP |
The following configuration parameters are set:
|
Getting database details from a communal storage location
To revive a database, you must know:
-
The version of Vertica that created it (so you can use the same or a later version)
-
The total number of nodes (when reviving both primary and secondary nodes) or primary nodes (when just reviving the primary nodes) in the database's cluster when it shut down.
If you do not know these details, you can determine them based on the contents of the communal storage location.
If you are not sure which version of Vertica created the database stored in a communal storage location, examine the cluster_config.json
file. This file is stored in the communal storage location in the folder named metadata/
databasename
. For example, suppose you have a database named mydb stored in the communal storage location s3://mybucket/mydb
. Then you can download and examine the file s3://mybucket/mydb/metadata/mydb/cluster_config.json
.
In the cluster_config.json
, the Vertica version that created the database is stored with the JSON key named DatabaseVersion near the top of the file:
{
"CatalogTruncationVersion" : 804,
"ClusterLeaseExpiration" : "2020-12-21 21:52:31.005936",
"Database" : {
"branch" : "",
"name" : "verticadb"
},
"DatabaseVersion" : "v10.1.0",
"GlobalSettings" : {
"TupleMoverServices" : -33,
"appliedUpgrades" : [
. . .
In this example, you can revive the storage location using Vertica version 10.1.0 or later.
If you do not know how many nodes or primary nodes the cluster had when it shut down, use the --display-only
option of the admintools revive_db tool. Adding this option prevents admintools from reviving the database. Instead, it validates the files in the communal storage and reports details about the nodes that made up the database cluster. Parts of this report show the total number of nodes in the cluster and the number of primary nodes:
$ admintools -t revive_db --display-only --communal-storage-location \
s3://mybucket/verticadb -d verticadb
Attempting to retrieve file: [s3://mybucket/verticadb/metadata/verticadb/cluster_config.json]
Validated 6-node database verticadb defined at communal storage s3://mybucket/verticadb.
Expected layout of database after reviving from communal storage: s3://mybucket/verticadb
== Communal location details: ==
{
"communal_storage_url": "s3://mybucket/verticadb",
"num_shards": "3",
"depot_path": "/vertica/data",
. . .
]
Number of primary nodes: 3
You can use grep
to find just the relevant lines in the report:
$ admintools -t revive_db --display-only --communal-storage-location \
s3://mybucket/verticadb -d verticadb | grep 'Validated\|primary nodes'
Validated 6-node database verticadb defined at communal storage s3://mybucket/verticadb.
Number of primary nodes: 3
Creating a parameter file
For Eon Mode deployments that are not on AWS, you must create a configuration file to pass the parameters listed in the table in the previous section to admintools. Traditionally this file is named auth_params.conf
although you can choose any file name you want.
For on-premises Eon Mode databases, this parameter file is the same one you used when initially installing the database. See the following links for instructions on creating a parameter file for the communal storage solution you are using for your database:
For databases running on Microsoft Azure, the parameter file is only necessary if your database does not use managed identities. This file is the same format that you use to manually install an Eon Mode database. See Manually create an Eon Mode database on Azure for more information.
To revive an Eon Mode database on GCP manually, create a configuration file to hold the GCSAuth parameter and optionally, the GCSEnableHttp parameter.
You must supply the GCSAuth parameter to enable Vertica to read from the communal storage location stored in GCS. The value for this parameter is the HMAC access key and secret:
GCSAuth = HMAC_access_key:HMAC_secret_key
See Creating an HMAC Key for more information about HMAC keys.
If your Eon Mode database does not use encryption when accessing communal storage on GCS, then disable HTTPS access by adding the following line to auth_params.conf
:
GCSEnableHttps = 0
Use the admintools revive_db tool to revive the database:
-
Use SSH to access a cluster host as an administrator.
-
Depending on your environment, run one of the following admintools commands:
-
AWS:
$ admintools -t revive_db \
--communal-storage-location=s3://communal_store_path \
-s host1,... -d database_name
Important
If you revive an on-premises Eon Mode database to AWS, check the controlmode
setting in /opt/vertica/config/admintools.conf
. This setting must be compatible with the network messaging requirements of your Eon implementation. AWS relies on unicast messaging, which is compatible with a controlmode
setting of point-to-point
(pt2pt). If the source database controlmode
setting was broacast
and you migrate to S3/AWS communal storage, you must change controlmode
with admintools:
$ admintools -t re_ip -d dbname -T
-
On-premises and other environments:
$ admintools -t revive_db -x auth_params.conf \
--communal-storage-location=storage-schema://communal_store_path \
-s host1_ip,... -d database_name
This example revives a six-node on-premises database:
$ admintools -t revive_db -x auth_params.conf \
--communal-storage-location=s3://mybucket/mydir \
-s 172.16.116.27,172.16.116.28,172.16.116.29,172.16.116.30,\
172.16.116.31,172.16.116.32 -d VMart
The following example demonstrates reviving a three-node database hosted on GCP:
$ admintools -t revive_db -x auth_params.conf \
--communal-storage-location gs://mybucket/verticadb \
-s 10.142.0.35,10.142.0.38,10.142.0.39 -d VerticaDB
Attempting to retrieve file:
[gs://mybucket/verticadb/metadata/VerticaDB/cluster_config.json]
Validated 3-node database VerticaDB defined at communal storage
gs://mybucket/verticadb .
Cluster lease has expired.
Preparation succeeded all hosts
Calculated necessary addresses for all nodes.
Starting to bootstrap nodes. Please wait, databases with a large
catalog may take a while to initialize.
>>Calling bootstrap on node v_verticadb_node0002 (10.142.0.38)
>>Calling bootstrap on node v_verticadb_node0003 (10.142.0.39)
Load Remote Catalog succeeded on all hosts
Database revived successfully.
Reviving only primary subclusters
You can revive just the primary subclusters in an Eon Mode database. Make the list of hosts you pass to the admintools revive_db tool's --hosts
(or -s
) argument match the number of primary nodes that were in the database when it shut down. For example, if you have a six-node Eon Mode database that had three primary nodes, you can revive just the primary nodes by supplying three hosts in the --hosts
argument:
$ admintools -t revive_db --communal-storage-location=s3://verticadb -d verticadb \
-x auth_params.conf --hosts node01,node02,node03
Attempting to retrieve file: [s3://verticadb/metadata/verticadb/cluster_config.json]
Consider reviving to only primary nodes: communal storage indicates 6 nodes, while
3 nodes were specified
Validated 3-node database verticadb defined at communal storage s3://verticadb.
Cluster lease has expired.
Preparation succeeded all hosts
Calculated necessary addresses for all nodes.
Starting to bootstrap nodes. Please wait, databases with a large catalog may take a
while to initialize.
>>Calling bootstrap on node v_verticadb_node0002 (192.168.56.103)
>>Calling bootstrap on node v_verticadb_node0003 (192.168.56.104)
Load Remote Catalog succeeded on all hosts
Database revived successfully.
In a database where you have revived only the primary nodes, the secondary nodes are down. Their IP address is set to 0.0.0.0 so they are not part of the database. For example, querying the NODES system table in the database revived in the previous example shows the secondary nodes are all down:
=> SELECT node_name,node_state,node_address,subcluster_name FROM NODES;
node_name | node_state | node_address | subcluster_name
----------------------+------------+----------------+--------------------
v_verticadb_node0001 | UP | 192.168.56.102 | default_subcluster
v_verticadb_node0002 | UP | 192.168.56.103 | default_subcluster
v_verticadb_node0003 | UP | 192.168.56.104 | default_subcluster
v_verticadb_node0004 | DOWN | 0.0.0.0 | analytics
v_verticadb_node0005 | DOWN | 0.0.0.0 | analytics
v_verticadb_node0006 | DOWN | 0.0.0.0 | analytics
Note
Secondary nodes that have not been revived may cause error messages if your database has the large cluster feature enabled. (See Large cluster for more information about the large cluster feature.)
For example, adding a node to a secondary subcluster can fail if the new node would be assigned a control node that has not been revived. In this case, Vertica reports that adding the node failed because the control node has an invalid IP address.
If you encounter errors involving control nodes with invalid IP addresses, consider reviving the unrevived secondary subcluster, as explained below.
Because Vertica considers these unrevived nodes to be down, it may not allow you to remove them or remove their subcluster while they are in their unrevived state. The best way to remove the nodes or the secondary subcluster is to revive them first.
Reviving unrevived secondary subclusters
If you revived just the primary subclusters in your database, you can later choose to revive some or all of the secondary subclusters. Your cluster must have hosts that are not nodes in the database that Vertica can use to revive the unrevived nodes. If your cluster does not have enough of these non-node hosts, you can add more hosts. See Adding hosts to a cluster.
You revive a secondary subcluster by using the admintools' restart_subcluster tool. You supply it with the list of hosts in the --hosts
argument where the nodes will be revived. The number of hosts in this list must match the number of nodes in the subcluster. You must revive all nodes in the subcluster at the same time. If you pass restart_subcluster a list with fewer or more hosts than the number of nodes defined in the subcluster, it returns an error.
The follow example demonstrates reviving the secondary subcluster named analytics shown in the previous examples.
$ admintools -t restart_subcluster -d verticadb --hosts node04,node05,node06 \
-p 'password' -c analytics
Updating hostnames of nodes in subcluster analytics.
Replicating configuration to all nodes
Generating new configuration information and reloading spread
Hostnames of nodes in subcluster analytics updated successfully.
*** Restarting subcluster for database verticadb ***
Restarting host [192.168.56.105] with catalog [v_verticadb_node0004_catalog]
Restarting host [192.168.56.106] with catalog [v_verticadb_node0005_catalog]
Restarting host [192.168.56.107] with catalog [v_verticadb_node0006_catalog]
Issuing multi-node restart
Starting nodes:
v_verticadb_node0004 (192.168.56.105)
v_verticadb_node0005 (192.168.56.106)
v_verticadb_node0006 (192.168.56.107)
Starting Vertica on all nodes. Please wait, databases with a large catalog may take a while to initialize.
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (DOWN) v_verticadb_node0005: (DOWN) v_verticadb_node0006: (DOWN)
Node Status: v_verticadb_node0004: (INITIALIZING) v_verticadb_node0005: (INITIALIZING) v_verticadb_node0006: (INITIALIZING)
Node Status: v_verticadb_node0004: (UP) v_verticadb_node0005: (UP) v_verticadb_node0006: (UP)
Syncing catalog on verticadb with 2000 attempts.
See also
14 - Synchronizing metadata
An Eon Mode database maintains its catalog, which contains all database metadata, in communal storage.
An Eon Mode database maintains its catalog, which contains all database metadata, in communal storage. Vertica uses this metadata when it revives the database, so it is important that the catalog is always up to date. Vertica automatically synchronizes the catalog at regular intervals as specified by the configuration parameter CatalogSyncInterval —by default, set to five minutes.
In general, it is not necessary to monitor the synchronization process or change it. One exception applies: before shutting down a database that you intend to revive or replicate, it is good practice to verify that the catalog contains all recent changes, and if necessary synchronize it manually.
Verifying catalog status
You can verify the synchronization status of the database catalog in two ways, depending on whether the database is running.
If the database is running, query and compare these two system tables:
-
CATALOG_SYNC_STATE: Shows how recently each node synchronized its catalog to communal storage, and the version that it synchronized.
-
CATALOG_TRUNCATION_STATUS: Shows the latest synchronization status of the database catalog. Catalog synchronization is up to date when columns TRUNCATION_CATALOG_VERSION
and CURRENT_CATALOG_VERSION
are the same.
If the database is not currently running, check the following JSON file on communal storage:
/metadata/
database-name
/cluster_config.json
The catalog truncation version and timestamp in this file indicate when Vertica last synchronized the database catalog.
Manually synchronizing the database datalog
If necessary, call SYNC_CATALOG to synchronize the catalog immediately with all nodes or a specific node:
=> SELECT sync_catalog();
Customizing synchronization intervals
By default, Vertica checks for catalog changes every five minutes. Occasionally, you might want to change this setting temporarily—for example, set it to a high value in order to take a snapshot of the current bucket contents:
=> ALTER DATABASE DEFAULT SET CatalogSyncInterval = 300;
Important
If you set CatalogSyncInterval to high value for a specific task, be sure to revert it to its default setting immediately after the task is complete:
=> ALTER DATABASE DEFAULT CLEAR PARAMETER CatalogSyncInterval;