This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Using Vertica on the cloud

Welcome to the Vertica on the Cloud guide.

Welcome to the Vertica on the Cloud guide. This section explains how you can create Vertica clusters running on different cloud platforms. It does not cover working with existing data stored in the cloud. For information about loading data, see Data load.

This document assumes that you are familiar with the cloud environment on which you will create your Vertica cluster.

1 - Vertica on Amazon Web Services

This section explains how to create and manage Vertica clusters on AWS.

This section explains how to create and manage Vertica clusters on AWS.

When you launch a cluster on AWS resources and are ready to create your database, consider whether to run it in Eon Mode or Enterprise Mode. The differences in these two modes lay in their architecture, deployment, and scalability:

  • Enterprise Mode stores data locally on the nodes in the database.

  • Eon Mode stores its data in an S3 bucket.


    Eon Mode separates the computational processes from the communal storage layer of your database. This separation lets you elastically vary the number of nodes in your database cluster to adjust to varying workloads.

    Vertica provides CloudFormation Templates (CFTs) through the AWS Marketplace. These CFTs also deploy the Management Console.

See Architecture for more about the differences between the two database modes.

In this section

1.1 - Overview of Vertica on Amazon Web Services (AWS)

Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template, or manually deployed from Amazon Machine Images (AMIs).

Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template, or manually deployed from Amazon Machine Images (AMIs).

You can create a database in either Eon Mode or Enterprise Mode in a Vertica cluster in AWS.

For more information about Amazon cluster instances and their limitations, see the Amazon documentation.

In this section

1.1.1 - CloudFormation templates

Vertica provides Cloud Formation Templates (CFTs) through the AWS Marketplace.

Vertica provides Cloud Formation Templates (CFTs) through the AWS Marketplace. After you provide a few parameters to the template, create a stack to automatically provision the AWS resources for your Vertica system.

After creating the stack, in the Management Console (MC) you can create and manage your clusters and databases. See Creating an Eon Mode database in AWS with MC or Creating an Enterprise Mode database in AWS with MC.

1.1.2 - Vertica offerings on AWS

Using the license models and CFTs described in CloudFormation Template (CFT) Overview, you can install the following Vertica products:.

Using the license models and CFTs described in CloudFormation template (CFT) overview, you can install the following Vertica products:

  • Vertica BYOL, Amazon Linux 2.0

  • Vertica by the Hour, Amazon Linux 2.0

  • Vertica BYOL, Red Hat

  • Vertica by the Hour, Red Hat

See Launch MC and AWS resources with a CloudFormation template for information on installing these products

1.1.3 - Vertica AMI operating systems for AWS

Vertica provides Vertica and Management Console AMIs in the following operating systems.

Vertica provides Vertica and Management Console AMIs in the following operating systems.

  • Red Hat 7.4 and later

  • Amazon Linux 2.0 and later

You can use the AMI to deploy MC hosts or cluster hosts.

1.1.4 - Supported AWS instance types

Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes.

Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes. Choose the instance type that best matches your requirements. The two tables below list the AWS instance types that Vertica supports for Vertica cluster hosts, and for use in MC. For more information, see the Amazon Web Services documentation on instance types and volumes.

Instance types for Vertica cluster hosts

Each Amazon EC2 Instance type natively provides one of the following storage options:

  • Elastic Block Store (EBS) provides durable storage: Data files stored on instance persist after instance is stopped.

  • Instance Store provides temporary storage: Data files stored on instance are lost when instance is stopped.

Optimization Instance Types Using Only EBS Volumes (Durable) Instance Types Using Instance Store Volumes (Temporary)
General purpose

m4.4xlarge

m4.10xlarge

m5.4xlarge

m5.8xlarge

m5.12xlarge

m5d.4xlarge

m5d.8xlarge

m5d.12xlarge

Compute

c4.4xlarge

c4.8xlarge

c5.4xlarge

c5.9xlarge

c3.4xlarge

c3.8xlarge

c5d.4xlarge

c5d.9xlarge

Memory

r4.4xlarge

r4.8xlarge

r4.16xlarge

r5.4xlarge

r5.8xlarge

r5.12xlarge

r3.4xlarge

r3.8xlarge

r5d.4xlarge

r5d.8xlarge

r5d.12xlarge

Storage

d2.4xlarge

d2.8xlarge

i3.4xlarge

i3.8xlarge

i3.16xlarge

i3en.3xlarge

i3en.6xlarge

i3en.12xlarge

Instance types available for MC hosts

Optimization Type Supports EBS Storage (Durable) Supports Ephemeral Storage (Temporary)
Computing

c4.large

c4.xlarge

c5.large

c5.xlarge

Yes

Yes

Yes

Yes

No

No

No

No

More information

For more information about Amazon cluster instances and their limitations, see Manage Clusters in the Amazon Web Services documentation.

1.1.5 - Choosing AWS Eon Mode instance types

This topic lists the recommended instance types to use in an Eon Mode database running in AWS.

This topic lists the recommended instance types to use in an Eon Mode database running in AWS.

Choose instance types that support ephemeral instance storage or EBS volumes for your depot, depending on cost and availability. It is not mandatory to have an EBS-backed depot, because in Eon Mode, a copy of the data is safely stored in communal storage. Vertica recommends either r4 or i3 instances for production clusters.

The following table provides information to help you make a decision on how to pick instances with ephemeral instance storage or EBS only storage. Check with AWS for the latest cost per hour.

Storage Type Instance Type Pros/Cons
Instance storage i3.8xlarge

Instance storage offers better performance than EBS attached storage through multiple EBS volumes. Instance storage can be striped (RAIDed) together to increase throughput and load balance I/O.

Data stored in instance-store volumes is not persistent through instance stops, terminations, or hardware failures.

EBS-only storage

r4.8xlarge with 600 GB

EBS volume attached

Newer instance types from AWS have only the EBS option. In most AWS regions, it's easier to provision a large number of instances.

You can terminate an instance but leave the EBS volume around for faster revive. Perserving the EBS will preserve the depot. While some of the cached files might have become stale, they will be ignored and evicted. Much of the cached data will not be stale. It will save time when the node revives and warms its depot.

Take advantage of full-volume encryption.

1.1.6 - Vertica AMI sleep c-states

By default, the following instances have their processor C-states set to a value of 1 in the Vertica AMI:.

By default, the following instances have their processor C-states set to a value of 1 in the Vertica AMI:

  • c4.8xlarge

  • d2.8xlarge

  • m4.10xlarge

This measure is meant to improve performance by limiting the sleep states that an instance running Vertica uses.

For more information about sleep states, visit the AWS Documentation.

1.1.7 - AWS features supported by Vertica

Vertica supports the following AWS features:.

Vertica supports the following AWS features:

  • Enhanced Networking: Vertica recommends that you use the AWS enhanced networking for optimal performance. For more information, see Enabling Enhanced Networking on Linux Instances in a VPC in the AWS documentation.

  • Command Line Interface: Use the Amazon command-line Interface (CLI) with your Vertica AMIs. For more information, see What Is the AWS Command Line Interface?.

  • Elastic Load Balancing: Use elastic load balancing (ELB) for queries up to one hour. When enabling ELB, configure the timer to 3600 seconds. For more information see Elastic Load Balancing in the AWS documentation.

1.1.8 - AWS authentication

Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens.

Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens. For long-term access to non-communal storage buckets, you should use IAM roles for access control centralization. You do not need to change your application's configuration if you want to change its access settings. You just alter the IAM role applied to your EC2 instances.

However, for one-time tasks like backing up and restoring the database or loading data to and from non-communal storage buckets, you should use an AWS access key.

Vertica uses both of these authentication methods to support different features and use cases:

  • An Eon Mode database's access to S3 for communal and catalog storage must always use IAM role authentication. IAM roles are the default access control method for AWS resources. Vertica uses this method if you do not configure the legacy access control session parameters.

  • Individual users can read data from S3 storage locations other than the ones Vertica uses for communal storage. For example, users can use COPY to load data into Vertica from an S3 bucket or query an external table stored on S3. If the IAM role assigned to the Vertica nodes does not have access to this external S3 data, the user must set an id, secret, and optionally an access token in session variables to authorize access to it. These session variables override the IAM role set on the server. See S3 parameters for a list of these session parameters.

  • Individual users can export data to S3 using the Vertica Library for AWS. This library cannot use IAM authorization. Users who want to export data to S3 using this library must set id, secret, and optionally access token values in session variables. See Configure the Vertica library for Amazon Web Services for details.

Configuring an IAM role

To configure an IAM role to grant Vertica to access AWS resources you must:

  1. Create an IAM role to allow EC2 instances to access the specific resources.

  2. Grant that role permission to access your resources.

  3. Attach this IAM role to each EC2 instance in the Vertica cluster.

To see an example of IAM roles for a Vertica cluster, look at the roles defined in one of the Cloud Formation Templates provided by Vertica. You can download these templates from any of the Vertica entries in the Amazon Marketplace. Under each entry's Usage Information section, click the View CloudFormation Template link, then click Download CloudFormation Template.

For more information about IAM roles, see IAM Roles for Amazon EC2 in the AWS documentation.

1.2 - Installing Vertica with CloudFormation templates

Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly.

Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly. Using the template allows you to automatically provision your AWS resources and launch a Vertica cluster and Management Console, with minimal configuration required.

If you prefer to deploy a VPC, instances, and related resources manually, see Install Vertica with manually deployed AWS resources.

For details about creating an Eon Mode or Enterprise Mode database after you create a cluster with CFTs, see Amazon Web Services in MC.

1.2.1 - CloudFormation template (CFT) overview

With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.

With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.

To access Vertica CFTs, go to the AWS Marketplace. Licensing models for CFTs are:

  • Bring Your Own License (BYOL): By default, free CE license is installed with 3 nodes and 1 TB. To extend nodes or size, you can purchase the Vertica BYOL license.
    Outside of the BYOL license on CFTs, you can also access the Community Edition without a license file:

  • By the Hour: A pay-as-you-go model where you pay for only the number of hours you use for each node. One advantage of using the Paid Listing is that all charges appear on your Amazon AWS bill. This offers an alternative to purchasing a full Vertica license. This eliminates the need to compute potential storage needs in advance.

Available Vertica CFTs are:

  • Management Console with 3 Vertica nodes: The easiest way to deploy Vertica. This CFT deploys an Eon Mode database by default. However, this environment can also be used to create an Enterprise Mode database. For more information, see Creating a database.

  • Deploy Management Console into new VPC: This CFT deploys all required AWS resources and installs the Vertica Management Console (MC). After stack creation completes, log in to the MC to provision a Vertica database cluster.

  • Deploy Management Console into existing VPC: This CFT deploys the Vertica Management Console (MC) in an already-existing VPC and subnet. After stack creation completes, the MC is available. Log in to MC to provision either a Vertica database cluster or an Eon Mode database cluster.

    For this CFT, you must first set up the VPC, subnet, and related network resources. For more information about the correct configuration of these resources for Vertica, see the following topics in the AWS documentation: * Creating a virtual private cloud * Configuring the network

For more information

For supported operating systems for these CFTs, see Vertica AMI operating systems for AWS.

For Vertica products available on AWS, see Vertica offerings on AWS.

1.2.2 - Prerequisites for using CFTs

Before you can install Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:.

Before you can install Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:

  • AWS account with permissions to create a VPC, subnet, security group, EC2 instances, and IAM roles (For more information about AWS accounts, see the AWS documentation)

  • Amazon key pair for SSH access to an EC2 instance. (See the AWS documentation for key pairs.)

1.2.3 - Launch MC and AWS resources with a CloudFormation template

Launch (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace.

Launch Management Console (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace. For a list of available CFTs, see CloudFormation template (CFT) overview.

Starting in the AWS Marketplace, launch the provisioning instance from which you can install Vertica.:

  1. Log in to the AWS Marketplace with an AWS account (see the Prerequisites section above).

  2. Search for "Vertica" in the AWS Marketplace.

  3. Select a Vertica CFT. Each CFT leads you to a product overview page, with pricing estimates. (Also see CloudFormation template (CFT) overview for an overview of available templates and products).

  4. Click Continue to Subscribe.

  5. On the next page, select your launch settings based on your requirements for deployment.

  6. If you have not agreed to Vertica EULA terms on the AWS Marketplace before, click Accept Software Terms to subscribe.

  7. Click Launch with CloudFormation Console. The CloudFormation Console opens.

  8. The CloudFormation Console automatically supplies the URL in the Specify an Amazon S3 template URL field. Click Next.

  9. Follow the CloudFormation workflow and enter the parameters (collectively called a stack).

  10. After confirming the details you have provided for your new stack, click Create. The AWS console brings you to the Stacks page, where you can view the progress of the creation process. The process takes several minutes.

  11. The Outputs tab displays information about accessing your environment after the process completes.

Next, access the Management Console (MC) to deploy your cluster instances and create a database, as described in Access Management Console.

1.2.4 - Access Management Console

You use MC to deploy Vertica cluster instances and create a database.

You use MC to deploy Vertica cluster instances and create a database. You can also use MC to manage and monitor your databases. You will use Management Console to provision a Vertica cluster and database on the AWS resources you just launched.

  1. On the AWS CloudFormation Stacks page, select your new stack and view the Outputs tab. This tab provides information about accessing your environment, as well as documentation and licensing resources.

  2. Click the Access Management Console URL. This link takes you to the MC login page.

  3. To log in, enter the MC username and password that you created using the CloudFormation Console.

    After login, MC displays the home page, with options to provision a new cluster or database or import existing ones. If you chose a CFT that also creates a database, your new database is also displayed on the home page.

    This page also provides a Resources section with links to online training, blogs, community, and help resources.

You have successfully launched Management Console on AWS resources.

If you have not yet provisioned a Vertica cluster and database, complete the steps in one of the following:

1.2.5 - Creating a virtual private cloud

A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network.

A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network. Create a virtual private cloud (VPC) to ensure the nodes in your cluster will be able to communicate with each other within AWS.

Create a single public subnet VPC with the following configurations:

For information about VPCs, including how to create one, visit the AWS documentation.

1.3 - Install Vertica with manually deployed AWS resources

Vertica provides an AMI that you can install on AWS resources that you manually deploy.

Vertica provides an AMI that you can install on AWS resources that you manually deploy. This section will guide you through configuring your network settings on AWS, launching and preparing EC2 instances using the Vertica AMI, and creating a Vertica cluster on those EC2 instances.

Choose this method of installation if you are familiar with configuring AWS and have many specific AWS configuration needs. (To automatically deploy AWS resources and a Vertica cluster instead, see Installing Vertica with CloudFormation templates.

1.3.1 - Configure your network

Before you create your cluster, you must configure the network on which Vertica will run.

Before you create your cluster, you must configure the network on which Vertica will run. Vertica requires a number of specific network configurations to operate on AWS. You may also have specific network configuration needs beyond the default Vertica settings.

The following sections explain which Amazon EC2 features you need to configure for instance creation.

1.3.1.1 - Create a placement group, key pair, and VPC

Part of configuring your network for AWS is to create the following:.

Part of configuring your network for AWS is to create the following:

Create a placement group

A placement group is a logical grouping of instances in a single Availability Zone. Placement Groups are required for clusters and all Vertica nodes must be in the same Placement Group.

Vertica recommends placement groups for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your Placement Group, choose an instance type that supports enhanced networking.

For information on creating placement groups, see Placement Groups in the AWS documentation.

Create a key pair

You need a key pair to access your instances using SSH. Create the key pair using the AWS interface and store a copy of your key (*.pem) file on your local machine. When you access an instance, you need to know the local path of your key.

Use a key pair to:

  • Authenticate your connection as dbadmin to your instances from outside your cluster.

  • Install and configure Vertica on your AWS instances.

for information on creating a key pair, see Amazon EC2 Key Pairs in the AWS documentation.

Create a virtual private cloud (VPC)

You create a Virtual Private Cloud (VPC) on Amazon so that you can create a network of your EC2 instances. Your instances in the VPC all share the same network and security settings.

A Vertica cluster on AWS must be logically located in the same network. Create a VPC to ensure the nodes in you cluster can communicate with each other in AWS.

Create a single public subnet VPC with the following configurations:

For information on creating a VPC, see Create a Virtual Private Cloud (VPC) in the AWS documentation.

1.3.1.2 - Network ACL settings

Vertica requires the following network access control list (ACL) settings on an AWS instance running the Vertica AMI.

Vertica requires the following basic network access control list (ACL) settings on an AWS instance running the Vertica AMI. Vertica recommends that you secure your network with additional ACL settings that are appropriate to your situation.

Inbound Rules

Type Protocol Port Range Use Source Allow/Deny
SSH TCP (6) 22 SSH (Optional—for access to your cluster from outside your VPC) User Specific Allow
Custom TCP Rule TCP (6) 5450 MC (Optional—for MC running outside of your VPC) User Specific Allow
Custom TCP Rule TCP (6) 5433 SQL Clients (Optional—for access to your cluster from SQL clients) User Specific Allow
Custom TCP Rule TCP (6) 50000 Rsync (Optional—for backup outside of your VPC) User Specific Allow
Custom TCP Rule TCP (6) 1024-65535 Ephemeral Ports (Needed if you use any of the above) User Specific Allow
ALL Traffic ALL ALL N/A 0.0.0.0/0 Deny

Outbound Rules

Type Protocol Port Range Use Source Allow/Deny
Custom TCP Rule TCP (6) 0–65535 Ephemeral Ports 0.0.0.0/0 Allow

You can use the entire port range specified in the previous table, or find your specific ephemeral ports by entering the following command:

$ cat /proc/sys/net/ipv4/ip_local_port_range

More information

For detailed information on network ACLs within AWS, refer to Network ACLs in the Amazon documentation.

For detailed information on ephemeral ports within AWS, refer to Ephemeral Ports in the Amazon documentation.

1.3.1.3 - Configure TCP keepalive with AWS network load balancer

AWS supports three types of elastic load balancers (ELBs):.

AWS supports three types of elastic load balancers (ELBs):

Classic Load Balancers

Application Load Balancers

Network Load Balancers

Vertica strongly recommends the AWS Network Load Balancer (NLB), which provides the best performance with your Vertica database. The Network Load Balancer acts as a proxy between clients (such as JDBC) and Vertica servers. The Classic and Application Load Balancers do not work with Vertica, in Enterprise Mode or Eon Mode.

To avoid timeouts and hangs when connecting to Vertica through the NLB, it is important to understand how AWS NLB handles idle timeouts for connections. For the NLB, AWS sets the idle timeout value to 350 seconds and you cannot change this value. The timeout applies to both connection points.

For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned because the server fails to send a keepalive within 350 seconds.

To identify an idle timeout/keepalive issue, run a query like this via a client such as JDBC:

=> SELECT SLEEP(355);

If there’s a problem, one of the following situations occurs:

  • The client connection terminates before 355 seconds. In this case, lower the JDBC keepalive setting so that keepalives are sent less than 350 seconds apart.

  • The client connection doesn’t return a result after 355 seconds. In this case, you need to adjust the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) so that keepalives are sent less than 350 seconds apart.

For detailed information about AWS Network Load Balancers, see What is a Network Load Balancer? in the AWS documentation.

1.3.1.4 - Create and assign an internet gateway

When you create a VPC, an Internet gateway is automatically assigned to it.

When you create a VPC, an Internet gateway is automatically assigned to it. You can use that gateway, or you can assign your own. If you are using the default Internet gateway, continue with the procedure described in Create a security group.

Otherwise, create an Internet gateway specific to your needs. Associate that internet gateway with your VPC and subnet.

For information about how to create an Internet Gateway, see Internet Gateways in the AWS documentation.

1.3.1.5 - Assign an elastic IP address

An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally.

An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally. Vertica recommends you assign a single elastic IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using their internal IP addresses dictated by your VPC settings.

Create an elastic IP address. For information, see Elastic IP Addresses in the AWS documentation.

1.3.1.6 - Create a security group

The Vertica AMI has specific security group requirements.

The Vertica AMI has specific security group requirements. When you create a Virtual Private Cloud (VPC), AWS automatically creates a default security group and assigns it to the VPC. You can use the default security group, or you can name and assign your own.

Create and name your own security group using the following basic security group settings. You may make additional modifications based on your specific needs.

Inbound

Type Use Protocol Port Range IP
SSH TCP 22 The CIDR address range of administrative systems that require SSH access to the Vertica nodes. Make this range as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
DNS (UDP) UDP 53 Your private subnet address range (for example, 10.0.0.0/24).
Custom UDP Spread UDP 4803 and 4804 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP Spread TCP 4803 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP VSQL/SQL TCP 5433 The CIDR address range of client systems that require access to the Vertica nodes. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP Inter-node Communication TCP 5434 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP TCP 5444 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP MC TCP 5450 The CIDR address of client systems that require access to the management console. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP Rsync TCP 50000 Your private subnet address range (for example, 10.0.0.0/24).
ICMP Installer Echo Reply N/A Your private subnet address range (for example, 10.0.0.0/24).
ICMP Installer Traceroute N/A Your private subnet address range (for example, 10.0.0.0/24).

Outbound

Type Protocol Port Range Destination IP
All TCP TCP 0-65535 Anywhere 0.0.0.0/0
All ICMP ICMP 0-65535 Anywhere 0.0.0.0/0
All UDP UDP 0-65535 Anywhere 0.0.0.0/0

For information about what a security group is, as well as how to create one, see Amazon EC2 Security Groups for Linux Instances in the AWS documentation.

1.3.2 - Deploy AWS instances for your Vertica database cluster

Once you have configured your network, you are ready to create your AWS instances and install Vertica.

Once you have configured your network, you are ready to create your AWS instances and install Vertica. Follow these procedures to install and run Vertica on AWS.

1.3.2.1 - Configure and launch an instance

After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica.

After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.

When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps. Vertica recommends that you use the Vertica AMI as is—without modification.

Configure EC2 instances in AWS

  1. Select the Vertica AMI from the AWS marketplace.

  2. Select the desired fulfillment method.

  3. Configure the following:

Add storage to your instances

Consider the following issues when you add storage to your instances:

  • Add a number of drives equal to the number of physical cores in your instance. For example, for a c3.8xlarge instance, 16 drives. For an r3.4xlarge, add 8 drives.

  • Do not store your information on the root volume.

  • Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types in the Amazon Web Services documentation.

Decide whether to configure EBS volumes as a RAID array

You can choose to configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:

[dbadmin@ip-10-11-12-13 ~]$ /opt/vertica/bin/vioperf /vertica/data

The minimum required I/O is 20 MB/s read and write per physical processor core on
each node, in full duplex i.e. reading and writing at this rate simultaneously,
concurrently on all nodes of the cluster. The recommended I/O is 40 MB/s per
physical core on each node. For example, the I/O rate for a server node with 2
hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.

Using direct io (buffer size=1048576, alignment=512) for directory "/vertica/data"

test      | directory     | counter name        | counter | counter   | counter       | counter       | thread | %CPU  | %IO Wait  | elapsed | remaining
          |               |                     | value   | value (10 | value/core    | value/core    | count  |       |           | time (s)| time (s)
          |               |                     |         | sec avg)  |               | (10 sec avg)  |        |       |           |         |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Write     | /vertica/data | MB/s                | 259     | 259       | 32.375        | 32.375        | 8      | 4     | 11        | 10      | 65
Write     | /vertica/data | MB/s                | 248     | 232       | 31            | 29            | 8      | 4     | 11        | 20      | 55
Write     | /vertica/data | MB/s                | 240     | 234       | 30            | 29.25         | 8      | 4     | 11        | 30      | 45
Write     | /vertica/data | MB/s                | 240     | 233       | 30            | 29.125        | 8      | 4     | 13        | 40      | 35
Write     | /vertica/data | MB/s                | 240     | 233       | 30            | 29.125        | 8      | 4     | 13        | 50      | 25
Write     | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 12        | 60      | 15
Write     | /vertica/data | MB/s                | 240     | 238       | 30            | 29.75         | 8      | 4     | 12        | 70      | 5
Write     | /vertica/data | MB/s                | 240     | 235       | 30            | 29.375        | 8      | 4     | 12        | 75      | 0
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 237+237 | 237+237   | 29.625+29.625 | 29.625+29.625 | 8      | 4     | 22        | 10      | 65
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 235+235 | 234+234   | 29.375+29.375 | 29.25+29.25   | 8      | 4     | 20        | 20      | 55
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235   | 29.25+29.25   | 29.375+29.375 | 8      | 4     | 20        | 30      | 45
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234   | 29.125+29.125 | 29.25+29.25   | 8      | 4     | 18        | 40      | 35
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234   | 29.125+29.125 | 29.25+29.25   | 8      | 4     | 20        | 50      | 25
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235   | 29.25+29.25   | 29.375+29.375 | 8      | 3     | 19        | 60      | 15
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 236+236   | 29.125+29.125 | 29.5+29.5     | 8      | 4     | 21        | 70      | 5
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 232+232 | 236+236   | 29+29         | 29.5+29.5     | 8      | 4     | 21        | 75      | 0
Read      | /vertica/data | MB/s                | 248     | 248       | 31            | 31            | 8      | 4     | 12        | 10      | 65
Read      | /vertica/data | MB/s                | 241     | 236       | 30.125        | 29.5          | 8      | 4     | 15        | 20      | 55
Read      | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 10        | 30      | 45
Read      | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 12        | 40      | 35
Read      | /vertica/data | MB/s                | 240     | 234       | 30            | 29.25         | 8      | 4     | 12        | 50      | 25
Read      | /vertica/data | MB/s                | 238     | 235       | 29.75         | 29.375        | 8      | 4     | 15        | 60      | 15
Read      | /vertica/data | MB/s                | 238     | 232       | 29.75         | 29            | 8      | 4     | 13        | 70      | 5
Read      | /vertica/data | MB/s                | 238     | 238       | 29.75         | 29.75         | 8      | 3     | 9         | 75      | 0
SkipRead  | /vertica/data | seeks/s             | 22909   | 22909     | 2863.62       | 2863.62       | 8      | 0     | 6         | 10      | 65
SkipRead  | /vertica/data | seeks/s             | 21989   | 21068     | 2748.62       | 2633.5        | 8      | 0     | 6         | 20      | 55
SkipRead  | /vertica/data | seeks/s             | 21639   | 20936     | 2704.88       | 2617          | 8      | 0     | 7         | 30      | 45
SkipRead  | /vertica/data | seeks/s             | 21478   | 20999     | 2684.75       | 2624.88       | 8      | 0     | 6         | 40      | 35
SkipRead  | /vertica/data | seeks/s             | 21381   | 20995     | 2672.62       | 2624.38       | 8      | 0     | 5         | 50      | 25
SkipRead  | /vertica/data | seeks/s             | 21310   | 20953     | 2663.75       | 2619.12       | 8      | 0     | 5         | 60      | 15
SkipRead  | /vertica/data | seeks/s             | 21280   | 21103     | 2660          | 2637.88       | 8      | 0     | 8         | 70      | 5
SkipRead  | /vertica/data | seeks/s             | 21272   | 21142     | 2659          | 2642.75       | 8      | 0     | 6         | 75      | 0

If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).

If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.

Security group and access

  1. Choose between your previously configured security group or the default security group.

  2. Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS authentication for more information.

Launch instances

Verify that your instances are running.

1.3.2.2 - Connect to an instance

Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:.

Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:

  1. As the dbadmin user, type the following command, substituting your ssh key:

    $ ssh --ssh-identity <ssh key> dbadmin@elasticipaddress
    
  2. Select Instances from the Navigation panel.

  3. Select the instance that is attached to the Elastic IP.

  4. Click Connect.

  5. On Connect to Your Instance, choose one of the following options:

    • A Java SSH Client directly from my browser—Add the path to your private key in the field Private key path, andclick Launch SSH Client.

    • Connect with a standalone SSH client**—**Follow the steps required by your standalone SSH client.

Connect to an instance from windows using putty

If you connect to the instance from the Windows operating system, and plan to use Putty:

  1. Convert your key file using PuTTYgen.

  2. Connect with Putty or WinSCP (connect via the elastic IP), using your converted key (i.e., the *ppk file).

  3. Move your key file (the *pem file) to the root dir using Putty or WinSCP.

1.3.2.3 - Prepare instances for cluster formation

After you create your instances, you need to prepare them for cluster formation.

After you create your instances, you need to prepare them for cluster formation. Prepare your instances by adding your AWS .pem key and your Vertica license.

By default, each AMI includes a Community Edition license. Once Vertica is installed, you can find the license at this location:

/opt/vertica/config/licensing/vertica_community_edition.license.key
  1. As the dbadmin user, copy your *pem file (from where you saved it locally) onto your primary instance.

    Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:

    FATAL (19): Failed Login Validation 10.0.3.158, cannot resolve or connect to host as root.
    

    If you receive a failure message, enter the following command to correct permissions on your *pem file:

    $ chmod 600 /<name-of-pem>.pem
    
  2. Copy your Vertica license over to your primary instance, placing it in your home directory or other known location.

1.3.2.4 - Change instances on AWS

You can change instance types on AWS.

You can change instance types on AWS. For example, you can downgrade a c3.8xlarge instance to c3.4xlarge. See Supported AWS instance types for a list of valid AWS instances.

When you change AWS instances you may need to:

  • Reconfigure memory settings

  • Reset memory size in a resource pool

  • Reset number of CPUs in a resource pool

Reconfigure memory settings

If you change to an AWS instance type that requires a different amount of memory, you may need to recompute the following and then reset the values:

Reset memory size in a resource pool

If you used absolute memory in a resource pool, you may need to reconfigure the memory using the MEMORYSIZE parameter in ALTER RESOURCE POOL.

Reset number of CPUs in a resource pool

If your new instance requires a different number of CPUs, you may need to reset the CPUAFFINITYSET parameter in ALTER RESOURCE POOL.

1.3.2.5 - Configure storage

Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system.

Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system. The /opt/vertica/sbin/configure_software_raid.sh script automates the storage configuration process.

Vertica performance tests Eon Mode with a per-node EBS volume of up to 2TB. For best performance, combine multiple EBS volumes into a RAID 0 array.

For more information about RAID 0 arrays and EBS volumes, see RAID configuration on Linux.

Determining volume names

Because the storage configuration script requires the volume names that you want to configure, you must identify the volumes on your machine. The following command lists the contents of the /dev directory. Search for the volumes that begin with xvd:

$ ls /dev

Combining volumes for storage

The configure_software_raid.sh shell script combines your EBS volumes into a RAID 0 array.

The following steps combine your EBS volumes into RAID 0 with the configure_software_raid.sh script:

  1. Edit the /opt/vertica/sbin/configure_software_raid.sh shell file as follows:

    1. Comment out the safety exit command at the beginning .

    2. Change the sample volume names to your own volume names, which you noted previously. Add more volumes, if necessary.

  2. Run the /opt/vertica/sbin/configure_software_raid.sh shell file. Running this file creates a RAID 0 volume and mounts it to /vertica/data.

  3. Change the owner of the newly created volume to dbadmin with chown.

  4. Repeat steps 1-3 for each node on your cluster.

1.3.2.6 - Create a cluster

On AWS, use the install_vertica script to combine instances and create a cluster.

On AWS, use the install_vertica script to combine instances and create a cluster. Check your My Instances page on AWS for a list of current instances and their associated IP addresses. You need these IP addresses when you run install_vertica.

Create a cluster as follows:

  1. While connected to your primary instance, enter the following command to combine your instances into a cluster. Substitute the IP addresses for your instances and include your root *.pem file name.

    $ sudo /opt/vertica/sbin/install_vertica --hosts 10.0.11.164,10.0.11.165,10.0.11.166 \
      --dba-user-password-disabled --point-to-point --data-dir /vertica/data \
      --ssh-identity ~/name-of-pem.pem --license license.file
    
  2. After combining your instances, Vertica recommends deleting your *.pem key from your cluster to reduce security risks. The example below uses the shred command to delete the file:

    $ shred name-of-pem.pem
    
  3. After creating one or more clusters, create your database.

For complete information on the install_vertica script and its parameters, see Installing Vertica with the installation script.

Check open ports manually using the netcat utility

Once your cluster is up and running, you can check ports manually through the command line using the netcat (nc) utility. What follows is an example using the utility to check ports.

Before performing the procedure, choose the private IP addresses of two nodes in your cluster.

The examples given below use nodes with the private IPs:

10.0.11.60 10.0.11.61

Install the nc utility on your nodes. Once installed, you can issue commands to check the ports on one node from another node.

  1. To check a TCP port:

    1. Put one node in listen mode and specify the port. The following sample shows how to put IP 10.0.11.60 into listen mode for port 4804.

      [root@ip-10-0-11-60 ~]# nc -l 4804
      
    2. From the other node, run nc specifying the IP address of the node you just put in listen mode, and the same port number.

      [root@ip-10-0-11-61 ~]# nc 10.0.11.60 4804
      
    3. Enter sample text from either node and it should show up on the other node. To cancel after you have checked a port, enter Ctrl+C.

    [root@ip-10-0-11-60 ~]# nc -u -l 4804
    [root@ip-10-0-11-61 ~]# nc -u 10.0.11.60 4804
    

1.3.2.7 - Use Management Console (MC) on AWS

Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster.

Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster. Vertica provides an MC AMI, which you can use with AWS. The MC AMI allows you to create an instance, dedicated to running MC, that you can attach to a new or existing Vertica cluster on AWS. You can create and attach an MC instance to your Vertica on AWS cluster at any time.

For information on requirements and installing MC, see Installing Management Console.

See also

1.3.2.7.1 - Log in to MC and managing your cluster

After you launch your MC instance and configure your security group settings, log in to your database.

After you launch your MC instance and configure your security group settings, log in to your database. To do so, use the elastic IP you specified during instance creation.

From this elastic IP, you can manage your Vertica database on AWS using standard MC procedures.

Considerations when using MC on AWS

  • Because MC is already installed on the MC AMI, the MC installation process does not apply.

  • To uninstall MC on AWS, follow the procedures provided in Uninstalling Management Console before terminating the MC Instance.

1.4 - Export data to Amazon S3 using the AWS library

The AWS library is deprecated.

The Vertica library for Amazon Web Services (AWS) is a set of functions and configurable session parameters. These parameters allow you to export delimited data from Vertica to Amazon S3 storage without any third-party scripts or programs.

To use the AWS library, you must have access to an Amazon S3 storage account.

1.4.1 - Configure the Vertica library for Amazon Web Services

You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3.

You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3. This library does not support IAM authentication. You must configure it to authenticate with S3 by using session parameters containing your AWS access key credentials. You can set your session parameters directly, or you can store your credentials in a table and set them with the AWS_SET_CONFIG function.

Because the AWS library uses session parameters, you must reconfigure the library with each new session.

Set AWS authentication parameters

The following AWS authentication parameters allow you to access AWS and work with the data in your Vertica database:

  • aws_id: The 20-character AWS access key used to authenticate your account.

  • aws_secret: The 40-character AWS secret access key used to authenticate your account.

  • aws_session_token: The AWS temporary security token generated by running the AWS STS command get-session-token. This AWS STS command generates temporary credentials you can use to implement multi-factor authentication for security purposes. See Implementing Multi-factor Authentication.

Implement multi-factor authentication

Implement multi-factor authentication as follows:

  1. Run the AWS STS command get-session-token, this returns the following:

    $ Credentials": {
    "SecretAccessKey": "bQid6jNuSWRqUzkIJCFG7c71gDHZY3h7aDSW2DU6",
    "SessionToken":
    "FQoDYXdzEBcaDKM1mWpeu88nDTTFICKsAbaiIDTWe4BTh33tnUvo9F/8mZicKKLLy7WIcpT4FLfr6ltIm242/U2CI9G/
    XdC6eoysUi3UGH7cxdhjxAW4fjgCKKYuNL764N2xn0issmIuJOku3GTDyc4U4iNlWyEng3SlshdiqVlk1It2Mk0isEQXKtx
    F9VgfncDQBxjZUCkYIzseZw5pULa9YQcJOzl+Q2JrdUCWu0iFspSUJPhOguH+wTqiM2XdHL5hcUcomqm41gU=",
    "Expiration": "2018-04-12T01:58:50Z",
    "AccessKeyId": "ASIAJ4ZYGTOSVSLUIN7Q"
     }
    }
    

    For more information on get-session-token, see the AWS documentation.

  2. Using the SecretAccessKey returned from get-sessiontoken, set your temporary aws_secret:

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='bQid6jNuSWRqUzkIJCFG7c71gDHZY3h7aDSW2DU6';
    
  3. Using the SessionToken returned from get-session-token, set your temporary aws_session_token:

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_session_token='FQoDYXdzEBcaDKM1mWpeu88nDTTFICKsAbaiIDTWe4B
    Th33tnUvo9F/8mZicKKLLy7WIcpT4FLfr6ltIm242/U2CI9G/XdC6eoysUi3UGH7cxdhjxAW4fjgCKKYuNL764N2xn0issmIuJOku3GTDy
    c4U4iNlWyEng3SlshdiqVlk1It2Mk0isEQXKtxF9VgfncDQBxjZUCkYIzseZw5pULa9YQcJOzl+Q2JrdUCWu0iFspSUJPhOguH+wTq
    iM2XdHL5hcUcomqm41gU=';
    
  4. Using the AccessKeyID returned from get-session-token, set your temporary aws_id:

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_id='ASIAJ4ZYGTOSVSLUIN7Q';
    

The Expiration value returned indicates when the temporary credentials expire. In this example expiration occurs April 12, 2018 at 01:58:50.

These examples show how to implement multifactor authentication using session parameters. You can use either of the following methods to securely set and store your AWS account credentials:

AWS access key requirements

To communicate with AWS, your access key must have the following permissions:

  • s3:GetObject

  • s3:PutObject

  • s3:ListBucket

For security purposes, Vertica recommends that you create a separate access key with limited permissions specifically for use with the Vertica Library for AWS.

Configure session parameters directly

These examples show how to set the session parameters for AWS using your own credentials. Parameter values are case sensitive:

  • aws_id: This value is your AWS access key ID.

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_id='AKABCOEXAMPLEPKPXYZQ';
    
  • aws_secret: This value is your AWS secret access key.

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='CEXAMPLE3tEXAMPLE1wEXAMPLEFrFEXAMPLE6+Yz';
    
  • aws_region: This value is the AWS region associated with the S3 bucket you intend to access. Left unconfigured, aws_region will default to us-east-1. It identifies the default server used by Amazon S3.

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_region='us-east-1';
    

When using ALTER SESSION:

  • Using ALTER SESSION to change the values of S3 parameters also changes the values of corresponding UDParameters.

  • Setting a UDParameter changes only the UDParameter.

  • Setting a configuration parameter changes both the AWS parameter and UDParameter.

Configure session parameters using credentials stored in a table

You can place your credentials in a table and secure them with a row-level access policy. You can then call your credentials with the AWS_SET_CONFIG scalar meta-function. This approach allows you to store your credentials on your cluster for future session parameter configuration. You must have dbadmin access to create access policies.

  1. Create a table with rows or columns corresponding with your credentials:

    => CREATE TABLE keychain(accesskey varchar, secretaccesskey varchar);
    
  2. Store your credentials in the corresponding columns:

    => COPY keychain FROM STDIN;
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> AEXAMPLEI5EXAMPLEYXQ|CCEXAMPLEtFjTEXAMPLEiEXAMPLE6+Yz
    >> \.
    
  3. Set a row-level access policy appropriate to your security situation.

  4. With each new session, configure your session parameters by calling the AWS_SET_CONFIG parameter in a SELECT statement:

    => SELECT AWS_SET_CONFIG('aws_id', accesskey), AWS_SET_CONFIG('aws_secret', secretaccesskey)
       FROM keychain;
     aws_set_config | aws_set_config
    ----------------+----------------
     aws_id         | aws_secret
    (1 row)
    
  5. After you have configured your session parameters, verify them:

    => SHOW SESSION UDPARAMETER ALL;
    

1.4.2 - Export data to Amazon S3 from Vertica

After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function.

After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function. S3EXPORT() writes data to files, based on the URL you provide. Vertica performs all communication over HTTPS, regardless of the URL type you use.Vertica does not support virtual host style URLs. If you use HTTPS URL constructions, you must use path style URLs.

You can control the output of S3EXPORT() in the following ways:

Adjust the query provided to S3EXPORT

By adjusting the query given to S3EXPORT(), you can export anything from tables to reporting queries.

This example exports a whole table:

=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
   FROM exampleTable;
 rows | url
------+------------------------------
  606 | https://exampleBucket/object
(1 row)

This example exports the results of a query:

=> SELECT S3EXPORT(customer_name, annual_income USING PARAMETERS url='s3://exampleBucket/object') OVER()
    FROM public.customer_dimension
      WHERE (customer_gender, annual_income) IN
        (SELECT customer_gender, MAX(annual_income)
         FROM public.customer_dimension
         GROUP BY customer_gender);

 rows | url
------+------------------------------
   25 | https://exampleBucket/object
(1 row)

Adjust the partition of your result set with the OVER clause

Use the OVER clause to control your export partitions. Using the OVER() clause without qualification results in a single partition processed by the initiator for all of the query data. This example shows how to call the function with an unqualified OVER() clause:

=> SELECT S3EXPORT(name, company USING PARAMETERS url='s3://exampleBucket/object',
                                                  delimiter=',') OVER()
     FROM exampleTable WHERE company='Vertica';
 rows | url
------+------------------------------
   10 | https://exampleBucket/object
(1 row)

You can also use window clauses, such as window partition clauses and window order clauses, to manage exported objects.

This example shows how you can use a window partition clause to partition S3 objects based on company values:

=> SELECT S3EXPORT(name, company
                    USING PARAMETERS url='s3://exampleBucket/object',
                                     delimiter=',') OVER(PARTITION BY company) AS MEDIAN
      FROM exampleTable;

Adjusting the export chunk size for wide tables

You may encounter the following error when exporting extremely wide tables or tables with long data types such as LONG VARCHAR or LONG VARBINARY:

=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
   FROM veryWideTable;
ERROR 5861: Error calling setup() in User Function s3export
at [/data/.../S3.cpp:787],
error code: 0, message: The specified buffer of 10485760 bytesRead is too small,
it should be at least 11279701 bytesRead.

Vertica returns this error if the data for a single row overflows the buffer storing the data before export. By default, this buffer is 10MB. You can increase the size of this buffer using the chunksize parameter, which sets the size of the buffer in bytes. This example sets it to around 60MB:

=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object', chunksize=60485760)
   OVER(PARTITION BEST) FROM veryWideTable;
 rows | url
------+------------------------------
  606 | https://exampleBucket/object
(1 row)

See also

1.5 - Add nodes to a running cluster on the cloud

There are two ways to add nodes to an AWS cluster:.

There are two ways to add nodes to an AWS cluster:

  • Using Management Console

  • Using admintools

When you use MC to add nodes to a cluster in the cloud, MC provisions the instances, adds the new instances to the existing Vertica cluster, and then adds those hosts to the database. However, when you add nodes to a cluster using admintools, you need to execute those steps yourself, as explained in Adding Nodes Using admintools.

Adding nodes using Management Console

In the Vertica Management Console, you can add nodes in several ways, depending on your database mode.

For Eon Mode databases, MC supports actions for subcluster and node management for the following public and private cloud providers:

For Enterprise Mode databases, MC supports these actions:

  • In the cloud on AWS: Add Node action, Add Instance action.

  • On-premises: Add Node action.

Adding nodes in an Eon Mode database

In an Eon Mode database, every node must belong to a subcluster. To add nodes, you always add them to one of the subclusters in the database:

Adding nodes in an Enterprise Mode database on AWS

In an Enterprise Mode database on AWS, to add an instance to your cluster:

  1. On the MC Home page, click View Infrastructure to go to the Infrastructure page. This page lists all the clusters the MC is monitoring.

  2. Click any cluster shown on the Infrastructure page.

  3. Select View or Manage from the dialog that displays, to view its Cluster page. (In a cloud environment, if MC was deployed from a cloud template the button says "Manage". Otherwise, the button says "View".)

  1. Click the Add (+) icon on the Instance List on the Cluster Management page.

    MC adds a node to the selected cluster.

Adding nodes using admintools

This section gives an overview on how to add nodes if you are managing your cluster using admintools. Each main step points to another topic with the complete instructions.

Step 1: before you start

Before you add nodes to a cluster, verify that you have an AWS cluster up and running and that you have:

  • Created a database.

  • Defined a database schema.

  • Loaded data.

  • Run the Database Designer.

  • Connected to your database.

Step 2: launch new instances to add to an existing cluster

Perform the procedure in Configure and launch an instance to create new instances (hosts) that you then will add to your existing cluster. Be sure to choose the same details you chose when you created the original instances (VPC, placement group, subnet, and security group).

Step 3: include new instances as cluster nodes

You need the IP addresses when you run the install_vertica script to include new instances as cluster nodes.

If you are configuring Amazon Elastic Block Store (EBS) volumes, be sure to configure the volumes on the node before you add the node to your cluster.

To add the new instances as nodes to your existing cluster:

  1. Configure and launch your new instances.

  2. Connect to the instance that is assigned to the Elastic IP. See Connect to an instance if you need more information.

  3. Run the Vertica installation script to add the new instances as nodes to your cluster. Specify the internal IP addresses for your instances and your *.pem file name.

    $ sudo /opt/vertica/sbin/install_vertica --add-hosts instance-ip --dba-user-password-disabled \
      --point-to-point --data-dir /vertica/data --ssh-identity ~/name-of-pem.pem
    

Step 4: add the nodes

After you have added the new instances to your existing cluster, add them as nodes to your cluster, as described in Adding nodes to a database.

Step 5: rebalance the database

After you add nodes to a database, always rebalance the database.

1.6 - Remove nodes from a running AWS cluster

Use the following procedures to remove instances/nodes from an AWS cluster.

Use the following procedures to remove instances/nodes from an AWS cluster.

To avoid data loss, Vertica strongly recommends that you back up your database before removing a node. For details, see Backing up and restoring the database.

In this section

1.6.1 - Remove hosts from the database

Before you remove hosts from the database, verify that you have:.

Before you remove hosts from the database, verify that you have:

  • Backed up the database.

  • Lowered the K-safety of the database.

To remove a host from the database:

  1. While logged on as dbadmin, launch Administration Tools.

    $ /opt/vertica/bin/admintools

  2. From the Main Menu, select Advanced Menu.

  3. From Advanced Menu, select Cluster Management. ClickOK.

  4. From Cluster Management, select Remove Host(s). Click OK.

  5. From Select Database, choose the database from which you plan to remove hosts. Click OK.

  6. Select the host(s) to remove. Click OK.

  7. Click Yes to confirm removal of the hosts.

  8. Click OK. The system displays a message telling you that the hosts have been removed. Automatic rebalancing also occurs.

  9. Click OK to confirm. Administration Tools brings you back to the Cluster Management menu.

1.6.2 - Remove nodes from the cluster

To remove nodes from a cluster, run the update_vertica script and specify:.

To remove nodes from a cluster, run the update_vertica script and specify:

  • The option --remove-hosts, followed by the IP addresses of the nodes you are removing.

  • The option --ssh-identity, followed by the location and name of your *pem file.

  • The option --dba-user-password-disabled.

The following example removes one node from the cluster:

$ sudo /opt/vertica/sbin/update_vertica  --remove-hosts 10.0.11.165  --point-to-point  \
  --ssh-identity ~/name-of-pem.pem --dba-user-password-disabled

1.6.3 - Stop the AWS instances (optional)

After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.

After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.

To stop an instance in AWS:

  1. On AWS, navigate to your Instances page.

  2. Right-click the instance, and choose Stop.

This step is optional because, after you have removed the node from your Vertica cluster, Vertica no longer sees the node as part of the cluster, even though it is still running within AWS.

1.7 - Upgrade Vertica on AWS

Before you upgrade to the latest Vertica version, do the following:.

Before you upgrade to the latest Vertica version, do the following:

  1. Back up your existing database.

  2. Download the Vertica install packages described in Download and Install the Vertica Install Package.

Upgrade to the latest version of Vertica on AWS

To upgrade to the latest version of Vertica on AWS, follow the instructions in Upgrading Vertica.

If you are setting up a Vertica cluster on AWS for the first time, follow the procedure for installing and running on AWS.

Upgrade Vertica running on AWS

Vertica supports upgrades of Vertica server running on AWS instances created from the Vertica AMI. To upgrade Vertica, follow the instructions provided in Upgrading Vertica.

Make sure to add the following arguments to the upgrade script:

  • --dba-user-password-disabled

  • --point-to-point

1.8 - Copying and exporting data on AWS: what you need to know

There are common issues that occur when exporting or copying on AWS clusters, as described below.

There are common issues that occur when exporting or copying on AWS clusters, as described below. Except for these specific issues as they relate to AWS, copying and exporting data works as documented in Database export and import.

To copy or export data on AWS:

  1. Verify that all nodes in source and destination clusters have their own elastic IPs (or public IPs) assigned.

    If your destination cluster is located within the same VPC as your source cluster, proceed to step 3. Each node in one cluster must be able to communicate with each node in the other cluster. Thus, each source and destination node needs an elastic IP (or public IP) assigned.

  2. (For non-CloudFormation Template installs) Create an S3 gateway endpoint.

    If you aren't using a CloudFormation Template (CFT) to install Vertica, you must create an S3 gateway endpoint in your VPC. For more information, see the AWS documentation.

    For example, the Vertica CFT has the following VPC endpoint:

    "S3Enpoint" : {
        "Type" : "AWS::EC2::VPCEndpoint",
        "Properties" : {
        "PolicyDocument" : {
            "Version":"2012-10-17",
            "Statement":[{
            "Effect":"Allow",
            "Principal": "*",
            "Action":["*"],
            "Resource":["*"]
            }]
        },
        "RouteTableIds" : [ {"Ref" : "RouteTable"} ],
        "ServiceName" : { "Fn::Join": [ "", [ "com.amazonaws.", { "Ref": "AWS::Region" }, ".s3" ] ] },
        "VpcId" : {"Ref" : "VPC"}
    }
    

  3. Verify that your security group allows the AWS clusters to communicate.

    Check your security groups for both your source and destination AWS clusters. Verify that ports 5433 and 5434 are open. If one of your AWS clusters is on a separate VPC, verify that your network access control list (ACL) allows communication on port 5434.

  4. If there are one or more elastic load balancers (ELBs) between the clusters, verify that port 5433 is open between the ELBs and clusters.

  5. If you use the Vertica client to connect to one or more ELBs, the ELBs only distribute incoming connections. The data transmission path occurs between clusters.

2 - Vertica on Microsoft Azure

You can deploy a Vertica database on the Microsoft Azure Cloud running in either or.

You can deploy a Vertica database on the Microsoft Azure Cloud running in either Enterprise Mode or Eon Mode. In Eon Mode, Vertica stores its data communally using Azure block blob storage.

This section explains how to deploy a Vertica database to Microsoft Azure.

For more information about Azure, see the Azure documentation.

2.1 - Deploying Vertica from the Azure Marketplace

Deploy Vertica in the Microsoft Azure Cloud using the Vertica Analytics Platform entry in the Azure Marketplace.

Deploy Vertica in the Microsoft Azure Cloud using the Vertica Analytics Platform entry in the Azure Marketplace. Vertica provides the following deployment options:

  • Eon Mode: Deploy a Management Console (MC) instance, and then provision and create an Eon Mode database from the MC. For cluster and storage requirements, see Eon Mode on Azure prerequisites.

  • Enterprise Mode: Deploy a four-node Enterprise Mode database comprised of one MC instance and three database nodes. This requires an Azure subscription with a minimum of 12 cores for the Vertica Marketplace solution.

    The Enterprise Mode deployment uses the MC primarily as a monitoring tool. For example, you cannot provision and create a database with an Enterprise Mode MC. For information about creating and managing an Enterprise Mode database, see Create a database using administration tools.

Creating a deployment

Eon Mode and Enterprise Mode require much of the same information for deployment. Any information that is not required for both deployment types is clearly marked.

1. selecting the deployment type

  1. Sign in to your Microsoft Azure account. From the Home screen, select Create a resource under Azure services.

  2. Search for Vertica Analytics Platform and select it from the search results.

  3. On the Vertica Analytics Platform page, select one of the following:

    • To deploy an MC instance that can manage an Eon Mode database, select Vertica Data Warehouse, Eon BYOL.

    • To deploy an Enterprise Mode database, select Vertica Analytics Platform.

  4. On the next screen, select Create.

After you select your deployment type, the Basics tab on the Create Vertica Analytics Platform page displays.

2. adding project and instance details on the basics tab

Provide the following information in the Project details and Instance details sections:

  1. Subscription: Azure bills this subscription for the cluster resources.

  2. Resource group: The location to save all of the Azure resources. Create a new resource group or choose an existing one from the dropdown list.

  3. Region: The location where the virtual machine running your MC instance is deployed.

  4. VerticaManagement ConsoleUser: Eon Mode only. The administrator username for the MC.

  5. SSH public key for OS Access: Provide the SSH public key associated with the Vertica User, for command line access to the virtual machine.

  6. Password for MC Access: Enter a password to log in to Management Console. Note that Management Console requires that you change your password after the initial login.

  7. Confirm password: Reenter the value you entered in Password for MC Access.

  8. Select Next: Virtual Machine Settings >.

3. selecting virtual machine settings

Provide the following information on the Virtual Machine Settings tab:

  1. Management Console VM size: Select Change size to customize the VM settings or select the default. For a list of VM types recommended by use case, see Recommended Azure VM types.

  2. Storage account of Eon DB: Eon Mode only. The storage account associated with the database deployment.

  3. Number of Vertica Cluster nodes: Enterprise Mode only. The number of nodes to deploy in the cluster, in addition to the MC instance.
    The Community Edition (CE) license is automatically applied to the cluster. This license is limited to 1 TB of RAW data 3 Vertica nodes. If you select more than 3 nodes with a CE license, the initial database is created on the first 3 nodes. For information about upgrading your license, see Managing licenses.

  4. Vertica Node VM size: Enterprise Mode only. Select the VM type to deploy in your cluster. Use the default or select Change size to customize the VM settings. For a list of VM types recommended by use case, see Recommended Azure VM types.

  5. Total RAW storage per node: Enterprise Mode only. Select the amount of storage per node from the dropdown list. Each VM has a set of premium data disks that are configured and presented as a single storage location.

  6. Select Next: Network Settings >.

4. selecting network settings

Provide the following information on the Network Settings tab:

  1. Virtual Network: The virtual network that hosts the Vertica cluster. Create a new virtual network or select an existing one from the dropdown list.
    If you select an existing virtual network, Vertica recommends that you already created a subnet to use for the deployment.

  2. First subnet: The subnet for the associated Virtual Network. Create a new subnet or select an existing one from the dropdown list.

  3. Public IP Address Resource Name: Each VM is configured with a publicly accessible IP address. This field allows you to specify the resource name for those IP addresses, and whether they are static or dynamic. The first public IP address resource is created exactly as entered, and associated with the VerticaManagement Console. Azure appends a number from 1 to 16 to the resource name for each additional Vertica cluster node created. This number associates each VM with a resource.

  4. Domain Name Label for Management Console: Because each VM has a public IP address, each node requires a DNS name. Enter a prefix for the name. The first DNS name is created exactly as entered, and associated with the VerticaManagement Console. Azure appends a number from 1 to 16 to the DNS name for each Vertica cluster node created. That number associates each VM with a resource. Azure adds the remaining part of the fully qualified domain name based on the location where you created the cluster.

  5. Select Next: Review + create >.

5. verifying on review + create

As the Review + create page loads, Azure validates your settings. After it passes validation, review your settings. When you are satisfied with your selections, select Create.

Accessing the MC after deployment

After your resources are successfully deployed, you are brought to the Overview page on Home > resources-name > Deployments. You must retrieve your Management Console IP address and username to log in.

  1. From the Overview page, select Outputs in the left navigation.

  2. Copy the vertica management console URL and vertica management console user name.

  3. Paste the vertica management console URL in the browser address bar and press Enter.

  4. Depending on your browser, you might receive a warning of a security risk. If you receive the warning, select the Advanced button and follow the browsers instructions to proceed to the Management Console.

  5. On the VerticaManagement Console log in page, paste the vertica management console user name, and enter the Password for MC Access that you entered on Basics > Project details when you were deploying your MC instance.

Deleting a resource group

For details about the Azure Resource Manager and deleting a resource group, see the Azure documentation.

2.2 - Manually deploy Vertica on Microsoft Azure

Manually creating a database cluster for your Vertica deployment lets you customize your VMs to meet your specific needs.

Manually creating a database cluster for your Vertica deployment lets you customize your VMs to meet your specific needs. You often want to manually configure your VMs when deploying a Vertica cluster to host an Eon Mode database.

To start creating your Vertica cluster in Azure using manual steps, you first need to create a VM. During the VM creation process, you create and configure the other resources required for your cluster, which are then available for any additional VMs that you create.

The topics in this section explain how to manually deploy Vertica on Azure.

2.2.1 - Recommended Azure VM types

Vertica supports a range of Microsoft Azure virtual machine (VM) types, each optimized for different purposes.

Vertica supports a range of Microsoft Azure virtual machine (VM) types, each optimized for different purposes. Choose the VM type that best matches your performance and price needs as a user.

For the best performance in most common scenarios, use one of the following VMs:

Virtual Machine Types Virtual Machine Size
Memory optimized

DS13_v2

DS14_v2

DS15_v2

D8s_v3

D16s_v3

D32s_v3

High memory and I/O throughput

GS3

GS4

GS5

E8s_v3

E16s_v3

E32s_v3

L8s

L16s

L32s

2.2.2 - Supported Azure operating systems

For best performance, use one of the following operating systems when deploying Vertica on Azure:.

For best performance, use one of the following operating systems when deploying Vertica on Azure:

  • Red Hat 7.3 or later

  • CentOS 7.3 or later. The Azure Marketplace solution as of this writing (June 2017) is based on CentOS 7.3.1611.

For more information, see Supported platforms.

2.2.3 - Configuring and launching a new instance

An Azure VM is similar to a traditional host.

An Azure VM is similar to a traditional host. Just as with an on-premises cluster, you must prepare and configure the hardware settings for your cluster and network before you install Vertica.

The first steps are:

  1. From the Azure marketplace, select an operating system that Vertica supports.

  2. Select a VM type.See Recommended Azure VM types.

  3. Choose a deployment model. For best results, choose the resource manager deployment model.

Configure network security group

Vertica has specific network security group requirements, as described in page_title.

Create and name your own network security group, following these guidelines.

You must configure SSH as:

  • Protocol: TCP

  • Source port range: Any

  • Destination port range: 22

  • Source: Any

  • Destination: Any

You can make additional modifications, based on your specific requirements.

Add disk containers

Create an Azure storage account, which later contains your cluster storage disk containers.

For optimal throughput, select Premium storage and align the storage to your chosen VM type.

For more information about what a storage account is, and how to create one, refer to About Azure storage accounts.

For an Enterprise Mode database deployment, provision enough space

Configure credentials

Create a password or assign an SSH key pair to use with Vertica.

For information about how to use key pairs in Azure, see How to create and use an SSH public and private key pair for Linux VMs in Azure.

Assign a public IP address

A public IP is an IP address that you can use to connect to your cluster externally. For best results, assign a single static public IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using the internal IP addresses that Azure generated when you specified your virtual network settings.

By default, a public IP address is dynamic; it changes every time you shut down the server. You can choose a static IP address, but doing so can add cost to your deployment.

During a VM installation, you cannot set a DNS name. If you use dynamic public IPs, set the DNS name in the public IP resource for each VM after deployment.

For information about public IP addresses, refer to IP address types and allocation methods in Azure.

Create additional VMs

If needed, to create additional VMs, repeat the previous instructions in this document.

2.2.4 - Connect to a virtual machine

Before you can connect to any of the VMs you created, you must first make your virtual network externally accessible.

Before you can connect to any of the VMs you created, you must first make your virtual network externally accessible. To do so, you must attach the public IP address you created during network configuration to one of your VMs.

Connect to your VM

To connect to your VM, complete the following tasks:

  1. Connect to your VM using SSH with the public IP address you created in the configuration steps.

  2. Authenticate using the credentials and authentication method you specified during the VM creation process.

Connect to other VMs

Connect to other virtual machines in your virtual network by first using SSH to connect to your publicly connected VM. Then, use SSH again from that VM to connect through the private IP addresses of your other VMs.

If you are using private key authentication, you may need to move your key file to the root directory of your publicly connected VM. Then, use PuTTY or WinSCP to connect to other VMs in your virtual network.

2.2.5 - Prepare the virtual machines

After you create your VMs, you need to prepare them for cluster formation.

After you create your VMs, you need to prepare them for cluster formation.

Add the Vertica license and private key

Prepare your nodes by adding your private key (if you are using one) to each node and to your Vertica license. These steps assume that the initial user you configured is the DBADMIN user.

  1. As the dbadmin user, copy your private key file from where you saved it locally onto your primary node.

    Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:

    Failed Login Validation 10.0.2.158, cannot resolve or connect to host as root.
    

    If you receive a failure message, enter the following command to correct permissions on your private key file:

    $ chmod 600 /<name-of-key>.pem
    
  2. Copy your Vertica license to your primary VM. Save it in your home directory or other known location.

Install software dependencies for Vertica on Azure

In addition to the Vertica standard Package dependencies, as the root user, you must install the following packages before you install Vertica on Azure:

  • pstack

  • mcelog

  • sysstat

  • dialog

2.2.6 - Configure storage

Use a dedicated Azure storage account for node storage.

Use a dedicated Azure storage account for node storage.

When configuring your storage, make sure to use a supported file system. For details, see Recommended storage format types.

Attach disk containers to virtual machines (VMs)

Using your previously created storage account, attach disk containers to your VMs that are appropriate to your needs.

For best performance, combine multiple storage volumes into RAID-0. For most RAID-0 implementations, attach 6 storage disk containers per VM.

Combine disk containers for storage

If you are using RAID, follow these steps to create a RAID-0 drive on your VMs. The following example shows how you can create a RAID-0 volume named md10 from 6 individual volumes named:

  • sdc

  • sdd

  • sde

  • sdf

  • sdg

  • sdh

  1. Form a RAID-0 volume using the mdadm utility:

    $ mdadm --create /dev/md10 --level 0 --raid-devices=6 \
      /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
    
  2. Format the file system to be one that Vertica supports:

    $ mkfs.ext4 /dev/md10
    
  3. Find the UUID on the newly-formed RAID volume using the blkid command. In the output, look for the device you assigned to the RAID volume:

    $ blkid
     . . .
     /dev/md10 : UUID="e7510a6f-2922-4413-b5fa-9dcd725967fd" TYPE="ext4" PARTUUID="fb9b7449-08c3-4231-9ee5-086f7b0c9001"
     . . .
    
  4. The RAID device can be renamed after a reboot. To ensure the filesystem is mounted in a predictable location on your VM, create a directory to use as the mount point to mount the filesystem. For example, you can choose to create a mount point named /data that you will use to store your database's catalog and data (or depot, if you are running Vertica in Eon Mode).

    $ mkdir /data
    
  5. Using a text editor, add an entry to the /etc/fstab file for the UUID of the filesystem and your mount point so it is mounted when the system boots:

    UUID=RAID_UUID mountpoint      ext4    defaults,nofail,nobarrier    0   2
    

    For example, if you have the UUID shown in the previous example and the mount point /data, add the following line to the /etc/fstab file:

    UUID=e7510a6f-2922-4413-b5fa-9dcd725967fd  /data      ext4    defaults,nofail,nobarrier    0   2
    
  6. Mount the RAID filesystem you added to the fstab file. For example, to mount a mount point named /data use the command:

    $ mount /data
    
  7. Create folders for your Vertica data and catalog under your mount point.

    $ mkdir /data/vertica
    $ mkdir /data/vertica/data
    

    If you are planning to run Vertica in Eon Mode, create a directory for the depot instead of data:

    $ mkdir /data/vertica/depot
    

Create a swap file

In addition to storage volumes to store your data, Vertica requires a swap volume or swap file to operate.

Create a swap file or swap volume of at least 2 GB. The following steps show how to create a swap file within Vertica on Azure:

  1. Install devnull and swapfile:

    $ install -o root -g root -m 0600 /dev/null /swapfile
    
  2. Create the swap file:

    $ dd if=/dev/zero of=/swapfile bs=1024 count=2048k
    
  3. Prepare the swap file using mkswap:

    $ mkswap /swapfile
    
  4. Use swapon to instruct Linux to swap on the swap file:

    $ swapon /swapfile
    
  5. Persist the swapfile in FSTAB:

    $ echo "/swapfile       swap    swap    auto      0       0" >> /etc/fstab
    

Repeat the volume attachment, combination, and swap file creation procedures for each VM in your cluster.

For more information

2.2.7 - Download Vertica

To download the Vertica server appropriate for your operating system and license type, go to www.vertica.com/download/vertica.

To download the Vertica server appropriate for your operating system and license type, go to www.vertica.com/download/vertica.

Run the rpm to extract the files.

After you complete the download and extraction, the next section describes how to use the install_vertica script to form a cluster and install the Vertica database software.

2.2.8 - Form a cluster and install Vertica

Use the install_vertica script to combine two or more individual VMs to form a cluster and install the Vertica database.

Use the install_vertica script to combine two or more individual VMs to form a cluster and install the Vertica database.

Before you start

Before you run the install_vertica script:

  • Check the Virtual Network page for a list of current VMs and their associated private IP addresses.

  • Identify your storage location. The installer assumes that you have mounted your storage to /vertica/data. To specify another location, use the --data-dir argument.

  • Identify your storage location. To create your database's data directory on mounted RAID drive, when you run the install_vertica script, provide /vertica/data as the value of the --data-dir option .

Combine virtual machines (VMs)

The following example shows how to combine VMs using the install_vertica script.

  1. While connected to your primary node, construct the following command to combine your nodes into a cluster.

    $ sudo /opt/vertica/sbin/install_vertica --hosts 10.2.0.164,10.2.0.165,10.2.0.166 --dba-user-password-disabled --point-to-point --data-dir /vertica/data --ssh-identity ~/<name-of-private-key>.pem --license <license.file>
    
  2. Substitute the IP addresses for your VMs and include your root key file name, if applicable.

  3. Include the --point-to-point parameter to configure spread to use direct point-to-point communication between all Vertica nodes, as required for clusters on Azure when installing or updating Vertica.

  4. If you are using Vertica Community Edition, which limits you to three nodes, specify -L CE with no license file.

  5. After you combine your nodes, to reduce security risks, keep your key file in a secure place—separate from your cluster—and delete your on-cluster key with the shred command:

    $ shred examplekey.pem
    
  6. Reboot your cluster to complete the cluster formation and Vertica installation.

For complete information on the install_vertica script and its parameters, see Installing Vertica with the installation script.

2.2.9 - After your cluster is up and running

Now that your cluster is configured and running, take these steps:.

Now that your cluster is configured and running, take these steps:

  1. Log into one of the database nodes using the database administrator account (named dbadmin by default).

  2. Create and start a database:

  3. Configure your database. See Configuring the database.

2.3 - Eon Mode databases on Azure

You can create an database on a cluster that is hosted on Azure.

You can create an Eon Mode database on a cluster that is hosted on Azure. In this configuration, your database stores its data communally in Azure Blob storage. See Eon Mode to learn more about this database mode.

Eon Mode databases on Azure support some of the encryption features built into Azure Storage. You can use its encryption at rest feature transparently—you do not need to configure Vertica to take advantage of it. You can use Microsoft-managed or customer-managed keys for storage encryption. Vertica does not support Azure Storage's client-side encryption and encryption using customer-provided keys. See the Azure Data Encryption at rest page in the Azure documentation for more information about the encryption at rest features in Azure Storage.

This section explains how you create an Eon Mode database running on Azure cloud.

2.3.1 - Eon Mode on Azure prerequisites

Before you can create an Eon Mode database on Azure, you must have a database cluster and an Azure blob storage container to store your database's data.

Before you can create an Eon Mode database on Azure, you must have a database cluster and an Azure blob storage container to store your database's data.

Cluster requirements

Before you can create an Eon Mode database on Azure, you must provision a cluster to host it. See Configuring your Vertica cluster for Eon Mode for suggestions on choosing VM configurations and the number of nodes your cluster should start with.

Storage requirements

An Eon Mode database on Azure stores its data communally in Azure blob storage. Vertica only supports block blob storage for communal data storage, not append or page blob storage.

You must create a storage path for Vertica to use exclusively. This path can be a blob container or a folder within a blob container. This path must not contain any files. If you attempt to create an Eon Mode database with a container or folder that contains files, admintools returns an error.

You pass Vertica a URI for the storage path using the azb:// schema. See Azure Blob Storage object store for the format of this URI.

You must also configure the storage container so Vertica is authorized to access it. Depending on authentication method you use, you may need to supply Vertica the with credentials to access the container. Vertica can use one of following methods to authenticate with the blob storage container:

  • Using Azure managed identities. This authentication method is transparent—you do not need to add any authentication configuration information to Vertica. Vertica automatically uses the managed identity bound to the VMs it runs on to authenticate with the blob storage container. See the Azure AD-managed identities for Azure resources documentation page in the Azure documentation for more information.

    If you provide credentials for either of the other two supported authentication methods, Vertica uses them instead of authenticating using a managed identity bound to your VM.

  • Using an account name and access key credentials for a service account that has full access to the blob storage container. In this case, you provide Vertica with the credentials when you create the Eon Mode database. See Creating an Authentication File for details.

  • Using a shared access signature (SAS) that grants Vertica access to the storage container. See Grant limited access to Azure Storage resources using shared access signatures (SAS) in the Azure documentation. See Creating an Authentication File for details.

For details on how Vertica accesses Azure blob storage, see Azure Blob Storage object store.

2.3.2 - Manually creating an Eon Mode database on Azure

Once you have met the cluster and storage requirements for using an Eon Mode database on Azure, you are ready to create an Eon Mode database.

Once you have met the cluster and storage requirements for using an Eon Mode database on Azure, you are ready to create an Eon Mode database. Use the admintools create_db tool to create your Eon Mode database.

Creating an authentication file

If your database will use a managed identity to authenticate with the Azure storage container, you do not need to supply any additional configuration information to the create_db tool.

If your database will not use a managed identity, you must supply create_db with authentication information in a configuration file. It must contain at least the AzureStorageCredentials parameter that defines one or more account names and keys Vertica will use to access blob storage. It can also contain an AzureStorageEnpointConfig parameter that defines an alternate endpoint to use instead of the the default Azure host name. This option is useful if you are creating a test environment using an Azure storage emulator such as Azurite.

The following table defines the values that can be set in these two parameters.

AzureStorageCredentials
Collection of JSON objects, each of which specifies connection credentials for one endpoint. This parameter takes precedence over Azure managed identities.

The collection must contain at least one object and may contain more. Each object must specify at least one of accountName or blobEndpoint, and at least one of accountKey or sharedAccessSignature.

  • accountName: If not specified, uses the label of blobEndpoint.
  • blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
  • accountKey: Access key for the account or endpoint.
  • sharedAccessSignature: Access token for finer-grained access control, if being used by the Azure endpoint.
AzureStorageEndpointConfig
Collection of JSON objects, each of which specifies configuration elements for one endpoint. Each object must specify at least one of accountName or blobEndpoint.
  • accountName: If not specified, uses the label of blobEndpoint.
  • blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
  • protocol: HTTPS (default) or HTTP.
  • isMultiAccountEndpoint: true if the endpoint supports multiple accounts, false otherwise (default is false). To use multiple-account access, you must include the account name in the URI. If a URI path contains an account, this value is assumed to be true unless explicitly set to false.

The authentication configuration file is a text file containing the configuration parameter names and their values. The values are in a JSON format. The name of this file is not important. The following examples use the file name auth_params.conf.

The following example is a configuration file for a storage account hosted on Azure. The storage account name is mystore, and the key value is a placeholder. In your own configuration file, you must provide the storage account's access key. You can find this value by right-clicking the storage account in the Azure Storage Explorer and selecting Copy Primary Key.

AzureStorageCredentials=[{"accountName": "mystore", "accountKey": "access-key"}]

The following example shows a configuration file that defines an account for a storage container hosted on the local system using the Azurite storage system. The user account and key are the "well-known" account provided by Azurite by default. Because this configuration uses an alternate storage endpoint, it also defines the AzureStorageEndpointConfig parameter. In addition to reiterating the account name and endpoint definition, this example sets the protocol to the non-encrypted HTTP.

AzureStorageCredentials=[{"accountName": "devstoreaccount1", "blobEndpoint": "127.0.0.1:10000 ",
                          "accountKey":
"Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw=="
                        }]

AzureStorageEndpointConfig=[{"accountName": "devstoreaccount1",
                             "blobEndpoint": "127.0.0.1:10000", "protocol": "http"}]

Creating the Eon Mode database

Use the admintools create_db tool to create your Eon Mode database. The required arguments you pass to this tool are:

Argument Description
--communal-storage-location The URI for the storage container Vertica will use for communal storage. This URI must use the azb:// schema. See Azure Blob Storage object store for the format of this URI.
-x The path to the file containing the authentication parameters Vertica needs to access the communal storage location. This argument is only required if your database will use a storage account name and key to authenticate with the storage container. If it is using a managed identity, you do not need to specify this argument.
--depot-path The absolute path to store the depot on the nodes in the cluster.
--shard-count The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divisor. See Planning for Scaling Your Cluster for more information.
-s A comma-separated list of the nodes in your database.
-d The name for your database.

Some other common optional arguments for create_db are:

Argument Description
-l The absolute path to the Vertica license file to apply to the new database.
-p The password for the new database.
--depot-size

The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.

You can specify the size in two ways:

  • integer%: Percentage of filesystem's disk space to allocate.

  • integer{K|M|G|T}: Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.

However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.

To view all arguments for the create_db tool, run the command:

admintools -t create_db --help

The following example demonstrates creating an Eon Mode database with the following settings:

  • Vertica will use a storage account named mystore.

  • The communal data will be stored in a directory named verticadb located in a storage container named db_blobs.

  • The authentication information Vertica needs to access the storage container is in the file named auth_params.conf in the current directory. The contents of this file are shown in the first example under Creating an Authentication File.

  • The hostnames of the nodes in the cluster are node01 through node03.

$ admintools -t create_db \
             --communal-storage-location=azb://mystore/db_blobs/verticadb \
             -x auth_params.conf -s node01,node02,node03  \
             -d verticadb --depot-path /vertica/depot --shard-count 3 \
             -p 'mypassword'

2.4 - page_title

Vertica has the following network security group requirements.

Vertica has the following network security group requirements.

For details on security groups and how to create one, see the Azure documentation.

Inbound settings

Name Protocol Source port range Destination port range Source Destination
SSH TCP
22 Any Any
HTTP TCP
80 Any Any
HTTPS TCP
80 Any Any
HTTPS TCP
443 Any Any
DNS (UDP) UDP
53 Any Any
Spread UDP
4803-4805 Any Any
Spread TCP
4803-4805 Any Any
VSQL/SQL TCP
5433 Any Any
Inter-node communication TCP
5434 Any Any
TCP
5444 Any Any
MC TCP
5450 Any Any
TCP
8080 Any Any
TCP
48073 Any Any
rsync TCP
50000 Any Any

Outbound settings

Name Protocol Source port range Destination port range Source Destination
All TCP TCP 0-65535
Any Any
All ICMP ICMP 0-65535
Any Any
All UDP UDP 0-65535
Any Any

3 - Vertica on Google Cloud Platform

Welcome to the Vertica on Google Cloud Platform guide.

Welcome to the Vertica on Google Cloud Platform guide.

Vertica provides two templates to help you deploy a Vertica database running in either Enterprise Mode or Eon Mode. See Architecture for more information about these modes.

The following topics describe several deployment methods to run Vertica on Google Cloud Platform.

3.1 - Supported GCP machine types

Vertica Analytic Database supports a range of machine types, each optimized for different workloads.

Vertica Analytic Database supports a range of machine types, each optimized for different workloads. When you deploy your Vertica Analytic Database cluster to the Google Cloud Platform (GCP), different machine types are available depending on how you provision your database.

The sections below list the GCP machine types that Vertica supports for Vertica cluster hosts, and for use in Management Console. For details on the configuration of the machine type options, see the Google Cloud documentation's Machine types page.

Machine types available for MC hosts

Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy an instance for running the Vertica Management Console.

Machine types available for Vertica database cluster hosts

Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy cluster hosts.

Machine types for Vertica database cluster hosts provisioned from MC

The table below lists the GCP machine types that Vertica supports when you provision your cluster from Management Console.

Machine Type Machine Name
N1 standard

n1-standard-16

n1-standard-32

n1-standard-64

N1 high-memory

n1-highmem-16

n1-highmem-32

n1-highmem-64

N2 standard

n2-standard-16

n2-standard-32

n2-standard-48

n2-standard-64

N2 high-memory

n2-highmem-16

n2-highmem-32

n2-highmem-48

n2-highmem-64

3.2 - Deploy Vertica from the Google cloud marketplace

The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP).

The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP). Currently, three entries let you select the database mode and the license you want to use:

  • The Enterprise Mode launcher deploys a Vertica database with 3 or more nodes, plus an additional VM running the Management Console (MC). See Deploying an Enterprise Mode database in GCP from the marketplace for more information.

  • The Eon Mode BYOL (bring your own license) launcher deploys a single instance running the MC. You use this MC instance to deploy a Vertica database running on Eon Mode. This database has a community license applied to it initially. You can later upgrade it to a license you have obtained from Vertica. See Deploying an Eon Mode database on GCP for more information.

  • The Eon Mode BTH (by the hour) launcher also deploys a single instance running the MC that you use to deploy a database. This database has a by-the-hour license applied to it. Instead of paying for a license up front, you pay an hourly fee that covers both Vertica and running your instances. The BTH license is automatically applied to all clusters you create using a BTH MC instance. See Deploying an Eon Mode database on GCP for more information. If you choose, you can upgrade this hourly license to a longer-term license you purchase from Vertica. To move a BTH cluster to a BYOL license, follow the instructions in Moving a cloud installation from by the hour (BTH) to bring your own license (BYOL) for more information.

3.2.1 - Deploying an Enterprise Mode database in GCP from the marketplace

The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database.

The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database. The solution includes the Vertica Management Console (MC) as the primary UI for you to get started.

The launcher automatically creates a database named vdb using the Community Edition (CE) license. The CE license is limited to a maximum of 3 nodes. You can tell the launcher to add more than 3 nodes to your deployment. In this case, it uses the first three nodes in the cluster to create the database. The remaining nodes are not part of the database, but are added to your cluster. To add these nodes to your database, you must replace the Community Edition license with a license key you receive from the Software Entitlement support site. See Managing licenses for more information.

After the launcher creates the initial database, it configures the MC to attach to that database automatically.

Configure the Vertica cloud launcher solution

To get started with a deployment of Vertica from the Google Cloud Launcher, search for the Vertica Data Warehouse, Enterprise Mode entry.

Follow these steps:

  1. Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.

  2. From the listing page, click LAUNCH.

  3. On the New Vertica Analytics Platform deployment page, enter the following information:

    • Deployment name: Each deployment must have a unique name. That name is used as the prefix for the names of all VMs created during the deployment. The deployment name can only contain lowercase characters, numbers, and dashes. The name must start with a lowercase letter and cannot end with a dash.

    • Zone: GCP breaks its cloud data centers into regions and zones. Regions are a collection of zones in the same geographical location. Zones are collections of compute resources, which vary from zone to zone.

      For best results, pick the zone in your designated region that supports the latest Intel CPUs. For a complete listing of regions and zones, including supported processors, see Regions and Zones.

    • Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.

    • Under Vertica Management Console, choose the configuration for the virtual machine that will run the Management Console. The Vertica Analytics Platform in Cloud Launcher always deploys the Vertica Management Console (MC) as part of the solution.

      The default machine type for MC is sufficient for most deployments. You can choose another machine type that better suits any additional purposes, such serving as a target node for backups, data transformation, or additional management tools.

    • Node count for Vertica Cluster: The total number of VMs you want to deploy in the Vertica Cluster. The default is 3.

    • Machine type for Vertica Cluster nodes: The Cloud Launcher builds each node in the cluster using the same machine type. Modify the machine type for your nodes based on the workloads you expect your database to handle. See Supported GCP machine types for more information.

    • Data disk type: GCP offers two types of persistent disk storage: Standard and SSD. The costs associated with Standard are less, but the performance of SSD storage is much better. Vertica recommends you use SSD storage. For more information on Standard and SSD persistent disks, see Storage Options.

    • Disk size in GB: Disk performance is directly tied to the disk size in GCP. The default value of 2000 GBs (2 TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.

      If you select a smaller disk size, the throughput performance decreases. If you select a large disk size, the performance remains the same as the 2 TB option.

    • Network: VMs in GCP must exist on a virtual private cloud (VPC). When you created your GCP account, a default VPC was created. Create additional VPCs to isolate solutions or projects from one another. The Vertica Analytics Plaform creates all the nodes in the same VPC.

    • Subnetwork: Just as a GCP account may have multiple VPCs, each VPC may also have multiple subnets. Use additional subnets to group or isolate solutions within the same VPC.

    • Firewall: If you want your MC to be accessible via the internet, check the Allow access to the Management Console from the Internet box. Vertica recommends you protect your MC using a firewall that restricts access to just the IP addresses of users that need to access it. You can enter one or more comma-separated CIDR address ranges.

After you have entered all the required information, click Deploy to begin the deployment process.

Monitor the deployment

After the deployment begins, Google Cloud Launcher automatically opens the Deployment Manager page that displays the status of the deployment. Items that are still being processed have a spinning circle to the left of them and the text is a light gray color. Items that have been created are dark gray in color, with an icon designating that resource type on the left.

After the deployment completes, a green check mark appears next to the deployment name in the upper left-hand section of the screen.

Accessing the cluster after deployment

After the deployment completes, the right-hand section of the screen displays the following information:

  • dbadmin password: A randomly generated password for the dbadmin account on the nodes. For security reasons, change the dbadmin password when you first log in to one of the Vertica cluster nodes.

  • mcadmin password: A randomly generated password for the mcadmin account for accessing the Management Console. For security reasons, change the mcadmin password after you first log in to the MC.

  • Vertica Node 1 IP address: The external IP address for the first node in the Vertica cluster is exposed here so that you can connect to the VM using a standard SSH client.To access the MC, press the Access Vertica MC button in the Get Started section of the dialog box. Copy the mcadmin password and paste it when asked.

For more information on using the MC, see Management Console.

Access the cluster nodes

There are two ways to access the cluster nodes directly:

  • Use GCP's integrated SSH shell by selecting the SSH button in the Get Started section. This shell opens a pop-up in your browser that runs GCP's web-based SSH client. You are automatically logged on as the user you authenticated as in the GCP environment.

    After you have access to the first Vertica cluster node, execute the su dbadmin command, and authenticate using the dbadmin password.

  • In addition, use other standard SSH clients to connect directly to the first Vertica cluster node. Use the Vertica Node 1 IP address listed on the screen as the dbadmin user, and authenticate with the dbadmin password.

    Follow the on-screen directions to log in using the mcadmin account and accept the EULA. After you've been authenticated, access the initial database by clicking the vdb icon (looks like a green cylinder) in the Recent Databases section.

Using a custom service account

In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:

  • The custom service account must have the Editor role.

  • Individual user accounts must have the Service Account User role on the custom service account.

3.2.2 - Eon Mode databases on GCP

You deploy an Eon Mode database to GCP using Google Cloud Platform Launcher to deploy a Management Console (MC) instance.

You deploy an Eon Mode database to GCP using Google Cloud Platform Launcher to deploy a Management Console (MC) instance. You then use the MC instance to provision and deploy an Eon Mode database.

3.2.2.1 - GCP Eon Mode instance recommendations

When you use the MC to deploy an Eon Mode database to the Google Cloud Platform (GCP), you choose the instance type to deploy as the database's nodes.

When you use the MC to deploy an Eon Mode database to the Google Cloud Platform (GCP), you choose the instance type to deploy as the database's nodes. The default instance settings in the MC are the more conservative option (currently, n1-standard-16). They are sufficient for most workloads. However, you may choose instances with more memory (such as n1-highmem-16) if your queries perform complex joins that may otherwise spill to disk. You can also choose instances with more cores (such as n1-standard-32), if you perform highly-complex compute-intensive analysis. The following links provide additional information about GCP machine type instances and Vertica:

The more powerful instance you choose, the higher the cost per hour. You need to balance whether you want to use fewer, higher-powered but more expensive instances vs. relying on more lower-powered instances that cost less. Thanks to Eon Mode's elasticity, if you choose to use the less-powerful instances, you can always add more nodes to meet peak demands. When you reduce the number of instances to a minimum during off-peak times, you'll spend less than if you had a similar number of more-powerful instances.

Storage options

The MC's deployment wizard also asks you to select the type of local storage for your instances. You can select different options for each type of local storage that Vertica uses: the catalog, the depot, and temporary space. For all of these storage locations, you choose the type of disks to use (standard vs. SSD). You will see the best performance with SSD disks. However, SSD disks cost more.

For the depot, you also choose whether to use local or persistent disks. The local option is faster, as it resides directly on the virtual machine host. However, whenever you shut down the node, this storage is wiped clean. The persistent storage is slower than the local option, as it is not stored directly on the machine hosting the instance. However, it is not wiped out whenever you shut down the instance. See the Google Cloud documentation's Storage options page for more information.

Which of these options you choose depends on how much depot warming the nodes must perform when starting. If the content of your node's depots change little over time (or you tend to frequently start and stop instances), using persistent storage makes sense. In this case, the depot's warming period will be shorter because most of the data the node needs to participate in queries may still be in its depot when it starts. It will perform fewer fetches of data from communal storage while participating in queries.

If your working data set is rapidly changing or you tend to leave nodes stopped for extended periods of time, your best choice is usually to use local storage. In this scenario, the data in the node's depot when it restarts is usually stale. To participate in queries, the node must fetch much of the data it needs from communal storage, resulting in slower performance until it has warmed its depot. Using local ephemeral storage makes sense here, because you will get the benefit of having faster depot storage. Because your nodes have to warm their depots anyhow, there is less of a downside of having the depot on ephemeral storage.

For general guidelines on scaling your cluster for Eon Mode database, see Configuring your Vertica cluster for Eon Mode.

3.2.2.2 - Eon Mode on GCP prerequisites

Before deploying an Eon Mode database on GCP, you must take several steps:.

Before deploying an Eon Mode database on GCP, you must take several steps:

  • Review the default service account's permissions for your GCP project.

  • Create an HMAC key to use when creating your cluster.

  • Create a communal storage location.

Service account permissions

Service accounts allow automated processes to authenticate with GCP. The Eon Mode database deployment process uses the project's service account for your GCP project to deploy instances. When you create a new project, GCP automatically creates a default service account (identified by project_number-compute@developer.gserviceaccount.com) for the project and grants it the IAM role Editor. See the Google Cloud documentation's Understanding roles for details about this and other IAM roles.

The Editor role lets the service account create resources from the Marketplace. When you create an instance of the Management Console (MC), the MC uses the account to deploy further resources, such as provisioning instances for an database.

For details, see the Google Cloud documentation's Understanding service accounts page.

Permissions and roles

To deploy Vertica on GCP, your user account must have the:

  • Editor role.

  • runtimeconfig.waiters.getIamPolicy permission.

Creating an HMAC key

Vertica uses a hash-based message authentication code (HMAC) key to authenticate requests to access the communal storage location. This key has two parts: an access ID and a secret. When you create an Eon Mode database in GCP, you provide both parts of an HMAC key for the nodes to use to access communal storage.

To create an HMAC key:

  1. Log in to your Google Cloud account.

  2. If the name of the project you will use to create your database does not appear in the top banner, click the dropdown and select the correct project.

  3. In the navigation menu in the upper-left corner, under the Storage heading, click Storage and select Settings.

  4. In the Settings page, click Interoperability.

  5. Scroll to the bottom of the page and find the User account HMAC heading.

  6. Unless you have already set a default project, you will see the message stating you haven’t set a default project for your user account yet. Click the Set project-id as default project button to choose the current project as your default for interoperability.

  7. Under Access keys for your user account, click Create a key.

  8. Your new access key and secret appear in the HMAC key list. You will need them when you create your Eon Mode database. You can copy them to a handy location (such as a text editor) or leave a browser tab open to this page while you use another tab or window to create your database. These keys remain available on this page, so you do not need to worry about saving them elsewhere.

Creating a communal storage location

Your Eon Mode database needs a storage location for its communal storage. Eon Mode databases running on GCP use Google Cloud Storage (GCS) for their communal storage location. When you create your new Eon Mode database, you will supply the MC's wizard with a GCS URL for the storage location.

This location needs to meet the following criteria:

  • The URL must include at least a bucket name. You can use one or more levels of folders, as well. For example, the following GCS URLs are valid:

    • gs://verticabucket/mydatabase

    • gs://verticabucket/databases/mydatabase

    • gs://verticabucket

    Multiple databases can share the same bucket, as long as each has its own folder.

  • If provided, the lowest-level folder in the URL must not already exist. For example, in the GCS URL gs://verticabucket/databases/mydatabase, the bucket named verticabucket and the directory named databases must exist. The subdirectory named mydatabase must not exist. The Vertica install process expects to create the final folder itself. If the folder already exists, the installation process fails.

  • The permissions on the bucket must be set to allow the service account read, write, and delete privileges on the bucket. The best role to assign to the user to gain these permissions is Storage Object Admin.

  • To prevent performance issues, the bucket must be in the same region as all of the nodes running the Eon Mode database.

  • If you create the database through the admintools UI, you must set gcsauth as a bootstrap parameter in admintools.conf. For more information on this and other GCP parameters, see Google Cloud Storage parameters.

    [BootstrapParameters]
    gcsauth = ID:secret
    

3.2.2.3 - Deploying an Eon Mode database on GCP

Once you have taken the steps listed in Eon Mode on GCP Prerequisites, you are ready to deploy an Eon Mode database in GCP.

Once you have taken the steps listed in Eon Mode on GCP prerequisites, you are ready to deploy an Eon Mode database in GCP. This process has two steps: deploy a single-node MC instance, then use the MC to provision and deploy a database. The following topics explain these steps.

3.2.2.3.1 - Deploying an MC instance to GCP for Eon Mode

To deploy an MC instance that is able to deploy Eon Mode databases to GCP:.

To deploy an MC instance that is able to deploy Eon Mode databases to GCP:

  1. Log into your GCP account, if you are not currently logged in.

  2. Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.

  3. Verify that the name of the GCP project you want to use for the deployment appears in the top banner. If it does not, click the down arrow next to the project name and select the correct project.

  4. Click the navigation menu icon in the top left of the page and select Marketplace.

  5. In the Search for solutions box, type Vertica Eon Mode and press enter.

  6. Click the search result for Vertica Data Warehouse, Eon Mode. There are two license options: by the hour (BTH) and bring your own license (BYOL). See Deploy Vertica from the Google cloud marketplace for more information on this license choice.

  7. Click Launch on the license option you prefer.

  8. On the following page, fill in the fields to configure your MC instance:

    • Deployment name identifies your MC deployment in the GCP Deployments page.

    • Zone is the location where the virtual machine running your MC instance will be deployed. Make this the same location where your communal storage bucket is located.

    • Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.

    • Machine Type is the virtual hardware configuration of the instance that will run the MC. The default values here are "middle of the road" settings which are sufficient for most use cases. If you are doing a small proof-of-concept deployment, you can choose a less powerful instance to save some money. If you are planning on deploying multiple large databases, consider increasing the count of virtual CPUs and RAM.
      For details about Vertica's default volume configurations, see Eon Mode volume configuration defaults for GCP.

    • User Name for Access to MC is the administrator username for the MC. You can customize this if you want.

    • Network and Subnetwork are the virtual private cloud (VPC) network and subnet within that network you want your MC instance and your Vertica nodes to use. This setting does not affect your MC's external network address. If you want to isolate your Vertica cluster from other GCP instances in your project, create a custom VPC network and optionally a subnet in your GCP project and select them in these fields. See the Google Cloud documentation's VPC network overview page for more information.

    • Firewall enables access to the MC from the internet by opening port 5450 in the firewall. You can choose to not open this port by clearing the I accept opening a port in the firewall (5450) for Vertica box. However, if you do not open the port in the firewall, your MC instance will only be accessible from within the VPC network. Not opening the port will make accessing your MC instance much harder.

    • Source IP ranges for MC traffic: If you choose to open the MC for external access, add one or more or more CIDR address ranges to this box for network addresses that you want to be able to access to the MC.

  9. Click the Deploy button to start the deployment of your MC instance.

The deployment process will take several minutes.

Using a custom service account

In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:

  • The custom service account must have the Editor role.

  • Individual user accounts must have the Service Account User role on the custom service account.

Connect and log into the MC instance

After the deployment process is finished, the Deployment Manager page for your MC instance contains links to connect to the MC via your browser or ssh.

To connect to the MC instance:

  1. The MC administrator user has a randomly-generated password that you need to log into the MC. Copy the password in the MC Admin Password field to the clipboard.

  2. Click Access Management Console.

  3. A new browser tab or window opens, showing you a page titled Redirection Notice. Click the link for the MC URL to continue to the MC login page.

  4. Your browser will likely show you a security warning. The MC instance uses a self-signed security certificate. Most browsers treat these certificates as a security hazard because they cannot verify their origin. You can safely ignore this warning and continue. In most browsers, click the Advanced button on the warning page, and select the option to proceed. In Chrome, this is a link titled Proceed to xxx.xxx.xxx.xxx (unsafe). In Firefox, it is a button labeled Accept the Risk and Continue.

  5. At the login screen, enter the MC administrator user name into the Username box. This user name is mcadmin, unless you changed the user name in the MC deployment form.

  6. Paste the automatically-generated password you copied from the MC Admin Password field earlier into the Password box.

  7. Click Log In.

Once you have logged into the MC, change the MC administrator account's password.

To change the password:

  1. On the home page of the MC, under the MC Tools section, click MC Settings.

  2. In the left-hand menu, click User Management.

  3. Select the entry for the MC administrator account and click Edit.

  4. Click either the Generate new or Edit password button to change the password. If you click the Generate new button, be sure to save the automatically-generated password in a safe location. If you click Edit password, you are prompted to enter a new password twice.

  5. Click Save to update the password.

Now that you have created your MC instance, you are ready to deploy a Vertica Eon Mode cluster. See Using the MC to provision and create an Eon Mode database in GCP.

3.2.2.3.2 - Using the MC to provision and create an Eon Mode database in GCP

After you deploy an MC instance to GCP, use it to deploy an Eon Mode database.

After you deploy an MC instance to GCP, use it to deploy an Eon Mode database.

To use the MC to provision and deploy a new Eon Mode database on GCP:

  1. From the MC home screen, click Create new database to launch the Create a Vertica Cluster on Google Cloud wizard.

  2. On the first page of the wizard enter the following information:

    • Google Cloud Storage HMAC Access Key and HMAC Secret Key: Copy and paste the HMAC access key and secret you created earlier. You find these values on the Interoperability tab of the of the Storage Settings page. See Eon Mode on GCP prerequisites for details.

    • Zone: This value defaults to the zone containing your MC instance. Make this value the same as the zone containing the Google Cloud Storage bucket that your database will use for communal storage.

    • CIDR Range: The IP address range for clients to whom you want to grant access to your database. Make this range as restrictive as possible to limit access to your database.

  3. Click Next, and supply the following information:

    • Vertica Database Name: the name for your new database. See Creating a database name and password for database name requirements.

    • Vertica Version: select the desired Vertica database version. You can select from the latest hotfix of recent Vertica releases. For each database version, you can also select the operating system.

    • Vertica Database User Name: the name of the database superuser. This name defaults to dbadmin, but you can enter another user name here.

    • Password and Confirm Password: Enter a password for the database superuser account.

    • Database Size: The number of nodes in your initial database. If you specify more than three nodes here, you must supply a valid Vertica license file in the Vertica License field (below).

    • Vertica License: Click Browse to locate and upload your Vertica license key file. If you do not supply a license key file here, the wizard deploys your database with a Vertica Community Edition license. This license has a three node limit, so the value in the Database Size filed cannot be larger than 3 if you do not supply a license. If you use a Community Edition license for your deployment, you can upgrade the license later to expand your cluster load more than 1TB of data. See Managing licenses form more information.

    • Load example data: Check this box if you want your deployed database to load some example clickstream data. This option is useful if you are testing features and just want some preloaded data in the database to query.

  4. Click Next and supply the following information:

    • Instance Type: the specifications of the virtual machine instances the MC will use to deploy your database nodes. See the Google Cloud documentation's Machine types page for details of each instance type. Also see GCP Eon Mode instance recommendations.

    • Database Depot Path and Disk Type: the local mount point for the depot, and the type and number of local disks dedicated to the depot for each node. You cannot change the mount path for the depot. The disks you select in the Disk Type field are only used to store the depot. On the next page of the wizard, you will configure disks for the catalog and temporary disk space. You will see the best performance when using SSD disks, although at a higher cost. You can choose to use faster local storage for your depot. However, local storage is ephemeral—GCP wipes the disk clean whenever you stop the instance. This means each time you start a node, it will have to warm its depot from scratch, rather than taking advantage of any still-current data in its depot. See the Google Cloud documentation's Storage options page for more information about the local disk options.

    • Volume Size: the amount of disk space available on each disk attached to each node in your cluster. This field shows you the total disk space available per node in your cluster. For the best practices on choosing the amount of disk space for your nodes, see Configuring your Vertica cluster for Eon Mode.

    • Data Segmentation Shards: sets the number of shards in your database. After you set this value, you cannot change it later. See Configuring your Vertica cluster for Eon Mode for recommendations. The default value is based on the number of nodes you entered in the Database size you specified earlier. It is usually sufficient, unless you anticipate greatly expanding your cluster beyond your initial node count.

    • Communal Location: a Google Cloud Storage URL that specifies where to store your database's communal data. See Eon Mode on GCP prerequisites for requirements.

    • Instance IP settings: specify whether the nodes in your database will have static or ephemeral network addresses that are accessible from the internet, or addresses that are only accessible from within the internal virtual network.

  5. Click Next. The wizard validates your communal storage location URL. If there is an problem with the URL you entered, it displays an error message and prompts you to fix the URL.

    After your communal storage URL passes validation, fill in the following information:

    • Database Catalog Path, Disk Type, and Size (GB) per Available Node: the mount point disk type, and disk size for the local copy of the database catalog on each node. You cannot edit the mount point. You choose the type of local disk to use for the catalog, and its size. You can only choose persistent disk storage for the catalog. SSD drives are faster, but more expensive than standard disks. The default setting for the disk size is adequate for most medium size databases. Increase the size if you anticipate maintaining a large database.

    • Database Temp Path, Disk Type, and Size (GB) per Available Node: the mount point disk type, and disk size for the temporary storage space on each node. You cannot edit the mount point. You choose the type of local disk to use, and its size. You can only choose persistent disk storage for the temporary disk space. SSD drives are faster, but more expensive than standard disks. The default setting is adequate for most databases. Consider increasing the temporary space if you perform many complex merges that spill to disk.

    • Label Instances: check this box to enable adding labels to your node's instances. Many organizations use labels to organize, track responsibility, and assign costs for instances. See the Google Cloud documentation's Labeling resources page for more information. If you choose to add labels, enter the label name and value, and click Add.

  6. Click Next. Review the summary of all your database settings. If you need to make a correction, use the Back button to step back to previous pages of the wizard.

  7. When you are satisfied with the database settings, check Accept terms and conditions and click Create.

The process of provisioning and creating the database takes several minutes. After it completes successfully, the MC displays a Get Started button. This button leads to a page of useful links for getting started with your new database.

See also

3.3 - Manually deploying an Enterprise Mode database on GCP

Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.

Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.

Configure and launch a new instance

All VM instances that you create should be launched in the same virtual public cloud (VPC).

To configure and launch a new VM instance, follow these instructions:

  1. From within the Compute Engine section of GCP, from the menu on the left-hand site of the screen, select VM Instances.

    GCP displays all the VM instances that you have created so far.

  2. Select the CREATE INSTANCE link.

  3. Enter a name for the new instance.

  4. Select the zone where you plan to deploy the instance.

    GCP breaks its cloud data centers down by regions and zones. Regions are a collection of zones that are all in the same geographical location. Zones are collections of compute resources, which vary from zone to zone. Always pick the zone in your designated region that supports the latest Intel CPUs.

    For a complete listing of regions and zones, including supported processors, see Regions and Zones.

  5. Select a machine type.

    GCE offers many different types of VM instances. For best results, only deploy Vertica on VM instances with 8 vCPus or more and at least 30 GB of RAM.

  6. Select the boot disk (image).

    You create VM instances from a public or custom image. If you are starting with Vertica in GCP for the first time, select either the CentOS 7 or RHEL 7 public image. Those images have been tested thoroughly with Vertica.

    For more information about deploying a VM instance, see Creating and Starting an Instance.

After you have configured the VM instance to be used as a Vertica cluster node, GCP allows you to convert that instance into a custom image. Doing so allows you to deploy multiple versions of that VM instance; each VM instance is identical except for the node name and IP address.

For more information about creating a custom image, see Creating, Deleting, and Deprecating Custom Images.

Connect to a virtual machine

Before you can connect to any of the VMs you created, you must first identify the external IP address. The VM instance section of GCP contains a list of all currently deployed VMs and their associated external IP addresses.

Connect to your VM

To connect to your VM, complete the following tasks:

  1. Connect to your VM using SSH with the external IP address you created in the configuration steps.

  2. Authenticate using the credentials and SSH key that you provided to your GCP account upon creation.

Connect to other VMs

To connect to other virtual machines in your virtual network:

  1. Use SSH to connect to your publicly connected VM.

  2. Use SSH again from that VM to connect through the private IP addresses of your other VMs.

Because GCP forces the use of private key authentication, you may need to move your key file to the root directory of your publicly connected VM. Then, use SSH to connect to other VMs in your virtual network.

Prepare the virtual machines

After you create your VMs, you need to prepare them for cluster formation.

Add the Vertica license and private key

Prepare your nodes by adding your private key (if you are using one) to each node and to your Vertica license. The following steps assume that the initial user you configured is the DBADMIN user:

  1. As the DBADMIN user, copy your private key file from where you saved it locally onto your primary node.

    Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:

    Failed Login Validation 10.0.2.158, cannot resolve or connect to host as root.
    

    If you see the previous failure message, enter the following command to correct permissions on your private key file:

    $ chmod 600 /<name-of-key>.pem
    
  2. Copy your Vertica license to your primary VM. Save it in your home directory or other known location.

Install software dependencies for Vertica on GCP

In addition to the Vertica standard package dependencies, as the root user, you must install the following packages before you install Vertica:

  • pstack

  • mcelog

  • sysstat

  • dialog

Configure storage

For best disk performance in GCP, Vertica recommends customers use SSD persistent storage, configured to at least 2TB (2000 GB) in size. Disk performance is directly tied to the disk size in GCP. 2000 GBs (2TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.

When configuring your storage, make sure to use a supported file system. See Recommended storage format types for details.

Create a swap file

In addition to storage volumes to store your data, Vertica requires a swap volume or swap file for the setup script to complete.

Create a swap file or swap volume of at least 2 GB. The following steps show how to create a swap file within Vertica on GCP:

  1. Install the devnull and swapfile files:

    $ install -o root -g root -m 0600 /dev/null /swapfile
    
  2. Create the swap file:

    $ dd if=/dev/zero of=/swapfile bs=1024 count=2048k
    
  3. Prepare the swap file using mkswap:

    $ mkswap /swapfile
    
  4. Use swapon to instruct Linux to swap on the swap file:

    $ swapon /swapfile
    
  5. Persist the swapfile in FSTAB:

    $ echo "/swapfile swap swap auto 0 0" >> /etc/fstab
    
  6. Repeat the volume attachment, combination, and swap file creation procedures for each VM in your cluster.

Download Vertica

To download the Vertica server appropriate for your operating system and license type, follow the steps in described in Download and install the Vertica server package.

After you complete the download and extraction, use the install_vertica script to form a cluster and install the Vertica database software, as described in the next section.

Form a cluster and install Vertica

Use the install_vertica script to combine two or more individual VMs to form a cluster and install your Vertica database.

Before you run the install_vertica script, follow these steps:

  1. Check the VM Instances page of the Compute Engine section on GCP to locate a list of current VMs and their associated internal IP addresses.

  2. Identify your storage location on your VMs. The installer assumes that you have mounted your storage to /home/dbadmin. To specify another location, use the --data-dir argument.

The following steps show how to combine virtual machines (VMs) into a cluster using the install_vertica script:

  1. While connected to your primary node, construct the following command to combine your nodes into a cluster.

    $ sudo /opt/vertica/sbin/install_vertica --hosts 10.2.0.164,10.2.0.165,10.2.0.166 --dba-user-password-disabled --point-to-point --data-dir /vertica/data --ssh-identity ~/.pem --license 
    
  2. Substitute the IP addresses for your VMs, and include your root key file name, if applicable.

  3. Include the --point-to-point parameter to configure spread to use direct point-to-point communication among all Vertica nodes, as required for clusters on GCP when installing or updating Vertica.

  4. If you are using Vertica Community Edition, which limits you to three nodes, specify -L CE with no license file.

  5. After you combine your nodes, to reduce security risks, keep your key file in a secure place—separate from your cluster—and delete your on-cluster key with the shred command:

    $ shred examplekey.pem
    

For complete information about the install_vertica script and its parameters, see Installing Vertica with the installation script.

After your cluster is up and running

Now that your cluster is configured and running, and Vertica is running, take these steps:

  1. Create a database. See Creating a database for details.
  2. When you installed Vertica, a database administrator user was created with the DBADMIN role (usually named dbadmin). Use this account to create and start a database.
  3. See Configuring the database for important database configuration steps.

4 - Moving a cloud installation from by the hour (BTH) to bring your own license (BYOL)

Vertica offers two licensing options for some of the entries in the Amazon Web Services Marketplace and Google Cloud Marketplace:.

Vertica offers two licensing options for some of the entries in the Amazon Web Services Marketplace and Google Cloud Marketplace:

  • Bring Your Own License (BYOL): a long-term license that you obtain through an online licensing portal. These deployments also work with a free Community Edition license. Vertica uses a community license automatically if you do not install a license that you purchased. (For more about Vertica licenses, see Managing licenses and Understanding Vertica licenses.)
  • Vertica by the Hour (BTH): a pay-as-you-go environment where you are charged an hourly fee for both the use of Vertica and the cost of the instances it runs on. The Vertica by the hour deployment offers an alternative to purchasing a term license. If you want to crunch large volumes of data within a short period of time, this option might work better for you. The BTH license is automatically applied to all clusters you create using a BTH MC instance.

If you start out with an hourly license, you can later decide to use a long-term license for your database. The support for an hourly versus a long-term license is built into the instances running your database. To move your database from an hourly license to a long-term license, you must create a new database cluster with a new set of instances.

To move from an hourly to a long-term license, follow these steps:

  1. Purchase a BYOL license. Follow the process described in Obtaining a license key file.

  2. Apply the new license to your database.

  3. Shut down your database.

  4. Create a new database cluster using a BYOL marketplace entry.

  5. Revive your database onto the new cluster.

The exact steps you must take depend on your database mode and your preferred tool for managing your database:

Moving an Eon Mode database from BTH to BYOL using the command line

Follow these steps to move an Eon Mode database from an hourly to a long-term license.

Obtain a long-term BYOL license from the online licensing portal, described in Obtaining a license key file.Upload the license file to a node in your database. Note the absolute path in the node's filesystem, as you will need this later when installing the license.Connect to the node you uploaded the license file to in the previous step. Connect to your database using vsql and view the licenses table:

=> SELECT * FROM licenses;

Note the name of the hourly license listed in the NAME column, so you can check if it is still present later.

Install the license in the database using the INSTALL_LICENSE function with the absolute path to the license file you uploaded in step 2:

=> SELECT install_license('absolute path to BYOL license');

View the licenses table again:

=> SELECT * FROM licenses;

If only the new BYOL license appears in the table, skip to step 8. If the hourly license whose name you noted in step 4 is still in the table, copy the name and proceed to step 7.

Call the DROP_LICENSE function to drop the hourly license:

=> SELECT drop_license('hourly license name');

  1. You will need the path for your cluster's communal storage in a later step. If you do not already know the path, you can find this information by executing this query:

    => SELECT location_path FROM V_CATALOG.STORAGE_LOCATIONS
       WHERE sharing_type = 'COMMUNAL';
    
  2. Synchronize your database's metadata. See Synchronizing metadata.

  3. Shut down the database by calling the SHUTDOWN function:

    => SELECT SHUTDOWN();
    
  4. You now need to create a new BYOL cluster onto which you will revive your database. Deploy a new cluster including a new MC instance using a BYOL entry in the marketplace of your chosen cloud platform. See:

  5. Revive your database onto the new cluster. For instructions, see Reviving an Eon Mode database cluster. Because you created the new cluster using a BYOL entry in the marketplace, the database uses the BYOL you applied earlier.

  6. After reviving the database on your new BYOL cluster, terminate the instances for your hourly license cluster and MC. For instructions, see your cloud provider's documentation.

Moving an Eon Mode database from BTH to BYOL using the MC

Follow this procedure to move to BYOL and revive your database using MC:

  1. Purchase a long-term BYOL license from the online licensing portal, following the steps detailed in Obtaining a license key file. Save the file to a location on your computer.

  2. You now need to install the new license on your database. Log into MC and click your database in the Recent Databases list.

  3. At the bottom of your database's Overview page, click the License tab.

  4. Under the Installed Licenses list, note the name of the BTH license in the License Name column. You will need this later to check whether it is still present after installing the new long-term license.

  5. In the ribbon at the top of the License History page, click the Install New License button. The Settings: License page opens.

  6. Click the Browse button next to the Upload a new license box.

  7. Locate the license file you obtained in step 1, and click Open.

  8. Click the Apply button on the top right of the page.

  9. Select the checkbox to agree to the EULA terms and click OK.

  10. After Vertica installs the license, click the Close button.

  11. Click the License tab at the bottom of the page.

  12. If only the new long-term license appears in the Installed Licenses list, skip to Step 16. If the by-the-hour license also appears in the list, copy down its name from the License Name column.

  13. You must drop the by-the-hour license before you can proceed. At the bottom of the page, click the Query Execution tab.

  14. In the query editor, enter the following statement:

    SELECT DROP_LICENSE('hourly license name');
    
  15. Click Execute Query. The query should complete indicating that the license has been dropped.

  16. You will need the path for your cluster's communal storage in a later step. If you do not already know the path, you can find this information by executing this query in the Query Execution tab:

    SELECT location_path FROM V_CATALOG.STORAGE_LOCATIONS
       WHERE sharing_type = 'COMMUNAL';
    
  17. Synchronize your database's metadata. See Synchronizing metadata.

  18. You must now stop your by-the-hour database cluster. At the bottom of the page, click the Manage tab.

  19. In the banner at the top of the page, click Stop Database and then click OK to confirm.

  20. From the Amazon Web Services Marketplace or the Google Cloud Marketplace, deploy a new Vertica Management Console using a BYOL entry. Do not deploy a full cluster. You just need an MC deployment.

  21. Log into your new MC instance and revive the database. See Reviving an Eon Mode database on AWS in MC for detailed instructions.

  22. After reviving the database on your new environment, terminate the instances for your hourly license environment. To do so, on the AWS CloudFormation Stacks page, select the hourly environment's stack (its collection of AWS resources) and click Actions > Delete Stack.

Moving an Enterprise Mode database from hourly to BYOL using backup and restore

In an Enterprise Mode database, follow this procedure to move to BYOL, and then back up and restore your database:

Obtain a long-term BYOL license from the online licensing portal, described in Obtaining a license key file.Upload the license file to a node in your database. Note the absolute path in the node's filesystem, as you will need this later when installing the license.Connect to the node you uploaded the license file to in the previous step. Connect to your database using vsql and view the licenses table:

=> SELECT * FROM licenses;

Note the name of the hourly license listed in the NAME column, so you can check if it is still present later.

Install the license in the database using the INSTALL_LICENSE function with the absolute path to the license file you uploaded in step 2:

=> SELECT install_license('absolute path to BYOL license');

View the licenses table again:

=> SELECT * FROM licenses;

If only the new BYOL license appears in the table, skip to step 8. If the hourly license whose name you noted in step 4 is still in the table, copy the name and proceed to step 7.

Call the DROP_LICENSE function to drop the hourly license:

=> SELECT drop_license('hourly license name');

  1. Back up the database. See Backing up and restoring the database.

  2. Deploy a new cluster for your database using one of the BYOL entries in the Amazon Web Services Marketplace.

  3. Restore the database from the backup you created earlier. See Backing up and restoring the database. When you restore the database, it will use the BYOL you loaded earlier.

  4. After restoring the database on your new environment, terminate the instances for your hourly license environment. To do so, on the AWS CloudFormation Stacks page, select the hourly environment's stack (its collection of AWS resources) and click Actions > Delete Stack.

After completing one of these procedures, see Viewing your license status to confirm the license drop and install were successful.

5 - Adjusting Spread Daemon timeouts for virtual environments

You may see Vertica nodes leave the database even though they are still running.

You may see Vertica nodes leave the database even though they are still running. This issue can happen on networks that are prone to spikes in latency or in virtual environments where a node's VM may be paused for a short period of time. You can adjust a setting in Vertica to help prevent this issue from occurring.

Vertica relies on spread daemons to pass messages between database nodes. When a node fails to respond to a spread message after a timeout period, Vertica assumes the node is down and starts to remove it from the database.

The default Spread timeout depends on the number of configured Spread segments:

Configured Spread segments Default timeout
1 8 seconds
> 1 25 seconds

If network delays or temporary pauses of a VM last longer than the spread timeout period, you may see UP nodes leave the database. In these cases, you can increase the spread timeout to reduce or eliminate instances where UP nodes leave the database.

Azure's memory-preserving updates and spread timeouts

In Azure, you might see running nodes leave the database due to scheduled maintenance. Azure's maintenance down time is usually well-defined. For example, Azure's memory-preserving updates can pause a VM for up to 30 seconds while performing maintenance on the system hosting the VM. This pause does not disrupt the node. It continues normal operation once Azure resumes it. See the Azure documentation's topic on Maintenance for virtual machines in Azure for more information about updates. If Azure pauses a node for longer than the spread timeout period, Vertica interprets the node's inability to respond to a spread message as the node going down, even though it will resume running normally.

Setting the spread timeout

When you know your network or nodes may be unable to respond for a specific amount of time, you can increase the spread timeout period to longer than this time. Adjust the timeout to the period of time the node may be unable to respond, plus an additional 5 seconds as a safety margin.

For example, if you know Azure's memory-preserving maintenance can pause your VMs for up to 30 seconds, set the spread timeout to 35 seconds.

If you do not know exactly how long network or node disruptions can last, you can try increasing the spread timeout gradually, until you see reduced instances of UP nodes leaving the database. Be as conservative with this setting as you can.

You can see the current setting of the spread timeout by querying system tableSPREAD_STATE:

=> SELECT * FROM V_MONITOR.SPREAD_STATE;
    node_name     | token_timeout
------------------+---------------
 v_vmart_node0003 |          8000
 v_vmart_node0001 |          8000
 v_vmart_node0002 |          8000
(3 rows)

You change the spread timeout calling the meta-function SET_SPREAD_OPTION to set the token timeout to a new value. This value is a string, and sets the timeout in milliseconds.

This example sets the timeout to 35 seconds (35000ms):

=> SELECT SET_SPREAD_OPTION( 'TokenTimeout', '35000');
NOTICE 9003:  Spread has been notified about the change
                   SET_SPREAD_OPTION
--------------------------------------------------------
 Spread option 'TokenTimeout' has been set to '35000'.

(1 row)

=> SELECT * FROM V_MONITOR.SPREAD_STATE;
    node_name     | token_timeout
------------------+---------------
 v_vmart_node0001 |         35000
 v_vmart_node0002 |         35000
 v_vmart_node0003 |         35000
(3 rows);

See also