Welcome to the Vertica on the Cloud guide. This section explains how you can create Vertica clusters running on different cloud platforms. It does not cover working with existing data stored in the cloud. For information about loading data, see Data load.
This document assumes that you are familiar with the cloud environment on which you will create your Vertica cluster.
1 - Vertica on Amazon Web Services
This section explains how to create and manage Vertica clusters on AWS.
This section explains how to create and manage Vertica clusters on AWS.
When you launch a cluster on AWS resources and are ready to create your database, consider whether to run it in Eon Mode or Enterprise Mode. The differences in these two modes lay in their architecture, deployment, and scalability:
Enterprise Mode stores data locally on the nodes in the database.
Eon Mode stores its data in an S3 bucket.
Eon Mode separates the computational processes from the communal storage layer of your database. This separation lets you elastically vary the number of nodes in your database cluster to adjust to varying workloads.
Vertica provides CloudFormation Templates (CFTs) through the AWS Marketplace. These CFTs also deploy the Management Console.
See Architecture for more about the differences between the two database modes.
In this section
1.1 - Overview of Vertica on Amazon Web Services (AWS)
Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template, or manually deployed from Amazon Machine Images (AMIs).
Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template, or manually deployed from Amazon Machine Images (AMIs).
You can create a database in either Eon Mode or Enterprise Mode in a Vertica cluster in AWS.
For more information about Amazon cluster instances and their limitations, see the Amazon documentation.
In this section
1.1.1 - CloudFormation templates
Vertica provides Cloud Formation Templates (CFTs) through the AWS Marketplace.
Vertica provides Cloud Formation Templates (CFTs) through the AWS Marketplace. After you provide a few parameters to the template, create a stack to automatically provision the AWS resources for your Vertica system.
Vertica provides Vertica and Management Console AMIs in the following operating systems.
Vertica provides Vertica and Management Console AMIs in the following operating systems.
Red Hat 7.4 and later
Amazon Linux 2.0 and later
You can use the AMI to deploy MC hosts or cluster hosts.
Important
When using Amazon Linux 2.0, you must compile C++ UDX libraries with a supported gcc version. For more information see Setting up a development environment.
1.1.4 - Supported AWS instance types
Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes.
Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes. Choose the instance type that best matches your requirements. The two tables below list the AWS instance types that Vertica supports for Vertica cluster hosts, and for use in MC. For more information, see the Amazon Web Services documentation on instance types and volumes.
Instance types for Vertica cluster hosts
Each Amazon EC2 Instance type natively provides one of the following storage options:
Elastic Block Store (EBS) provides durable storage: Data files stored on instance persist after instance is stopped.
Instance Store provides temporary storage: Data files stored on instance are lost when instance is stopped.
Important
Instance types that support EBS volumes support encrypting.
Optimization
Instance Types Using Only EBS Volumes (Durable)
Instance Types Using Instance Store Volumes (Temporary)
General purpose
m4.4xlarge
m4.10xlarge
m5.4xlarge
m5.8xlarge
m5.12xlarge
m5d.4xlarge
m5d.8xlarge
m5d.12xlarge
Compute
c4.4xlarge
c4.8xlarge
c5.4xlarge
c5.9xlarge
c3.4xlarge
c3.8xlarge
c5d.4xlarge
c5d.9xlarge
Memory
r4.4xlarge
r4.8xlarge
r4.16xlarge
r5.4xlarge
r5.8xlarge
r5.12xlarge
r3.4xlarge
r3.8xlarge
r5d.4xlarge
r5d.8xlarge
r5d.12xlarge
Storage
d2.4xlarge
d2.8xlarge
i3.4xlarge
i3.8xlarge
i3.16xlarge
i3en.3xlarge
i3en.6xlarge
i3en.12xlarge
Instance types available for MC hosts
Optimization
Type
Supports EBS Storage (Durable)
Supports Ephemeral Storage (Temporary)
Computing
c4.large
c4.xlarge
c5.large
c5.xlarge
Yes
Yes
Yes
Yes
No
No
No
No
More information
For more information about Amazon cluster instances and their limitations, see Manage Clusters in the Amazon Web Services documentation.
1.1.5 - Choosing AWS Eon Mode instance types
This topic lists the recommended instance types to use in an Eon Mode database running in AWS.
This topic lists the recommended instance types to use in an Eon Mode database running in AWS.
Choose instance types that support ephemeral instance storage or EBS volumes for your depot, depending on cost and availability. It is not mandatory to have an EBS-backed depot, because in Eon Mode, a copy of the data is safely stored in communal storage. Vertica recommends either r4 or i3 instances for production clusters.
The following table provides information to help you make a decision on how to pick instances with ephemeral instance storage or EBS only storage. Check with AWS for the latest cost per hour.
Storage Type
Instance Type
Pros/Cons
Instance storage
i3.8xlarge
Instance storage offers better performance than EBS attached storage through multiple EBS volumes. Instance storage can be striped (RAIDed) together to increase throughput and load balance I/O.
Data stored in instance-store volumes is not persistent through instance stops, terminations, or hardware failures.
EBS-only storage
r4.8xlarge with 600 GB
EBS volume attached
Newer instance types from AWS have only the EBS option. In most AWS regions, it's easier to provision a large number of instances.
You can terminate an instance but leave the EBS volume around for faster revive. Perserving the EBS will preserve the depot. While some of the cached files might have become stale, they will be ignored and evicted. Much of the cached data will not be stale. It will save time when the node revives and warms its depot.
Take advantage of full-volume encryption.
Important
If you select instances that use instance store, if you then terminate those instances there is the potential for data loss. For Eon mode, MC displays an alert to inform the user of the potential data loss when terminating instances that support instance store.
1.1.6 - Vertica AMI sleep c-states
By default, the following instances have their processor C-states set to a value of 1 in the Vertica AMI:.
By default, the following instances have their processor C-states set to a value of 1 in the Vertica AMI:
c4.8xlarge
d2.8xlarge
m4.10xlarge
This measure is meant to improve performance by limiting the sleep states that an instance running Vertica uses.
For more information about sleep states, visit the AWS Documentation.
Command Line Interface: Use the Amazon command-line Interface (CLI) with your Vertica AMIs. For more information, see What Is the AWS Command Line Interface?.
Elastic Load Balancing: Use elastic load balancing (ELB) for queries up to one hour. When enabling ELB, configure the timer to 3600 seconds. For more information see Elastic Load Balancing in the AWS documentation.
1.1.8 - AWS authentication
Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens.
Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens. For long-term access to non-communal storage buckets, you should use IAM roles for access control centralization. You do not need to change your application's configuration if you want to change its access settings. You just alter the IAM role applied to your EC2 instances.
Vertica uses both of these authentication methods to support different features and use cases:
An Eon Mode database's access to S3 for communal and catalog storage must always use IAM role authentication. IAM roles are the default access control method for AWS resources. Vertica uses this method if you do not configure the legacy access control session parameters.
Individual users can read data from S3 storage locations other than the ones Vertica uses for communal storage. For example, users can use COPY to load data into Vertica from an S3 bucket or query an external table stored on S3. If the IAM role assigned to the Vertica nodes does not have access to this external S3 data, the user must set an id, secret, and optionally an access token in session variables to authorize access to it. These session variables override the IAM role set on the server. See S3 parameters for a list of these session parameters.
Individual users can export data to S3 using the Vertica Library for AWS. This library cannot use IAM authorization. Users who want to export data to S3 using this library must set id, secret, and optionally access token values in session variables. See Configure the Vertica library for Amazon Web Services for details.
Important
If the database is running in Eon Mode, using id and secret authentication is more complex. In addition to having access to the external S3 data, any id that a user sets must be authorized to read from and write to the S3 storage locations that Vertica uses to store communal and catalog data. The queries that the user executes uses this id for all storage requests, not just those for accessing external S3 data. If the id does not have access to the catalog and communal storage, the user cannot execute queries.
Configuring an IAM role
To configure an IAM role to grant Vertica to access AWS resources you must:
Create an IAM role to allow EC2 instances to access the specific resources.
Grant that role permission to access your resources.
Attach this IAM role to each EC2 instance in the Vertica cluster.
To see an example of IAM roles for a Vertica cluster, look at the roles defined in one of the Cloud Formation Templates provided by Vertica. You can download these templates from any of the Vertica entries in the Amazon Marketplace. Under each entry's Usage Information section, click the View CloudFormation Template link, then click Download CloudFormation Template.
1.2 - Installing Vertica with CloudFormation templates
Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly.
Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly. Using the template allows you to automatically provision your AWS resources and launch a Vertica cluster and Management Console, with minimal configuration required.
For details about creating an Eon Mode or Enterprise Mode database after you create a cluster with CFTs, see Amazon Web Services in MC.
1.2.1 - CloudFormation template (CFT) overview
With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.
With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.
To access Vertica CFTs, go to the AWS Marketplace. Licensing models for CFTs are:
Bring Your Own License (BYOL): By default, free CE license is installed with 3 nodes and 1 TB. To extend nodes or size, you can purchase the Vertica BYOL license. Outside of the BYOL license on CFTs, you can also access the Community Edition without a license file:
If you are using Management Console, simply leave the license field blank.
By the Hour: A pay-as-you-go model where you pay for only the number of hours you use for each node. One advantage of using the Paid Listing is that all charges appear on your Amazon AWS bill. This offers an alternative to purchasing a full Vertica license. This eliminates the need to compute potential storage needs in advance.
Available Vertica CFTs are:
Management Console with 3 Vertica nodes: The easiest way to deploy Vertica. This CFT deploys an Eon Mode database by default. However, this environment can also be used to create an Enterprise Mode database. For more information, see Creating a database.
Deploy Management Console into new VPC: This CFT deploys all required AWS resources and installs the Vertica Management Console (MC). After stack creation completes, log in to the MC to provision a Vertica database cluster.
Deploy Management Console into existing VPC: This CFT deploys the Vertica Management Console (MC) in an already-existing VPC and subnet. After stack creation completes, the MC is available. Log in to MC to provision either a Vertica database cluster or an Eon Mode database cluster.
For this CFT, you must first set up the VPC, subnet, and related network resources. For more information about the correct configuration of these resources for Vertica, see the following topics in the AWS documentation:
* Creating a virtual private cloud
* Configuring the network
Before you can install Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:.
Before you can install Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:
AWS account with permissions to create a VPC, subnet, security group, EC2 instances, and IAM roles (For more information about AWS accounts, see the AWS documentation)
Starting in the AWS Marketplace, launch the provisioning instance from which you can install Vertica.:
Log in to the AWS Marketplace with an AWS account (see the Prerequisites section above).
Search for "Vertica" in the AWS Marketplace.
Select a Vertica CFT. Each CFT leads you to a product overview page, with pricing estimates. (Also see CloudFormation template (CFT) overview for an overview of available templates and products).
Click Continue to Subscribe.
On the next page, select your launch settings based on your requirements for deployment.
If you have not agreed to Vertica EULA terms on the AWS Marketplace before, click Accept Software Terms to subscribe.
Click Launch with CloudFormation Console. The CloudFormation Console opens.
The CloudFormation Console automatically supplies the URL in the Specify an Amazon S3 template URL field. Click Next.
Follow the CloudFormation workflow and enter the parameters (collectively called a stack).
Note
Important: Take note of the username and password you set for Management Console during this step. You cannot recover or reset these credentials after you create the stack.
After confirming the details you have provided for your new stack, click Create. The AWS console brings you to the Stacks page, where you can view the progress of the creation process. The process takes several minutes.
The Outputs tab displays information about accessing your environment after the process completes.
Next, access the Management Console (MC) to deploy your cluster instances and create a database, as described in Access Management Console.
1.2.4 - Access Management Console
You use MC to deploy Vertica cluster instances and create a database.
You use MC to deploy Vertica cluster instances and create a database. You can also use MC to manage and monitor your databases. You will use Management Console to provision a Vertica cluster and database on the AWS resources you just launched.
On the AWS CloudFormation Stacks page, select your new stack and view the Outputs tab. This tab provides information about accessing your environment, as well as documentation and licensing resources.
Click the Access Management Console URL. This link takes you to the MC login page.
To log in, enter the MC username and password that you created using the CloudFormation Console.
After login, MC displays the home page, with options to provision a new cluster or database or import existing ones. If you chose a CFT that also creates a database, your new database is also displayed on the home page.
This page also provides a Resources section with links to online training, blogs, community, and help resources.
You have successfully launched Management Console on AWS resources.
If you have not yet provisioned a Vertica cluster and database, complete the steps in one of the following:
A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network.
A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network. Create a virtual private cloud (VPC) to ensure the nodes in your cluster will be able to communicate with each other within AWS.
Create a single public subnet VPC with the following configurations:
Assign a Network Access Control List (ACL) that is appropriate to your situation. The default ACL does not provide a high level of security.
Enable DNS resolution and enable DNS hostname support for instances launched in this VPC.
A Vertica cluster must be operated within a single availability zone.
For information about VPCs, including how to create one, visit the AWS documentation.
1.3 - Install Vertica with manually deployed AWS resources
Vertica provides an AMI that you can install on AWS resources that you manually deploy.
Vertica provides an AMI that you can install on AWS resources that you manually deploy. This section will guide you through configuring your network settings on AWS, launching and preparing EC2 instances using the Vertica AMI, and creating a Vertica cluster on those EC2 instances.
Choose this method of installation if you are familiar with configuring AWS and have many specific AWS configuration needs. (To automatically deploy AWS resources and a Vertica cluster instead, see Installing Vertica with CloudFormation templates.
1.3.1 - Configure your network
Before you create your cluster, you must configure the network on which Vertica will run.
Before you create your cluster, you must configure the network on which Vertica will run. Vertica requires a number of specific network configurations to operate on AWS. You may also have specific network configuration needs beyond the default Vertica settings.
Important
You can create a Vertica database that uses IPv6 for internal communications running on AWS. However, if you do so, you must identify the hosts in your cluster using IP addresses rather than host names. The AWS DNS resolution service is incompatible with IPv6.
The following sections explain which Amazon EC2 features you need to configure for instance creation.
1.3.1.1 - Create a placement group, key pair, and VPC
Part of configuring your network for AWS is to create the following:.
Part of configuring your network for AWS is to create the following:
A placement group is a logical grouping of instances in a single Availability Zone. Placement Groups are required for clusters and all Vertica nodes must be in the same Placement Group.
Vertica recommends placement groups for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your Placement Group, choose an instance type that supports enhanced networking.
For information on creating placement groups, see Placement Groups in the AWS documentation.
Create a key pair
You need a key pair to access your instances using SSH. Create the key pair using the AWS interface and store a copy of your key (*.pem) file on your local machine. When you access an instance, you need to know the local path of your key.
Use a key pair to:
Authenticate your connection as dbadmin to your instances from outside your cluster.
Install and configure Vertica on your AWS instances.
for information on creating a key pair, see Amazon EC2 Key Pairs in the AWS documentation.
Create a virtual private cloud (VPC)
You create a Virtual Private Cloud (VPC) on Amazon so that you can create a network of your EC2 instances. Your instances in the VPC all share the same network and security settings.
A Vertica cluster on AWS must be logically located in the same network. Create a VPC to ensure the nodes in you cluster can communicate with each other in AWS.
Create a single public subnet VPC with the following configurations:
Assign a Network Access Control List (ACL) that is appropriate to your situation.
Enable DNS resolution and enable DNS hostname support for instances launched in this VPC.
Vertica requires the following network access control list (ACL) settings on an AWS instance running the Vertica AMI.
Vertica requires the following basic network access control list (ACL) settings on an AWS instance running the Vertica AMI. Vertica recommends that you secure your network with additional ACL settings that are appropriate to your situation.
Inbound Rules
Type
Protocol
Port Range
Use
Source
Allow/Deny
SSH
TCP (6)
22
SSH (Optional—for access to your cluster from outside your VPC)
User Specific
Allow
Custom TCP Rule
TCP (6)
5450
MC (Optional—for MC running outside of your VPC)
User Specific
Allow
Custom TCP Rule
TCP (6)
5433
SQL Clients (Optional—for access to your cluster from SQL clients)
User Specific
Allow
Custom TCP Rule
TCP (6)
50000
Rsync (Optional—for backup outside of your VPC)
User Specific
Allow
Custom TCP Rule
TCP (6)
1024-65535
Ephemeral Ports (Needed if you use any of the above)
User Specific
Allow
ALL Traffic
ALL
ALL
N/A
0.0.0.0/0
Deny
Outbound Rules
Type
Protocol
Port Range
Use
Source
Allow/Deny
Custom TCP Rule
TCP (6)
0–65535
Ephemeral Ports
0.0.0.0/0
Allow
You can use the entire port range specified in the previous table, or find your specific ephemeral ports by entering the following command:
$ cat /proc/sys/net/ipv4/ip_local_port_range
More information
For detailed information on network ACLs within AWS, refer to Network ACLs in the Amazon documentation.
For detailed information on ephemeral ports within AWS, refer to Ephemeral Ports in the Amazon documentation.
1.3.1.3 - Configure TCP keepalive with AWS network load balancer
AWS supports three types of elastic load balancers (ELBs):.
AWS supports three types of elastic load balancers (ELBs):
Vertica strongly recommends the AWS Network Load Balancer (NLB), which provides the best performance with your Vertica database. The Network Load Balancer acts as a proxy between clients (such as JDBC) and Vertica servers. The Classic and Application Load Balancers do not work with Vertica, in Enterprise Mode or Eon Mode.
To avoid timeouts and hangs when connecting to Vertica through the NLB, it is important to understand how AWS NLB handles idle timeouts for connections. For the NLB, AWS sets the idle timeout value to 350 seconds and you cannot change this value. The timeout applies to both connection points.
For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned because the server fails to send a keepalive within 350 seconds.
To identify an idle timeout/keepalive issue, run a query like this via a client such as JDBC:
=> SELECT SLEEP(355);
If there’s a problem, one of the following situations occurs:
The client connection terminates before 355 seconds. In this case, lower the JDBC keepalive setting so that keepalives are sent less than 350 seconds apart.
The client connection doesn’t return a result after 355 seconds. In this case, you need to adjust the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) so that keepalives are sent less than 350 seconds apart.
When you create a VPC, an Internet gateway is automatically assigned to it.
When you create a VPC, an Internet gateway is automatically assigned to it. You can use that gateway, or you can assign your own. If you are using the default Internet gateway, continue with the procedure described in Create a security group.
Otherwise, create an Internet gateway specific to your needs. Associate that internet gateway with your VPC and subnet.
For information about how to create an Internet Gateway, see Internet Gateways in the AWS documentation.
1.3.1.5 - Assign an elastic IP address
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally.
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally. Vertica recommends you assign a single elastic IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using their internal IP addresses dictated by your VPC settings.
Create an elastic IP address. For information, see Elastic IP Addresses in the AWS documentation.
1.3.1.6 - Create a security group
The Vertica AMI has specific security group requirements.
The Vertica AMI has specific security group requirements. When you create a Virtual Private Cloud (VPC), AWS automatically creates a default security group and assigns it to the VPC. You can use the default security group, or you can name and assign your own.
Create and name your own security group using the following basic security group settings. You may make additional modifications based on your specific needs.
Inbound
Type
Use
Protocol
Port Range
IP
SSH
TCP
22
The CIDR address range of administrative systems that require SSH access to the Vertica nodes. Make this range as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
DNS (UDP)
UDP
53
Your private subnet address range (for example, 10.0.0.0/24).
Custom UDP
Spread
UDP
4803 and 4804
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
Spread
TCP
4803
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
VSQL/SQL
TCP
5433
The CIDR address range of client systems that require access to the Vertica nodes. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP
Inter-node Communication
TCP
5434
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
TCP
5444
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
MC
TCP
5450
The CIDR address of client systems that require access to the management console. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP
Rsync
TCP
50000
Your private subnet address range (for example, 10.0.0.0/24).
ICMP
Installer
Echo Reply
N/A
Your private subnet address range (for example, 10.0.0.0/24).
ICMP
Installer
Traceroute
N/A
Your private subnet address range (for example, 10.0.0.0/24).
Note
In Management Console (MC), the Java IANA discovery process uses port 7 once to detect if an IP address is reachable before the database import operation. Vertica tries port 7 first. If port 7 is blocked, Vertica switches to port 22.
1.3.2 - Deploy AWS instances for your Vertica database cluster
Once you have configured your network, you are ready to create your AWS instances and install Vertica.
Once you have configured your network, you are ready to create your AWS instances and install Vertica. Follow these procedures to install and run Vertica on AWS.
1.3.2.1 - Configure and launch an instance
After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica.
After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.
When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps. Vertica recommends that you use the Vertica AMI as is—without modification.
Consider the following issues when you add storage to your instances:
Add a number of drives equal to the number of physical cores in your instance. For example, for a c3.8xlarge instance, 16 drives. For an r3.4xlarge, add 8 drives.
Do not store your information on the root volume.
Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types in the Amazon Web Services documentation.
Decide whether to configure EBS volumes as a RAID array
You can choose to configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:
If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).
Note
If your EC2 instance has hyper-threading enabled, vioperf may incorrectly count the number of cores in your system. The 20MB/s throughput per core requirement only applies to physical cores, rather than virtual cores. If your EC2 instance has hyper-threading enabled, divide the counter value (column 4 in the output) by the number of physical cores. See CPU Cores and Threads Per CPU Core Per Instance Type section in the AWS documentation topic Optimizing CPU Options for a list of physical cores in each instance type.
If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.
Security group and access
Choose between your previously configured security group or the default security group.
Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS authentication for more information.
Launch instances
Verify that your instances are running.
1.3.2.2 - Connect to an instance
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:.
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:
As the dbadmin user, type the following command, substituting your ssh key:
Select the instance that is attached to the Elastic IP.
Click Connect.
On Connect to Your Instance, choose one of the following options:
A Java SSH Client directly from my browser—Add the path to your private key in the field Private key path, andclick Launch SSH Client.
Connect with a standalone SSH client**—**Follow the steps required by your standalone SSH client.
Connect to an instance from windows using putty
If you connect to the instance from the Windows operating system, and plan to use Putty:
Convert your key file using PuTTYgen.
Connect with Putty or WinSCP (connect via the elastic IP), using your converted key (i.e., the *ppk file).
Move your key file (the *pem file) to the root dir using Putty or WinSCP.
1.3.2.3 - Prepare instances for cluster formation
After you create your instances, you need to prepare them for cluster formation.
After you create your instances, you need to prepare them for cluster formation. Prepare your instances by adding your AWS .pem key and your Vertica license.
By default, each AMI includes a Community Edition license. Once Vertica is installed, you can find the license at this location:
As the dbadmin user, copy your *pem file (from where you saved it locally) onto your primary instance.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:
FATAL (19): Failed Login Validation 10.0.3.158, cannot resolve or connect to host as root.
If you receive a failure message, enter the following command to correct permissions on your *pem file:
$ chmod 600 /<name-of-pem>.pem
Copy your Vertica license over to your primary instance, placing it in your home directory or other known location.
1.3.2.4 - Change instances on AWS
You can change instance types on AWS.
You can change instance types on AWS. For example, you can downgrade a c3.8xlarge instance to c3.4xlarge. See Supported AWS instance types for a list of valid AWS instances.
When you change AWS instances you may need to:
Reconfigure memory settings
Reset memory size in a resource pool
Reset number of CPUs in a resource pool
Reconfigure memory settings
If you change to an AWS instance type that requires a different amount of memory, you may need to recompute the following and then reset the values:
You may need root user permissions to reset these values.
Reset memory size in a resource pool
If you used absolute memory in a resource pool, you may need to reconfigure the memory using the MEMORYSIZE parameter in ALTER RESOURCE POOL.
Note
If you set memory size as a percentage when you created the original resource pool, you do not need to change it here.
Reset number of CPUs in a resource pool
If your new instance requires a different number of CPUs, you may need to reset the CPUAFFINITYSET parameter in ALTER RESOURCE POOL.
1.3.2.5 - Configure storage
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system.
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system. The /opt/vertica/sbin/configure_software_raid.sh script automates the storage configuration process.
Caution
Do not store information on the root volume because it might result in data loss.
Vertica performance tests Eon Mode with a per-node EBS volume of up to 2TB. For best performance, combine multiple EBS volumes into a RAID 0 array.
Because the storage configuration script requires the volume names that you want to configure, you must identify the volumes on your machine. The following command lists the contents of the /dev directory. Search for the volumes that begin with xvd:
$ ls /dev
Important
Ignore the root volume. Do not include any of your root volumes in the RAID creation process.
Combining volumes for storage
The configure_software_raid.sh shell script combines your EBS volumes into a RAID 0 array.
Caution
Run configure_software_raid.sh in the default setting only if you have a fresh configuration with no existing RAID settings.
If you have existing RAID settings, open the script in a text editor and manually edit the raid_dev value to reflect your current RAID settings. If you have existing RAID settings and you do not edit the script, the script deletes important operating system device files.
Alternately, use the Management Console (MC) console to add storage nodes without unwanted changes to operating system device files. For more information, see Managing database clusters.
The following steps combine your EBS volumes into RAID 0 with the configure_software_raid.sh script:
Edit the /opt/vertica/sbin/configure_software_raid.sh shell file as follows:
Comment out the safety exit command at the beginning .
Change the sample volume names to your own volume names, which you noted previously. Add more volumes, if necessary.
Run the /opt/vertica/sbin/configure_software_raid.sh shell file. Running this file creates a RAID 0 volume and mounts it to /vertica/data.
Change the owner of the newly created volume to dbadmin with chown.
Repeat steps 1-3 for each node on your cluster.
1.3.2.6 - Create a cluster
On AWS, use the install_vertica script to combine instances and create a cluster.
On AWS, use the
install_vertica script to combine instances and create a cluster. Check your My Instances page on AWS for a list of current instances and their associated IP addresses. You need these IP addresses when you run install_vertica.
Create a cluster as follows:
While connected to your primary instance, enter the following command to combine your instances into a cluster. Substitute the IP addresses for your instances and include your root *.pem file name.
* If you are using Vertica Community Edition, which limits you to three instances, you can specify `-L CE` with no license file.
* When you issue install_vertica or update_vertica on a Vertica AMI script, --point-to-point is the default. This parameter configures <a class="glosslink" href="/en/glossary/spread/" title="An open source toolkit used in Vertica to provide a high performance messaging service that is resilient to network faults.">Spread</a> to use direct point-to-point communication between all Vertica nodes, which is a requirement for clusters on AWS.
* If you are using IPv6 network addresses to identify the hosts in your cluster, use the --ipv6 flag in your `install_vertica` command. You must also use IP addresses instead of host names, as the AWS DNS server cannot resolve host names to IPv6 addresses.
After combining your instances, Vertica recommends deleting your *.pem key from your cluster to reduce security risks. The example below uses the shred command to delete the file:
Stopping or rebooting an instance or cluster without first shutting down the database down, may result in disk or database corruption. To safely shut down and restart your cluster, see Operating the database.
Check open ports manually using the netcat utility
Once your cluster is up and running, you can check ports manually through the command line using the netcat (nc) utility. What follows is an example using the utility to check ports.
Before performing the procedure, choose the private IP addresses of two nodes in your cluster.
The examples given below use nodes with the private IPs:
10.0.11.60 10.0.11.61
Install the nc utility on your nodes. Once installed, you can issue commands to check the ports on one node from another node.
To check a TCP port:
Put one node in listen mode and specify the port. The following sample shows how to put IP 10.0.11.60 into listen mode for port 4804.
[root@ip-10-0-11-60 ~]# nc -l 4804
From the other node, run nc specifying the IP address of the node you just put in listen mode, and the same port number.
[root@ip-10-0-11-61 ~]# nc 10.0.11.60 4804
Enter sample text from either node and it should show up on the other node. To cancel after you have checked a port, enter Ctrl+C.
Note
Note: To check a UDP port, use the same nc commands with the –u option.
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster.
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster. Vertica provides an MC AMI, which you can use with AWS. The MC AMI allows you to create an instance, dedicated to running MC, that you can attach to a new or existing Vertica cluster on AWS. You can create and attach an MC instance to your Vertica on AWS cluster at any time.
1.3.2.7.1 - Log in to MC and managing your cluster
After you launch your MC instance and configure your security group settings, log in to your database.
After you launch your MC instance and configure your security group settings, log in to your database. To do so, use the elastic IP you specified during instance creation.
From this elastic IP, you can manage your Vertica database on AWS using standard MC procedures.
Considerations when using MC on AWS
Because MC is already installed on the MC AMI, the MC installation process does not apply.
To uninstall MC on AWS, follow the procedures provided in Uninstalling Management Console before terminating the MC Instance.
1.4 - Export data to Amazon S3 using the AWS library
The AWS library is deprecated.
Deprecated
The AWS library is deprecated. To export delimited data to S3 or any other destination, use EXPORT TO DELIMITED.
The Vertica library for Amazon Web Services (AWS) is a set of functions and configurable session parameters. These parameters allow you to export delimited data from Vertica to Amazon S3 storage without any third-party scripts or programs.
To use the AWS library, you must have access to an Amazon S3 storage account.
1.4.1 - Configure the Vertica library for Amazon Web Services
You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3.
You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3. This library does not support IAM authentication. You must configure it to authenticate with S3 by using session parameters containing your AWS access key credentials. You can set your session parameters directly, or you can store your credentials in a table and set them with the AWS_SET_CONFIG function.
Because the AWS library uses session parameters, you must reconfigure the library with each new session.
Note
Important: Your AWS access key ID and secret access key are different from your account access credentials. For more information about AWS access keys, visit the Managing Access Keys for IAM Users in the AWS documentation.
Set AWS authentication parameters
The following AWS authentication parameters allow you to access AWS and work with the data in your Vertica database:
aws_id: The 20-character AWS access key used to authenticate your account.
aws_secret: The 40-character AWS secret access key used to authenticate your account.
aws_session_token: The AWS temporary security token generated by running the AWS STS command get-session-token. This AWS STS command generates temporary credentials you can use to implement multi-factor authentication for security purposes. See Implementing Multi-factor Authentication.
Implement multi-factor authentication
Implement multi-factor authentication as follows:
Run the AWS STS command get-session-token, this returns the following:
For more information on get-session-token, see the AWS documentation.
Using the SecretAccessKey returned from get-sessiontoken, set your temporary aws_secret:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='bQid6jNuSWRqUzkIJCFG7c71gDHZY3h7aDSW2DU6';
Using the SessionToken returned from get-session-token, set your temporary aws_session_token:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_session_token='FQoDYXdzEBcaDKM1mWpeu88nDTTFICKsAbaiIDTWe4B
Th33tnUvo9F/8mZicKKLLy7WIcpT4FLfr6ltIm242/U2CI9G/XdC6eoysUi3UGH7cxdhjxAW4fjgCKKYuNL764N2xn0issmIuJOku3GTDy
c4U4iNlWyEng3SlshdiqVlk1It2Mk0isEQXKtxF9VgfncDQBxjZUCkYIzseZw5pULa9YQcJOzl+Q2JrdUCWu0iFspSUJPhOguH+wTq
iM2XdHL5hcUcomqm41gU=';
Using the AccessKeyID returned from get-session-token, set your temporary aws_id:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_id='ASIAJ4ZYGTOSVSLUIN7Q';
The Expiration value returned indicates when the temporary credentials expire. In this example expiration occurs April 12, 2018 at 01:58:50.
These examples show how to implement multifactor authentication using session parameters. You can use either of the following methods to securely set and store your AWS account credentials:
Note: To increase security, avoid directly setting the plain text value of your key directly in the aws_set_config parameter. Instead, store the value in a table protected with a access policy as described in Configure Session Parameters Using Credentials Stored in a Table.
AWS access key requirements
To communicate with AWS, your access key must have the following permissions:
s3:GetObject
s3:PutObject
s3:ListBucket
For security purposes, Vertica recommends that you create a separate access key with limited permissions specifically for use with the Vertica Library for AWS.
Configure session parameters directly
These examples show how to set the session parameters for AWS using your own credentials. Parameter values are case sensitive:
aws_id: This value is your AWS access key ID.
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_id='AKABCOEXAMPLEPKPXYZQ';
aws_secret: This value is your AWS secret access key.
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='CEXAMPLE3tEXAMPLE1wEXAMPLEFrFEXAMPLE6+Yz';
aws_region: This value is the AWS region associated with the S3 bucket you intend to access. Left unconfigured, aws_region will default to us-east-1. It identifies the default server used by Amazon S3.
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_region='us-east-1';
Using ALTER SESSION to change the values of S3 parameters also changes the values of corresponding UDParameters.
Setting a UDParameter changes only the UDParameter.
Setting a configuration parameter changes both the AWS parameter and UDParameter.
Configure session parameters using credentials stored in a table
You can place your credentials in a table and secure them with a row-level access policy. You can then call your credentials with the AWS_SET_CONFIG scalar meta-function. This approach allows you to store your credentials on your cluster for future session parameter configuration. You must have dbadmin access to create access policies.
Create a table with rows or columns corresponding with your credentials:
Store your credentials in the corresponding columns:
=> COPY keychain FROM STDIN;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> AEXAMPLEI5EXAMPLEYXQ|CCEXAMPLEtFjTEXAMPLEiEXAMPLE6+Yz
>> \.
After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function.
After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function. S3EXPORT() writes data to files, based on the URL you provide. Vertica performs all communication over HTTPS, regardless of the URL type you use.Vertica does not support virtual host style URLs. If you use HTTPS URL constructions, you must use path style URLs.
Note
If your S3 bucket contains a period in its path, set the prepend_hash parameter to True.
You can control the output of S3EXPORT() in the following ways:
By adjusting the query given to S3EXPORT(), you can export anything from tables to reporting queries.
This example exports a whole table:
=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
FROM exampleTable;
rows | url
------+------------------------------
606 | https://exampleBucket/object
(1 row)
This example exports the results of a query:
=> SELECT S3EXPORT(customer_name, annual_income USING PARAMETERS url='s3://exampleBucket/object') OVER()
FROM public.customer_dimension
WHERE (customer_gender, annual_income) IN
(SELECT customer_gender, MAX(annual_income)
FROM public.customer_dimension
GROUP BY customer_gender);
rows | url
------+------------------------------
25 | https://exampleBucket/object
(1 row)
Adjust the partition of your result set with the OVER clause
Use the OVER clause to control your export partitions. Using the OVER() clause without qualification results in a single partition processed by the initiator for all of the query data. This example shows how to call the function with an unqualified OVER() clause:
=> SELECT S3EXPORT(name, company USING PARAMETERS url='s3://exampleBucket/object',
delimiter=',') OVER()
FROM exampleTable WHERE company='Vertica';
rows | url
------+------------------------------
10 | https://exampleBucket/object
(1 row)
This example shows how you can use a window partition clause to partition S3 objects based on company values:
=> SELECT S3EXPORT(name, company
USING PARAMETERS url='s3://exampleBucket/object',
delimiter=',') OVER(PARTITION BY company) AS MEDIAN
FROM exampleTable;
Adjusting the export chunk size for wide tables
You may encounter the following error when exporting extremely wide tables or tables with long data types such as LONG VARCHAR or LONG VARBINARY:
=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
FROM veryWideTable;
ERROR 5861: Error calling setup() in User Function s3export
at [/data/.../S3.cpp:787],
error code: 0, message: The specified buffer of 10485760 bytesRead is too small,
it should be at least 11279701 bytesRead.
Vertica returns this error if the data for a single row overflows the buffer storing the data before export. By default, this buffer is 10MB. You can increase the size of this buffer using the chunksize parameter, which sets the size of the buffer in bytes. This example sets it to around 60MB:
There are two ways to add nodes to an AWS cluster:.
There are two ways to add nodes to an AWS cluster:
Using Management Console
Using admintools
When you use MC to add nodes to a cluster in the cloud, MC provisions the instances, adds the new instances to the existing Vertica cluster, and then adds those hosts to the database. However, when you add nodes to a cluster using admintools, you need to execute those steps yourself, as explained in Adding Nodes Using admintools.
Adding nodes using Management Console
In the Vertica Management Console, you can add nodes in several ways, depending on your database mode.
For Eon Mode databases, MC supports actions for subcluster and node management for the following public and private cloud providers:
Adding nodes in an Enterprise Mode database on AWS
In an Enterprise Mode database on AWS, to add an instance to your cluster:
On the MC Home page, click View Infrastructure to go to the Infrastructure page. This page lists all the clusters the MC is monitoring.
Click any cluster shown on the Infrastructure page.
Select View or Manage from the dialog that displays, to view its Cluster page. (In a cloud environment, if MC was deployed from a cloud template the button says "Manage". Otherwise, the button says "View".)
Note
You can click the pencil icon beside the cluster name to rename the cluster. Enter a name that is unique within MC.
Click the Add (+) icon on the Instance List on the Cluster Management page.
MC adds a node to the selected cluster.
Adding nodes using admintools
This section gives an overview on how to add nodes if you are managing your cluster using admintools. Each main step points to another topic with the complete instructions.
Step 1: before you start
Before you add nodes to a cluster, verify that you have an AWS cluster up and running and that you have:
Created a database.
Defined a database schema.
Loaded data.
Run the Database Designer.
Connected to your database.
Step 2: launch new instances to add to an existing cluster
Perform the procedure in Configure and launch an instance to create new instances (hosts) that you then will add to your existing cluster. Be sure to choose the same details you chose when you created the original instances (VPC, placement group, subnet, and security group).
Step 3: include new instances as cluster nodes
You need the IP addresses when you run the install_vertica script to include new instances as cluster nodes.
If you are configuring Amazon Elastic Block Store (EBS) volumes, be sure to configure the volumes on the node before you add the node to your cluster.
To add the new instances as nodes to your existing cluster:
Connect to the instance that is assigned to the Elastic IP. See Connect to an instance if you need more information.
Run the Vertica installation script to add the new instances as nodes to your cluster. Specify the internal IP addresses for your instances and your *.pem file name.
After you have added the new instances to your existing cluster, add them as nodes to your cluster, as described in Adding nodes to a database.
Step 5: rebalance the database
After you add nodes to a database, always rebalance the database.
1.6 - Remove nodes from a running AWS cluster
Use the following procedures to remove instances/nodes from an AWS cluster.
Use the following procedures to remove instances/nodes from an AWS cluster.
To avoid data loss, Vertica strongly recommends that you back up your database before removing a node. For details, see Backing up and restoring the database.
In this section
1.6.1 - Remove hosts from the database
Before you remove hosts from the database, verify that you have:.
Before you remove hosts from the database, verify that you have:
Backed up the database.
Lowered the K-safety of the database.
Note
Do not stop the database.
To remove a host from the database:
While logged on as dbadmin, launch Administration Tools.
$ /opt/vertica/bin/admintools
From the Main Menu, select Advanced Menu.
From Advanced Menu, select Cluster Management. ClickOK.
From Cluster Management, select Remove Host(s). Click OK.
From Select Database, choose the database from which you plan to remove hosts. Click OK.
Select the host(s) to remove. Click OK.
Click Yes to confirm removal of the hosts.
Note
Enter a password if necessary. Leave blank if there is no password.
Click OK. The system displays a message telling you that the hosts have been removed. Automatic rebalancing also occurs.
Click OK to confirm. Administration Tools brings you back to the Cluster Management menu.
1.6.2 - Remove nodes from the cluster
To remove nodes from a cluster, run the update_vertica script and specify:.
To remove nodes from a cluster, run the update_vertica script and specify:
The option --remove-hosts, followed by the IP addresses of the nodes you are removing.
The option --ssh-identity, followed by the location and name of your *pem file.
The option --dba-user-password-disabled.
The following example removes one node from the cluster:
After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.
After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.
To stop an instance in AWS:
On AWS, navigate to your Instances page.
Right-click the instance, and choose Stop.
This step is optional because, after you have removed the node from your Vertica cluster, Vertica no longer sees the node as part of the cluster, even though it is still running within AWS.
1.7 - Upgrade Vertica on AWS
Before you upgrade to the latest Vertica version, do the following:.
Before you upgrade to the latest Vertica version, do the following:
Vertica supports upgrades of Vertica server running on AWS instances created from the Vertica AMI. To upgrade Vertica, follow the instructions provided in Upgrading Vertica.
Make sure to add the following arguments to the upgrade script:
--dba-user-password-disabled
--point-to-point
1.8 - Copying and exporting data on AWS: what you need to know
There are common issues that occur when exporting or copying on AWS clusters, as described below.
There are common issues that occur when exporting or copying on AWS clusters, as described below. Except for these specific issues as they relate to AWS, copying and exporting data works as documented in Database export and import.
To copy or export data on AWS:
Verify that all nodes in source and destination clusters have their own elastic IPs (or public IPs) assigned.
If your destination cluster is located within the same VPC as your source cluster, proceed to step 3. Each node in one cluster must be able to communicate with each node in the other cluster. Thus, each source and destination node needs an elastic IP (or public IP) assigned.
(For non-CloudFormation Template installs) Create an S3 gateway endpoint.
If you aren't using a CloudFormation Template (CFT) to install Vertica, you must create an S3 gateway endpoint in your VPC. For more information, see the AWS documentation.
For example, the Vertica CFT has the following VPC endpoint:
Verify that your security group allows the AWS clusters to communicate.
Check your security groups for both your source and destination AWS clusters. Verify that ports 5433 and 5434 are open. If one of your AWS clusters is on a separate VPC, verify that your network access control list (ACL) allows communication on port 5434.
Note
Note:
This communication method exports and copies (imports) data across the Internet. You can alternatively use non-public IPs and gateways, or VPN to connect the source and destination clusters.
If there are one or more elastic load balancers (ELBs) between the clusters, verify that port 5433 is open between the ELBs and clusters.
If you use the Vertica client to connect to one or more ELBs, the ELBs only distribute incoming connections. The data transmission path occurs between clusters.
2 - Vertica on Microsoft Azure
You can deploy a Vertica database on the Microsoft Azure Cloud running in either or.
You can deploy a Vertica database on the Microsoft Azure Cloud running in either Enterprise Mode or Eon Mode. In Eon Mode, Vertica stores its data communally using Azure block blob storage.
This section explains how to deploy a Vertica database to Microsoft Azure.
2.1 - Deploying Vertica from the Azure Marketplace
Deploy Vertica in the Microsoft Azure Cloud using the Vertica Analytics Platform entry in the Azure Marketplace.
Deploy Vertica in the Microsoft Azure Cloud using the Vertica Analytics Platform entry in the Azure Marketplace. Vertica provides the following deployment options:
Eon Mode: Deploy a Management Console (MC) instance, and then provision and create an Eon Mode database from the MC. For cluster and storage requirements, see Eon Mode on Azure prerequisites.
Enterprise Mode: Deploy a four-node Enterprise Mode database comprised of one MC instance and three database nodes. This requires an Azure subscription with a minimum of 12 cores for the Vertica Marketplace solution.
The Enterprise Mode deployment uses the MC primarily as a monitoring tool. For example, you cannot provision and create a database with an Enterprise Mode MC. For information about creating and managing an Enterprise Mode database, see Create a database using administration tools.
Creating a deployment
Eon Mode and Enterprise Mode require much of the same information for deployment. Any information that is not required for both deployment types is clearly marked.
1. selecting the deployment type
Sign in to your Microsoft Azure account. From the Home screen, select Create a resource under Azure services.
Search for Vertica Analytics Platform and select it from the search results.
On the Vertica Analytics Platform page, select one of the following:
To deploy an MC instance that can manage an Eon Mode database, select Vertica Data Warehouse, Eon BYOL.
To deploy an Enterprise Mode database, select Vertica Analytics Platform.
On the next screen, select Create.
After you select your deployment type, the Basics tab on the Create Vertica Analytics Platform page displays.
2. adding project and instance details on the basics tab
Provide the following information in the Project details and Instance details sections:
Subscription: Azure bills this subscription for the cluster resources.
Resource group: The location to save all of the Azure resources. Create a new resource group or choose an existing one from the dropdown list.
Region: The location where the virtual machine running your MC instance is deployed.
VerticaManagement ConsoleUser: Eon Mode only. The administrator username for the MC.
SSH public key for OS Access: Provide the SSH public key associated with the Vertica User, for command line access to the virtual machine.
Password for MC Access: Enter a password to log in to Management Console. Note that Management Console requires that you change your password after the initial login.
Confirm password: Reenter the value you entered in Password for MC Access.
Select Next: Virtual Machine Settings >.
3. selecting virtual machine settings
Provide the following information on the Virtual Machine Settings tab:
Management Console VM size: Select Change size to customize the VM settings or select the default. For a list of VM types recommended by use case, see Recommended Azure VM types.
Storage account of Eon DB: Eon Mode only. The storage account associated with the database deployment.
Number of Vertica Cluster nodes: Enterprise Mode only. The number of nodes to deploy in the cluster, in addition to the MC instance. The Community Edition (CE) license is automatically applied to the cluster. This license is limited to 1 TB of RAW data 3 Vertica nodes. If you select more than 3 nodes with a CE license, the initial database is created on the first 3 nodes. For information about upgrading your license, see Managing licenses.
Vertica Node VM size: Enterprise Mode only. Select the VM type to deploy in your cluster. Use the default or select Change size to customize the VM settings. For a list of VM types recommended by use case, see Recommended Azure VM types.
Total RAW storage per node: Enterprise Mode only. Select the amount of storage per node from the dropdown list. Each VM has a set of premium data disks that are configured and presented as a single storage location.
Select Next: Network Settings >.
4. selecting network settings
Provide the following information on the Network Settings tab:
Virtual Network: The virtual network that hosts the Vertica cluster. Create a new virtual network or select an existing one from the dropdown list. If you select an existing virtual network, Vertica recommends that you already created a subnet to use for the deployment.
First subnet: The subnet for the associated Virtual Network. Create a new subnet or select an existing one from the dropdown list.
Public IP Address Resource Name: Each VM is configured with a publicly accessible IP address. This field allows you to specify the resource name for those IP addresses, and whether they are static or dynamic. The first public IP address resource is created exactly as entered, and associated with the VerticaManagement Console. Azure appends a number from 1 to 16 to the resource name for each additional Vertica cluster node created. This number associates each VM with a resource.
Domain Name Label for Management Console: Because each VM has a public IP address, each node requires a DNS name. Enter a prefix for the name. The first DNS name is created exactly as entered, and associated with the VerticaManagement Console. Azure appends a number from 1 to 16 to the DNS name for each Vertica cluster node created. That number associates each VM with a resource. Azure adds the remaining part of the fully qualified domain name based on the location where you created the cluster.
Select Next: Review + create >.
5. verifying on review + create
As the Review + create page loads, Azure validates your settings. After it passes validation, review your settings. When you are satisfied with your selections, select Create.
Accessing the MC after deployment
After your resources are successfully deployed, you are brought to the Overview page on Home > resources-name > Deployments. You must retrieve your Management Console IP address and username to log in.
From the Overview page, select Outputs in the left navigation.
Copy the vertica management console URL and vertica management console user name.
Paste the vertica management console URL in the browser address bar and press Enter.
Depending on your browser, you might receive a warning of a security risk. If you receive the warning, select the Advanced button and follow the browsers instructions to proceed to the Management Console.
On the VerticaManagement Console log in page, paste the vertica management console user name, and enter the Password for MC Access that you entered on Basics > Project details when you were deploying your MC instance.
Deleting a resource group
For details about the Azure Resource Manager and deleting a resource group, see the Azure documentation.
2.2 - Manually deploy Vertica on Microsoft Azure
Manually creating a database cluster for your Vertica deployment lets you customize your VMs to meet your specific needs.
Manually creating a database cluster for your Vertica deployment lets you customize your VMs to meet your specific needs. You often want to manually configure your VMs when deploying a Vertica cluster to host an Eon Mode database.
To start creating your Vertica cluster in Azure using manual steps, you first need to create a VM. During the VM creation process, you create and configure the other resources required for your cluster, which are then available for any additional VMs that you create.
The topics in this section explain how to manually deploy Vertica on Azure.
2.2.1 - Recommended Azure VM types
Vertica supports a range of Microsoft Azure virtual machine (VM) types, each optimized for different purposes.
Vertica supports a range of Microsoft Azure virtual machine (VM) types, each optimized for different purposes. Choose the VM type that best matches your performance and price needs as a user.
Note
The GS VMs are not available in all regions, or from the Azure Marketplace.
An Azure VM is similar to a traditional host. Just as with an on-premises cluster, you must prepare and configure the hardware settings for your cluster and network before you install Vertica.
The first steps are:
From the Azure marketplace, select an operating system that Vertica supports.
A public IP is an IP address that you can use to connect to your cluster externally. For best results, assign a single static public IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using the internal IP addresses that Azure generated when you specified your virtual network settings.
By default, a public IP address is dynamic; it changes every time you shut down the server. You can choose a static IP address, but doing so can add cost to your deployment.
During a VM installation, you cannot set a DNS name. If you use dynamic public IPs, set the DNS name in the public IP resource for each VM after deployment.
If needed, to create additional VMs, repeat the previous instructions in this document.
2.2.4 - Connect to a virtual machine
Before you can connect to any of the VMs you created, you must first make your virtual network externally accessible.
Before you can connect to any of the VMs you created, you must first make your virtual network externally accessible. To do so, you must attach the public IP address you created during network configuration to one of your VMs.
Connect to your VM
To connect to your VM, complete the following tasks:
Connect to your VM using SSH with the public IP address you created in the configuration steps.
Authenticate using the credentials and authentication method you specified during the VM creation process.
Connect to other VMs
Connect to other virtual machines in your virtual network by first using SSH to connect to your publicly connected VM. Then, use SSH again from that VM to connect through the private IP addresses of your other VMs.
If you are using private key authentication, you may need to move your key file to the root directory of your publicly connected VM. Then, use PuTTY or WinSCP to connect to other VMs in your virtual network.
2.2.5 - Prepare the virtual machines
After you create your VMs, you need to prepare them for cluster formation.
After you create your VMs, you need to prepare them for cluster formation.
Add the Vertica license and private key
Prepare your nodes by adding your private key (if you are using one) to each node and to your Vertica license. These steps assume that the initial user you configured is the DBADMIN user.
As the dbadmin user, copy your private key file from where you saved it locally onto your primary node.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:
Failed Login Validation 10.0.2.158, cannot resolve or connect to host as root.
If you receive a failure message, enter the following command to correct permissions on your private key file:
$ chmod 600 /<name-of-key>.pem
Copy your Vertica license to your primary VM. Save it in your home directory or other known location.
Install software dependencies for Vertica on Azure
In addition to the Vertica standard Package dependencies, as the root user, you must install the following packages before you install Vertica on Azure:
pstack
mcelog
sysstat
dialog
2.2.6 - Configure storage
Use a dedicated Azure storage account for node storage.
Use a dedicated Azure storage account for node storage.
Caution
Caution: Do not store your information on the root volume, especially your data and catalog directories. Storing information on the root volume may result in data loss.
Using your previously created storage account, attach disk containers to your VMs that are appropriate to your needs.
For best performance, combine multiple storage volumes into RAID-0. For most RAID-0 implementations, attach 6 storage disk containers per VM.
Combine disk containers for storage
If you are using RAID, follow these steps to create a RAID-0 drive on your VMs. The following example shows how you can create a RAID-0 volume named md10 from 6 individual volumes named:
The RAID device can be renamed after a reboot. To ensure the filesystem is mounted in a predictable location on your VM, create a directory to use as the mount point to mount the filesystem. For example, you can choose to create a mount point named /data that you will use to store your database's catalog and data (or depot, if you are running Vertica in Eon Mode).
$ mkdir /data
Using a text editor, add an entry to the /etc/fstab file for the UUID of the filesystem and your mount point so it is mounted when the system boots:
After you complete the download and extraction, the next section describes how to use the install_vertica script to form a cluster and install the Vertica database software.
2.2.8 - Form a cluster and install Vertica
Use the install_vertica script to combine two or more individual VMs to form a cluster and install the Vertica database.
Use the install_vertica script to combine two or more individual VMs to form a cluster and install the Vertica database.
Before you start
Before you run the install_vertica script:
Check the Virtual Network page for a list of current VMs and their associated private IP addresses.
Identify your storage location. The installer assumes that you have mounted your storage to /vertica/data. To specify another location, use the --data-dir argument.
Identify your storage location. To create your database's data directory on mounted RAID drive, when you run the install_vertica script, provide /vertica/data as the value of the --data-dir option .
Caution
Caution: Do not store your data on the root drive.
Combine virtual machines (VMs)
The following example shows how to combine VMs using the install_vertica script.
While connected to your primary node, construct the following command to combine your nodes into a cluster.
Substitute the IP addresses for your VMs and include your root key file name, if applicable.
Include the --point-to-point parameter to configure spread to use direct point-to-point communication between all Vertica nodes, as required for clusters on Azure when installing or updating Vertica.
If you are using Vertica Community Edition, which limits you to three nodes, specify -L CE with no license file.
After you combine your nodes, to reduce security risks, keep your key file in a secure place—separate from your cluster—and delete your on-cluster key with the shred command:
$ shred examplekey.pem
Important
You need your key file to perform future Vertica updates.
Reboot your cluster to complete the cluster formation and Vertica installation.
You can create an database on a cluster that is hosted on Azure.
You can create an Eon Mode database on a cluster that is hosted on Azure. In this configuration, your database stores its data communally in Azure Blob storage. See Eon Mode to learn more about this database mode.
Eon Mode databases on Azure support some of the encryption features built into Azure Storage. You can use its encryption at rest feature transparently—you do not need to configure Vertica to take advantage of it. You can use Microsoft-managed or customer-managed keys for storage encryption. Vertica does not support Azure Storage's client-side encryption and encryption using customer-provided keys. See the Azure Data Encryption at rest page in the Azure documentation for more information about the encryption at rest features in Azure Storage.
This section explains how you create an Eon Mode database running on Azure cloud.
2.3.1 - Eon Mode on Azure prerequisites
Before you can create an Eon Mode database on Azure, you must have a database cluster and an Azure blob storage container to store your database's data.
Before you can create an Eon Mode database on Azure, you must have a database cluster and an Azure blob storage container to store your database's data.
Cluster requirements
Before you can create an Eon Mode database on Azure, you must provision a cluster to host it. See Configuring your Vertica cluster for Eon Mode for suggestions on choosing VM configurations and the number of nodes your cluster should start with.
Storage requirements
An Eon Mode database on Azure stores its data communally in Azure blob storage. Vertica only supports block blob storage for communal data storage, not append or page blob storage.
You must create a storage path for Vertica to use exclusively. This path can be a blob container or a folder within a blob container. This path must not contain any files. If you attempt to create an Eon Mode database with a container or folder that contains files, admintools returns an error.
You pass Vertica a URI for the storage path using the azb:// schema. See Azure Blob Storage object store for the format of this URI.
You must also configure the storage container so Vertica is authorized to access it. Depending on authentication method you use, you may need to supply Vertica the with credentials to access the container. Vertica can use one of following methods to authenticate with the blob storage container:
Using Azure managed identities. This authentication method is transparent—you do not need to add any authentication configuration information to Vertica. Vertica automatically uses the managed identity bound to the VMs it runs on to authenticate with the blob storage container. See the Azure AD-managed identities for Azure resources documentation page in the Azure documentation for more information.
If you provide credentials for either of the other two supported authentication methods, Vertica uses them instead of authenticating using a managed identity bound to your VM.
Note
If your Azure VMs have more than one managed identity bound to them, you must tell Vertica which identity to use when authenticating with the blob storage container. Vertica gets the identity to use from a tag set on the VMs that it is running on.
On your VMs, create a tag with its key named VerticaManagedIdentityClientId and its value to the name of a managed identity bound to your VMs. See the Use tags to organize your Azure resources and management hierarchy page in the Azure documentation for more information.
Using an account name and access key credentials for a service account that has full access to the blob storage container. In this case, you provide Vertica with the credentials when you create the Eon Mode database. See Creating an Authentication File for details.
2.3.2 - Manually creating an Eon Mode database on Azure
Once you have met the cluster and storage requirements for using an Eon Mode database on Azure, you are ready to create an Eon Mode database.
Once you have met the cluster and storage requirements for using an Eon Mode database on Azure, you are ready to create an Eon Mode database. Use the admintools create_db tool to create your Eon Mode database.
Creating an authentication file
If your database will use a managed identity to authenticate with the Azure storage container, you do not need to supply any additional configuration information to the create_db tool.
If your database will not use a managed identity, you must supply create_db with authentication information in a configuration file. It must contain at least the AzureStorageCredentials parameter that defines one or more account names and keys Vertica will use to access blob storage. It can also contain an AzureStorageEnpointConfig parameter that defines an alternate endpoint to use instead of the the default Azure host name. This option is useful if you are creating a test environment using an Azure storage emulator such as Azurite.
Important
Vertica does not officially support Azure storage emulators as a communal storage location.
The following table defines the values that can be set in these two parameters.
AzureStorageCredentials
Collection of JSON objects, each of which specifies connection credentials for one endpoint. This parameter takes precedence over Azure managed identities.
The collection must contain at least one object and may contain more. Each object must specify at least one of accountName or blobEndpoint, and at least one of accountKey or sharedAccessSignature.
accountName: If not specified, uses the label of blobEndpoint.
blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
accountKey: Access key for the account or endpoint.
sharedAccessSignature: Access token for finer-grained access control, if being used by the Azure endpoint.
AzureStorageEndpointConfig
Collection of JSON objects, each of which specifies configuration elements for one endpoint. Each object must specify at least one of accountName or blobEndpoint.
accountName: If not specified, uses the label of blobEndpoint.
blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
protocol: HTTPS (default) or HTTP.
isMultiAccountEndpoint: true if the endpoint supports multiple accounts, false otherwise (default is false). To use multiple-account access, you must include the account name in the URI. If a URI path contains an account, this value is assumed to be true unless explicitly set to false.
The authentication configuration file is a text file containing the configuration parameter names and their values. The values are in a JSON format. The name of this file is not important. The following examples use the file name auth_params.conf.
The following example is a configuration file for a storage account hosted on Azure. The storage account name is mystore, and the key value is a placeholder. In your own configuration file, you must provide the storage account's access key. You can find this value by right-clicking the storage account in the Azure Storage Explorer and selecting Copy Primary Key.
The following example shows a configuration file that defines an account for a storage container hosted on the local system using the Azurite storage system. The user account and key are the "well-known" account provided by Azurite by default. Because this configuration uses an alternate storage endpoint, it also defines the AzureStorageEndpointConfig parameter. In addition to reiterating the account name and endpoint definition, this example sets the protocol to the non-encrypted HTTP.
Important
This example wraps the contents of the JSON values for clarity. In an actual configuration file, you cannot wrap these values. They must be on a single line.
Use the admintools create_db tool to create your Eon Mode database. The required arguments you pass to this tool are:
Argument
Description
--communal-storage-location
The URI for the storage container Vertica will use for communal storage. This URI must use the azb:// schema. See Azure Blob Storage object store for the format of this URI.
-x
The path to the file containing the authentication parameters Vertica needs to access the communal storage location. This argument is only required if your database will use a storage account name and key to authenticate with the storage container. If it is using a managed identity, you do not need to specify this argument.
--depot-path
The absolute path to store the depot on the nodes in the cluster.
--shard-count
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divisor. See Planning for Scaling Your Cluster for more information.
-s
A comma-separated list of the nodes in your database.
-d
The name for your database.
Some other common optional arguments for create_db are:
Argument
Description
-l
The absolute path to the Vertica license file to apply to the new database.
-p
The password for the new database.
--depot-size
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
integer%: Percentage of filesystem's disk space to allocate.
integer{K|M|G|T}: Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating an Eon Mode database with the following settings:
Vertica will use a storage account named mystore.
The communal data will be stored in a directory named verticadb located in a storage container named db_blobs.
The authentication information Vertica needs to access the storage container is in the file named auth_params.conf in the current directory. The contents of this file are shown in the first example under Creating an Authentication File.
The hostnames of the nodes in the cluster are node01 through node03.
Vertica has the following network security group requirements.
Vertica has the following network security group requirements.
For details on security groups and how to create one, see the Azure documentation.
Inbound settings
Name
Protocol
Source port range
Destination port range
Source
Destination
SSH
TCP
22
Any
Any
HTTP
TCP
80
Any
Any
HTTPS
TCP
80
Any
Any
HTTPS
TCP
443
Any
Any
DNS (UDP)
UDP
53
Any
Any
Spread
UDP
4803-4805
Any
Any
Spread
TCP
4803-4805
Any
Any
VSQL/SQL
TCP
5433
Any
Any
Inter-node communication
TCP
5434
Any
Any
TCP
5444
Any
Any
MC
TCP
5450
Any
Any
TCP
8080
Any
Any
TCP
48073
Any
Any
rsync
TCP
50000
Any
Any
Outbound settings
Name
Protocol
Source port range
Destination port range
Source
Destination
All TCP
TCP
0-65535
Any
Any
All ICMP
ICMP
0-65535
Any
Any
All UDP
UDP
0-65535
Any
Any
3 - Vertica on Google Cloud Platform
Welcome to the Vertica on Google Cloud Platform guide.
Welcome to the Vertica on Google Cloud Platform guide.
Vertica provides two templates to help you deploy a Vertica database running in either Enterprise Mode or Eon Mode. See Architecture for more information about these modes.
The following topics describe several deployment methods to run Vertica on Google Cloud Platform.
3.1 - Supported GCP machine types
Vertica Analytic Database supports a range of machine types, each optimized for different workloads.
Vertica Analytic Database supports a range of machine types, each optimized for different workloads. When you deploy your Vertica Analytic Database cluster to the Google Cloud Platform (GCP), different machine types are available depending on how you provision your database.
Note
Some machine types are not available across all regions.
The sections below list the GCP machine types that Vertica supports for Vertica cluster hosts, and for use in Management Console. For details on the configuration of the machine type options, see the Google Cloud documentation's Machine types page.
Machine types available for MC hosts
Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy an instance for running the Vertica Management Console.
Tip
In most cases, 8 vCPUs are sufficient when selecting a machine type for running the Management Console.
Machine types available for Vertica database cluster hosts
Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy cluster hosts.
Machine types for Vertica database cluster hosts provisioned from MC
The table below lists the GCP machine types that Vertica supports when you provision your cluster from Management Console.
Machine Type
Machine Name
N1 standard
n1-standard-16
n1-standard-32
n1-standard-64
N1 high-memory
n1-highmem-16
n1-highmem-32
n1-highmem-64
N2 standard
n2-standard-16
n2-standard-32
n2-standard-48
n2-standard-64
N2 high-memory
n2-highmem-16
n2-highmem-32
n2-highmem-48
n2-highmem-64
3.2 - Deploy Vertica from the Google cloud marketplace
The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP).
The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP). Currently, three entries let you select the database mode and the license you want to use:
The Eon Mode BYOL (bring your own license) launcher deploys a single instance running the MC. You use this MC instance to deploy a Vertica database running on Eon Mode. This database has a community license applied to it initially. You can later upgrade it to a license you have obtained from Vertica. See Deploying an Eon Mode database on GCP for more information.
The Eon Mode BTH (by the hour) launcher also deploys a single instance running the MC that you use to deploy a database. This database has a by-the-hour license applied to it. Instead of paying for a license up front, you pay an hourly fee that covers both Vertica and running your instances. The BTH license is automatically applied to all clusters you create using a BTH MC instance. See Deploying an Eon Mode database on GCP for more information. If you choose, you can upgrade this hourly license to a longer-term license you purchase from Vertica. To move a BTH cluster to a BYOL license, follow the instructions in Moving a cloud installation from by the hour (BTH) to bring your own license (BYOL) for more information.
Note
Vertica clusters that use IPv6 to identify hosts have not been tested on GCP. Vertica recommends you use IPv4 addresses to identify the hosts in your cluster on GCP.
3.2.1 - Deploying an Enterprise Mode database in GCP from the marketplace
The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database.
The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database. The solution includes the Vertica Management Console (MC) as the primary UI for you to get started.
The launcher automatically creates a database named vdb using the Community Edition (CE) license. The CE license is limited to a maximum of 3 nodes. You can tell the launcher to add more than 3 nodes to your deployment. In this case, it uses the first three nodes in the cluster to create the database. The remaining nodes are not part of the database, but are added to your cluster. To add these nodes to your database, you must replace the Community Edition license with a license key you receive from the Software Entitlement support site. See Managing licenses for more information.
After the launcher creates the initial database, it configures the MC to attach to that database automatically.
Configure the Vertica cloud launcher solution
To get started with a deployment of Vertica from the Google Cloud Launcher, search for the Vertica Data Warehouse, Enterprise Mode entry.
Follow these steps:
Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.
From the listing page, click LAUNCH.
On the New Vertica Analytics Platform deployment page, enter the following information:
Deployment name: Each deployment must have a unique name. That name is used as the prefix for the names of all VMs created during the deployment. The deployment name can only contain lowercase characters, numbers, and dashes. The name must start with a lowercase letter and cannot end with a dash.
Zone: GCP breaks its cloud data centers into regions and zones. Regions are a collection of zones in the same geographical location. Zones are collections of compute resources, which vary from zone to zone.
For best results, pick the zone in your designated region that supports the latest Intel CPUs. For a complete listing of regions and zones, including supported processors, see Regions and Zones.
Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.
Under Vertica Management Console, choose the configuration for the virtual machine that will run the Management Console. The Vertica Analytics Platform in Cloud Launcher always deploys the Vertica Management Console (MC) as part of the solution.
The default machine type for MC is sufficient for most deployments. You can choose another machine type that better suits any additional purposes, such serving as a target node for backups, data transformation, or additional management tools.
Node count for Vertica Cluster: The total number of VMs you want to deploy in the Vertica Cluster. The default is 3.
Note
As mentioned above, the Cloud Launcher automatically deploys the Vertica Community Edition license, which limits the database to 3 nodes and up to 1 TB in raw data. Any additional nodes will be part of your database cluster, but will not be part of your database.
If you intend to use the Community Edition license for your database, leave the setting at 3. Otherwise, you would add nodes that will sit idle and cost you money without being part of your database.
Machine type for Vertica Cluster nodes: The Cloud Launcher builds each node in the cluster using the same machine type. Modify the machine type for your nodes based on the workloads you expect your database to handle. See Supported GCP machine types for more information.
Data disk type: GCP offers two types of persistent disk storage: Standard and SSD. The costs associated with Standard are less, but the performance of SSD storage is much better. Vertica recommends you use SSD storage. For more information on Standard and SSD persistent disks, see Storage Options.
Disk size in GB: Disk performance is directly tied to the disk size in GCP. The default value of 2000 GBs (2 TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.
If you select a smaller disk size, the throughput performance decreases. If you select a large disk size, the performance remains the same as the 2 TB option.
Network: VMs in GCP must exist on a virtual private cloud (VPC). When you created your GCP account, a default VPC was created. Create additional VPCs to isolate solutions or projects from one another. The Vertica Analytics Plaform creates all the nodes in the same VPC.
Subnetwork: Just as a GCP account may have multiple VPCs, each VPC may also have multiple subnets. Use additional subnets to group or isolate solutions within the same VPC.
Firewall: If you want your MC to be accessible via the internet, check the Allow access to the Management Console from the Internet box. Vertica recommends you protect your MC using a firewall that restricts access to just the IP addresses of users that need to access it. You can enter one or more comma-separated CIDR address ranges.
After you have entered all the required information, click Deploy to begin the deployment process.
Monitor the deployment
After the deployment begins, Google Cloud Launcher automatically opens the Deployment Manager page that displays the status of the deployment. Items that are still being processed have a spinning circle to the left of them and the text is a light gray color. Items that have been created are dark gray in color, with an icon designating that resource type on the left.
After the deployment completes, a green check mark appears next to the deployment name in the upper left-hand section of the screen.
Accessing the cluster after deployment
After the deployment completes, the right-hand section of the screen displays the following information:
dbadmin password: A randomly generated password for the dbadmin account on the nodes. For security reasons, change the dbadmin password when you first log in to one of the Vertica cluster nodes.
mcadmin password: A randomly generated password for the mcadmin account for accessing the Management Console. For security reasons, change the mcadmin password after you first log in to the MC.
Vertica Node 1 IP address: The external IP address for the first node in the Vertica cluster is exposed here so that you can connect to the VM using a standard SSH client.To access the MC, press the Access Vertica MC button in the Get Started section of the dialog box. Copy the mcadmin password and paste it when asked.
There are two ways to access the cluster nodes directly:
Use GCP's integrated SSH shell by selecting the SSH button in the Get Started section. This shell opens a pop-up in your browser that runs GCP's web-based SSH client. You are automatically logged on as the user you authenticated as in the GCP environment.
After you have access to the first Vertica cluster node, execute the su dbadmin command, and authenticate using the dbadmin password.
In addition, use other standard SSH clients to connect directly to the first Vertica cluster node. Use the Vertica Node 1 IP address listed on the screen as the dbadmin user, and authenticate with the dbadmin password.
Follow the on-screen directions to log in using the mcadmin account and accept the EULA. After you've been authenticated, access the initial database by clicking the vdb icon (looks like a green cylinder) in the Recent Databases section.
Using a custom service account
In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:
The custom service account must have the Editor role.
Individual user accounts must have the Service Account User role on the custom service account.
3.2.2 - Eon Mode databases on GCP
You deploy an Eon Mode database to GCP using Google Cloud Platform Launcher to deploy a Management Console (MC) instance.
You deploy an Eon Mode database to GCP using Google Cloud Platform Launcher to deploy a Management Console (MC) instance. You then use the MC instance to provision and deploy an Eon Mode database.
3.2.2.1 - GCP Eon Mode instance recommendations
When you use the MC to deploy an Eon Mode database to the Google Cloud Platform (GCP), you choose the instance type to deploy as the database's nodes.
When you use the MC to deploy an Eon Mode database to the Google Cloud Platform (GCP), you choose the instance type to deploy as the database's nodes. The default instance settings in the MC are the more conservative option (currently, n1-standard-16). They are sufficient for most workloads. However, you may choose instances with more memory (such as n1-highmem-16) if your queries perform complex joins that may otherwise spill to disk. You can also choose instances with more cores (such as n1-standard-32), if you perform highly-complex compute-intensive analysis. The following links provide additional information about GCP machine type instances and Vertica:
Machine types: Google Cloud's documentation that describes configuration details for each instance option.
The more powerful instance you choose, the higher the cost per hour. You need to balance whether you want to use fewer, higher-powered but more expensive instances vs. relying on more lower-powered instances that cost less. Thanks to Eon Mode's elasticity, if you choose to use the less-powerful instances, you can always add more nodes to meet peak demands. When you reduce the number of instances to a minimum during off-peak times, you'll spend less than if you had a similar number of more-powerful instances.
Storage options
The MC's deployment wizard also asks you to select the type of local storage for your instances. You can select different options for each type of local storage that Vertica uses: the catalog, the depot, and temporary space. For all of these storage locations, you choose the type of disks to use (standard vs. SSD). You will see the best performance with SSD disks. However, SSD disks cost more.
For the depot, you also choose whether to use local or persistent disks. The local option is faster, as it resides directly on the virtual machine host. However, whenever you shut down the node, this storage is wiped clean. The persistent storage is slower than the local option, as it is not stored directly on the machine hosting the instance. However, it is not wiped out whenever you shut down the instance. See the Google Cloud documentation's Storage options page for more information.
Which of these options you choose depends on how much depot warming the nodes must perform when starting. If the content of your node's depots change little over time (or you tend to frequently start and stop instances), using persistent storage makes sense. In this case, the depot's warming period will be shorter because most of the data the node needs to participate in queries may still be in its depot when it starts. It will perform fewer fetches of data from communal storage while participating in queries.
If your working data set is rapidly changing or you tend to leave nodes stopped for extended periods of time, your best choice is usually to use local storage. In this scenario, the data in the node's depot when it restarts is usually stale. To participate in queries, the node must fetch much of the data it needs from communal storage, resulting in slower performance until it has warmed its depot. Using local ephemeral storage makes sense here, because you will get the benefit of having faster depot storage. Because your nodes have to warm their depots anyhow, there is less of a downside of having the depot on ephemeral storage.
Before deploying an Eon Mode database on GCP, you must take several steps:.
Before deploying an Eon Mode database on GCP, you must take several steps:
Review the default service account's permissions for your GCP project.
Create an HMAC key to use when creating your cluster.
Create a communal storage location.
Service account permissions
Service accounts allow automated processes to authenticate with GCP. The Eon Mode database deployment process uses the project's service account for your GCP project to deploy instances. When you create a new project, GCP automatically creates a default service account (identified by project_number-compute@developer.gserviceaccount.com) for the project and grants it the IAM role Editor. See the Google Cloud documentation's Understanding roles for details about this and other IAM roles.
The Editor role lets the service account create resources from the Marketplace. When you create an instance of the Management Console (MC), the MC uses the account to deploy further resources, such as provisioning instances for an database.
To deploy Vertica on GCP, your user account must have the:
Editor role.
runtimeconfig.waiters.getIamPolicy permission.
Creating an HMAC key
Vertica uses a hash-based message authentication code (HMAC) key to authenticate requests to access the communal storage location. This key has two parts: an access ID and a secret. When you create an Eon Mode database in GCP, you provide both parts of an HMAC key for the nodes to use to access communal storage.
To create an HMAC key:
Log in to your Google Cloud account.
If the name of the project you will use to create your database does not appear in the top banner, click the dropdown and select the correct project.
In the navigation menu in the upper-left corner, under the Storage heading, click Storage and select Settings.
In the Settings page, click Interoperability.
Scroll to the bottom of the page and find the User account HMAC heading.
Unless you have already set a default project, you will see the message stating you haven’t set a default project for your user account yet. Click the Set project-id as default project button to choose the current project as your default for interoperability.
Note
The project ID appears in the button label, not the project name.
Under Access keys for your user account, click Create a key.
Your new access key and secret appear in the HMAC key list. You will need them when you create your Eon Mode database. You can copy them to a handy location (such as a text editor) or leave a browser tab open to this page while you use another tab or window to create your database. These keys remain available on this page, so you do not need to worry about saving them elsewhere.
Caution
It is vital that you protect the security of your HMAC key. It can grant others access to your Eon Mode database's communal storage location. This means they could access all of the data in your database. Do not write the HMAC key anyplace where it may be exposed, such as email, shared folders, or similar insecure locations.
Creating a communal storage location
Your Eon Mode database needs a storage location for its communal storage. Eon Mode databases running on GCP use Google Cloud Storage (GCS) for their communal storage location. When you create your new Eon Mode database, you will supply the MC's wizard with a GCS URL for the storage location.
This location needs to meet the following criteria:
The URL must include at least a bucket name. You can use one or more levels of folders, as well. For example, the following GCS URLs are valid:
gs://verticabucket/mydatabase
gs://verticabucket/databases/mydatabase
gs://verticabucket
Multiple databases can share the same bucket, as long as each has its own folder.
If provided, the lowest-level folder in the URL must not already exist. For example, in the GCS URL gs://verticabucket/databases/mydatabase, the bucket named verticabucket and the directory named databases must exist. The subdirectory named mydatabase must not exist. The Vertica install process expects to create the final folder itself. If the folder already exists, the installation process fails.
The permissions on the bucket must be set to allow the service account read, write, and delete privileges on the bucket. The best role to assign to the user to gain these permissions is Storage Object Admin.
To prevent performance issues, the bucket must be in the same region as all of the nodes running the Eon Mode database.
If you create the database through the admintools UI, you must set gcsauth as a bootstrap parameter in admintools.conf. For more information on this and other GCP parameters, see Google Cloud Storage parameters.
[BootstrapParameters]
gcsauth = ID:secret
3.2.2.3 - Deploying an Eon Mode database on GCP
Once you have taken the steps listed in Eon Mode on GCP Prerequisites, you are ready to deploy an Eon Mode database in GCP.
Once you have taken the steps listed in Eon Mode on GCP prerequisites, you are ready to deploy an Eon Mode database in GCP. This process has two steps: deploy a single-node MC instance, then use the MC to provision and deploy a database. The following topics explain these steps.
3.2.2.3.1 - Deploying an MC instance to GCP for Eon Mode
To deploy an MC instance that is able to deploy Eon Mode databases to GCP:.
To deploy an MC instance that is able to deploy Eon Mode databases to GCP:
Log into your GCP account, if you are not currently logged in.
Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.
Verify that the name of the GCP project you want to use for the deployment appears in the top banner. If it does not, click the down arrow next to the project name and select the correct project.
Click the navigation menu icon in the top left of the page and select Marketplace.
In the Search for solutions box, type Vertica Eon Mode and press enter.
Click the search result for Vertica Data Warehouse, Eon Mode. There are two license options: by the hour (BTH) and bring your own license (BYOL). See Deploy Vertica from the Google cloud marketplace for more information on this license choice.
Click Launch on the license option you prefer.
On the following page, fill in the fields to configure your MC instance:
Deployment name identifies your MC deployment in the GCP Deployments page.
Zone is the location where the virtual machine running your MC instance will be deployed. Make this the same location where your communal storage bucket is located.
Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.
Machine Type is the virtual hardware configuration of the instance that will run the MC. The default values here are "middle of the road" settings which are sufficient for most use cases. If you are doing a small proof-of-concept deployment, you can choose a less powerful instance to save some money. If you are planning on deploying multiple large databases, consider increasing the count of virtual CPUs and RAM. For details about Vertica's default volume configurations, see Eon Mode volume configuration defaults for GCP.
User Name for Access to MC is the administrator username for the MC. You can customize this if you want.
Network and Subnetwork are the virtual private cloud (VPC) network and subnet within that network you want your MC instance and your Vertica nodes to use. This setting does not affect your MC's external network address. If you want to isolate your Vertica cluster from other GCP instances in your project, create a custom VPC network and optionally a subnet in your GCP project and select them in these fields. See the Google Cloud documentation's VPC network overview page for more information.
Firewall enables access to the MC from the internet by opening port 5450 in the firewall. You can choose to not open this port by clearing the I accept opening a port in the firewall (5450) for Vertica box. However, if you do not open the port in the firewall, your MC instance will only be accessible from within the VPC network. Not opening the port will make accessing your MC instance much harder.
Source IP ranges for MC traffic: If you choose to open the MC for external access, add one or more or more CIDR address ranges to this box for network addresses that you want to be able to access to the MC.
Caution
Make the address ranges as limited as possible to reduce the chances of unauthorized access to your MC instance.
Click the Deploy button to start the deployment of your MC instance.
The deployment process will take several minutes.
Using a custom service account
In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:
The custom service account must have the Editor role.
Individual user accounts must have the Service Account User role on the custom service account.
Connect and log into the MC instance
After the deployment process is finished, the Deployment Manager page for your MC instance contains links to connect to the MC via your browser or ssh.
To connect to the MC instance:
The MC administrator user has a randomly-generated password that you need to log into the MC. Copy the password in the MC Admin Password field to the clipboard.
Click Access Management Console.
A new browser tab or window opens, showing you a page titled Redirection Notice. Click the link for the MC URL to continue to the MC login page.
Your browser will likely show you a security warning. The MC instance uses a self-signed security certificate. Most browsers treat these certificates as a security hazard because they cannot verify their origin. You can safely ignore this warning and continue. In most browsers, click the Advanced button on the warning page, and select the option to proceed. In Chrome, this is a link titled Proceed toxxx.xxx.xxx.xxx(unsafe). In Firefox, it is a button labeled Accept the Risk and Continue.
At the login screen, enter the MC administrator user name into the Username box. This user name is mcadmin, unless you changed the user name in the MC deployment form.
Paste the automatically-generated password you copied from the MC Admin Password field earlier into the Password box.
Click Log In.
Once you have logged into the MC, change the MC administrator account's password.
Caution
The automatically-generated password appears on the MC instance's deployment page and can be revealed in several locations in the deployment logs. Failure to change this password can lead to unauthorized access to your MC instance.
To change the password:
On the home page of the MC, under the MC Tools section, click MC Settings.
In the left-hand menu, click User Management.
Select the entry for the MC administrator account and click Edit.
Click either the Generate new or Edit password button to change the password. If you click the Generate new button, be sure to save the automatically-generated password in a safe location. If you click Edit password, you are prompted to enter a new password twice.
3.2.2.3.2 - Using the MC to provision and create an Eon Mode database in GCP
After you deploy an MC instance to GCP, use it to deploy an Eon Mode database.
After you deploy an MC instance to GCP, use it to deploy an Eon Mode database.
Note
Currently, the admintools menu-based interface does not support creating an Eon Mode database on GCP.
To use the MC to provision and deploy a new Eon Mode database on GCP:
From the MC home screen, click Create new database to launch the Create a Vertica Cluster on Google Cloud wizard.
On the first page of the wizard enter the following information:
Google Cloud Storage HMAC Access Key and HMAC Secret Key: Copy and paste the HMAC access key and secret you created earlier. You find these values on the Interoperability tab of the of the Storage Settings page. See Eon Mode on GCP prerequisites for details.
Zone: This value defaults to the zone containing your MC instance. Make this value the same as the zone containing the Google Cloud Storage bucket that your database will use for communal storage.
Caution
You will see significant performance issues if you choose different zones for cluster instances, storage, or the MC.
CIDR Range: The IP address range for clients to whom you want to grant access to your database. Make this range as restrictive as possible to limit access to your database.
Vertica Version: select the desired Vertica database version. You can select from the latest hotfix of recent Vertica releases. For each database version, you can also select the operating system.
Vertica Database User Name: the name of the database superuser. This name defaults to dbadmin, but you can enter another user name here.
Password and Confirm Password: Enter a password for the database superuser account.
Database Size: The number of nodes in your initial database. If you specify more than three nodes here, you must supply a valid Vertica license file in the Vertica License field (below).
Vertica License: Click Browse to locate and upload your Vertica license key file. If you do not supply a license key file here, the wizard deploys your database with a Vertica Community Edition license. This license has a three node limit, so the value in the Database Size filed cannot be larger than 3 if you do not supply a license. If you use a Community Edition license for your deployment, you can upgrade the license later to expand your cluster load more than 1TB of data. See Managing licenses form more information.
Note
This field does not appear if you created your MC instance using a by-the-hour (BTH) launcher. The BTH license is automatically applied to all clusters you create using a BTH MC instance. For a by-the-hour license, cloud vendors charge the customer for licensed Vertica usage along with their cloud infrastructure charges.
Load example data: Check this box if you want your deployed database to load some example clickstream data. This option is useful if you are testing features and just want some preloaded data in the database to query.
Click Next and supply the following information:
Instance Type: the specifications of the virtual machine instances the MC will use to deploy your database nodes. See the Google Cloud documentation's Machine types page for details of each instance type. Also see GCP Eon Mode instance recommendations.
Database Depot Path and Disk Type: the local mount point for the depot, and the type and number of local disks dedicated to the depot for each node. You cannot change the mount path for the depot. The disks you select in the Disk Type field are only used to store the depot. On the next page of the wizard, you will configure disks for the catalog and temporary disk space. You will see the best performance when using SSD disks, although at a higher cost. You can choose to use faster local storage for your depot. However, local storage is ephemeral—GCP wipes the disk clean whenever you stop the instance. This means each time you start a node, it will have to warm its depot from scratch, rather than taking advantage of any still-current data in its depot. See the Google Cloud documentation's Storage options page for more information about the local disk options.
Volume Size: the amount of disk space available on each disk attached to each node in your cluster. This field shows you the total disk space available per node in your cluster. For the best practices on choosing the amount of disk space for your nodes, see Configuring your Vertica cluster for Eon Mode.
Data Segmentation Shards: sets the number of shards in your database. After you set this value, you cannot change it later. See Configuring your Vertica cluster for Eon Mode for recommendations. The default value is based on the number of nodes you entered in the Database size you specified earlier. It is usually sufficient, unless you anticipate greatly expanding your cluster beyond your initial node count.
Communal Location: a Google Cloud Storage URL that specifies where to store your database's communal data. See Eon Mode on GCP prerequisites for requirements.
Instance IP settings: specify whether the nodes in your database will have static or ephemeral network addresses that are accessible from the internet, or addresses that are only accessible from within the internal virtual network.
Click Next. The wizard validates your communal storage location URL. If there is an problem with the URL you entered, it displays an error message and prompts you to fix the URL.
After your communal storage URL passes validation, fill in the following information:
Database Catalog Path, Disk Type, and Size (GB) per Available Node: the mount point disk type, and disk size for the local copy of the database catalog on each node. You cannot edit the mount point. You choose the type of local disk to use for the catalog, and its size. You can only choose persistent disk storage for the catalog. SSD drives are faster, but more expensive than standard disks. The default setting for the disk size is adequate for most medium size databases. Increase the size if you anticipate maintaining a large database.
Database Temp Path, Disk Type, and Size (GB) per Available Node: the mount point disk type, and disk size for the temporary storage space on each node. You cannot edit the mount point. You choose the type of local disk to use, and its size. You can only choose persistent disk storage for the temporary disk space. SSD drives are faster, but more expensive than standard disks. The default setting is adequate for most databases. Consider increasing the temporary space if you perform many complex merges that spill to disk.
Label Instances: check this box to enable adding labels to your node's instances. Many organizations use labels to organize, track responsibility, and assign costs for instances. See the Google Cloud documentation's Labeling resources page for more information. If you choose to add labels, enter the label name and value, and click Add.
Click Next. Review the summary of all your database settings. If you need to make a correction, use the Back button to step back to previous pages of the wizard.
When you are satisfied with the database settings, check Accept terms and conditions and click Create.
The process of provisioning and creating the database takes several minutes. After it completes successfully, the MC displays a Get Started button. This button leads to a page of useful links for getting started with your new database.
3.3 - Manually deploying an Enterprise Mode database on GCP
Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.
Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.
Configure and launch a new instance
All VM instances that you create should be launched in the same virtual public cloud (VPC).
To configure and launch a new VM instance, follow these instructions:
From within the Compute Engine section of GCP, from the menu on the left-hand site of the screen, select VM Instances.
GCP displays all the VM instances that you have created so far.
Select the CREATE INSTANCE link.
Enter a name for the new instance.
Select the zone where you plan to deploy the instance.
GCP breaks its cloud data centers down by regions and zones. Regions are a collection of zones that are all in the same geographical location. Zones are collections of compute resources, which vary from zone to zone. Always pick the zone in your designated region that supports the latest Intel CPUs.
For a complete listing of regions and zones, including supported processors, see Regions and Zones.
Select a machine type.
GCE offers many different types of VM instances. For best results, only deploy Vertica on VM instances with 8 vCPus or more and at least 30 GB of RAM.
Select the boot disk (image).
You create VM instances from a public or custom image. If you are starting with Vertica in GCP for the first time, select either the CentOS 7 or RHEL 7 public image. Those images have been tested thoroughly with Vertica.
After you have configured the VM instance to be used as a Vertica cluster node, GCP allows you to convert that instance into a custom image. Doing so allows you to deploy multiple versions of that VM instance; each VM instance is identical except for the node name and IP address.
Before you can connect to any of the VMs you created, you must first identify the external IP address. The VM instance section of GCP contains a list of all currently deployed VMs and their associated external IP addresses.
Connect to your VM
To connect to your VM, complete the following tasks:
Connect to your VM using SSH with the external IP address you created in the configuration steps.
Authenticate using the credentials and SSH key that you provided to your GCP account upon creation.
Connect to other VMs
To connect to other virtual machines in your virtual network:
Use SSH to connect to your publicly connected VM.
Use SSH again from that VM to connect through the private IP addresses of your other VMs.
Because GCP forces the use of private key authentication, you may need to move your key file to the root directory of your publicly connected VM. Then, use SSH to connect to other VMs in your virtual network.
Prepare the virtual machines
After you create your VMs, you need to prepare them for cluster formation.
Add the Vertica license and private key
Prepare your nodes by adding your private key (if you are using one) to each node and to your Vertica license. The following steps assume that the initial user you configured is the DBADMIN user:
As the DBADMIN user, copy your private key file from where you saved it locally onto your primary node.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:
Failed Login Validation 10.0.2.158, cannot resolve or connect to host as root.
If you see the previous failure message, enter the following command to correct permissions on your private key file:
$ chmod 600 /<name-of-key>.pem
Copy your Vertica license to your primary VM. Save it in your home directory or other known location.
Install software dependencies for Vertica on GCP
In addition to the Vertica standard package dependencies, as the root user, you must install the following packages before you install Vertica:
pstack
mcelog
sysstat
dialog
Configure storage
For best disk performance in GCP, Vertica recommends customers use SSD persistent storage, configured to at least 2TB (2000 GB) in size. Disk performance is directly tied to the disk size in GCP. 2000 GBs (2TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.
Caution
Do not store your information on the root volume, especially in your data and catalog directories. Storing information on the root volume may result in data loss.
After you complete the download and extraction, use the install_vertica script to form a cluster and install the Vertica database software, as described in the next section.
Form a cluster and install Vertica
Use the install_vertica script to combine two or more individual VMs to form a cluster and install your Vertica database.
Before you run the install_vertica script, follow these steps:
Check the VM Instances page of the Compute Engine section on GCP to locate a list of current VMs and their associated internal IP addresses.
Identify your storage location on your VMs. The installer assumes that you have mounted your storage to /home/dbadmin. To specify another location, use the --data-dir argument.
Caution
Do not store your data on the root drive.
The following steps show how to combine virtual machines (VMs) into a cluster using the install_vertica script:
While connected to your primary node, construct the following command to combine your nodes into a cluster.
Substitute the IP addresses for your VMs, and include your root key file name, if applicable.
Include the --point-to-point parameter to configure spread to use direct point-to-point communication among all Vertica nodes, as required for clusters on GCP when installing or updating Vertica.
If you are using Vertica Community Edition, which limits you to three nodes, specify -L CE with no license file.
After you combine your nodes, to reduce security risks, keep your key file in a secure place—separate from your cluster—and delete your on-cluster key with the shred command:
$ shred examplekey.pem
Important
You need your key file to perform future Vertica updates.
When you installed Vertica, a database administrator user was created with the DBADMIN role (usually named dbadmin). Use this account to create and start a database.
Bring Your Own License (BYOL): a long-term license that you obtain through an online licensing portal. These deployments also work with a free Community Edition license. Vertica uses a community license automatically if you do not install a license that you purchased. (For more about Vertica licenses, see Managing licenses and Understanding Vertica licenses.)
Vertica by the Hour (BTH): a pay-as-you-go environment where you are charged an hourly fee for both the use of Vertica and the cost of the instances it runs on. The Vertica by the hour deployment offers an alternative to purchasing a term license. If you want to crunch large volumes of data within a short period of time, this option might work better for you. The BTH license is automatically applied to all clusters you create using a BTH MC instance.
If you start out with an hourly license, you can later decide to use a long-term license for your database. The support for an hourly versus a long-term license is built into the instances running your database. To move your database from an hourly license to a long-term license, you must create a new database cluster with a new set of instances.
To move from an hourly to a long-term license, follow these steps:
Moving an Eon Mode database from BTH to BYOL using the command line
Follow these steps to move an Eon Mode database from an hourly to a long-term license.
Obtain a long-term BYOL license from the online licensing portal, described in Obtaining a license key file.Upload the license file to a node in your database. Note the absolute path in the node's filesystem, as you will need this later when installing the license.Connect to the node you uploaded the license file to in the previous step.
Connect to your database using vsql and view the licenses table:
=> SELECT * FROM licenses;
Note the name of the hourly license listed in the NAME column, so you can check if it is still present later.
Install the license in the database using the INSTALL_LICENSE function with the absolute path to the license file you uploaded in step 2:
=> SELECT install_license('absolute path to BYOL license');
View the licenses table again:
=> SELECT * FROM licenses;
If only the new BYOL license appears in the table, skip to step 8. If the hourly license whose name you noted in step 4 is still in the table, copy the name and proceed to step 7.
Call the DROP_LICENSE function to drop the hourly license:
=> SELECT drop_license('hourly license name');
You will need the path for your cluster's communal storage in a later step. If you do not already know the path, you can find this information by executing this query:
=> SELECT location_path FROM V_CATALOG.STORAGE_LOCATIONS
WHERE sharing_type = 'COMMUNAL';
Shut down the database by calling the SHUTDOWN function:
=> SELECT SHUTDOWN();
You now need to create a new BYOL cluster onto which you will revive your database. Deploy a new cluster including a new MC instance using a BYOL entry in the marketplace of your chosen cloud platform. See:
Your new BYOL cluster must have the same number of primary nodes as your existing hourly license cluster.
Revive your database onto the new cluster. For instructions, see Reviving an Eon Mode database cluster. Because you created the new cluster using a BYOL entry in the marketplace, the database uses the BYOL you applied earlier.
After reviving the database on your new BYOL cluster, terminate the instances for your hourly license cluster and MC. For instructions, see your cloud provider's documentation.
Moving an Eon Mode database from BTH to BYOL using the MC
Follow this procedure to move to BYOL and revive your database using MC:
Purchase a long-term BYOL license from the online licensing portal, following the steps detailed in Obtaining a license key file. Save the file to a location on your computer.
You now need to install the new license on your database. Log into MC and click your database in the Recent Databases list.
At the bottom of your database's Overview page, click the License tab.
Under the Installed Licenses list, note the name of the BTH license in the License Name column. You will need this later to check whether it is still present after installing the new long-term license.
In the ribbon at the top of the License History page, click the Install New License button. The Settings: License page opens.
Click the Browse button next to the Upload a new license box.
Locate the license file you obtained in step 1, and click Open.
Click the Apply button on the top right of the page.
Select the checkbox to agree to the EULA terms and click OK.
After Vertica installs the license, click the Close button.
Click the License tab at the bottom of the page.
If only the new long-term license appears in the Installed Licenses list, skip to Step 16. If the by-the-hour license also appears in the list, copy down its name from the License Name column.
You must drop the by-the-hour license before you can proceed. At the bottom of the page, click the Query Execution tab.
In the query editor, enter the following statement:
SELECT DROP_LICENSE('hourly license name');
Click Execute Query. The query should complete indicating that the license has been dropped.
You will need the path for your cluster's communal storage in a later step. If you do not already know the path, you can find this information by executing this query in the Query Execution tab:
SELECT location_path FROM V_CATALOG.STORAGE_LOCATIONS
WHERE sharing_type = 'COMMUNAL';
After reviving the database on your new environment, terminate the instances for your hourly license environment. To do so, on the AWS CloudFormation Stacks page, select the hourly environment's stack (its collection of AWS resources) and click Actions > Delete Stack.
Moving an Enterprise Mode database from hourly to BYOL using backup and restore
Note
Currently, AWS is the only platform supported for Enterprise Mode databases using hourly licenses.
In an Enterprise Mode database, follow this procedure to move to BYOL, and then back up and restore your database:
Obtain a long-term BYOL license from the online licensing portal, described in Obtaining a license key file.Upload the license file to a node in your database. Note the absolute path in the node's filesystem, as you will need this later when installing the license.Connect to the node you uploaded the license file to in the previous step.
Connect to your database using vsql and view the licenses table:
=> SELECT * FROM licenses;
Note the name of the hourly license listed in the NAME column, so you can check if it is still present later.
Install the license in the database using the INSTALL_LICENSE function with the absolute path to the license file you uploaded in step 2:
=> SELECT install_license('absolute path to BYOL license');
View the licenses table again:
=> SELECT * FROM licenses;
If only the new BYOL license appears in the table, skip to step 8. If the hourly license whose name you noted in step 4 is still in the table, copy the name and proceed to step 7.
Call the DROP_LICENSE function to drop the hourly license:
Restore the database from the backup you created earlier. See Backing up and restoring the database. When you restore the database, it will use the BYOL you loaded earlier.
After restoring the database on your new environment, terminate the instances for your hourly license environment. To do so, on the AWS CloudFormation Stacks page, select the hourly environment's stack (its collection of AWS resources) and click Actions > Delete Stack.
After completing one of these procedures, see Viewing your license status to confirm the license drop and install were successful.
5 - Adjusting Spread Daemon timeouts for virtual environments
You may see Vertica nodes leave the database even though they are still running.
You may see Vertica nodes leave the database even though they are still running. This issue can happen on networks that are prone to spikes in latency or in virtual environments where a node's VM may be paused for a short period of time. You can adjust a setting in Vertica to help prevent this issue from occurring.
Vertica relies on spread daemons to pass messages between database nodes. When a node fails to respond to a spread message after a timeout period, Vertica assumes the node is down and starts to remove it from the database.
The default Spread timeout depends on the number of configured Spread segments:
Configured Spread segments
Default timeout
1
8 seconds
> 1
25 seconds
If network delays or temporary pauses of a VM last longer than the spread timeout period, you may see UP nodes leave the database. In these cases, you can increase the spread timeout to reduce or eliminate instances where UP nodes leave the database.
Azure's memory-preserving updates and spread timeouts
In Azure, you might see running nodes leave the database due to scheduled maintenance. Azure's maintenance down time is usually well-defined. For example, Azure's memory-preserving updates can pause a VM for up to 30 seconds while performing maintenance on the system hosting the VM. This pause does not disrupt the node. It continues normal operation once Azure resumes it. See the Azure documentation's topic on Maintenance for virtual machines in Azure for more information about updates. If Azure pauses a node for longer than the spread timeout period, Vertica interprets the node's inability to respond to a spread message as the node going down, even though it will resume running normally.
Note
If you deploy your Vertica cluster using the Azure Marketplace, the spread timeout defaults to 35 seconds. If you manually create your cluster in Azure, the spread timeout defaults to 8 or 25 seconds, as described earlier.
Setting the spread timeout
When you know your network or nodes may be unable to respond for a specific amount of time, you can increase the spread timeout period to longer than this time. Adjust the timeout to the period of time the node may be unable to respond, plus an additional 5 seconds as a safety margin.
For example, if you know Azure's memory-preserving maintenance can pause your VMs for up to 30 seconds, set the spread timeout to 35 seconds.
If you do not know exactly how long network or node disruptions can last, you can try increasing the spread timeout gradually, until you see reduced instances of UP nodes leaving the database. Be as conservative with this setting as you can.
Important
Vertica cannot react to a node going down or being shut down improperly before the timeout period has elapsed. Changing spread’s timeout to a value too high can result in longer query restarts if a node goes down.
You can see the current setting of the spread timeout by querying system tableSPREAD_STATE:
You change the spread timeout calling the meta-function SET_SPREAD_OPTION to set the token timeout to a new value. This value is a string, and sets the timeout in milliseconds.
Important
Changing spread settings with SET_SPREAD_OPTION has minor impact on your cluster as it pauses while the new settings are propagated across the entire cluster.
This example sets the timeout to 35 seconds (35000ms):
=> SELECT SET_SPREAD_OPTION( 'TokenTimeout', '35000');
NOTICE 9003: Spread has been notified about the change
SET_SPREAD_OPTION
--------------------------------------------------------
Spread option 'TokenTimeout' has been set to '35000'.
(1 row)
=> SELECT * FROM V_MONITOR.SPREAD_STATE;
node_name | token_timeout
------------------+---------------
v_vmart_node0001 | 35000
v_vmart_node0002 | 35000
v_vmart_node0003 | 35000
(3 rows);
Note
The changes you make to the spread timeout might not take effect immediately. It might take some time before you see the settings change in system table V_MONITOR.SPREAD_STATE table.