This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Vertica on Amazon Web Services
Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template (CFT), or manually deployed using Amazon Machine Images (AMIs).
Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template (CFT), or manually deployed using Amazon Machine Images (AMIs). For information about these deployment methods, see Deploy Vertica using CloudFormation templates and Manually deploy Vertica on AWS.
You can deploy a Vertica database on AWS running in either Enterprise Mode or Eon Mode. The differences between these two modes lay in their architecture, deployment, and scalability:
-
Enterprise Mode stores data locally on the nodes in the database.
-
Eon Mode stores its data in an S3 bucket.
Eon Mode separates the computational processes from the communal storage layer of your database. This separation lets you elastically vary the number of nodes in your database cluster to adjust to varying workloads.
Vertica also supports the following AWS features:
-
Enhanced Networking: Vertica recommends that you use the AWS enhanced networking for optimal performance. For more information, see Enabling Enhanced Networking on Linux Instances in a VPC in the AWS documentation.
-
Command Line Interface: Use the Amazon command-line Interface (CLI) with your Vertica AMIs. For more information, see What Is the AWS Command Line Interface?.
-
Elastic Load Balancing: Use elastic load balancing (ELB) for queries up to one hour. When enabling ELB, configure the timer to 3600 seconds. For more information see Elastic Load Balancing in the AWS documentation.
For more information about Amazon cluster instances and their limitations, see the Amazon documentation.
In this section
1 - Supported AWS instance types
Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes.
Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes. Choose the instance type that best matches your requirements. The two tables below list the AWS instance types that Vertica supports for Vertica cluster hosts, and for use in MC. For more information, see the Amazon Web Services documentation on instance types and volumes.
Important
If you plan to use an Amazon Machine Image (AMI) on multiple AWS accounts, make sure to subscribe to the image on all your accounts. This allows you to access an image even when it is delisted from the AWS Marketplace.
Instance types for Vertica cluster hosts
Each Amazon EC2 Instance type natively provides one of the following storage options:
-
Elastic Block Store (EBS) provides durable storage: Data files stored on instance persist after instance is stopped.
-
Instance Store provides temporary storage: Data files stored on instance are lost when instance is stopped.
Vertica AMIs can use either the Instance Metadata Service Version 1 (IMDSv1) or the Instance Metadata Service Version 2 (IMDSv2) to authenticate to AWS services, including S3.
For more information about storage configuration in AWS, see Configure storage.
Note
Instance types that support EBS volumes support encrypting.
Optimization |
Instance Types Using Only EBS Volumes (Durable) |
Instance Types Using Instance Store Volumes (Temporary) |
General purpose |
m4.4xlarge
m4.10xlarge
m5.4xlarge
m5.8xlarge
m5.12xlarge
|
m5d.4xlarge
m5d.8xlarge
m5d.12xlarge
|
Compute |
c4.4xlarge
c4.8xlarge
c5.4xlarge
c5.9xlarge
c6i.4xlarge
c6i.8xlarge
c6i.12xlarge
c6i.16xlarge
c6i.24xlarge
c6i.32xlarge
|
c3.4xlarge
c3.8xlarge
c5d.4xlarge
c5d.9xlarge
|
Memory |
r4.4xlarge
r4.8xlarge
r4.16xlarge
r5.4xlarge
r5.8xlarge
r5.12xlarge
r6i.4xlarge
r6i.8xlarge
r6i.12xlarge
r6i.16xlarge
r6i.24xlarge
r6i.32xlarge
|
r3.4xlarge
r3.8xlarge
r5d.4xlarge
r5d.8xlarge
r5d.12xlarge
|
Storage |
|
d2.4xlarge
d2.8xlarge
i3.4xlarge
i3.8xlarge
i3.16xlarge
i3en.3xlarge
i3en.6xlarge
i3en.12xlarge
i4i.4xlarge
i4i.8xlarge
i4i.12xlarge
i4i.16xlarge
i4i.24xlarge
|
Note
By default, the c4.8xlarge, d2.8xlarge, and m4.10xlarge instances have their processor C-states set to a value of 1 in the Vertica AMI. This measure is meant to improve performance by limiting the sleep states that an instance running Vertica uses.
For more information about sleep states, visit the AWS Documentation.
Instance types available for MC hosts
Optimization |
Type |
Supports EBS Storage (Durable) |
Supports Ephemeral Storage (Temporary) |
Computing |
c4.large
c4.xlarge
c5.large
c5.xlarge
|
Yes
Yes
Yes
Yes
|
No
No
No
No
|
Choosing AWS Eon Mode instance types
When running an Eon Mode database in AWS, choose instance types that support ephemeral instance storage or EBS volumes for your depot, depending on cost and availability. Vertica recommends either r4 or i3 instances for production clusters. It is not mandatory to have an EBS-backed depot, because in Eon Mode, a copy of the data is safely stored in communal storage. However, you must have an EBS-backed catalog for Eon Mode databases.
The following table provides information to help you make a decision on how to pick instances with ephemeral instance storage or EBS only storage. Check with Amazon Web Services for the latest cost per hour.
Important
If you select instances that use instance store, if you then terminate those instances there is the potential for data loss. For Eon mode, MC displays an alert to inform the user of the potential data loss when terminating instances that support instance store.
Storage Type |
Instance Type |
Pros/Cons |
Instance storage |
i3.8xlarge |
Instance storage offers better performance than EBS attached storage through multiple EBS volumes. Instance storage can be striped (RAIDed) together to increase throughput and load balance I/O.
Data stored in instance-store volumes is not persistent through instance stops, terminations, or hardware failures.
|
EBS-only storage |
r4.8xlarge with 600 GB
EBS volume attached
|
Newer instance types from AWS have only the EBS option. In most AWS regions, it's easier to provision a large number of instances.
You can terminate an instance but leave the EBS volume around for faster revive. Perserving the EBS will preserve the depot. While some of the cached files might have become stale, they will be ignored and evicted. Much of the cached data will not be stale. It will save time when the node revives and warms its depot.
Take advantage of full-volume encryption.
|
For more information about Amazon cluster instances and their limitations, see Manage Clusters in the Amazon Web Services documentation.
2 - AWS authentication
Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens.
Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens. For long-term access to non-communal storage buckets, you should use IAM roles for access control centralization. You do not need to change your application's configuration if you want to change its access settings. You just alter the IAM role applied to your EC2 instances.
However, for one-time tasks like backing up and restoring the database or loading data to and from non-communal storage buckets, you should use an AWS access key.
Vertica uses both of these authentication methods to support different features and use cases:
-
An Eon Mode database's access to S3 for communal and catalog storage must always use IAM role authentication. IAM roles are the default access control method for AWS resources. Vertica uses this method if you do not configure the legacy access control session parameters.
-
Individual users can read data from S3 storage locations other than the ones Vertica uses for communal storage. For example, users can use COPY to load data into Vertica from an S3 bucket or query an external table stored on S3. If the IAM role assigned to the Vertica nodes does not have access to this external S3 data, the user must set an id, secret, and optionally an access token in session variables to authorize access to it. These session variables override the IAM role set on the server. See S3 parameters for a list of these session parameters.
-
Individual users can export data to S3 using file export. File export cannot use IAM authorization. Users who want to export data to S3 must set id, secret, and optionally access token values in session variables.
Important
If the database is running in Eon Mode, using id and secret authentication is more complex. In addition to having access to the external S3 data, any id that a user sets must be authorized to read from and write to the S3 storage locations that Vertica uses to store communal and catalog data. The queries that the user executes uses this id for all storage requests, not just those for accessing external S3 data. If the id does not have access to the catalog and communal storage, the user cannot execute queries.
Configuring an IAM role
To configure an IAM role to grant Vertica to access AWS resources you must:
-
Create an IAM role to allow EC2 instances to access the specific resources.
-
Grant that role permission to access your resources.
-
Attach this IAM role to each EC2 instance in the Vertica cluster.
To see an example of IAM roles for a Vertica cluster, look at the roles defined in one of the Cloud Formation Templates provided by Vertica. You can download these templates from any of the Vertica entries in the Amazon Marketplace. Under each entry's Usage Information section, click the View CloudFormation Template link, then click Download CloudFormation Template.
For more information about IAM roles, see IAM Roles for Amazon EC2 in the AWS documentation.
3 - Deploy Vertica using CloudFormation templates
Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly.
Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly. Using the template allows you to automatically provision your AWS resources and launch a Vertica cluster and Management Console, with minimal configuration required.
If you prefer to deploy a VPC, instances, and related resources manually, see Manually deploy Vertica on AWS.
For details about creating an Eon Mode or Enterprise Mode database after you create a cluster with CFTs, see Amazon Web Services in MC.
3.1 - CloudFormation template (CFT) overview
With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.
With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system. After you provide a few parameters to the template, you can create a stack to automatically provision the AWS resources for your Vertica system.
To access Vertica CFTs, go to the AWS Marketplace.
CFT licensing models
Licensing models for CFTs are:
-
Bring Your Own License (BYOL): By default, free CE license is installed with 3 nodes and 1 TB. To extend nodes or size, you can purchase the Vertica BYOL license.
Outside of the BYOL license on CFTs, you can also access the Community Edition without a license file:
-
By the Hour: A pay-as-you-go model where you pay for only the number of hours you use for each node. One advantage of using the Paid Listing is that all charges appear on your Amazon AWS bill. This offers an alternative to purchasing a full Vertica license. This eliminates the need to compute potential storage needs in advance.
CFT prerequisites
Before you can deploy Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:
-
AWS account with permissions to create a VPC, subnet, security group, EC2 instances, and IAM roles (For more information about AWS accounts, see the AWS documentation)
-
Amazon key pair for SSH access to an EC2 instance. (See the AWS documentation for key pairs.)
Supported CFTs and Vertica offerings
Available Vertica CFTs are:
-
Management Console with 3 Vertica nodes: The easiest way to deploy Vertica. This CFT deploys an Eon Mode database by default. However, this environment can also be used to create an Enterprise Mode database. For more information, see Creating a database.
-
Deploy Management Console into new VPC: This CFT deploys all required AWS resources and installs the Vertica Management Console (MC). After stack creation completes, log in to the MC to provision a Vertica database cluster.
-
Deploy Management Console into existing VPC: This CFT deploys the Vertica Management Console (MC) in an already-existing VPC and subnet. After stack creation completes, the MC is available. Log in to MC to provision either a Vertica database cluster or an Eon Mode database cluster.
For this CFT, you must first set up the VPC, subnet, and related network resources. For more information about the correct configuration of these resources for Vertica, see the following topics in the AWS documentation:
Using the license models and supported CFTs, you can deploy the following Vertica products:
See Deploy MC and AWS resources with a CloudFormation template for information on deploying these products.
3.2 - Creating a Virtual Private Cloud
A Vertica cluster on AWS must be logically located in the same network.
A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network. Create a virtual private cloud (VPC) to ensure the nodes in your cluster will be able to communicate with each other within AWS.
Create a single public subnet VPC with the following configurations:
Note
A Vertica cluster must be operated within a single availability zone.
For more information about VPCs, including how to create one, see the AWS documentation.
3.3 - Deploy MC and AWS resources with a CloudFormation template
You can deploy (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace.
You can deploy Management Console (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace. For a list of available CFTs, see CloudFormation template (CFT) overview.
Complete the following to deploy the Vertica MC and related resources in AWS:
-
Log in to the AWS Marketplace with an AWS account (see the Prerequisites section above).
-
Search for "Vertica" in the AWS Marketplace.
-
Select a Vertica CFT. Each CFT leads you to a product overview page, with pricing estimates. (Also see CloudFormation template (CFT) overview for an overview of available templates and products).
-
Click Continue to Subscribe.
-
On the next page, select your launch settings based on your requirements for deployment.
-
If you have not agreed to Vertica EULA terms on the AWS Marketplace before, click Accept Software Terms to subscribe.
-
Click Launch with CloudFormation Console. The CloudFormation Console opens.
-
The CloudFormation Console automatically supplies the URL in the Specify an Amazon S3 template URL field. Click Next.
-
Follow the CloudFormation workflow and enter the parameters (collectively called a stack).
Note
Important: Take note of the username and password you set for Management Console during this step. You cannot recover or reset these credentials after you create the stack.
-
After confirming the details you have provided for your new stack, click Create. The AWS console brings you to the Stacks page, where you can view the progress of the creation process. The process takes several minutes.
-
The Outputs tab displays information about accessing your environment after the process completes.
Next, access the Management Console (MC) to deploy your cluster instances and create a database, as described in Access Management Console.
3.4 - Access Management Console
Complete the following steps to access Management Console on your deployed AWS resources:.
Complete the following steps to access Management Console on your deployed AWS resources:
-
On the AWS CloudFormation Stacks page, select your new stack and view the Outputs tab. This tab provides information about accessing your environment, as well as documentation and licensing resources.
-
In the ManagementConsole row, select the URL in the Value column to open the MC login page.
-
To log in, enter the MC username and password that you created using the CloudFormation Console.
After login, MC displays the home page, with options to provision a new cluster or database or import existing ones. If you chose a CFT that also creates a database, your new database is also displayed on the home page.
This page also provides a Resources section with links to online training, blogs, community, and help resources.
You have successfully launched and connected to Management Console on AWS resources.
If you have not yet provisioned a Vertica cluster and database, complete the steps in one of the following:
4 - Manually deploy Vertica on AWS
Vertica provides tested and pre-configured Amazon Machine Images (AMIs) to deploy cluster hosts or MC hosts on AWS.
Vertica provides tested and pre-configured Amazon Machine Images (AMIs) to deploy cluster hosts or MC hosts on AWS. When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps.
This section will guide you through configuring your network settings on AWS, launching and preparing EC2 instances using the Vertica AMI, and creating a Vertica cluster on those EC2 instances.
Choose this method of installation if you are familiar with configuring AWS and have many specific AWS configuration needs. To automatically deploy AWS resources and a Vertica cluster instead, see Deploy Vertica using CloudFormation templates.
4.1 - Configure your network
Before you deploy your cluster, you must configure the network on which Vertica will run.
Before you deploy your cluster, you must configure the network on which Vertica will run. Vertica requires a number of specific network configurations to operate on AWS. You may also have specific network configuration needs beyond the default Vertica settings.
Important
You can create a Vertica database that uses IPv6 for internal communications running on AWS. However, if you do so, you must identify the hosts in your cluster using IP addresses rather than host names. The AWS DNS resolution service is incompatible with IPv6.
The following sections explain which Amazon EC2 features you need to configure for instance creation.
4.1.1 - Create a placement group, key pair, and VPC
Part of configuring your network for AWS is to create the following:.
Part of configuring your network for AWS is to create the following:
Create a placement group
A placement group is a logical grouping of instances in a single Availability Zone. Placement Groups are required for clusters and all Vertica nodes must be in the same Placement Group.
Vertica recommends placement groups for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your Placement Group, choose an instance type that supports enhanced networking.
For information on creating placement groups, see Placement Groups in the AWS documentation.
Create a key pair
You need a key pair to access your instances using SSH. Create the key pair using the AWS interface and store a copy of your key (*.pem) file on your local machine. When you access an instance, you need to know the local path of your key.
Use a key pair to:
for information on creating a key pair, see Amazon EC2 Key Pairs in the AWS documentation.
Create a virtual private cloud (VPC)
You create a Virtual Private Cloud (VPC) on Amazon so that you can create a network of your EC2 instances. Your instances in the VPC all share the same network and security settings.
A Vertica cluster on AWS must be logically located in the same network. Create a VPC to ensure the nodes in you cluster can communicate with each other in AWS.
Create a single public subnet VPC with the following configurations:
Note
A Vertica cluster must be operated in a single availability zone.
For information on creating a VPC, see Create a Virtual Private Cloud (VPC) in the AWS documentation.
4.1.2 - Network ACL settings
Vertica requires the following network access control list (ACL) settings on an AWS instance running the Vertica AMI.
Vertica requires the following basic network access control list (ACL) settings on an AWS instance running the Vertica AMI. Vertica recommends that you secure your network with additional ACL settings that are appropriate to your situation.
Inbound Rules
Type |
Protocol |
Port Range |
Use |
Source |
Allow/Deny |
SSH |
TCP (6) |
22 |
SSH (Optional—for access to your cluster from outside your VPC) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
5450 |
MC (Optional—for MC running outside of your VPC) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
5433 |
SQL Clients (Optional—for access to your cluster from SQL clients) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
50000 |
Rsync (Optional—for backup outside of your VPC) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
1024-65535 |
Ephemeral Ports (Needed if you use any of the above) |
User Specific |
Allow |
ALL Traffic |
ALL |
ALL |
N/A |
0.0.0.0/0 |
Deny |
Outbound Rules
Type |
Protocol |
Port Range |
Use |
Source |
Allow/Deny |
Custom TCP Rule |
TCP (6) |
0–65535 |
Ephemeral Ports |
0.0.0.0/0 |
Allow |
You can use the entire port range specified in the previous table, or find your specific ephemeral ports by entering the following command:
$ cat /proc/sys/net/ipv4/ip_local_port_range
For detailed information on network ACLs within AWS, refer to Network ACLs in the Amazon documentation.
For detailed information on ephemeral ports within AWS, refer to Ephemeral Ports in the Amazon documentation.
4.1.3 - Configure TCP keepalive with AWS network load balancer
AWS supports three types of elastic load balancers (ELBs):.
AWS supports three types of elastic load balancers (ELBs):
Vertica strongly recommends the AWS Network Load Balancer (NLB), which provides the best performance with your Vertica database. The Network Load Balancer acts as a proxy between clients (such as JDBC) and Vertica servers. The Classic and Application Load Balancers do not work with Vertica, in Enterprise Mode or Eon Mode.
To avoid timeouts and hangs when connecting to Vertica through the NLB, it is important to understand how AWS NLB handles idle timeouts for connections. For the NLB, AWS sets the idle timeout value to 350 seconds and you cannot change this value. The timeout applies to both connection points.
For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned because the server fails to send a keepalive within 350 seconds.
To identify an idle timeout/keepalive issue, run a query like this via a client such as JDBC:
=> SELECT SLEEP(355);
If there’s a problem, one of the following situations occurs:
-
The client connection terminates before 355 seconds. In this case, lower the JDBC keepalive setting so that keepalives are sent less than 350 seconds apart.
-
The client connection doesn’t return a result after 355 seconds. In this case, you need to adjust the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) so that keepalives are sent less than 350 seconds apart.
You can adjust the keepalive settings on the server, or you can adjust them in Vertica.
For detailed information about AWS Network Load Balancers, see the AWS documentation.
4.1.4 - Create and assign an internet gateway
When you create a VPC, an Internet gateway is automatically assigned to it.
When you create a VPC, an Internet gateway is automatically assigned to it. You can use that gateway, or you can assign your own. If you are using the default Internet gateway, continue with the procedure described in Create a security group.
Otherwise, create an Internet gateway specific to your needs. Associate that internet gateway with your VPC and subnet.
For information about how to create an Internet Gateway, see Internet Gateways in the AWS documentation.
4.1.5 - Assign an elastic IP address
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally.
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally. Vertica recommends you assign a single elastic IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using their internal IP addresses dictated by your VPC settings.
Create an elastic IP address. For information, see Elastic IP Addresses in the AWS documentation.
4.1.6 - Create a security group
The Vertica AMI has specific security group requirements.
The Vertica AMI has specific security group requirements. When you create a Virtual Private Cloud (VPC), AWS automatically creates a default security group and assigns it to the VPC. You can use the default security group, or you can name and assign your own.
Create and name your own security group using the following basic security group settings. You may make additional modifications based on your specific needs.
Inbound
Type |
Use |
Protocol |
Port Range |
IP |
SSH |
|
TCP |
22 |
The CIDR address range of administrative systems that require SSH access to the Vertica nodes. Make this range as restrictive as possible. You can add multiple rules for separate network ranges, if necessary. |
DNS (UDP) |
|
UDP |
53 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom UDP |
Spread |
UDP |
4803 and 4804 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
Spread |
TCP |
4803 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
VSQL/SQL |
TCP |
5433 |
The CIDR address range of client systems that require access to the Vertica nodes. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary. |
Custom TCP |
Inter-node Communication |
TCP |
5434 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
|
TCP |
5444 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
MC |
TCP |
5450 |
The CIDR address of client systems that require access to the management console. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary. |
Custom TCP |
Rsync |
TCP |
50000 |
Your private subnet address range (for example, 10.0.0.0/24). |
ICMP |
Installer |
Echo Reply |
N/A |
Your private subnet address range (for example, 10.0.0.0/24). |
ICMP |
Installer |
Traceroute |
N/A |
Your private subnet address range (for example, 10.0.0.0/24). |
Note
In Management Console (MC), the Java IANA discovery process uses port 7 once to detect if an IP address is reachable before the database import operation. Vertica tries port 7 first. If port 7 is blocked, Vertica switches to port 22.
Outbound
Type |
Protocol |
Port Range |
Destination |
IP |
All TCP |
TCP |
0-65535 |
Anywhere |
0.0.0.0/0 |
All ICMP |
ICMP |
0-65535 |
Anywhere |
0.0.0.0/0 |
All UDP |
UDP |
0-65535 |
Anywhere |
0.0.0.0/0 |
For information about what a security group is, as well as how to create one, see Amazon EC2 Security Groups for Linux Instances in the AWS documentation.
4.2 - Deploy AWS instances for your Vertica database cluster
After you Configure Your Network, you can create AWS instances and deploy Vertica.
After you Configure your network, you can create AWS instances and deploy Vertica. Follow these procedures to deploy and run Vertica on AWS.
4.2.1 - Configure and launch an instance
After you configure your network settings on AWS, configure and launch the instances where you will install Vertica.
After you configure your network settings on AWS, configure and launch the instances where you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.
When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. Vertica recommends that you use the Vertica AMI unmodified. The Vertica AMI acts as a template, requiring fewer configuration steps:
-
Choose a Vertica AMI Operating Systems
-
Configure EC2 instances.
-
Add storage to instances.
-
Optionally, configure EBS volumes as a RAID array.
-
Set the security group and S3 access.
-
Launch instances and verify they are running.
OpenText provides Vertica and Management Console AMIs on the Red Hat Enterprise Linux 8 operating system.
You can use the AMI to deploy MC hosts or cluster hosts. For more information, see the AWS Marketplace.
-
Select a Vertica AMI from the AWS marketplace.For instance type recommendations for Eon Mode databases, see Choosing AWS Eon Mode Instance Types.
-
Select the desired fulfillment method.
-
Configure the following:
Add storage to instances
Consider the following issues when you add storage to your instances:
-
Add a number of drives equal to the number of physical cores in your instance—for example, for a c3.8xlarge instance, 16 drives; for an r3.4xlarge, 8 drives.
-
Do not store your information on the root volume.
-
Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types.
You can configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:
[dbadmin@ip-10-11-12-13 ~]$ /opt/vertica/bin/vioperf /vertica/data
The minimum required I/O is 20 MB/s read and write per physical processor core on
each node, in full duplex i.e. reading and writing at this rate simultaneously,
concurrently on all nodes of the cluster. The recommended I/O is 40 MB/s per
physical core on each node. For example, the I/O rate for a server node with 2
hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.
Using direct io (buffer size=1048576, alignment=512) for directory "/vertica/data"
test | directory | counter name | counter | counter | counter | counter | thread | %CPU | %IO Wait | elapsed | remaining
| | | value | value (10 | value/core | value/core | count | | | time (s)| time (s)
| | | | sec avg) | | (10 sec avg) | | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Write | /vertica/data | MB/s | 259 | 259 | 32.375 | 32.375 | 8 | 4 | 11 | 10 | 65
Write | /vertica/data | MB/s | 248 | 232 | 31 | 29 | 8 | 4 | 11 | 20 | 55
Write | /vertica/data | MB/s | 240 | 234 | 30 | 29.25 | 8 | 4 | 11 | 30 | 45
Write | /vertica/data | MB/s | 240 | 233 | 30 | 29.125 | 8 | 4 | 13 | 40 | 35
Write | /vertica/data | MB/s | 240 | 233 | 30 | 29.125 | 8 | 4 | 13 | 50 | 25
Write | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 12 | 60 | 15
Write | /vertica/data | MB/s | 240 | 238 | 30 | 29.75 | 8 | 4 | 12 | 70 | 5
Write | /vertica/data | MB/s | 240 | 235 | 30 | 29.375 | 8 | 4 | 12 | 75 | 0
ReWrite | /vertica/data | (MB-read+MB-write)/s| 237+237 | 237+237 | 29.625+29.625 | 29.625+29.625 | 8 | 4 | 22 | 10 | 65
ReWrite | /vertica/data | (MB-read+MB-write)/s| 235+235 | 234+234 | 29.375+29.375 | 29.25+29.25 | 8 | 4 | 20 | 20 | 55
ReWrite | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235 | 29.25+29.25 | 29.375+29.375 | 8 | 4 | 20 | 30 | 45
ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234 | 29.125+29.125 | 29.25+29.25 | 8 | 4 | 18 | 40 | 35
ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234 | 29.125+29.125 | 29.25+29.25 | 8 | 4 | 20 | 50 | 25
ReWrite | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235 | 29.25+29.25 | 29.375+29.375 | 8 | 3 | 19 | 60 | 15
ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 236+236 | 29.125+29.125 | 29.5+29.5 | 8 | 4 | 21 | 70 | 5
ReWrite | /vertica/data | (MB-read+MB-write)/s| 232+232 | 236+236 | 29+29 | 29.5+29.5 | 8 | 4 | 21 | 75 | 0
Read | /vertica/data | MB/s | 248 | 248 | 31 | 31 | 8 | 4 | 12 | 10 | 65
Read | /vertica/data | MB/s | 241 | 236 | 30.125 | 29.5 | 8 | 4 | 15 | 20 | 55
Read | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 10 | 30 | 45
Read | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 12 | 40 | 35
Read | /vertica/data | MB/s | 240 | 234 | 30 | 29.25 | 8 | 4 | 12 | 50 | 25
Read | /vertica/data | MB/s | 238 | 235 | 29.75 | 29.375 | 8 | 4 | 15 | 60 | 15
Read | /vertica/data | MB/s | 238 | 232 | 29.75 | 29 | 8 | 4 | 13 | 70 | 5
Read | /vertica/data | MB/s | 238 | 238 | 29.75 | 29.75 | 8 | 3 | 9 | 75 | 0
SkipRead | /vertica/data | seeks/s | 22909 | 22909 | 2863.62 | 2863.62 | 8 | 0 | 6 | 10 | 65
SkipRead | /vertica/data | seeks/s | 21989 | 21068 | 2748.62 | 2633.5 | 8 | 0 | 6 | 20 | 55
SkipRead | /vertica/data | seeks/s | 21639 | 20936 | 2704.88 | 2617 | 8 | 0 | 7 | 30 | 45
SkipRead | /vertica/data | seeks/s | 21478 | 20999 | 2684.75 | 2624.88 | 8 | 0 | 6 | 40 | 35
SkipRead | /vertica/data | seeks/s | 21381 | 20995 | 2672.62 | 2624.38 | 8 | 0 | 5 | 50 | 25
SkipRead | /vertica/data | seeks/s | 21310 | 20953 | 2663.75 | 2619.12 | 8 | 0 | 5 | 60 | 15
SkipRead | /vertica/data | seeks/s | 21280 | 21103 | 2660 | 2637.88 | 8 | 0 | 8 | 70 | 5
SkipRead | /vertica/data | seeks/s | 21272 | 21142 | 2659 | 2642.75 | 8 | 0 | 6 | 75 | 0
If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).
Note
If your EC2 instance has hyper-threading enabled, vioperf may incorrectly count the number of cores in your system. The 20MB/s throughput per core requirement only applies to physical cores, rather than virtual cores. If your EC2 instance has hyper-threading enabled, divide the counter value (column 4 in the output) by the number of physical cores. See CPU Cores and Threads Per CPU Core Per Instance Type section in the AWS documentation topic
Optimizing CPU Options for a list of physical cores in each instance type.
If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.
Security group and access
-
Choose between your previously configured security group or the default security group.
-
Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS authentication for more information.
4.2.2 - Connect to an instance
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:.
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:
-
As the dbadmin user, type the following command, substituting your ssh key:
$ ssh --ssh-identity <ssh key> dbadmin@elasticipaddress
-
Select Instances from the Navigation panel.
-
Select the instance that is attached to the Elastic IP.
-
Click Connect.
-
On Connect to Your Instance, choose one of the following options:
-
A Java SSH Client directly from my browser—Add the path to your private key in the field Private key path, andclick Launch SSH Client.
-
Connect with a standalone SSH client**—**Follow the steps required by your standalone SSH client.
Connect to an instance from windows using putty
If you connect to the instance from the Windows operating system, and plan to use Putty:
-
Convert your key file using PuTTYgen.
-
Connect with Putty or WinSCP (connect via the elastic IP), using your converted key (i.e., the *ppk
file).
-
Move your key file (the *pem
file) to the root dir using Putty or WinSCP.
4.2.3 - Prepare instances for cluster formation
After you create your instances, you need to prepare them for cluster formation.
After you create your instances, you need to prepare them for cluster formation. Prepare your instances by adding your AWS .pem
key and your Vertica license.
By default, each AMI includes a Community Edition license. Once Vertica is installed, you can find the license at this location:
/opt/vertica/config/licensing/vertica_community_edition.license.key
-
As the dbadmin user, copy your *pem
file (from where you saved it locally) onto your primary instance.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica
script fails with a message similar to the following:
FATAL (19): Failed Login Validation 10.0.3.158, cannot resolve or connect to host as root.
If you receive a failure message, enter the following command to correct permissions on your *pem
file:
$ chmod 600 /<name-of-pem>.pem
-
Copy your Vertica license over to your primary instance, placing it in your home directory or other known location.
4.2.4 - Change instances on AWS
You can change instance types on AWS.
You can change instance types on AWS. For example, you can downgrade a c3.8xlarge instance to c3.4xlarge. See Supported AWS instance types for a list of valid AWS instances.
When you change AWS instances you may need to:
-
Reconfigure memory settings
-
Reset memory size in a resource pool
-
Reset number of CPUs in a resource pool
If you change to an AWS instance type that requires a different amount of memory, you may need to recompute the following and then reset the values:
Note
You may need root user permissions to reset these values.
Reset memory size in a resource pool
If you used absolute memory in a resource pool, you may need to reconfigure the memory using the MEMORYSIZE parameter in ALTER RESOURCE POOL.
Note
If you set memory size as a percentage when you created the original resource pool, you do not need to change it here.
Reset number of CPUs in a resource pool
If your new instance requires a different number of CPUs, you may need to reset the CPUAFFINITYSET parameter in ALTER RESOURCE POOL.
4.2.5 - Configure storage
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system.
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system. The /opt/vertica/sbin/configure_software_raid.sh
script automates the storage configuration process.
Caution
Do not store information on the root volume because it might result in data loss.
Vertica performance tests Eon Mode with a per-node EBS volume of up to 2TB. For best performance, combine multiple EBS volumes into a RAID 0 array.
For more information about RAID 0 arrays and EBS volumes, see RAID configuration on Linux.
Determining volume names
Because the storage configuration script requires the volume names that you want to configure, you must identify the volumes on your machine. The following command lists the contents of the /dev
directory. Search for the volumes that begin with xvd
:
$ ls /dev
Important
Ignore the root volume. Do not include any of your root volumes in the RAID creation process.
Combining volumes for storage
The configure_software_raid.sh
shell script combines your EBS volumes into a RAID 0 array.
Caution
Run configure_software_raid.sh
in the default setting only if you have a fresh configuration with no existing RAID settings.
If you have existing RAID settings, open the script in a text editor and manually edit the raid_dev
value to reflect your current RAID settings. If you have existing RAID settings and you do not edit the script, the script deletes important operating system device files.
Alternately, use the Management Console (MC) console to add storage nodes without unwanted changes to operating system device files. For more information, see Managing database clusters.
The following steps combine your EBS volumes into RAID 0 with the configure_software_raid.sh
script:
-
Edit the /opt/vertica/sbin/configure_software_raid.sh
shell file as follows:
-
Comment out the safety exit
command at the beginning .
-
Change the sample volume names to your own volume names, which you noted previously. Add more volumes, if necessary.
-
Run the /opt/vertica/sbin/configure_software_raid.sh
shell file. Running this file creates a RAID 0 volume and mounts it to /vertica/data
.
-
Change the owner of the newly created volume to dbadmin with chown
.
-
Repeat steps 1-3 for each node on your cluster.
4.2.6 - Create a cluster
On AWS, use the install_vertica script to combine instances and create a cluster.
On AWS, use the
install_vertica
script to combine instances and create a cluster. Check your My Instances page on AWS for a list of current instances and their associated IP addresses. You need these IP addresses when you run install_vertica
.
Create a cluster as follows:
-
While connected to your primary instance, enter the following command to combine your instances into a cluster. Substitute the IP addresses for your instances and include your root *.pem
file name.
$ sudo /opt/vertica/sbin/install_vertica --hosts 10.0.11.164,10.0.11.165,10.0.11.166 \
--dba-user-password-disabled --point-to-point --data-dir /vertica/data \
--ssh-identity ~/name-of-pem.pem --license license.file
Note
-
If you are using Vertica Community Edition, which limits you to three instances, you can specify -L CE
with no license file.
-
When you issue install_vertica or update_vertica on a Vertica AMI script, --point-to-point is the default. This parameter configures Spread to use direct point-to-point communication between all Vertica nodes, which is a requirement for clusters on AWS.
-
If you are using IPv6 network addresses to identify the hosts in your cluster, use the --ipv6 flag in your install_vertica
command. You must also use IP addresses instead of host names, as the AWS DNS server cannot resolve host names to IPv6 addresses.
-
After combining your instances, Vertica recommends deleting your *.pem
key from your cluster to reduce security risks. The example below uses the shred
command to delete the file:
$ shred name-of-pem.pem
-
After creating one or more clusters, create your database or connect to Management Console on AWS.
For complete information on the install_vertica
script and its parameters, see Install Vertica with the installation script.
Important
Stopping or rebooting an instance or cluster without first shutting down the database down, may result in disk or database corruption. To safely shut down and restart your cluster, see
Operating the database.
Check open ports manually using the netcat utility
Once your cluster is up and running, you can check ports manually through the command line using the netcat (nc) utility. What follows is an example using the utility to check ports.
Before performing the procedure, choose the private IP addresses of two nodes in your cluster.
The examples given below use nodes with the private IPs:
10.0.11.60 10.0.11.61
Install the nc utility on your nodes. Once installed, you can issue commands to check the ports on one node from another node.
To check a TCP port:
- Put one node in listen mode and specify the port. The following sample shows how to put IP
10.0.11.60
into listen mode for port 480
[root@ip-10-0-11-60 ~]# nc -l 4804
- From the other node, run
nc
specifying the IP address of the node you just put in listen mode, and the same port number.
[root@ip-10-0-11-61 ~]# nc 10.0.11.60 4804
-
Enter sample text from either node and it should show up on the other node. To cancel after you have checked a port, enter Ctrl+C.
Note
To check a UDP port, use the same nc
commands with the –u
option.
[root@ip-10-0-11-60 ~]# nc -u -l 4804
[root@ip-10-0-11-61 ~]# nc -u 10.0.11.60 4804
4.2.7 - Management Console on AWS
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster.
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster. Vertica provides an MC AMI, which you can use with AWS. The MC AMI allows you to create an instance, dedicated to running MC, that you can attach to a new or existing Vertica cluster on AWS. You can create and attach an MC instance to your Vertica on AWS cluster at any time.
After you launch your MC instance and configure your security group settings, you can log in to your database. To do so, use the elastic IP you specified during instance creation.
From this elastic IP, you can manage your Vertica database on AWS using standard MC procedures.
Considerations when using MC on AWS
-
Because MC is already installed on the MC AMI, the MC installation process does not apply.
-
To uninstall MC on AWS, follow the procedures provided in Uninstalling Management Console before terminating the MC Instance.