This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Vertica on Amazon Web Services

This section explains how to create and manage Vertica clusters on AWS.

This section explains how to create and manage Vertica clusters on AWS.

When you launch a cluster on AWS resources and are ready to create your database, consider whether to run it in Eon Mode or Enterprise Mode. The differences in these two modes lay in their architecture, deployment, and scalability:

  • Enterprise Mode stores data locally on the nodes in the database.

  • Eon Mode stores its data in an S3 bucket.


    Eon Mode separates the computational processes from the communal storage layer of your database. This separation lets you elastically vary the number of nodes in your database cluster to adjust to varying workloads.

    Vertica provides CloudFormation Templates (CFTs) through the AWS Marketplace. These CFTs also deploy the Management Console.

See Architecture for more about the differences between the two database modes.

In this section

1 - Overview of Vertica on Amazon Web Services (AWS)

Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template, or manually deployed from Amazon Machine Images (AMIs).

Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template, or manually deployed from Amazon Machine Images (AMIs).

You can create a database in either Eon Mode or Enterprise Mode in a Vertica cluster in AWS.

For more information about Amazon cluster instances and their limitations, see the Amazon documentation.

In this section

1.1 - CloudFormation templates

Vertica provides Cloud Formation Templates (CFTs) through the AWS Marketplace.

Vertica provides Cloud Formation Templates (CFTs) through the AWS Marketplace. After you provide a few parameters to the template, create a stack to automatically provision the AWS resources for your Vertica system.

After creating the stack, in the Management Console (MC) you can create and manage your clusters and databases. See Creating an Eon Mode database in AWS with MC or Creating an Enterprise Mode database in AWS with MC.

1.2 - Vertica offerings on AWS

Using the license models and CFTs described in CloudFormation Template (CFT) Overview, you can install the following Vertica products:.

Using the license models and CFTs described in CloudFormation template (CFT) overview, you can install the following Vertica products:

  • Vertica BYOL, Amazon Linux 2.0

  • Vertica by the Hour, Amazon Linux 2.0

  • Vertica BYOL, Red Hat

  • Vertica by the Hour, Red Hat

See Launch MC and AWS resources with a CloudFormation template for information on installing these products

1.3 - Vertica AMI operating systems for AWS

Vertica provides Vertica and Management Console AMIs in the following operating systems.

Vertica provides Vertica and Management Console AMIs in the following operating systems.

  • Red Hat 7.4 and later

  • Amazon Linux 2.0 and later

You can use the AMI to deploy MC hosts or cluster hosts.

1.4 - Supported AWS instance types

Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes.

Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes. Choose the instance type that best matches your requirements. The two tables below list the AWS instance types that Vertica supports for Vertica cluster hosts, and for use in MC. For more information, see the Amazon Web Services documentation on instance types and volumes.

Instance types for Vertica cluster hosts

Each Amazon EC2 Instance type natively provides one of the following storage options:

  • Elastic Block Store (EBS) provides durable storage: Data files stored on instance persist after instance is stopped.

  • Instance Store provides temporary storage: Data files stored on instance are lost when instance is stopped.

Optimization Instance Types Using Only EBS Volumes (Durable) Instance Types Using Instance Store Volumes (Temporary)
General purpose

m4.4xlarge

m4.10xlarge

m5.4xlarge

m5.8xlarge

m5.12xlarge

m5d.4xlarge

m5d.8xlarge

m5d.12xlarge

Compute

c4.4xlarge

c4.8xlarge

c5.4xlarge

c5.9xlarge

c3.4xlarge

c3.8xlarge

c5d.4xlarge

c5d.9xlarge

Memory

r4.4xlarge

r4.8xlarge

r4.16xlarge

r5.4xlarge

r5.8xlarge

r5.12xlarge

r3.4xlarge

r3.8xlarge

r5d.4xlarge

r5d.8xlarge

r5d.12xlarge

Storage

d2.4xlarge

d2.8xlarge

i3.4xlarge

i3.8xlarge

i3.16xlarge

i3en.3xlarge

i3en.6xlarge

i3en.12xlarge

Instance types available for MC hosts

Optimization Type Supports EBS Storage (Durable) Supports Ephemeral Storage (Temporary)
Computing

c4.large

c4.xlarge

c5.large

c5.xlarge

Yes

Yes

Yes

Yes

No

No

No

No

More information

For more information about Amazon cluster instances and their limitations, see Manage Clusters in the Amazon Web Services documentation.

1.5 - Choosing AWS Eon Mode instance types

This topic lists the recommended instance types to use in an Eon Mode database running in AWS.

This topic lists the recommended instance types to use in an Eon Mode database running in AWS.

Choose instance types that support ephemeral instance storage or EBS volumes for your depot, depending on cost and availability. It is not mandatory to have an EBS-backed depot, because in Eon Mode, a copy of the data is safely stored in communal storage. Vertica recommends either r4 or i3 instances for production clusters.

The following table provides information to help you make a decision on how to pick instances with ephemeral instance storage or EBS only storage. Check with AWS for the latest cost per hour.

Storage Type Instance Type Pros/Cons
Instance storage i3.8xlarge

Instance storage offers better performance than EBS attached storage through multiple EBS volumes. Instance storage can be striped (RAIDed) together to increase throughput and load balance I/O.

Data stored in instance-store volumes is not persistent through instance stops, terminations, or hardware failures.

EBS-only storage

r4.8xlarge with 600 GB

EBS volume attached

Newer instance types from AWS have only the EBS option. In most AWS regions, it's easier to provision a large number of instances.

You can terminate an instance but leave the EBS volume around for faster revive. Perserving the EBS will preserve the depot. While some of the cached files might have become stale, they will be ignored and evicted. Much of the cached data will not be stale. It will save time when the node revives and warms its depot.

Take advantage of full-volume encryption.

1.6 - Vertica AMI sleep c-states

By default, the following instances have their processor C-states set to a value of 1 in the Vertica AMI:.

By default, the following instances have their processor C-states set to a value of 1 in the Vertica AMI:

  • c4.8xlarge

  • d2.8xlarge

  • m4.10xlarge

This measure is meant to improve performance by limiting the sleep states that an instance running Vertica uses.

For more information about sleep states, visit the AWS Documentation.

1.7 - AWS features supported by Vertica

Vertica supports the following AWS features:.

Vertica supports the following AWS features:

  • Enhanced Networking: Vertica recommends that you use the AWS enhanced networking for optimal performance. For more information, see Enabling Enhanced Networking on Linux Instances in a VPC in the AWS documentation.

  • Command Line Interface: Use the Amazon command-line Interface (CLI) with your Vertica AMIs. For more information, see What Is the AWS Command Line Interface?.

  • Elastic Load Balancing: Use elastic load balancing (ELB) for queries up to one hour. When enabling ELB, configure the timer to 3600 seconds. For more information see Elastic Load Balancing in the AWS documentation.

1.8 - AWS authentication

Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens.

Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens. For long-term access to non-communal storage buckets, you should use IAM roles for access control centralization. You do not need to change your application's configuration if you want to change its access settings. You just alter the IAM role applied to your EC2 instances.

However, for one-time tasks like backing up and restoring the database or loading data to and from non-communal storage buckets, you should use an AWS access key.

Vertica uses both of these authentication methods to support different features and use cases:

  • An Eon Mode database's access to S3 for communal and catalog storage must always use IAM role authentication. IAM roles are the default access control method for AWS resources. Vertica uses this method if you do not configure the legacy access control session parameters.

  • Individual users can read data from S3 storage locations other than the ones Vertica uses for communal storage. For example, users can use COPY to load data into Vertica from an S3 bucket or query an external table stored on S3. If the IAM role assigned to the Vertica nodes does not have access to this external S3 data, the user must set an id, secret, and optionally an access token in session variables to authorize access to it. These session variables override the IAM role set on the server. See S3 parameters for a list of these session parameters.

  • Individual users can export data to S3 using the Vertica Library for AWS. This library cannot use IAM authorization. Users who want to export data to S3 using this library must set id, secret, and optionally access token values in session variables. See Configure the Vertica library for Amazon Web Services for details.

Configuring an IAM role

To configure an IAM role to grant Vertica to access AWS resources you must:

  1. Create an IAM role to allow EC2 instances to access the specific resources.

  2. Grant that role permission to access your resources.

  3. Attach this IAM role to each EC2 instance in the Vertica cluster.

To see an example of IAM roles for a Vertica cluster, look at the roles defined in one of the Cloud Formation Templates provided by Vertica. You can download these templates from any of the Vertica entries in the Amazon Marketplace. Under each entry's Usage Information section, click the View CloudFormation Template link, then click Download CloudFormation Template.

For more information about IAM roles, see IAM Roles for Amazon EC2 in the AWS documentation.

2 - Installing Vertica with CloudFormation templates

Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly.

Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly. Using the template allows you to automatically provision your AWS resources and launch a Vertica cluster and Management Console, with minimal configuration required.

If you prefer to deploy a VPC, instances, and related resources manually, see Install Vertica with manually deployed AWS resources.

For details about creating an Eon Mode or Enterprise Mode database after you create a cluster with CFTs, see Amazon Web Services in MC.

2.1 - CloudFormation template (CFT) overview

With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.

With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.

To access Vertica CFTs, go to the AWS Marketplace. Licensing models for CFTs are:

  • Bring Your Own License (BYOL): By default, free CE license is installed with 3 nodes and 1 TB. To extend nodes or size, you can purchase the Vertica BYOL license.
    Outside of the BYOL license on CFTs, you can also access the Community Edition without a license file:

  • By the Hour: A pay-as-you-go model where you pay for only the number of hours you use for each node. One advantage of using the Paid Listing is that all charges appear on your Amazon AWS bill. This offers an alternative to purchasing a full Vertica license. This eliminates the need to compute potential storage needs in advance.

Available Vertica CFTs are:

  • Management Console with 3 Vertica nodes: The easiest way to deploy Vertica. This CFT deploys an Eon Mode database by default. However, this environment can also be used to create an Enterprise Mode database. For more information, see Creating a database.

  • Deploy Management Console into new VPC: This CFT deploys all required AWS resources and installs the Vertica Management Console (MC). After stack creation completes, log in to the MC to provision a Vertica database cluster.

  • Deploy Management Console into existing VPC: This CFT deploys the Vertica Management Console (MC) in an already-existing VPC and subnet. After stack creation completes, the MC is available. Log in to MC to provision either a Vertica database cluster or an Eon Mode database cluster.

    For this CFT, you must first set up the VPC, subnet, and related network resources. For more information about the correct configuration of these resources for Vertica, see the following topics in the AWS documentation: * Creating a virtual private cloud * Configuring the network

For more information

For supported operating systems for these CFTs, see Vertica AMI operating systems for AWS.

For Vertica products available on AWS, see Vertica offerings on AWS.

2.2 - Prerequisites for using CFTs

Before you can install Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:.

Before you can install Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:

  • AWS account with permissions to create a VPC, subnet, security group, EC2 instances, and IAM roles (For more information about AWS accounts, see the AWS documentation)

  • Amazon key pair for SSH access to an EC2 instance. (See the AWS documentation for key pairs.)

2.3 - Launch MC and AWS resources with a CloudFormation template

Launch (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace.

Launch Management Console (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace. For a list of available CFTs, see CloudFormation template (CFT) overview.

Starting in the AWS Marketplace, launch the provisioning instance from which you can install Vertica.:

  1. Log in to the AWS Marketplace with an AWS account (see the Prerequisites section above).

  2. Search for "Vertica" in the AWS Marketplace.

  3. Select a Vertica CFT. Each CFT leads you to a product overview page, with pricing estimates. (Also see CloudFormation template (CFT) overview for an overview of available templates and products).

  4. Click Continue to Subscribe.

  5. On the next page, select your launch settings based on your requirements for deployment.

  6. If you have not agreed to Vertica EULA terms on the AWS Marketplace before, click Accept Software Terms to subscribe.

  7. Click Launch with CloudFormation Console. The CloudFormation Console opens.

  8. The CloudFormation Console automatically supplies the URL in the Specify an Amazon S3 template URL field. Click Next.

  9. Follow the CloudFormation workflow and enter the parameters (collectively called a stack).

  10. After confirming the details you have provided for your new stack, click Create. The AWS console brings you to the Stacks page, where you can view the progress of the creation process. The process takes several minutes.

  11. The Outputs tab displays information about accessing your environment after the process completes.

Next, access the Management Console (MC) to deploy your cluster instances and create a database, as described in Access Management Console.

2.4 - Access Management Console

You use MC to deploy Vertica cluster instances and create a database.

You use MC to deploy Vertica cluster instances and create a database. You can also use MC to manage and monitor your databases. You will use Management Console to provision a Vertica cluster and database on the AWS resources you just launched.

  1. On the AWS CloudFormation Stacks page, select your new stack and view the Outputs tab. This tab provides information about accessing your environment, as well as documentation and licensing resources.

  2. Click the Access Management Console URL. This link takes you to the MC login page.

  3. To log in, enter the MC username and password that you created using the CloudFormation Console.

    After login, MC displays the home page, with options to provision a new cluster or database or import existing ones. If you chose a CFT that also creates a database, your new database is also displayed on the home page.

    This page also provides a Resources section with links to online training, blogs, community, and help resources.

You have successfully launched Management Console on AWS resources.

If you have not yet provisioned a Vertica cluster and database, complete the steps in one of the following:

2.5 - Creating a virtual private cloud

A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network.

A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network. Create a virtual private cloud (VPC) to ensure the nodes in your cluster will be able to communicate with each other within AWS.

Create a single public subnet VPC with the following configurations:

For information about VPCs, including how to create one, visit the AWS documentation.

3 - Install Vertica with manually deployed AWS resources

Vertica provides an AMI that you can install on AWS resources that you manually deploy.

Vertica provides an AMI that you can install on AWS resources that you manually deploy. This section will guide you through configuring your network settings on AWS, launching and preparing EC2 instances using the Vertica AMI, and creating a Vertica cluster on those EC2 instances.

Choose this method of installation if you are familiar with configuring AWS and have many specific AWS configuration needs. (To automatically deploy AWS resources and a Vertica cluster instead, see Installing Vertica with CloudFormation templates.

3.1 - Configure your network

Before you create your cluster, you must configure the network on which Vertica will run.

Before you create your cluster, you must configure the network on which Vertica will run. Vertica requires a number of specific network configurations to operate on AWS. You may also have specific network configuration needs beyond the default Vertica settings.

The following sections explain which Amazon EC2 features you need to configure for instance creation.

3.1.1 - Create a placement group, key pair, and VPC

Part of configuring your network for AWS is to create the following:.

Part of configuring your network for AWS is to create the following:

Create a placement group

A placement group is a logical grouping of instances in a single Availability Zone. Placement Groups are required for clusters and all Vertica nodes must be in the same Placement Group.

Vertica recommends placement groups for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your Placement Group, choose an instance type that supports enhanced networking.

For information on creating placement groups, see Placement Groups in the AWS documentation.

Create a key pair

You need a key pair to access your instances using SSH. Create the key pair using the AWS interface and store a copy of your key (*.pem) file on your local machine. When you access an instance, you need to know the local path of your key.

Use a key pair to:

  • Authenticate your connection as dbadmin to your instances from outside your cluster.

  • Install and configure Vertica on your AWS instances.

for information on creating a key pair, see Amazon EC2 Key Pairs in the AWS documentation.

Create a virtual private cloud (VPC)

You create a Virtual Private Cloud (VPC) on Amazon so that you can create a network of your EC2 instances. Your instances in the VPC all share the same network and security settings.

A Vertica cluster on AWS must be logically located in the same network. Create a VPC to ensure the nodes in you cluster can communicate with each other in AWS.

Create a single public subnet VPC with the following configurations:

For information on creating a VPC, see Create a Virtual Private Cloud (VPC) in the AWS documentation.

3.1.2 - Network ACL settings

Vertica requires the following network access control list (ACL) settings on an AWS instance running the Vertica AMI.

Vertica requires the following basic network access control list (ACL) settings on an AWS instance running the Vertica AMI. Vertica recommends that you secure your network with additional ACL settings that are appropriate to your situation.

Inbound Rules

Type Protocol Port Range Use Source Allow/Deny
SSH TCP (6) 22 SSH (Optional—for access to your cluster from outside your VPC) User Specific Allow
Custom TCP Rule TCP (6) 5450 MC (Optional—for MC running outside of your VPC) User Specific Allow
Custom TCP Rule TCP (6) 5433 SQL Clients (Optional—for access to your cluster from SQL clients) User Specific Allow
Custom TCP Rule TCP (6) 50000 Rsync (Optional—for backup outside of your VPC) User Specific Allow
Custom TCP Rule TCP (6) 1024-65535 Ephemeral Ports (Needed if you use any of the above) User Specific Allow
ALL Traffic ALL ALL N/A 0.0.0.0/0 Deny

Outbound Rules

Type Protocol Port Range Use Source Allow/Deny
Custom TCP Rule TCP (6) 0–65535 Ephemeral Ports 0.0.0.0/0 Allow

You can use the entire port range specified in the previous table, or find your specific ephemeral ports by entering the following command:

$ cat /proc/sys/net/ipv4/ip_local_port_range

More information

For detailed information on network ACLs within AWS, refer to Network ACLs in the Amazon documentation.

For detailed information on ephemeral ports within AWS, refer to Ephemeral Ports in the Amazon documentation.

3.1.3 - Configure TCP keepalive with AWS network load balancer

AWS supports three types of elastic load balancers (ELBs):.

AWS supports three types of elastic load balancers (ELBs):

Classic Load Balancers

Application Load Balancers

Network Load Balancers

Vertica strongly recommends the AWS Network Load Balancer (NLB), which provides the best performance with your Vertica database. The Network Load Balancer acts as a proxy between clients (such as JDBC) and Vertica servers. The Classic and Application Load Balancers do not work with Vertica, in Enterprise Mode or Eon Mode.

To avoid timeouts and hangs when connecting to Vertica through the NLB, it is important to understand how AWS NLB handles idle timeouts for connections. For the NLB, AWS sets the idle timeout value to 350 seconds and you cannot change this value. The timeout applies to both connection points.

For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned because the server fails to send a keepalive within 350 seconds.

To identify an idle timeout/keepalive issue, run a query like this via a client such as JDBC:

=> SELECT SLEEP(355);

If there’s a problem, one of the following situations occurs:

  • The client connection terminates before 355 seconds. In this case, lower the JDBC keepalive setting so that keepalives are sent less than 350 seconds apart.

  • The client connection doesn’t return a result after 355 seconds. In this case, you need to adjust the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) so that keepalives are sent less than 350 seconds apart.

For detailed information about AWS Network Load Balancers, see What is a Network Load Balancer? in the AWS documentation.

3.1.4 - Create and assign an internet gateway

When you create a VPC, an Internet gateway is automatically assigned to it.

When you create a VPC, an Internet gateway is automatically assigned to it. You can use that gateway, or you can assign your own. If you are using the default Internet gateway, continue with the procedure described in Create a security group.

Otherwise, create an Internet gateway specific to your needs. Associate that internet gateway with your VPC and subnet.

For information about how to create an Internet Gateway, see Internet Gateways in the AWS documentation.

3.1.5 - Assign an elastic IP address

An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally.

An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally. Vertica recommends you assign a single elastic IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using their internal IP addresses dictated by your VPC settings.

Create an elastic IP address. For information, see Elastic IP Addresses in the AWS documentation.

3.1.6 - Create a security group

The Vertica AMI has specific security group requirements.

The Vertica AMI has specific security group requirements. When you create a Virtual Private Cloud (VPC), AWS automatically creates a default security group and assigns it to the VPC. You can use the default security group, or you can name and assign your own.

Create and name your own security group using the following basic security group settings. You may make additional modifications based on your specific needs.

Inbound

Type Use Protocol Port Range IP
SSH TCP 22 The CIDR address range of administrative systems that require SSH access to the Vertica nodes. Make this range as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
DNS (UDP) UDP 53 Your private subnet address range (for example, 10.0.0.0/24).
Custom UDP Spread UDP 4803 and 4804 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP Spread TCP 4803 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP VSQL/SQL TCP 5433 The CIDR address range of client systems that require access to the Vertica nodes. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP Inter-node Communication TCP 5434 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP TCP 5444 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP MC TCP 5450 The CIDR address of client systems that require access to the management console. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP Rsync TCP 50000 Your private subnet address range (for example, 10.0.0.0/24).
ICMP Installer Echo Reply N/A Your private subnet address range (for example, 10.0.0.0/24).
ICMP Installer Traceroute N/A Your private subnet address range (for example, 10.0.0.0/24).

Outbound

Type Protocol Port Range Destination IP
All TCP TCP 0-65535 Anywhere 0.0.0.0/0
All ICMP ICMP 0-65535 Anywhere 0.0.0.0/0
All UDP UDP 0-65535 Anywhere 0.0.0.0/0

For information about what a security group is, as well as how to create one, see Amazon EC2 Security Groups for Linux Instances in the AWS documentation.

3.2 - Deploy AWS instances for your Vertica database cluster

Once you have configured your network, you are ready to create your AWS instances and install Vertica.

Once you have configured your network, you are ready to create your AWS instances and install Vertica. Follow these procedures to install and run Vertica on AWS.

3.2.1 - Configure and launch an instance

After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica.

After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.

When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps. Vertica recommends that you use the Vertica AMI as is—without modification.

Configure EC2 instances in AWS

  1. Select the Vertica AMI from the AWS marketplace.

  2. Select the desired fulfillment method.

  3. Configure the following:

Add storage to your instances

Consider the following issues when you add storage to your instances:

  • Add a number of drives equal to the number of physical cores in your instance. For example, for a c3.8xlarge instance, 16 drives. For an r3.4xlarge, add 8 drives.

  • Do not store your information on the root volume.

  • Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types in the Amazon Web Services documentation.

Decide whether to configure EBS volumes as a RAID array

You can choose to configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:

[dbadmin@ip-10-11-12-13 ~]$ /opt/vertica/bin/vioperf /vertica/data

The minimum required I/O is 20 MB/s read and write per physical processor core on
each node, in full duplex i.e. reading and writing at this rate simultaneously,
concurrently on all nodes of the cluster. The recommended I/O is 40 MB/s per
physical core on each node. For example, the I/O rate for a server node with 2
hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.

Using direct io (buffer size=1048576, alignment=512) for directory "/vertica/data"

test      | directory     | counter name        | counter | counter   | counter       | counter       | thread | %CPU  | %IO Wait  | elapsed | remaining
          |               |                     | value   | value (10 | value/core    | value/core    | count  |       |           | time (s)| time (s)
          |               |                     |         | sec avg)  |               | (10 sec avg)  |        |       |           |         |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Write     | /vertica/data | MB/s                | 259     | 259       | 32.375        | 32.375        | 8      | 4     | 11        | 10      | 65
Write     | /vertica/data | MB/s                | 248     | 232       | 31            | 29            | 8      | 4     | 11        | 20      | 55
Write     | /vertica/data | MB/s                | 240     | 234       | 30            | 29.25         | 8      | 4     | 11        | 30      | 45
Write     | /vertica/data | MB/s                | 240     | 233       | 30            | 29.125        | 8      | 4     | 13        | 40      | 35
Write     | /vertica/data | MB/s                | 240     | 233       | 30            | 29.125        | 8      | 4     | 13        | 50      | 25
Write     | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 12        | 60      | 15
Write     | /vertica/data | MB/s                | 240     | 238       | 30            | 29.75         | 8      | 4     | 12        | 70      | 5
Write     | /vertica/data | MB/s                | 240     | 235       | 30            | 29.375        | 8      | 4     | 12        | 75      | 0
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 237+237 | 237+237   | 29.625+29.625 | 29.625+29.625 | 8      | 4     | 22        | 10      | 65
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 235+235 | 234+234   | 29.375+29.375 | 29.25+29.25   | 8      | 4     | 20        | 20      | 55
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235   | 29.25+29.25   | 29.375+29.375 | 8      | 4     | 20        | 30      | 45
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234   | 29.125+29.125 | 29.25+29.25   | 8      | 4     | 18        | 40      | 35
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234   | 29.125+29.125 | 29.25+29.25   | 8      | 4     | 20        | 50      | 25
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235   | 29.25+29.25   | 29.375+29.375 | 8      | 3     | 19        | 60      | 15
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 236+236   | 29.125+29.125 | 29.5+29.5     | 8      | 4     | 21        | 70      | 5
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 232+232 | 236+236   | 29+29         | 29.5+29.5     | 8      | 4     | 21        | 75      | 0
Read      | /vertica/data | MB/s                | 248     | 248       | 31            | 31            | 8      | 4     | 12        | 10      | 65
Read      | /vertica/data | MB/s                | 241     | 236       | 30.125        | 29.5          | 8      | 4     | 15        | 20      | 55
Read      | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 10        | 30      | 45
Read      | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 12        | 40      | 35
Read      | /vertica/data | MB/s                | 240     | 234       | 30            | 29.25         | 8      | 4     | 12        | 50      | 25
Read      | /vertica/data | MB/s                | 238     | 235       | 29.75         | 29.375        | 8      | 4     | 15        | 60      | 15
Read      | /vertica/data | MB/s                | 238     | 232       | 29.75         | 29            | 8      | 4     | 13        | 70      | 5
Read      | /vertica/data | MB/s                | 238     | 238       | 29.75         | 29.75         | 8      | 3     | 9         | 75      | 0
SkipRead  | /vertica/data | seeks/s             | 22909   | 22909     | 2863.62       | 2863.62       | 8      | 0     | 6         | 10      | 65
SkipRead  | /vertica/data | seeks/s             | 21989   | 21068     | 2748.62       | 2633.5        | 8      | 0     | 6         | 20      | 55
SkipRead  | /vertica/data | seeks/s             | 21639   | 20936     | 2704.88       | 2617          | 8      | 0     | 7         | 30      | 45
SkipRead  | /vertica/data | seeks/s             | 21478   | 20999     | 2684.75       | 2624.88       | 8      | 0     | 6         | 40      | 35
SkipRead  | /vertica/data | seeks/s             | 21381   | 20995     | 2672.62       | 2624.38       | 8      | 0     | 5         | 50      | 25
SkipRead  | /vertica/data | seeks/s             | 21310   | 20953     | 2663.75       | 2619.12       | 8      | 0     | 5         | 60      | 15
SkipRead  | /vertica/data | seeks/s             | 21280   | 21103     | 2660          | 2637.88       | 8      | 0     | 8         | 70      | 5
SkipRead  | /vertica/data | seeks/s             | 21272   | 21142     | 2659          | 2642.75       | 8      | 0     | 6         | 75      | 0

If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).

If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.

Security group and access

  1. Choose between your previously configured security group or the default security group.

  2. Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS authentication for more information.

Launch instances

Verify that your instances are running.

3.2.2 - Connect to an instance

Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:.

Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:

  1. As the dbadmin user, type the following command, substituting your ssh key:

    $ ssh --ssh-identity <ssh key> dbadmin@elasticipaddress
    
  2. Select Instances from the Navigation panel.

  3. Select the instance that is attached to the Elastic IP.

  4. Click Connect.

  5. On Connect to Your Instance, choose one of the following options:

    • A Java SSH Client directly from my browser—Add the path to your private key in the field Private key path, andclick Launch SSH Client.

    • Connect with a standalone SSH client**—**Follow the steps required by your standalone SSH client.

Connect to an instance from windows using putty

If you connect to the instance from the Windows operating system, and plan to use Putty:

  1. Convert your key file using PuTTYgen.

  2. Connect with Putty or WinSCP (connect via the elastic IP), using your converted key (i.e., the *ppk file).

  3. Move your key file (the *pem file) to the root dir using Putty or WinSCP.

3.2.3 - Prepare instances for cluster formation

After you create your instances, you need to prepare them for cluster formation.

After you create your instances, you need to prepare them for cluster formation. Prepare your instances by adding your AWS .pem key and your Vertica license.

By default, each AMI includes a Community Edition license. Once Vertica is installed, you can find the license at this location:

/opt/vertica/config/licensing/vertica_community_edition.license.key
  1. As the dbadmin user, copy your *pem file (from where you saved it locally) onto your primary instance.

    Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:

    FATAL (19): Failed Login Validation 10.0.3.158, cannot resolve or connect to host as root.
    

    If you receive a failure message, enter the following command to correct permissions on your *pem file:

    $ chmod 600 /<name-of-pem>.pem
    
  2. Copy your Vertica license over to your primary instance, placing it in your home directory or other known location.

3.2.4 - Change instances on AWS

You can change instance types on AWS.

You can change instance types on AWS. For example, you can downgrade a c3.8xlarge instance to c3.4xlarge. See Supported AWS instance types for a list of valid AWS instances.

When you change AWS instances you may need to:

  • Reconfigure memory settings

  • Reset memory size in a resource pool

  • Reset number of CPUs in a resource pool

Reconfigure memory settings

If you change to an AWS instance type that requires a different amount of memory, you may need to recompute the following and then reset the values:

Reset memory size in a resource pool

If you used absolute memory in a resource pool, you may need to reconfigure the memory using the MEMORYSIZE parameter in ALTER RESOURCE POOL.

Reset number of CPUs in a resource pool

If your new instance requires a different number of CPUs, you may need to reset the CPUAFFINITYSET parameter in ALTER RESOURCE POOL.

3.2.5 - Configure storage

Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system.

Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system. The /opt/vertica/sbin/configure_software_raid.sh script automates the storage configuration process.

Vertica performance tests Eon Mode with a per-node EBS volume of up to 2TB. For best performance, combine multiple EBS volumes into a RAID 0 array.

For more information about RAID 0 arrays and EBS volumes, see RAID configuration on Linux.

Determining volume names

Because the storage configuration script requires the volume names that you want to configure, you must identify the volumes on your machine. The following command lists the contents of the /dev directory. Search for the volumes that begin with xvd:

$ ls /dev

Combining volumes for storage

The configure_software_raid.sh shell script combines your EBS volumes into a RAID 0 array.

The following steps combine your EBS volumes into RAID 0 with the configure_software_raid.sh script:

  1. Edit the /opt/vertica/sbin/configure_software_raid.sh shell file as follows:

    1. Comment out the safety exit command at the beginning .

    2. Change the sample volume names to your own volume names, which you noted previously. Add more volumes, if necessary.

  2. Run the /opt/vertica/sbin/configure_software_raid.sh shell file. Running this file creates a RAID 0 volume and mounts it to /vertica/data.

  3. Change the owner of the newly created volume to dbadmin with chown.

  4. Repeat steps 1-3 for each node on your cluster.

3.2.6 - Create a cluster

On AWS, use the install_vertica script to combine instances and create a cluster.

On AWS, use the install_vertica script to combine instances and create a cluster. Check your My Instances page on AWS for a list of current instances and their associated IP addresses. You need these IP addresses when you run install_vertica.

Create a cluster as follows:

  1. While connected to your primary instance, enter the following command to combine your instances into a cluster. Substitute the IP addresses for your instances and include your root *.pem file name.

    $ sudo /opt/vertica/sbin/install_vertica --hosts 10.0.11.164,10.0.11.165,10.0.11.166 \
      --dba-user-password-disabled --point-to-point --data-dir /vertica/data \
      --ssh-identity ~/name-of-pem.pem --license license.file
    
  2. After combining your instances, Vertica recommends deleting your *.pem key from your cluster to reduce security risks. The example below uses the shred command to delete the file:

    $ shred name-of-pem.pem
    
  3. After creating one or more clusters, create your database.

For complete information on the install_vertica script and its parameters, see Installing Vertica with the installation script.

Check open ports manually using the netcat utility

Once your cluster is up and running, you can check ports manually through the command line using the netcat (nc) utility. What follows is an example using the utility to check ports.

Before performing the procedure, choose the private IP addresses of two nodes in your cluster.

The examples given below use nodes with the private IPs:

10.0.11.60 10.0.11.61

Install the nc utility on your nodes. Once installed, you can issue commands to check the ports on one node from another node.

  1. To check a TCP port:

    1. Put one node in listen mode and specify the port. The following sample shows how to put IP 10.0.11.60 into listen mode for port 4804.

      [root@ip-10-0-11-60 ~]# nc -l 4804
      
    2. From the other node, run nc specifying the IP address of the node you just put in listen mode, and the same port number.

      [root@ip-10-0-11-61 ~]# nc 10.0.11.60 4804
      
    3. Enter sample text from either node and it should show up on the other node. To cancel after you have checked a port, enter Ctrl+C.

    [root@ip-10-0-11-60 ~]# nc -u -l 4804
    [root@ip-10-0-11-61 ~]# nc -u 10.0.11.60 4804
    

3.2.7 - Use Management Console (MC) on AWS

Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster.

Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster. Vertica provides an MC AMI, which you can use with AWS. The MC AMI allows you to create an instance, dedicated to running MC, that you can attach to a new or existing Vertica cluster on AWS. You can create and attach an MC instance to your Vertica on AWS cluster at any time.

For information on requirements and installing MC, see Installing Management Console.

See also

3.2.7.1 - Log in to MC and managing your cluster

After you launch your MC instance and configure your security group settings, log in to your database.

After you launch your MC instance and configure your security group settings, log in to your database. To do so, use the elastic IP you specified during instance creation.

From this elastic IP, you can manage your Vertica database on AWS using standard MC procedures.

Considerations when using MC on AWS

  • Because MC is already installed on the MC AMI, the MC installation process does not apply.

  • To uninstall MC on AWS, follow the procedures provided in Uninstalling Management Console before terminating the MC Instance.

4 - Export data to Amazon S3 using the AWS library

The AWS library is deprecated.

The Vertica library for Amazon Web Services (AWS) is a set of functions and configurable session parameters. These parameters allow you to export delimited data from Vertica to Amazon S3 storage without any third-party scripts or programs.

To use the AWS library, you must have access to an Amazon S3 storage account.

4.1 - Configure the Vertica library for Amazon Web Services

You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3.

You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3. This library does not support IAM authentication. You must configure it to authenticate with S3 by using session parameters containing your AWS access key credentials. You can set your session parameters directly, or you can store your credentials in a table and set them with the AWS_SET_CONFIG function.

Because the AWS library uses session parameters, you must reconfigure the library with each new session.

Set AWS authentication parameters

The following AWS authentication parameters allow you to access AWS and work with the data in your Vertica database:

  • aws_id: The 20-character AWS access key used to authenticate your account.

  • aws_secret: The 40-character AWS secret access key used to authenticate your account.

  • aws_session_token: The AWS temporary security token generated by running the AWS STS command get-session-token. This AWS STS command generates temporary credentials you can use to implement multi-factor authentication for security purposes. See Implementing Multi-factor Authentication.

Implement multi-factor authentication

Implement multi-factor authentication as follows:

  1. Run the AWS STS command get-session-token, this returns the following:

    $ Credentials": {
    "SecretAccessKey": "bQid6jNuSWRqUzkIJCFG7c71gDHZY3h7aDSW2DU6",
    "SessionToken":
    "FQoDYXdzEBcaDKM1mWpeu88nDTTFICKsAbaiIDTWe4BTh33tnUvo9F/8mZicKKLLy7WIcpT4FLfr6ltIm242/U2CI9G/
    XdC6eoysUi3UGH7cxdhjxAW4fjgCKKYuNL764N2xn0issmIuJOku3GTDyc4U4iNlWyEng3SlshdiqVlk1It2Mk0isEQXKtx
    F9VgfncDQBxjZUCkYIzseZw5pULa9YQcJOzl+Q2JrdUCWu0iFspSUJPhOguH+wTqiM2XdHL5hcUcomqm41gU=",
    "Expiration": "2018-04-12T01:58:50Z",
    "AccessKeyId": "ASIAJ4ZYGTOSVSLUIN7Q"
     }
    }
    

    For more information on get-session-token, see the AWS documentation.

  2. Using the SecretAccessKey returned from get-sessiontoken, set your temporary aws_secret:

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='bQid6jNuSWRqUzkIJCFG7c71gDHZY3h7aDSW2DU6';
    
  3. Using the SessionToken returned from get-session-token, set your temporary aws_session_token:

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_session_token='FQoDYXdzEBcaDKM1mWpeu88nDTTFICKsAbaiIDTWe4B
    Th33tnUvo9F/8mZicKKLLy7WIcpT4FLfr6ltIm242/U2CI9G/XdC6eoysUi3UGH7cxdhjxAW4fjgCKKYuNL764N2xn0issmIuJOku3GTDy
    c4U4iNlWyEng3SlshdiqVlk1It2Mk0isEQXKtxF9VgfncDQBxjZUCkYIzseZw5pULa9YQcJOzl+Q2JrdUCWu0iFspSUJPhOguH+wTq
    iM2XdHL5hcUcomqm41gU=';
    
  4. Using the AccessKeyID returned from get-session-token, set your temporary aws_id:

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_id='ASIAJ4ZYGTOSVSLUIN7Q';
    

The Expiration value returned indicates when the temporary credentials expire. In this example expiration occurs April 12, 2018 at 01:58:50.

These examples show how to implement multifactor authentication using session parameters. You can use either of the following methods to securely set and store your AWS account credentials:

AWS access key requirements

To communicate with AWS, your access key must have the following permissions:

  • s3:GetObject

  • s3:PutObject

  • s3:ListBucket

For security purposes, Vertica recommends that you create a separate access key with limited permissions specifically for use with the Vertica Library for AWS.

Configure session parameters directly

These examples show how to set the session parameters for AWS using your own credentials. Parameter values are case sensitive:

  • aws_id: This value is your AWS access key ID.

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_id='AKABCOEXAMPLEPKPXYZQ';
    
  • aws_secret: This value is your AWS secret access key.

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='CEXAMPLE3tEXAMPLE1wEXAMPLEFrFEXAMPLE6+Yz';
    
  • aws_region: This value is the AWS region associated with the S3 bucket you intend to access. Left unconfigured, aws_region will default to us-east-1. It identifies the default server used by Amazon S3.

    => ALTER SESSION SET UDPARAMETER FOR awslib aws_region='us-east-1';
    

When using ALTER SESSION:

  • Using ALTER SESSION to change the values of S3 parameters also changes the values of corresponding UDParameters.

  • Setting a UDParameter changes only the UDParameter.

  • Setting a configuration parameter changes both the AWS parameter and UDParameter.

Configure session parameters using credentials stored in a table

You can place your credentials in a table and secure them with a row-level access policy. You can then call your credentials with the AWS_SET_CONFIG scalar meta-function. This approach allows you to store your credentials on your cluster for future session parameter configuration. You must have dbadmin access to create access policies.

  1. Create a table with rows or columns corresponding with your credentials:

    => CREATE TABLE keychain(accesskey varchar, secretaccesskey varchar);
    
  2. Store your credentials in the corresponding columns:

    => COPY keychain FROM STDIN;
    Enter data to be copied followed by a newline.
    End with a backslash and a period on a line by itself.
    >> AEXAMPLEI5EXAMPLEYXQ|CCEXAMPLEtFjTEXAMPLEiEXAMPLE6+Yz
    >> \.
    
  3. Set a row-level access policy appropriate to your security situation.

  4. With each new session, configure your session parameters by calling the AWS_SET_CONFIG parameter in a SELECT statement:

    => SELECT AWS_SET_CONFIG('aws_id', accesskey), AWS_SET_CONFIG('aws_secret', secretaccesskey)
       FROM keychain;
     aws_set_config | aws_set_config
    ----------------+----------------
     aws_id         | aws_secret
    (1 row)
    
  5. After you have configured your session parameters, verify them:

    => SHOW SESSION UDPARAMETER ALL;
    

4.2 - Export data to Amazon S3 from Vertica

After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function.

After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function. S3EXPORT() writes data to files, based on the URL you provide. Vertica performs all communication over HTTPS, regardless of the URL type you use.Vertica does not support virtual host style URLs. If you use HTTPS URL constructions, you must use path style URLs.

You can control the output of S3EXPORT() in the following ways:

Adjust the query provided to S3EXPORT

By adjusting the query given to S3EXPORT(), you can export anything from tables to reporting queries.

This example exports a whole table:

=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
   FROM exampleTable;
 rows | url
------+------------------------------
  606 | https://exampleBucket/object
(1 row)

This example exports the results of a query:

=> SELECT S3EXPORT(customer_name, annual_income USING PARAMETERS url='s3://exampleBucket/object') OVER()
    FROM public.customer_dimension
      WHERE (customer_gender, annual_income) IN
        (SELECT customer_gender, MAX(annual_income)
         FROM public.customer_dimension
         GROUP BY customer_gender);

 rows | url
------+------------------------------
   25 | https://exampleBucket/object
(1 row)

Adjust the partition of your result set with the OVER clause

Use the OVER clause to control your export partitions. Using the OVER() clause without qualification results in a single partition processed by the initiator for all of the query data. This example shows how to call the function with an unqualified OVER() clause:

=> SELECT S3EXPORT(name, company USING PARAMETERS url='s3://exampleBucket/object',
                                                  delimiter=',') OVER()
     FROM exampleTable WHERE company='Vertica';
 rows | url
------+------------------------------
   10 | https://exampleBucket/object
(1 row)

You can also use window clauses, such as window partition clauses and window order clauses, to manage exported objects.

This example shows how you can use a window partition clause to partition S3 objects based on company values:

=> SELECT S3EXPORT(name, company
                    USING PARAMETERS url='s3://exampleBucket/object',
                                     delimiter=',') OVER(PARTITION BY company) AS MEDIAN
      FROM exampleTable;

Adjusting the export chunk size for wide tables

You may encounter the following error when exporting extremely wide tables or tables with long data types such as LONG VARCHAR or LONG VARBINARY:

=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
   FROM veryWideTable;
ERROR 5861: Error calling setup() in User Function s3export
at [/data/.../S3.cpp:787],
error code: 0, message: The specified buffer of 10485760 bytesRead is too small,
it should be at least 11279701 bytesRead.

Vertica returns this error if the data for a single row overflows the buffer storing the data before export. By default, this buffer is 10MB. You can increase the size of this buffer using the chunksize parameter, which sets the size of the buffer in bytes. This example sets it to around 60MB:

=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object', chunksize=60485760)
   OVER(PARTITION BEST) FROM veryWideTable;
 rows | url
------+------------------------------
  606 | https://exampleBucket/object
(1 row)

See also

5 - Add nodes to a running cluster on the cloud

There are two ways to add nodes to an AWS cluster:.

There are two ways to add nodes to an AWS cluster:

  • Using Management Console

  • Using admintools

When you use MC to add nodes to a cluster in the cloud, MC provisions the instances, adds the new instances to the existing Vertica cluster, and then adds those hosts to the database. However, when you add nodes to a cluster using admintools, you need to execute those steps yourself, as explained in Adding Nodes Using admintools.

Adding nodes using Management Console

In the Vertica Management Console, you can add nodes in several ways, depending on your database mode.

For Eon Mode databases, MC supports actions for subcluster and node management for the following public and private cloud providers:

For Enterprise Mode databases, MC supports these actions:

  • In the cloud on AWS: Add Node action, Add Instance action.

  • On-premises: Add Node action.

Adding nodes in an Eon Mode database

In an Eon Mode database, every node must belong to a subcluster. To add nodes, you always add them to one of the subclusters in the database:

Adding nodes in an Enterprise Mode database on AWS

In an Enterprise Mode database on AWS, to add an instance to your cluster:

  1. On the MC Home page, click View Infrastructure to go to the Infrastructure page. This page lists all the clusters the MC is monitoring.

  2. Click any cluster shown on the Infrastructure page.

  3. Select View or Manage from the dialog that displays, to view its Cluster page. (In a cloud environment, if MC was deployed from a cloud template the button says "Manage". Otherwise, the button says "View".)

  1. Click the Add (+) icon on the Instance List on the Cluster Management page.

    MC adds a node to the selected cluster.

Adding nodes using admintools

This section gives an overview on how to add nodes if you are managing your cluster using admintools. Each main step points to another topic with the complete instructions.

Step 1: before you start

Before you add nodes to a cluster, verify that you have an AWS cluster up and running and that you have:

  • Created a database.

  • Defined a database schema.

  • Loaded data.

  • Run the Database Designer.

  • Connected to your database.

Step 2: launch new instances to add to an existing cluster

Perform the procedure in Configure and launch an instance to create new instances (hosts) that you then will add to your existing cluster. Be sure to choose the same details you chose when you created the original instances (VPC, placement group, subnet, and security group).

Step 3: include new instances as cluster nodes

You need the IP addresses when you run the install_vertica script to include new instances as cluster nodes.

If you are configuring Amazon Elastic Block Store (EBS) volumes, be sure to configure the volumes on the node before you add the node to your cluster.

To add the new instances as nodes to your existing cluster:

  1. Configure and launch your new instances.

  2. Connect to the instance that is assigned to the Elastic IP. See Connect to an instance if you need more information.

  3. Run the Vertica installation script to add the new instances as nodes to your cluster. Specify the internal IP addresses for your instances and your *.pem file name.

    $ sudo /opt/vertica/sbin/install_vertica --add-hosts instance-ip --dba-user-password-disabled \
      --point-to-point --data-dir /vertica/data --ssh-identity ~/name-of-pem.pem
    

Step 4: add the nodes

After you have added the new instances to your existing cluster, add them as nodes to your cluster, as described in Adding nodes to a database.

Step 5: rebalance the database

After you add nodes to a database, always rebalance the database.

6 - Remove nodes from a running AWS cluster

Use the following procedures to remove instances/nodes from an AWS cluster.

Use the following procedures to remove instances/nodes from an AWS cluster.

To avoid data loss, Vertica strongly recommends that you back up your database before removing a node. For details, see Backing up and restoring the database.

In this section

6.1 - Remove hosts from the database

Before you remove hosts from the database, verify that you have:.

Before you remove hosts from the database, verify that you have:

  • Backed up the database.

  • Lowered the K-safety of the database.

To remove a host from the database:

  1. While logged on as dbadmin, launch Administration Tools.

    $ /opt/vertica/bin/admintools

  2. From the Main Menu, select Advanced Menu.

  3. From Advanced Menu, select Cluster Management. ClickOK.

  4. From Cluster Management, select Remove Host(s). Click OK.

  5. From Select Database, choose the database from which you plan to remove hosts. Click OK.

  6. Select the host(s) to remove. Click OK.

  7. Click Yes to confirm removal of the hosts.

  8. Click OK. The system displays a message telling you that the hosts have been removed. Automatic rebalancing also occurs.

  9. Click OK to confirm. Administration Tools brings you back to the Cluster Management menu.

6.2 - Remove nodes from the cluster

To remove nodes from a cluster, run the update_vertica script and specify:.

To remove nodes from a cluster, run the update_vertica script and specify:

  • The option --remove-hosts, followed by the IP addresses of the nodes you are removing.

  • The option --ssh-identity, followed by the location and name of your *pem file.

  • The option --dba-user-password-disabled.

The following example removes one node from the cluster:

$ sudo /opt/vertica/sbin/update_vertica  --remove-hosts 10.0.11.165  --point-to-point  \
  --ssh-identity ~/name-of-pem.pem --dba-user-password-disabled

6.3 - Stop the AWS instances (optional)

After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.

After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.

To stop an instance in AWS:

  1. On AWS, navigate to your Instances page.

  2. Right-click the instance, and choose Stop.

This step is optional because, after you have removed the node from your Vertica cluster, Vertica no longer sees the node as part of the cluster, even though it is still running within AWS.

7 - Upgrade Vertica on AWS

Before you upgrade to the latest Vertica version, do the following:.

Before you upgrade to the latest Vertica version, do the following:

  1. Back up your existing database.

  2. Download the Vertica install packages described in Download and Install the Vertica Install Package.

Upgrade to the latest version of Vertica on AWS

To upgrade to the latest version of Vertica on AWS, follow the instructions in Upgrading Vertica.

If you are setting up a Vertica cluster on AWS for the first time, follow the procedure for installing and running on AWS.

Upgrade Vertica running on AWS

Vertica supports upgrades of Vertica server running on AWS instances created from the Vertica AMI. To upgrade Vertica, follow the instructions provided in Upgrading Vertica.

Make sure to add the following arguments to the upgrade script:

  • --dba-user-password-disabled

  • --point-to-point

8 - Copying and exporting data on AWS: what you need to know

There are common issues that occur when exporting or copying on AWS clusters, as described below.

There are common issues that occur when exporting or copying on AWS clusters, as described below. Except for these specific issues as they relate to AWS, copying and exporting data works as documented in Database export and import.

To copy or export data on AWS:

  1. Verify that all nodes in source and destination clusters have their own elastic IPs (or public IPs) assigned.

    If your destination cluster is located within the same VPC as your source cluster, proceed to step 3. Each node in one cluster must be able to communicate with each node in the other cluster. Thus, each source and destination node needs an elastic IP (or public IP) assigned.

  2. (For non-CloudFormation Template installs) Create an S3 gateway endpoint.

    If you aren't using a CloudFormation Template (CFT) to install Vertica, you must create an S3 gateway endpoint in your VPC. For more information, see the AWS documentation.

    For example, the Vertica CFT has the following VPC endpoint:

    "S3Enpoint" : {
        "Type" : "AWS::EC2::VPCEndpoint",
        "Properties" : {
        "PolicyDocument" : {
            "Version":"2012-10-17",
            "Statement":[{
            "Effect":"Allow",
            "Principal": "*",
            "Action":["*"],
            "Resource":["*"]
            }]
        },
        "RouteTableIds" : [ {"Ref" : "RouteTable"} ],
        "ServiceName" : { "Fn::Join": [ "", [ "com.amazonaws.", { "Ref": "AWS::Region" }, ".s3" ] ] },
        "VpcId" : {"Ref" : "VPC"}
    }
    

  3. Verify that your security group allows the AWS clusters to communicate.

    Check your security groups for both your source and destination AWS clusters. Verify that ports 5433 and 5434 are open. If one of your AWS clusters is on a separate VPC, verify that your network access control list (ACL) allows communication on port 5434.

  4. If there are one or more elastic load balancers (ELBs) between the clusters, verify that port 5433 is open between the ELBs and clusters.

  5. If you use the Vertica client to connect to one or more ELBs, the ELBs only distribute incoming connections. The data transmission path occurs between clusters.