This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Manually deploy Vertica on AWS

Vertica provides tested and pre-configured Amazon Machine Images (AMIs) to deploy cluster hosts or MC hosts on AWS.

Vertica provides tested and pre-configured Amazon Machine Images (AMIs) to deploy cluster hosts or MC hosts on AWS. When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps.

This section will guide you through configuring your network settings on AWS, launching and preparing EC2 instances using the Vertica AMI, and creating a Vertica cluster on those EC2 instances.

Choose this method of installation if you are familiar with configuring AWS and have many specific AWS configuration needs. To automatically deploy AWS resources and a Vertica cluster instead, see Deploy Vertica using CloudFormation templates.

1 - Configure your network

Before you deploy your cluster, you must configure the network on which Vertica will run.

Before you deploy your cluster, you must configure the network on which Vertica will run. Vertica requires a number of specific network configurations to operate on AWS. You may also have specific network configuration needs beyond the default Vertica settings.

The following sections explain which Amazon EC2 features you need to configure for instance creation.

1.1 - Create a placement group, key pair, and VPC

Part of configuring your network for AWS is to create the following:.

Part of configuring your network for AWS is to create the following:

Create a placement group

A placement group is a logical grouping of instances in a single Availability Zone. Placement Groups are required for clusters and all Vertica nodes must be in the same Placement Group.

Vertica recommends placement groups for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your Placement Group, choose an instance type that supports enhanced networking.

For information on creating placement groups, see Placement Groups in the AWS documentation.

Create a key pair

You need a key pair to access your instances using SSH. Create the key pair using the AWS interface and store a copy of your key (*.pem) file on your local machine. When you access an instance, you need to know the local path of your key.

Use a key pair to:

  • Authenticate your connection as dbadmin to your instances from outside your cluster.

  • Install and configure Vertica on your AWS instances.

for information on creating a key pair, see Amazon EC2 Key Pairs in the AWS documentation.

Create a virtual private cloud (VPC)

You create a Virtual Private Cloud (VPC) on Amazon so that you can create a network of your EC2 instances. Your instances in the VPC all share the same network and security settings.

A Vertica cluster on AWS must be logically located in the same network. Create a VPC to ensure the nodes in you cluster can communicate with each other in AWS.

Create a single public subnet VPC with the following configurations:

For information on creating a VPC, see Create a Virtual Private Cloud (VPC) in the AWS documentation.

1.2 - Network ACL settings

Vertica requires the following network access control list (ACL) settings on an AWS instance running the Vertica AMI.

Vertica requires the following basic network access control list (ACL) settings on an AWS instance running the Vertica AMI. Vertica recommends that you secure your network with additional ACL settings that are appropriate to your situation.

Inbound Rules

Type Protocol Port Range Use Source Allow/Deny
SSH TCP (6) 22 SSH (Optional—for access to your cluster from outside your VPC) User Specific Allow
Custom TCP Rule TCP (6) 5450 MC (Optional—for MC running outside of your VPC) User Specific Allow
Custom TCP Rule TCP (6) 5433 SQL Clients (Optional—for access to your cluster from SQL clients) User Specific Allow
Custom TCP Rule TCP (6) 50000 Rsync (Optional—for backup outside of your VPC) User Specific Allow
Custom TCP Rule TCP (6) 1024-65535 Ephemeral Ports (Needed if you use any of the above) User Specific Allow
ALL Traffic ALL ALL N/A 0.0.0.0/0 Deny

Outbound Rules

Type Protocol Port Range Use Source Allow/Deny
Custom TCP Rule TCP (6) 0–65535 Ephemeral Ports 0.0.0.0/0 Allow

You can use the entire port range specified in the previous table, or find your specific ephemeral ports by entering the following command:

$ cat /proc/sys/net/ipv4/ip_local_port_range

More information

For detailed information on network ACLs within AWS, refer to Network ACLs in the Amazon documentation.

For detailed information on ephemeral ports within AWS, refer to Ephemeral Ports in the Amazon documentation.

1.3 - Configure TCP keepalive with AWS network load balancer

AWS supports three types of elastic load balancers (ELBs):.

AWS supports three types of elastic load balancers (ELBs):

Vertica strongly recommends the AWS Network Load Balancer (NLB), which provides the best performance with your Vertica database. The Network Load Balancer acts as a proxy between clients (such as JDBC) and Vertica servers. The Classic and Application Load Balancers do not work with Vertica, in Enterprise Mode or Eon Mode.

To avoid timeouts and hangs when connecting to Vertica through the NLB, it is important to understand how AWS NLB handles idle timeouts for connections. For the NLB, AWS sets the idle timeout value to 350 seconds and you cannot change this value. The timeout applies to both connection points.

For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned because the server fails to send a keepalive within 350 seconds.

To identify an idle timeout/keepalive issue, run a query like this via a client such as JDBC:

=> SELECT SLEEP(355);

If there’s a problem, one of the following situations occurs:

  • The client connection terminates before 355 seconds. In this case, lower the JDBC keepalive setting so that keepalives are sent less than 350 seconds apart.

  • The client connection doesn’t return a result after 355 seconds. In this case, you need to adjust the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) so that keepalives are sent less than 350 seconds apart.

    You can adjust the keepalive settings on the server, or you can adjust them in Vertica.

For detailed information about AWS Network Load Balancers, see the AWS documentation.

1.4 - Create and assign an internet gateway

When you create a VPC, an Internet gateway is automatically assigned to it.

When you create a VPC, an Internet gateway is automatically assigned to it. You can use that gateway, or you can assign your own. If you are using the default Internet gateway, continue with the procedure described in Create a security group.

Otherwise, create an Internet gateway specific to your needs. Associate that internet gateway with your VPC and subnet.

For information about how to create an Internet Gateway, see Internet Gateways in the AWS documentation.

1.5 - Assign an elastic IP address

An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally.

An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally. Vertica recommends you assign a single elastic IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using their internal IP addresses dictated by your VPC settings.

Create an elastic IP address. For information, see Elastic IP Addresses in the AWS documentation.

1.6 - Create a security group

The Vertica AMI has specific security group requirements.

The Vertica AMI has specific security group requirements. When you create a Virtual Private Cloud (VPC), AWS automatically creates a default security group and assigns it to the VPC. You can use the default security group, or you can name and assign your own.

Create and name your own security group using the following basic security group settings. You may make additional modifications based on your specific needs.

Inbound

Type Use Protocol Port Range IP
SSH TCP 22 The CIDR address range of administrative systems that require SSH access to the Vertica nodes. Make this range as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
DNS (UDP) UDP 53 Your private subnet address range (for example, 10.0.0.0/24).
Custom UDP Spread UDP 4803 and 4804 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP Spread TCP 4803 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP VSQL/SQL TCP 5433 The CIDR address range of client systems that require access to the Vertica nodes. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP Inter-node Communication TCP 5434 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP TCP 5444 Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP MC TCP 5450 The CIDR address of client systems that require access to the management console. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP Rsync TCP 50000 Your private subnet address range (for example, 10.0.0.0/24).
ICMP Installer Echo Reply N/A Your private subnet address range (for example, 10.0.0.0/24).
ICMP Installer Traceroute N/A Your private subnet address range (for example, 10.0.0.0/24).

Outbound

Type Protocol Port Range Destination IP
All TCP TCP 0-65535 Anywhere 0.0.0.0/0
All ICMP ICMP 0-65535 Anywhere 0.0.0.0/0
All UDP UDP 0-65535 Anywhere 0.0.0.0/0

For information about what a security group is, as well as how to create one, see Amazon EC2 Security Groups for Linux Instances in the AWS documentation.

2 - Deploy AWS instances for your Vertica database cluster

After you Configure Your Network, you can create AWS instances and deploy Vertica.

After you Configure your network, you can create AWS instances and deploy Vertica. Follow these procedures to deploy and run Vertica on AWS.

2.1 - Configure and launch an instance

After you configure your network settings on AWS, configure and launch the instances where you will install Vertica.

After you configure your network settings on AWS, configure and launch the instances where you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.

When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. Vertica recommends that you use the Vertica AMI unmodified. The Vertica AMI acts as a template, requiring fewer configuration steps:

  1. Choose a Vertica AMI Operating Systems

  2. Configure EC2 instances.

  3. Add storage to instances.

  4. Optionally, configure EBS volumes as a RAID array.

  5. Set the security group and S3 access.

  6. Launch instances and verify they are running.

OpenText provides Vertica and Management Console AMIs on the Red Hat Enterprise Linux 8 operating system.

You can use the AMI to deploy MC hosts or cluster hosts. For more information, see the AWS Marketplace.

Configure EC2 instances in AWS

  1. Select a Vertica AMI from the AWS marketplace.For instance type recommendations for Eon Mode databases, see Choosing AWS Eon Mode Instance Types.

  2. Select the desired fulfillment method.

  3. Configure the following:

Add storage to instances

Consider the following issues when you add storage to your instances:

  • Add a number of drives equal to the number of physical cores in your instance—for example, for a c3.8xlarge instance, 16 drives; for an r3.4xlarge, 8 drives.

  • Do not store your information on the root volume.

  • Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types.

Configure EBS volumes as a RAID array

You can configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:

[dbadmin@ip-10-11-12-13 ~]$ /opt/vertica/bin/vioperf /vertica/data

The minimum required I/O is 20 MB/s read and write per physical processor core on
each node, in full duplex i.e. reading and writing at this rate simultaneously,
concurrently on all nodes of the cluster. The recommended I/O is 40 MB/s per
physical core on each node. For example, the I/O rate for a server node with 2
hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.

Using direct io (buffer size=1048576, alignment=512) for directory "/vertica/data"

test      | directory     | counter name        | counter | counter   | counter       | counter       | thread | %CPU  | %IO Wait  | elapsed | remaining
          |               |                     | value   | value (10 | value/core    | value/core    | count  |       |           | time (s)| time (s)
          |               |                     |         | sec avg)  |               | (10 sec avg)  |        |       |           |         |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Write     | /vertica/data | MB/s                | 259     | 259       | 32.375        | 32.375        | 8      | 4     | 11        | 10      | 65
Write     | /vertica/data | MB/s                | 248     | 232       | 31            | 29            | 8      | 4     | 11        | 20      | 55
Write     | /vertica/data | MB/s                | 240     | 234       | 30            | 29.25         | 8      | 4     | 11        | 30      | 45
Write     | /vertica/data | MB/s                | 240     | 233       | 30            | 29.125        | 8      | 4     | 13        | 40      | 35
Write     | /vertica/data | MB/s                | 240     | 233       | 30            | 29.125        | 8      | 4     | 13        | 50      | 25
Write     | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 12        | 60      | 15
Write     | /vertica/data | MB/s                | 240     | 238       | 30            | 29.75         | 8      | 4     | 12        | 70      | 5
Write     | /vertica/data | MB/s                | 240     | 235       | 30            | 29.375        | 8      | 4     | 12        | 75      | 0
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 237+237 | 237+237   | 29.625+29.625 | 29.625+29.625 | 8      | 4     | 22        | 10      | 65
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 235+235 | 234+234   | 29.375+29.375 | 29.25+29.25   | 8      | 4     | 20        | 20      | 55
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235   | 29.25+29.25   | 29.375+29.375 | 8      | 4     | 20        | 30      | 45
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234   | 29.125+29.125 | 29.25+29.25   | 8      | 4     | 18        | 40      | 35
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234   | 29.125+29.125 | 29.25+29.25   | 8      | 4     | 20        | 50      | 25
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235   | 29.25+29.25   | 29.375+29.375 | 8      | 3     | 19        | 60      | 15
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 233+233 | 236+236   | 29.125+29.125 | 29.5+29.5     | 8      | 4     | 21        | 70      | 5
ReWrite   | /vertica/data | (MB-read+MB-write)/s| 232+232 | 236+236   | 29+29         | 29.5+29.5     | 8      | 4     | 21        | 75      | 0
Read      | /vertica/data | MB/s                | 248     | 248       | 31            | 31            | 8      | 4     | 12        | 10      | 65
Read      | /vertica/data | MB/s                | 241     | 236       | 30.125        | 29.5          | 8      | 4     | 15        | 20      | 55
Read      | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 10        | 30      | 45
Read      | /vertica/data | MB/s                | 240     | 232       | 30            | 29            | 8      | 4     | 12        | 40      | 35
Read      | /vertica/data | MB/s                | 240     | 234       | 30            | 29.25         | 8      | 4     | 12        | 50      | 25
Read      | /vertica/data | MB/s                | 238     | 235       | 29.75         | 29.375        | 8      | 4     | 15        | 60      | 15
Read      | /vertica/data | MB/s                | 238     | 232       | 29.75         | 29            | 8      | 4     | 13        | 70      | 5
Read      | /vertica/data | MB/s                | 238     | 238       | 29.75         | 29.75         | 8      | 3     | 9         | 75      | 0
SkipRead  | /vertica/data | seeks/s             | 22909   | 22909     | 2863.62       | 2863.62       | 8      | 0     | 6         | 10      | 65
SkipRead  | /vertica/data | seeks/s             | 21989   | 21068     | 2748.62       | 2633.5        | 8      | 0     | 6         | 20      | 55
SkipRead  | /vertica/data | seeks/s             | 21639   | 20936     | 2704.88       | 2617          | 8      | 0     | 7         | 30      | 45
SkipRead  | /vertica/data | seeks/s             | 21478   | 20999     | 2684.75       | 2624.88       | 8      | 0     | 6         | 40      | 35
SkipRead  | /vertica/data | seeks/s             | 21381   | 20995     | 2672.62       | 2624.38       | 8      | 0     | 5         | 50      | 25
SkipRead  | /vertica/data | seeks/s             | 21310   | 20953     | 2663.75       | 2619.12       | 8      | 0     | 5         | 60      | 15
SkipRead  | /vertica/data | seeks/s             | 21280   | 21103     | 2660          | 2637.88       | 8      | 0     | 8         | 70      | 5
SkipRead  | /vertica/data | seeks/s             | 21272   | 21142     | 2659          | 2642.75       | 8      | 0     | 6         | 75      | 0

If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).

If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.

Security group and access

  1. Choose between your previously configured security group or the default security group.

  2. Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS authentication for more information.

2.2 - Connect to an instance

Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:.

Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:

  1. As the dbadmin user, type the following command, substituting your ssh key:

    $ ssh --ssh-identity <ssh key> dbadmin@elasticipaddress
    
  2. Select Instances from the Navigation panel.

  3. Select the instance that is attached to the Elastic IP.

  4. Click Connect.

  5. On Connect to Your Instance, choose one of the following options:

    • A Java SSH Client directly from my browser—Add the path to your private key in the field Private key path, andclick Launch SSH Client.

    • Connect with a standalone SSH client**—**Follow the steps required by your standalone SSH client.

Connect to an instance from windows using putty

If you connect to the instance from the Windows operating system, and plan to use Putty:

  1. Convert your key file using PuTTYgen.

  2. Connect with Putty or WinSCP (connect via the elastic IP), using your converted key (i.e., the *ppk file).

  3. Move your key file (the *pem file) to the root dir using Putty or WinSCP.

2.3 - Prepare instances for cluster formation

After you create your instances, you need to prepare them for cluster formation.

After you create your instances, you need to prepare them for cluster formation. Prepare your instances by adding your AWS .pem key and your Vertica license.

By default, each AMI includes a Community Edition license. Once Vertica is installed, you can find the license at this location:

/opt/vertica/config/licensing/vertica_community_edition.license.key
  1. As the dbadmin user, copy your *pem file (from where you saved it locally) onto your primary instance.

    Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:

    FATAL (19): Failed Login Validation 10.0.3.158, cannot resolve or connect to host as root.
    

    If you receive a failure message, enter the following command to correct permissions on your *pem file:

    $ chmod 600 /<name-of-pem>.pem
    
  2. Copy your Vertica license over to your primary instance, placing it in your home directory or other known location.

2.4 - Change instances on AWS

You can change instance types on AWS.

You can change instance types on AWS. For example, you can downgrade a c3.8xlarge instance to c3.4xlarge. See Supported AWS instance types for a list of valid AWS instances.

When you change AWS instances you may need to:

  • Reconfigure memory settings

  • Reset memory size in a resource pool

  • Reset number of CPUs in a resource pool

Reconfigure memory settings

If you change to an AWS instance type that requires a different amount of memory, you may need to recompute the following and then reset the values:

Reset memory size in a resource pool

If you used absolute memory in a resource pool, you may need to reconfigure the memory using the MEMORYSIZE parameter in ALTER RESOURCE POOL.

Reset number of CPUs in a resource pool

If your new instance requires a different number of CPUs, you may need to reset the CPUAFFINITYSET parameter in ALTER RESOURCE POOL.

2.5 - Configure storage

Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system.

Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system. The /opt/vertica/sbin/configure_software_raid.sh script automates the storage configuration process.

Vertica performance tests Eon Mode with a per-node EBS volume of up to 2TB. For best performance, combine multiple EBS volumes into a RAID 0 array.

For more information about RAID 0 arrays and EBS volumes, see RAID configuration on Linux.

Determining volume names

Because the storage configuration script requires the volume names that you want to configure, you must identify the volumes on your machine. The following command lists the contents of the /dev directory. Search for the volumes that begin with xvd:

$ ls /dev

Combining volumes for storage

The configure_software_raid.sh shell script combines your EBS volumes into a RAID 0 array.

The following steps combine your EBS volumes into RAID 0 with the configure_software_raid.sh script:

  1. Edit the /opt/vertica/sbin/configure_software_raid.sh shell file as follows:

    1. Comment out the safety exit command at the beginning .

    2. Change the sample volume names to your own volume names, which you noted previously. Add more volumes, if necessary.

  2. Run the /opt/vertica/sbin/configure_software_raid.sh shell file. Running this file creates a RAID 0 volume and mounts it to /vertica/data.

  3. Change the owner of the newly created volume to dbadmin with chown.

  4. Repeat steps 1-3 for each node on your cluster.

2.6 - Create a cluster

On AWS, use the install_vertica script to combine instances and create a cluster.

On AWS, use the install_vertica script to combine instances and create a cluster. Check your My Instances page on AWS for a list of current instances and their associated IP addresses. You need these IP addresses when you run install_vertica.

Create a cluster as follows:

  1. While connected to your primary instance, enter the following command to combine your instances into a cluster. Substitute the IP addresses for your instances and include your root *.pem file name.

    $ sudo /opt/vertica/sbin/install_vertica --hosts 10.0.11.164,10.0.11.165,10.0.11.166 \
      --dba-user-password-disabled --point-to-point --data-dir /vertica/data \
      --ssh-identity ~/name-of-pem.pem --license license.file
    
  2. After combining your instances, Vertica recommends deleting your *.pem key from your cluster to reduce security risks. The example below uses the shred command to delete the file:

    $ shred name-of-pem.pem
    
  3. After creating one or more clusters, create your database or connect to Management Console on AWS.

For complete information on the install_vertica script and its parameters, see Install Vertica with the installation script.

Check open ports manually using the netcat utility

Once your cluster is up and running, you can check ports manually through the command line using the netcat (nc) utility. What follows is an example using the utility to check ports.

Before performing the procedure, choose the private IP addresses of two nodes in your cluster.

The examples given below use nodes with the private IPs:

10.0.11.60 10.0.11.61

Install the nc utility on your nodes. Once installed, you can issue commands to check the ports on one node from another node.

To check a TCP port:

  1. Put one node in listen mode and specify the port. The following sample shows how to put IP 10.0.11.60 into listen mode for port 480
[root@ip-10-0-11-60 ~]# nc -l 4804
  1. From the other node, run nc specifying the IP address of the node you just put in listen mode, and the same port number.
[root@ip-10-0-11-61 ~]# nc 10.0.11.60 4804
  1. Enter sample text from either node and it should show up on the other node. To cancel after you have checked a port, enter Ctrl+C.

    [root@ip-10-0-11-60 ~]# nc -u -l 4804
    [root@ip-10-0-11-61 ~]# nc -u 10.0.11.60 4804
    

2.7 - Management Console on AWS

Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster.

Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster. Vertica provides an MC AMI, which you can use with AWS. The MC AMI allows you to create an instance, dedicated to running MC, that you can attach to a new or existing Vertica cluster on AWS. You can create and attach an MC instance to your Vertica on AWS cluster at any time.

After you launch your MC instance and configure your security group settings, you can log in to your database. To do so, use the elastic IP you specified during instance creation.

From this elastic IP, you can manage your Vertica database on AWS using standard MC procedures.

Considerations when using MC on AWS

  • Because MC is already installed on the MC AMI, the MC installation process does not apply.

  • To uninstall MC on AWS, follow the procedures provided in Uninstalling Management Console before terminating the MC Instance.