Add the required network inbound and outbound rules to the Network ACL associated to the VPC.
Note
A Vertica cluster must be operated within a single availability zone.
For information about VPCs, including how to create one, visit the AWS documentation.
3 - Install Vertica with manually deployed AWS resources
Vertica provides an AMI that you can install on AWS resources that you manually deploy.
Vertica provides an AMI that you can install on AWS resources that you manually deploy. This section will guide you through configuring your network settings on AWS, launching and preparing EC2 instances using the Vertica AMI, and creating a Vertica cluster on those EC2 instances.
Choose this method of installation if you are familiar with configuring AWS and have many specific AWS configuration needs. (To automatically deploy AWS resources and a Vertica cluster instead, see Installing Vertica with CloudFormation templates.
3.1 - Configure your network
Before you create your cluster, you must configure the network on which Vertica will run.
Before you create your cluster, you must configure the network on which Vertica will run. Vertica requires a number of specific network configurations to operate on AWS. You may also have specific network configuration needs beyond the default Vertica settings.
Important
You can create a Vertica database that uses IPv6 for internal communications running on AWS. However, if you do so, you must identify the hosts in your cluster using IP addresses rather than host names. The AWS DNS resolution service is incompatible with IPv6.
The following sections explain which Amazon EC2 features you need to configure for instance creation.
3.1.1 - Create a placement group, key pair, and VPC
Part of configuring your network for AWS is to create the following:.
Part of configuring your network for AWS is to create the following:
Create a placement group
A placement group is a logical grouping of instances in a single Availability Zone. Placement Groups are required for clusters and all Vertica nodes must be in the same Placement Group.
Vertica recommends placement groups for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your Placement Group, choose an instance type that supports enhanced networking.
For information on creating placement groups, see Placement Groups in the AWS documentation.
Create a key pair
You need a key pair to access your instances using SSH. Create the key pair using the AWS interface and store a copy of your key (*.pem) file on your local machine. When you access an instance, you need to know the local path of your key.
Use a key pair to:
for information on creating a key pair, see Amazon EC2 Key Pairs in the AWS documentation.
Create a virtual private cloud (VPC)
You create a Virtual Private Cloud (VPC) on Amazon so that you can create a network of your EC2 instances. Your instances in the VPC all share the same network and security settings.
A Vertica cluster on AWS must be logically located in the same network. Create a VPC to ensure the nodes in you cluster can communicate with each other in AWS.
Create a single public subnet VPC with the following configurations:
Note
A Vertica cluster must be operated in a single availability zone.
For information on creating a VPC, see Create a Virtual Private Cloud (VPC) in the AWS documentation.
3.1.2 - Network ACL settings
Vertica requires the following network access control list (ACL) settings on an AWS instance running the Vertica AMI.
Vertica requires the following basic network access control list (ACL) settings on an AWS instance running the Vertica AMI. Vertica recommends that you secure your network with additional ACL settings that are appropriate to your situation.
Inbound Rules
Type |
Protocol |
Port Range |
Use |
Source |
Allow/Deny |
SSH |
TCP (6) |
22 |
SSH (Optional—for access to your cluster from outside your VPC) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
5450 |
MC (Optional—for MC running outside of your VPC) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
5433 |
SQL Clients (Optional—for access to your cluster from SQL clients) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
50000 |
Rsync (Optional—for backup outside of your VPC) |
User Specific |
Allow |
Custom TCP Rule |
TCP (6) |
1024-65535 |
Ephemeral Ports (Needed if you use any of the above) |
User Specific |
Allow |
ALL Traffic |
ALL |
ALL |
N/A |
0.0.0.0/0 |
Deny |
Outbound Rules
Type |
Protocol |
Port Range |
Use |
Source |
Allow/Deny |
Custom TCP Rule |
TCP (6) |
0–65535 |
Ephemeral Ports |
0.0.0.0/0 |
Allow |
You can use the entire port range specified in the previous table, or find your specific ephemeral ports by entering the following command:
$ cat /proc/sys/net/ipv4/ip_local_port_range
For detailed information on network ACLs within AWS, refer to Network ACLs in the Amazon documentation.
For detailed information on ephemeral ports within AWS, refer to Ephemeral Ports in the Amazon documentation.
3.1.3 - Configure TCP keepalive with AWS network load balancer
AWS supports three types of elastic load balancers (ELBs):.
AWS supports three types of elastic load balancers (ELBs):
•Classic Load Balancers
•Application Load Balancers
•Network Load Balancers
Vertica strongly recommends the AWS Network Load Balancer (NLB), which provides the best performance with your Vertica database. The Network Load Balancer acts as a proxy between clients (such as JDBC) and Vertica servers. The Classic and Application Load Balancers do not work with Vertica, in Enterprise Mode or Eon Mode.
To avoid timeouts and hangs when connecting to Vertica through the NLB, it is important to understand how AWS NLB handles idle timeouts for connections. For the NLB, AWS sets the idle timeout value to 350 seconds and you cannot change this value. The timeout applies to both connection points.
For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned because the server fails to send a keepalive within 350 seconds.
To identify an idle timeout/keepalive issue, run a query like this via a client such as JDBC:
=> SELECT SLEEP(355);
If there’s a problem, one of the following situations occurs:
-
The client connection terminates before 355 seconds. In this case, lower the JDBC keepalive setting so that keepalives are sent less than 350 seconds apart.
-
The client connection doesn’t return a result after 355 seconds. In this case, you need to adjust the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) so that keepalives are sent less than 350 seconds apart.
For detailed information about AWS Network Load Balancers, see What is a Network Load Balancer? in the AWS documentation.
3.1.4 - Create and assign an internet gateway
When you create a VPC, an Internet gateway is automatically assigned to it.
When you create a VPC, an Internet gateway is automatically assigned to it. You can use that gateway, or you can assign your own. If you are using the default Internet gateway, continue with the procedure described in Create a security group.
Otherwise, create an Internet gateway specific to your needs. Associate that internet gateway with your VPC and subnet.
For information about how to create an Internet Gateway, see Internet Gateways in the AWS documentation.
3.1.5 - Assign an elastic IP address
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally.
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally. Vertica recommends you assign a single elastic IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using their internal IP addresses dictated by your VPC settings.
Create an elastic IP address. For information, see Elastic IP Addresses in the AWS documentation.
3.1.6 - Create a security group
The Vertica AMI has specific security group requirements.
The Vertica AMI has specific security group requirements. When you create a Virtual Private Cloud (VPC), AWS automatically creates a default security group and assigns it to the VPC. You can use the default security group, or you can name and assign your own.
Create and name your own security group using the following basic security group settings. You may make additional modifications based on your specific needs.
Inbound
Type |
Use |
Protocol |
Port Range |
IP |
SSH |
|
TCP |
22 |
The CIDR address range of administrative systems that require SSH access to the Vertica nodes. Make this range as restrictive as possible. You can add multiple rules for separate network ranges, if necessary. |
DNS (UDP) |
|
UDP |
53 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom UDP |
Spread |
UDP |
4803 and 4804 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
Spread |
TCP |
4803 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
VSQL/SQL |
TCP |
5433 |
The CIDR address range of client systems that require access to the Vertica nodes. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary. |
Custom TCP |
Inter-node Communication |
TCP |
5434 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
|
TCP |
5444 |
Your private subnet address range (for example, 10.0.0.0/24). |
Custom TCP |
MC |
TCP |
5450 |
The CIDR address of client systems that require access to the management console. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary. |
Custom TCP |
Rsync |
TCP |
50000 |
Your private subnet address range (for example, 10.0.0.0/24). |
ICMP |
Installer |
Echo Reply |
N/A |
Your private subnet address range (for example, 10.0.0.0/24). |
ICMP |
Installer |
Traceroute |
N/A |
Your private subnet address range (for example, 10.0.0.0/24). |
Note
In Management Console (MC), the Java IANA discovery process uses port 7 once to detect if an IP address is reachable before the database import operation. Vertica tries port 7 first. If port 7 is blocked, Vertica switches to port 22.
Outbound
Type |
Protocol |
Port Range |
Destination |
IP |
All TCP |
TCP |
0-65535 |
Anywhere |
0.0.0.0/0 |
All ICMP |
ICMP |
0-65535 |
Anywhere |
0.0.0.0/0 |
All UDP |
UDP |
0-65535 |
Anywhere |
0.0.0.0/0 |
For information about what a security group is, as well as how to create one, see Amazon EC2 Security Groups for Linux Instances in the AWS documentation.
3.2 - Deploy AWS instances for your Vertica database cluster
Once you have configured your network, you are ready to create your AWS instances and install Vertica.
Once you have configured your network, you are ready to create your AWS instances and install Vertica. Follow these procedures to install and run Vertica on AWS.
3.2.1 - Configure and launch an instance
After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica.
After you configure your network settings on AWS, configure and launch the instances onto which you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.
When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps. Vertica recommends that you use the Vertica AMI as is—without modification.
-
Select the Vertica AMI from the AWS marketplace.
-
Select the desired fulfillment method.
-
Configure the following:
Add storage to your instances
Consider the following issues when you add storage to your instances:
-
Add a number of drives equal to the number of physical cores in your instance. For example, for a c3.8xlarge instance, 16 drives. For an r3.4xlarge, add 8 drives.
-
Do not store your information on the root volume.
-
Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types in the Amazon Web Services documentation.
You can choose to configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:
[dbadmin@ip-10-11-12-13 ~]$ /opt/vertica/bin/vioperf /vertica/data
The minimum required I/O is 20 MB/s read and write per physical processor core on
each node, in full duplex i.e. reading and writing at this rate simultaneously,
concurrently on all nodes of the cluster. The recommended I/O is 40 MB/s per
physical core on each node. For example, the I/O rate for a server node with 2
hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.
Using direct io (buffer size=1048576, alignment=512) for directory "/vertica/data"
test | directory | counter name | counter | counter | counter | counter | thread | %CPU | %IO Wait | elapsed | remaining
| | | value | value (10 | value/core | value/core | count | | | time (s)| time (s)
| | | | sec avg) | | (10 sec avg) | | | | |
--------------------------------------------------------------------------------------------------------------------------------------------------------
Write | /vertica/data | MB/s | 259 | 259 | 32.375 | 32.375 | 8 | 4 | 11 | 10 | 65
Write | /vertica/data | MB/s | 248 | 232 | 31 | 29 | 8 | 4 | 11 | 20 | 55
Write | /vertica/data | MB/s | 240 | 234 | 30 | 29.25 | 8 | 4 | 11 | 30 | 45
Write | /vertica/data | MB/s | 240 | 233 | 30 | 29.125 | 8 | 4 | 13 | 40 | 35
Write | /vertica/data | MB/s | 240 | 233 | 30 | 29.125 | 8 | 4 | 13 | 50 | 25
Write | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 12 | 60 | 15
Write | /vertica/data | MB/s | 240 | 238 | 30 | 29.75 | 8 | 4 | 12 | 70 | 5
Write | /vertica/data | MB/s | 240 | 235 | 30 | 29.375 | 8 | 4 | 12 | 75 | 0
ReWrite | /vertica/data | (MB-read+MB-write)/s| 237+237 | 237+237 | 29.625+29.625 | 29.625+29.625 | 8 | 4 | 22 | 10 | 65
ReWrite | /vertica/data | (MB-read+MB-write)/s| 235+235 | 234+234 | 29.375+29.375 | 29.25+29.25 | 8 | 4 | 20 | 20 | 55
ReWrite | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235 | 29.25+29.25 | 29.375+29.375 | 8 | 4 | 20 | 30 | 45
ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234 | 29.125+29.125 | 29.25+29.25 | 8 | 4 | 18 | 40 | 35
ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 234+234 | 29.125+29.125 | 29.25+29.25 | 8 | 4 | 20 | 50 | 25
ReWrite | /vertica/data | (MB-read+MB-write)/s| 234+234 | 235+235 | 29.25+29.25 | 29.375+29.375 | 8 | 3 | 19 | 60 | 15
ReWrite | /vertica/data | (MB-read+MB-write)/s| 233+233 | 236+236 | 29.125+29.125 | 29.5+29.5 | 8 | 4 | 21 | 70 | 5
ReWrite | /vertica/data | (MB-read+MB-write)/s| 232+232 | 236+236 | 29+29 | 29.5+29.5 | 8 | 4 | 21 | 75 | 0
Read | /vertica/data | MB/s | 248 | 248 | 31 | 31 | 8 | 4 | 12 | 10 | 65
Read | /vertica/data | MB/s | 241 | 236 | 30.125 | 29.5 | 8 | 4 | 15 | 20 | 55
Read | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 10 | 30 | 45
Read | /vertica/data | MB/s | 240 | 232 | 30 | 29 | 8 | 4 | 12 | 40 | 35
Read | /vertica/data | MB/s | 240 | 234 | 30 | 29.25 | 8 | 4 | 12 | 50 | 25
Read | /vertica/data | MB/s | 238 | 235 | 29.75 | 29.375 | 8 | 4 | 15 | 60 | 15
Read | /vertica/data | MB/s | 238 | 232 | 29.75 | 29 | 8 | 4 | 13 | 70 | 5
Read | /vertica/data | MB/s | 238 | 238 | 29.75 | 29.75 | 8 | 3 | 9 | 75 | 0
SkipRead | /vertica/data | seeks/s | 22909 | 22909 | 2863.62 | 2863.62 | 8 | 0 | 6 | 10 | 65
SkipRead | /vertica/data | seeks/s | 21989 | 21068 | 2748.62 | 2633.5 | 8 | 0 | 6 | 20 | 55
SkipRead | /vertica/data | seeks/s | 21639 | 20936 | 2704.88 | 2617 | 8 | 0 | 7 | 30 | 45
SkipRead | /vertica/data | seeks/s | 21478 | 20999 | 2684.75 | 2624.88 | 8 | 0 | 6 | 40 | 35
SkipRead | /vertica/data | seeks/s | 21381 | 20995 | 2672.62 | 2624.38 | 8 | 0 | 5 | 50 | 25
SkipRead | /vertica/data | seeks/s | 21310 | 20953 | 2663.75 | 2619.12 | 8 | 0 | 5 | 60 | 15
SkipRead | /vertica/data | seeks/s | 21280 | 21103 | 2660 | 2637.88 | 8 | 0 | 8 | 70 | 5
SkipRead | /vertica/data | seeks/s | 21272 | 21142 | 2659 | 2642.75 | 8 | 0 | 6 | 75 | 0
If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).
Note
If your EC2 instance has hyper-threading enabled, vioperf may incorrectly count the number of cores in your system. The 20MB/s throughput per core requirement only applies to physical cores, rather than virtual cores. If your EC2 instance has hyper-threading enabled, divide the counter value (column 4 in the output) by the number of physical cores. See CPU Cores and Threads Per CPU Core Per Instance Type section in the AWS documentation topic
Optimizing CPU Options for a list of physical cores in each instance type.
If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.
Security group and access
-
Choose between your previously configured security group or the default security group.
-
Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS authentication for more information.
Launch instances
Verify that your instances are running.
3.2.2 - Connect to an instance
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:.
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:
-
As the dbadmin user, type the following command, substituting your ssh key:
$ ssh --ssh-identity <ssh key> dbadmin@elasticipaddress
-
Select Instances from the Navigation panel.
-
Select the instance that is attached to the Elastic IP.
-
Click Connect.
-
On Connect to Your Instance, choose one of the following options:
-
A Java SSH Client directly from my browser—Add the path to your private key in the field Private key path, andclick Launch SSH Client.
-
Connect with a standalone SSH client**—**Follow the steps required by your standalone SSH client.
Connect to an instance from windows using putty
If you connect to the instance from the Windows operating system, and plan to use Putty:
-
Convert your key file using PuTTYgen.
-
Connect with Putty or WinSCP (connect via the elastic IP), using your converted key (i.e., the *ppk
file).
-
Move your key file (the *pem
file) to the root dir using Putty or WinSCP.
3.2.3 - Prepare instances for cluster formation
After you create your instances, you need to prepare them for cluster formation.
After you create your instances, you need to prepare them for cluster formation. Prepare your instances by adding your AWS .pem
key and your Vertica license.
By default, each AMI includes a Community Edition license. Once Vertica is installed, you can find the license at this location:
/opt/vertica/config/licensing/vertica_community_edition.license.key
-
As the dbadmin user, copy your *pem
file (from where you saved it locally) onto your primary instance.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica
script fails with a message similar to the following:
FATAL (19): Failed Login Validation 10.0.3.158, cannot resolve or connect to host as root.
If you receive a failure message, enter the following command to correct permissions on your *pem
file:
$ chmod 600 /<name-of-pem>.pem
-
Copy your Vertica license over to your primary instance, placing it in your home directory or other known location.
3.2.4 - Change instances on AWS
You can change instance types on AWS.
You can change instance types on AWS. For example, you can downgrade a c3.8xlarge instance to c3.4xlarge. See Supported AWS instance types for a list of valid AWS instances.
When you change AWS instances you may need to:
-
Reconfigure memory settings
-
Reset memory size in a resource pool
-
Reset number of CPUs in a resource pool
If you change to an AWS instance type that requires a different amount of memory, you may need to recompute the following and then reset the values:
Note
You may need root user permissions to reset these values.
Reset memory size in a resource pool
If you used absolute memory in a resource pool, you may need to reconfigure the memory using the MEMORYSIZE parameter in ALTER RESOURCE POOL.
Note
If you set memory size as a percentage when you created the original resource pool, you do not need to change it here.
Reset number of CPUs in a resource pool
If your new instance requires a different number of CPUs, you may need to reset the CPUAFFINITYSET parameter in ALTER RESOURCE POOL.
3.2.5 - Configure storage
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system.
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system. The /opt/vertica/sbin/configure_software_raid.sh
script automates the storage configuration process.
Caution
Do not store information on the root volume because it might result in data loss.
Vertica performance tests Eon Mode with a per-node EBS volume of up to 2TB. For best performance, combine multiple EBS volumes into a RAID 0 array.
For more information about RAID 0 arrays and EBS volumes, see RAID configuration on Linux.
Determining volume names
Because the storage configuration script requires the volume names that you want to configure, you must identify the volumes on your machine. The following command lists the contents of the /dev
directory. Search for the volumes that begin with xvd
:
$ ls /dev
Important
Ignore the root volume. Do not include any of your root volumes in the RAID creation process.
Combining volumes for storage
The configure_software_raid.sh
shell script combines your EBS volumes into a RAID 0 array.
Caution
Run configure_software_raid.sh
in the default setting only if you have a fresh configuration with no existing RAID settings.
If you have existing RAID settings, open the script in a text editor and manually edit the raid_dev
value to reflect your current RAID settings. If you have existing RAID settings and you do not edit the script, the script deletes important operating system device files.
Alternately, use the Management Console (MC) console to add storage nodes without unwanted changes to operating system device files. For more information, see Managing database clusters.
The following steps combine your EBS volumes into RAID 0 with the configure_software_raid.sh
script:
-
Edit the /opt/vertica/sbin/configure_software_raid.sh
shell file as follows:
-
Comment out the safety exit
command at the beginning .
-
Change the sample volume names to your own volume names, which you noted previously. Add more volumes, if necessary.
-
Run the /opt/vertica/sbin/configure_software_raid.sh
shell file. Running this file creates a RAID 0 volume and mounts it to /vertica/data
.
-
Change the owner of the newly created volume to dbadmin with chown
.
-
Repeat steps 1-3 for each node on your cluster.
3.2.6 - Create a cluster
On AWS, use the install_vertica script to combine instances and create a cluster.
On AWS, use the
install_vertica
script to combine instances and create a cluster. Check your My Instances page on AWS for a list of current instances and their associated IP addresses. You need these IP addresses when you run install_vertica
.
Create a cluster as follows:
-
While connected to your primary instance, enter the following command to combine your instances into a cluster. Substitute the IP addresses for your instances and include your root *.pem
file name.
$ sudo /opt/vertica/sbin/install_vertica --hosts 10.0.11.164,10.0.11.165,10.0.11.166 \
--dba-user-password-disabled --point-to-point --data-dir /vertica/data \
--ssh-identity ~/name-of-pem.pem --license license.file
Note
* If you are using Vertica Community Edition, which limits you to three instances, you can specify `-L CE` with no license file.
* When you issue install_vertica or update_vertica on a Vertica AMI script, --point-to-point is the default. This parameter configures <a class="glosslink" href="/en/glossary/spread/" title="An open source toolkit used in Vertica to provide a high performance messaging service that is resilient to network faults.">Spread</a> to use direct point-to-point communication between all Vertica nodes, which is a requirement for clusters on AWS.
* If you are using IPv6 network addresses to identify the hosts in your cluster, use the --ipv6 flag in your `install_vertica` command. You must also use IP addresses instead of host names, as the AWS DNS server cannot resolve host names to IPv6 addresses.
-
After combining your instances, Vertica recommends deleting your *.pem
key from your cluster to reduce security risks. The example below uses the shred
command to delete the file:
$ shred name-of-pem.pem
-
After creating one or more clusters, create your database.
For complete information on the install_vertica
script and its parameters, see Installing Vertica with the installation script.
Important
Stopping or rebooting an instance or cluster without first shutting down the database down, may result in disk or database corruption. To safely shut down and restart your cluster, see
Operating the database.
Check open ports manually using the netcat utility
Once your cluster is up and running, you can check ports manually through the command line using the netcat (nc) utility. What follows is an example using the utility to check ports.
Before performing the procedure, choose the private IP addresses of two nodes in your cluster.
The examples given below use nodes with the private IPs:
10.0.11.60 10.0.11.61
Install the nc utility on your nodes. Once installed, you can issue commands to check the ports on one node from another node.
-
To check a TCP port:
-
Put one node in listen mode and specify the port. The following sample shows how to put IP 10.0.11.60
into listen mode for port 4804.
[root@ip-10-0-11-60 ~]# nc -l 4804
-
From the other node, run nc
specifying the IP address of the node you just put in listen mode, and the same port number.
[root@ip-10-0-11-61 ~]# nc 10.0.11.60 4804
-
Enter sample text from either node and it should show up on the other node. To cancel after you have checked a port, enter Ctrl+C.
Note
Note: To check a UDP port, use the same nc
commands with the –u
option.
[root@ip-10-0-11-60 ~]# nc -u -l 4804
[root@ip-10-0-11-61 ~]# nc -u 10.0.11.60 4804
3.2.7 - Use Management Console (MC) on AWS
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster.
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster. Vertica provides an MC AMI, which you can use with AWS. The MC AMI allows you to create an instance, dedicated to running MC, that you can attach to a new or existing Vertica cluster on AWS. You can create and attach an MC instance to your Vertica on AWS cluster at any time.
For information on requirements and installing MC, see Installing Management Console.
See also
3.2.7.1 - Log in to MC and managing your cluster
After you launch your MC instance and configure your security group settings, log in to your database.
After you launch your MC instance and configure your security group settings, log in to your database. To do so, use the elastic IP you specified during instance creation.
From this elastic IP, you can manage your Vertica database on AWS using standard MC procedures.
Considerations when using MC on AWS
-
Because MC is already installed on the MC AMI, the MC installation process does not apply.
-
To uninstall MC on AWS, follow the procedures provided in Uninstalling Management Console before terminating the MC Instance.
4 - Export data to Amazon S3 using the AWS library
The AWS library is deprecated.
Deprecated
The AWS library is deprecated. To export delimited data to S3 or any other destination, use
EXPORT TO DELIMITED.
The Vertica library for Amazon Web Services (AWS) is a set of functions and configurable session parameters. These parameters allow you to export delimited data from Vertica to Amazon S3 storage without any third-party scripts or programs.
To use the AWS library, you must have access to an Amazon S3 storage account.
4.1 - Configure the Vertica library for Amazon Web Services
You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3.
You use the Vertica library for Amazon Web Services (AWS) to export data from Vertica to S3. This library does not support IAM authentication. You must configure it to authenticate with S3 by using session parameters containing your AWS access key credentials. You can set your session parameters directly, or you can store your credentials in a table and set them with the AWS_SET_CONFIG function.
Because the AWS library uses session parameters, you must reconfigure the library with each new session.
Note
Important: Your AWS access key ID and secret access key are different from your account access credentials. For more information about AWS access keys, visit the
Managing Access Keys for IAM Users in the AWS documentation.
Set AWS authentication parameters
The following AWS authentication parameters allow you to access AWS and work with the data in your Vertica database:
-
aws_id: The 20-character AWS access key used to authenticate your account.
-
aws_secret: The 40-character AWS secret access key used to authenticate your account.
-
aws_session_token: The AWS temporary security token generated by running the AWS STS command get-session-token
. This AWS STS command generates temporary credentials you can use to implement multi-factor authentication for security purposes. See Implementing Multi-factor Authentication.
Implement multi-factor authentication
Implement multi-factor authentication as follows:
-
Run the AWS STS command get-session-token
, this returns the following:
$ Credentials": {
"SecretAccessKey": "bQid6jNuSWRqUzkIJCFG7c71gDHZY3h7aDSW2DU6",
"SessionToken":
"FQoDYXdzEBcaDKM1mWpeu88nDTTFICKsAbaiIDTWe4BTh33tnUvo9F/8mZicKKLLy7WIcpT4FLfr6ltIm242/U2CI9G/
XdC6eoysUi3UGH7cxdhjxAW4fjgCKKYuNL764N2xn0issmIuJOku3GTDyc4U4iNlWyEng3SlshdiqVlk1It2Mk0isEQXKtx
F9VgfncDQBxjZUCkYIzseZw5pULa9YQcJOzl+Q2JrdUCWu0iFspSUJPhOguH+wTqiM2XdHL5hcUcomqm41gU=",
"Expiration": "2018-04-12T01:58:50Z",
"AccessKeyId": "ASIAJ4ZYGTOSVSLUIN7Q"
}
}
For more information on get-session-token, see the AWS documentation.
-
Using the SecretAccessKey returned from get-sessiontoken, set your temporary aws_secret:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='bQid6jNuSWRqUzkIJCFG7c71gDHZY3h7aDSW2DU6';
-
Using the SessionToken returned from get-session-token, set your temporary aws_session_token:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_session_token='FQoDYXdzEBcaDKM1mWpeu88nDTTFICKsAbaiIDTWe4B
Th33tnUvo9F/8mZicKKLLy7WIcpT4FLfr6ltIm242/U2CI9G/XdC6eoysUi3UGH7cxdhjxAW4fjgCKKYuNL764N2xn0issmIuJOku3GTDy
c4U4iNlWyEng3SlshdiqVlk1It2Mk0isEQXKtxF9VgfncDQBxjZUCkYIzseZw5pULa9YQcJOzl+Q2JrdUCWu0iFspSUJPhOguH+wTq
iM2XdHL5hcUcomqm41gU=';
-
Using the AccessKeyID returned from get-session-token, set your temporary aws_id:
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_id='ASIAJ4ZYGTOSVSLUIN7Q';
The Expiration value returned indicates when the temporary credentials expire. In this example expiration occurs April 12, 2018 at 01:58:50.
These examples show how to implement multifactor authentication using session parameters. You can use either of the following methods to securely set and store your AWS account credentials:
AWS access key requirements
To communicate with AWS, your access key must have the following permissions:
-
s3:GetObject
-
s3:PutObject
-
s3:ListBucket
For security purposes, Vertica recommends that you create a separate access key with limited permissions specifically for use with the Vertica Library for AWS.
These examples show how to set the session parameters for AWS using your own credentials. Parameter values are case sensitive:
-
aws_id
: This value is your AWS access key ID.
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_id='AKABCOEXAMPLEPKPXYZQ';
-
aws_secret
: This value is your AWS secret access key.
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_secret='CEXAMPLE3tEXAMPLE1wEXAMPLEFrFEXAMPLE6+Yz';
-
aws_region
: This value is the AWS region associated with the S3 bucket you intend to access. Left unconfigured, aws_region will default to us-east-1. It identifies the default server used by Amazon S3.
=> ALTER SESSION SET UDPARAMETER FOR awslib aws_region='us-east-1';
When using ALTER SESSION:
-
Using ALTER SESSION to change the values of S3 parameters also changes the values of corresponding UDParameters.
-
Setting a UDParameter changes only the UDParameter.
-
Setting a configuration parameter changes both the AWS parameter and UDParameter.
You can place your credentials in a table and secure them with a row-level access policy. You can then call your credentials with the AWS_SET_CONFIG scalar meta-function. This approach allows you to store your credentials on your cluster for future session parameter configuration. You must have dbadmin access to create access policies.
-
Create a table with rows or columns corresponding with your credentials:
=> CREATE TABLE keychain(accesskey varchar, secretaccesskey varchar);
-
Store your credentials in the corresponding columns:
=> COPY keychain FROM STDIN;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> AEXAMPLEI5EXAMPLEYXQ|CCEXAMPLEtFjTEXAMPLEiEXAMPLE6+Yz
>> \.
-
Set a row-level access policy appropriate to your security situation.
-
With each new session, configure your session parameters by calling the AWS_SET_CONFIG parameter in a SELECT statement:
=> SELECT AWS_SET_CONFIG('aws_id', accesskey), AWS_SET_CONFIG('aws_secret', secretaccesskey)
FROM keychain;
aws_set_config | aws_set_config
----------------+----------------
aws_id | aws_secret
(1 row)
-
After you have configured your session parameters, verify them:
=> SHOW SESSION UDPARAMETER ALL;
4.2 - Export data to Amazon S3 from Vertica
After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function.
After you configure the library for Amazon Web Services (AWS), you can export Vertica data to Amazon S3 by calling the S3EXPORT() transform function. S3EXPORT() writes data to files, based on the URL you provide. Vertica performs all communication over HTTPS, regardless of the URL type you use.Vertica does not support virtual host style URLs. If you use HTTPS URL constructions, you must use path style URLs.
Note
If your S3 bucket contains a period in its path, set the prepend_hash
parameter to True.
You can control the output of S3EXPORT() in the following ways:
Adjust the query provided to S3EXPORT
By adjusting the query given to S3EXPORT(), you can export anything from tables to reporting queries.
This example exports a whole table:
=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
FROM exampleTable;
rows | url
------+------------------------------
606 | https://exampleBucket/object
(1 row)
This example exports the results of a query:
=> SELECT S3EXPORT(customer_name, annual_income USING PARAMETERS url='s3://exampleBucket/object') OVER()
FROM public.customer_dimension
WHERE (customer_gender, annual_income) IN
(SELECT customer_gender, MAX(annual_income)
FROM public.customer_dimension
GROUP BY customer_gender);
rows | url
------+------------------------------
25 | https://exampleBucket/object
(1 row)
Adjust the partition of your result set with the OVER clause
Use the OVER clause to control your export partitions. Using the OVER() clause without qualification results in a single partition processed by the initiator for all of the query data. This example shows how to call the function with an unqualified OVER() clause:
=> SELECT S3EXPORT(name, company USING PARAMETERS url='s3://exampleBucket/object',
delimiter=',') OVER()
FROM exampleTable WHERE company='Vertica';
rows | url
------+------------------------------
10 | https://exampleBucket/object
(1 row)
You can also use window clauses, such as window partition clauses and window order clauses, to manage exported objects.
This example shows how you can use a window partition clause to partition S3 objects based on company values:
=> SELECT S3EXPORT(name, company
USING PARAMETERS url='s3://exampleBucket/object',
delimiter=',') OVER(PARTITION BY company) AS MEDIAN
FROM exampleTable;
Adjusting the export chunk size for wide tables
You may encounter the following error when exporting extremely wide tables or tables with long data types such as LONG VARCHAR or LONG VARBINARY:
=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object') OVER(PARTITION BEST)
FROM veryWideTable;
ERROR 5861: Error calling setup() in User Function s3export
at [/data/.../S3.cpp:787],
error code: 0, message: The specified buffer of 10485760 bytesRead is too small,
it should be at least 11279701 bytesRead.
Vertica returns this error if the data for a single row overflows the buffer storing the data before export. By default, this buffer is 10MB. You can increase the size of this buffer using the chunksize parameter, which sets the size of the buffer in bytes. This example sets it to around 60MB:
=> SELECT S3EXPORT( * USING PARAMETERS url='s3://exampleBucket/object', chunksize=60485760)
OVER(PARTITION BEST) FROM veryWideTable;
rows | url
------+------------------------------
606 | https://exampleBucket/object
(1 row)
See also
5 - Add nodes to a running cluster on the cloud
There are two ways to add nodes to an AWS cluster:.
There are two ways to add nodes to an AWS cluster:
-
Using Management Console
-
Using admintools
When you use MC to add nodes to a cluster in the cloud, MC provisions the instances, adds the new instances to the existing Vertica cluster, and then adds those hosts to the database. However, when you add nodes to a cluster using admintools, you need to execute those steps yourself, as explained in Adding Nodes Using admintools.
Adding nodes using Management Console
In the Vertica Management Console, you can add nodes in several ways, depending on your database mode.
For Eon Mode databases, MC supports actions for subcluster and node management for the following public and private cloud providers:
Note
Enterprise Mode does not support subclusters.
For Enterprise Mode databases, MC supports these actions:
Note
In the cloud on GCP, Enterprise Mode databases are not supported.
Adding nodes in an Eon Mode database
In an Eon Mode database, every node must belong to a subcluster. To add nodes, you always add them to one of the subclusters in the database:
Adding nodes in an Enterprise Mode database on AWS
In an Enterprise Mode database on AWS, to add an instance to your cluster:
-
On the MC Home page, click View Infrastructure to go to the Infrastructure page. This page lists all the clusters the MC is monitoring.
-
Click any cluster shown on the Infrastructure page.
-
Select View or Manage from the dialog that displays, to view its Cluster page. (In a cloud environment, if MC was deployed from a cloud template the button says "Manage". Otherwise, the button says "View".)
Note
You can click the pencil icon beside the cluster name to rename the cluster. Enter a name that is unique within MC.
-
Click the Add (+) icon on the Instance List on the Cluster Management page.
MC adds a node to the selected cluster.
This section gives an overview on how to add nodes if you are managing your cluster using admintools. Each main step points to another topic with the complete instructions.
Step 1: before you start
Before you add nodes to a cluster, verify that you have an AWS cluster up and running and that you have:
-
Created a database.
-
Defined a database schema.
-
Loaded data.
-
Run the Database Designer.
-
Connected to your database.
Step 2: launch new instances to add to an existing cluster
Perform the procedure in Configure and launch an instance to create new instances (hosts) that you then will add to your existing cluster. Be sure to choose the same details you chose when you created the original instances (VPC, placement group, subnet, and security group).
Step 3: include new instances as cluster nodes
You need the IP addresses when you run the install_vertica
script to include new instances as cluster nodes.
If you are configuring Amazon Elastic Block Store (EBS) volumes, be sure to configure the volumes on the node before you add the node to your cluster.
To add the new instances as nodes to your existing cluster:
-
Configure and launch your new instances.
-
Connect to the instance that is assigned to the Elastic IP. See Connect to an instance if you need more information.
-
Run the Vertica installation script to add the new instances as nodes to your cluster. Specify the internal IP addresses for your instances and your *.pem
file name.
$ sudo /opt/vertica/sbin/install_vertica --add-hosts instance-ip --dba-user-password-disabled \
--point-to-point --data-dir /vertica/data --ssh-identity ~/name-of-pem.pem
Step 4: add the nodes
After you have added the new instances to your existing cluster, add them as nodes to your cluster, as described in Adding nodes to a database.
Step 5: rebalance the database
After you add nodes to a database, always rebalance the database.
6 - Remove nodes from a running AWS cluster
Use the following procedures to remove instances/nodes from an AWS cluster.
Use the following procedures to remove instances/nodes from an AWS cluster.
To avoid data loss, Vertica strongly recommends that you back up your database before removing a node. For details, see Backing up and restoring the database.
In this section
6.1 - Remove hosts from the database
Before you remove hosts from the database, verify that you have:.
Before you remove hosts from the database, verify that you have:
Note
Do not stop the database.
To remove a host from the database:
-
While logged on as dbadmin, launch Administration Tools.
$ /opt/vertica/bin/admintools
-
From the Main Menu, select Advanced Menu.
-
From Advanced Menu, select Cluster Management. ClickOK.
-
From Cluster Management, select Remove Host(s). Click OK.
-
From Select Database, choose the database from which you plan to remove hosts. Click OK.
-
Select the host(s) to remove. Click OK.
-
Click Yes to confirm removal of the hosts.
Note
Enter a password if necessary. Leave blank if there is no password.
-
Click OK. The system displays a message telling you that the hosts have been removed. Automatic rebalancing also occurs.
-
Click OK to confirm. Administration Tools brings you back to the Cluster Management menu.
6.2 - Remove nodes from the cluster
To remove nodes from a cluster, run the update_vertica script and specify:.
To remove nodes from a cluster, run the update_vertica
script and specify:
-
The option --remove-hosts
, followed by the IP addresses of the nodes you are removing.
-
The option --ssh-identity
, followed by the location and name of your *pem
file.
-
The option --dba-user-password-disabled
.
The following example removes one node from the cluster:
$ sudo /opt/vertica/sbin/update_vertica --remove-hosts 10.0.11.165 --point-to-point \
--ssh-identity ~/name-of-pem.pem --dba-user-password-disabled
6.3 - Stop the AWS instances (optional)
After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.
After you have removed one or more nodes from your cluster, to save costs associated with running instances, you can choose to stop the AWS instances that were previously part of your cluster.
To stop an instance in AWS:
-
On AWS, navigate to your Instances page.
-
Right-click the instance, and choose Stop.
This step is optional because, after you have removed the node from your Vertica cluster, Vertica no longer sees the node as part of the cluster, even though it is still running within AWS.
7 - Upgrade Vertica on AWS
Before you upgrade to the latest Vertica version, do the following:.
Before you upgrade to the latest Vertica version, do the following:
-
Back up your existing database.
-
Download the Vertica install packages described in Download and Install the Vertica Install Package.
Upgrade to the latest version of Vertica on AWS
To upgrade to the latest version of Vertica on AWS, follow the instructions in Upgrading Vertica.
If you are setting up a Vertica cluster on AWS for the first time, follow the procedure for installing and running on AWS.
Upgrade Vertica running on AWS
Vertica supports upgrades of Vertica server running on AWS instances created from the Vertica AMI. To upgrade Vertica, follow the instructions provided in Upgrading Vertica.
Make sure to add the following arguments to the upgrade script:
8 - Copying and exporting data on AWS: what you need to know
There are common issues that occur when exporting or copying on AWS clusters, as described below.
There are common issues that occur when exporting or copying on AWS clusters, as described below. Except for these specific issues as they relate to AWS, copying and exporting data works as documented in Database export and import.
To copy or export data on AWS:
-
Verify that all nodes in source and destination clusters have their own elastic IPs (or public IPs) assigned.
If your destination cluster is located within the same VPC as your source cluster, proceed to step 3. Each node in one cluster must be able to communicate with each node in the other cluster. Thus, each source and destination node needs an elastic IP (or public IP) assigned.
-
(For non-CloudFormation Template installs) Create an S3 gateway endpoint.
If you aren't using a CloudFormation Template (CFT) to install Vertica, you must create an S3 gateway endpoint in your VPC. For more information, see the AWS documentation.
For example, the Vertica CFT has the following VPC endpoint:
"S3Enpoint" : {
"Type" : "AWS::EC2::VPCEndpoint",
"Properties" : {
"PolicyDocument" : {
"Version":"2012-10-17",
"Statement":[{
"Effect":"Allow",
"Principal": "*",
"Action":["*"],
"Resource":["*"]
}]
},
"RouteTableIds" : [ {"Ref" : "RouteTable"} ],
"ServiceName" : { "Fn::Join": [ "", [ "com.amazonaws.", { "Ref": "AWS::Region" }, ".s3" ] ] },
"VpcId" : {"Ref" : "VPC"}
}
-
Verify that your security group allows the AWS clusters to communicate.
Check your security groups for both your source and destination AWS clusters. Verify that ports 5433 and 5434 are open. If one of your AWS clusters is on a separate VPC, verify that your network access control list (ACL) allows communication on port 5434.
Note
Note:
This communication method exports and copies (imports) data across the Internet. You can alternatively use non-public IPs and gateways, or VPN to connect the source and destination clusters.
-
If there are one or more elastic load balancers (ELBs) between the clusters, verify that port 5433 is open between the ELBs and clusters.
-
If you use the Vertica client to connect to one or more ELBs, the ELBs only distribute incoming connections. The data transmission path occurs between clusters.