You can create Eon Mode and Enterprise Mode Vertica clusters in cloud and on-premises environments:
On the cloud: Deploy Vertica clusters running on Amazon Web Services, Microsoft Azure, or Google Cloud Platform. For setup instructions for each cloud provider, see Set up Vertica on the cloud.
On-premises: Manually install Vertica on your host hardware. For detailed instructions on the manual install process, see Set up Vertica on-premises.
1 - Plan your setup
Before you get started with Vertica, consider your business needs and available resources.
Before you get started with Vertica, consider your business needs and available resources. Vertica is built to run in a variety of environments depending on your requirements:
You can choose to run Vertica on physical host hardware, or deploy Vertica on the cloud.
Cloud environment
Vertica can run on Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. You might consider running Vertica on cloud resources for any of the following benefits:
You plan to quickly scale your cluster size up and down to accommodate varying analytic workload. You will provision more computing resources during peak work loads without incurring the same resource costs during low-demand periods. The Vertica database's Eon Mode is designed for this use case.
You prefer to pay over time (OpEx) for ongoing cloud deployment, rather than the higher up-front cost of buying hardware for on-premises deployment.
You need to reduce the costs, labor, and expertise involved in maintaining physical on-premises hardware (such as accommodating for server purchases, hardware depreciation, software maintenance, power consumption, floor space, and backup infrastructure).
You prefer simpler, faster deployment. Deploying on the cloud eliminates the need for more specific hardware expertise during setup. In addition, on cloud platforms such as AWS and GCP, Vertica offers templates that allow you to deploy a pre-configured set of resources on which Vertica and Management Console are already installed, in just a few steps.
You have very variable workloads and you do not want to pay for idle equipment in a data center when you can simply rent infrastructure when you need it.
You are a start-up and don't want to build out a data center until your product or service is proven and growing.
An on-premises environment can provide benefits in cases like the following:
Your business requirements demand keeping sensitive data on-premises.
You prefer to pay a higher up-front cost (CapEx) of buying hardware for on-premises deployment, rather than potentially paying a higher long-term total cost of a cloud deployment.
You cannot rely on continuous access to the internet.
You prefer end-to-end control over your environment, rather than depending on a third-party cloud provider to store your data.
You may have already invested in a data center and suitable hardware for Vertica that you want to capitalize on.
You can create a Vertica database in one of two modes: Eon Mode or Enterprise Mode. The mode determines the database's underlying architecture, how the database cluster scales, and how data is loaded. Eon Mode uses communal storage to separate storage from compute and allows you to easily scale compute up or down to meet changing workloads. Enterprise Mode is a share-nothing architecture that's particularly optimized for data local to each node. Database mode does not affect the way you run queries and other everyday tasks while using the database.
After you have decided how you will run Vertica, you can choose which setup method works for your needs.
Install Vertica manually
Manually installing Vertica through the command line works on all platforms. You will first set up a cluster of nodes, then install Vertica.
Manual installation might be right for you if your cluster will have many specific configuration requirements, and you have a database administrator with the expertise to set up the cluster manually on your chosen platform. Manual installation takes more time, but you can configure your cluster to your system's exact needs.
For an on-premises environment, you must install Vertica manually. See Set up Vertica on-premises to get started.
For Amazon AWS, Google Cloud Platform, and Microsoft Azure, you have the option to deploy automatically or install manually. See Set up Vertica on the cloud for information on manual installation on each cloud platform.
Note
If you manually install Vertica in a cloud environment, you cannot access the Management Console.
Deploy Vertica automatically or manually
Automatic deployment is available on AWS, GCP, and Microsoft Azure. Manual deployment is only available on AWS through Vertica Amazon Machine Images (AMI), which include the Vertica software and the recommended configuration.
Automatic deployment creates a pre-configured environment consisting of cloud resources on which your cluster can run, with Vertica and Management Console already installed. You can enter a few parameters into a template on your chosen platform and be up and running with Vertica.
This section explains how you can deploy Vertica clusters running on Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP).
This section explains how you can deploy Vertica clusters running on Amazon Web Services (AWS), Microsoft Azure, or Google Cloud Platform (GCP). It assumes that you are familiar with the cloud environment on which you will create your Vertica cluster.
Vertica offers simple, automatic deployment on all three platforms. By setting a few parameters, you can launch a fully functional environment with Vertica and Management Console already installed.
If launching a pre-configured environment doesn't work for your specific needs, you can instead set up your nodes in the cloud and manually install Vertica in order to have more control over your setup. AWS also supports manually deploying a Vertica Amazon Machine Image (AMI) that has Vertica pre-installed and allows for greater control over your environment configuration.
After you set up your environment, you can create a database in either Enterprise Mode or Eon Mode.
Automatic deployment (all cloud platforms)
Vertica offers automatic configuration of resources and quick deployment on the cloud.
AWS
Vertica provides CloudFormation Templates (CFTs) in the AWS Marketplace. You can use a CFT to automatically launch preconfigured AWS resources in minutes, with Vertica and Management Console automatically installed.
Each CFT includes the in-browser Vertica Management Console. When you install Vertica using one of the CFTs, Management Console provides AWS-specific cluster management options, including the ability to quickly create a new cluster and Vertica database.
For GCP, Vertica provides an automated installer that is available from the Google Cloud Marketplace.
Input a few parameters, and the Google Cloud Launcher will deploy the Vertica solution, including your new database. You can create up to a 16-node cluster. The solution includes the Vertica Management Console as the primary UI to get started.
Vertica offers a fully automated cluster deployment from the Microsoft Azure Marketplace. This solution will automatically deploy a Vertica cluster and create an initial database, allowing you to log in to the Vertica Management Console and start using it after deployment has finished.
Manual installation might be the right option for you if you have many specific configuration requirements, and have an administrator who is familiar with setting up and maintaining cloud resources in the environment of your choice. Setup and maintenance may take longer and requires more expertise, but you will have more control over how your cluster is configured.
The process of installing Vertica manually on cloud resources is very similar to doing so with on-premises hardware.
If you manually install the Vertica server on cloud hosts, you cannot access Management Console.
Vertica offers cloud-specific manual installation instructions for GCP and Azure. If you want more control over your AWS cluster configuration, see Manual Deployment. Before you install, make sure to refer to the documentation of the platform you are using in order to set up your cloud resources correctly.
Manual deployment is only available on AWS, which supports Amazon Machine Images (AMI) that include the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps than a manual installation but still allowing control over your configuration. Vertica provides AMIs for both Management Console and cluster hosts.
Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template (CFT), or manually deployed using Amazon Machine Images (AMIs).
Vertica clusters on AWS can operate on EC2 instances automatically provisioned using a CloudFormation Template (CFT), or manually deployed using Amazon Machine Images (AMIs). For information about these deployment methods, see Deploy Vertica using CloudFormation templates and Manually deploy Vertica on AWS.
You can deploy a Vertica database on AWS running in either Enterprise Mode or Eon Mode. The differences between these two modes lay in their architecture, deployment, and scalability:
Enterprise Mode stores data locally on the nodes in the database.
Eon Mode stores its data in an S3 bucket.
Eon Mode separates the computational processes from the communal storage layer of your database. This separation lets you elastically vary the number of nodes in your database cluster to adjust to varying workloads.
Command Line Interface: Use the Amazon command-line Interface (CLI) with your Vertica AMIs. For more information, see What Is the AWS Command Line Interface?.
Elastic Load Balancing: Use elastic load balancing (ELB) for queries up to one hour. When enabling ELB, configure the timer to 3600 seconds. For more information see Elastic Load Balancing in the AWS documentation.
For more information about Amazon cluster instances and their limitations, see the Amazon documentation.
In this section
2.1.1 - Supported AWS instance types
Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes.
Vertica supports a range of Amazon Web Services instance types, each optimized for different purposes. Choose the instance type that best matches your requirements. The two tables below list the AWS instance types that Vertica supports for Vertica cluster hosts, and for use in MC. For more information, see the Amazon Web Services documentation on instance types and volumes.
Important
If you plan to use an Amazon Machine Image (AMI) on multiple AWS accounts, make sure to subscribe to the image on all your accounts. This allows you to access an image even when it is delisted from the AWS Marketplace.
Instance types for Vertica cluster hosts
Each Amazon EC2 Instance type natively provides one of the following storage options:
Elastic Block Store (EBS) provides durable storage: Data files stored on instance persist after instance is stopped.
Instance Store provides temporary storage: Data files stored on instance are lost when instance is stopped.
Vertica AMIs can use either the Instance Metadata Service Version 1 (IMDSv1) or the Instance Metadata Service Version 2 (IMDSv2) to authenticate to AWS services, including S3.
For more information about storage configuration in AWS, see Configure storage.
Note
Instance types that support EBS volumes support encrypting.
Optimization
Instance Types Using Only EBS Volumes (Durable)
Instance Types Using Instance Store Volumes (Temporary)
General purpose
m4.4xlarge
m4.10xlarge
m5.4xlarge
m5.8xlarge
m5.12xlarge
m5d.4xlarge
m5d.8xlarge
m5d.12xlarge
Compute
c4.4xlarge
c4.8xlarge
c5.4xlarge
c5.9xlarge
c6i.4xlarge
c6i.8xlarge
c6i.12xlarge
c6i.16xlarge
c6i.24xlarge
c6i.32xlarge
c3.4xlarge
c3.8xlarge
c5d.4xlarge
c5d.9xlarge
Memory
r4.4xlarge
r4.8xlarge
r4.16xlarge
r5.4xlarge
r5.8xlarge
r5.12xlarge
r6i.4xlarge
r6i.8xlarge
r6i.12xlarge
r6i.16xlarge
r6i.24xlarge
r6i.32xlarge
r3.4xlarge
r3.8xlarge
r5d.4xlarge
r5d.8xlarge
r5d.12xlarge
Storage
d2.4xlarge
d2.8xlarge
i3.4xlarge
i3.8xlarge
i3.16xlarge
i3en.3xlarge
i3en.6xlarge
i3en.12xlarge
i4i.4xlarge
i4i.8xlarge
i4i.16xlarge
Note
By default, the c4.8xlarge, d2.8xlarge, and m4.10xlarge instances have their processor C-states set to a value of 1 in the Vertica AMI. This measure is meant to improve performance by limiting the sleep states that an instance running Vertica uses.
For more information about sleep states, visit the AWS Documentation.
Instance types available for MC hosts
Optimization
Type
Supports EBS Storage (Durable)
Supports Ephemeral Storage (Temporary)
Computing
c4.large
c4.xlarge
c5.large
c5.xlarge
Yes
Yes
Yes
Yes
No
No
No
No
Choosing AWS Eon Mode instance types
When running an Eon Mode database in AWS, choose instance types that support ephemeral instance storage or EBS volumes for your depot, depending on cost and availability. Vertica recommends either r4 or i3 instances for production clusters. It is not mandatory to have an EBS-backed depot, because in Eon Mode, a copy of the data is safely stored in communal storage. However, you must have an EBS-backed catalog for Eon Mode databases.
The following table provides information to help you make a decision on how to pick instances with ephemeral instance storage or EBS only storage. Check with Amazon Web Services for the latest cost per hour.
Important
If you select instances that use instance store, if you then terminate those instances there is the potential for data loss. For Eon mode, MC displays an alert to inform the user of the potential data loss when terminating instances that support instance store.
Storage Type
Instance Type
Pros/Cons
Instance storage
i3.8xlarge
Instance storage offers better performance than EBS attached storage through multiple EBS volumes. Instance storage can be striped (RAIDed) together to increase throughput and load balance I/O.
Data stored in instance-store volumes is not persistent through instance stops, terminations, or hardware failures.
EBS-only storage
r4.8xlarge with 600 GB
EBS volume attached
Newer instance types from AWS have only the EBS option. In most AWS regions, it's easier to provision a large number of instances.
You can terminate an instance but leave the EBS volume around for faster revive. Perserving the EBS will preserve the depot. While some of the cached files might have become stale, they will be ignored and evicted. Much of the cached data will not be stale. It will save time when the node revives and warms its depot.
Take advantage of full-volume encryption.
More information
For more information about Amazon cluster instances and their limitations, see Manage Clusters in the Amazon Web Services documentation.
2.1.2 - AWS authentication
Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens.
Amazon defines two ways to control access to AWS resources such as S3: IAM roles and the combination of id, secrets, and (optionally) session tokens. For long-term access to non-communal storage buckets, you should use IAM roles for access control centralization. You do not need to change your application's configuration if you want to change its access settings. You just alter the IAM role applied to your EC2 instances.
Vertica uses both of these authentication methods to support different features and use cases:
An Eon Mode database's access to S3 for communal and catalog storage must always use IAM role authentication. IAM roles are the default access control method for AWS resources. Vertica uses this method if you do not configure the legacy access control session parameters.
Individual users can read data from S3 storage locations other than the ones Vertica uses for communal storage. For example, users can use COPY to load data into Vertica from an S3 bucket or query an external table stored on S3. If the IAM role assigned to the Vertica nodes does not have access to this external S3 data, the user must set an id, secret, and optionally an access token in session variables to authorize access to it. These session variables override the IAM role set on the server. See S3 parameters for a list of these session parameters.
Individual users can export data to S3 using file export. File export cannot use IAM authorization. Users who want to export data to S3 must set id, secret, and optionally access token values in session variables.
Important
If the database is running in Eon Mode, using id and secret authentication is more complex. In addition to having access to the external S3 data, any id that a user sets must be authorized to read from and write to the S3 storage locations that Vertica uses to store communal and catalog data. The queries that the user executes uses this id for all storage requests, not just those for accessing external S3 data. If the id does not have access to the catalog and communal storage, the user cannot execute queries.
Configuring an IAM role
To configure an IAM role to grant Vertica to access AWS resources you must:
Create an IAM role to allow EC2 instances to access the specific resources.
Grant that role permission to access your resources.
Attach this IAM role to each EC2 instance in the Vertica cluster.
To see an example of IAM roles for a Vertica cluster, look at the roles defined in one of the Cloud Formation Templates provided by Vertica. You can download these templates from any of the Vertica entries in the Amazon Marketplace. Under each entry's Usage Information section, click the View CloudFormation Template link, then click Download CloudFormation Template.
2.1.3 - Deploy Vertica using CloudFormation templates
Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly.
Vertica provides CloudFormation Templates (CFTs) on the AWS Marketplace that allow you to get a cluster up and running quickly. Using the template allows you to automatically provision your AWS resources and launch a Vertica cluster and Management Console, with minimal configuration required.
For details about creating an Eon Mode or Enterprise Mode database after you create a cluster with CFTs, see Amazon Web Services in MC.
2.1.3.1 - CloudFormation template (CFT) overview
With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system.
With Vertica on AWS, use CloudFormation Templates (CFTs) to easily manage provisioning the AWS resources with a running Vertica system. After you provide a few parameters to the template, you can create a stack to automatically provision the AWS resources for your Vertica system.
Bring Your Own License (BYOL): By default, free CE license is installed with 3 nodes and 1 TB. To extend nodes or size, you can purchase the Vertica BYOL license. Outside of the BYOL license on CFTs, you can also access the Community Edition without a license file:
If you are using Management Console, simply leave the license field blank.
By the Hour: A pay-as-you-go model where you pay for only the number of hours you use for each node. One advantage of using the Paid Listing is that all charges appear on your Amazon AWS bill. This offers an alternative to purchasing a full Vertica license. This eliminates the need to compute potential storage needs in advance.
CFT prerequisites
Before you can deploy Vertica on AWS using CloudFormation Templates (CFTs), verify that you have:
AWS account with permissions to create a VPC, subnet, security group, EC2 instances, and IAM roles (For more information about AWS accounts, see the AWS documentation)
Management Console with 3 Vertica nodes: The easiest way to deploy Vertica. This CFT deploys an Eon Mode database by default. However, this environment can also be used to create an Enterprise Mode database. For more information, see Creating a database.
Deploy Management Console into new VPC: This CFT deploys all required AWS resources and installs the Vertica Management Console (MC). After stack creation completes, log in to the MC to provision a Vertica database cluster.
Deploy Management Console into existing VPC: This CFT deploys the Vertica Management Console (MC) in an already-existing VPC and subnet. After stack creation completes, the MC is available. Log in to MC to provision either a Vertica database cluster or an Eon Mode database cluster.
For this CFT, you must first set up the VPC, subnet, and related network resources. For more information about the correct configuration of these resources for Vertica, see the following topics in the AWS documentation:
A Vertica cluster on AWS must be logically located in the same network.
A Vertica cluster on AWS must be logically located in the same network. This is similar to placing the nodes of an on-premises cluster within the same network. Create a virtual private cloud (VPC) to ensure the nodes in your cluster will be able to communicate with each other within AWS.
Create a single public subnet VPC with the following configurations:
Assign a Network Access Control List (ACL) that is appropriate to your situation. The default ACL does not provide a high level of security.
Enable DNS resolution and enable DNS hostname support for instances launched in this VPC.
A Vertica cluster must be operated within a single availability zone.
For more information about VPCs, including how to create one, see the AWS documentation.
2.1.3.3 - Deploy MC and AWS resources with a CloudFormation template
You can deploy (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace.
You can deploy Management Console (MC) and its associated AWS resources using CloudFormation templates (CFTs) that are available through the AWS Marketplace. For a list of available CFTs, see CloudFormation template (CFT) overview.
Complete the following to deploy the Vertica MC and related resources in AWS:
Log in to the AWS Marketplace with an AWS account (see the Prerequisites section above).
Search for "Vertica" in the AWS Marketplace.
Select a Vertica CFT. Each CFT leads you to a product overview page, with pricing estimates. (Also see CloudFormation template (CFT) overview for an overview of available templates and products).
Click Continue to Subscribe.
On the next page, select your launch settings based on your requirements for deployment.
If you have not agreed to Vertica EULA terms on the AWS Marketplace before, click Accept Software Terms to subscribe.
Click Launch with CloudFormation Console. The CloudFormation Console opens.
The CloudFormation Console automatically supplies the URL in the Specify an Amazon S3 template URL field. Click Next.
Follow the CloudFormation workflow and enter the parameters (collectively called a stack).
Note
Important: Take note of the username and password you set for Management Console during this step. You cannot recover or reset these credentials after you create the stack.
After confirming the details you have provided for your new stack, click Create. The AWS console brings you to the Stacks page, where you can view the progress of the creation process. The process takes several minutes.
The Outputs tab displays information about accessing your environment after the process completes.
Next, access the Management Console (MC) to deploy your cluster instances and create a database, as described in Access Management Console.
2.1.3.4 - Access Management Console
Complete the following steps to access Management Console on your deployed AWS resources:.
Complete the following steps to access Management Console on your deployed AWS resources:
On the AWS CloudFormation Stacks page, select your new stack and view the Outputs tab. This tab provides information about accessing your environment, as well as documentation and licensing resources.
In the ManagementConsole row, select the URL in the Value column to open the MC login page.
To log in, enter the MC username and password that you created using the CloudFormation Console.
After login, MC displays the home page, with options to provision a new cluster or database or import existing ones. If you chose a CFT that also creates a database, your new database is also displayed on the home page.
This page also provides a Resources section with links to online training, blogs, community, and help resources.
You have successfully launched and connected to Management Console on AWS resources.
If you have not yet provisioned a Vertica cluster and database, complete the steps in one of the following:
Vertica provides tested and pre-configured Amazon Machine Images (AMIs) to deploy cluster hosts or MC hosts on AWS.
Vertica provides tested and pre-configured Amazon Machine Images (AMIs) to deploy cluster hosts or MC hosts on AWS. When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. The Vertica AMI acts as a template, requiring fewer configuration steps.
This section will guide you through configuring your network settings on AWS, launching and preparing EC2 instances using the Vertica AMI, and creating a Vertica cluster on those EC2 instances.
Choose this method of installation if you are familiar with configuring AWS and have many specific AWS configuration needs. To automatically deploy AWS resources and a Vertica cluster instead, see Deploy Vertica using CloudFormation templates.
2.1.4.1 - Configure your network
Before you deploy your cluster, you must configure the network on which Vertica will run.
Before you deploy your cluster, you must configure the network on which Vertica will run. Vertica requires a number of specific network configurations to operate on AWS. You may also have specific network configuration needs beyond the default Vertica settings.
Important
You can create a Vertica database that uses IPv6 for internal communications running on AWS. However, if you do so, you must identify the hosts in your cluster using IP addresses rather than host names. The AWS DNS resolution service is incompatible with IPv6.
The following sections explain which Amazon EC2 features you need to configure for instance creation.
2.1.4.1.1 - Create a placement group, key pair, and VPC
Part of configuring your network for AWS is to create the following:.
Part of configuring your network for AWS is to create the following:
A placement group is a logical grouping of instances in a single Availability Zone. Placement Groups are required for clusters and all Vertica nodes must be in the same Placement Group.
Vertica recommends placement groups for applications that benefit from low network latency, high network throughput, or both. To provide the lowest latency, and the highest packet-per-second network performance for your Placement Group, choose an instance type that supports enhanced networking.
For information on creating placement groups, see Placement Groups in the AWS documentation.
Create a key pair
You need a key pair to access your instances using SSH. Create the key pair using the AWS interface and store a copy of your key (*.pem) file on your local machine. When you access an instance, you need to know the local path of your key.
Use a key pair to:
Authenticate your connection as dbadmin to your instances from outside your cluster.
Install and configure Vertica on your AWS instances.
for information on creating a key pair, see Amazon EC2 Key Pairs in the AWS documentation.
Create a virtual private cloud (VPC)
You create a Virtual Private Cloud (VPC) on Amazon so that you can create a network of your EC2 instances. Your instances in the VPC all share the same network and security settings.
A Vertica cluster on AWS must be logically located in the same network. Create a VPC to ensure the nodes in you cluster can communicate with each other in AWS.
Create a single public subnet VPC with the following configurations:
Assign a Network Access Control List (ACL) that is appropriate to your situation.
Enable DNS resolution and enable DNS hostname support for instances launched in this VPC.
Vertica requires the following network access control list (ACL) settings on an AWS instance running the Vertica AMI.
Vertica requires the following basic network access control list (ACL) settings on an AWS instance running the Vertica AMI. Vertica recommends that you secure your network with additional ACL settings that are appropriate to your situation.
Inbound Rules
Type
Protocol
Port Range
Use
Source
Allow/Deny
SSH
TCP (6)
22
SSH (Optional—for access to your cluster from outside your VPC)
User Specific
Allow
Custom TCP Rule
TCP (6)
5450
MC (Optional—for MC running outside of your VPC)
User Specific
Allow
Custom TCP Rule
TCP (6)
5433
SQL Clients (Optional—for access to your cluster from SQL clients)
User Specific
Allow
Custom TCP Rule
TCP (6)
50000
Rsync (Optional—for backup outside of your VPC)
User Specific
Allow
Custom TCP Rule
TCP (6)
1024-65535
Ephemeral Ports (Needed if you use any of the above)
User Specific
Allow
ALL Traffic
ALL
ALL
N/A
0.0.0.0/0
Deny
Outbound Rules
Type
Protocol
Port Range
Use
Source
Allow/Deny
Custom TCP Rule
TCP (6)
0–65535
Ephemeral Ports
0.0.0.0/0
Allow
You can use the entire port range specified in the previous table, or find your specific ephemeral ports by entering the following command:
$ cat /proc/sys/net/ipv4/ip_local_port_range
More information
For detailed information on network ACLs within AWS, refer to Network ACLs in the Amazon documentation.
For detailed information on ephemeral ports within AWS, refer to Ephemeral Ports in the Amazon documentation.
2.1.4.1.3 - Configure TCP keepalive with AWS network load balancer
AWS supports three types of elastic load balancers (ELBs):.
AWS supports three types of elastic load balancers (ELBs):
Vertica strongly recommends the AWS Network Load Balancer (NLB), which provides the best performance with your Vertica database. The Network Load Balancer acts as a proxy between clients (such as JDBC) and Vertica servers. The Classic and Application Load Balancers do not work with Vertica, in Enterprise Mode or Eon Mode.
To avoid timeouts and hangs when connecting to Vertica through the NLB, it is important to understand how AWS NLB handles idle timeouts for connections. For the NLB, AWS sets the idle timeout value to 350 seconds and you cannot change this value. The timeout applies to both connection points.
For a long-running query, if either the client or the server fails to send a timely keepalive, that side of the connection is terminated. This can lead to situations where a JDBC client hangs waiting for results that would never be returned because the server fails to send a keepalive within 350 seconds.
To identify an idle timeout/keepalive issue, run a query like this via a client such as JDBC:
=> SELECT SLEEP(355);
If there’s a problem, one of the following situations occurs:
The client connection terminates before 355 seconds. In this case, lower the JDBC keepalive setting so that keepalives are sent less than 350 seconds apart.
The client connection doesn’t return a result after 355 seconds. In this case, you need to adjust the server keepalive settings (tcp_keepalive_time and tcp_keepalive_intvl) so that keepalives are sent less than 350 seconds apart.
For detailed information about AWS Network Load Balancers, see the AWS documentation.
2.1.4.1.4 - Create and assign an internet gateway
When you create a VPC, an Internet gateway is automatically assigned to it.
When you create a VPC, an Internet gateway is automatically assigned to it. You can use that gateway, or you can assign your own. If you are using the default Internet gateway, continue with the procedure described in Create a security group.
Otherwise, create an Internet gateway specific to your needs. Associate that internet gateway with your VPC and subnet.
For information about how to create an Internet Gateway, see Internet Gateways in the AWS documentation.
2.1.4.1.5 - Assign an elastic IP address
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally.
An elastic IP address is an unchanging IP address that you can use to connect to your cluster externally. Vertica recommends you assign a single elastic IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using their internal IP addresses dictated by your VPC settings.
Create an elastic IP address. For information, see Elastic IP Addresses in the AWS documentation.
2.1.4.1.6 - Create a security group
The Vertica AMI has specific security group requirements.
The Vertica AMI has specific security group requirements. When you create a Virtual Private Cloud (VPC), AWS automatically creates a default security group and assigns it to the VPC. You can use the default security group, or you can name and assign your own.
Create and name your own security group using the following basic security group settings. You may make additional modifications based on your specific needs.
Inbound
Type
Use
Protocol
Port Range
IP
SSH
TCP
22
The CIDR address range of administrative systems that require SSH access to the Vertica nodes. Make this range as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
DNS (UDP)
UDP
53
Your private subnet address range (for example, 10.0.0.0/24).
Custom UDP
Spread
UDP
4803 and 4804
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
Spread
TCP
4803
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
VSQL/SQL
TCP
5433
The CIDR address range of client systems that require access to the Vertica nodes. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP
Inter-node Communication
TCP
5434
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
TCP
5444
Your private subnet address range (for example, 10.0.0.0/24).
Custom TCP
MC
TCP
5450
The CIDR address of client systems that require access to the management console. This range should be as restrictive as possible. You can add multiple rules for separate network ranges, if necessary.
Custom TCP
Rsync
TCP
50000
Your private subnet address range (for example, 10.0.0.0/24).
ICMP
Installer
Echo Reply
N/A
Your private subnet address range (for example, 10.0.0.0/24).
ICMP
Installer
Traceroute
N/A
Your private subnet address range (for example, 10.0.0.0/24).
Note
In Management Console (MC), the Java IANA discovery process uses port 7 once to detect if an IP address is reachable before the database import operation. Vertica tries port 7 first. If port 7 is blocked, Vertica switches to port 22.
2.1.4.2 - Deploy AWS instances for your Vertica database cluster
After you Configure Your Network, you can create AWS instances and deploy Vertica.
After you Configure your network, you can create AWS instances and deploy Vertica. Follow these procedures to deploy and run Vertica on AWS.
2.1.4.2.1 - Configure and launch an instance
After you configure your network settings on AWS, configure and launch the instances where you will install Vertica.
After you configure your network settings on AWS, configure and launch the instances where you will install Vertica. An Elastic Compute Cloud (EC2) instance without a Vertica AMI is similar to a traditional host. Just like with an on-premises cluster, you must prepare and configure your cluster and network at the hardware level before you can install Vertica.
When you create an EC2 instance on AWS using a Vertica AMI, the instance includes the Vertica software and the recommended configuration. Vertica recommends that you use the Vertica AMI unmodified. The Vertica AMI acts as a template, requiring fewer configuration steps:
Consider the following issues when you add storage to your instances:
Add a number of drives equal to the number of physical cores in your instance—for example, for a c3.8xlarge instance, 16 drives; for an r3.4xlarge, 8 drives.
Do not store your information on the root volume.
Amazon EBS provides durable, block-level storage volumes that you can attach to running instances. For guidance on selecting and configuring an Amazon EBS volume type, see Amazon EBS Volume Types.
Configure EBS volumes as a RAID array
You can configure your EBS volumes into a RAID 0 array to improve disk performance. Before doing so, use the vioperf utility to determine whether the performance of the EBS volumes is fast enough without using them in a RAID array. Pass vioperf the path to a mount point for an EBS volume. In this example, an EBS volume is mounted on a directory named /vertica/data:
If the EBS volume read and write performance (the entries with Read and Write in column 1 of the output) is greater than 20MB/s per physical processor core (columns 6 and 7), you do not need to configure the EBS volumes as a RAID array to meet the minimum requirements to run Vertica. You may still consider configuring your EBS volumes as a RAID array if the performance is less than the optimal 40MB/s per physical core (as is the case in this example).
Note
If your EC2 instance has hyper-threading enabled, vioperf may incorrectly count the number of cores in your system. The 20MB/s throughput per core requirement only applies to physical cores, rather than virtual cores. If your EC2 instance has hyper-threading enabled, divide the counter value (column 4 in the output) by the number of physical cores. See CPU Cores and Threads Per CPU Core Per Instance Type section in the AWS documentation topic Optimizing CPU Options for a list of physical cores in each instance type.
If you determine you need to configure your EBS volumes as a RAID 0 array, see the AWS documentation topic RAID Configuration on Linux the steps you need to take.
Security group and access
Choose between your previously configured security group or the default security group.
Configure S3 access for your nodes by creating and assigning an IAM role to your EC2 instance. See AWS authentication for more information.
2.1.4.2.2 - Connect to an instance
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:.
Using your private key, take these steps to connect to your cluster through the instance to which you attached an elastic IPaddress:
As the dbadmin user, type the following command, substituting your ssh key:
Select the instance that is attached to the Elastic IP.
Click Connect.
On Connect to Your Instance, choose one of the following options:
A Java SSH Client directly from my browser—Add the path to your private key in the field Private key path, andclick Launch SSH Client.
Connect with a standalone SSH client**—**Follow the steps required by your standalone SSH client.
Connect to an instance from windows using putty
If you connect to the instance from the Windows operating system, and plan to use Putty:
Convert your key file using PuTTYgen.
Connect with Putty or WinSCP (connect via the elastic IP), using your converted key (i.e., the *ppk file).
Move your key file (the *pem file) to the root dir using Putty or WinSCP.
2.1.4.2.3 - Prepare instances for cluster formation
After you create your instances, you need to prepare them for cluster formation.
After you create your instances, you need to prepare them for cluster formation. Prepare your instances by adding your AWS .pem key and your Vertica license.
By default, each AMI includes a Community Edition license. Once Vertica is installed, you can find the license at this location:
As the dbadmin user, copy your *pem file (from where you saved it locally) onto your primary instance.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:
FATAL (19): Failed Login Validation 10.0.3.158, cannot resolve or connect to host as root.
If you receive a failure message, enter the following command to correct permissions on your *pem file:
$ chmod 600 /<name-of-pem>.pem
Copy your Vertica license over to your primary instance, placing it in your home directory or other known location.
2.1.4.2.4 - Change instances on AWS
You can change instance types on AWS.
You can change instance types on AWS. For example, you can downgrade a c3.8xlarge instance to c3.4xlarge. See Supported AWS instance types for a list of valid AWS instances.
When you change AWS instances you may need to:
Reconfigure memory settings
Reset memory size in a resource pool
Reset number of CPUs in a resource pool
Reconfigure memory settings
If you change to an AWS instance type that requires a different amount of memory, you may need to recompute the following and then reset the values:
You may need root user permissions to reset these values.
Reset memory size in a resource pool
If you used absolute memory in a resource pool, you may need to reconfigure the memory using the MEMORYSIZE parameter in ALTER RESOURCE POOL.
Note
If you set memory size as a percentage when you created the original resource pool, you do not need to change it here.
Reset number of CPUs in a resource pool
If your new instance requires a different number of CPUs, you may need to reset the CPUAFFINITYSET parameter in ALTER RESOURCE POOL.
2.1.4.2.5 - Configure storage
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system.
Vertica recommends that you store information — especially your data and catalog directories — on dedicated Amazon EBS volumes formatted with a supported file system. The /opt/vertica/sbin/configure_software_raid.sh script automates the storage configuration process.
Caution
Do not store information on the root volume because it might result in data loss.
Vertica performance tests Eon Mode with a per-node EBS volume of up to 2TB. For best performance, combine multiple EBS volumes into a RAID 0 array.
Because the storage configuration script requires the volume names that you want to configure, you must identify the volumes on your machine. The following command lists the contents of the /dev directory. Search for the volumes that begin with xvd:
$ ls /dev
Important
Ignore the root volume. Do not include any of your root volumes in the RAID creation process.
Combining volumes for storage
The configure_software_raid.sh shell script combines your EBS volumes into a RAID 0 array.
Caution
Run configure_software_raid.sh in the default setting only if you have a fresh configuration with no existing RAID settings.
If you have existing RAID settings, open the script in a text editor and manually edit the raid_dev value to reflect your current RAID settings. If you have existing RAID settings and you do not edit the script, the script deletes important operating system device files.
Alternately, use the Management Console (MC) console to add storage nodes without unwanted changes to operating system device files. For more information, see Managing database clusters.
The following steps combine your EBS volumes into RAID 0 with the configure_software_raid.sh script:
Edit the /opt/vertica/sbin/configure_software_raid.sh shell file as follows:
Comment out the safety exit command at the beginning .
Change the sample volume names to your own volume names, which you noted previously. Add more volumes, if necessary.
Run the /opt/vertica/sbin/configure_software_raid.sh shell file. Running this file creates a RAID 0 volume and mounts it to /vertica/data.
Change the owner of the newly created volume to dbadmin with chown.
Repeat steps 1-3 for each node on your cluster.
2.1.4.2.6 - Create a cluster
On AWS, use the install_vertica script to combine instances and create a cluster.
On AWS, use the
install_vertica script to combine instances and create a cluster. Check your My Instances page on AWS for a list of current instances and their associated IP addresses. You need these IP addresses when you run install_vertica.
Create a cluster as follows:
While connected to your primary instance, enter the following command to combine your instances into a cluster. Substitute the IP addresses for your instances and include your root *.pem file name.
If you are using Vertica Community Edition, which limits you to three instances, you can specify -L CE with no license file.
When you issue install_vertica or update_vertica on a Vertica AMI script, --point-to-point is the default. This parameter configures Spread to use direct point-to-point communication between all Vertica nodes, which is a requirement for clusters on AWS.
If you are using IPv6 network addresses to identify the hosts in your cluster, use the --ipv6 flag in your install_vertica command. You must also use IP addresses instead of host names, as the AWS DNS server cannot resolve host names to IPv6 addresses.
After combining your instances, Vertica recommends deleting your *.pem key from your cluster to reduce security risks. The example below uses the shred command to delete the file:
Stopping or rebooting an instance or cluster without first shutting down the database down, may result in disk or database corruption. To safely shut down and restart your cluster, see Operating the database.
Check open ports manually using the netcat utility
Once your cluster is up and running, you can check ports manually through the command line using the netcat (nc) utility. What follows is an example using the utility to check ports.
Before performing the procedure, choose the private IP addresses of two nodes in your cluster.
The examples given below use nodes with the private IPs:
10.0.11.60 10.0.11.61
Install the nc utility on your nodes. Once installed, you can issue commands to check the ports on one node from another node.
To check a TCP port:
Put one node in listen mode and specify the port. The following sample shows how to put IP 10.0.11.60 into listen mode for port 480
[root@ip-10-0-11-60 ~]# nc -l 4804
From the other node, run nc specifying the IP address of the node you just put in listen mode, and the same port number.
[root@ip-10-0-11-61 ~]# nc 10.0.11.60 4804
Enter sample text from either node and it should show up on the other node. To cancel after you have checked a port, enter Ctrl+C.
Note
To check a UDP port, use the same nc commands with the –u option.
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster.
Management Console (MC) is a database management tool that allows you to view and manage aspects of your cluster. Vertica provides an MC AMI, which you can use with AWS. The MC AMI allows you to create an instance, dedicated to running MC, that you can attach to a new or existing Vertica cluster on AWS. You can create and attach an MC instance to your Vertica on AWS cluster at any time.
After you launch your MC instance and configure your security group settings, you can log in to your database. To do so, use the elastic IP you specified during instance creation.
From this elastic IP, you can manage your Vertica database on AWS using standard MC procedures.
Considerations when using MC on AWS
Because MC is already installed on the MC AMI, the MC installation process does not apply.
To uninstall MC on AWS, follow the procedures provided in Uninstalling Management Console before terminating the MC Instance.
Vertica supports automatic deployment on Azure through the Microsoft Azure Marketplace, or manual installation and deployment on Azure VMs.
Vertica supports automatic deployment on Azure through the Microsoft Azure Marketplace, or manual installation and deployment on Azure VMs.
You can deploy a Vertica database on the Microsoft Azure Cloud running in either Enterprise Mode or Eon Mode. In Eon Mode, Vertica stores its data communally using Azure block blob storage.
This section explains how to deploy a Vertica database to Microsoft Azure.
2.2.1 - Recommended Azure VM types and operating systems
Vertica supports a range of Microsoft Azure virtual machine (VM) types, each optimized for different purposes.
Recommended Azure VM types
Vertica supports a range of Microsoft Azure virtual machine (VM) types, each optimized for different purposes. Choose the VM type that best matches your performance and price needs as a user.
Note
The GS VMs are not available in all regions, or from the Azure Marketplace.
Before you can create an Eon Mode database on Azure, you must have a database cluster and an Azure blob storage container to store your database's data.
Before you can create an Eon Mode database on Azure, you must have a database cluster and an Azure blob storage container to store your database's data.
You can create an Eon Mode database on a cluster that is hosted on Azure. In this configuration, your database stores its data communally in Azure Blob storage. See Eon Mode to learn more about this database mode.
Before you can create an Eon Mode database on Azure, you must provision a cluster to host it. See Configuring your Vertica cluster for Eon Mode for suggestions on choosing VM configurations and the number of nodes your cluster should start with.
Storage requirements
An Eon Mode database on Azure stores its data communally in Azure blob storage. Vertica only supports block blob storage for communal data storage, not append or page blob storage.
You must create a storage path for Vertica to use exclusively. This path can be a blob container or a folder within a blob container. This path must not contain any files. If you attempt to create an Eon Mode database with a container or folder that contains files, admintools returns an error.
You pass Vertica a URI for the storage path using the azb:// schema. See Azure Blob Storage object store for the format of this URI.
You must also configure the storage container so Vertica is authorized to access it. Depending on authentication method you use, you may need to supply Vertica the with credentials to access the container. Vertica can use one of following methods to authenticate with the blob storage container:
Using Azure managed identities. This authentication method is transparent—you do not need to add any authentication configuration information to Vertica. Vertica automatically uses the managed identity bound to the VMs it runs on to authenticate with the blob storage container. See the Azure AD-managed identities for Azure resources documentation page in the Azure documentation for more information.
If you provide credentials for either of the other two supported authentication methods, Vertica uses them instead of authenticating using a managed identity bound to your VM.
Note
If your Azure VMs have more than one managed identity bound to them, you must tell Vertica which identity to use when authenticating with the blob storage container. Vertica gets the identity to use from a tag set on the VMs that it is running on.
On your VMs, create a tag with its key named VerticaManagedIdentityClientId and its value to the name of a managed identity bound to your VMs. See the Use tags to organize your Azure resources and management hierarchy page in the Azure documentation for more information.
Using an account name and access key credentials for a service account that has full access to the blob storage container. In this case, you provide Vertica with the credentials when you create the Eon Mode database. See Creating an Authentication File for details.
Eon Mode databases on Azure support some of the encryption features built into Azure Storage. You can use its encryption at rest feature transparently—you do not need to configure Vertica to take advantage of it. You can use Microsoft-managed or customer-managed keys for storage encryption. Vertica does not support Azure Storage's client-side encryption and encryption using customer-provided keys. See the Azure Data Encryption at rest page in the Azure documentation for more information about the encryption at rest features in Azure Storage.
2.2.3 - Deploy Vertica from the Azure Marketplace
Deploy Vertica in the Microsoft Azure Cloud using the Vertica Data Warehouse entry in the Azure Marketplace.
Deploy Vertica in the Microsoft Azure Cloud using the Vertica Data Warehouse entry in the Azure Marketplace. Vertica provides the following deployment options:
Eon Mode: Deploy a Management Console (MC) instance, and then provision and create an Eon Mode database from the MC. For cluster and storage requirements, see Eon Mode on Azure prerequisites.
Enterprise Mode: Deploy a four-node Enterprise Mode database comprised of one MC instance and three database nodes. This requires an Azure subscription with a minimum of 12 cores for the Vertica Marketplace solution.
The Enterprise Mode deployment uses the MC primarily as a monitoring tool. For example, you cannot provision and create a database with an Enterprise Mode MC. For information about creating and managing an Enterprise Mode database, see Create a database using administration tools.
Create a deployment
Eon Mode and Enterprise Mode require much of the same information for deployment. Any information that is not required for both deployment types is clearly marked.
1. Select the deployment type
Sign in to your Microsoft Azure account. From the Home screen, select Create a resource under Azure services.
Search for Vertica Data Warehouse and select it from the search results.
On the Vertica Data Warehouse page, select one of the following:
To deploy an MC instance that can manage an Eon Mode database, select Vertica Data Warehouse, Eon BYOL.
To deploy an Enterprise Mode database, select Vertica Data Warehouse, Enterprise BYOL.
On the next screen, select Create.
After you select your deployment type, the Basics tab on the Create Vertica Data Warehouse page displays.
2. Add project and instance details on the basics tab
Provide the following information in the Project details and Instance details sections:
Subscription: Azure bills this subscription for the cluster resources.
Resource group: The location to save all of the Azure resources. Create a new resource group or choose an existing one from the dropdown list.
Region: The location where the virtual machine running your MC instance is deployed.
VerticaManagement ConsoleUser: Eon Mode only. The administrator username for the MC.
SSH public key for OS Access: Provide the SSH public key associated with the Vertica User, for command line access to the virtual machine.
Password for MC Access: Enter a password to log in to Management Console. Note that Management Console requires that you change your password after the initial login.
Confirm password: Reenter the value you entered in Password for MC Access.
Select Next: Virtual Machine Settings >.
3. Select virtual machine settings
Provide the following information on the Virtual Machine Settings tab:
Management Console VM size: Select Change size to customize the VM settings or select the default. For a list of VM types recommended by use case, see Recommended Azure VM types and operating systems.
Storage account of Eon DB: Eon Mode only. The storage account associated with the database deployment.
Number of Vertica Cluster nodes: Enterprise Mode only. The number of nodes to deploy in the cluster, in addition to the MC instance. The Community Edition (CE) license is automatically applied to the cluster. This license is limited to 1 TB of RAW data 3 Vertica nodes. If you select more than 3 nodes with a CE license, the initial database is created on the first 3 nodes. For information about upgrading your license, see Managing licenses.
Vertica Node VM size: Enterprise Mode only. Select the VM type to deploy in your cluster. Use the default or select Change size to customize the VM settings. For a list of VM types recommended by use case, see Recommended Azure VM types and operating systems.
Total RAW storage per node: Enterprise Mode only. Select the amount of storage per node from the dropdown list. Each VM has a set of premium data disks that are configured and presented as a single storage location.
Select Next: Network Settings >.
4. Select network settings
Provide the following information on the Network Settings tab:
Virtual Network: The virtual network that hosts the Vertica cluster. Create a new virtual network or select an existing one from the dropdown list. If you select an existing virtual network, Vertica recommends that you already created a subnet to use for the deployment.
First subnet: The subnet for the associated Virtual Network. Create a new subnet or select an existing one from the dropdown list.
Public IP Address Resource Name: Each VM is configured with a publicly accessible IP address. This field allows you to specify the resource name for those IP addresses, and whether they are static or dynamic. The first public IP address resource is created exactly as entered, and associated with the VerticaManagement Console. Azure appends a number from 1 to 16 to the resource name for each additional Vertica cluster node created. This number associates each VM with a resource.
Domain Name Label for Management Console: Because each VM has a public IP address, each node requires a DNS name. Enter a prefix for the name. The first DNS name is created exactly as entered, and associated with the VerticaManagement Console. Azure appends a number from 1 to 16 to the DNS name for each Vertica cluster node created. That number associates each VM with a resource. Azure adds the remaining part of the fully qualified domain name based on the location where you created the cluster.
Select Next: Review + create >.
5. Verify on review + create
As the Review + create page loads, Azure validates your settings. After it passes validation, review your settings. When you are satisfied with your selections, select Create.
Access the MC after deployment
After your resources are successfully deployed, you are brought to the Overview page on Home > resources-name > Deployments. You must retrieve your Management Console IP address and username to log in.
From the Overview page, select Outputs in the left navigation.
Copy the Vertica Management Console URL and Vertica Management Console user name*.
Paste the Vertica Management Console URL in the browser address bar and press Enter.
Depending on your browser, you might receive a warning of a security risk. If you receive the warning, select the Advanced button and follow the browsers instructions to proceed to the Management Console.
On the VerticaManagement Console log in page, paste the Vertica Management Console user name, and enter the Password for MC Access that you entered on Basics > Project details when you were deploying your MC instance.
Delete a resource group
For details about the Azure Resource Manager and deleting a resource group, see the Azure documentation.
2.2.4 - Manually deploy Vertica on Microsoft Azure
Manually creating a database cluster for your Vertica deployment lets you customize your VMs to meet your specific needs.
Manually creating a database cluster for your Vertica deployment lets you customize your VMs to meet your specific needs. You often want to manually configure your VMs when deploying a Vertica cluster to host an Eon Mode database.
To start creating your Vertica cluster in Azure using manual steps, you first need to create a VM. During the VM creation process, you create and configure the other resources required for your cluster, which are then available for any additional VMs that you create.
2.2.4.1 - Configure and launch a new instance
An Azure VM is similar to a traditional host.
An Azure VM is similar to a traditional host. Just as with an on-premises cluster, you must prepare and configure the hardware settings for your cluster and network before you install Vertica.
The first steps are:
From the Azure marketplace, select an operating system that Vertica supports.
A public IP is an IP address that you can use to connect to your cluster externally. For best results, assign a single static public IP to a node in your cluster. You can then connect to other nodes in your cluster from your primary node using the internal IP addresses that Azure generated when you specified your virtual network settings.
By default, a public IP address is dynamic; it changes every time you shut down the server. You can choose a static IP address, but doing so can add cost to your deployment.
During a VM installation, you cannot set a DNS name. If you use dynamic public IPs, set the DNS name in the public IP resource for each VM after deployment.
If needed, to create additional VMs, repeat the previous instructions in this document.
2.2.4.2 - Connect to a virtual machine
Before you can connect to any of the VMs you created, you must first make your virtual network externally accessible.
Before you can connect to any of the VMs you created, you must first make your virtual network externally accessible. To do so, you must attach the public IP address you created during network configuration to one of your VMs.
Connect to your VM
To connect to your VM, complete the following tasks:
Connect to your VM using SSH with the public IP address you created in the configuration steps.
Authenticate using the credentials and authentication method you specified during the VM creation process.
Connect to other VMs
Connect to other virtual machines in your virtual network by first using SSH to connect to your publicly connected VM. Then, use SSH again from that VM to connect through the private IP addresses of your other VMs.
If you are using private key authentication, you may need to move your key file to the root directory of your publicly connected VM. Then, use PuTTY or WinSCP to connect to other VMs in your virtual network.
2.2.4.3 - Prepare the virtual machines
After you create your VMs, you need to prepare them for cluster formation.
After you create your VMs, you need to prepare them for cluster formation.
Add the Vertica license and private key
Prepare your nodes by adding your private key (if you are using one) to each node and to your Vertica license. These steps assume that the initial user you configured is the DBADMIN user.
As the dbadmin user, copy your private key file from where you saved it locally onto your primary node.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:
Failed Login Validation 10.0.2.158, cannot resolve or connect to host as root.
If you receive a failure message, enter the following command to correct permissions on your private key file:
$ chmod 600 /<name-of-key>.pem
Copy your Vertica license to your primary VM. Save it in your home directory or other known location.
Install software dependencies for Vertica on Azure
In addition to the Vertica standard Package dependencies, as the root user, you must install the following packages before you install Vertica on Azure:
pstack
mcelog
sysstat
dialog
2.2.4.4 - Configure storage
Use a dedicated Azure storage account for node storage.
Use a dedicated Azure storage account for node storage.
Caution
Caution: Do not store your information on the root volume, especially your data and catalog directories. Storing information on the root volume may result in data loss.
When configuring your storage, make sure to use a supported file system. For details, see File system.
Attach disk containers to virtual machines (VMs)
Using your previously created storage account, attach disk containers to your VMs that are appropriate to your needs.
For best performance, combine multiple storage volumes into RAID-0. For most RAID-0 implementations, attach 6 storage disk containers per VM.
Combine disk containers for storage
If you are using RAID, follow these steps to create a RAID-0 drive on your VMs. The following example shows how you can create a RAID-0 volume named md10 from 6 individual volumes named:
The RAID device can be renamed after a reboot. To ensure the filesystem is mounted in a predictable location on your VM, create a directory to use as the mount point to mount the filesystem. For example, you can choose to create a mount point named /data that you will use to store your database's catalog and data (or depot, if you are running Vertica in Eon Mode).
$ mkdir /data
Using a text editor, add an entry to the /etc/fstab file for the UUID of the filesystem and your mount point so it is mounted when the system boots:
After you complete the download and extraction, the next section describes how to use the install_vertica script to form a cluster and install the Vertica database software.
Before you start
Before you run the install_vertica script:
Check the Virtual Network page for a list of current VMs and their associated private IP addresses.
Identify your storage location. The installer assumes that you have mounted your storage to /vertica/data. To specify another location, use the --data-dir argument.
Identify your storage location. To create your database's data directory on mounted RAID drive, when you run the install_vertica script, provide /vertica/data as the value of the --data-dir option .
Caution
Caution: Do not store your data on the root drive.
Combine virtual machines (VMs)
The following example shows how to combine VMs using the install_vertica script.
While connected to your primary node, construct the following command to combine your nodes into a cluster.
Substitute the IP addresses for your VMs and include your root key file name, if applicable.
Include the --point-to-point parameter to configure spread to use direct point-to-point communication between all Vertica nodes, as required for clusters on Azure when installing or updating Vertica.
If you are using Vertica Community Edition, which limits you to three nodes, specify -L CE with no license file.
After you combine your nodes, to reduce security risks, keep your key file in a secure place—separate from your cluster—and delete your on-cluster key with the shred command:
$ shred examplekey.pem
Important
You need your key file to perform future Vertica updates.
Reboot your cluster to complete the cluster formation and Vertica installation.
Vertica supports automatic deployment on Google Cloud Platform (GCP) through the Google Cloud Launcher, or manual installation and deployment on GCP machines.
Vertica supports automatic deployment on Google Cloud Platform (GCP) through the Google Cloud Launcher, or manual installation and deployment on GCP machines.
You can deploy a Vertica database on GCP running in either Enterprise Mode or Eon Mode. In Eon Mode, Vertica stores its data communally using Google Cloud Storage (GCS).
This section explains how to deploy a Vertica database to GCP.
Vertica Analytic Database supports a range of machine types, each optimized for different workloads.
Vertica Analytic Database supports a range of machine types, each optimized for different workloads. When you deploy your Vertica Analytic Database cluster to the Google Cloud Platform (GCP), different machine types are available depending on how you provision your database.
Note
Some machine types are not available across all regions.
The sections below list the GCP machine types that Vertica supports for Vertica cluster hosts, and for use in Management Console. For details on the configuration of the machine type options, see the Google Cloud documentation's Machine types page.
Machine types available for MC hosts
Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy an instance for running the Vertica Management Console.
Tip
In most cases, 8 vCPUs are sufficient when selecting a machine type for running the Management Console.
Machine types available for Vertica database cluster hosts
Vertica supports all N1, N2, E2, M1, M2, and C2 machine types to deploy cluster hosts.
Machine types for Vertica database cluster hosts provisioned from MC
The table below lists the GCP machine types that Vertica supports when you provision your cluster from Management Console.
Machine Type
Machine Name
N1 standard
n1-standard-16
n1-standard-32
n1-standard-64
N1 high-memory
n1-highmem-16
n1-highmem-32
n1-highmem-64
N2 standard
n2-standard-16
n2-standard-32
n2-standard-48
n2-standard-64
N2 high-memory
n2-highmem-16
n2-highmem-32
n2-highmem-48
n2-highmem-64
2.3.2 - Deploy Vertica from the Google cloud marketplace
The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP).
The Vertica entries in the Google Cloud Launcher Marketplace let you quickly deploy a Vertica cluster in the Google Cloud Platform (GCP). Currently, three entries let you select the database mode and the license you want to use:
The Eon Mode BYOL (bring your own license) launcher deploys a single instance running the MC. You use this MC instance to deploy a Vertica database running on Eon Mode. This database has a community license applied to it initially. You can later upgrade it to a license you have obtained from Vertica. See Deploy an MC instance in GCP for Eon Mode for more information.
The Eon Mode BTH (by the hour) launcher also deploys a single instance running the MC that you use to deploy a database. This database has a by-the-hour license applied to it. Instead of paying for a license up front, you pay an hourly fee that covers both Vertica and running your instances. The BTH license is automatically applied to all clusters you create using a BTH MC instance. See Deploy an MC instance in GCP for Eon Mode for more information. If you choose, you can upgrade this hourly license to a longer-term license you purchase from Vertica. To move a BTH cluster to a BYOL license, follow the instructions in Moving a cloud installation from by the hour (BTH) to bring your own license (BYOL).
Note
Vertica clusters that use IPv6 to identify hosts have not been tested on GCP. Vertica recommends you use IPv4 addresses to identify the hosts in your cluster on GCP.
2.3.2.1 - Eon Mode on GCP prerequisites
Before deploying an Eon Mode database on GCP, you must take several steps:.
Before deploying an Eon Mode database on GCP, you must take several steps:
Review the default service account's permissions for your GCP project.
Create an HMAC key to use when creating your cluster.
Create a communal storage location.
Service account permissions
Service accounts allow automated processes to authenticate with GCP. The Eon Mode database deployment process uses the project's service account for your GCP project to deploy instances. When you create a new project, GCP automatically creates a default service account (identified by project_number-compute@developer.gserviceaccount.com) for the project and grants it the IAM role Editor. See the Google Cloud documentation's Understanding roles for details about this and other IAM roles.
The Editor role lets the service account create resources from the Marketplace. When you create an instance of the Management Console (MC), the MC uses the account to deploy further resources, such as provisioning instances for an database.
To deploy Vertica on GCP, your user account must have the:
Editor role.
runtimeconfig.waiters.getIamPolicy permission.
Creating an HMAC key
Vertica uses a hash-based message authentication code (HMAC) key to authenticate requests to access the communal storage location. This key has two parts: an access ID and a secret. When you create an Eon Mode database in GCP, you provide both parts of an HMAC key for the nodes to use to access communal storage.
To create an HMAC key:
Log in to your Google Cloud account.
If the name of the project you will use to create your database does not appear in the top banner, click the dropdown and select the correct project.
In the navigation menu in the upper-left corner, under the Storage heading, click Storage and select Settings.
In the Settings page, click Interoperability.
Scroll to the bottom of the page and find the User account HMAC heading.
Unless you have already set a default project, you will see the message stating you haven’t set a default project for your user account yet. Click the Set project-id as default project button to choose the current project as your default for interoperability.
Note
The project ID appears in the button label, not the project name.
Under Access keys for your user account, click Create a key.
Your new access key and secret appear in the HMAC key list. You will need them when you create your Eon Mode database. You can copy them to a handy location (such as a text editor) or leave a browser tab open to this page while you use another tab or window to create your database. These keys remain available on this page, so you do not need to worry about saving them elsewhere.
Caution
It is vital that you protect the security of your HMAC key. It can grant others access to your Eon Mode database's communal storage location. This means they could access all of the data in your database. Do not write the HMAC key anyplace where it may be exposed, such as email, shared folders, or similar insecure locations.
Creating a communal storage location
Your Eon Mode database needs a storage location for its communal storage. Eon Mode databases running on GCP use Google Cloud Storage (GCS) for their communal storage location. When you create your new Eon Mode database, you will supply the MC's wizard with a GCS URL for the storage location.
This location needs to meet the following criteria:
The URL must include at least a bucket name. You can use one or more levels of folders, as well. For example, the following GCS URLs are valid:
gs://verticabucket/mydatabase
gs://verticabucket/databases/mydatabase
gs://verticabucket
Multiple databases can share the same bucket, as long as each has its own folder.
If provided, the lowest-level folder in the URL must not already exist. For example, in the GCS URL gs://verticabucket/databases/mydatabase, the bucket named verticabucket and the directory named databases must exist. The subdirectory named mydatabase must not exist. The Vertica install process expects to create the final folder itself. If the folder already exists, the installation process fails.
The permissions on the bucket must be set to allow the service account read, write, and delete privileges on the bucket. The best role to assign to the user to gain these permissions is Storage Object Admin.
To prevent performance issues, the bucket must be in the same region as all of the nodes running the Eon Mode database.
If you create the database through the admintools UI, you must set gcsauth as a bootstrap parameter in admintools.conf. For more information on this and other GCP parameters, see Google Cloud Storage parameters.
[BootstrapParameters]
gcsauth = ID:secret
2.3.2.2 - Deploy an Enterprise Mode database in GCP from the marketplace
The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database.
The Vertica Cloud Launcher solution creates a Vertica Enterprise Mode database. The solution includes the Vertica Management Console (MC) as the primary UI for you to get started.
The launcher automatically creates a database named vdb using the Community Edition (CE) license. The CE license is limited to a maximum of 3 nodes. You can tell the launcher to add more than 3 nodes to your deployment. In this case, it uses the first three nodes in the cluster to create the database. The remaining nodes are not part of the database, but are added to your cluster. To add these nodes to your database, you must replace the Community Edition license with a license key you receive from the Software Entitlement support site. See Managing licenses for more information.
After the launcher creates the initial database, it configures the MC to attach to that database automatically.
Configure the Vertica cloud launcher solution
To get started with a deployment of Vertica from the Google Cloud Launcher, search for the Vertica Data Warehouse, Enterprise Mode entry.
Follow these steps:
Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.
From the listing page, click LAUNCH.
On the New Vertica Analytics Platform deployment page, enter the following information:
Deployment name: Each deployment must have a unique name. That name is used as the prefix for the names of all VMs created during the deployment. The deployment name can only contain lowercase characters, numbers, and dashes. The name must start with a lowercase letter and cannot end with a dash.
Zone: GCP breaks its cloud data centers into regions and zones. Regions are a collection of zones in the same geographical location. Zones are collections of compute resources, which vary from zone to zone.
For best results, pick the zone in your designated region that supports the latest Intel CPUs. For a complete listing of regions and zones, including supported processors, see Regions and Zones.
Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.
Under Vertica Management Console, choose the configuration for the virtual machine that will run the Management Console. The Vertica Analytics Platform in Cloud Launcher always deploys the Vertica Management Console (MC) as part of the solution.
The default machine type for MC is sufficient for most deployments. You can choose another machine type that better suits any additional purposes, such serving as a target node for backups, data transformation, or additional management tools.
Node count for Vertica Cluster: The total number of VMs you want to deploy in the Vertica Cluster. The default is 3.
Note
As mentioned above, the Cloud Launcher automatically deploys the Vertica Community Edition license, which limits the database to 3 nodes and up to 1 TB in raw data. Any additional nodes will be part of your database cluster, but will not be part of your database.
If you intend to use the Community Edition license for your database, leave the setting at 3. Otherwise, you would add nodes that will sit idle and cost you money without being part of your database.
Machine type for Vertica Cluster nodes: The Cloud Launcher builds each node in the cluster using the same machine type. Modify the machine type for your nodes based on the workloads you expect your database to handle. See Supported GCP machine types for more information.
Data disk type: GCP offers two types of persistent disk storage: Standard and SSD. The costs associated with Standard are less, but the performance of SSD storage is much better. Vertica recommends you use SSD storage. For more information on Standard and SSD persistent disks, see Storage Options.
Disk size in GB: Disk performance is directly tied to the disk size in GCP. The default value of 2000 GBs (2 TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.
If you select a smaller disk size, the throughput performance decreases. If you select a large disk size, the performance remains the same as the 2 TB option.
Network: VMs in GCP must exist on a virtual private cloud (VPC). When you created your GCP account, a default VPC was created. Create additional VPCs to isolate solutions or projects from one another. The Vertica Analytics Plaform creates all the nodes in the same VPC.
Subnetwork: Just as a GCP account may have multiple VPCs, each VPC may also have multiple subnets. Use additional subnets to group or isolate solutions within the same VPC.
Firewall: If you want your MC to be accessible via the internet, check the Allow access to the Management Console from the Internet box. Vertica recommends you protect your MC using a firewall that restricts access to just the IP addresses of users that need to access it. You can enter one or more comma-separated CIDR address ranges.
After you have entered all the required information, click Deploy to begin the deployment process.
Monitor the deployment
After the deployment begins, Google Cloud Launcher automatically opens the Deployment Manager page that displays the status of the deployment. Items that are still being processed have a spinning circle to the left of them and the text is a light gray color. Items that have been created are dark gray in color, with an icon designating that resource type on the left.
After the deployment completes, a green check mark appears next to the deployment name in the upper left-hand section of the screen.
Accessing the cluster after deployment
After the deployment completes, the right-hand section of the screen displays the following information:
dbadmin password: A randomly generated password for the dbadmin account on the nodes. For security reasons, change the dbadmin password when you first log in to one of the Vertica cluster nodes.
mcadmin password: A randomly generated password for the mcadmin account for accessing the Management Console. For security reasons, change the mcadmin password after you first log in to the MC.
Vertica Node 1 IP address: The external IP address for the first node in the Vertica cluster is exposed here so that you can connect to the VM using a standard SSH client.To access the MC, press the Access Vertica MC button in the Get Started section of the dialog box. Copy the mcadmin password and paste it when asked.
There are two ways to access the cluster nodes directly:
Use GCP's integrated SSH shell by selecting the SSH button in the Get Started section. This shell opens a pop-up in your browser that runs GCP's web-based SSH client. You are automatically logged on as the user you authenticated as in the GCP environment.
After you have access to the first Vertica cluster node, execute the su dbadmin command, and authenticate using the dbadmin password.
In addition, use other standard SSH clients to connect directly to the first Vertica cluster node. Use the Vertica Node 1 IP address listed on the screen as the dbadmin user, and authenticate with the dbadmin password.
Follow the on-screen directions to log in using the mcadmin account and accept the EULA. After you've been authenticated, access the initial database by clicking the vdb icon (looks like a green cylinder) in the Recent Databases section.
Using a custom service account
In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:
The custom service account must have the Editor role.
Individual user accounts must have the Service Account User role on the custom service account.
2.3.2.3 - Deploy an MC instance in GCP for Eon Mode
To deploy an Eon Mode database to GCP using Google Cloud Platform Launcher, you must deploy a Management Console (MC) instance.
To deploy an Eon Mode database to GCP using Google Cloud Platform Launcher, you must deploy a Management Console (MC) instance. You then use the MC instance to provision and deploy an Eon Mode database.
Once you have taken the steps listed in Eon Mode on GCP prerequisites, you are ready to deploy an Eon Mode database in GCP. To deploy an MC instance that is able to deploy Eon Mode databases to GCP:
Log into your GCP account, if you are not currently logged in.
Verify that your user account has the Editor role and the runtimeconfig.waiters.getIamPolicy permission.
Verify that the name of the GCP project you want to use for the deployment appears in the top banner. If it does not, click the down arrow next to the project name and select the correct project.
Click the navigation menu icon in the top left of the page and select Marketplace.
In the Search for solutions box, type Vertica Eon Mode and press enter.
Click the search result for Vertica Data Warehouse, Eon Mode. There are two license options: by the hour (BTH) and bring your own license (BYOL). See Deploy Vertica from the Google cloud marketplace for more information on this license choice.
Click Launch on the license option you prefer.
On the following page, fill in the fields to configure your MC instance:
Deployment name identifies your MC deployment in the GCP Deployments page.
Zone is the location where the virtual machine running your MC instance will be deployed. Make this the same location where your communal storage bucket is located.
Service Account: Service accounts allow automated processes to authenticate with GCP. Select the default service account, identified by project_number-compute@developer.gserviceaccount.com.
Machine Type is the virtual hardware configuration of the instance that will run the MC. The default values here are "middle of the road" settings which are sufficient for most use cases. If you are doing a small proof-of-concept deployment, you can choose a less powerful instance to save some money. If you are planning on deploying multiple large databases, consider increasing the count of virtual CPUs and RAM. For details about Vertica's default volume configurations, see Eon Mode volume configuration defaults for GCP.
User Name for Access to MC is the administrator username for the MC. You can customize this if you want.
Network and Subnetwork are the virtual private cloud (VPC) network and subnet within that network you want your MC instance and your Vertica nodes to use. This setting does not affect your MC's external network address. If you want to isolate your Vertica cluster from other GCP instances in your project, create a custom VPC network and optionally a subnet in your GCP project and select them in these fields. See the Google Cloud documentation's VPC network overview page for more information.
Firewall enables access to the MC from the internet by opening port 5450 in the firewall. You can choose to not open this port by clearing the I accept opening a port in the firewall (5450) for Vertica box. However, if you do not open the port in the firewall, your MC instance will only be accessible from within the VPC network. Not opening the port will make accessing your MC instance much harder.
Source IP ranges for MC traffic: If you choose to open the MC for external access, add one or more or more CIDR address ranges to this box for network addresses that you want to be able to access to the MC.
Caution
Make the address ranges as limited as possible to reduce the chances of unauthorized access to your MC instance.
Click the Deploy button to start the deployment of your MC instance.
The deployment process will take several minutes.
Using a custom service account
In general, you should use the default service account created by the GCP deployment (project_number-compute@developer.gserviceaccount.com), but if you want to use a custom service account:
The custom service account must have the Editor role.
Individual user accounts must have the Service Account User role on the custom service account.
Connect and log into the MC instance
After the deployment process is finished, the Deployment Manager page for your MC instance contains links to connect to the MC via your browser or ssh.
To connect to the MC instance:
The MC administrator user has a randomly-generated password that you need to log into the MC. Copy the password in the MC Admin Password field to the clipboard.
Click Access Management Console.
A new browser tab or window opens, showing you a page titled Redirection Notice. Click the link for the MC URL to continue to the MC login page.
Your browser will likely show you a security warning. The MC instance uses a self-signed security certificate. Most browsers treat these certificates as a security hazard because they cannot verify their origin. You can safely ignore this warning and continue. In most browsers, click the Advanced button on the warning page, and select the option to proceed. In Chrome, this is a link titled Proceed toxxx.xxx.xxx.xxx(unsafe). In Firefox, it is a button labeled Accept the Risk and Continue.
At the login screen, enter the MC administrator user name into the Username box. This user name is mcadmin, unless you changed the user name in the MC deployment form.
Paste the automatically-generated password you copied from the MC Admin Password field earlier into the Password box.
Click Log In.
Once you have logged into the MC, change the MC administrator account's password.
Caution
The automatically-generated password appears on the MC instance's deployment page and can be revealed in several locations in the deployment logs. Failure to change this password can lead to unauthorized access to your MC instance.
To change the password:
On the home page of the MC, under the MC Tools section, click MC Settings.
In the left-hand menu, click User Management.
Select the entry for the MC administrator account and click Edit.
Click either the Generate new or Edit password button to change the password. If you click the Generate new button, be sure to save the automatically-generated password in a safe location. If you click Edit password, you are prompted to enter a new password twice.
2.3.3 - Manually deploy an Enterprise Mode database on GCP
Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.
Before you create your Vertica cluster in Google Cloud Platform (GCP) using manual steps, you must create a virtual machine (VM) instance from the Compute Engine section of GCP.
Configure and launch a new instance
All VM instances that you create should be launched in the same virtual public cloud (VPC).
To configure and launch a new VM instance, follow these instructions:
From within the Compute Engine section of GCP, from the menu on the left-hand site of the screen, select VM Instances.
GCP displays all the VM instances that you have created so far.
Click CREATE INSTANCE.
Enter a name for the new instance.
Select the zone where you plan to deploy the instance.
GCP breaks its cloud data centers down by regions and zones. Regions are a collection of zones that are all in the same geographical location. Zones are collections of compute resources, which vary from zone to zone. Always pick the zone in your designated region that supports the latest Intel CPUs.
For a complete listing of regions and zones, including supported processors, see Regions and Zones.
Select a machine type.
GCE offers many different types of VM instances. For best results, only deploy Vertica on VM instances with 8 vCPus or more and at least 30 GB of RAM.
Select the boot disk (image).
You create VM instances from a public or custom image. If you are starting with Vertica in GCP for the first time, select either the CentOS 7 or RHEL 7 public image. Those images have been tested thoroughly with Vertica.
After you have configured the VM instance to be used as a Vertica cluster node, GCP allows you to convert that instance into a custom image. Doing so allows you to deploy multiple versions of that VM instance; each VM instance is identical except for the node name and IP address.
Before you can connect to any of the VMs you created, you must first identify the external IP address. The VM instance section of GCP contains a list of all currently deployed VMs and their associated external IP addresses.
Connect to your VM
To connect to your VM, complete the following tasks:
Connect to your VM using SSH with the external IP address you created in the configuration steps.
Authenticate using the credentials and SSH key that you provided to your GCP account upon creation.
Connect to other VMs
To connect to other virtual machines in your virtual network:
Use SSH to connect to your publicly connected VM.
Use SSH again from that VM to connect through the private IP addresses of your other VMs.
Because GCP forces the use of private key authentication, you may need to move your key file to the root directory of your publicly connected VM. Then, use SSH to connect to other VMs in your virtual network.
Prepare the virtual machines
After you create your VMs, you need to prepare them for cluster formation.
Add the Vertica license and private key
Prepare your nodes by adding your private key (if you are using one) to each node and to your Vertica license. The following steps assume that the initial user you configured is the DBADMIN user:
As the DBADMIN user, copy your private key file from where you saved it locally onto your primary node.
Depending upon the procedure you use to copy the file, the permissions on the file may change. If permissions change, the install_vertica script fails with a message similar to the following:
Failed Login Validation 10.0.2.158, cannot resolve or connect to host as root.
If you see the previous failure message, enter the following command to correct permissions on your private key file:
$ chmod 600 /<name-of-key>.pem
Copy your Vertica license to your primary VM. Save it in your home directory or other known location.
Install software dependencies for Vertica on GCP
In addition to the Vertica standard Package dependencies, as the root user, you must install the following packages before you install Vertica:
pstack
mcelog
sysstat
dialog
Configure storage
For best disk performance in GCP, Vertica recommends customers use SSD persistent storage, configured to at least 2TB (2000 GB) in size. Disk performance is directly tied to the disk size in GCP. 2000 GBs (2TB) is the minimum disk size for SSD persistent disks that allows maximum throughput.
Caution
Do not store your information on the root volume, especially in your data and catalog directories. Storing information on the root volume may result in data loss.
When configuring your storage, make sure to use a supported file system. See for details.
Create a swap file
In addition to storage volumes to store your data, Vertica requires a swap volume or swap file for the setup script to complete.
Create a swap file or swap volume of at least 2 GB. The following steps show how to create a swap file within Vertica on GCP:
After you complete the download and extraction, use the install_vertica script to form a cluster and install the Vertica database software, as described in the next section.
Form a cluster and install Vertica
Use the install_vertica script to combine two or more individual VMs to form a cluster and install your Vertica database.
Before you run the install_vertica script, follow these steps:
Check the VM Instances page of the Compute Engine section on GCP to locate a list of current VMs and their associated internal IP addresses.
Identify your storage location on your VMs. The installer assumes that you have mounted your storage to /home/dbadmin. To specify another location, use the --data-dir argument.
Caution
Do not store your data on the root drive.
The following steps show how to combine virtual machines (VMs) into a cluster using the install_vertica script:
While connected to your primary node, construct the following command to combine your nodes into a cluster.
Substitute the IP addresses for your VMs, and include your root key file name, if applicable.
Include the --point-to-point parameter to configure spread to use direct point-to-point communication among all Vertica nodes, as required for clusters on GCP when installing or updating Vertica.
If you are using Vertica Community Edition, which limits you to three nodes, specify -L CE with no license file.
After you combine your nodes, to reduce security risks, keep your key file in a secure place—separate from your cluster—and delete your on-cluster key with the shred command:
$ shred examplekey.pem
Important
You need your key file to perform future Vertica updates.
When you installed Vertica, a database administrator user was created with the DBADMIN role (usually named dbadmin). Use this account to create and start a database.
This section discusses the procedure for installing Vertica manually in an on-premises environment.
This section discusses the procedure for installing Vertica manually in an on-premises environment.
3.1 - Installation overview and checklist
Carefully review and complete the installation tasks in all sections of this topic.
Carefully review and complete the installation tasks in all sections of this topic.
Important notes
Vertica supports only one running database per cluster.
Vertica supports installation on one, two, or multiple nodes. The steps for Installing Vertica are the same, no matter how many nodes are in the cluster.
Only one instance of Vertica can be running on a host at any time.
To run the
install_vertica script, as well as adding, updating, or deleting nodes, you must be logged in as root, or sudo as a user with all privileges. You must run the script for all installations, including upgrades and single-node installations.
Installation scenarios
The three main scenarios for installing Vertica on hosts are:
A single node install, where Vertica is installed on a single host as a localhost process. This form of install cannot be expanded to more hosts later on and is typically used for development or evaluation purposes.
Installing to a cluster of physical host hardware. This is the most common scenario when deploying Vertica in a testing or production environment.
Installing to a local cluster of virtual host hardware. This is similar to installing on physical hosts, but with network configuration differences.
Before you install
Before You Install Vertica describes how to construct a hardware platform and prepare Linux for Vertica installation.
These preliminary steps are broken into two categories:
Configuring Hardware and Installing Linux
Configuring the Network
Install or upgrade Vertica
Once you have completed the steps in the Before You Install Vertica section, you are ready to run the install script.
After You Install Vertica describes subsequent steps to take after you've run the installation script. Some of the steps can be skipped based on your needs:
Install the license key.
Verify that kernel and user parameters are correctly set.
Install the vsql client application on non-cluster hosts.
Resolve any SLES 11.3 issues during spread configuration.
Use the Vertica documentation online, or download and install Vertica documentation. Find the online documentation and documentation packages to download at https://docs.vertica.com/latest.
Complete all of the tasks in this section before you install Vertica.
Complete all of the tasks in this section before you install Vertica. When you have completed this section, proceed to Install Vertica using the command line.
3.2.1 - Platform and hardware requirements and recommendations
The Vertica Analytics Platform is based on a massively parallel processing (MPP), shared-nothing architecture, in which the query processing workload is divided among all nodes of the Vertica database.
Hardware recommendations
The Vertica Analytics Platform is based on a massively parallel processing (MPP), shared-nothing architecture, in which the query processing workload is divided among all nodes of the Vertica database. OpenText highly recommends using a homogeneous hardware configuration for your Vertica cluster; that is, each node of the cluster should be similar in CPU, clock speed, number of cores, memory, and operating system version.
Note that OpenText has not tested Vertica on clusters made up of nodes with disparate hardware specifications. While it is expected that a Vertica database would functionally work in a mixed hardware configuration, performance will be limited to that of the slowest node in the cluster.
Vertica performs best on processors with higher clock frequency. When possible, choose a faster processor with fewer cores as opposed to a slower processor with more cores.
Tests performed both internally and by customers have shown performance differences between processor architectures even when accounting for differences in core count and clock frequency. When possible, compare platforms by installing Vertica and running experiments using your data and workloads. Consider testing on cloud platforms that offer VMs running on different processor architectures, even if you intend to deploy your Vertica database on premises.
You must verify that your servers meet the platform requirements described in Supported Platforms. The Supported Platforms topics detail supported versions for the following:
OS for Server and Management Console (MC)
Supported Browsers for MC
Supported File Systems
Important
Deploy Vertica as the only active process on each host—other than Linux processes or software explicitly approved by Vertica. Vertica cannot be co-located with other software. Remove or disable all non-essential applications from cluster hosts.
Install the latest vendor-specific system software
Install the latest vendor drivers for your hardware.
Data storage recommendations
All internal drives connect to a single RAID controller.
The RAID array should form one hardware RAID device as a contiguous /data volume.
Install Perl
Before you perform the cluster installation, install Perl 5 on all the target hosts. Perl is available for download from www.perl.org.
Validation utilities
Vertica provides several validation utilities that validate the performance on prospective hosts. The utilities are installed when you install the Vertica RPM, but you can use them before you run the install_vertica script. See Validation scripts for more details on running the utilities and verifying that your hosts meet the recommended requirements.
Verify sudo
Vertica uses the sudo command during installation and some administrative tasks. Ensure that sudo is available on all hosts with the following command:
When you use sudo to install Vertica, the user that performs the installation must have privileges on all nodes in the cluster.
Configuring sudo with privileges for the individual commands can be a tedious and error-prone process; thus, the Vertica documentation does not include every possible sudo command that you can include in the sudoers file. Instead, Vertica recommends that you temporarily elevate the sudo user to have all privileges for the duration of the install.
Note
See the sudoers and visudo man pages for the details on how to write/modify a sudoers file.
To allow root sudo access on all commands as any user on any machine, use visudo as root to edit the /etc/sudoers file and add this line:
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
After the installation completes, remove (or reset) sudo privileges to the pre-installation settings.
BASH shell requirements
All shell scripts included in Vertica must run under the BASH shell. If you are on a Debian system, then the default shell can be DASH. DASH is not supported. Change the shell for root and for the dbadmin user to BASH with the chsh command.
For example:
# getent passwd | grep root
root:x:0:0:root:/root:/bin/dash
# chsh
Changing shell for root.
New shell [/bin/dash]: /bin/bash
Shell changed.
Then, as root, change the symbolic link for /bin/sh from /bin/dash to /bin/bash:
# rm /bin/sh
# ln -s /bin/bash /bin/sh
Log out and back in for the change to take effect.
3.2.2 - Communal storage for on-premises Eon Mode databases
If you create an Eon Mode database, you must plan for your use of communal storage to store your database's data.
If you create an Eon Mode database, you must plan for your use of communal storage to store your database's data. Communal storage is based on a shared storage, such as AWS S3 or Pure Storage FlashBlade servers.
Whatever communal storage platform you use, you must ensure that it is durable (protected against data loss). The data in your Eon Mode database is only as safe as the communal storage that contains it. Most cloud provider's object stores come with a guaranteed redundancy to prevent data loss. When you install an Eon Mode database on-premises, you may have to take additional steps to prevent data loss.
Planning communal storage capacity for on-premises databases
Most cloud providers do not limit the amount of data you can store in their object stores. The only real limit is your budget; storing more data costs more money.
When you create an Eon Mode database on-premises, your storage is limited to the size of your communal storage. Unlike the cloud, you must plan ahead for the amount of storage you will need. For example, if you have a Pure Admin FlashBlade installation with three 8TB blades, then in theory, your database can grow up to 24TB. In practice, you need to account other uses of your object store, as well as factors such as data compression, and space consumed by unreaped ROS containers (storage containers no longer used by Vertica but not yet deleted by the object store).
The following calculator helps you determine the size for your communal storage needs, based on your estimated data size and additional uses of your communal storage. The values with white backgrounds in the Value column are editable. Change them to reflect your environment.
Note
The calculator currently does not work in mobile browsers. Please use a desktop browser to view the calculator.
3.2.3 - Configure the network
This group of steps involve configuring the network.
This group of steps involve configuring the network. These steps differ depending on your installation scenario. A single node installation requires little network configuration, because the single instance of the Vertica server does not need to communication with other nodes in a cluster. For cluster install scenarios, you must make several decisions regarding your configuration.
Vertica supports server configuration with multiple network interfaces. For example, you might want to use one as a private network interface for internal communication among cluster hosts (the ones supplied via the --hosts option to install_vertica) and a separate one for client connections.
Important
Vertica performs best when all nodes are on the same subnet and have the same broadcast address for one or more interfaces. A cluster that has nodes on more than one subnet can experience lower performance due to the network latency associated with a multi-subnet system at high network utilization levels.
Important notes
Network configuration is exactly the same for single nodes as for multi-node clusters, with one special exception. If you install Vertica on a single host machine that is to remain a permanent single-node configuration (such as for development or Proof of Concept), you can install Vertica using localhost or the loopback IP (typically 127.0.0.1) as the value for --hosts. Do not use the hostname localhost in a node definition if you are likely to add nodes to the configuration later.
If you are using a host with multiple network interfaces, configure Vertica to use the address which is assigned to the NIC that is connected to the other cluster hosts.
Use a dedicated gigabit switch. If you do not performance could be severely affected.
Do not use DHCP dynamically-assigned IP addresses for the private network. Use only static addresses or permanently-leased DHCP addresses.
Choose IPv4 or IPv6 addresses for host identification and communications
Vertica supports using either IPv4 or IPv6 IP addresses for identifying the hosts in a database cluster. Vertica uses a single address to identify a host in the database cluster. All the IP addresses used to identify hosts in the cluster must use the same IP family.
The hosts in your database cluster can have both IPv4 and IPv6 network addresses assigned to them. Only one of these addresses is used to identify the node within the cluster. You can use the other addresses to handle client connections or connections to other systems.
You tell Vertica which address family to use when you install it. By default, Vertica uses IPv4 addresses for hosts. If you want the nodes in your database to use IPv6 addresses, add the --ipv6 option to the arguments you pass to the install_vertica script.
Note
You cannot change the address family a database cluster uses after you create it. For example, suppose you created a Vertica database using IPv4 addresses to identify the hosts in your cluster. Then you cannot later change the hosts to use an IPv6 address for internal communications.
In most cases, the address family you select does not impact how your database functions. However, there are a few exceptions:
Use IPv4 addresses to identify the nodes in your cluster if you want to use the Management Console to manage your database. Currently, the MC does not support databases that use IPv6 addresses.
If you select IPv6 addressing for your cluster, it automatically uses point-to-point networking mode.
Currently, AWS is the only cloud platform on which Vertica supports IPv6 addressing. To use IPv6 on AWS, you must identify cluster hosts using IP addresses instead of host names. The AWS DNS does not support resolving host names to IPv6.
If you only assign IPv6 addresses to the hosts in your database cluster, you may have problems interfacing to other systems that do not support IPv6.
Part of the information you pass to the install script is the list of hosts it will use to form the Vertica cluster. If you use host names in this list instead of IP addresses, ensure that the host names resolve to the IP address family you want to use for your cluster. For example, if you want your cluster to use IPv6 addresses, ensure your DNS or /etc/hosts file resolves the host names to IPv6 addresses.
You can configure DNS to return both IPv4 and IPv6 addresses for a host name. In this case, the installer uses the IPv4 address unless you supply the --ipv6 argument. If you use /etc/hosts for host name resolution (which is the best practice), host names cannot resolve to both IPv4 and IPv6 addresses.
Optionally run spread on a separate control network
If your query workloads are network intensive, you can use the --control-network parameter with the
install_vertica script (see Install Vertica with the installation script) to allow spread communications to be configured on a subnet that is different from other Vertica data communications.
The --control-network parameter accepts either the default value or a broadcast network IP address (for example, 192.168.10.255 ).
Configure SSH
Verify that root can use Secure Shell (SSH) to log in (ssh) to all hosts that are included in the cluster. SSH (SSH client) is a program for logging into a remote machine and for running commands on a remote machine.
If you do not already have SSH installed on all hosts, log in as root on each host and install it before installing Vertica. You can download a free version of the SSH connectivity tools from OpenSSH.
Make sure that /dev/pts is mounted. Installing Vertica on a host that is missing the mount point /dev/pts could result in the following error when you create a database:
TIMEOUT ERROR: Could not login with SSH. Here is what SSH said:Last login: Sat Dec 15 18:05:35 2007 from v_vmart_node0001
Allow passwordless SSH access for the dbadmin user
The dbadmin user must be authorized for passwordless ssh. In typical installs, you won't need to change anything; however, if you set up your system to disallow passwordless login, you'll need to enable it for the dbadmin user. See Enable secure shell (SSH) logins.
3.2.3.1 - Reserved ports
The install_vertica script checks that required ports are open and available to Vertica. The installer reports any issues with identifier N0020.
The install_vertica script checks that required ports are open and available to Vertica. The installer reports any issues with identifier N0020.
You can also verify that ports required by Vertica are not in use by running the following command as the root user and comparing it with the ports required, as shown below:
$ ss -atupn
Firewall requirements
Vertica requires several ports to be open on the local network. Vertica does not recommend placing a firewall between nodes (all nodes should be behind a firewall), but if you must use a firewall between nodes, ensure the following ports are available:
Intra- and inter-cluster communication. Vertica opens the Vertica client port +1 (5434 by default) for intra-cluster communication, such as during a plan. If the port +1 from the default client port is not available, then Vertica opens a random port for intra-cluster communication.
Port used to connect to MC from a web browser and allows communication from nodes to the MC application/web server. See Connecting to Management Console.
Vertica requires multiple ports be open between nodes.
Vertica requires multiple ports be open between nodes. You may use a firewall (IP Tables) on Redhat/CentOS and Ubuntu/Debian based systems. Note that firewall use is not supported on SuSE systems and that SuSE systems must disable the firewall. The installer reports issues found with your IP tables configuration with the identifiers N0010 for (systems that use IP Tables) and N011 (for SuSE systems).
The installer checks the IP tables configuration and issues a warning if there are any configured rules or chains. The installer does not detect if the configuration may conflict with Vertica. It is your responsibility to verify that your firewall allows traffic for Vertica as described in Reserved ports.
Note
The installer does not check NAT entries in iptables.
You can modify your firewall to allow for Vertica network traffic, or you can disable the firewall if your network is secure. Note that firewalls are not supported for Vertica systems running on SuSE.
Important
You may encounter the N0010 issue even when the firewall is disabled. If this occurs, you can workaround this issue and install Vertica by ignoring installer WARN messages. To do this, install (or update) with a failure threshold of FAIL. For example, /opt/vertica/sbin/install_vertica --failure-threshold FAIL <other install options...>.
Red Hat and CentOS systems:
To disable the system firewall, run the following command as root or sudo:
To disable iptables on Ubuntu, run the following command:
$ sudo ufw disable
SuSE systems
The firewall must be disabled on SUSE systems. To disable the firewall on SuSE systems, run the following command:
# /sbin/SuSEfirewall2 off
3.2.4 - Operating system configuration overview
This topic provides a high-level overview of the OS settings required for Vertica.
This topic provides a high-level overview of the OS settings required for Vertica. Each item provides a link to additional details about the setting and detailed steps on making the configuration change. The installer tests for all of these settings and provides hints, warnings, and failures if the current configuration does not meet Vertica requirements.
Before you install the operating system
The below sections detail system settings that must be configured when you install the operating system. These settings cannot be easily changed after the operating system is installed.
Configuration
Description
Supported Platforms
Verify that your servers meet the platform requirements described in Supported Platforms. Unsupported operating systems are detected by the installer.
The installer generates one of the following issue identifiers if it detects an unsupported operating system:
[S0320] - Fedora OS is not supported.
[S0321] - The version of Red Hat/CentOS is not supported.
[S0322] - The version of Ubuntu/Debian is not supported.
[S0323] - The operating system could not be determined. The unknown operating system is not supported because it does not match the list of supported operating systems.
[S0324] - The version of Red Hat is not supported.
LVM
Vertica Analytic Database supports Linux Volume Manager (LVM) on all supported operating systems. For information on LVM requirements and restrictions, see the section, Vertica Support for LVM.
File system
Choose the storage format type based on deployment requirements. Vertica recommends the following storage format types where applicable:
ext3
ext4
NFS for backup
XFS
Amazon S3 Standard, Azure Blob Storage, or Google Cloud Storage for communal storage and related backup tasks when running in Eon Mode
Note
For the Vertica I/O profile, the ext4 file system is considerably faster than ext3.
The storage format type at your backup and temporary directory locations must support fcntl lockf (POSIX) file locking.
Swap Space
A 2GB swap partition is required, regardless of the amount of RAM installed on your system. Larger swap space is acceptable, but unnecessary. Partition the remaining disk space in a single partition under "/". If you do not have the required 2GB swap partition, the installer reports this issue with identifier S0180.
You typically define the swap partition when you install Linux. See your platform’s documentation for details on configuring the swap partition.
Note
Do not place a swap file on a disk containing the Vertica data files. If a host has only two disks (boot and data), put the swap file on the boot disk.
Disk Block Size
The disk block size for the Vertica data and catalog directories should be 4096 bytes, the default on ext4 and XFS file systems. You set the disk block size when you format your file system. If you change the block size, you will need to reformat the disk.
Memory
Vertica requires that your hosts have a minimum of 1GB of RAM per logical processor. If your hosts do not meet this requirement, the installer reports this issue with the identifier S0190. For performance reasons, you typically require more RAM than the minimum.
In addition to the individual host RAM requirement, the installer also reports a hint if the hosts in your cluster do not have identical amounts of RAM. Ensuring your host have the same amount of RAM helps prevent performance issues if one or more nodes has less RAM than the other nodes in your database.
Note
In an Eon Mode database, after you create the initial cluster, you can configure subclusters that have different hardware specifications (including RAM) than the initial primary subcluster the installer creates.
Automatically configured operating system settings
These general OS settings are automatically made by the installer if they do not meet Vertica requirements. You can prevent the installer from automatically making these configuration changes by using the --no-system-configuration parameter for the install_vertica script.
This disk readahead must be at least 2048, with a high of 8192. Set this high limit only with the help of Vertica support. The specific value depends on your hardware configuration.
The installer automatically creates a user with the correct settings. If you specify a user with --dba-user, then the user must conform to the requirements for the Vertica system user.
The TZ environment variable must be set and valid for the database administration user.
3.2.5 - Automatically configured operating system settings
These general Operating System settings are automatically made by the installer.
These general Operating System settings are automatically made by the installer. You can prevent the installer from automatically making these configuration changes by using the --no-system-configuration parameter for the install_vertica script.
3.2.5.1 - Sysctl
During installation, Vertica attempts to automatically change various OS level settings.
During installation, Vertica attempts to automatically change various OS level settings. The installer may not change values on your system if they exceed the threshold required by the installer. You can prevent the installer from automatically making these configuration changes by using the --no-system-configuration parameter for the install_vertica script.
To permanently edit certain settings and prevent them from reverting on reboot, use sysctl.
The sysctl settings relevant to the installation of Vertica include:
For example, to set the parameter and value for fs.file-max to meet Vertica requirements, enter:
fs.file-max = 65536
Save your changes, and close the /etc/sysctl.conf file.
As the root user, reload the config file:
# sysctl -p
Identifying settings added by the installer
You can see whether the installer has added a setting by opening the /etc/sysctl.conf file:
# vi /etc/sysctl.conf
If the installer has added a setting, the following line appears:
# The following 1 line added by Vertica tools. 2015-02-23 13:20:29
parameter = value
3.2.5.2 - Nice limits configuration
The Vertica system user (dbadmin by default) must be able to raise and lower the priority of Vertica processes.
The Vertica system user (dbadmin by default) must be able to raise and lower the priority of Vertica processes. To do this, the nice option in the /etc/security/limits.conf file must include an entry for the dbadmin user. The installer reports this issue with the identifier: S0010.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
Note
Vertica never raises priority above the default level of 0. However, Vertica does lower the priority of certain Vertica threads and needs to able to raise the priority of these threads back up to the default level. This setting allows Vertica to raise the priorities back to the default level.
All systems
To set the Nice Limit configuration for the dbadmin user, edit /etc/security/limits.conf and add the following line. Replace dbadmin with the name of your system user.
dbadmin - nice 0
3.2.5.3 - min_free_kbytes setting
This topic details how to update the min_free_kbytes setting so that it is within the range supported by Vertica.
This topic details how to update the min_free_kbytes setting so that it is within the range supported by Vertica. The installer reports this issue with the identifier: S0050 if the setting is too low, or S0051 if the setting is too high.
The vm.min_free_kbytes setting configures the page reclaim thresholds. When this number is increased the system starts reclaiming memory earlier, when its lowered it starts reclaiming memory later. The default min_free_kbytes is calculated at boot time based on the number of pages of physical RAM available on the system.
The setting must be whichever value is the greatest from the following options:
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To manually set min_free_kbytes:
Determine the current/default setting with the following command:
$ sysctl vm.min_free_kbytes
If the result of the previous command is No such file or directory or the default value is less than 4096, then run these commands to determine the correct value:
Edit or add the current value of vm.min_free_kbytes in /etc/sysctl.conf with the value from the output of the previous command.
# The min_free_kbytes setting
vm.min_free_kbytes=16132
Run sysctl -p to apply the changes in sysctl.conf immediately.
Note
These steps must be repeated for each node in the cluster.
3.2.5.4 - User max open files limit
This topic details how to change the user max open-files limit setting to meet Vertica requirements.
This topic details how to change the user max open-files limit setting to meet Vertica requirements. The installer reports this issue with the identifier S0060.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
Vertica requires that the dbadmin user not be limited when opening files. The open file limit per user is calculated as follows:
user max open files = greater of { ≥ 65536 | ≤ RAM-MBs }
As a dbadmin user, you can determine the open file limit by running ulimit -n. For example:
$ ulimit -n
65536
To manually set the limit, edit /etc/security/limits.conf and edit/add the nofile setting for the user who is configured as the database administrator—by default, dbadmin. For example:
dbadmin - nofile 65536
The setting must be no less than 65536 MB, but not greater than the system value of fs.nr_open. For example, the default value of fs.nr_open value on Red Hat Enterprise Linux 9 is 1048576 MB.
This topic details how to modify the limit for the number of open files on your system so that it meets Vertica requirements.
This topic details how to modify the limit for the number of open files on your system so that it meets Vertica requirements. The installer reports this issue with the identifier: S0120.
Vertica opens many files. Some platforms have global limits on the number of open files. The open file limit must be set sufficiently high so as not to interfere with database operations.
The recommended value is at least the amount of memory in MB, but not less than 65536.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To manually set the open file limit:
Run /sbin/sysctl fs.file-max to determine the current limit.
If the limit is not 65536 or the amount of system memory in MB (whichever is higher), then edit or add fs.file-max=max number of files to /etc/sysctl.conf.
# Controls the maximum number of open files
fs.file-max=65536
Run sysctl -p to apply the changes in sysctl.conf immediately.
Note
These steps will need to be replicated for each node in the cluster.
3.2.5.6 - Pam limits
This topic details how to enable the "su" pam_limits.so module required by Vertica.
This topic details how to enable the "su" pam_limits.so module required by Vertica. The installer reports issues with the setting with the identifier: S0070.
On some systems the pam module called pam_limits.so is not set in the file /etc/pam.d/su. When it is not set, it prevents the conveying of limits (such as open file descriptors) to any command started with su -.
In particular, the Vertica init script would fail to start Vertica because it calls the Administration Tools to start a database with the su - command. This problem was first noticed on Debian systems, but the configuration could be missing on other Linux distributions. See the pam_limits man page for more details.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To manually configure this setting, append the following line to the /etc/pam.d/su file:
session required pam_limits.so
See the pam_limits man page for more details: man pam_limits.
3.2.5.7 - pid_max setting
This topic explains how to change pid_max to a supported value.
This topic explains how to change pid_max to a supported value. The value of pid_max should be
where num-user-proc is the size of memory in megabytes.
The minimum value for pid_max is 524288.
If your pid_max value is too low, the installer reports this problem and indicates the minimum value.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To change the pid_max value:
# sysctl -w kernel.pid_max=524288
3.2.5.8 - User address space limits
This topic details how to modify the Linux address space limit for the dbadmin user so that it meets Vertica requirements.
This topic details how to modify the Linux address space limit for the dbadmin user so that it meets Vertica requirements. The address space setting controls the maximum number of threads and processes for each user. If this setting does not meet the requirements then the installer reports this issue with the identifier: S0090.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
The address space available to the dbadmin user must not be reduced via user limits and must be set to unlimited.
All systems
To manually set the address space limit:
Run ulimit -v as the dbadmin user to determine the current limit.
If the limit is not unlimited, then add the following line to /etc/security/limits.conf. Replace dbadmin with your database admin user
dbadmin - as unlimited
3.2.5.9 - User file size limit
This topic details how to modify the file size limit for files on your system so that it meets Vertica requirements.
This topic details how to modify the file size limit for files on your system so that it meets Vertica requirements. The installer reports this issue with the identifier: S0100.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
The file size limit for the dbadmin user must not be reduced via user limits and must be set to unlimited.
All systems
To manually set the file size limit:
Run ulimit -f as the dbadmin user to determine the current limit.
If the limit is not unlimited, then edit/add the following line to /etc/security/limits.conf. Replace dbadmin with your database admin user.
dbadmin - fsize unlimited
3.2.5.10 - User process limit
This topic details how to change the user process limit so that it meets Vertica requirements.The installer reports this issue with the identifier: S0110.
This topic details how to change the user process limit so that it meets Vertica requirements.The installer reports this issue with the identifier: S0110.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
The user process limit must be high enough to allow for the many threads opened by Vertica. The recommended limit is the amount of RAM in MB and must be at least 1024.
All systems
To manually set the user process limit:
Run ulimit -u as the dbadmin user to determine the current limit.
If the limit is not the amount of memory in MB on the server, then edit/add the following line to /etc/security/limits.conf. Replace 4096 with the amount of system memory, in MB, on the server.
dbadmin - nproc 4096
3.2.5.11 - Maximum memory maps configuration
This topic details how to modify the limit for the number memory maps a process can have on your system so that it meets Vertica requirements.
This topic details how to modify the limit for the number memory maps a process can have on your system so that it meets Vertica requirements. The installer reports this issue with the identifier: S0130.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
Vertica uses a lot of memory while processing and can approach the default limit for memory maps per process.
The recommended value is at least the amount of memory on the system in KB / 16, but not less than 65536.
All systems
To manually set the memory map limit:
Run /sbin/sysctl vm.max_map_count to determine the current limit.
If the limit is not 65536 or the amount of system memory in KB / 16 (whichever is higher), then edit/add the following line to /etc/sysctl.conf. Replace 65536 with the value for your system.
# The following 1 line added by Vertica tools. 2014-03-07 13:20:31
vm.max_map_count=65536
Run sysctl -p to apply the changes in sysctl.conf immediately.
Note
These steps will need to be replicated for each node in the cluster.
3.2.6 - Manually configured operating system settings
The topics in this section detail general Operating System settings that must be set manually.
The topics in this section detail general Operating System settings that must be set manually.
Persisting operating system settings
To prevent manually set Operating System settings from reverting on reboot, you should configure some of these settings in the /etc/rc.local script. This script contains commands and scripts that run each time the system is booted.
Important
On reboot, SUSE systems use the /etc/init.d/after.local file rather than /etc/rc.local.
Vertica uses settings in /etc/rc.local to set the following functionality:
Enter a script or command. For example, to configure transparent hugepages to meet Vertica requirements, enter the following:
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
Important
On some Ubuntu/Debian systems, the last line in /etc/rc.local must be exit 0. All additions to /etc/rc.local must precede this line.
Save your changes, and close /etc/rc.local.
If you use Red Hat 7.0 or CentOS 7.0 or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
On reboot, the command runs during startup. You can also run the command manually as the root user, if you want it to take effect immediately.
Disabling tuning system service
If you use Red Hat 7.0 or CentOS 7.0 or higher, make sure the tuning system service does not start on when Vertica reboots. Turning off tuning prevents monitoring of your OS and any tuning of your OS based on this monitoring. Tuning also enables THP silently, which can cause issues in other areas such as read ahead.
Run the following command as sudo or root:
$ chkconfig tuned off
3.2.6.1 - SUSE control groups configuration
On SuSE 12, the installer checks the control group (cgroup) setting for the cgroups that Vertica may run under:.
On SuSE 12, the installer checks the control group (cgroup) setting for the cgroups that Vertica may run under:
verticad
vertica_agent
sshd
The installer verifies that the pid.max resource is large enough for all the threads that Vertica creates. We check the contents of:
If these files exist and they fail to include the value max, the installation stops and the installer returns a failure message (code S0340).
If these files do not exist, they are created automatically when the systemd runs the verticad and vertica_agent startup scripts. However, the site's cgroup configuration process managed their default values. Vertica does not change the defaults.
Pre-installation configuration
Before installing Vertica, configure your system as follows:
# Create the following directories:
sudo mkdir /sys/fs/cgroup/pids/system.slice/verticad.service/
sudo mkdir /sys/fs/cgroup/pids/system.slice/vertica_agent.service/
# sshd service dir should already exist, so don't need to create it
# Set pids.max values:
sudo sh -c 'echo "max" > /sys/fs/cgroup/pids/system.slice/verticad.service/pids.max'
sudo sh -c 'echo "max" > /sys/fs/cgroup/pids/system.slice/vertica_agent.service/pids.max'
sudo sh -c 'echo "max" > /sys/fs/cgroup/pids/system.slice/sshd.service/pids.max'
Persisting configuration for restart
After installation, you can configure control groups for subsequent reboots of the Vertica database. You do so by editing configuration file /etc/init.d/after.local and adding the commands shown earlier.
Note
Because after.local is executed as root, it can omit sudo commands.
3.2.6.2 - Cron required for scheduled jobs
Admintools uses the Linux cron package to schedule jobs that regularly rotate the database logs.
Admintools uses the Linux cron package to schedule jobs that regularly rotate the database logs. Without this package installed, the database logs will never be rotated. The lack of rotation can lead to a significant consumption of storage for logs. On busy clusters, Vertica can produce hundreds of gigabytes of logs per day.
cron is installed by default on most Linux distributions, but it may not be present on some SUSE 12 systems.
To install cron, run this command:
$ sudo zypper install cron
3.2.6.3 - Disk readahead
Vertica requires that Disk Readahead be set to at least 2048.
Vertica requires that Disk Readahead be set to at least 2048. The installer reports this issue with the identifier: S0020.
Note
These commands must be executed with root privileges and assumes the blockdev program is in /sbin.
The blockdev program operates on whole devices, and not individual partitions. You cannot set the readahead value to different settings on the same device. If you run blockdev against a partition, for example: /dev/sda1, then the setting is still applied to the entire /dev/sda device. For instance, running /sbin/blockdev --setra 2048 /dev/sda1 also causes /dev/sda2 through /dev/sdaN to use a readahead value of 2048.
RedHat/CentOS and SuSE based systems
For each drive in the Vertica system, Vertica recommends that you set the readahead value to at least 2048 for most deployments. The command immediately changes the readahead value for the specified disk. The second line adds the command to /etc/rc.local so that the setting is applied each time the system is booted. Note that some deployments may require a higher value and the setting can be set as high as 8192, under guidance of support.
Note
For systems that do not support /etc/rc.local, use the equivalent startup script that is run after the destination runlevel has been reached. For example SUSE uses /etc/init.d/after.local.
The following example sets the readahead value of the drive sda to 2048:
If you are using Red Hat 7.0 or CentOS 7.0 or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
Ubuntu and debian systems
For each drive in the Vertica system, set the readahead value to 2048. Run the command once in your shell, then add the command to /etc/rc.local so that the setting is applied each time the system is booted. Note that on Ubuntu systems, the last line in rc.local must be "exit 0". So you must manually add the following line to etc/rc.local before the last line with exit 0.
Note
For systems that do not support /etc/rc.local, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local.
/sbin/blockdev --setra 2048 /dev/sda
3.2.6.4 - I/O scheduling
Vertica requires that I/O Scheduling be set to deadline or noop.
Vertica requires that I/O Scheduling be set to
deadline or
noop. The installer checks what scheduler the system is using, reporting an unsupported scheduler issue with identifier: S0150. If the installer cannot detect the type of scheduler in use (typically if your system is using a RAID array), it reports that issue with identifier: S0151.
If your system is not using a RAID array, then complete the following steps to change your system to a supported I/O Scheduler. If you are using a RAID array, then consult your RAID vendor documentation for the best performing scheduler for your hardware.
Configure the I/O scheduler
The Linux kernel can use several different I/O schedulers to prioritize disk input and output. Most Linux distributions use the Completely Fair Queuing (CFQ) scheme by default, which gives input and output requests equal priority. This scheduler is efficient on systems running multiple tasks that need equal access to I/O resources. However, it can create a bottleneck when used on Vertica drives containing the catalog and data directories, because it gives write requests equal priority to read requests, and its per-process I/O queues can penalize processes making more requests than other processes.
Instead of the CFQ scheduler, configure your hosts to use either the Deadline or NOOP I/O scheduler for the drives containing the catalog and data directories:
The Deadline scheduler gives priority to read requests over write requests. It also imposes a deadline on all requests. After reaching the deadline, such requests gain priority over all other requests. This scheduling method helps prevent processes from becoming starved for I/O access. The Deadline scheduler is best used on physical media drives (disks using spinning platters), since it attempts to group requests for adjacent sectors on a disk, lowering the time the drive spends seeking.
The NOOP scheduler uses a simple FIFO approach, placing all input and output requests into a single queue. This scheduler is best used on solid state drives (SSDs). Because SSDs do not have a physical read head, no performance penalty exists when accessing non-adjacent sectors.
Failure to use one of these schedulers for the Vertica drives containing the catalog and data directories can result in slower database performance. Other drives on the system (such as the drive containing swap space, log files, or the Linux system files) can still use the default CFQ scheduler (although you should always use the NOOP scheduler for SSDs).
You can set your disk device scheduler by writing the name of the scheduler to a file in the /sys directory or using a kernel boot parameter.
Changing the scheduler through the /sys directory
You can view and change the scheduler Linux uses for I/O requests to a single drive using a virtual file under the /sys directory. The name of the file that controls the scheduler a block device uses is:
/sys/block/deviceName/queue/scheduler
Where deviceName is the name of the disk device, such as sda or cciss\!c0d1 (the first disk on an OpenText RAID array). Viewing the contents of this file shows you all of the possible settings for the scheduler. The currently-selected scheduler is surrounded by square brackets:
To change the scheduler, write the name of the scheduler you want the device to use to its scheduler file. You must have root privileges to write to this file. For example, to set the sda drive to use the deadline scheduler, run the following command as root:
Changing the scheduler immediately affects the I/O requests for the device. The Linux kernel starts using the new scheduler for all of the drive's input and output requests.
Note
While tests show that changing the scheduler settings while Vertica is running does not cause problems, Vertica recommends shutting down. Before changing the I/O schedule, or making any other changes to the system configuration, consider shutting down any running database.
Changes to the I/O scheduler made through the /sys directory only last until the system is rebooted, so you need to add the commands that change the I/O scheduler to a startup script (such as those stored in /etc/init.d, or though a command in /etc/rc.local). You also need to use a separate command for each drive on the system whose scheduler you want to change.
For example, to make the configuration take effect immediately and add it to rc.local so it is used on subsequent reboots.
Note
For systems that do not support /etc/rc.local, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local.
On some Ubuntu/Debian systems, the last line in rc.local must be "exit 0". So you must manually add the following line to etc/rc.local before the last line with exit 0.
You may prefer to use this method of setting the I/O scheduler over using a boot parameter if your system has a mix of solid-state and physical media drives, or has many drives that do not store Vertica catalog and data directories.
If you are using Red Hat 7.0 or CentOS 7.0 or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
Changing the scheduler with a boot parameter
Use the elevator kernel boot parameter to change the default scheduler used by all disks on your system. This is the best method to use if most or all of the drives on your hosts are of the same type (physical media or SSD) and will contain catalog or data files. You can also use the boot parameter to change the default to the scheduler the majority of the drives on the system need, then use the /sys files to change individual drives to another I/O scheduler. The format of the elevator boot parameter is:
elevator=schedulerName
Where schedulerName is deadline, noop, or cfq. You set the boot parameter using your bootloader (grub or grub2 on most recent Linux distributions). See your distribution's documentation for details on how to add a kernel boot parameter.
3.2.6.5 - Enabling or disabling transparent hugepages
You can modify transparent hugepages to meet Vertica configuration requirements:.
You can modify transparent hugepages to meet Vertica configuration requirements:
For Red Hat/CentOS and SUSE 15.1, Vertica provides recommended settings to optimize your system performance by workload.
For all other systems, you must disable transparent hugepages or set them to madvise. The installer reports this issue with the identifier: S0310.
Recommended settings by workload for Red Hat/CentOS and SUSE 15.1
Vertica recommends transparent hugepages settings to optimize performance by workload. The following table contains recommendations for systems that primarily run concurrent queries (such as short-running dashboard queries), or sequential SELECT or load (COPY) queries:
Operating System
Concurrent
Sequential
Important Notes
Red Hat and CentOS
Disable
Enable
SUSE 15.1
Disable
Enable
Additionally, Vertica recommends the following khugepaged settings to optimize for each workload:
Concurrent Workloads: Disable khugepaged with the following command:
Enabling transparent hugepages on Red Hat/CentOS and SUSE 15.1
Determine if transparent hugepages is enabled. To do so, run the following command.
cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
The setting returned in brackets is your current setting.
For systems that do not support /etc/rc.local, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local.
You can enable transparent hugepages by editing /etc/rc.local and adding the following script:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo always > /sys/kernel/mm/transparent_hugepage/enabled
fi
You must reboot your system for the setting to take effect, or, as root, run the following echo line to proceed with the install without rebooting:
If you are using Red Hat or CentOS or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
Disabling transparent hugepages on other systems
Note
SUSE did not offer transparent hugepage support in its initial 11.0 release. However, subsequent SUSE service packs do include support for transparent hugepages.
To determine if transparent hugepages is enabled, run the following command.
cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
The setting returned in brackets is your current setting. Depending on your platform OS, the madvise setting may not be displayed.
You can disable transparent hugepages one of two ways:
Edit your boot loader (for example /etc/grub.conf). Typically, you add the following to the end of the kernel line. However, consult the documentation for your system before editing your bootloader configuration.
transparent_hugepage=never
Edit /etc/rc.local (on systems that support rc.local) and add the following script.
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
For systems that do not support /etc/rc.local, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local.
Regardless of which approach you choose, you must reboot your system for the setting to take effect, or run the following two echo lines to proceed with the install without rebooting:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
3.2.6.6 - Check for swappiness
The swappiness kernel parameter defines the amount, and how often, the kernel copies RAM contents to a swap space.
The swappiness kernel parameter defines the amount, and how often, the kernel copies RAM contents to a swap space. Vertica recommends a value of 0. The installer reports any swappiness issues with identifier S0112.
You can check the swappiness value by running the following command:
$ cat /proc/sys/vm/swappiness
To set the swappiness value add or update the following line in /etc/sysctl.conf:
vm.swappiness = 0
This also ensures that the value persists after a reboot.
If necessary, you change the swappiness value at runtime by logging in as root and running the following:
$ echo 0 > /proc/sys/vm/swappiness
3.2.6.7 - Enabling network time protocol (NTP)
Data damage and performance issues might occur if you change host NTP settings while the database is running.
Important
Data damage and performance issues might occur if you change host NTP settings while the database is running. Before you change the NPT settings, stop the database. If you cannot stop the database, stop the Vertica process of each host and change the NTP settings one host at a time.
The network time protocol (NTP) daemon must be running on all of the hosts in the cluster so that their clocks are synchronized. The spread daemon relies on all of the nodes to have their clocks synchronized for timing purposes. If your nodes do not have NTP running, the installation can fail with a spread configuration error or other errors.
Note
Different Linux distributions refer to the NTP daemon in different ways. For example, SUSE and Debian/Ubuntu refer to it as ntp, while CentOS and Red Hat refer to it as ntpd. If the following commands produce errors, try using the other NTP daemon reference name.
Verify that NTP is running
To verify that your hosts are configured to run the NTP daemon on startup, enter the following command:
$ chkconfig --list ntpd
Debian and Ubuntu do not support chkconfig, but they do offer an optional package. You can install this package with the command sudo apt-get install sysv-rc-conf. To verify that your hosts are configured to run the NTP daemon on startup with the sysv-rc-conf utility, enter the following command:
$ sysv-rc-conf --list ntpd
The chkconfig command can produce an error similar to ntpd: unknown service. If you get this error, verify that your Linux distribution refers to the NTP daemon as ntpd rather than ntp. If it does not, you need to install the NTP daemon package before you can configure it. Consult your Linux documentation for instructions on how to locate and install packages.
If the NTP daemon is installed, your output should resemble the following:
ntp 0:off 1:off 2:on 3:on 4:off 5:on 6:off
The output indicates the runlevels where the daemon runs. Verify that the current runlevel of the system (usually 3 or 5) has the NTP daemon set to on. If you do not know the current runlevel, you can find it using the runlevel command:
$ runlevel
N 3
Configure NTP for red hat 6/CentOS 6 and SLES
If your system is based on Red Hat 6/CentOS 6 or SUSE Linux Enterprise Server, use the service and chkconfig utilities to start NTP and have it start at startup.
$ /sbin/service ntpd restart
$ /sbin/chkconfig ntpd on
Red Hat 6/CentOS 6—NTP uses the default time servers at ntp.org. You can change the default NTP servers by editing /etc/ntpd.conf.
SLES—By default, no time servers are configured. You must edit /etc/ntpd.conf after the install completes and add time servers.
Configure NTP for ubuntu and debian
By default, the NTP daemon is not installed on some Ubuntu and Debian systems. First, install NTP, and then start the NTP process. You can change the default NTP servers by editing /etc/ntpd.confas shown:
To verify that the Network Time Protocol Daemon (NTPD) is operating correctly, issue the following command on all nodes in the cluster.
For Red Hat 6/CentOS 6 and SLES:
$ /usr/sbin/ntpq -c rv | grep stratum
For Ubuntu and Debian:
$ ntpq -c rv | grep stratum
A stratum level of 16 indicates that NTP is not synchronizing correctly.
If a stratum level of 16 is detected, wait 15 minutes and issue the command again. It may take this long for the NTP server to stabilize.
If NTP continues to detect a stratum level of 16, verify that the NTP port (UDP Port 123) is open on all firewalls between the cluster and the remote machine to which you are attempting to synchronize.
Red hat documentation related to NTP
The preceding links were current as of the last publication of the Vertica documentation and could change between releases.
3.2.6.8 - Enabling chrony or ntpd for Red Hat and CentOS systems
Before you can install Vertica, you must enable one of the following on your system for clock synchronization:.
Before you can install Vertica, you must enable one of the following on your system for clock synchronization:
chrony
NTPD
You must enable and activate the Network Time Protocol (NTP) before installation. Otherwise, the installer reports this issue with the identifier S0030.
For information on installing and using chrony, see the information below. For information on NTPD see Enabling network time protocol (NTP). For more information about chrony, see Using chrony in the Red Hat documentation.
Install chrony
The chrony suite consists of:
chronyd - the daemon for clock synchronization.
chronyc - the command-line utility for configuring chronyd .
chrony is installed by default on some versions of Red Hat/CentOS 7. However, if chrony is not installed on your system, you must download it. To download chrony, run the following command as sudo or root:
# dnf install chrony
Verify that chrony is running
To view the status of the chronyd daemon, run the following command:
$ systemctl status chronyd
If chrony is running, an output similar to the following appears:
chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled)
Active: active (running) since Mon 2015-07-06 16:29:54 EDT; 15s ago
Main PID: 2530 (chronyd)
CGroup: /system.slice/chronyd.service
ââ2530 /usr/sbin/chronyd -u chrony
If chrony is not running, execute the following command as sudo or root. This command also causes chrony to run at boot time:
# systemctl enable chronyd
Verify that chrony is operating correctly
To verify that the chrony daemon is operating correctly, issue the following command on all nodes in the cluster:
$ chronyc tracking
An output similar to the following appears:
Reference ID : 198.247.63.98 (time01.website.org)
Stratum : 3
Ref time (UTC) : Thu Jul 9 14:58:01 2015
System time : 0.000035685 seconds slow of NTP time
Last offset : -0.000151098 seconds
RMS offset : 0.000279871 seconds
Frequency : 2.085 ppm slow
Residual freq : -0.013 ppm
Skew : 0.185 ppm
Root delay : 0.042370 seconds
Root dispersion : 0.022658 seconds
Update interval : 1031.0 seconds
Leap status : Normal
A stratum level of 16 indicates that chrony is not synchronizing correctly. If chrony continues to detect a stratum level of 16, verify that the UDP port 323 is open. This port must be open on all firewalls between the cluster and the remote machine to which you are attempting to synchronize.
3.2.6.9 - SELinux configuration
Vertica does not support SELinux except when SELinux is running in permissive mode.
Vertica does not support SELinux except when SELinux is running in permissive mode. If it detects that SELinux is installed and the mode cannot be determined the installer reports this issue with the identifier: S0080. If the mode can be determined, and the mode is not permissive, then the issue is reported with the identifier: S0081.
Red hat and SUSE systems
You can either disable SELinux or change it to use permissive mode.
To disable SELinux:
Edit /etc/selinux/config and change setting for SELinux to disabled (SELINUX=disabled). This disables SELinux at boot time.
As root/sudo, type setenforce 0 to disable SELinux immediately.
To change SELinux to use permissive mode:
Edit /etc/selinux/config and change setting for SELINUX to permissive (SELINUX=Permissive).
As root/sudo, type setenforce Permissive to switch to permissive mode immediately.
Ubuntu and debian systems
You can either disable SELinux or change it to use permissive mode.
To disable SELinux:
Edit /selinux/config and change setting for SELinux to disabled (SELINUX=disabled). This disables SELinux at boot time.
As root/sudo, type setenforce 0 to disable SELinux immediately.
To change SELinux to use permissive mode:
Edit /selinux/config and change setting for SELinux to permissive (SELINUX=Permissive).
As root/sudo, type setenforce Permissive to switch to permissive mode immediately.
3.2.6.10 - CPU frequency scaling
This topic details the various CPU frequency scaling methods supported by Vertica.
This topic details the various CPU frequency scaling methods supported by Vertica. In general, if you do not require CPU frequency scaling, then disable it so as not to impact system performance.
Important
Your systems may use significantly more energy when frequency scaling is disabled.
The installer allows CPU frequency scaling to be enabled when the cpufreq scaling governor is set to performance. If the cpu scaling governor is set to ondemand, and ignore_nice_load is 1 (true), then the installer fails with the error S0140. If the cpu scaling governor is set to ondemand and ignore_nice_load is 0 (false), then the installer warns with the identifier S0141.
CPU frequency scaling is a hardware and software feature that helps computers conserve energy by slowing the processor when the system load is low, and speeding it up again when the system load increases. This feature can impact system performance, since raising the CPU frequency in response to higher system load does not occur instantly. Always disable this feature on the Vertica database hosts to prevent it from interfering with performance.
You disable CPU scaling in your host's system BIOS. There may be multiple settings in your host's BIOS that you need to adjust in order to completely disable CPU frequency scaling. Consult your host hardware's documentation for details on entering the system BIOS and disabling CPU frequency scaling.
If you cannot disable CPU scaling through the system BIOS, you can limit the impact of CPU scaling by disabling the scaling through the Linux kernel or setting the CPU frequency governor to always run the CPU at full speed.
Caution
This method is not reliable, as some hardware platforms may ignore the kernel settings. For more information, see Vertica Hardware Guide.
The method you use to disable frequency depends on the CPU scaling method being used in the Linux kernel. See your Linux distribution's documentation for instructions on disabling scaling in the kernel or changing the CPU governor.
3.2.6.11 - Enabling or disabling defrag
You can modify the defrag utility to meet Vertica configuration requirements, or to optimize your system performance by workload.
You can modify the defrag utility to meet Vertica configuration requirements, or to optimize your system performance by workload.
On all Red Hat/CentOS systems, you must disable the defrag utility to meet Vertica configuration requirements.
For SUSE 15.1, Vertica recommends that you enable defrag for optimized performance.
Recommended settings by workload for Red Hat/CentOS and SUSE 15.1
Vertica recommends defrag settings to optimize performance by workload. The following table contains recommendations for systems that primarily run concurrent queries (such as short-running dashboard queries), or sequential SELECT or load (COPY) queries:
For Ubuntu versions 18.04 and higher, run apt-get install rasdaemon instead of apt-get install mcelog.
SuSE systems
To install the required tools on SuSE systems, run the following commands as sudo or root.
# zypper install sysstat
# zypper install mcelog
There is no individual SuSE package for pstack/gstack. However, the gdb package contains gstack, so you could optionally install gdb instead, or build pstack/gstack from source. To install the gdb package:
# zypper install gdb
3.2.7 - System user configuration
The following tasks pertain to the configuration of the system user required by Vertica.
The following tasks pertain to the configuration of the system user required by Vertica.
3.2.7.1 - System user requirements
Vertica has specific requirements for the system user that runs and manages Vertica.
Vertica has specific requirements for the system user that runs and manages Vertica. If you specify a user during install, but the user does not exist, then the installer reports this issue with the identifier: S0200.
System user requirement details
Vertica requires a system user to own database files and run database processes and administration scripts. By default, the install script automatically configures and creates this user for you with the username dbadmin. See Linux users created by Vertica for details on the default user created by the install script. If you decide to manually create your own system user, then you must create the user before you run the install script. If you manually create the user:
Note
Instances of dbadmin and verticadba are placeholders for the names you choose if you do not use the default values.
the user must have the same username and password on all nodes
the user must use the BASH shell as the user's default shell. If not, then the installer reports this issue with identifier [S0240].
the user must be in the verticadba group (for example: usermod -a -G verticadba userNameHere). If not, the installer reports this issue with identifier [S0220].
Note
You must create a verticadba group on all nodes. If you do not, then the installer reports the issue with identifier [S0210].
the user's login group must be either verticadba or a group with the same name as the user (for example, the home group for dbadmin is dbadmin). You can check the groups for a user with the id command. For example: id dbadmin. The "gid" group is the user's primary group. If this is not configured correctly then the installer reports this issue with the identifier [S0230]. Vertica recommends that you use verticadba as the user's primary login group. For example: usermod -g verticadba userNameHere. If the user's primary group is not verticadba as suggested, then the installer reports this with HINT [S0231].
the user must have a home directory. If not, then the installer reports this issue with identifier [S0260].
the user's home directory must be owned by the user. If not, then the installer reports the issue with identifier [S0270].
the system must be aware of the user's home directory (you can set it with the usermod command: usermod -m -d /path/to/new/home/dir userNameHere). If this is not configured correctly then the installer reports the issue with [S0250].
the user's home directory must be owned by the dbadmin's primary group (use the chown and chgrp commands if necessary). If this is not configured correctly, then the installer reports the issue with identifier [S0280].
the user's home directory should have secure permissions. Specifically, it should not be writable by anyone or by the group. Ideally the permissions should be, when viewing with ls, "---" (nothing), or "r-x" (read and execute). If this is not configured as suggested then the installer reports this with HINT [S0290].
3.2.7.2 - TZ environment variable
This topic details how to set or change the TZ environment variable and update your tzdata package.
This topic details how to set or change the TZ environment variable and update your tzdata package. If this variable is not set, then the installer reports this issue with the identifier: S0305.
Before installing Vertica, update the tzdata package for your system and set the default time zone for your database administrator account by specifying the TZ environmental variable. If your database administrator is being created by the install_vertica script, then set the TZ variable after you have installed Vertica.
Update tzdata package
The tzdata package is a public-domain time zone database that is pre-installed on most Linux systems. The tzdata package is updated periodically for time-zone changes across the world. You should update to the latest tzdata package before installing or updating Vertica.
Update your tzdata package with the following command:
RedHat based systems: yum update tzdata
Debian and Ubuntu systems: apt-get install tzdata
Setting the default time zone
When a client receives the result set of a SQL query, all rows contain data adjusted, if necessary, to the same time zone. That time zone is the default time zone of the initiator node unless the client explicitly overrides it using the SQL SET TIME ZONE command described in the SQL Reference Manual. The default time zone of any node is controlled by the TZ environment variable. If TZ is undefined, the operating system time zone.
Important
The TZ variable must be set to the same value on all nodes in the cluster.
If your operating system timezone is not set to the desired timezone of the database then make sure that the Linux environment variable TZ is set to the desired value on all cluster hosts.
The installer returns a warning if the TZ variable is not set. If your operating system timezone is appropriate for your database, then the operating system timezone is used and the warning can be safely ignored.
Setting the time zone on a host
Important
If you explicitly set the TZ environment variable at a command line before you start the Administration tools, the current setting will not take effect. The Administration Tools uses SSH to start copies on the other nodes, so each time SSH is used, the TZ variable for the startup command is reset. TZ must be set in the .profile or .bashrc files on all nodes in the cluster to take affect properly.
You can set the time zone several different ways, depending on the Linux distribution or the system administrator’s preferences.
To set the system time zone on Red Hat and SUSE Linux systems, edit:
/etc/sysconfig/clock
To set the TZ variable, edit, /etc/profile, or /home/dbadmin/.bashrc or /home/dbadmin/.bash_profile and add the following line (for example, for the US Eastern Time Zone):
This topic details how to set or change the LANG environment variable.
This topic details how to set or change the LANG environment variable. The LANG environment variable controls the locale of the host. If this variable is not set, then the installer reports this issue with the identifier: S0300. If this variable is not set to a valid value, then the installer reports this issue with the identifier: S0301.
Set the host locale
Each host has a system setting for the Linux environment variable LANG. LANG determines the locale category for native language, local customs, and coded character set in the absence of the LC_ALL and other LC_ environment variables. LANG can be used by applications to determine which language to use for error messages and instructions, collating sequences, date formats, and so forth.
To change the LANG setting for the database administrator, edit, /etc/profile, or /dbadmin/.bashrc or /home/dbadmin/.bash_profile on all cluster hosts and set the environment variable; for example:
export LANG=en_US.UTF-8
The LANG setting controls the following in Vertica:
OS-level errors and warnings, for example, "file not found" during COPY operations.
Vertica specific error and warning messages. These are always in English at this time.
Collation of results returned by SQL issued to Vertica. This must be done using a database parameter instead. See Implement locales for international data sets section for details.
Note
If the LC_ALL environment variable is set, it supersedes the setting of LANG.
3.2.7.4 - Package dependencies
For successful Vertica installation, you must first install three packages on all nodes in your cluster before installing the database platform.
For successful Vertica installation, you must first install three packages on all nodes in your cluster before installing the database platform.
which—Required for Vertica operating system integration and for validating installations.
dialog—Required for interactivity with Administration Tools.
Installing the required packages
The procedure you follow to install the required packages depends on the operating system on which your node or cluster is running. See your operating system's documentation for detailed information on installing packages.
For CentOS/Red Hat Systems—Typically, you manage packages on Red Hat and CentOS systems using the yum utility.
Run the following yum commands to install each of the package dependencies. The yum utility guides you through the installation:
This section describes how to install the Vertica software on a cluster of nodes.
This section describes how to install the Vertica software on a cluster of nodes. It assumes that you have already performed the tasks in Before You Install Vertica, and that you have a Vertica license key.
Be sure that you download the RPM for the correct operating system and architecture.
Vertica supports two-node clusters with zero fault tolerance (K=0 safety). This means that you can add a node to a single-node cluster, as long as the installation node (the node upon which you build) is not the loopback node (localhost/127.0.0.1).
The installer performs platform verification tests that prevent the install from continuing if the platform requirements are not met. These tests ensure that your platform meets the hardware and software requirements for Vertica. You can simply run the installer and view a list of the failures and warnings to determine which configuration changes you must make.
3.3.1 - Download and install the Vertica server package
To download and install the Vertica server package:.
To download and install the Vertica server package:
Click the Support tab and select Customer Downloads.
Log into the portal to download the install package. Be sure the package you download matches the operating system and the machine architecture on which you intend to install it.
If you installed a previous version of Vertica on any of the hosts in the cluster, use the Administration tools to shut down any running database.
The database must stop normally; you cannot upgrade a database that requires recovery.
If you are using sudo, skip to the next step. If you are root, log in to the Administration Host as root (or log in as another user and switch to root).
$ su - root
password: root-password
#
Caution
When installing Vertica using an existing user as the dba, you must exit all UNIX terminal sessions for that user after setup completes and log in again to ensure that group privileges are applied correctly.
Use one of the following commands to run the RPM package installer:
If you are root and installing an RPM:
# rpm -Uvh pathname
Note
When installing a Vertica RPM, you might see an unexpected warning about a SHA256 signature. This warning indicates that you need to import a GPG key. Only necessary for versions after 10.0, keys can be downloaded under the Security section of your chosen release from the Vertica Client Drivers page. After downloading the key, you can import it with the following command:
# rpm --import RPM-GPG-KEY-VERTICA
If you are using sudo and installing an RPM:
$ sudo rpm -Uvh pathname
If you are using Debian:
$ sudo dpkg -i pathname
where pathname is the Vertica package file you downloaded.
Note
If the package installer reports multiple dependency problems, or you receive the error "ERROR: You're attempting to install the wrong RPM for this operating system", then you are trying to install the wrong Vertica server package.
After you install the Vertica RPM, you can use several Validation scripts to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica.
3.3.2 - Linux users created by Vertica
This topic describes the Linux accounts that the installer creates and configures so Vertica can run.
This topic describes the Linux accounts that the installer creates and configures so Vertica can run. When you install Vertica, the installation script optionally creates the following Linux user and group:
dbadmin—Administrative user
verticadba—Group for DBA users
dbadmin and verticadba are the default names. If you want to change what these Linux accounts are called, you can do so using the installation script. See Install Vertica with the installation script for details.
Dbadmin privileges
The Linux dbadmin user owns the database catalog and data storage on disk. When you run the install script, Vertica creates this user on each node in the database cluster. It also adds dbadmin to the Linux dbadmin and verticadba groups, and configures the account as follows:
Configures and authorizes dbadmin for passwordless SSH between all cluster nodes. SSH must be installed and configured to allow passwordless logins. See Enable secure shell (SSH) logins.
Sets the dbadmin user's BASH shell to /bin/bash, required to run scripts, such as install_vertica and the Administration tools.
Provides read-write-execute permissions on the following directories:
/opt/vertica/*
/home/dbadmin—the default directory for database data and catalog files (configurable through the install script)
Note
The Vertica installation script also creates a Vertica database superuser named dbadmin. They share the same name, but they are not the same; one is a Linux user and the other is a Vertica user. See Database administration user for information about the database superuser.
After you install Vertica
Root or sudo privileges are not required to start or run Vertica after the installation process completes.
The dbadmin user can log in and perform Vertica tasks, such as creating a database, installing/changing the license key, or installing drivers. If dbadmin wants database directories in a location that differs from the default, the root user (or a user with sudo privileges) must create the requested directories and change ownership to the dbadmin user.
Vertica prevents administration from users other than the dbadmin user (or the user name you specified during the installation process if not dbadmin). Only this user can run Administration Tools.
Vertica provides several validation utilities that can be used prior to deploying Vertica to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica.
Vertica provides several validation utilities that can be used prior to deploying Vertica to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica. These utilities can also be used if you are encountering performance issues and need to troubleshoot the issue.
After you install the Vertica RPM, you have access to the following scripts in /opt/vertica/bin:
Vcpuperf - a CPU performance test used to verify your CPU performance.
Vioperf - an Input/Output test used to verify the speed and consistency of your hard drives.
Vnetperf - a Network test used to test the latency and throughput of your network between hosts.
These utilities can be run at any time, but are well suited to use before running the install_vertica script.
3.3.3.1 - Vcpuperf
The vcpuperf utility measures your server's CPU processing speed and compares it against benchmarks for common server CPUs.
The vcpuperf utility measures your server's CPU processing speed and compares it against benchmarks for common server CPUs. The utility performs a CPU test and measures the time it takes to complete the test. The lower the number scored on the test, the better the performance of the CPU.
The vcpuperf utility also checks the high and low load times to determine if CPU throttling is enabled. If a server's low-load computation time is significantly longer than the high-load computation time, CPU throttling may be enabled. CPU throttling is a power-saving feature. However, CPU throttling can reduce the performance of your server. Vertica recommends disabling CPU throttling to enhance server performance.
Syntax
vcpuperf [-q]
Options
-q
Run in quiet mode. Quiet mode displays only the CPU Time, Real Time, and high and low load times.
Returns
CPU Time: the amount of time it took the CPU to run the test.
Real Time: the total time for the test to execute.
High load time: The amount of time to run the load test while simulating a high CPU load.
Low load time: The amount of time to run the load test while simulating a low CPU load.
Example
The following example shows a CPU that is running slightly slower than the expected time on a Xeon 5670 CPU that has CPU throttling enabled.
[root@node1 bin]# /opt/vertica/bin/vcpuperf
Compiled with: 4.1.2 20080704 (Red Hat 4.1.2-52) Expected time on Core 2, 2.53GHz: ~9.5s
Expected time on Nehalem, 2.67GHz: ~9.0s
Expected time on Xeon 5670, 2.93GHz: ~8.0s
This machine's time:
CPU Time: 8.540000s
Real Time:8.710000s
Some machines automatically throttle the CPU to save power.
This test can be done in <100 microseconds (60-70 on Xeon 5670, 2.93GHz).
Low load times much larger than 100-200us or much larger than the corresponding high load time
indicate low-load throttling, which can adversely affect small query / concurrent performance.
This machine's high load time: 67 microseconds.
This machine's low load time: 208 microseconds.
3.3.3.2 - Vioperf
The vioperf utility quickly tests the performance of your host's input and output subsystem.
The vioperf utility quickly tests the performance of your host's input and output subsystem. The utility performs the following tests:
sequential write
sequential rewrite
sequential read
skip read (read non-contiguous data blocks)
The utility verifies that the host reads the same bytes that it wrote and prints its output to STDOUT. The utility also logs the output to a JSON formatted file.
For data in HDFS, the utility tests reads but not writes.
The minimum required I/O is 20 MB/s read/write per physical processor core on each node, in full duplex (reading and writing) simultaneously, concurrently on all nodes of the cluster.
Note
Vertica supports some AWS instance types that do not meet these minimum I/O requirements. However, all supported AWS instances types, regardless of vioperf performance, can be used as Vertica cluster hosts. See Supported AWS instance types for a list of all supported AWS instance types.
The recommended I/O is 40 MB/s per physical core on each node.
The minimum required I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s. Vertica recommends 480 MB/s.
For example, the I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s required minimum, 480 MB/s recommended.
Disk space vioperf needs
vioperf requires about 4.5 GB to run.
Options
--help
Prints a help message and exits.
--duration
The length of time vioprobe runs performance tests. The default is 5 minutes. Specify the interval in seconds, minutes, or hours with any of these suffixes:
The interval at which the log file reports summary information. The default interval is 10 seconds. This option uses the same interval notation as --duration.
--log-file
The path and name where log file contents are written, in JSON. If not specified, then vioperf creates a file named resultsdate-time.JSON in the current directory.
--condense-log
Directs vioperf to write the log file contents in condensed format, one JSON entry per line, rather than as indented JSON syntax.
--thread-count=<N>
The number of execution threads to use. By default, vioperf uses all threads available on the host machine.
--max-buffer-size=<SIZE>
The maximum size of the in-memory buffer to use for reads or writes. Specify the units with any of these suffixes:
Bytes: b, byte, bytes.
Kilobytes: k, kb, kilobyte, kilobytes.
Megabytes: m, mb, megabyte, megabytes.
Gigabytes: g, gb, gigabyte, gigabytes.
--preserve-files
Directs vioperf to keep the files it writes. This parameter is ignored for HDFS tests, which are read-only. Inspecting the files can help diagnose write-related failures.
--disable-crc
Directs vioperf to ignore CRC checksums when validating writes. Verifying checksums can add overhead, particularly when running vioperf on slower processors. This parameter is ignored for HDFS tests.
--disable-direct-io
When reading from or writing to a local file system, vioperf goes directly to disk by default, bypassing the operating system's page cache. Using direct I/O allows vioperf to measure performance quickly without having to fill the cache.
Disabling this behavior can produce more realistic performance results but slows down the operation of vioperf.
--debug
Directs vioperf to report verbose error messages.
<DIR>
Zero or more directories to test. If you do not specify a directory, vioperf tests the current directory. To test the performance of each disk, specify different directories mounted on different disks.
To test reads from a directory on HDFS:
Use a URL in the hdfs scheme that points to a single directory (not a path) containing files at least 10MB in size. For best results, use 10GB files and verify that there is at least one file per vioperf thread.
If you do not specify a host and port, set the HADOOP_CONF_DIR environment variable to a path including the Hadoop configuration files. This value is the same value that you use for the HadoopConfDir configuration parameter in Vertica. For more information see Configuring HDFS access.
If the HDFS cluster uses Kerberos, set the HADOOP_USER_NAME environment variable to a Kerberos principal.
Returns
The utility returns the following information:
test
The test being run (Write, ReWrite, Read, or Skip Read)
directory
The directory in which the test is being run.
counter name
The counter type of the test being run. Can be either MB/s or Seeks per second.
counter value
The value of the counter in MB/s or Seeks per second across all threads. This measurement represents the bandwidth at the exact time of measurement. Contrast with counter value (avg).
counter value (10 sec avg)
The average amount of data in MB/s, or the average number of Seeks per second, for the test being run in the duration specified with --log-interval. The default interval is 10 seconds. The counter value (avg) is the average bandwidth since the last log message, across all threads.
counter value/core
The counter value divided by the number of cores.
counter value/core (10 sec avg)
The counter value (10 sec avg) divided by the number of cores.
thread count
The number of threads used to run the test.
%CPU
The available CPU percentage used during this test.
%IO Wait
The CPU percentage in I/O Wait state during this test. I/O wait state is the time working processes are blocked while waiting for I/O operations to complete.
elapsed time
The amount of time taken for a particular test. If you run the test multiple times, elapsed time increases the next time the test is run.
remaining time
The time remaining until the next test. Based on the --duration option, each of the tests is run at least once. If the test set is run multiple times, then remaining time is how much longer the test will run. The remaining time value is cumulative. Its total is added to elapsed time each time the same test is run again.
Example
Invoking vioperf from a terminal outputs the following message and sample results:
[dbadmin@v_vmart_node0001 ~]$ /opt/vertica/bin/vioperf --duration=60s
The minimum required I/O is 20 MB/s read and write per physical processor core on each node, in full duplex
i.e. reading and writing at this rate simultaneously, concurrently on all nodes of the cluster.
The recommended I/O is 40 MB/s per physical core on each node.
For example, the I/O rate for a server node with 2 hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.
Using direct io (buffer size=1048576, alignment=512) for directory "/home/dbadmin"
test | directory | counter name | counter value | counter value (10 sec avg) | counter value/core | counter value/core (10 sec avg) | thread count | %CPU | %IO Wait | elapsed time (s)| remaining time (s)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Write | /home/dbadmin | MB/s | 420 | 420 | 210 | 210 | 2 | 89 | 10 | 10 | 5
Write | /home/dbadmin | MB/s | 412 | 396 | 206 | 198 | 2 | 89 | 9 | 15 | 0
ReWrite | /home/dbadmin | (MB-read+MB-write)/s | 150+150 | 150+150 | 75+75 | 75+75 | 2 | 58 | 40 | 10 | 5
ReWrite | /home/dbadmin | (MB-read+MB-write)/s | 158+158 | 172+172 | 79+79 | 86+86 | 2 | 64 | 33 | 15 | 0
Read | /home/dbadmin | MB/s | 194 | 194 | 97 | 97 | 2 | 69 | 26 | 10 | 5
Read | /home/dbadmin | MB/s | 192 | 190 | 96 | 95 | 2 | 71 | 27 | 15 | 0
SkipRead | /home/dbadmin | seeks/s | 659 | 659 | 329.5 | 329.5 | 2 | 2 | 85 | 10 | 5
SkipRead | /home/dbadmin | seeks/s | 677 | 714 | 338.5 | 357 | 2 | 2 | 59 | 15 | 0
Note
When evaluating performance for minimum and recommended I/O, include the Write and Read values in your evaluation. ReWrite and SkipRead values are not relevant to determining minimum and recommended I/O.
3.3.3.3 - Vnetperf
The vnetperf utility measures network performance of database hosts, as well as network latency and throughput for TCP and UDP protocols.
The vnetperf utility measures network performance of database hosts, as well as network latency and throughput for TCP and UDP protocols.
Caution
This utility incurs high network load, which degrades database performance. Do not use this utility on a Vertica production database.
This utility helps identify the following issues:
Low throughput for all hosts or one
High latency for all hosts or one
Bottlenecks between one or more hosts or subnets
Too-low limit on the number of TCP connections that can be established simultaneously
High rates of network packet loss
Syntax
vnetperf [[options](#Options)] [[tests](#Tests)]
Options
--condense
Condenses the log into one JSON entry per line, instead of indented JSON syntax.
--collect-logs
Collects test log files from each host.
--datarate rate
Limits throughput to this rate in MB/s. A rate of 0 loops the tests through several different rates.
Default: 0
--duration seconds
Time limit for each test to run in seconds.
Default: 1
--hosts host-name[,...]
Comma-separated list of host names or IP addresses on which to run the tests. The list must not contain embedded spaces.
--hosts file
File that specifies the hosts on which to run the tests. If you omit this option, then the vnetperf tries to access admintools to identify cluster hosts.
--identity-file file
If using passwordless SSH/SCP access between hosts, then specify the key file used to gain access to the hosts.
--ignore-bad-hosts
If set, runs tests on reachable hosts even if some hosts are not reachable. If you omit this option and a host is unreachable, then no tests are run on any hosts.
--log-dir directory
If --collect-logs is set, specifies the directory in which to place the collected logs.
Comma-delimited list of port numbers to use. If only one port number is specified, then the next two numbers in sequence are also used.
Default: 14159,14160,14161
--scp-options 'scp-args'
Specifies one or more standard SCP command line arguments. SCP is used to copy test binaries over to the target hosts.
--ssh-options 'ssh-args'
Specifies one or more standard SSH command line arguments. SSH is used to issue test commands on the target hosts.
--tmp-dir directory
Specifies the temporary directory for vnetperf, where directory must have execute permission on all hosts, and does not include the unsupported characters ", ```, or '.
Default:/tmp (execute permission required)
--vertica-install directory
Indicates that Vertica is installed on each of the hosts, so vnetperf uses test binaries on the target system rather than copying them over with SCP.
Tests
vnetperf can specify one or more of the following tests. If no test is specified, vnetperf runs all tests. Test results are printed for each host.
Test
Description
Results
latency
Measures latency from the host that is running the script to other hosts. Hosts with unusually high latency should be investigated further.
Round trip time latency for each host in milliseconds.
Clock skew—the difference in time shown by the clock on the target host relative to the host running the utility.
tcp-throughput
Tests TCP throughput among hosts.
Date/time and test name
Rrate limit in MB/s
Tested node
Sent and received data in MB/s and bytes
Duration of the test in seconds
udp-throughput
Tests UDP throughput among hosts
Recommended network performance
Maximum recommended RTT (round-trip time) latency is 1000 microseconds. Ideal RTT latency is 200 microseconds or less. Vertica recommends that clock skew be less than 1 second.
Minimum recommended throughput is 100 MB/s. Ideal throughput is 800 MB/s or more.
Note
UDP throughput can be lower; multiple network switches can adversely affect performance.
3.3.4 - Install Vertica with the installation script
You can run the installation script after you install the Vertica package.
You can run the installation script after you install the Vertica package. The installation script runs on a single node, using a Bash shell. The script copies the Vertica package to all other hosts (identified by the --hosts argument) in your planned cluster.
Tip
To speed up the installation, you can provide a local copy of the RPM to each node in the cluster before running the install script. This allows the installer to bypass the time-consuming process of copying the RPM to the nodes. For details, see --no-rpm-copy.
The installation script runs several tests on each of the target hosts to verify that the hosts meet system and performance requirements for a Vertica node. The installation script modifies some operating system configuration settings to meet these requirements. Other settings cannot be modified by the installation script and must be manually reconfigured. For details on operating system configuration settings, see Manually configured operating system settings and Automatically configured operating system settings.
Note
The installation script sets up passwordless ssh for the admin user across all hosts. If passwordless ssh is already set up, the installation script verifies that it functions correctly.
3.3.4.1 - Install on a FIPS 140-2 enabled machine
Vertica supports the implementation of the Federal Information Processing Standard 140-2 (FIPS).
Vertica supports the implementation of the Federal Information Processing Standard 140-2 (FIPS). You enable FIPS mode in the operating system.
Note
Enabling FIPS on the operating system occurs outside of Vertica.
During installation, the install_vertica script detects whether the host is operating in FIPS mode. The installer searches for the file /proc/sys/crypto/fips_enabled and examines its content. If the file exists and contains a '1' in the filename, the host is operating in FIPS mode and the following message appears:
/proc/sys/crypto/fips_enabled exists and contains '1', this is a FIPS system
Important
On certain systems where the libssl and libcrypto libraries do not have versioning information, when starting Vertica, you may see the message
No version information available
This message is benign and you can ignore it.
To implement FIPS 140-2 on your Vertica Analytic Database, you need to configure both the server and the client you are using. To see the detailed configuration steps, go to Implementing FIPS 140-2.
Symbolic links for OpenSSL
On some non-FIPS systems, versioning anomalies can occur when you install a new version of OpenSSL. Sometimes, the default OpenSSL build procedure produces libraries with versions named 1.0.0. For Vertica to recognize that a library has a higher version number, the library name with a higher version number must be provided. As part of the Vertica installation, symbolic links are created to the appropriate OpenSSL files. The steps are as follows:
The RPM installer places two OpenSSL library files in /opt/vertica/lib:
libssl.so.1.1
libcrypto.so.1.1
The install_vertica script creates two symbolic links in /opt/vertica/lib:
libssl.so
libcrypto.so
The symbolic links point to libssl.so.1.1 and libcrypto.so.1.1, which the RPM installer placed in /opt/vertica/lib.
3.3.4.2 - Specifying disk storage location during installation
You can specify the disk storage location when you:.
You can specify the disk storage location when you:
When you install Vertica, the --data-dir parameter in the install_vertica script lets you specify a directory to contain database data and catalog files. The script defaults to the database administrator's default home directory
/home/dbadmin.
Important
Replace this default with a directory that has adequate space to hold your data and catalog files.
Requirements
The data and catalog directory must exist on each node in the cluster.
The directory on each node must be owned by the database administrator
Catalog and data path names must contain only alphanumeric characters and cannot have leading space characters. Failure to comply with these restrictions will result in database creation failure.
Vertica refuses to overwrite a directory if it appears to be in use by another database. Therefore, if you created a database for evaluation purposes, dropped the database, and want to reuse the database name, make sure that the disk storage location previously used has been completely cleaned up. See Managing storage locations for details.
3.3.4.3 - Perform a basic install
For all installation options, see [%=Vertica.INSTALL_SCRIPT%] Options.
As root (or sudo) run the install script. The script must be run by a BASH shell as root or as a user with sudo privileges. You can configure many options when running the install script. See Basic Installation Parameters for the required options.
If the installer fails due to any requirements not being met, you can correct the issue and then rerun the installer with the same command line options.
If you place install_vertica in a location other than /opt/vertica, create a symlink from that location to /opt/vertica. Create this symlink on all cluster nodes, otherwise the database will not start.
When prompted for a password to log into the other nodes, provide the requested password. Doing so allows the installation of the package and system configuration on the other cluster nodes.
If you are root, this is the root password.
If you are using sudo, this is the sudo user password.
The password does not echo on the command line. For example:
Vertica Database 24.3.x Installation Tool
Please enter password for root@host01:password
If the dbadmin user, or the user specified in the argument --dba-user, does not exist, then the install script prompts for the password for the user. Provide the password. For example:
Enter password for new UNIX user dbadmin:password
Retype new UNIX password for user dbadmin:password
Carefully examine any warnings or failures returned by
install_vertica and correct the problems.
For example, insufficient RAM, insufficient network throughput, and too high readahead settings on the file system could cause performance problems later on. Additionally, LANG warnings, if not resolved, can cause database startup to fail and issues with VSQL. The system LANG attributes must be UTF-8 compatible. After you fix the problems, rerun the install script.
When installation is successful, disconnect from the Administration host, as instructed by the script. Then, complete the required post-installation steps.
At this point, root privileges are no longer needed and the database administrator can perform any remaining steps.
3.3.4.4 - install_vertica options
The following tables describe script options.
The following tables describe install_vertica script options. Most options have long and short forms—for example, --hosts and -s.
Required
install_vertica requires the following options:
--hosts / -s
--rpm / -r | --deb | --no-rpm-copy
--dba-user username | -uusername
Required only if installing using root or upgrading versions.
If upgrading an existing installation of Vertica, use the same host names used previously.
IP addresses or hostnames must be for unique hosts. Do not list the same host using multiple IP addresses/hostnames.
Note
Vertica stores only IP addresses in its configuration files. If you provide host names, they are converted to IP addresses when the script runs.
--rpm package-name-rpackage-name-debpackage-name
Path and name of the Vertica RPM or Debian package. For example:
--rpm /tmp/vertica-version.RHEL8.x86_64.rpm
For Debian and Ubuntu installs, provide the name of the Debian package:
--deb /tmp/vertica_10.1_amd64.deb
The install package must be provided if you install or upgrade the Vertica server package on multiple nodes where the nodes do not have the latest server package installed, or if you are adding a new node. You do not need to provide the server package if you have a local copy of the RPM on each node and call the install script with the no-rpm-copy option. Unless you provide the --no-rpm-copy option, the install_vertica and update_vertica scripts serially copy the server package to the other nodes and install the package.
Tip
If installing or upgrading a large number of nodes, consider manually installing the package on all nodes before running the install/upgrade script. The script runs faster if it does not need to serially upload and install the package on each node.
--no-rpm-copy
Installer does not copy the RPM to the nodes in the cluster. The RPM must be present on each node specified by --hosts, and you must provide the path to the local RPM files with the --rpm-path option (defaults to /tmp/dbRPM.rpm). If you specify this option, you do not need to provide the --rpm option.
--dba-user username-u username
Name of the database superuser account to create. Only this account can run the Administration Tools. If you omit this parameter, then the default administrator account name is
dbadmin.
This parameter is optional for new installations done as root; they must be specified when upgrading or when installing using sudo. If upgrading, use this parameter to specify the same account name that you used previously. If installing using sudo, username must already exist.
If you manually create the user, modify the user's .bashrc file to include the line: PATH=/opt/vertica/bin:$PATH so Vertica tools such as vsql and admintools can be easily started by this user.
The following
install_vertica options are not required. Many of them enable greater control over the installation process.
--help
Display help for this script.
--accept-eula -Y
Silently accepts the EULA agreement. On multi-node installations, this option is propagated across the cluster at the end of the installation, at the same time as the Administration Tools metadata.
Combine this option with --license (-L) to activate your license.
--add-hosts hostlist-A hostlist
Comma-separated list of hosts to add to an existing Vertica cluster.
--add-hosts modifies an existing installation of Vertica by adding a host to the database cluster and then reconfiguring spread. This is useful for improving system performance, or making the database K-safe.
If spread is configured in your installation to use point-to-point communication within the existing cluster, you must also use it when you add a new host; otherwise, the new host automatically uses UDP broadcast traffic, resulting in cluster communication problems that prevent Vertica from running properly. For example:
--add-hosts host01
--add-hosts 192.168.233.101
You can also use this option with the
update_vertica script. For details, see Adding nodes.
--broadcast -U
Configures spread to use UDP broadcast traffic between nodes on the subnet. This is the default setting. Up to 80 spread daemons are supported by broadcast traffic. You can exceed the 80-node limit by using large cluster mode, which does not install a spread daemon on each node.
When changing the configuration from --point-to-point to --broadcast, you must also specify
--control-network.
--clean
Forcibly cleans previously stored configuration files. Use this option if you need to change the hosts that are included in your cluster. Only use this option when no database is defined.
This option is not supported by the update_vertica script.
--config-file file-z file
Use the properties file created earlier with
[‑‑record-config](#record-config). This properties file contains key/value settings that map to
install_vertica option.
IPaddress: A broadcast network IP address that enables configuration of spread communications on a subnet different from other Vertica data communications.
default
Important
IPaddress must match the subnet for at least some database nodes. If the address does not match the subnet of any database node, then the installer displays an error and stops. If the provided address matches some, but not all of the node's subnets, the installer displays a warning, but installation continues.
Optimally, the value for --control-network matches all node subnets.
You can also use this option to force a cluster-wide spread reconfiguration when changing spread-related options.
Do not use a shared directory over more than one host for this setting. Data and catalog directories must be distinct for each node. Multiple nodes must not be allowed to write to the same data or catalog directory.
Default:/home/dbadmin
--dba-group group-g group
UNIX group for DBA users.
Default:verticadba
--dba-user-home directory-l directory
Home directory for the database administrator.
Default:/home/dbadmin
--dba-user-password password-p password
Password for the database administrator account. If omitted, the script prompts for a password and does not echo the input.
--dba-use-password-disabled
Disables the password for --dba-user. This argument stops the installer from prompting for a password for --dba-user. You can assign a password later using standard user management tools such as passwd.
--failure-threshold [threshold-arg]
Stops the installation when the specified failure threshold is encountered, where threshold-arg is one of the following:
HINT: Stop the install if a HINT or greater issue is encountered during the installation tests. HINT configurations are settings you should make, but the database runs with no significant negative consequences if you omit the setting.
WARN: Stop the installation if a WARN or greater issue is encountered. WARN issues might affect database performance. However, for environments where high-level performance is not a priority—for example, testing—WARN issues can be ignored.
FAIL: Stop the installation if a FAIL or greater issue is encountered. FAIL issues can have severely negative performance consequences and possible later processing issues if not addressed. However, Vertica can start even if FAIL issues are ignored.
HALT: Stop the installation if a HALT or greater issue is encountered. The database might be unable to start if you choose his option. This option is not supported in production environments.
NONE: Do not stop the installation. The database might be unable to start if you choose this option. This option is not supported in production environments.
Default:WARN
--ipv4
Hosts in the cluster are identified by IPv4 network addresses. This is the default behavior.
--ipv6
Hosts in the cluster are identified by IPv6 network addresses, required if the --hosts list specifies Pv6 addresses. This option automatically enables the
--point-to-point option.
--large-cluster [ num-control-nodes| default ]
Enables the large cluster feature, where a subset of nodes called control nodes connect to spread to send and receive broadcast messages. Consider using this option for a cluster with more than 50 nodes in Enterprise Mode. Vertica automatically enables this feature if you install onto 120 or more nodes in Enterprise Mode, or 16 or more nodes in Eon Mode.
Supply this option with one of the following arguments:
num-control-nodes: Sets the number of control nodes in the new database to the smaller of this value or the value of --hosts. This value is applied differently in Enterprise Mode and Eon Mode:
Enterprise Mode: Sets the number of control nodes in the entire cluster.
Eon Mode: Sets the number of control nodes in the initial default subcluster. This value must be between 1 to 120, inclusive.
default: Vertica sets the number of control nodes to the square root of the total number of cluster nodes listed in --hosts (-s).
--license { license-file| CE } -L { hostlist| CE }
Silently and automatically deploys the license key to /opt/vertica/config/share. On multi-node installations, the –-license option also applies the license to all nodes declared by
--hosts. To activate your license, combined this option with ‑‑accept-eula option. If you do not use the ‑‑accept-eula option, you are asked to accept the EULA when you connect to your database. After you accept the EULA, your license is activated.
If specified with CE, this option automatically deploys the Community Edition license key, which is included in your download.
---no-system-configuration
Installer makes no changes to system properties. By default, the installer makes system configuration changes that meet server requirements.
If you use this option, the installer posts warnings or failures for configuration settings that do not meet requirements that it otherwise configures automatically.
This option has no effect on creating or updating user accounts.
--parallel-no-prompts
Installs the server binary package (.rpm or .deb) on the hosts in parallel without prompting for confirmation. This option reduces the installation time, especially on large clusters. If omitted, the install script installs the package on one host at a time. .
This option requires that the installer use passwordless ssh to connect to the hosts. It has no effect if the installer is not using passwordless ssh.
--point-to-point -T
Configures spread to use direct point-to-point communication between all Vertica nodes. Use this option if nodes are not located on the same subnet. Also use this option for all virtual environment installations, whether or not virtual servers are on the same subnet.
Up to 80 spread daemons are supported by point-to-point communication. You can exceed the 80-node limit by using large cluster mode, which does not install a spread daemon on each node.
This option is automatically enabled by the
--ipv6 option.
Important
When changing the configuration from --broadcast to --point-to-point, you must also specify
--control-network.
--record-config filename-B filename
File name used with command line options to create a properties file that can be used with
[‑‑config-file](#record-config). This option creates the properties file and exits; it does not affect installation.
--remove-hosts hostlist-R hostlist
Comma-separated list of hosts to remove from an existing Vertica cluster. After removing the specified hosts, spread is reconfigured on the cluster.
This option is useful for removing an obsolete or over-provisioned system.
If you use --point-to-point (-T) to configure spread to use direct point-to-point communication within the existing cluster, you must also use it when you remove a host; otherwise, the hosts automatically use UDP broadcast traffic, resulting in cluster communication problems that prevents Vertica from running properly.
The
update_vertica script (see Removing hosts from a cluster) calls
install_vertica to update the installation. You can use either script with this option.
--rpm-path rpm-filepath
Only used in conjunction with --no-rpm-copy, identifies the path to the local copy of the RPM on all nodes specified by --hosts.
Default:/tmp/dbRPM.rpm
--spread-logging -w
Configures spread to output logging to
/opt/vertica/log/spread_hostname.log. This option does not apply to upgrades.
Note
Enable spread logging only if requested by Vertica technical support.
--ssh-identity file-i file
The root private-key file to use if passwordless ssh was already configured between the hosts. Before using this option, verify that normal SSH works without a password . The file can be private key file—for example, id_rsa—or PEM file. Do not use with the --ssh-password (-P) option.
Vertica accepts the following:
By providing an SSH private key which is not password protected. You cannot run the
install_vertica script with the sudo command when using this method.
By providing a password-protected private key and using an SSH-Agent. Note that sudo typically resets environment variables when it is invoked. Specifically, the SSH_AUTHSOCK variable required by the SSH-Agent may be reset. Therefore, configure your system to maintain SSH_AUTHSOCK or invoke install_vertica using a method similar to the following:
The password to use by default for each cluster host. If you omit this option and also omit ‑‑ssh‑identity (-i), then the script prompts for the password as necessary and does not echo input.
Do not use this option together with --ssh-identity (-i).
Important
If you run the
install_vertica script as root, specify the root password:
If you run the
install_vertica script with the sudo command, specify the password of the user who runs
install_vertica, not the root password. For example if the dbadmin user runs
install_vertica with sudo and has the password dbapasswd, then specify the password as dbapasswd:
Temporary directory used for administrative purposes. If it is a directory within /opt/vertica, then it is created by the installer. Otherwise, the directory should already exist on all nodes in the cluster. The location should allow dbadmin write privileges.
Note
This is not a temporary data location for the database.
Default:/tmp
3.3.5 - Install Vertica silently
This section describes how to create a properties file that lets you install and deploy Vertica-based applications quickly and without much manual intervention.
This section describes how to create a properties file that lets you install and deploy Vertica-based applications quickly and without much manual intervention.
[Required] Accepts a file name, which when used in conjunction with command line options, creates a properties file that can be used with the --config-file option during setup. This flag creates the properties file and exits; it has no impact on installation.
--license { license_file | CE }
Silently and automatically deploys the license key to /opt/vertica/config/share. On multi-node installations, the –-license option also applies the license to all nodes declared in the --hosts host_list.
If specified with CE, automatically deploys the Community Edition license key, which is included in your download. You do not need to specify a license file.
--accept-eula
Silently accepts the EULA agreement during setup.
--dba-user-password password
The password for the Database Superuser account; if not supplied, the script prompts for the password and does not echo the input.
--ssh-password password
The root password to use by default for each cluster host; if not supplied, the script prompts for the password if and when necessary and does not echo the input.
--hosts host_list
A comma-separated list of hostnames or IP addresses to include in the cluster; do not include space characters in the list.
--config-file file_name accepts an existing properties file created by --record-config file_name. This properties file contains key/value parameters that map to values in the install_vertica script, many with boolean arguments that default to false
The command for a single-node install might look like this:
If you did not supply a --ssh-password password parameter to the properties file, you are prompted to provide the requested password to allow installation of the RPM/DEB and system configuration of the other cluster nodes. If you are root, this is the root password. If you are using sudo, this is the sudo user password. The password does not echo on the command line.
Note
If you are root on a single-node installation, you are not prompted for a password.
If you did not supply a --dba-user-password password parameter to the properties file, you are prompted to provide the database administrator account password.
The installation script creates a new Linux user account (dbadmin by default) with the password that you provide.
Carefully examine any warnings produced by install_vertica and correct the problems if possible. For example, insufficient RAM, insufficient Network throughput and too high readahead settings on file system could cause performance problems later on.
Note
You can redirect any warning outputs to a separate file, instead of having them display on the system. Use your platforms standard redirected mechanisms. For example: install_vertica [options] > /tmp/file 1>&2.
Disconnect from the Administration Host as instructed by the script. This is required to:
Set certain system parameters correctly.
Function as the Vertica database administrator.
At this point, Linux root privileges are no longer needed. The database administrator can perform the remaining steps.
Note
When creating a new database, the database administrator might want to use different data or catalog locations than those created by the installation script. In that case, a Linux administrator might need to create those directories and change their ownership to the database administrator.
The administrative account must be able to use Secure Shell (SSH) to log in (ssh) to all hosts without specifying a password.
The administrative account must be able to use Secure Shell (SSH) to log in (ssh) to all hosts without specifying a password. The shell script install_vertica does this automatically. This section describes how to do it manually if necessary.
If you do not already have SSH installed on all hosts, log in as root on each host and install it now. You can download a free version of the SSH connectivity tools from OpenSSH.
Log in to the Vertica administrator account (dbadmin in this example).
Make your home directory (~) writable only by yourself. Choose one of:
$ chmod700 ~
or
$ chmod755 ~
where:
700 includes
755 includes
400 read by owner
200 write by owner
100 execute by owner
400 read by owner
200 write by owner
100 execute by owner
040 read by group
010 execute by group
004 read by anybody (other)
001 execute by anybody
Change to your home directory:
$ cd ~
Generate a private key/ public key pair:
$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter fileinwhich to save the key (/home/dbadmin/.ssh/id_rsa):
Created directory '/home/dbadmin/.ssh'.Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/dbadmin/.ssh/id_rsa.
Your public key has been saved in /home/dbadmin/.ssh/id_rsa.pub.
Make your .ssh directory readable and writable only by yourself:
$ chmod700 ~/.ssh
Change to the .ssh directory:
$ cd ~/.ssh
Copy the file id_rsa.pub onto the file authorized_keys2.
$ cp id_rsa.pub authorized_keys2
Make the files in your .ssh directory readable and writable only by yourself:
$ chmod600 ~/.ssh/*
For each cluster host:
$ scp-r ~/.ssh <host>:.
Connect to each cluster host. The first time you ssh to a new remote machine, you could get a message similar to the following:
$ ssh dev0 Warning: Permanently added 'dev0,192.168.1.92'(RSA) to the list of known hosts.
This message appears only the first time you ssh to a particular remote host.
The tasks described in this section are optional and are provided for your convenience.
The tasks described in this section are optional and are provided for your convenience. When you have completed this section, proceed to one of the following:
After you install Vertica, install drivers on the client systems from which you plan to access your databases. Vertica supplies drivers for ADO.NET, JDBC, ODBC, OLE DB, Perl, and Python. For instructions on installing these drivers, see Client drivers.
Install the license key
If you did not supply the -L parameter during setup, or if you did not bypass the -L parameter for a silent install, the first time you log in as the Database Superuser and run the VerticaAdministration tools or Management Console, Vertica requires you to install a license key.
Follow the instructions in Managing licenses in the Administrator's Guide.
Create a database
To get started using Vertica immediately after installation, create a database. You can use either the Administration Tools or the Management Console. To create a database using MC, refer to Creating a database using MC. For instructions on creating a database with admintools, see Creating a database.
Install vsql client application on non-cluster hosts
You can use the Vertica vsql executable image on a non-cluster Linux host to connect to a Vertica database.
On Red Hat, CentOS, and SUSE systems, you can install the client driver RPM, which includes the vsql executable. See Installing the vsql client for details.
If the non-cluster host is running the same version of Linux as the cluster, copy the image file to the remote system. For example:
$ scp host01:/opt/vertica/bin/vsql .$ ./vsql
If the non-cluster host is running a different distribution or version of Linux than your cluster hosts, you must install the Vertica server RPM in order to get vsql:
Download the appropriate RPM package by browsing to Vertica website. On the Support tab, select Customer Downloads.
If the system you used to download the RPM is not the non-cluster host, transfer the file to the non-cluster host.
Log into the non-cluster host as root and install the RPM package using the command:
# rpm -Uvh filename
Where filename is the package you downloaded. Note that you do not have to run the install_vertica script on the non-cluster host to use vsql.
You can upgrade your database from its current Vertica version to any higher version. Before upgrading, make sure that you have performed a full database backup and have tested the new version in an environment that closely resembles your production database.
Tip
If you want to test out a new version of Vertica for an Eon Mode database without spinning up a new cluster, you can sandbox a secondary subcluster and then upgrade the subcluster within the sandbox. By sandboxing the subcluster, you can try out the new version of Vertica stress-free and continue to use your main cluster as usual. After confirming that the upgrade works and performs as expected, you can downgrade the sandboxed subcluster, remove the sandbox, and proceed to upgrade the main cluster.
Be sure to read the Release Notes and New Features for the Vertica version to which you intend to upgrade. Documentation and release notes for the current Vertica version are available in the RPM and at https://docs.vertica.com/latest, which also provides access to documentation for earlier versions.
Before you upgrade the Vertica database, perform the following steps:.
Before you upgrade the Vertica database, perform the following steps:
Verify that you have enough RAM available to run the upgrade. The upgrade requires approximately three times the amount of memory your database catalog uses.
You can calculate catalog memory usage on all nodes by querying system table RESOURCE_POOL_STATUS:
=> SELECT node_name, pool_name, memory_size_kb FROM resource_pool_status WHERE pool_name = 'metadata';
Perform a full database backup. This precautionary measure allows you to restore the current version if the upgrade is unsuccessful.
Determine whether you are using any third-party user-defined extension libraries (UDxs). UDx libraries that are compiled (such as those developed using C++ or Java) may need to be recompiled with a new version of the Vertica SDK libraries to be compatible with the new version of Vertica. See UDx library compatibility with new server versions.
Note that any user or role with the same name as a predefined role is renamed to OLD_n_name, where n is an integer that increments from zero until the resulting name is unique and name is the previous name of the user or role.
If you're upgrading from Vertica 9.2.x and have set the PasswordMinCharChange or PasswordMinLifeTime system-level security parameters, take note of their current values. You will have to set these parameters again, this time at the PROFILE-level, to reproduce your configuration. To view the current values for these parameters, run the following query:
=> SELECT parameter_name,current_value from CONFIGURATION_PARAMETERS
WHERE parameter_name IN ('PasswordMinCharChange', 'PasswordMinLifeTime');
The Vertica installer checks the target platform as it runs, and stops whenever it determines the platform fails to meet an installation requirement.
The Vertica installer checks the target platform as it runs, and stops whenever it determines the platform fails to meet an installation requirement. Before you update the server package on your systems, manually verify that your platform meets all hardware and software requirements (see Platform and hardware requirements and recommendations).
By default, the installer stops on all warnings. You can configure the level where the installer stops installation, through the installation parameter --failure-threshold. If you set the failure threshold to FAIL, the installer ignores warnings and stops only on failures.
Caution
Changing the failure threshold lets you immediately upgrade and bring up the Vertica database. However, Vertica cannot fully optimize performance until you correct all warnings.
4.1.2 - Checking catalog storage space
Use the commands documented here to determine how much catalog space is available before upgrading.
Use the commands documented here to determine how much catalog space is available before upgrading. This helps you determine how much space the updated catalog may take up.
Compare how much space the catalog currently uses against space that is available in the same directory:
Use the du command to determine how much space the catalog directory currently uses:
$ du -s -BG v_vmart_node0001_catalog
2G v_vmart_node0001_catalog
Determine how much space is available in the same directory:
$ df -BG v_vmart_node0001_catalog
Filesystem 1G-blocks Used Available Use% Mounted on
/dev/sda2 48G 19G 26G 43% /
4.1.3 - Verify license compliance for ORC and Parquet data
If you are upgrading from a version before 9.1.0 and:.
If you are upgrading from a version before 9.1.0 and:
Your database has external tables based on ORC or Parquet files (whether stored locally on the Vertica cluster or on a Hadoop cluster)
Your Vertica license has a raw data allowance
follow the steps in this topic before upgrading.
Background
Vertica licenses can include a raw data allowance. Since 2016, Vertica licenses have allowed you to use ORC and Parquet data in external tables. This data has always counted against any raw data allowance in your license. Previously, the audit of data in ORC and Parquet format was handled manually. Because this audit was not automated, the total amount of data in your native tables and external tables could exceed your licensed allowance for some time before being spotted.
Starting in version 9.1.0, Vertica automatically audits ORC and Parquet data in external tables. This auditing begins soon after you install or upgrade to version 9.1.0. If your Vertica license includes a raw data allowance and you have data in external tables based on Parquet or ORC files, review your license compliance before upgrading to Vertica 9.1.x. Verifying your database is compliant with your license terms avoids having your database become non-compliant soon after you upgrade.
Verifying your ORC and Parquet usage complies with your license terms
To verify your data usage is compliant with your license, run the following query as the database administrator:
SELECT (database_size_bytes + file_size_bytes) <= license_size_bytes
"license_compliant?"
FROM (SELECT database_size_bytes,
license_size_bytes FROM license_audits
WHERE audited_data='Total'
ORDER BY audit_end_timestamp DESC LIMIT 1) dbs,
(SELECT sum(total_file_size_bytes) file_size_bytes
FROM external_table_details
WHERE source_format IN ('ORC', 'PARQUET')) ets;
This query returns one of three values:
If you do not have any external data in ORC or Parquet format, the query returns 0 rows:
license_compliant?
--------------------
(0 rows)
In this case, you can proceed with your upgrade.
If you have data in external tables based on ORC or Parquet format, and that data does not cause your database to exceed your raw data allowance, the query returns t:
license_compliant?
--------------------
t
(1 row)
In this case, you can proceed with your upgrade.
If the data in your external tables based on ORC and Parquet causes your database to exceed your raw data allowance, the query returns f:
license_compliant?
--------------------
f
(1 row)
In this case, resolve the compliance issue before you upgrade. See below for more information.
Resolving non-compliance
If query in the previous section indicates that your database is not in compliance with your license, you should resolve this issue before upgrading. There are two ways you can bring your database into compliance:
Delete data (either from ORC and Parquet-based external tables or Vertica native tables) to bring your data size into compliance with your license. You should always backup any data you are about to delete from Vertica. Dropping external tables is a less disruptive way to reduce the size of your database, as the data is not lost—it is still in the files that your external table is based on.
Note
You can still choose to upgrade your database if it is not compliant. However, soon after you upgrade, you will begin getting warnings that your database is out of compliance. See Managing license warnings and limits for more information.
4.1.4 - Backing up and restoring grants
After an upgrade, if the prototypes of UDx libraries change, Vertica will drop the grants on those libraries since they aren't technically the same function anymore.
After an upgrade, if the prototypes of UDx libraries change, Vertica will drop the grants on those libraries since they aren't technically the same function anymore. To resolve these types of issues, it's best practice to back up the grants on these libraries so you can restore them after the upgrade.
Save the following SQL to a file named user_ddl.sql. It creates a view named user_ddl which contains the grants on all objects in the database.
CREATE OR REPLACE VIEW user_ddl AS
(
SELECT 0 as grant_order,
name principal_name,
'CREATE ROLE "' || name || '"' || ';' AS sql,
'NONE' AS object_type,
'NONE' AS object_name
FROM v_internal.vs_roles vr
WHERE NOT vr.predefined_role -- Exclude system roles
AND ldapdn = '' -- Limit to NON-LDAP created roles
)
UNION ALL
(
SELECT 1, -- CREATE USERs
user_name,
'CREATE USER "' || user_name || '"' ||
DECODE(is_locked, TRUE, ' ACCOUNT LOCK', '') ||
DECODE(grace_period, 'undefined', '', ' GRACEPERIOD ''' || grace_period || '''') ||
DECODE(idle_session_timeout, 'unlimited', '', ' IDLESESSIONTIMEOUT ''' || idle_session_timeout || '''') ||
DECODE(max_connections, 'unlimited', '', ' MAXCONNECTIONS ' || max_connections || ' ON ' || connection_limit_mode) ||
DECODE(memory_cap_kb, 'unlimited', '', ' MEMORYCAP ''' || memory_cap_kb || 'K''') ||
DECODE(profile_name, 'default', '', ' PROFILE ' || profile_name) ||
DECODE(resource_pool, 'general', '', ' RESOURCE POOL ' || resource_pool) ||
DECODE(run_time_cap, 'unlimited', '', ' RUNTIMECAP ''' || run_time_cap || '''') ||
DECODE(search_path, '', '', ' SEARCH_PATH ' || search_path) ||
DECODE(temp_space_cap_kb, 'unlimited', '', ' TEMPSPACECAP ''' || temp_space_cap_kb || 'K''') || ';' AS sql,
'NONE' AS object_type,
'NONE' AS object_name
FROM v_catalog.users
WHERE NOT is_super_user -- Exclude database superuser
AND ldap_dn = '' -- Limit to NON-LDAP created users
)
UNION ALL
(
SELECT 2, -- GRANTs
grantee,
'GRANT ' || REPLACE(TRIM(BOTH ' ' FROM words), '*', '') ||
CASE
WHEN object_type = 'RESOURCEPOOL' THEN ' ON RESOURCE POOL '
WHEN object_type = 'STORAGELOCATION' THEN ' ON LOCATION '
WHEN object_type = 'CLIENTAUTHENTICATION' THEN 'AUTHENTICATION '
WHEN object_type IN ('DATABASE', 'LIBRARY', 'MODEL', 'SEQUENCE', 'SCHEMA') THEN ' ON ' || object_type || ' '
WHEN object_type = 'PROCEDURE' THEN (SELECT ' ON ' || CASE REPLACE(procedure_type, 'User Defined ', '')
WHEN 'Transform' THEN 'TRANSFORM FUNCTION '
WHEN 'Aggregate' THEN 'AGGREGATE FUNCTION '
WHEN 'Analytic' THEN 'ANALYTIC FUNCTION '
ELSE UPPER(REPLACE(procedure_type, 'User Defined ', '')) || ' '
END
FROM vs_procedures
WHERE proc_oid = object_id)
WHEN object_type = 'ROLE' THEN ''
ELSE ' ON '
END ||
NVL2(object_schema, object_schema || '.', '') || CASE WHEN object_type = 'STORAGELOCATION' THEN (SELECT '''' || location_path || ''' ON ' || node_name FROM storage_locations WHERE location_id = object_id) ELSE object_name END ||
CASE
WHEN object_type = 'PROCEDURE' THEN (SELECT CASE WHEN procedure_argument_types = '' OR procedure_argument_types = 'Any' THEN '()' ELSE '(' || procedure_argument_types || ')' END
FROM vs_procedures
WHERE proc_oid = object_id)
ELSE ''
END ||
' TO ' || grantee ||
CASE WHEN INSTR(words, '*') > 0 THEN ' WITH GRANT OPTION' ELSE '' END
|| ';',
object_type,
object_name
FROM (SELECT grantee, object_type, object_schema, object_name, object_id,
v_txtindex.StringTokenizerDelim(DECODE(privileges_description, '', ',' , privileges_description), ',')
OVER (PARTITION BY grantee, object_type, object_schema, object_name, object_id)
FROM v_catalog.grants) foo
ORDER BY CASE REPLACE(TRIM(BOTH ' ' FROM words), '*', '') WHEN 'USAGE' THEN 1 ELSE 2 END
)
UNION ALL
(
SELECT 3, -- Default ROLEs
user_name,
'ALTER USER "' || user_name || '"' ||
DECODE(default_roles, '', '', ' DEFAULT ROLE ' || REPLACE(default_roles, '*', '')) || ';' ,
'NONE' AS object_type,
'NONE' AS object_name
FROM v_catalog.users
WHERE default_roles <> ''
)
UNION ALL -- GRANTs WITH ADMIN OPTION
(
SELECT 4, user_name, 'GRANT ' || REPLACE(TRIM(BOTH ' ' FROM words), '*', '') || ' TO ' || user_name || ' WITH ADMIN OPTION;',
'NONE' AS object_type ,
'NONE' AS object_name
FROM (SELECT user_name, v_txtindex.StringTokenizerDelim(DECODE(all_roles, '', ',', all_roles), ',') OVER (PARTITION BY user_name)
FROM v_catalog.users
WHERE all_roles <> '') foo
WHERE INSTR(words, '*') > 0
)
UNION ALL
(
SELECT 5, 'public', 'ALTER SCHEMA ' || name || ' DEFAULT ' || CASE WHEN defaultinheritprivileges THEN 'INCLUDE PRIVILEGES;' ELSE 'EXCLUDE PRIVILEGES;' END, 'SCHEMA', name
FROM v_internal.vs_schemata
WHERE NOT issys -- Exclude system schemas
)
UNION ALL
(
SELECT 6, 'public', 'ALTER DATABASE ' || database_name || ' SET disableinheritedprivileges = ' || current_value || ';',
'DATABASE', database_name
FROM v_internal.vs_configuration_parameters
CROSS JOIN v_catalog.databases
WHERE parameter_name = 'DisableInheritedPrivileges'
)
UNION ALL -- TABLE PRIV INHERITENCE
(
SELECT 7, 'public' , 'ALTER TABLE ' || table_schema || '.' || table_name ||
CASE WHEN inheritprivileges THEN ' INCLUDE PRIVILEGES;' ELSE ' EXCLUDE PRIVILEGES;' END,
'TABLE' AS object_type,
table_schema || '.' || table_name AS object_name
FROM v_internal.vs_tables
JOIN v_catalog.tables ON (table_id = oid)
)
UNION ALL -- VIEW PRIV INHERITENCE
(
SELECT 8, 'public', 'ALTER VIEW ' || table_schema || '.' || table_name || CASE WHEN inherit_privileges THEN ' INCLUDE PRIVILEGES;' ELSE ' EXCLUDE PRIVILEGES; ' END,
'TABLE' AS object_type, table_schema || '.' || table_name AS object_name
FROM v_catalog.views
)
UNION ALL
(
SELECT 9, owner_name, 'ALTER TABLE ' || table_schema || '.' || table_name || ' OWNER TO ' || owner_name || ';',
'TABLE', table_schema || '.' || table_name
FROM v_catalog.tables
)
UNION ALL
(
SELECT 10, owner_name, 'ALTER VIEW ' || table_schema || '.' || table_name || ' OWNER TO ' || owner_name || ';', 'TABLE',
table_schema || '.' || table_name
FROM v_catalog.views
);
From the Linux command line, run the script in the user_ddl.sql file:
$ vsql -f user_ddl.sql
CREATE VIEW
Connect to Vertica using vsql.
Export the content of the user_ddl's sql column ordered on the grant_order column to a file:
=> \o pre-upgrade.txt
=> SELECT sql FROM user_ddl ORDER BY grant_order ASC;
=> \o
To restore any missing grants, run the remaining grants in grants-list.txt, if any:
=> \i 'grants-list.txt'
Note
Attempting to restore grants to users with the ANY keyword triggers the following error:
ERROR 4856: Syntax error at or near "Any" at character
To avoid this error, use () instead of (ANY) as shown in the following example:
=> GRANT EXECUTE ON FUNCTION public.MapLookup() TO public;
GRANT PRIVILEGE
4.1.5 - Nonsequential FIPS database upgrades
As of Vertica 10.1.1, FIPS support has been reinstated.
As of Vertica 10.1.1, FIPS support has been reinstated. Prior to this, the last version to support FIPS was Vertica 9.2.x. If you are upgrading from 9.2.x and want to maintain your FIPS certification, you must first perform a direct upgrade from 9.2.x to 10.1.1 before performing further upgrades.
The following procedure performs a direct upgrade from Vertica 9.2.x running on RHEL 6.x to Vertica 10.1.1 on RHEL 8.1.
Important
If you have any questions or want additional guidance for performing this upgrade, contact Vertica Support.
Create a full backup of your Vertica 9.2.x database. This example uses the configuration file fullRestore.ini.
If you acquired your RHEL 8.1 cluster by reimaging or using a different cluster, you must restore your database.
$ vbr -c /tmp/fullRestore.ini -t restore
If you encounter the following warning, you can safely ignore it.
Warning: Vertica versions do not match: v9.2.1-xx -> v10.1.1-xxxxxxxx. This operation may not be supported.
Start the Vertica 10.1.1 database to trigger the upgrade. This should be the first time you've started your database since shutting it down in step 2.
$ admintools -t start_db -d fips_db
4.2 - Upgrade Vertica
Before running the upgrade script, be sure to review the tasks described in Before You Upgrade.
Important
Before running the upgrade script, be sure to review the tasks described in Before you upgrade.
To upgrade your database to a new Vertica version, complete the following steps:
Perform a full backup of your existing database. This precautionary measure lets you restore from the backup, if the upgrade is unsuccessful. If the upgrade fails, you can reinstall the previous version of Vertica and restore your database to that version.
On each host where an additional package is installed, such as the R language pack, uninstall it. For example:
rpm -e vertica-R-lang
Important
If you omit this step and do not uninstall additional packages, the Vertica server package fails to install in the next step.
Make sure you are logged in as root or sudo and use one of the following commands to run the RPM package installer:
If you are root and installing an RPM:
# rpm -Uvh pathname
If you are using sudo and installing an RPM:
$ sudo rpm -Uvh pathname
If you are using Debian:
$ sudo dpkg -i pathname
On the same node on which you just installed the RPM, run update_vertica as root or sudo. This installs the RPM on all the hosts in the cluster. For example:
You can upgrade the Vertica server running on AWS instances created from a Vertica AMI. To upgrade the Vertica server on these AWS instances, you need to add the --dba-user-password-disabled and --point-to-point arguments to the upgrade script.
The following requirements and restrictions apply:
The DBADMIN user must be able to read the RPM or DEB file when upgrading. Some upgrade scripts are run as the DBADMIN user, and that user must be able to read the RPM or DEB file.
Use the same options that you used when you last installed or upgraded the database. You can find these options in /opt/vertica/config/admintools.conf, on the install_opts line. For details on all options, see Install Vertica with the installation script.
Caution
If you omit any previous options, their default settings are restored. If you do so, or if you change any options, the upgrade script uses the new settings to reconfigure the cluster. This can cause issues with the upgraded database.
Omit the --hosts/-s host-list parameter. The upgrade script automatically identifies cluster hosts.
If the root user is not in /etc/sudoers, an error appears. The installer reports this issue with S0311. See the Sudoers Manual for more information.
Start the database. The start-up scripts analyze the database and perform necessary data and catalog updates for the new version.
If Vertica issues a warning stating that one or more packages cannot be installed, run the admintools --force-reinstall option to force reinstallation of the packages. For details, see Reinstalling packages.
When the upgrade completes, the database automatically restarts.
Note
Any user or role with the same name as a predefined role is renamed to OLD_n_name, where n is an integer that increments from zero until the resulting name is unique and name is the previous name of the user or role.
Manually restart any nodes that fail to start.
Perform another database backup.
Upgrade duration
Duration depends on average in-memory size of catalogs across all cluster nodes. For every 20GB, you can expect the upgrade to last between one and two hours.
You can calculate catalog memory usage on all nodes by querying system table RESOURCE_POOL_STATUS:
=> SELECT node_name, pool_name, memory_size_kb FROM resource_pool_status WHERE pool_name = 'metadata';
Post-upgrade tasks
After you complete the upgrade, review post-upgrade tasks in After you upgrade.
4.3 - After you upgrade
After you finish upgrading the Vertica server package on your cluster, a number of tasks remain.
After you finish upgrading the Vertica server package on your cluster, a number of tasks remain.
Required tasks
If you created projections in earlier releases with pre-aggregated data (for example, LAPs and TopK projections) and the projections were partitioned with a GROUP BY clause, you must rebuild these projections.
If you're upgrading from Vertica 9.2.x and have set the PasswordMinCharChange or PasswordMinLifeTime system-level security parameters, set them again at the PROFILE-level.
4.3.1 - Rebuilding partitioned projections with pre-aggregated data
If you created projections in earlier (pre-10.0.x) releases with pre-aggregated data (for example, LAPs and TopK projections) and the anchor tables were partitioned with a GROUP BY clause, their ROS containers are liable to be corrupted from various DML and ILM operations.
If you created projections in earlier (pre-10.0.x) releases with pre-aggregated data (for example, LAPs and TopK projections) and the anchor tables were partitioned with a GROUP BY clause, their ROS containers are liable to be corrupted from various DML and ILM operations. In this case, you must rebuild the projections:
Run the meta-function REFRESH on the database. If REFRESH detects problematic projections, it returns with failure messages. For example:
=> SELECT REFRESH();
REFRESH
-----------------------------------------------------------------------------------------------------
Refresh completed with the following outcomes:
Projection Name: [Anchor Table] [Status] [ Refresh Method] [Error Count]
"public"."store_sales_udt_sum": [store_sales] [failed: Drop and recreate projection] [] [1]
"public"."product_sales_largest": [store_sales] [failed: Drop and recreate projection] [] [1]
"public"."store_sales_recent": [store_sales] [failed: Drop and recreate projection] [] [1]
(1 row)
Vertica also logs messages to vertica.log:
2020-07-07 11:28:41.618 Init Session:ox7fabbbfff700-aoo000000oosbs [Txnl <INFO> Be in Txn: aoooooooooo5b5 'Refresh: Evaluating which projection to refresh'
2020-07-07 11:28:41.640 Init Session:ex7fabbbfff7oe-aooooeeeeoosbs [Refresh] <INFO> Storage issues detected, unable to refresh projection 'store_sales_recent'. Drop and recreate this projection, then refresh.
2020-07-07 11:28:41.641 Init Session:Ox7fabbbfff700-aooooeooooosbs [Refresh] <INFO> Storage issues detected, unable to refresh projection 'product_sales_largest'. Drop and recreate this projection, then refresh.
2020-07-07 11:28:41.641 Init Session:Ox7fabbbfff700-aeoeeeaeeeosbs [Refresh] <INFO> Storage issues detected, unable to refresh projection 'store_sales_udt_sum'. Drop and recreate this projection, then refresh.
Drop the projections, then recreate them as defined in the exported DDL.
Run REFRESH. Vertica rebuilds the projections with new storage containers.
4.3.2 - Verifying catalog memory consumption
Vertica versions ≥ 9.2 significantly reduce how much memory database catalogs consume.
Vertica versions ≥ 9.2 significantly reduce how much memory database catalogs consume. After you upgrade, check catalog memory consumption on each node to verify that the upgrade refactored catalogs correctly. If memory consumption for a given catalog is as large as or larger than it was in the earlier database, restart the host node.
Known issues
Certain operations might significantly inflate catalog memory consumption. For example:
You created a backup on a 9.1.1 database and restored objects from the backup to a new database of version ≥ 9.2.
You replicated objects from a 9.1.1 database to a database of version ≥ 9.2.
To refactor database catalogs and reduce their memory footprint, restart the database.
4.3.3 - Reinstalling packages
In most cases, Vertica automatically reinstalls all default packages when you restart your database for the first time after running the upgrade script.
In most cases, Vertica automatically reinstalls all default packages when you restart your database for the first time after running the upgrade script. Occasionally, however, one or more packages might fail to reinstall correctly.
To verify that Vertica succeeded in reinstalling all packages:
Restart the database after upgrading.
Enter a correct password.
If any packages failed to reinstall, Vertica issues a message that specifies the uninstalled packages. You can manually reinstall the packages with admintools or the HTTPS service:
To reinstall with admintools, run the install_package command with the option --force-reinstall:
Vertica internally stores physical table data in bundles together with metadata on the bundle contents.
Vertica internally stores physical table data in bundles together with metadata on the bundle contents. The query optimizer uses bundle metadata to look up and fetch the data it needs for a given query.
Vertica stores bundle metadata in the database catalog. This is especially beneficial in Eon mode: instead of fetching this metadata from remote (S3) storage, the optimizer can find it in the local catalog. This minimizes S3 reads, and facilitates faster query planning and overall execution.
Vertica writes bundle metadata to the catalog on two events:
Any DML operation that changes table content, such as INSERT, UPDATE, or COPY. Vertica writes bundle metadata to the catalog on the new or changed table data. DML operations have no effect on bundle metadata for existing table data.
Invocations of function UPDATE_STORAGE_CATALOG, as an argument to Vertica meta-function
DO_TM_TASK, on existing data. You can narrow the scope of the catalog update operation to a specific projection or table. If no scope is specified, the operation is applied to the entire database.
Important
After upgrading to any Vertica version ≥ 9.2.1, you only need to call UPDATE_STORAGE_CATALOG once on existing data. Bundle metadata on all new or updated data is always written automatically to the catalog.
For example, the following DO_TM_TASK call writes bundle metadata on all projections in table store.store_sales_fact:
You can query system table
STORAGE_BUNDLE_INFO_STATISTICS to determine which projections have invalid bundle metadata in the database catalog. For example, results from the following query show that the database catalog has invalid metadata for projections inventory_fact_b0 and inventory_fact_b1:
Updating the database catalog with UPDATE_STORAGE_CATALOG is recommended only for Eon users. Enterprise users are unlikely to see measurable performance improvements from this update.
Calls to UPDATE_STORAGE_CATALOG can incur considerable overhead, as the update process typically requires numerous and expensive S3 reads. Vertica advises against running this operation on the entire database. Instead, consider an incremental approach:
Call UPDATE_STORAGE_CATALOG on a single large fact table. You can use performance metrics to estimate how much time updating other files will require.
Identify which tables are subject to frequent queries and prioritize catalog updates accordingly.
4.3.5 - Upgrading the streaming data scheduler utility
If you have integrated Vertica with a streaming data application, such as Apache Kafka, you must update the streaming data scheduler utility after you update Vertica.
If you have integrated Vertica with a streaming data application, such as Apache Kafka, you must update the streaming data scheduler utility after you update Vertica.
From a command prompt, enter the following command:
To uninstall Vertica, perform the following steps for each host in the cluster:.
To uninstall Vertica, perform the following steps for each host in the cluster:
Choose a host machine and log in as root (or log in as another user and switch to root).
$ su - root
password: root-password
Find the name of the package that is installed:
RPM:
# rpm -qa | grep vertica
DEB:
# dpkg -l | grep vertica
Remove the package:
RPM:
# rpm -e package
DEB:
# dpkg -r package
Note
If you want to delete the configuration file used with your installation, you can choose to delete the /opt/vertica/ directory and all subdirectories using this command:
# rm -rf /opt/vertica/
Perform the following steps for each client system:
Delete the JDBC driver jar file.
Delete ODBC driver data source names.
Delete the ODBC driver software:
In Windows, go to Start > Control Panel > Add or Remove Programs.