This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Installation
This section explains how to prepare for and install the Vertica server.
This section explains how to prepare for and install the Vertica server. This guide also provides instructions for installing the Vertica Management Console.
For information about installing client drivers, see Client drivers.
Prerequisites
-
This document assumes that you have become familiar with the concepts discussed in Architecture.
-
To perform the procedures described in this document, you must have root password or sudo access (for all commands) for all nodes in your cluster.
1 - Planning your installation
Before you get started with Vertica, consider your business needs and available resources.
Before you get started with Vertica, consider your business needs and available resources. Vertica is built to run on a variety of environments, and to be installed using different methods depending on your requirements. This will determine which installation path to proceed with as you install.
1.1 - Choosing an on-premises or cloud environment
You can choose to run Vertica on physical host hardware, or deploy Vertica on the cloud.
You can choose to run Vertica on physical host hardware, or deploy Vertica on the cloud.
On-premises environment
Do you have access to on-premises hardware on which to install Vertica? On-premises hardware can provide benefits in cases like the following:
-
Your business requirements demand keeping sensitive data on-premises.
-
You prefer to pay a higher up-front cost (CapEx) of buying hardware for on-premises deployment, rather than potentially paying a higher long-term total cost of a cloud deployment.
-
You cannot rely on continuous access to the internet.
-
You prefer end-to-end control over your environment, rather than depending on a third-party cloud provider to store your data.
-
You may have already invested in a data center and suitable hardware for Vertica that you want to capitalize on.
If you plan to install Vertica in an on-premises environment, this section of the documentation walks you through preparation and installation: Installing manually.
Cloud environment
Vertica can run on Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. You might consider running Vertica on cloud resources for any of the following benefits:
-
You plan to quickly scale your cluster size up and down to accommodate varying analytic workload. You will provision more computing resources during peak work loads without incurring the same resource costs during low-demand periods. The Vertica database's Eon Mode is designed for this use case.
-
You prefer to pay over time (OpEx) for ongoing cloud deployment, rather than the higher up-front cost of buying hardware for on-premises deployment.
-
You need to reduce the costs, labor, and expertise involved in maintaining physical on-premises hardware (such as accommodating for server purchases, hardware depreciation, software maintenance, power consumption, floor space, and backup infrastructure).
-
You prefer simpler, faster deployment. Installing on the cloud eliminates the need for more specific hardware expertise during setup. In addition, on cloud platforms such as AWS and GCP, Vertica offers templates on that allow you to deploy a pre-configured set of resources on which Vertica and Management Console are already installed, in just a few steps.
-
You have very variable workloads and you do not want to pay for idle equipment in a data center when you can simply rent infrastructure when you need it.
-
You are a start-up and don't want to build out a data center until your product or service is proven and growing.
If you plan to install Vertica on the cloud, first see Installing in the cloud.
1.2 - Choosing a database mode
You can create a Vertica database in one of two modes: Eon Mode or Enterprise Mode.
You can create a Vertica database in one of two modes: Eon Mode or Enterprise Mode. The mode determines the database's underlying architecture, such as the way Vertica stores data, how the database cluster scales, and how data is loaded; the mode cannot be changed after database creation. Database mode does not affect the way you run queries and other everyday tasks while using the database.
For an in-depth explanation of Enterprise Mode and Eon Mode, see Architecture.
1.3 - Choosing an installation method
After you have decided how you will run Vertica, you can choose which installation method works for your needs.
After you have decided how you will run Vertica, you can choose which installation method works for your needs.
Installing Vertica manually
Manually installing Vertica through the command line works on all platforms. You will first set up a cluster of nodes, then install Vertica.
Manual installation might be right for you if your cluster will have many specific configuration requirements, and you have a database administrator with the expertise to set up the cluster manually on your chosen platform. Manual installation takes more time, but you can configure your cluster to your system's exact needs.
For an on-premises environment, you must install Vertica manually. See Installing manually to get started.
For Amazon AWS, Google Cloud Platform, and Microsoft Azure, you have the option to install automatically or manually. See Installing in the cloud for information on manual installation on each cloud platform.
Install Vertica automatically
Automatic installation is available on Amazon AWS, Google Cloud Platform, and Microsoft Azure.
Automatic installation deploys a pre-configured environment consisting of cloud resources on which your cluster can run, with Vertica and Management Console already installed. Enter a few parameters into a template on your chosen platform, and quickly to get up and running with Vertica.
In addition, when you deploy automatically with AWS, Management Console provides AWS-specific cluster management capabilities, including a cluster creation wizard that spins up AWS cluster nodes and creates a Vertica database on them.
For Amazon AWS, Google Cloud Platform, and Microsoft Azure, you have the option to install automatically or manually. See Installing in the cloud for information on manual installation on each cloud platform.
1.4 - Planning Eon Mode communal storage
If you choose to install your database using Eon Mode, you must plan for your use of communal storage to store your database's data.
If you choose to install your database using Eon Mode, you must plan for your use of communal storage to store your database's data. Communal storage is based on an object store, such as AWS S3 or Pure Storage FlashBlade servers.
Whatever object storage platform you use, you must ensure that it is durable (protected against data loss). The data in your Eon Mode database is only as safe as the object store that contains it. Most cloud provider's object stores come with a guaranteed redundancy to prevent data loss. When you install an Eon Mode database on-premises, you may have to take additional steps to prevent data loss.
Planning communal storage capacity for on-premises databases
Most cloud providers do not limit the amount of data you can store in their object stores. The only real limit is your budget; storing more data costs more money.
When you deploy an Eon Mode database on-premises, your storage is limited to the size of your object store. Unlike the cloud, you must plan ahead for the amount of storage you will need. For example, if you have a Pure Admin FlashBlade installation with three 8TB blades, then in theory, your database can grow up to 24TB. In practice, you need to account other uses of your object store, as well as factors such as data compression, and space consumed by unreaped ROS containers (storage containers no longer used by Vertica but not yet deleted by the object store).
The following calculator helps you determine the size for your communal storage needs, based on your estimated data size and additional uses of your communal storage. The values with white backgrounds in the Value column are editable. Change them to reflect your environment.
Note
The calculator currently does not work in mobile browsers. Please use a desktop browser to view the calculator.
2 - Installing in the cloud
You can spin up a Vertica cluster in minutes using Amazon Web Services, Microsoft Azure, or Google Cloud Platform.
You can spin up a Vertica cluster in minutes using Amazon Web Services, Microsoft Azure, or Google Cloud Platform.
Vertica offers simple, automatic deployment on all three platforms; just input a few parameters, then launch a fully functional environment with Vertica and Management Console already installed on them.
If launching a pre-configured environment doesn't work with your specific needs, you can instead set up your nodes in the cloud and manually install Vertica in order to have more control over your setup.
You can create a database in either Enterprise Mode or Eon Mode. Eon Mode databases are supported on AWS environments only, and are optimized for easier scalability on the cloud. Enterprise Mode is also supported on AWS environments, as well as all other platforms that Vertica is compatible with.
Automatic installation
Vertica offers automatic configuration of resources and quick deployment on the cloud.
AWS:
Vertica provides CloudFormation Templates (CFTs) in the AWS Marketplace. Use a CFT to automatically launch preconfigured AWS resources in minutes, with Vertica and Management Console also automatically installed.
Each CFT includes the in-browser Vertica Management Console. When you install Vertica using one of the CFTs, Management Console provides AWS-specific cluster management options, including the ability to quickly create a new cluster and Vertica database.
To deploy Vertica on AWS automatically, see Installing Vertica with CloudFormation templates.
After deployment, you can create an Eon Mode or Enterprise Mode cluster and database using Management Console:
Also refer to the official AWS documentation.
Google Cloud Platform:
For GCP, Vertica provides an automated installer that is available from the Google Cloud Marketplace.
Input a few parameters, and the Google Cloud Launcher will deploy the Vertica solution, including your new database. You can create up to a 16-node cluster. The solution includes the Vertica Management Console as the primary UI for you to get started.
To deploy Vertica on GCP automatically, see Deploy Vertica from the Google cloud marketplace in the GCP section of the Vertica documentation.
Also refer to the official Google Cloud Platform documentation.
Microsoft Azure:
Vertica offers a fully automated cluster deployment from the Microsoft Azure Marketplace. This solution will automatically deploy a Vertica cluster and create an initial database, allowing you to log in to the Vertica Management Console and start using it once deployment has finished.
To deploy Vertica on Azure automatically, see Deploying Vertica from the Azure Marketplace in the Microsoft Azure section of the Vertica documentation.
Also refer to the official Microsoft Azure documentation.
Manual installation
Manual installation might be the right option for you if you have many specific configuration requirements, and have an administrator who is familiar with setting up and maintaining cloud resources in the environment of your choice. Setup and maintenance may take longer, and requires more expertise, but you will have more control over how your cluster is configured.
The process of installing Vertica manually on cloud resources is very similar to doing so with on-premises hardware.
See the guide to manual installation of Vertica here: Installing manually
However, it is extremely important to consider your platform when preparing your environment for installation. Vertica offers cloud-specific documentation for details about how each platform works with Vertica specifically. Before you install, make sure to also refer to the documentation of the platform you are using in order to set up your cloud resources correctly.
AWS:
To install Vertica on AWS manually, see the AWS section of the Vertica documentation: Vertica on Amazon Web Services
Refer to the official AWS documentation for in-depth details for how to set up your AWS resources.
Google Cloud Platform:
To install Vertica on GCP manually, see the GCP section of the Vertica documentation: Vertica on Google Cloud Platform
Refer to the official Google Cloud Platform documentation for more detail on setting up your GCP resources.
Microsoft Azure:
To install Vertica on Azure manually, see the Azure section of the Vertica documentation: Vertica on Microsoft Azure
Refer to the official Microsoft Azure documentation for more detail on setting up your Azure resources.
3 - Installing manually
This section discusses the procedure for installing Vertica manually.
This section discusses the procedure for installing Vertica manually. You can manually install Vertica on-premises or in a cloud environment running in either Eon Mode or Enterprise Mode.
You usually perform a manual install when installing on-premises. Most cloud environments offer an automated way to install Vertica. See Installing in the cloud for additional resources specific to cluster configuration on your chosen cloud platform.
3.1 - Installation overview and checklist
This page provides an overview of installation tasks.
This page provides an overview of installation tasks. Carefully review and follow the instructions in all sections in this topic.
Important notes
-
Vertica supports only one running database per cluster.
-
Vertica supports installation on one, two, or multiple nodes. The steps for Installing Vertica are the same, no matter how many nodes are in the cluster.
-
Prerequisites listed in Before You Install Vertica are required for all Vertica configurations.
-
Only one instance of Vertica can be running on a host at any time.
-
To run the
install_vertica
script, as well as adding, updating, or deleting nodes, you must be logged in as root, or sudo as a user with all privileges. You must run the script for all installations, including upgrades and single-node installations.
Installation scenarios
The four main scenarios for installing Vertica on hosts are:
-
A single node install, where Vertica is installed on a single host as a localhost process. This form of install cannot be expanded to more hosts later on and is typically used for development or evaluation purposes.
-
Installing to a cluster of physical host hardware. This is the most common scenario when deploying Vertica in a testing or production environment.
-
Installing on Amazon Web Services (AWS). You can install by creating a Vertica cluster using a CloudFormation template and step-by-step wizards in MC, or manually deploy using an Amazon Machine Image (AMI) where Vertica is installed when you create your instances. Eon Mode databases are currently only supported on AWS resources. For the AWS-specific installation procedure, see Deploy AWS instances for your Vertica database cluster.
-
Installing to a local cluster of virtual host hardware. Also similar to installing on physical hosts, but with network configuration differences.
Before you install
Before You Install Vertica describes how to construct a hardware platform and prepare Linux for Vertica installation.
These preliminary steps are broken into two categories:
Install or upgrade Vertica
Once you have completed the steps in the Before You Install Vertica section, you are ready to run the install script.
Installing Vertica describes how to:
-
Back up any existing databases.
-
Download and install the Vertica RPM package.
-
Install a cluster using the
install_vertica
script.
-
[Optional] Create a properties file that lets you install Vertica silently.
Note
This guide provides additional
manual procedures in case you encounter installation problems.
Post-installation tasks
After You Install Vertica describes subsequent steps to take after you've run the installation script. Some of the steps can be skipped based on your needs:
-
Install the license key.
-
Verify that kernel and user parameters are correctly set.
-
Install the vsql client application on non-cluster hosts.
-
Resolve any SLES 11.3 issues during spread configuration.
-
Use the Vertica documentation online, or download and install Vertica documentation. Find the online documentation and documentation packages to download at https://docs.vertica.com/latest.
-
Install client drivers.
-
Extend your installation with Vertica packages.
-
Install or upgrade the Management Console.
3.2 - About Linux users created by Vertica and their privileges
This topic describes the Linux accounts that the installer creates and configures so Vertica can run.
This topic describes the Linux accounts that the installer creates and configures so Vertica can run. When you install Vertica, the installation script optionally creates the following Linux user and group:
dbadmin and verticadba are the default names. If you want to change what these Linux accounts are called, you can do so using the installation script. See Installing Vertica with the installation script for details.
Before you install Vertica
See the following topics for more information:
When you install Vertica
The Linux dbadmin user owns the database catalog and data storage on disk. When you run the install script, Vertica creates this user on each node in the database cluster. It also adds dbadmin to the Linux dbadmin and verticadba groups, and configures the account as follows:
-
Configures and authorizes dbadmin for passwordless SSH between all cluster nodes. SSH must be installed and configured to allow passwordless logins. See Enable secure shell (SSH) logins.
-
Sets the dbadmin user's BASH shell to /bin/bash
, required to run scripts, such as install_vertica and the Administration tools.
-
Provides read-write-execute permissions on the following directories:
Note
The Vertica installation script also creates a Vertica database superuser named dbadmin. They share the same name, but they are not the same; one is a Linux user and the other is a Vertica user. See
Database administration user for information about the database superuser.
After you install Vertica
Root or sudo privileges are not required to start or run Vertica after the installation process completes.
The dbadmin user can log in and perform Vertica tasks, such as creating a database, installing/changing the license key, or installing drivers. If dbadmin wants database directories in a location that differs from the default, the root user (or a user with sudo privileges) must create the requested directories and change ownership to the dbadmin user.
Vertica prevents administration from users other than the dbadmin user (or the user name you specified during the installation process if not dbadmin). Only this user can run Administration Tools.
See also
3.3 - Before you install Vertica
Complete all of the tasks in this section before you install Vertica.
Complete all of the tasks in this section before you install Vertica. When you have completed this section, proceed to Installing Vertica.
3.3.1 - Platform requirements and recommendations
You must verify that your servers meet the platform requirements described in.
You must verify that your servers meet the platform requirements described in Supported Platforms. The Supported Platforms topics detail supported versions for the following:
Install the latest vendor-specific system software
Install the latest vendor drivers for your hardware.
Data storage recommendations
Install Perl
Before you perform the cluster installation, install Perl 5 on all the target hosts. Perl is available for download from www.perl.org.
Validation utilities
Vertica provides several validation utilities that validate the performance on prospective hosts. The utilities are installed when you install the Vertica RPM, but you can use them before you run the install_vertica
script. See Validation scripts for more details on running the utilities and verifying that your hosts meet the recommended requirements.
3.3.1.1 - General hardware and OS requirements and recommendations
The Vertica Analytics Platform is based on a massively parallel processing (MPP), shared-nothing architecture, in which the query processing workload is divided among all nodes of the Vertica database.
Hardware recommendations
The Vertica Analytics Platform is based on a massively parallel processing (MPP), shared-nothing architecture, in which the query processing workload is divided among all nodes of the Vertica database. OpenText highly recommends using a homogeneous hardware configuration for your Vertica cluster; that is, each node of the cluster should be similar in CPU, clock speed, number of cores, memory, and operating system version.
Note that OpenText has not tested Vertica on clusters made up of nodes with disparate hardware specifications. While it is expected that a Vertica database would functionally work in a mixed hardware configuration, performance will be limited to that of the slowest node in the cluster.
Vertica performs best on processors with higher clock frequency. When possible, choose a faster processor with fewer cores as opposed to a slower processor with more cores.
Tests performed both internally and by customers have shown performance differences between processor architectures even when accounting for differences in core count and clock frequency. When possible, compare platforms by installing Vertica and running experiments using your data and workloads. Consider testing on cloud platforms that offer VMs running on different processor architectures, even if you intend to deploy your Vertica database on premises.
Detailed hardware recommendations are available in Recommendations for Sizing Vertica Nodes and Clusters (formerly the Vertica Hardware Planning Guide).
Important
Deploy Vertica as the only active process on each host—other than Linux processes or software explicitly approved by Vertica. Vertica cannot be colocated with other software. Remove or disable all non-essential applications from cluster hosts.
You must verify that your servers meet the platform requirements described in Vertica server and Management Console.
Verify sudo
Vertica uses the sudo command during installation and some administrative tasks. Ensure that sudo is available on all hosts with the following command:
# which sudo
/usr/bin/sudo
If sudo is not installed, on all hosts, follow the instructions in How to Enable sudo on Red Hat Enterprise Linux.
When you use sudo to install Vertica, the user that performs the installation must have privileges on all nodes in the cluster.
Configuring sudo with privileges for the individual commands can be a tedious and error-prone process; thus, the Vertica documentation does not include every possible sudo command that you can include in the sudoers file. Instead, Vertica recommends that you temporarily elevate the sudo user to have all privileges for the duration of the install.
Note
See the sudoers and visudo man pages for the details on how to write/modify a sudoers file.
To allow root sudo access on all commands as any user on any machine, use visudo as root to edit the /etc/sudoers
file and add this line:
## Allow root to run any commands anywhere
root ALL=(ALL) ALL
After the installation completes, remove (or reset) sudo privileges to the pre-installation settings.
3.3.1.2 - BASH shell requirements
All shell scripts included in Vertica must run under the BASH shell.
All shell scripts included in Vertica must run under the BASH shell. If you are on a Debian system, then the default shell can be DASH. DASH is not supported. Change the shell for root and for the dbadmin user to BASH with the chsh
command.
For example:
# getent passwd | grep root
root:x:0:0:root:/root:/bin/dash
# chsh
Changing shell for root.
New shell [/bin/dash]: /bin/bash
Shell changed.
Then, as root, change the symbolic link for /bin/sh
from /bin/dash
to /bin/bash
:
# rm /bin/sh
# ln -s /bin/bash /bin/sh
Log out and back in for the change to take effect.
3.3.2 - Configuring the network
This group of steps involve configuring the network.
This group of steps involve configuring the network. These steps differ depending on your installation scenario. A single node installation requires little network configuration, because the single instance of the Vertica server does not need to communication with other nodes in a cluster. For cluster and cloud install scenarios, you must make several decisions regarding your configuration.
Vertica supports server configuration with multiple network interfaces. For example, you might want to use one as a private network interface for internal communication among cluster hosts (the ones supplied via the --hosts
option to install_vertica
) and a separate one for client connections.
Important
Vertica performs best when all nodes are on the same subnet and have the same broadcast address for one or more interfaces. A cluster that has nodes on more than one subnet can experience lower performance due to the network latency associated with a multi-subnet system at high network utilization levels.
Important notes
-
Network configuration is exactly the same for single nodes as for multi-node clusters, with one special exception. If you install Vertica on a single host machine that is to remain a permanent single-node configuration (such as for development or Proof of Concept), you can install Vertica using localhost
or the loopback IP (typically 127.0.0.1) as the value for --hosts
. Do not use the hostname localhost
in a node definition if you are likely to add nodes to the configuration later.
-
If you are using a host with multiple network interfaces, configure Vertica to use the address which is assigned to the NIC that is connected to the other cluster hosts.
-
Use a dedicated gigabit switch. If you do not performance could be severely affected.
-
Do not use DHCP dynamically-assigned IP addresses for the private network. Use only static addresses or permanently-leased DHCP addresses.
Choose IPv4 or IPv6 addresses for host identification and communications
Vertica supports using either IPv4 or IPv6 IP addresses for identifying the hosts in a database cluster. Vertica uses a single address to identify a host in the database cluster. All the IP addresses used to identify hosts in the cluster must use the same IP family.
The hosts in your database cluster can have both IPv4 and IPv6 network addresses assigned to them. Only one of these addresses is used to identify the node within the cluster. You can use the other addresses to handle client connections or connections to other systems.
You tell Vertica which address family to use when you install it. By default, Vertica uses IPv4 addresses for hosts. If you want the nodes in your database to use IPv6 addresses, add the --ipv6
option to the arguments you pass to the install_vertica
script.
Note
You cannot change the address family a database cluster uses after you create it. For example, suppose you created a Vertica database using IPv4 addresses to identify the hosts in your cluster. Then you cannot later change the hosts to use an IPv6 address for internal communications.
In most cases, the address family you select does not impact how your database functions. However, there are a few exceptions:
-
Use IPv4 addresses to identify the nodes in your cluster if you want to use the Management Console to manage your database. Currently, the MC does not support databases that use IPv6 addresses.
-
If you select IPv6 addressing for your cluster, it automatically uses point-to-point networking mode.
-
Currently, AWS is the only cloud platform on which Vertica supports IPv6 addressing. To use IPv6 on AWS, you must identify cluster hosts using IP addresses instead of host names. The AWS DNS does not support resolving host names to IPv6.
-
If you only assign IPv6 addresses to the hosts in your database cluster, you may have problems interfacing to other systems that do not support IPv6.
Part of the information you pass to the install script is the list of hosts it will use to form the Vertica cluster. If you use host names in this list instead of IP addresses, ensure that the host names resolve to the IP address family you want to use for your cluster. For example, if you want your cluster to use IPv6 addresses, ensure your DNS or /etc/hosts
file resolves the host names to IPv6 addresses.
You can configure DNS to return both IPv4 and IPv6 addresses for a host name. In this case, the installer uses the IPv4 address unless you supply the --ipv6
argument. If you use /etc/hosts
for host name resolution (which is the best practice), host names cannot resolve to both IPv4 and IPv6 addresses.
Optionally run spread on a separate control network
If your query workloads are network intensive, you can use the --control-network
parameter with the
install_vertica
script (see Installing Vertica with the installation script) to allow spread communications to be configured on a subnet that is different from other Vertica data communications.
The --control-network
parameter accepts either the default
value or a broadcast network IP address (for example, 192.168.10.255
).
-
Verify that root can use Secure Shell (SSH) to log in (ssh) to all hosts that are included in the cluster. SSH (SSH client) is a program for logging into a remote machine and for running commands on a remote machine.
-
If you do not already have SSH installed on all hosts, log in as root on each host and install it before installing Vertica. You can download a free version of the SSH connectivity tools from OpenSSH.
-
Make sure that /dev/pts
is mounted. Installing Vertica on a host that is missing the mount point /dev/pts
could result in the following error when you create a database:
TIMEOUT ERROR: Could not login with SSH. Here is what SSH said:Last login: Sat Dec 15 18:05:35 2007 from v_vmart_node0001
Allow passwordless SSH access for the dbadmin user
The dbadmin user must be authorized for passwordless ssh. In typical installs, you won't need to change anything; however, if you set up your system to disallow passwordless login, you'll need to enable it for the dbadmin user. See Enable secure shell (SSH) logins.
3.3.2.1 - Ensure ports are available
Verify that ports required by Vertica are not in use by running the following command as the root user and comparing it with the ports required, as shown below:.
Verify that ports required by Vertica are not in use by running the following command as the root user and comparing it with the ports required, as shown below:
netstat -atupn
If you are using a Red Hat 7/CentOS 7 system, use the following command instead:
ss -atupn
Firewall requirements
Vertica requires several ports to be open on the local network. Vertica does not recommend placing a firewall between nodes (all nodes should be behind a firewall), but if you must use a firewall between nodes, ensure the following ports are available:
Port |
Protocol |
Service |
Notes |
22 |
TCP |
sshd |
Required by Administration tools and the Management Console Cluster Installation wizard. |
5433 |
TCP |
Vertica |
Vertica client (vsql, ODBC, JDBC, etc) port. |
5434 |
TCP |
Vertica |
Intra- and inter-cluster communication. Vertica opens the Vertica client port +1 (5434 by default) for intra-cluster communication, such as during a plan. If the port +1 from the default client port is not available, then Vertica opens a random port for intra-cluster communication. |
5433 |
UDP |
Vertica |
Vertica spread monitoring and MC cluster import. |
5444 |
TCP |
Vertica Management Console |
MC-to-node and node-to-node (agent) communications port. See Changing MC or agent ports. |
5450 |
TCP |
Vertica Management Console |
Port used to connect to MC from a web browser and allows communication from nodes to the MC application/web server. See Connecting to Management Console. |
4803 |
TCP |
Spread |
Client connections. |
8443 |
TCP |
HTTPS |
Reserved. |
4803 |
UDP |
Spread |
Daemon to daemon connections. |
4804 |
UDP |
Spread |
Daemon to daemon connections. |
6543 |
UDP |
Spread |
Monitor to daemon connection. |
3.3.3 - Operating system configuration task overview
This topic provides a high-level overview of the OS settings required for Vertica.
This topic provides a high-level overview of the OS settings required for Vertica. Each item provides a link to additional details about the setting and detailed steps on making the configuration change. The installer tests for all of these settings and provides hints, warnings, and failures if the current configuration does not meet Vertica requirements.
Before you install the operating system
Configuration |
Description |
Supported Platforms |
Verify that your servers meet the platform requirements described in Supported Platforms. Unsupported operating systems are detected by the installer. |
LVM |
Vertica Analytic Database supports Linux Volume Manager (LVM) on all supported operating systems. For information on LVM requirements and restrictions, see the section, Vertica Support for LVM. |
File system |
Choose the storage format type based on deployment requirements. Vertica recommends the following storage format types where applicable:
Note
For the Vertica I/O profile, the ext4 file system is considerably faster than ext3.
The storage format type at your backup and temporary directory locations must support fcntl lockf (POSIX) file locking.
|
Swap Space |
A 2GB swap partition is required. Partition the remaining disk space in a single partition under "/". |
Disk Block Size |
The disk block size for the Vertica data and catalog directories should be 4096 bytes, the default for ext4 file systems. |
Memory |
For more information on sizing your hardware, see the Vertica Hardware Planning Guide. |
Firewall considerations
Configuration |
Description |
Firewall/Ports |
Firewalls, if present, must be configured so as not to interfere with Vertica. |
These general OS settings are automatically made by the installer if they do not meet Vertica requirements. You can prevent the installer from automatically making these configuration changes by using the --no-system-configuration
parameter for the install_vertica
script.
Configuration |
Description |
Nice Limits |
The database administration user must be able to nice processes back to the default level of 0. |
min_free_kbytes |
The vm.min_free_kbytes setting in /etc/sysctl.conf must be configured sufficiently high. The specific value depends on your hardware configuration. |
User Open Files Limit |
The open file limit for the dbadmin user should be at least 1 file open per MB of RAM, 65536, or the amount of RAM in MB; whichever is greater. |
System Open File Limits |
The maximum number of files open on the system must not be less than at least the amount of memory in MB, but not less than 65536. |
Pam Limits |
/etc/pam.d/su must contain the line:
session required pam_limits.so
This allows for the conveying of limits to commands run with the su - command.
|
Address Space Limits |
The address space limits (as setting) defined in /etc/security/limits.conf must be unlimited for the database administrator. |
File Size Limits |
The file sizelimits (fsize setting) defined in /etc/security/limits.conf must be unlimited for the database administrator. |
User Process Limits |
The nproc setting defined in /etc/security/limits.conf must be 1024 or the amount of memory in MB, whichever is greater. |
Maximum Memory Maps |
The vm.max_map_count in /etc/sysctl.conf must be 65536 or the amount of memory in KB / 16, whichever is greater. |
General operating system configuration - manual configuration
The following general OS settings must be done manually.
Configuration |
Description |
Disk Readahead |
This disk readahead must be at least 2048, with a high of 8192. Set this high limit only with the help of Vertica support. The specific value depends on your hardware configuration. |
NTP Services |
The NTP daemon must be enabled and running, with the exception of Red Hat 7 and CentOS 7 systems. |
chrony |
For Red Hat 7 and CentOS 7 systems, chrony must be enabled and running. |
SELinux |
SElinux must be disabled or run in permissive mode. |
CPU Frequency Scaling |
Vertica recommends that you disable CPU Frequency Scaling.
Important
Your systems may use significantly more energy when CPU frequency scaling is disabled.
|
Transparent Hugepages |
For Red Hat 7, CentOS 7 and Amazon Linux 2.0, Transparent Hugepages must be set to always.
For all other operating systems, Transparent Hugepages must be disabled or set to madvise.
|
I/O Scheduler |
The I/O Scheduler for disks used by Vertica must be set to deadline or noop. |
Support Tools |
Several optional packages can be installed to assist Vertica support when troubleshooting your system. |
System user requirements
The following tasks pertain to the configuration of the system user required by Vertica.
Configuration |
Required Setting(s) |
System User Requirements |
The installer automatically creates a user with the correct settings. If you specify a user with --dba-use r, then the user must conform to the requirements for the Vertica system user. |
LANG Environment Settings |
The LANG environment variable must be set and valid for the database administration user. |
TZ Environment Settings |
The TZ environment variable must be set and valid for the database administration user. |
3.3.3.1 - Operating system prerequisites
The topics in this section detail system settings that must be configured when you install the operating system.
The topics in this section detail system settings that must be configured when you install the operating system. These settings cannot be easily changed after the operating system is installed.
3.3.3.1.1 - Supported platforms
The Vertica installer checks the type of operating system that is installed.
The Vertica installer checks the type of operating system that is installed. If the operating system does not meet one of the supported operating systems (See Vertica server and Management Console), or the operating system cannot be determined, then the installer halts.
The installer generates one of the following issue identifiers if it detects an unsupported operating system:
-
[S0320] - Fedora OS is not supported.
-
[S0321] - The version of Red Hat/CentOS is not supported.
-
[S0322] - The version of Ubuntu/Debian is not supported.
-
[S0323] - The operating system could not be determined. The unknown operating system is not supported because it does not match the list of supported operating systems.
-
[S0324] - The version of Red Hat is not supported.
3.3.3.1.2 - Recommended storage format types
Choose the storage format type based on deployment requirements. Vertica recommends the following storage format types where applicable:
Note
For the Vertica I/O profile, the ext4 file system is considerably faster than ext3.
The storage format type at your backup and temporary directory locations must support fcntl lockf (POSIX) file locking.
3.3.3.1.3 - Swap space requirements
Vertica requires at least 2 GB swap partition regardless of the amount of RAM installed on your system.
Vertica requires at least 2 GB swap partition regardless of the amount of RAM installed on your system. The installer reports this issue with identifier S0180.
For typical installations Vertica recommends that you partition your system with a 2GB primary partition for swap regardless of the amount of installed RAM. Larger swap space is acceptable, but unnecessary.
Note
Do not place a swap file on a disk containing the Vertica data files. If a host has only two disks (boot and data), put the swap file on the boot disk.
If you do not have at least a 2 GB swap partition then you may experience performance issues when running Vertica.
You typically define the swap partition when you install Linux. See your platform’s documentation for details on configuring the swap partition.
3.3.3.1.4 - Disk block size requirements
Vertica recommends that your disk block size be 4096 bytes, the default on ext4 and XFS file systems.
Vertica recommends that your disk block size be 4096 bytes, the default on ext4 and XFS file systems.
You set the disk block size when you format your file system. If you change the block size, you will need to reformat the disk.
For more information, see Recommended storage format types.
3.3.3.1.5 - Memory requirements
Vertica requires that your hosts have a minimum of 1GB of RAM per logical processor.
Individual host requirements
Vertica requires that your hosts have a minimum of 1GB of RAM per logical processor. If your hosts do not meet this requirement, the installer reports this issue with the identifier S0190.
For performance reasons, you typically require more RAM than the minimum. For more information on sizing your hardware, see the Vertica Knowledge Base Hardware documents.
RAM should be identical on all hosts
In addition to the individual host RAM requirement, the installer also reports a hint if the hosts in your cluster do not have identical amounts of RAM. Ensuring your host have the same amount of RAM helps prevent performance issues if one or more nodes has less RAM than the other nodes in your database.
Note
In an Eon Mode database, after you create the initial cluster, you can configure
subclusters that have different hardware specifications (including RAM) than the initial
primary subcluster the installer creates.
3.3.3.2 - Firewall considerations
Vertica requires multiple ports be open between nodes.
Vertica requires multiple ports be open between nodes. You may use a firewall (IP Tables) on Redhat/CentOS and Ubuntu/Debian based systems. Note that firewall use is not supported on SuSE systems and that SuSE systems must disable the firewall. The installer reports issues found with your IP tables configuration with the identifiers N0010 for (systems that use IP Tables) and N011 (for SuSE systems).
The installer checks the IP tables configuration and issues a warning if there are any configured rules or chains. The installer does not detect if the configuration may conflict with Vertica. It is your responsibility to verify that your firewall allows traffic for Vertica as described in Ensure ports are available.
Note
The installer does not check NAT entries in iptables.
You can modify your firewall to allow for Vertica network traffic, or you can disable the firewall if your network is secure. Note that firewalls are not supported for Vertica systems running on SuSE.
Important
You may encounter the N0010 issue even when the firewall is disabled. If this occurs, you can workaround this issue and install Vertica by ignoring installer WARN messages. To do this, install (or update) with a failure threshold of FAIL. For example, /opt/vertica/sbin/install_vertica --failure-threshold FAIL <other install options...>
.
Red hat 6 and CentOS 6 systems
For details on how to configure iptables and allow specific ports to be open, see the platform-specific documentation for your platform:
To disable iptables, run the following command as root or sudo:
# service iptables save
# service iptables stop
# chkconfig iptables off
To disable iptables if you are using the ipv6 versions of iptables, run the following command as root or sudo:
# service ip6tables save
# service ip6tables stop
# chkconfig ip6tables off
Red hat 7 and CentOS 7 systems:
To disable the system firewall, run the following command as root or sudo:
# systemctl mask firewalld
# systemctl disable firewalld
# systemctl stop firewalld
Ubuntu and debian systems
For details on how to configure iptables and allow specific ports to be open, see the platform-specific documentation for your platform:
Note
Ubuntu uses the ufw program to manage iptables.
To disable iptables on Debian, run the following command as root or sudo:
/etc/init.d/iptables stop
update-rc.d -f iptables remove
To disable iptables on Ubuntu, run the following command:
sudo ufw disable
SuSE systems
The firewall must be disabled on SUSE systems. To disable the firewall on SuSE systems, run the following command:
/sbin/SuSEfirewall2 off
3.3.3.3 - Port availability
The script checks that required ports are open and available to Vertica.
The install_vertica script checks that required ports are open and available to Vertica. The installer reports any issues with identifier N0020
.
The following table lists the ports required by Vertica.
Port |
Protocol |
Service |
Notes |
22 |
TCP |
sshd |
Required by Administration tools and the Management Console Cluster Installation wizard. |
5433 |
TCP |
Vertica |
Vertica client (vsql, ODBC, JDBC, etc) port. |
5434 |
TCP |
Vertica |
Intra- and inter-cluster communication. Vertica opens the Vertica client port +1 (5434 by default) for intra-cluster communication, such as during a plan. If the port +1 from the default client port is not available, then Vertica opens a random port for intra-cluster communication. |
5433 |
UDP |
Vertica |
Vertica spread monitoring and MC cluster import. |
5444 |
TCP |
Vertica Management Console |
MC-to-node and node-to-node (agent) communications port. See Changing MC or agent ports. |
5450 |
TCP |
Vertica Management Console |
Port used to connect to MC from a web browser and allows communication from nodes to the MC application/web server. See Connecting to Management Console. |
4803 |
TCP |
Spread |
Client connections. |
8443 |
TCP |
HTTPS |
Reserved. |
4803 |
UDP |
Spread |
Daemon to daemon connections. |
4804 |
UDP |
Spread |
Daemon to daemon connections. |
6543 |
UDP |
Spread |
Monitor to daemon connection. |
3.3.3.4 - General operating system configuration - automatically configured by the installer
These general Operating System settings are automatically made by the installer if they do not meet Vertica requirements.
These general Operating System settings are automatically made by the installer if they do not meet Vertica requirements. You can prevent the installer from automatically making these configuration changes by using the --no-system-configuration
parameter for the install_vertica
script.
3.3.3.4.1 - Sysctl
During installation, Vertica attempts to automatically change various OS level settings.
During installation, Vertica attempts to automatically change various OS level settings. The installer may not change values on your system if they exceed the threshold required by the installer. You can prevent the installer from automatically making these configuration changes by using the --no-system-configuration
parameter for the install_vertica
script.
To permanently edit certain settings and prevent them from reverting on reboot, use sysctl.
The sysctl settings relevant to the installation of Vertica include:
Permanently changing settings with sysctl:
-
As the root user, open the /etc/sysctl.conf file:
# vi /etc/sysctl.conf
-
Enter a parameter and value:
parameter = value
For example, to set the parameter and value for fs.file-max
to meet Vertica requirements, enter:
fs.file-max = 65536
-
Save your changes, and close the /etc/sysctl.conf file.
-
As the root user, reload the config file:
# sysctl -p
Identifying settings added by the installer
You can see whether the installer has added a setting by opening the /etc/sysctl.conf file:
# vi /etc/sysctl.conf
If the installer has added a setting, the following line appears:
# The following 1 line added by Vertica tools. 2015-02-23 13:20:29
parameter = value
3.3.3.4.2 - Nice limits configuration
The Vertica system user (dbadmin by default) must be able to raise and lower the priority of Vertica processes.
The Vertica system user (dbadmin by default) must be able to raise and lower the priority of Vertica processes. To do this, the nice option in the /etc/security/limits.conf
file must include an entry for the dbadmin user. The installer reports this issue with the identifier: S0010.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
Note
Vertica never raises priority above the default level of 0. However, Vertica does lower the priority of certain Vertica threads and needs to able to raise the priority of these threads back up to the default level. This setting allows Vertica to raise the priorities back to the default level.
All systems
To set the Nice Limit configuration for the dbadmin user, edit /etc/security/limits.conf
and add the following line. Replace dbadmin with the name of your system user.
dbadmin - nice 0
3.3.3.4.3 - min_free_kbytes setting
This topic details how to update the min_free_kbytes setting so that it is within the range supported by Vertica.
This topic details how to update the min_free_kbytes setting so that it is within the range supported by Vertica. The installer reports this issue with the identifier: S0050 if the setting is too low, or S0051 if the setting is too high.
The vm.min_free_kbytes setting configures the page reclaim thresholds. When this number is increased the system starts reclaiming memory earlier, when its lowered it starts reclaiming memory later. The default min_free_kbytes is calculated at boot time based on the number of pages of physical RAM available on the system.
The setting must be whichever value is the greatest from the following options:
-
The default value configured by the system
-
4096
-
The result of running the commands:
$ memtot=`grep MemTotal /proc/meminfo | awk '{printf "%.0f",$2}'`
$ echo "scale=0;sqrt ($memtot*16)" | bc
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To manually set min_free_kbytes:
-
Determine the current/default setting with the following command:
$ sysctl vm.min_free_kbytes
-
If the result of the previous command is No such file or directory
or the default value is less than 4096, then run these commands to determine the correct value:
$ memtot=`grep MemTotal /proc/meminfo | awk '{printf "%.0f",$2}'`
$ echo "scale=0;sqrt ($memtot*16)" | bc
-
Edit or add the current value of vm.min_free_kbytes
in /etc/sysctl.conf
with the value from the output of the previous command.
# The min_free_kbytes setting
vm.min_free_kbytes=16132
-
Run sysctl -p
to apply the changes in sysctl.conf
immediately.
Note
These steps must be repeated for each node in the cluster.
3.3.3.4.4 - User max open files limit
This topic details how to change the user max open-files limit setting to meet Vertica requirements.
This topic details how to change the user max open-files limit setting to meet Vertica requirements. The installer reports this issue with the identifier S0060.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
Vertica requires that the dbadmin user not be limited when opening files. The open file limit per user is calculated as follows:
user max open files = greater of { ≥ 65536 | ≤ RAM-MBs }
As a dbadmin user, you can determine the open file limit by running ulimit -n
. For example:
$ ulimit -n
65536
To manually set the limit, edit /etc/security/limits.conf
and edit/add the nofile
setting for the user who is configured as the database administrator—by default, dbadmin
. For example:
dbadmin - nofile 65536
The setting must be no less than 65536 MB, but not greater than the system value of fs.nr_open
. For example, the default value of fs.nr_open
value on Red Hat Enterprise Linux 9 is 1048576 MB.
3.3.3.4.5 - System max open files limit
This topic details how to modify the limit for the number of open files on your system so that it meets Vertica requirements.
This topic details how to modify the limit for the number of open files on your system so that it meets Vertica requirements. The installer reports this issue with the identifier: S0120.
Vertica opens many files. Some platforms have global limits on the number of open files. The open file limit must be set sufficiently high so as not to interfere with database operations.
The recommended value is at least the amount of memory in MB, but not less than 65536.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To manually set the open file limit:
-
Run /sbin/sysctl fs.file-max
to determine the current limit.
-
If the limit is not 65536 or the amount of system memory in MB (whichever is higher), then edit or add fs.file-max=
max number of files
to /etc/sysctl.conf
.
# Controls the maximum number of open files
fs.file-max=65536
-
Run sysctl -p
to apply the changes in sysctl.conf
immediately.
Note
These steps will need to be replicated for each node in the cluster.
3.3.3.4.6 - Pam limits
This topic details how to enable the "su" pam_limits.so module required by Vertica.
This topic details how to enable the "su" pam_limits.so module required by Vertica. The installer reports issues with the setting with the identifier: S0070.
On some systems the pam module called pam_limits.so
is not set in the file /etc/pam.d/su
. When it is not set, it prevents the conveying of limits (such as open file descriptors) to any command started with su -
.
In particular, the Vertica init script would fail to start Vertica because it calls the Administration Tools to start a database with the su -
command. This problem was first noticed on Debian systems, but the configuration could be missing on other Linux distributions. See the pam_limits man page for more details.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To manually configure this setting, append the following line to the /etc/pam.d/su
file:
session required pam_limits.so
See the pam_limits man page for more details: man pam_limits
.
3.3.3.4.7 - pid_max setting
This topic explains how to change pid_max to a supported value.
This topic explains how to change pid_max
to a supported value. The value of pid_max
should be
pid_max = num-user-proc + 2**15 = num-user-proc + 32768
where num-user-proc
is the size of memory in megabytes.
The minimum value for pid_max
is 524288.
If your pid_max
value is too low, the installer reports this problem and indicates the minimum value.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
All systems
To change the pid_max
value:
# sysctl -w kernel.pid_max=524288
3.3.3.4.8 - User address space limits
This topic details how to modify the Linux address space limit for the dbadmin user so that it meets Vertica requirements.
This topic details how to modify the Linux address space limit for the dbadmin user so that it meets Vertica requirements. The address space setting controls the maximum number of threads and processes for each user. If this setting does not meet the requirements then the installer reports this issue with the identifier: S0090.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
The address space available to the dbadmin user must not be reduced via user limits and must be set to unlimited.
All systems
To manually set the address space limit:
-
Run ulimit -v
as the dbadmin user to determine the current limit.
-
If the limit is not unlimited, then add the following line to /etc/security/limits.conf
. Replace dbadmin with your database admin user
dbadmin - as unlimited
3.3.3.4.9 - User file size limit
This topic details how to modify the file size limit for files on your system so that it meets Vertica requirements.
This topic details how to modify the file size limit for files on your system so that it meets Vertica requirements. The installer reports this issue with the identifier: S0100.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
The file size limit for the dbadmin user must not be reduced via user limits and must be set to unlimited.
All systems
To manually set the file size limit:
-
Run ulimit -f
as the dbadmin user to determine the current limit.
-
If the limit is not unlimited, then edit/add the following line to /etc/security/limits.conf
. Replace dbadmin with your database admin user.
dbadmin - fsize unlimited
3.3.3.4.10 - User process limit
This topic details how to change the user process limit so that it meets Vertica requirements.The installer reports this issue with the identifier: S0110.
This topic details how to change the user process limit so that it meets Vertica requirements.The installer reports this issue with the identifier: S0110.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
The user process limit must be high enough to allow for the many threads opened by Vertica. The recommended limit is the amount of RAM in MB and must be at least 1024.
All systems
To manually set the user process limit:
-
Run ulimit -u
as the dbadmin user to determine the current limit.
-
If the limit is not the amount of memory in MB on the server, then edit/add the following line to /etc/security/limits.conf
. Replace 4096 with the amount of system memory, in MB, on the server.
dbadmin - nproc 4096
3.3.3.4.11 - Maximum memory maps configuration
This topic details how to modify the limit for the number memory maps a process can have on your system so that it meets Vertica requirements.
This topic details how to modify the limit for the number memory maps a process can have on your system so that it meets Vertica requirements. The installer reports this issue with the identifier: S0130.
The installer automatically configures the correct setting if the default value does not meet system requirements. If an issue occurs when setting this value, or you use the --no-system-configuration
argument to the installer and the current setting is incorrect, then the installer reports this as an issue.
Vertica uses a lot of memory while processing and can approach the default limit for memory maps per process.
The recommended value is at least the amount of memory on the system in KB / 16, but not less than 65536.
All systems
To manually set the memory map limit:
-
Run /sbin/sysctl vm.max_map_count
to determine the current limit.
-
If the limit is not 65536 or the amount of system memory in KB / 16 (whichever is higher), then edit/add the following line to /etc/sysctl.conf
. Replace 65536 with the value for your system.
# The following 1 line added by Vertica tools. 2014-03-07 13:20:31
vm.max_map_count=65536
-
Run sysctl -p
to apply the changes in sysctl.conf
immediately.
Note
These steps will need to be replicated for each node in the cluster.
3.3.3.5 - General operating system configuration - manual configuration
The following general Operating System settings must be done manually.
The following general Operating System settings must be done manually.
3.3.3.5.1 - Persisting operating system settings
Vertica requires that you manually configure several general operating system settings.
Vertica requires that you manually configure several general operating system settings. You should configure some of these settings in the /etc/rc.local
script, to prevent them from reverting on reboot. This script contains scripts and commands that run each time the system is booted.
Important
On reboot, SUSE systems use the /etc/init.d/after.local
file rather than /etc/rc.local
.
Vertica uses settings in /etc/rc.local
to set the following functionality:
Editing /etc/rc.local
-
As the root user, open /etc/rc.local
:
# vi /etc/rc.local
-
Enter a script or command. For example, to configure transparent hugepages to meet Vertica requirements, enter the following:
echo never > /sys/kernel/mm/redhat_transparent_hugepage/enabled
Important
On some Ubuntu/Debian systems, the last line in /etc/rc.local
must be exit 0
. All additions to /etc/rc.local
must precede this line.
-
Save your changes, and close /etc/rc.local
.
-
If you use Red Hat 7.0 or CentOS 7.0 or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
On reboot, the command runs during startup. You can also run the command manually as the root user, if you want it to take effect immediately.
Disabling tuning system service
If you use Red Hat 7.0 or CentOS 7.0 or higher, make sure the tuning system service does not start on when Vertica reboots. Turning off tuning prevents monitoring of your OS and any tuning of your OS based on this monitoring. Tuning also enables THP silently, which can cause issues in other areas such as read ahead.
Run the following command as sudo or root:
$ chkconfig tuned off
3.3.3.5.2 - SUSE control groups configuration
On SuSE 12, the installer checks the control group (cgroup) setting for the cgroups that Vertica may run under:.
On SuSE 12, the installer checks the control group (cgroup) setting for the cgroups that Vertica may run under:
-
verticad
-
vertica_agent
-
sshd
The installer verifies that the pid.max
resource is large enough for all the threads that Vertica creates. We check the contents of:
-
/sys/fs/cgroup/pids/system.slice/verticad.service/pids.max
-
/sys/fs/cgroup/pids/system.slice/vertica_agent.service/pids.max
-
/sys/fs/cgroup/pids/system.slice/sshd.service/pids.max
If these files exist and they fail to include the value max
, the installation stops and the installer returns a failure message (code S0340).
If these files do not exist, they are created automatically when the systemd
runs the verticad
and vertica_agent
startup scripts. However, the site's cgroup configuration process managed their default values. Vertica does not change the defaults.
Pre-installation configuration
Before installing Vertica, configure your system as follows:
# Create the following directories:
sudo mkdir /sys/fs/cgroup/pids/system.slice/verticad.service/
sudo mkdir /sys/fs/cgroup/pids/system.slice/vertica_agent.service/
# sshd service dir should already exist, so don't need to create it
# Set pids.max values:
sudo sh -c 'echo "max" > /sys/fs/cgroup/pids/system.slice/verticad.service/pids.max'
sudo sh -c 'echo "max" > /sys/fs/cgroup/pids/system.slice/vertica_agent.service/pids.max'
sudo sh -c 'echo "max" > /sys/fs/cgroup/pids/system.slice/sshd.service/pids.max'
Persisting configuration for restart
After installation, you can configure control groups for subsequent reboots of the Vertica database. You do so by editing configuration file /etc/init.d/after.local
and adding the commands shown earlier.
Note
Because after.local
is executed as root, it can omit sudo
commands.
3.3.3.5.3 - Cron required for scheduled jobs
Admintools uses the Linux cron package to schedule jobs that regularly rotate the database logs.
Admintools uses the Linux cron
package to schedule jobs that regularly rotate the database logs. Without this package installed, the database logs will never be rotated. The lack of rotation can lead to a significant consumption of storage for logs. On busy clusters, Vertica can produce hundreds of gigabytes of logs per day.
cron
is installed by default on most Linux distributions, but it may not be present on some SUSE 12 systems.
To install cron
, run this command:
$ sudo zypper install cron
3.3.3.5.4 - Disk readahead
This topic details how to change Disk Readahead to a supported value.
This topic details how to change Disk Readahead to a supported value. Vertica requires that Disk Readahead be set to at least 2048. The installer reports this issue with the identifier: S0020.
Note
-
These commands must be executed with root privileges and assumes the blockdev program is in /sbin
.
-
The blockdev program operates on whole devices, and not individual partitions. You cannot set the readahead value to different settings on the same device. If you run blockdev against a partition, for example: /dev/sda1, then the setting is still applied to the entire /dev/sda device. For instance, running /sbin/blockdev --setra 2048 /dev/sda1
also causes /dev/sda2 through /dev/sdaN to use a readahead value of 2048.
RedHat/CentOS and SuSE based systems
For each drive in the Vertica system, Vertica recommends that you set the readahead value to at least 2048 for most deployments. The command immediately changes the readahead value for the specified disk. The second line adds the command to /etc/rc.local
so that the setting is applied each time the system is booted. Note that some deployments may require a higher value and the setting can be set as high as 8192, under guidance of support.
Note
For systems that do not support /etc/rc.local
, use the equivalent startup script that is run after the destination runlevel has been reached. For example SUSE uses /etc/init.d/after.local
.
The following example sets the readahead value of the drive sda to 2048:
$ /sbin/blockdev --setra 2048 /dev/sda
$ echo '/sbin/blockdev --setra 2048 /dev/sda' >> /etc/rc.local
If you are using Red Hat 7.0 or CentOS 7.0 or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
Ubuntu and debian systems
For each drive in the Vertica system, set the readahead value to 2048. Run the command once in your shell, then add the command to /etc/rc.local
so that the setting is applied each time the system is booted. Note that on Ubuntu systems, the last line in rc.local must be "exit 0
". So you must manually add the following line to etc/rc.local
before the last line with exit 0
.
Note
For systems that do not support /etc/rc.local
, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local
.
/sbin/blockdev --setra 2048 /dev/sda
3.3.3.5.5 - I/O scheduling
This topic details how to change I/O Scheduling to a supported scheduler.
This topic details how to change I/O Scheduling to a supported scheduler. Vertica requires that I/O Scheduling be set to deadline
or noop
. The installer checks what scheduler the system is using, reporting an unsupported scheduler issue with identifier: S0150. If the installer cannot detect the type of scheduler in use (typically if your system is using a RAID array), it reports that issue with identifier: S0151.
If your system is not using a RAID array, then complete the following steps to change your system to a supported I/O Scheduler. If you are using a RAID array, then consult your RAID vendor documentation for the best performing scheduler for your hardware.
The Linux kernel can use several different I/O schedulers to prioritize disk input and output. Most Linux distributions use the Completely Fair Queuing (CFQ) scheme by default, which gives input and output requests equal priority. This scheduler is efficient on systems running multiple tasks that need equal access to I/O resources. However, it can create a bottleneck when used on Vertica drives containing the catalog and data directories, because it gives write requests equal priority to read requests, and its per-process I/O queues can penalize processes making more requests than other processes.
Instead of the CFQ scheduler, configure your hosts to use either the Deadline or NOOP I/O scheduler for the drives containing the catalog and data directories:
-
The Deadline scheduler gives priority to read requests over write requests. It also imposes a deadline on all requests. After reaching the deadline, such requests gain priority over all other requests. This scheduling method helps prevent processes from becoming starved for I/O access. The Deadline scheduler is best used on physical media drives (disks using spinning platters), since it attempts to group requests for adjacent sectors on a disk, lowering the time the drive spends seeking.
-
The NOOP scheduler uses a simple FIFO approach, placing all input and output requests into a single queue. This scheduler is best used on solid state drives (SSDs). Because SSDs do not have a physical read head, no performance penalty exists when accessing non-adjacent sectors.
Failure to use one of these schedulers for the Vertica drives containing the catalog and data directories can result in slower database performance. Other drives on the system (such as the drive containing swap space, log files, or the Linux system files) can still use the default CFQ scheduler (although you should always use the NOOP scheduler for SSDs).
There are two ways for you to set the scheduler used by your disk devices:
-
Write the name of the scheduler to a file in the /sys
directory.
--or--
-
Use a kernel boot parameter.
You can view and change the scheduler Linux uses for I/O requests to a single drive using a virtual file under the /sys
directory. The name of the file that controls the scheduler a block device uses is:
/sys/block/deviceName/queue/scheduler
Where deviceName
is the name of the disk device, such as sda
or cciss\!c0d1
(the first disk on an OpenText RAID array). Viewing the contents of this file shows you all of the possible settings for the scheduler. The currently-selected scheduler is surrounded by square brackets:
# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]
To change the scheduler, write the name of the scheduler you want the device to use to its scheduler file. You must have root privileges to write to this file. For example, to set the sda drive to use the deadline scheduler, run the following command as root:
# echo deadline > /sys/block/sda/queue/scheduler
# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq
Changing the scheduler immediately affects the I/O requests for the device. The Linux kernel starts using the new scheduler for all of the drive's input and output requests.
Note
While tests show that changing the scheduler settings while Vertica is running does not cause problems, Vertica recommends shutting down. Before changing the I/O schedule, or making any other changes to the system configuration, consider shutting down any running database.
Changes to the I/O scheduler made through the /sys
directory only last until the system is rebooted, so you need to add the commands that change the I/O scheduler to a startup script (such as those stored in /etc/init.d
, or though a command in /etc/rc.local
). You also need to use a separate command for each drive on the system whose scheduler you want to change.
For example, to make the configuration take effect immediately and add it to rc.local so it is used on subsequent reboots.
Note
For systems that do not support /etc/rc.local
, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local
.
echo deadline > /sys/block/sda/queue/scheduler
echo 'echo deadline > /sys/block/sda/queue/scheduler' >> /etc/rc.local
Note
On some Ubuntu/Debian systems, the last line in rc.local must be "exit 0
". So you must manually add the following line to etc/rc.local
before the last line with exit 0
.
You may prefer to use this method of setting the I/O scheduler over using a boot parameter if your system has a mix of solid-state and physical media drives, or has many drives that do not store Vertica catalog and data directories.
If you are using Red Hat 7.0 or CentOS 7.0 or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
Use the elevator
kernel boot parameter to change the default scheduler used by all disks on your system. This is the best method to use if most or all of the drives on your hosts are of the same type (physical media or SSD) and will contain catalog or data files. You can also use the boot parameter to change the default to the scheduler the majority of the drives on the system need, then use the /sys
files to change individual drives to another I/O scheduler. The format of the elevator boot parameter is:
elevator=schedulerName
Where schedulerName
is deadline
, noop
, or cfq
. You set the boot parameter using your bootloader (grub or grub2 on most recent Linux distributions). See your distribution's documentation for details on how to add a kernel boot parameter.
3.3.3.5.6 - Enabling or disabling transparent hugepages
You can modify transparent hugepages to meet Vertica configuration requirements:.
You can modify transparent hugepages to meet Vertica configuration requirements:
-
For Red Hat 7/CentOS 7 and Amazon Linux 2.0, you must enable transparent hugepages. The installer reports this issue with the identifier: S0312.
-
For Red Hat 8/CentOS 8 and SUSE 15.1, Vertica provides recommended settings to optimize your system performance by workload.
-
For all other systems, you must disable transparent hugepages or set them to madvise
. The installer reports this issue with the identifier: S0310.
Recommended settings by workload for red hat 8/CentOS 8 and SUSE 15.1
Vertica recommends transparent hugepages settings to optimize performance by workload. The following table contains recommendations for systems that primarily run concurrent queries (such as short-running dashboard queries), or sequential SELECT or load (COPY) queries:
Operating System |
Concurrent |
Sequential |
Important Notes |
Red Hat 8.0/CentOS 8.0 |
Disable |
Enable |
|
SUSE 15.1 |
Disable |
Enable |
Additionally, Vertica recommends the following khugepaged settings to optimize for each workload:
Concurrent Workloads: Disable khugepaged with the following command:
echo 0 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
Sequential Workloads: Enable khugepaged with the following command:
echo 1 > /sys/kernel/mm/transparent_hugepage/khugepaged/defrag
|
See Enabling or disabling defrag for additional settings that optimize your system performance by workload.
Enabling transparent hugepages on red hat 7/8, CentOS 7/8, SUSE 15.1, and Amazon Linux 2.0
Determine if transparent hugepages is enabled. To do so, run the following command.
cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
The setting returned in brackets is your current setting.
For systems that do not support /etc/rc.local
, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local
.
You can enable transparent hugepages by editing /etc/rc.local
and adding the following script:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo always > /sys/kernel/mm/transparent_hugepage/enabled
fi
You must reboot your system for the setting to take effect, or, as root, run the following echo line to proceed with the install without rebooting:
# echo always > /sys/kernel/mm/transparent_hugepage/enabled
If you are using Red Hat 7.0 or CentOS 7.0 or higher, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
Disabling transparent hugepages on other systems
Note
SUSE did not offer transparent hugepage support in its initial 11.0 release. However, subsequent SUSE service packs do include support for transparent hugepages.
To determine if transparent hugepages is enabled, run the following command.
cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
The setting returned in brackets is your current setting. Depending on your platform OS, the madvise
setting may not be displayed.
You can disable transparent hugepages one of two ways:
-
Edit your boot loader (for example /etc/grub.conf
). Typically, you add the following to the end of the kernel line. However, consult the documentation for your system before editing your bootloader configuration.
transparent_hugepage=never
-
Edit /etc/rc.local
(on systems that support rc.local) and add the following script.
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
For systems that do not support /etc/rc.local
, use the equivalent startup script that is run after the destination runlevel has been reached. For example SuSE uses /etc/init.d/after.local
.
Regardless of which approach you choose, you must reboot your system for the setting to take effect, or run the following two echo lines to proceed with the install without rebooting:
echo never > /sys/kernel/mm/transparent_hugepage/enabled
3.3.3.5.7 - Check for swappiness
The swappiness kernel parameter defines the amount, and how often, the kernel copies RAM contents to a swap space.
The swappiness kernel parameter defines the amount, and how often, the kernel copies RAM contents to a swap space. Vertica recommends a value of 0. The installer reports any swappiness issues with identifier S0112.
You can check the swappiness value by running the following command:
$ cat /proc/sys/vm/swappiness
To set the swappiness value add or update the following line in /etc/sysctl.conf
:
vm.swappiness = 0
This also ensures that the value persists after a reboot.
If necessary, you change the swappiness value at runtime by logging in as root and running the following:
$ echo 0 > /proc/sys/vm/swappiness
3.3.3.5.8 - Enabling network time protocol (NTP)
Data damage and performance issues might occur if you change host NTP settings while the database is running.
Important
Data damage and performance issues might occur if you change host NTP settings while the database is running. Before you change the NPT settings, stop the database. If you cannot stop the database, stop the Vertica process of each host and change the NTP settings one host at a time.
For details, see Stopping Vertica on host.
The network time protocol (NTP) daemon must be running on all of the hosts in the cluster so that their clocks are synchronized. The spread daemon relies on all of the nodes to have their clocks synchronized for timing purposes. If your nodes do not have NTP running, the installation can fail with a spread configuration error or other errors.
Note
Different Linux distributions refer to the NTP daemon in different ways. For example, SUSE and Debian/Ubuntu refer to it as ntp
, while CentOS and Red Hat refer to it as ntpd
. If the following commands produce errors, try using the other NTP daemon reference name.
Verify that NTP is running
To verify that your hosts are configured to run the NTP daemon on startup, enter the following command:
$ chkconfig --list ntpd
Debian and Ubuntu do not support chkconfig
, but they do offer an optional package. You can install this package with the command sudo apt-get install sysv-rc-conf
. To verify that your hosts are configured to run the NTP daemon on startup with the sysv-rc-conf
utility, enter the following command:
$ sysv-rc-conf --list ntpd
The chkconfig
command can produce an error similar to ntpd: unknown service
. If you get this error, verify that your Linux distribution refers to the NTP daemon as ntpd
rather than ntp
. If it does not, you need to install the NTP daemon package before you can configure it. Consult your Linux documentation for instructions on how to locate and install packages.
If the NTP daemon is installed, your output should resemble the following:
ntp 0:off 1:off 2:on 3:on 4:off 5:on 6:off
The output indicates the runlevels where the daemon runs. Verify that the current runlevel of the system (usually 3 or 5) has the NTP daemon set to on
. If you do not know the current runlevel, you can find it using the runlevel
command:
$ runlevel
N 3
If your system is based on Red Hat 6/CentOS 6 or SUSE Linux Enterprise Server, use the service
and chkconfig
utilities to start NTP and have it start at startup.
$ /sbin/service ntpd restart
$ /sbin/chkconfig ntpd on
-
Red Hat 6/CentOS 6—NTP uses the default time servers at ntp.org. You can change the default NTP servers by editing /etc/ntpd.conf
.
-
SLES—By default, no time servers are configured. You must edit /etc/ntpd.conf
after the install completes and add time servers.
By default, the NTP daemon is not installed on some Ubuntu and Debian systems. First, install NTP, and then start the NTP process. You can change the default NTP servers by editing /etc/ntpd.conf
as shown:
$ sudo apt-get install ntp
$ sudo /etc/init.d/ntp reload
Verify that NTP is operating correctly
To verify that the Network Time Protocol Daemon (NTPD) is operating correctly, issue the following command on all nodes in the cluster.
For Red Hat 6/CentOS 6 and SLES:
$ /usr/sbin/ntpq -c rv | grep stratum
For Ubuntu and Debian:
$ ntpq -c rv | grep stratum
A stratum level of 16 indicates that NTP is not synchronizing correctly.
If a stratum level of 16 is detected, wait 15 minutes and issue the command again. It may take this long for the NTP server to stabilize.
If NTP continues to detect a stratum level of 16, verify that the NTP port (UDP Port 123) is open on all firewalls between the cluster and the remote machine to which you are attempting to synchronize.
The preceding links were current as of the last publication of the Vertica documentation and could change between releases.
3.3.3.5.9 - Enabling chrony or ntpd for red hat 7/CentOS 7 systems
Before you can install Vertica, you must enable one of the following on your system for clock synchronization:.
Before you can install Vertica, you must enable one of the following on your system for clock synchronization:
You must enable and activate the Network Time Protocol (NTP) before installation. Otherwise, the installer reports this issue with the identifier S0030.
For information on installing and using chrony, see the information below. For information on NTPD see Enabling network time protocol (NTP). For more information about chrony, see Using chrony in the Red Hat documentation.
Install chrony
The chrony suite consists of:
chrony is installed by default on some versions of Red Hat/CentOS 7. However, if chrony is not installed on your system, you must download it. To download chrony, run the following command as sudo or root:
# yum install chrony
Verify that chrony is running
To view the status of the chronyd daemon, run the following command:
$ systemctl status chronyd
If chrony is running, an output similar to the following appears:
chronyd.service - NTP client/server
Loaded: loaded (/usr/lib/systemd/system/chronyd.service; enabled)
Active: active (running) since Mon 2015-07-06 16:29:54 EDT; 15s ago
Main PID: 2530 (chronyd)
CGroup: /system.slice/chronyd.service
ââ2530 /usr/sbin/chronyd -u chrony
If chrony is not running, execute the following command as sudo or root. This command also causes chrony to run at boot time:
# systemctl enable chronyd
Verify that chrony is operating correctly
To verify that the chrony daemon is operating correctly, issue the following command on all nodes in the cluster:
$ chronyc tracking
An output similar to the following appears:
Reference ID : 198.247.63.98 (time01.website.org)
Stratum : 3
Ref time (UTC) : Thu Jul 9 14:58:01 2015
System time : 0.000035685 seconds slow of NTP time
Last offset : -0.000151098 seconds
RMS offset : 0.000279871 seconds
Frequency : 2.085 ppm slow
Residual freq : -0.013 ppm
Skew : 0.185 ppm
Root delay : 0.042370 seconds
Root dispersion : 0.022658 seconds
Update interval : 1031.0 seconds
Leap status : Normal
A stratum level of 16 indicates that chrony is not synchronizing correctly. If chrony continues to detect a stratum level of 16, verify that the UDP port 323 is open. This port must be open on all firewalls between the cluster and the remote machine to which you are attempting to synchronize.
3.3.3.5.10 - SELinux configuration
Vertica does not support SELinux except when SELinux is running in permissive mode.
Vertica does not support SELinux except when SELinux is running in permissive mode. If it detects that SELinux is installed and the mode cannot be determined the installer reports this issue with the identifier: S0080. If the mode can be determined, and the mode is not permissive, then the issue is reported with the identifier: S0081.
Red hat and SUSE systems
You can either disable SELinux or change it to use permissive mode.
To disable SELinux:
-
Edit /etc/selinux/config
and change setting for SELinux to disabled (SELINUX=disabled
). This disables SELinux at boot time.
-
As root/sudo, type setenforce 0
to disable SELinux immediately.
To change SELinux to use permissive mode:
-
Edit /etc/selinux/config
and change setting for SELINUX to permissive (SELINUX=Permissive
).
-
As root/sudo, type setenforce Permissive
to switch to permissive mode immediately.
Ubuntu and debian systems
You can either disable SELinux or change it to use permissive mode.
To disable SELinux:
-
Edit /selinux/config
and change setting for SELinux to disabled (SELINUX=disabled
). This disables SELinux at boot time.
-
As root/sudo, type setenforce 0
to disable SELinux immediately.
To change SELinux to use permissive mode:
-
Edit /selinux/config
and change setting for SELinux to permissive (SELINUX=Permissive
).
-
As root/sudo, type setenforce Permissive
to switch to permissive mode immediately.
3.3.3.5.11 - CPU frequency scaling
This topic details the various CPU frequency scaling methods supported by Vertica.
This topic details the various CPU frequency scaling methods supported by Vertica. In general, if you do not require CPU frequency scaling, then disable it so as not to impact system performance.
Important
Your systems may use significantly more energy when frequency scaling is disabled.
The installer allows CPU frequency scaling to be enabled when the cpufreq scaling governor is set to performance
. If the cpu scaling governor is set to ondemand, and ignore_nice_load
is 1 (true), then the installer fails with the error S0140. If the cpu scaling governor is set to ondemand and ignore_nice_load
is 0 (false), then the installer warns with the identifier S0141.
CPU frequency scaling is a hardware and software feature that helps computers conserve energy by slowing the processor when the system load is low, and speeding it up again when the system load increases. This feature can impact system performance, since raising the CPU frequency in response to higher system load does not occur instantly. Always disable this feature on the Vertica database hosts to prevent it from interfering with performance.
You disable CPU scaling in your host's system BIOS. There may be multiple settings in your host's BIOS that you need to adjust in order to completely disable CPU frequency scaling. Consult your host hardware's documentation for details on entering the system BIOS and disabling CPU frequency scaling.
If you cannot disable CPU scaling through the system BIOS, you can limit the impact of CPU scaling by disabling the scaling through the Linux kernel or setting the CPU frequency governor to always run the CPU at full speed.
Caution
This method is not reliable, as some hardware platforms may ignore the kernel settings. For more information, see
Vertica Hardware Guide.
The method you use to disable frequency depends on the CPU scaling method being used in the Linux kernel. See your Linux distribution's documentation for instructions on disabling scaling in the kernel or changing the CPU governor.
3.3.3.5.12 - Enabling or disabling defrag
You can modify the defrag utility to meet Vertica configuration requirements, or to optimize your system performance by workload.
You can modify the defrag utility to meet Vertica configuration requirements, or to optimize your system performance by workload.
On all Red Hat/CentOS systems, you must disable the defrag utility to meet Vertica configuration requirements.
Note
The steps to disable defrag on Red Hat 6/CentOS 6 systems differ from those used to disable defrag on Red Hat 7/CentOS 7 and Red Hat 8/CentOS 8.
For SUSE 15.1, Vertica recommends that you enable defrag for optimized performance.
Recommended settings by workload for red hat 8/CentOS 8 and SUSE 15.1
Vertica recommends defrag settings to optimize performance by workload. The following table contains recommendations for systems that primarily run concurrent queries (such as short-running dashboard queries), or sequential SELECT or load (COPY) queries:
Operating System |
Concurrent |
Sequential |
Red Hat 8.0/CentOS 8.0 |
Disable |
Disable |
SUSE 15.1 |
Enable |
Enable |
See Enabling or disabling transparent hugepages for additional settings that optimize your system performance by workload.
Disabling defrag on red hat 6/CentOS 6 systems
-
Determine if defrag is enabled by running the following command:
cat /sys/kernel/mm/redhat_transparent_hugepage/defrag
[always] madvise never
The setting returned in brackets is your current setting. If you are not using madvise
or never
as your defrag setting, then you must disable defrag.
-
Edit /etc/rc.local,
and add the following script:
if test -f /sys/kernel/mm/redhat_transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
fi
You must reboot your system for the setting to take effect, or run the following echo line to proceed with the install without rebooting:
# echo never > /sys/kernel/mm/redhat_transparent_hugepage/defrag
Disabling defrag on red hat 7/CentOS 7, red hat 8/CentOS 8, and SUSE 15.1
-
Determine if defrag is enabled by running the following command:
cat /sys/kernel/mm/transparent_hugepage/defrag
[always] madvise never
The setting returned in brackets is your current setting. If you are not using madvise
or never
as your defrag setting, then you must disable defrag.
-
Edit /etc/rc.local,
and add the following script:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
You must reboot your system for the setting to take effect, or run the following echo line to proceed with the install without rebooting:
# echo never > /sys/kernel/mm/transparent_hugepage/defrag
-
If you are using Red Hat 7.0/CentOS 7.0 or Red Hat 8.0/CentOS 8.0, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
Enabling defrag on red hat 7/8, CentOS 7/8, and SUSE 15.1
-
Determine if defrag is enabled by running the following command:
cat /sys/kernel/mm/transparent_hugepage/defrag
[never] madvise never
The setting returned in brackets is your current setting. If you are not using madvise
or always
as your defrag setting, then you must enable defrag.
-
Edit /etc/rc.local,
and add the following script:
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo always > /sys/kernel/mm/transparent_hugepage/defrag
fi
You must reboot your system for the setting to take effect, or run the following echo line to proceed with the install without rebooting:
# echo always > /sys/kernel/mm/transparent_hugepage/defrag
-
If you are using Red Hat 7.0/CentOS 7.0 or Red Hat 8.0/CentOS 8.0, run the following command as root or sudo:
$ chmod +x /etc/rc.d/rc.local
3.3.3.5.13 - Support tools
Vertica suggests that the following tools are installed so support can assist in troubleshooting your system if any issues arise:.
Vertica suggests that the following tools are installed so support can assist in troubleshooting your system if any issues arise:
-
pstack (or gstack) package. Identified by issue S0040 when not installed.
- On Red Hat 7 and CentOS 7 systems, the pstack package is installed as part of the gdb package.
-
mcelog package. Identified by issue S0041 when not installed.
-
sysstat package. Identified by issue S0045 when not installed.
Red hat 6 and CentOS 6 systems
To install the required tools on Red Hat 6 and CentOS 6 systems, run the following commands as sudo or root:
yum install pstack
yum install mcelog
yum install sysstat
Red hat 7 and CentOS 7 systems
To install the required tools on Red Hat 7/CentOS 7 systems, run the following commands as sudo or root:
yum install gdb
yum install mcelog
yum install sysstat
Ubuntu and debian systems
To install the required tools on Ubuntu and Debian systems, run the following commands as sudo or root:
apt-get install pstack
apt-get install mcelog
apt-get install sysstat
Important
For Ubuntu versions 18.04 and higher, run apt-get install rasdaemon
instead of apt-get install mcelog
.
SuSE systems
To install the required tools on SuSE systems, run the following commands as sudo or root.
zypper install sysstat
zypper install mcelog
There is no individual SuSE package for pstack/gstack. However, the gdb package contains gstack, so you could optionally install gdb instead, or build pstack/gstack from source. To install the gdb package:
zypper install gdb
3.3.3.6 - System user configuration
The following tasks pertain to the configuration of the system user required by Vertica.
The following tasks pertain to the configuration of the system user required by Vertica.
3.3.3.6.1 - System user requirements
Vertica has specific requirements for the system user that runs and manages Vertica.
Vertica has specific requirements for the system user that runs and manages Vertica. If you specify a user during install, but the user does not exist, then the installer reports this issue with the identifier: S0200.
System user requirement details
Vertica requires a system user to own database files and run database processes and administration scripts. By default, the install script automatically configures and creates this user for you with the username dbadmin. See About Linux users created by Vertica and their privileges for details on the default user created by the install script. If you decide to manually create your own system user, then you must create the user before you run the install script. If you manually create the user:
Note
Instances of dbadmin
and verticadba
are placeholders for the names you choose if you do not use the default values.
-
the user must have the same username and password on all nodes
-
the user must use the BASH shell as the user's default shell. If not, then the installer reports this issue with identifier [S0240].
-
the user must be in the verticadba group (for example: usermod -a -G verticadba
userNameHere
). If not, the installer reports this issue with identifier [S0220].
Note
You must create a verticadba group on all nodes. If you do not, then the installer reports the issue with identifier [S0210].
-
the user's login group must be either verticadba or a group with the same name as the user (for example, the home group for dbadmin is dbadmin). You can check the groups for a user with the id command. For example: id dbadmin
. The "gid" group is the user's primary group. If this is not configured correctly then the installer reports this issue with the identifier [S0230]. Vertica recommends that you use verticadba as the user's primary login group. For example: usermod -g verticadba
userNameHere
. If the user's primary group is not verticadba as suggested, then the installer reports this with HINT [S0231].
-
the user must have a home directory. If not, then the installer reports this issue with identifier [S0260].
-
the user's home directory must be owned by the user. If not, then the installer reports the issue with identifier [S0270].
-
the system must be aware of the user's home directory (you can set it with the usermod command: usermod -m -d /path/to/new/home/dir
userNameHere
). If this is not configured correctly then the installer reports the issue with [S0250].
-
the user's home directory must be owned by the dbadmin's primary group (use the chown
and chgrp
commands if necessary). If this is not configured correctly, then the installer reports the issue with identifier [S0280].
-
the user's home directory should have secure permissions. Specifically, it should not be writable by anyone or by the group. Ideally the permissions should be, when viewing with ls, "---
" (nothing), or "r-x
" (read and execute). If this is not configured as suggested then the installer reports this with HINT [S0290].
3.3.3.6.2 - TZ environment variable
This topic details how to set or change the TZ environment variable and update your tzdata package.
This topic details how to set or change the TZ environment variable and update your tzdata package. If this variable is not set, then the installer reports this issue with the identifier: S0305.
Before installing Vertica, update the tzdata package for your system and set the default time zone for your database administrator account by specifying the TZ
environmental variable. If your database administrator is being created by the install_vertica
script, then set the TZ
variable after you have installed Vertica.
Update tzdata package
The tzdata package is a public-domain time zone database that is pre-installed on most Linux systems. The tzdata package is updated periodically for time-zone changes across the world. OpenText recommends that you update to the latest tzdata package before installing or updating Vertica.
Update your tzdata package with the following command:
Setting the default time zone
When a client receives the result set of a SQL query, all rows contain data adjusted, if necessary, to the same time zone. That time zone is the default time zone of the initiator node unless the client explicitly overrides it using the SQL SET TIME ZONE command described in the SQL Reference Manual. The default time zone of any node is controlled by the TZ
environment variable. If TZ
is undefined, the operating system time zone.
Important
The TZ
variable must be set to the same value on all nodes in the cluster.
If your operating system timezone is not set to the desired timezone of the database then make sure that the Linux environment variable TZ
is set to the desired value on all cluster hosts.
The installer returns a warning if the TZ variable is not set. If your operating system timezone is appropriate for your database, then the operating system timezone is used and the warning can be safely ignored.
Setting the time zone on a host
Important
If you explicitly set the
TZ
environment variable at a command line before you start the
Administration tools, the current setting will not take effect. The Administration Tools uses SSH to start copies on the other nodes, so each time SSH is used, the
TZ
variable for the startup command is reset.
TZ
must be set in the
.profile
or
.bashrc
files on all nodes in the cluster to take affect properly.
You can set the time zone several different ways, depending on the Linux distribution or the system administrator’s preferences.
-
To set the system time zone on Red Hat and SUSE Linux systems, edit:
/etc/sysconfig/clock
-
To set the TZ
variable, edit, /etc/profile
, or /home/dbadmin/.bashrc
or /home/dbadmin/.bash_profile
and add the following line (for example, for the US Eastern Time Zone):
export TZ="America/New_York"
For details on which timezone names are recognized by Vertica, see the appendix: Using time zones with Vertica.
3.3.3.6.3 - LANG environment variable settings
This topic details how to set or change the LANG environment variable.
This topic details how to set or change the LANG environment variable. The LANG environment variable controls the locale of the host. If this variable is not set, then the installer reports this issue with the identifier: S0300. If this variable is not set to a valid value, then the installer reports this issue with the identifier: S0301.
Set the host locale
Each host has a system setting for the Linux environment variable LANG
. LANG
determines the locale category for native language, local customs, and coded character set in the absence of the LC_ALL
and other LC_ environment variables. LANG
can be used by applications to determine which language to use for error messages and instructions, collating sequences, date formats, and so forth.
To change the LANG
setting for the database administrator, edit, /etc/profile
, or /dbadmin/.bashrc
or /home/dbadmin/.bash_profile
on all cluster hosts and set the environment variable; for example:
export LANG=en_US.UTF-8
The LANG
setting controls the following in Vertica:
The LANG
setting does not control the following:
-
Vertica specific error and warning messages. These are always in English at this time.
-
Collation of results returned by SQL issued to Vertica. This must be done using a database parameter instead. See Implement locales for international data sets section for details.
Note
If the LC_ALL
environment variable is set, it supersedes the setting of LANG
.
3.3.3.6.4 - Package dependencies
For successful Vertica installation, you must first install three packages on all nodes in your cluster before installing the database platform.
For successful Vertica installation, you must first install three packages on all nodes in your cluster before installing the database platform.
The required packages are:
-
openssh—Required for Administration tools connectivity between nodes.
-
which—Required for Vertica operating system integration and for validating installations.
-
dialog—Required for interactivity with Administration Tools.
Installing the required packages
The procedure you follow to install the required packages depends on the operating system on which your node or cluster is running. See your operating system's documentation for detailed information on installing packages.
-
For CentOS/Red Hat Systems—Typically, you manage packages on Red Hat and CentOS systems using the yum utility.
Run the following yum commands to install each of the package dependencies. The yum utility guides you through the installation:
# yum install openssh
# yum install which
# yum install dialog
-
For Debian/Ubuntu Systems—Typically, you use the apt-get utility to manage packages on Debian and Ubuntu systems.
Run the following apt-get commands to install each of the package dependencies. The apt-get utility guides you through the installation:
# apt-get install openssh
# apt-get install which
# apt-get install dialog
3.3.4 - Specifying disk storage location during installation
You can specify the disk storage location when you:.
You can specify the disk storage location when you:
Specifying disk storage location when you install
When you install Vertica, the --data-dir
parameter in the install_vertica script lets you specify a directory to contain database data and catalog files. The script defaults to the database administrator's default home directory /home/dbadmin
.
Important
Replace this default with a directory that has adequate space to hold your data and catalog files.
Requirements
-
The data and catalog directory must exist on each node in the cluster.
-
The directory on each node must be owned by the database administrator
-
Catalog and data path names must contain only alphanumeric characters and cannot have leading space characters. Failure to comply with these restrictions will result in database creation failure.
-
Vertica refuses to overwrite a directory if it appears to be in use by another database. Therefore, if you created a database for evaluation purposes, dropped the database, and want to reuse the database name, make sure that the disk storage location previously used has been completely cleaned up. See Managing storage locations for details.
3.4 - Installing using the command line
Although Vertica supports installation on one node, two nodes, and multiple nodes, this section describes how to install the Vertica software on a cluster of nodes.
Although Vertica supports installation on one node, two nodes, and multiple nodes, this section describes how to install the Vertica software on a cluster of nodes. It assumes that you have already performed the tasks in Before You Install Vertica, and that you have a Vertica license key.
To install Vertica, complete the following tasks:
-
Download and install the Vertica server package
-
Installing Vertica with the installation script
Special notes
-
Downgrade installations are not supported.
-
Be sure that you download the RPM for the correct operating system and architecture.
-
Vertica supports two-node clusters with zero fault tolerance (K=0 safety). This means that you can add a node to a single-node cluster, as long as the installation node (the node upon which you build) is not the loopback node (localhost/127.0.0.1 ).
-
The Version 7.0 installer introduces new platform verification tests that prevent the install from continuing if the platform requirements are not met by your system. Manually verify that your system meets the requirements in Before you install Vertica on your systems. These tests ensure that your platform meets the hardware and software requirements for Vertica. Previous versions documented these requirements, but the installer did not verify all of the settings. If this is a fresh install, then you can simply run the installer and view a list of the failures and warnings to determine which configuration changes you must make.
|
3.4.1 - Download and install the Vertica server package
To download and install the Vertica server package:.
To download and install the Vertica server package:
-
Use a Web browser to go to the Vertica website.
-
Click the Support tab and select Customer Downloads.
-
Log into the portal to download the install package.
Be sure the package you download matches the operating system and the machine architecture on which you intend to install it.
-
Transfer the installation package to the Administration host.
-
If you installed a previous version of Vertica on any of the hosts in the cluster, use the Administration tools to shut down any running database.
The database must stop normally; you cannot upgrade a database that requires recovery.
-
If you are using sudo, skip to the next step. If you are root, log in to the Administration Host as root (or log in as another user and switch to root).
$ su - root
password: root-password
#
Caution
When installing Vertica using an existing user as the dba, you must exit all UNIX terminal sessions for that user after setup completes and log in again to ensure that group privileges are applied correctly.
After Vertica is installed, you no longer need root privileges. To verify sudo, see General hardware and OS requirements and recommendations.
-
Use one of the following commands to run the RPM package installer:
where pathname
is the Vertica package file you downloaded.
Note
If the package installer reports multiple dependency problems, or you receive the error "ERROR: You're attempting to install the wrong RPM for this operating system", then you are trying to install the wrong Vertica server package.
3.4.2 - Installing Vertica with the installation script
Run the installation script after you install the Vertica package.
Run the installation script after you install the Vertica package. The installation script runs on a single node, using a Bash shell. It copies the Vertica package to all other hosts (identified by the --hosts
argument) in your planned cluster.
The installation script runs several tests on each of the target hosts to verify that the hosts meet system and performance requirements for a Vertica node. The installation script modifies some operating system configuration settings to meet these requirements. Other settings cannot be modified by the installation script and must be manually reconfigured.
Note
The installation script sets up passwordless ssh for the admin user across all hosts. If passwordless ssh is already set up, the installation script verifies that it functions correctly.
3.4.2.1 - Perform a basic install
For all installation options, see [%=Vertica.INSTALL_SCRIPT%] Options.
For all installation options, see install_vertica options.
-
As root (or sudo) run the install script. The script must be run by a BASH shell as root or as a user with sudo privileges. You can configure many options when running the install script. See Basic Installation Parameters below for the complete list of options.
If the installer fails due to any requirements not being met, you can correct the issue and then rerun the installer with the same command line options.
To perform a basic installation:
Important
If you place install_vertica
in a location other than /opt/vertica
, create a symlink from that location to /opt/vertica
. Create this symlink on all cluster nodes, otherwise the database will not start.
Basic installation parameters
Option |
Description |
--hosts host_list |
A comma-separated list of host names or IP addresses to include in the cluster. The list must not include embedded spaces. For example:
* `--hosts node01,node02,node03`
* `--hosts 127.0.0.1`
* `--hosts 192.168.233.101,192.168.233.102,192.168.233.103`
* `--hosts fd95:ff5d:5549:bdb0::1,fd95:ff5d:5549:bdb0::2,fd95:ff5d:5549:bdb0::3`
Note
Vertica stores only IP addresses in its configuration files. If you provide host names, they are converted to IP addresses when the script runs.
|
--rpm package_name --deb package_name |
The path and name of the Vertica RPM package. For example:
--rpm /tmp/vertica-10.1.1-0.x86_64.RHEL6.rpm
For Debian and Ubuntu installs, provide the name of the Debian package. For example:
--deb /tmp/vertica_10.1_amd64.deb
|
--dba-user dba_username |
The name of the Database Superuser system account to create. Only this account can run the Administration Tools. If you omit the --dba-user parameter, then the default database administrator account name is dbadmin .
This parameter is optional for new installations done as root but must be specified when upgrading or when installing using sudo. If upgrading, use the -u parameter to specify the same DBA account name that you used previously. If installing using sudo, the user must already exist.
Note
If you manually create the user, modify the user's .bashrc file to include the line: PATH=/opt/vertica/bin:$PATH so that the Vertica tools such as vsql and admintools can be easily started by the dbadmin user.
|
-
When prompted for a password to log into the other nodes, provide the requested password. Doing so allows the installation of the package and system configuration on the other cluster nodes.
-
If you are root, this is the root password.
-
If you are using sudo, this is the sudo user password.
The password does not echo on the command line. For example:
Vertica Database 11.1.x Installation Tool
Please enter password for root@host01:password
-
If the dbadmin user, or the user specified in the argument --dba-user
, does not exist, then the install script prompts for the password for the user. Provide the password. For example:
Enter password for new UNIX user dbadmin:password
Retype new UNIX password for user dbadmin:password
-
Carefully examine any warnings or failures returned by
install_vertica
and correct the problems.
For example, insufficient RAM, insufficient network throughput, and too high readahead settings on the file system could cause performance problems later on. Additionally, LANG warnings, if not resolved, can cause database startup to fail and issues with VSQL. The system LANG attributes must be UTF-8 compatible. After you fix the problems, rerun the install script.
-
When installation is successful, disconnect from the Administration host, as instructed by the script. Then, complete the required post-installation steps.
At this point, root privileges are no longer needed and the database administrator can perform any remaining steps.
3.4.2.2 - Install on a FIPS 140-2 enabled machine
Vertica supports the implementation of the Federal Information Processing Standard 140-2 (FIPS).
Vertica supports the implementation of the Federal Information Processing Standard 140-2 (FIPS). You enable FIPS mode in the operating system.
Note
Enabling FIPS on the operating system occurs outside of Vertica.
During installation, the install_vertica script detects whether the host is operating in FIPS mode. The installer searches for the file /proc/sys/crypto/fips_enabled
and examines its content. If the file exists and contains a '1' in the filename, the host is operating in FIPS mode and the following message appears:
/proc/sys/crypto/fips_enabled exists and contains '1', this is a FIPS system
Important
On certain systems where the libssl and libcrypto libraries do not have versioning information, when starting Vertica, you may see the message
No version information available
This message is benign and you can ignore it.
3.4.2.3 - Create symbolic links for OpenSSL
As part of the Vertica installation, symbolic links are created to the appropriate OpenSSL files.
As part of the Vertica installation, symbolic links are created to the appropriate OpenSSL files. The steps are as follows:
-
The RPM installer places two OpenSSL library files in /opt/vertica/lib:
-
libssl.so.1.1
-
libcrypto.so.1.1
-
The install_vertica script creates two symbolic links in /opt/vertica/lib:
-
The symbolic links point to libssl.so.1.1 and libcrypto.so.1.1, which the RPM installer placed in /opt/vertica/lib.
To implement FIPS 140-2 on your Vertica Analytic Database, you need to configure both the server and the client you are using. To see the detailed configuration steps, go to Implementing FIPS 140-2.
3.4.2.4 - install_vertica options
The table below describes all script options.
The table below describes all
install_vertica
script options. Most options have a long and short form—for example, --hosts
and -s
.
install_vertica
minimally requires two options:
-
--hosts
/ -s
-
--rpm
/ -r
| --deb
For example:
# /opt/vertica/sbin/install_vertica --hosts node0001,node0002,node0003 \
--rpm /tmp/vertica-10.1.1-0.x86_64.RHEL6.rpm
For details on minimal installation requirements, see Perform a basic install.
Option |
Description |
--help |
Display help for this script. |
--accept-eula -Y |
Silently accepts the EULA agreement. On multi-node installations, this option is propagated across the cluster at the end of the installation, at the same time as the Administration Tools metadata.
Combine this option with --license (-L ) to activate your license.
|
--add-hosts host-list -A host-list |
A comma-separated list of hosts to add to an existing Vertica cluster.
--add-hosts modifies an existing installation of Vertica by adding a host to the database cluster and then reconfiguring spread. This is useful for improving system performance, or making the database K-safe.
Important
If you used --point-to-point (-T ) to configure spread to use direct point-to-point communication within the existing cluster, you must also use it when you add a new host; otherwise, the new host automatically uses UDP broadcast traffic, resulting in cluster communication problems that prevent Vertica from running properly. For example:
--add-hosts host01
--add-hosts 192.168.233.101
You can also use this option with the update_vertica script. For details, see Adding nodes.
|
--broadcast -U |
Specifies that Vertica use UDP broadcast traffic by spread between nodes on the subnet. This option is automatically used by default. No more than 80 spread daemons are supported by broadcast traffic. It is possible to have more than 80 nodes by using large cluster mode, which does not install a spread daemon on each node (see Large cluster).
Do not combine this option with --point-to-point (-T ).
Important
When changing the configuration from --broadcast (-U ) (the default) to --point-to-point (-T ) or vice-versa, you must also specify --control-network (-S ).
|
--clean |
Forcibly cleans previously stored configuration files. Use this option if you need to change the hosts that are included in your cluster. Only use this option when no database is defined.
This option cannot be combined with update_vertica .
|
--config-file file -z file |
Accepts an existing properties file created by --record-config . This properties file contains key/value settings that map to options in the
install_vertica script, many with Boolean arguments that default to false. |
--control-network { bcast-addess | default } -S { bcast-addess | default } |
Set to one of the following arguments:
bcast-addess : A broadcast network IP address that enables configuration of spread communications on a subnet different from other Vertica data communications.
Important
bcast-addess must match the subnet for at least some of the nodes in the database. If the address does not match the subnet of any node in the database, then the installer displays an error and stops. If the provided address matches some, but not all of the node's subnets, the installer displays a warning, but installation continues.
Ideally, the value for --control-network should match all node subnets.
You can also use this option to force a cluster-wide spread reconfiguration when changing spread related options.
|
--data-dir data-directory -d data-directory |
Specifies the directory for database data and catalog files. For more information, see Specifying disk storage location during installation.
Caution
Do not use a shared directory over more than one host for this setting. Data and catalog directories must be distinct for each node. Multiple nodes must not be allowed to write to the same data or catalog directory.
Default: /home/dbadmin
|
--dba-group group -g group |
The UNIX group for DBA users.
Default: verticadba .
|
--dba-user dba-username -u dba-username |
The name of the database superuser system account to create. Only this account can run the Administration Tools. If you omit this option, then the default database administrator account name is dbadmin .
This option is optional for new installations done as root but must be specified when upgrading or when installing using sudo. If upgrading, use this option to specify the same DBA account name that you used previously. If installing using sudo, dba-username must already exist.
Note
If you manually create the user, modify the user's .bashrc file to include the line: PATH=/opt/vertica/bin:$PATH so Vertica tools such as vsql and admintools can be easily started by the dbadmin user.
|
--dba-user-home dba-home-directory -l dba-home-directory |
The home directory for the database administrator.
Default: /home/dbadmin .
|
--dba-user-password dba-password -p dba-password |
The password for the database administrator account. If not supplied, the script prompts for a password and does not echo the input. |
--dba-user-password-disabled |
Disables the password for --dba-user . This argument stops the installer from prompting for a password for --dba-user . You can assign a password later using standard user management tools such as passwd . |
--failure-threshold [ threshold-arg ] |
Stops the installation when the specified failure threshold is encountered, where threshold-arg can be one of the following:
-
HINT : Stop the install if a HINT or greater issue is encountered during the installation tests. HINT configurations are settings you should make, but the database runs with no significant negative consequences if you omit the setting.
-
WARN : Stop the installation if a WARN or greater issue is encountered. WARN issues may affect the performance of the database. However, for basic testing purposes or Community Edition users, WARN issues can be ignored if extreme performance is not required.
-
FAIL : Stop the installation if a FAIL or greater issue is encountered. FAIL issues can have severely negative performance consequences and possible later processing issues if not addressed. However, Vertica can start even if FAIL issues are ignored.
-
HALT : Stop the installation if a HALT or greater issue is encountered. The database may not be able to be started if you choose his option. Not supported in production environments.
-
NONE : Do not stop the installation. The database may not start. Not supported in production environments.
Default: WARN
|
--hosts host-list -s host-list |
A comma-separated list of host names or IP addresses to include in the cluster, where host-list must not include spaces. For example:
--hosts host01,host02,host03
-s 192.168.233.101,192.168.233.102,192.168.233.103
The following requirements apply:
-
If upgrading an existing installation of Vertica, use the same host names used previously.
-
IP addresses or hostnames must be for unique hosts. Do not list the same host using multiple IP addresses/hostnames.
|
--ipv4 |
Hosts in the cluster are identified by IPv4 network addresses. This is the default behavior. |
--ipv6 |
Hosts in the cluster are identified by IPv6 network addresses. You must specify this option when you pass IPv6 addresses in the --hosts list. If you use host names in the --hosts option, the names must resolve to IPv6 addresses. This option automatically enables the -- point-to-point option. |
--large-cluster [ num-control-nodes | default] |
Enables the large cluster feature, where a subset of nodes called control nodes connect to Spread to send and receive broadcast messages. Consider using this option for a cluster with more than 50 nodes in Enterprise Mode. Vertica automatically enables this feature if you install onto 120 or more nodes in Enterprise Mode, or 16 or more nodes in Eon Mode.
Supply this option with one of the following arguments:
num-control-nodes : Sets the number of control nodes in the new database. For Enterprise Mode, sets the number of control nodes in the entire cluster. In Eon Mode, sets the number of control nodes in the initial default subcluster. This value must be between 1 to 120 inclusive.
Note
Vertica sets the number of control nodes for the database to the value you specify here or the number of nodes in the --hosts option list, whichever is less.
default : Vertica sets the number of control nodes to the square root of the total number of cluster nodes listed in --hosts (-s ).
See Enable Large Cluster When Installing Vertica for more information.
Default: default
|
--license { licensefile | CE } -L { licensefile | CE } |
Silently and automatically deploys the license key to /opt/vertica/config/share . On multi-node installations, the –-license option also applies the license to all nodes declared in the --hosts host_list . To activate your license, combined this option with –-accept-eula option. If you do not use the –-accept-eula option, you are asked to accept the EULA when you connect to your database. After you accept the EULA, your license is activated.
If specified with CE , automatically deploys the Community Edition license key, which is included in your download. You do not need to specify a license file.
For example:
--license CE
--license /tmp/vlicense.dat
|
--no-system-configuration |
Specifies that the installer makes no changes to system properties. By default, the installer makes system configuration changes to meet server requirements.
If you use this option, the installer posts warnings or failures for configuration settings that do not meet requirements that it otherwise configures automatically.
Note
This option has no effect on creating or updating user accounts.
|
--point-to-point -T |
Configures spread to use direct point-to-point communication between all Vertica nodes. Use this option if your nodes are not located on the same subnet. Also use this option for all virtual environment installations, whether the virtual servers are on the same subnet or not.
The maximum number of spread daemons supported in point-to-point communication in Vertica is 80. It is possible to have more than 80 nodes by using large cluster mode, which does not install a spread daemon on each node.
Do not combine this option with --broadcast (-U ).
This option is automatically enabled when you enable the --ipv6 option.
Important
When changing the configuration from --broadcast (-U ) (the default) to --point-to-point (-T ) or vice-versa, you must also specify --control-network (-S ).
|
--record-config filename -B filename |
Accepts a file name, which when used in conjunction with command line options, creates a properties file that can be used with --config-file (-z ). This option creates the properties file and exits; it does not affect installation. |
--remove-hosts host-list -R host-list |
A comma-separated list of hosts to remove from an existing Vertica cluster.
--remove-hosts modifies an existing installation of Vertica by removing a host from the database cluster and then reconfiguring the spread. This is useful for removing an obsolete or over-provisioned system. For example:
--remove-hosts host01
-R 192.168.233.101
Notes:
-
If you use --point-to-point (-T ) to configure spread to use direct point-to-point communication within the existing cluster, you must also use it when you remove a host; otherwise, the hosts automatically use UDP broadcast traffic, resulting in cluster communication problems that prevents Vertica from running properly.
-
The update_vertica script described in Removing nodes calls the install_vertica script to perform the update to the installation. You can use the
install_vertica or update_vertica script with this option.
|
--rpm * package-name* -r * package-name* --deb * package-name* |
The name of the RPM or Debian package. For example:
--rpm
vertica-12.0.x.x86_64.RHEL6.rpm
The install package must be provided if installing or upgrading multiple nodes and the nodes do not have the latest server package installed, or if you are adding a new node. The install_vertica and update_vertica scripts serially copy the server package to the other nodes and install the package.
Tip
If installing or upgrading a large number of nodes, consider manually installing the package on all nodes before running the upgrade script, as the script runs faster if it does not need to serially upload and install the package on each node.
|
--spread-logging -w |
Configures spread to output logging to /opt/vertica/log/spread_ hostname .log . This option does not apply to upgrades.
Note
Do not enable spread logging unless so directed by Vertica technical support.
|
--ssh-identity file -i file |
The root private-key file to use if passwordless ssh has already been configured between the hosts. Verify that normal SSH works without a password before using this option. The file can be private key file (for example, id_rsa), or PEM file. Do not use with the --ssh-password (-P ) option.
Vertica accepts the following:
-
By providing an SSH private key which is not password protected. You cannot run the
install_vertica script with the sudo command when using this method.
-
By providing a password-protected private key and using an SSH-Agent. Note that sudo typically resets environment variables when it is invoked. Specifically, the SSH_AUTHSOCK variable required by the SSH-Agent may be reset. Therefore, configure your system to maintain SSH_AUTHSOCK or invoke install_vertica using a method similar to the following: sudo SSH_AUTHSOCK=$SSH_AUTHSOCK /opt/vertica/sbin/install_vertica ...
|
--ssh-password password -P password |
The password to use by default for each cluster host. If you omit this option, and you also omit specifying --ssh-identity (-i ), then the script prompts for the password as necessary and does not echo input.
Do not use this option together with --ssh-identity (-i ).
Important
Specify the password as follows:
-
If you run the
install_vertica script as root, specify the root password:
# /opt/vertica/sbin/
install_vertica -P root-passwd
-
If you run the
install_vertica script with the sudo command, specify the password of the user who runs
install_vertica , not the root password.
For example if user dbadmin runs
install_vertica with sudo and has the password dbapasswd , then specify the password as dbapasswd :
$ sudo /opt/vertica/sbin/
install_vertica -P dbapasswd
|
--temp-dir directory |
The temporary directory used for administrative purposes. If it is a directory within /opt/vertica , then it is created by the installer. Otherwise, the directory should already exist on all nodes in the cluster. The location should allow dbadmin write privileges.
Note
This is not a temporary data location for the database.
Default: /tmp
|
3.4.3 - Installing Vertica silently
This section describes how to create a properties file that lets you install and deploy Vertica-based applications quickly and without much manual intervention.
This section describes how to create a properties file that lets you install and deploy Vertica-based applications quickly and without much manual intervention.
Install the properties file:
-
Download and install the Vertica install package, as described in Installing Vertica.
-
Create the properties file that enables non-interactive setup by supplying the parameters you want Vertica to use. For example:
The following command assumes a multi-node setup:
# /opt/vertica/sbin/install_vertica --record-config file_name --license /tmp/license.txt --accept-eula \
# --dba-user-password password --ssh-password password --hosts host_list --rpm package_name
The following command assumes a single-node setup:
# /opt/vertica/sbin/install_vertica --record-config file_name --license /tmp/license.txt --accept-eula \
# --dba-user-password password
Option |
Description |
--record-file file_name |
[Required] Accepts a file name, which when used in conjunction with command line options, creates a properties file that can be used with the --config-file option during setup. This flag creates the properties file and exits; it has no impact on installation. |
--license { license_file | CE } |
Silently and automatically deploys the license key to /opt/vertica/config/share. On multi-node installations, the –-license option also applies the license to all nodes declared in the --hosts host_list .
If specified with CE, automatically deploys the Community Edition license key, which is included in your download. You do not need to specify a license file.
|
--accept-eula |
Silently accepts the EULA agreement during setup. |
--dba-user-password password |
The password for the Database Superuser account; if not supplied, the script prompts for the password and does not echo the input. |
--ssh-password password |
The root password to use by default for each cluster host; if not supplied, the script prompts for the password if and when necessary and does not echo the input. |
--hosts host_list |
A comma-separated list of hostnames or IP addresses to include in the cluster; do not include space characters in the list.
Examples:
--hosts host01,host02,host03
--hosts 192.168.233.101,192.168.233.102,192.168.233.103
|
--rpm package_name
--deb package_name
|
The name of the RPM or Debian package that contained this script.
Example:
--rpm
vertica-12.0.x.x86_64.RHEL6.rpm
This parameter is required on multi-node installations if the RPM or DEB package is not already installed on the other hosts.
|
See Installing Vertica with the installation script for the complete set of installation parameters.
Tip
Supply the parameters to the properties file once only. You can then install Vertica using just the --config-file
parameter, as described below.
- Use one of the following commands to run the installation script.
-
If you are root:
/opt/vertica/sbin/install_vertica --config-file file_name
-
If you are using sudo:
$ sudo /opt/vertica/sbin/install_vertica --config-file file_name
--config-file
file_name
accepts an existing properties file created by --record-config
file_name
. This properties file contains key/value parameters that map to values in the install_vertica
script, many with boolean arguments that default to false
The command for a single-node install might look like this:
# /opt/vertica/sbin/install_vertica --config-file /tmp/vertica-inst.prp
- If you did not supply a
--ssh-password
password parameter to the properties file, you are prompted to provide the requested password to allow installation of the RPM/DEB and system configuration of the other cluster nodes. If you are root, this is the root password. If you are using sudo, this is the sudo user password. The password does not echo on the command line.
Note
If you are root on a single-node installation, you are not prompted for a password.
- If you did not supply a
--dba-user-password
password parameter to the properties file, you are prompted to provide the database administrator account password.
The installation script creates a new Linux user account (dbadmin by default) with the password that you provide.
- Carefully examine any warnings produced by
install_vertica
and correct the problems if possible. For example, insufficient RAM, insufficient Network throughput and too high readahead settings on file system could cause performance problems later on.
Note
You can redirect any warning outputs to a separate file, instead of having them display on the system. Use your platforms standard redirected mechanisms. For example: install_vertica
[options]
> /tmp/file 1>&2
.
- Optionally perform the following steps:
- Disconnect from the Administration Host as instructed by the script. This is required to:
At this point, Linux root privileges are no longer needed. The database administrator can perform the remaining steps.
Note
When creating a new database, the database administrator might want to use different data or catalog locations than those created by the installation script. In that case, a Linux administrator might need to create those directories and change their ownership to the database administrator.
If you supplied the --license
and --accept-eula
parameters to the properties file, then proceed to Getting started and then see Configuring the database.
Otherwise:
-
Log in to the Database Superuser account on the administration host.
-
Accept the End User License Agreement and install the license key you downloaded previously as described in Install the license key.
-
Proceed to Getting started and then see Configuring the database.
Notes
accept_eula = True
license_file = /tmp/license.txt
record_to = file_name
root_password = password
vertica_dba_group = verticadba
vertica_dba_user = dbadmin
vertica_dba_user_password = password
3.4.4 - Installing Vertica on Amazon Web Services (AWS)
Beginning with Vertica 6.1.x, you can use Vertica on AWS by utilizing a pre-configured Amazon Machine Image (AMI).
Beginning with Vertica 6.1.x, you can use Vertica on AWS by utilizing a pre-configured Amazon Machine Image (AMI). For details on installing and configuring a cluster on AWS, refer to Installing and Running Vertica on AWS.
4 - Installing Vertica for Eon Mode on-premises
You can install Vertica in your own network (also known as "on-premises") and have it run in Eon Mode.
You can install Vertica in your own network (also known as "on-premises") and have it run in Eon Mode. See Eon Mode concepts for more information about Eon Mode. In an Eon Mode on-premises configuration, Vertica uses an object store hosting on your network for communal storage. See Eon on-premises storage for a list of the object stores that Vertica supports for communal storage.
Installing Vertica for an Eon Mode on-premises deployment follows the same steps you follow to install Vertica for an on-premises Enterprise Mode deployment. The actual difference between the two comes when you create the database.
4.1 - Installing an Eon Mode database on premises with FlashBlade
You have two options on how to install an Eon Mode database on premises with Pure Storage FlashBlade as your S3-compatible communal storage:.
You have two options on how to install an Eon Mode database on premises with Pure Storage FlashBlade as your S3-compatible communal storage:
Step 1: create a bucket and credentials on the Pure Storage FlashBlade
To use a Pure Storage FlashBlade appliance as a communal storage location for an Eon Mode database you must have:
-
The IP address of the FlashBlade appliance. You must also have the connection port number if your FlashBlade is not using the standard port 80 or 443 to access the bucket. All of the nodes in your Vertica cluster must be able to access this IP address. Make sure any firewalls between the FlashBlade appliance and the nodes are configured to allow access.
-
The name of the bucket on the FlashBlade to use for communal storage.
-
An access key and secret key for a user account that has read and write access to the bucket.
See the Pure Storage support site for instructions on how to create the bucket and the access keys needed for a communal storage location.
Step 2: install Vertica on your cluster
To install Vertica:
-
Ensure your nodes are configured properly by reviewing all of the content in the Before you install Vertica section.
-
Use the install_vertica
script to verify that your nodes are correctly configured and to install the Vertica binaries on all of your nodes. Follow the steps under Installing using the command line to install Vertica.
Note
These installation steps are the same ones you follow to install Vertica in Enterprise Mode. The difference between Eon Mode and Enterprise Mode on-premises databases is how you create the database, not how you install the Vertica software.
Step 3: create an authorization file
Before you create your Eon Mode on-premises database, you must create an authorization file that admintools will use to authenticate with the FlashBlade storage.
-
On the Vertica node where you will run admintools to create your database, use a text editor to create a file. You can name this file anything you wish. In these steps, it is named auth_params.conf
. The location of this file isn't important, as long as it is readable by the Linux user you use to create the database (usually, dbadmin).
Important
The auth_params.conf
file contains the secret key to access the bucket containing your Eon Mode database's data. This information is sensitive, and can be used to access the raw data in your database. Be sure this file is not readable by unauthorized users. After you have created your database, you can delete this file.
-
Add the following lines to the file:
awsauth = FlasbBlade_Access_Key:FlashBlade_Secret_Key
awsendpoint = FlashBladeIp:FlashBladePort
Note
You do not need to supply a port number in the awsendpoint
setting if you are using the default port for the connection between Vertica and the FlashBlade (80 for an unencrypted connection or 443 for an encrypted connection).
-
If you are not using TLS encryption for the connection between Vertica and the FlashBlade, add the following line to the file:
awsenablehttps = 0
-
Save the file and exit the editor.
This example auth_params.conf
file is for an unencrypted connection between the Vertica cluster and a FlashBlade appliance at IP address 10.10.20.30 using the standard port 80.
awsauth = PIWHSNDGSHVRPIQ:339068001+e904816E02E5fe9103f8MQOEAEHFFVPKBAAL
awsendpoint = 10.10.20.30
awsenablehttps = 0
Step 4: choose a depot path on all nodes
Choose or create a directory on each node for the depot storage path. The directory you supply for the depot storage path parameter must:
-
Have the same path on all nodes in the cluster (i.e. /home/dbadmin/depot
).
-
Be readable and writable by the dbadmin user.
-
Have sufficient storage. By default, Vertica uses 60% of the filesystem space containing the directory for depot storage. You can limit the size of the depot by using the --depot-size
argument in the create_db command. See Configuring your Vertica cluster for Eon Mode for guidelines on choosing a size for your depot.
The admintools create_db tool will attempt to create the depot path for you if it doesn't exist.
Step 5: create the Eon on-premises database
Use the admintools create_db tool to create the database. You must pass this tool the following arguments:
Argument |
Description |
-x |
The path to the auth_params.conf file. |
--communal-storage-location |
The S3 URL for the bucket on the FlashBlade appliance (usually, this is s3://bucketname). |
--depot-path |
The absolute path to store the depot on the nodes in the cluster. |
--shard-count |
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divider. See Planning for Scaling Your Cluster for more information. |
-s |
A comma-separated list of the nodes in your database. |
-d |
The name for your database. |
Some common optional arguments include:
Argument |
Description |
-l |
The absolute path to the Vertica license file to apply to the new database. |
-p |
The password for the new database. |
--depot-size |
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
-
integer % : Percentage of filesystem's disk space to allocate.
-
integer {K|M|G|T} : Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
|
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating a three-node database named verticadb, specifying the depot will be stored in the home directory of the dbadmin user.
$ admintools -t create_db -x auth_params.conf \
--communal-storage-location=s3://verticadbbucket \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
Step 6: disable streaming limitations
After creating the database, disable the AWSStreamingConnectionPercentage configuration parameter. This setting is unnecessary for an Eon Mode on-premises install with communal storage on Set Snippet Variable Value in Topic. This configuration parameter controls the number of connections to the object store that Vertica uses for streaming reads. In a cloud environment, this setting helps avoid having streaming data from the object store use up all of the available file handles. It leaves some file handles available for other object store operations. Due to the low latency of on-premises object stores, this option is unnecessary. Set it to 0 to disable it.
The following example shows how to disable this parameter using ALTER DATABASE...SET PARAMETER:
=> ALTER DATABASE DEFAULT SET PARAMETER AWSStreamingConnectionPercentage = 0;
ALTER DATABASE
Deciding whether to disable the depot
The FlashBlade object store's performance is fast enough that you may consider disabling the depot in your Vertica database. If you disable the depot, you can get by with less local storage on your nodes. However, there is always a performance impact of disabling the depot. The exact impact depends mainly on the types of workloads you run on your database. The performance impact can range from a 30% to 4000% decrease in query performance. Only consider disabling the depot if you will see a significant benefit from reducing the storage requirements of your nodes. Before disabling the depot on a production database, always run a proof of concept test that executes the same workloads as your production database.
To disable the depot, set the UseDepotForReads configuration parameter to 0. The following example demonstrates disabling this parameter using ALTER DATABASE...SET PARAMETER:
=> ALTER DATABASE DEFAULT SET PARAMETER UseDepotForReads = 0;
ALTER DATABASE
4.2 - Installing Eon Mode on-premises with communal storage on MinIO
To use MinIO as a communal storage location for an Eon Mode database, you must have:.
Step 1: create a bucket and credentials on MinIO
To use MinIO as a communal storage location for an Eon Mode database, you must have:
-
The IP address and port number of the MinIO cluster. MinIO's default port number is 9000. A Vertica database running in Eon Mode defaults to using port 80 for unencrypted connections and port 443 for TLS encrypted connection. All of the nodes in your Vertica cluster must be able to access the MinIO cluster's IP address. Make sure any firewalls between the MinIO cluster and the nodes are configured to allow access.
-
The name of the bucket on the MinIO cluster to use for communal storage.
-
An access key and secret key for a user account that has read and write access to the bucket.
See the MinIO documentation for instructions on how to create the bucket and the access keys needed for a communal storage location.
Step 2: install Vertica on your cluster
To install Vertica:
-
Ensure your nodes are configured properly by reviewing all of the content in the Before you install Vertica section.
-
Use the install_vertica
script to verify that your nodes are correctly configured and to install the Vertica binaries on all of your nodes. Follow the steps under Installing using the command line to install Vertica.
Note
These installation steps are the same ones you follow to install Vertica in Enterprise Mode. The difference between Eon Mode and Enterprise Mode on-premises databases is how you create the database, not how you install the Vertica software.
Step 3: create an authorization file
Before you create your Eon Mode on-premises database, you must create an authorization file that admintools will use to authenticate with the MinIO storage cluster.
-
On the Vertica node where you will run admintools to create your database, use a text editor to create a file. You can name this file anything you wish. In these steps, it is named auth_params.conf
. The location of this file isn't important, as long as it is readable by the Linux user you use to create the database (usually, dbadmin).
Important
The auth_params.conf
file contains the secret key to access the bucket containing your Eon Mode database's data. This information is sensitive, and can be used to access the raw data in your database. Be sure this file is not readable by unauthorized users. After you have created your database, you can delete this file.
-
Add the following lines to the file:
awsauth = MinIO_Access_Key:MinIO_Secret_Key
awsendpoint = MinIOIp:MinIOPort
Note
You do not need to supply a port number in the awsendpoint
setting if you configured your MinIO cluster to use the default HTTP ports (80 for an unencrypted connection or 443 for an encrypted connection). MinIO uses port 9000 by default.
-
If you are not using TLS encryption for the connection between Vertica and MinIO, add the following line to the file:
awsenablehttps = 0
-
Save the file and exit the editor.
This example auth_params.conf
file is for an unencrypted connection between the Vertica cluster and a MinIO cluster at IP address 10.20.30.40 using port 9000 (which is the default for MinIO).
awsauth = PIWHSNDGSHVRPIQ:339068001+e904816E02E5fe9103f8MQOEAEHFFVPKBAAL
awsendpoint = 10.20.30.40:9000
awsenablehttps = 0
Step 4: choose a depot path on all nodes
Choose or create a directory on each node for the depot storage path. The directory you supply for the depot storage path parameter must:
-
Have the same path on all nodes in the cluster (i.e. /home/dbadmin/depot
).
-
Be readable and writable by the dbadmin user.
-
Have sufficient storage. By default, Vertica uses 60% of the filesystem space containing the directory for depot storage. You can limit the size of the depot by using the --depot-size
argument in the create_db command. See Configuring your Vertica cluster for Eon Mode for guidelines on choosing a size for your depot.
The admintools create_db tool will attempt to create the depot path for you if it doesn't exist.
Step 5: create the Eon on-premises database
Use the admintools create_db tool to create the database. You must pass this tool the following arguments:
Argument |
Description |
-x |
The path to the auth_params.conf file. |
--communal-storage-location |
The S3 URL for the bucket on the MinIO cluster (usually, this is s3://bucketname). |
--depot-path |
The absolute path to store the depot on the nodes in the cluster. |
--shard-count |
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divider. See Planning for Scaling Your Cluster for more information. |
-s |
A comma-separated list of the nodes in your database. |
-d |
The name for your database. |
Some common optional arguments include:
Argument |
Description |
-l |
The absolute path to the Vertica license file to apply to the new database. |
-p |
The password for the new database. |
--depot-size |
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
-
integer % : Percentage of filesystem's disk space to allocate.
-
integer {K|M|G|T} : Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
|
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating a three-node database named verticadb, specifying the depot will be stored in the home directory of the dbadmin user.
$ admintools -t create_db -x auth_params.conf \
--communal-storage-location=s3://verticadbbucket \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
Step 6: disable streaming limitations
After creating the database, disable the AWSStreamingConnectionPercentage configuration parameter. This setting is unnecessary for an Eon Mode on-premises install with communal storage on Set Snippet Variable Value in Topic. This configuration parameter controls the number of connections to the object store that Vertica uses for streaming reads. In a cloud environment, this setting helps avoid having streaming data from the object store use up all of the available file handles. It leaves some file handles available for other object store operations. Due to the low latency of on-premises object stores, this option is unnecessary. Set it to 0 to disable it.
The following example shows how to disable this parameter using ALTER DATABASE...SET PARAMETER:
=> ALTER DATABASE DEFAULT SET PARAMETER AWSStreamingConnectionPercentage = 0;
ALTER DATABASE
4.3 - Installing Eon Mode on-premises with communal storage on HDFS
To use HDFS as a communal storage location for an Eon Mode database you must:.
Step 1: satisfy HDFS environment prerequisites
To use HDFS as a communal storage location for an Eon Mode database you must:
-
Run the WebHDFS service.
-
If using Kerberos, create a Kerberos principal for the Vertica (system) user as described in Kerberos authentication, and grant it read and write access to the location in HDFS where you will place your communal storage. Vertica always uses this system principal to access communal storage.
-
If using High Availability Name Node or swebhdfs, distribute the HDFS configuration files to all Vertica nodes as described in Configuring HDFS access. This step is necessary even though you do not use the hdfs scheme for communal storage.
-
If using swebhdfs (wire encryption) instead of webhdfs, configure the HDFS cluster with certificates trusted by the Vertica hosts and set dfs.encrypt.data.transfer in hdfs-site.xml.
-
Vertica has no additional requirements for encryption at rest. Consult the documentation for your Hadoop distribution for information on how to configure encryption at rest for WebHDFS.
Note
Hadoop currently does not support IPv6 network addresses. Your cluster must use IPv4 addresses to access HDFS. If you choose to use IPv6 network addresses for the hosts in your database cluster, make sure they can access IPv4 addresses. One way to enable this access is to assign your Vertica hosts an IPv4 address in addition to an IPv6 address.
Step 2: install Vertica on your cluster
To install Vertica:
-
Ensure your nodes are configured properly by reviewing all of the content in the Before you install Vertica section.
-
Use the install_vertica
script to verify that your nodes are correctly configured and to install the Vertica binaries on all of your nodes. Follow the steps under Installing using the command line to install Vertica.
Note
These installation steps are the same ones you follow to install Vertica in Enterprise Mode. The difference between Eon Mode and Enterprise Mode on-premises databases is how you create the database, not how you install the Vertica software.
Step 3: create a bootstrapping file
Before you create your Eon Mode on-premises database, you must create a bootstrapping file to specify parameters that are required for database creation. This step applies if you are using Kerberos, High Availability Name Node, or TLS (wire encryption).
-
On the Vertica node where you will run admintools to create your database, use a text editor to create a file. You can name this file anything you wish. In these steps, it is named bootstrap_params.conf
. The location of this file isn't important, as long as it is readable by the Linux user you use to create the database (usually, dbadmin).
-
Add the following lines to the file. HadoopConfDir is typically set to /etc/hadoop/conf
; KerberosServiceName is usually set to vertica
.
HadoopConfDir = config-path
KerberosServiceName = principal-name
KerberosRealm = realm-name
KerberosKeytabFile = keytab-path
If you are not using HA Name Node, for example in a test environment, you can omit HadoopConfDir and use an explicit Name Node host and port when specifying the location of the communal storage.
-
Save the file and exit the editor.
Step 4: choose a depot path on all nodes
Choose or create a directory on each node for the depot storage path. The directory you supply for the depot storage path parameter must:
-
Have the same path on all nodes in the cluster (i.e. /home/dbadmin/depot
).
-
Be readable and writable by the dbadmin user.
-
Have sufficient storage. By default, Vertica uses 60% of the filesystem space containing the directory for depot storage. You can limit the size of the depot by using the --depot-size
argument in the create_db command. See Configuring your Vertica cluster for Eon Mode for guidelines on choosing a size for your depot.
The admintools create_db tool will attempt to create the depot path for you if it doesn't exist.
Step 5: create the Eon on-premises database
Use the admintools create_db tool to create the database. You must pass this tool the following arguments:
Argument |
Description |
-x |
The path to the bootstrap configuration file (bootstrap_params.conf in the examples in this section). |
--communal-storage-location |
The webhdfs or swebhdfs URL for the HDFS location. You cannot use the hdfs scheme. |
--depot-path |
The absolute path to store the depot on the nodes in the cluster. |
--shard-count |
The number of shards for the database. This is an integer number that is usually either a multiple of the number of nodes in your cluster, or an even divider. See Planning for Scaling Your Cluster for more information. |
-s |
A comma-separated list of the nodes in your database. |
-d |
The name for your database. |
Some common optional arguments include:
Argument |
Description |
-l |
The absolute path to the Vertica license file to apply to the new database. |
-p |
The password for the new database. |
--depot-size |
The maximum size for the depot. Defaults to 60% of the filesystem containing the depot path.
You can specify the size in two ways:
-
integer % : Percentage of filesystem's disk space to allocate.
-
integer {K|M|G|T} : Amount of disk space to allocate for the depot in kilobytes, megabytes, gigabytes, or terabytes.
However you specify this value, the depot size cannot be more than 80 percent of disk space of the file system where the depot is stored.
|
To view all arguments for the create_db tool, run the command:
admintools -t create_db --help
The following example demonstrates creating a three-node database named verticadb, specifying the depot will be stored in the home directory of the dbadmin user.
$ admintools -t create_db -x bootstrap_params.conf \
--communal-storage-location=webhdfs://mycluster/verticadb \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
If you are not using HA Name Node, for example in a test environment, you can use an explicit Name Node host and port for --communal-storage-location as in the following example.
$ admintools -t create_db -x bootstrap_params.conf \
--communal-storage-location=webhdfs://namenode.hadoop.example.com:50070/verticadb \
--depot-path=/home/dbadmin/depot --shard-count=6 \
-s vnode01,vnode02,vnode03 -d verticadb -p 'YourPasswordHere'
5 - Troubleshooting the Vertica install
The topics described in this section are performed automatically by the install_vertica script and are described in Installing Using the Command Line.
The topics described in this section are performed automatically by the install_vertica
script and are described in Installing using the command line. If you did not encounter any installation problems, proceed to the Administrator's guidefor instructions on how to configure and operate a database.
5.1 - Validation scripts
Vertica provides several validation utilities that can be used prior to deploying Vertica to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica.
Vertica provides several validation utilities that can be used prior to deploying Vertica to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica. These utilities can also be used if you are encountering performance issues and need to troubleshoot the issue.
After you install the Vertica RPM, you have access to the following scripts in /opt/vertica/bin
:
-
Vcpuperf - a CPU performance test used to verify your CPU performance.
-
Vioperf - an Input/Output test used to verify the speed and consistency of your hard drives.
-
Vnetperf - a Network test used to test the latency and throughput of your network between hosts.
These utilities can be run at any time, but are well suited to use before running the install_vertica script.
5.1.1 - Vcpuperf
The vcpuperf utility measures your server's CPU processing speed and compares it against benchmarks for common server CPUs.
The vcpuperf utility measures your server's CPU processing speed and compares it against benchmarks for common server CPUs. The utility performs a CPU test and measures the time it takes to complete the test. The lower the number scored on the test, the better the performance of the CPU.
The vcpuperf utility also checks the high and low load times to determine if CPU throttling is enabled. If a server's low-load computation time is significantly longer than the high-load computation time, CPU throttling may be enabled. CPU throttling is a power-saving feature. However, CPU throttling can reduce the performance of your server. Vertica recommends disabling CPU throttling to enhance server performance.
Syntax
vcpuperf [-q]
Option
Option |
Description |
-q |
Run in quiet mode. Quiet mode displays only the CPU Time, Real Time, and high and low load times. |
Returns
-
CPU Time: the amount of time it took the CPU to run the test.
-
Real Time: the total time for the test to execute.
-
High load time: The amount of time to run the load test while simulating a high CPU load.
-
Low load time: The amount of time to run the load test while simulating a low CPU load.
Example
The following example shows a CPU that is running slightly slower than the expected time on a Xeon 5670 CPU that has CPU throttling enabled.
[root@node1 bin]# /opt/vertica/bin/vcpuperf
Compiled with: 4.1.2 20080704 (Red Hat 4.1.2-52) Expected time on Core 2, 2.53GHz: ~9.5s
Expected time on Nehalem, 2.67GHz: ~9.0s
Expected time on Xeon 5670, 2.93GHz: ~8.0s
This machine's time:
CPU Time: 8.540000s
Real Time:8.710000s
Some machines automatically throttle the CPU to save power.
This test can be done in <100 microseconds (60-70 on Xeon 5670, 2.93GHz).
Low load times much larger than 100-200us or much larger than the corresponding high load time
indicate low-load throttling, which can adversely affect small query / concurrent performance.
This machine's high load time: 67 microseconds.
This machine's low load time: 208 microseconds.
5.1.2 - Vioperf
The vioperf utility quickly tests the performance of your host's input and output subsystem.
The vioperf
utility quickly tests the performance of your host's input and output subsystem. The utility performs the following tests:
The utility verifies that the host reads the same bytes that it wrote and prints its output to STDOUT. The utility also logs the output to a JSON formatted file.
For data in HDFS, the utility tests reads but not writes.
Syntax
vioperf [--help] [--duration=<INTERVAL>] [--log-interval=<INTERVAL>]
[--log-file=<FILE>] [--condense-log] [--thread-count=<N>] [--max-buffer-size=<SIZE>]
[--preserve-files] [--disable-crc] [--disable-direct-io] [--debug]
[<DIR>*]
-
The minimum required I/O is 20 MB/s read/write per physical processor core on each node, in full duplex (reading and writing) simultaneously, concurrently on all nodes of the cluster.
-
The recommended I/O is 40 MB/s per physical core on each node.
-
The minimum required I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s. Vertica recommends 480 MB/s.
For example, the I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s required minimum, 480 MB/s recommended.
Disk space vioperf needs
vioperf
requires about 4.5 GB to run.
Options
Option |
Description |
--help |
Prints a help message and exits. |
--duration |
The length of time vioprobe runs performance tests. The default is 5 minutes. Specify the interval in seconds, minutes, or hours with any of these suffixes:
-
Seconds: s , sec , secs , second , seconds . Example: --duration=60sec
-
Minutes: m , min , mins , minute , minutes . Example: --duration=10min
-
Hours: h , hr , hrs , hour , hours . Example: --duration=1hrs
|
--log-interval |
The interval at which the log file reports summary information. The default interval is 10 seconds. This option uses the same interval notation as --duration . |
--log-file |
The path and name where log file contents are written, in JSON. If not specified, then vioperf creates a file named results date-time.JSON in the current directory. |
--condense-log |
Directs vioperf to write the log file contents in condensed format, one JSON entry per line, rather than as indented JSON syntax. |
--thread-count=<N> |
The number of execution threads to use. By default, vioperf uses all threads available on the host machine. |
--max-buffer-size=<SIZE> |
The maximum size of the in-memory buffer to use for reads or writes. Specify the units with any of these suffixes:
-
Bytes: b , byte , bytes .
-
Kilobytes: k , kb , kilobyte , kilobytes .
-
Megabytes: m , mb , megabyte , megabytes .
-
Gigabytes: g , gb , gigabyte , gigabytes .
|
--preserve-files |
Directs vioperf to keep the files it writes. This parameter is ignored for HDFS tests, which are read-only. Inspecting the files can help diagnose write-related failures. |
--disable-crc |
Directs vioperf to ignore CRC checksums when validating writes. Verifying checksums can add overhead, particularly when running vioperf on slower processors. This parameter is ignored for HDFS tests. |
--disable-direct-io |
When reading from or writing to a local file system, vioperf goes directly to disk by default, bypassing the operating system's page cache. Using direct I/O allows vioperf to measure performance quickly without having to fill the cache.
Disabling this behavior can produce more realistic performance results but slows down the operation of vioperf .
|
--debug |
Directs vioperf to report verbose error messages. |
<DIR> |
Zero or more directories to test. If you do not specify a directory, vioperf tests the current directory. To test the performance of each disk, specify different directories mounted on different disks.
To test reads from a directory on HDFS:
-
Use a URL in the hdfs scheme that points to a single directory (not a path) containing files at least 10MB in size. For best results, use 10GB files and verify that there is at least one file per vioperf thread.
-
If you do not specify a host and port, set the HADOOP_CONF_DIR environment variable to a path including the Hadoop configuration files. This value is the same value that you use for the HadoopConfDir configuration parameter in Vertica. For more information see Configuring HDFS access.
-
If the HDFS cluster uses Kerberos, set the HADOOP_USER_NAME environment variable to a Kerberos principal.
|
Returns
The utility returns the following information:
Heading |
Description |
test |
The test being run (Write, ReWrite, Read, or Skip Read) |
directory |
The directory in which the test is being run. |
counter name |
The counter type of the test being run. Can be either MB/s or Seeks per second. |
counter value |
The value of the counter in MB/s or Seeks per second across all threads. This measurement represents the bandwidth at the exact time of measurement. Contrast with counter value (avg). |
counter value (10 sec avg) |
The average amount of data in MB/s, or the average number of Seeks per second, for the test being run in the duration specified with --log-interval . The default interval is 10 seconds. The counter value (avg) is the average bandwidth since the last log message, across all threads. |
counter value/core |
The counter value divided by the number of cores. |
counter value/core (10 sec avg) |
The counter value (10 sec avg) divided by the number of cores. |
thread count |
The number of threads used to run the test. |
%CPU |
The available CPU percentage used during this test. |
%IO Wait |
The CPU percentage in I/O Wait state during this test. I/O wait state is the time working processes are blocked while waiting for I/O operations to complete. |
elapsed time |
The amount of time taken for a particular test. If you run the test multiple times, elapsed time increases the next time the test is run. |
remaining time |
The time remaining until the next test. Based on the --duration option, each of the tests is run at least once. If the test set is run multiple times, then remaining time is how much longer the test will run. The remaining time value is cumulative. Its total is added to elapsed time each time the same test is run again. |
Example
Invoking vioperf
from a terminal outputs the following message and sample results:
[dbadmin@v_vmart_node0001 ~]$ /opt/vertica/bin/vioperf --duration=60s
The minimum required I/O is 20 MB/s read and write per physical processor core on each node, in full duplex
i.e. reading and writing at this rate simultaneously, concurrently on all nodes of the cluster.
The recommended I/O is 40 MB/s per physical core on each node.
For example, the I/O rate for a server node with 2 hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.
Using direct io (buffer size=1048576, alignment=512) for directory "/home/dbadmin"
test | directory | counter name | counter value | counter value (10 sec avg) | counter value/core | counter value/core (10 sec avg) | thread count | %CPU | %IO Wait | elapsed time (s)| remaining time (s)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Write | /home/dbadmin | MB/s | 420 | 420 | 210 | 210 | 2 | 89 | 10 | 10 | 5
Write | /home/dbadmin | MB/s | 412 | 396 | 206 | 198 | 2 | 89 | 9 | 15 | 0
ReWrite | /home/dbadmin | (MB-read+MB-write)/s | 150+150 | 150+150 | 75+75 | 75+75 | 2 | 58 | 40 | 10 | 5
ReWrite | /home/dbadmin | (MB-read+MB-write)/s | 158+158 | 172+172 | 79+79 | 86+86 | 2 | 64 | 33 | 15 | 0
Read | /home/dbadmin | MB/s | 194 | 194 | 97 | 97 | 2 | 69 | 26 | 10 | 5
Read | /home/dbadmin | MB/s | 192 | 190 | 96 | 95 | 2 | 71 | 27 | 15 | 0
SkipRead | /home/dbadmin | seeks/s | 659 | 659 | 329.5 | 329.5 | 2 | 2 | 85 | 10 | 5
SkipRead | /home/dbadmin | seeks/s | 677 | 714 | 338.5 | 357 | 2 | 2 | 59 | 15 | 0
Note
When evaluating performance for minimum and recommended I/O, include the Write and Read values in your evaluation. ReWrite and SkipRead values are not relevant to determining minimum and recommended I/O.
5.1.3 - Vnetperf
The vnetperf utility measures network performance of database hosts, as well as network latency and throughput for TCP and UDP protocols.
The vnetperf utility measures network performance of database hosts, as well as network latency and throughput for TCP and UDP protocols.
Caution
This utility incurs high network load, which degrades database performance. Do not use this utility on a Vertica production database.
This utility helps identify the following issues:
-
Low throughput for all hosts or one
-
High latency for all hosts or one
-
Bottlenecks between one or more hosts or subnets
-
Too-low limit on the number of TCP connections that can be established simultaneously
-
High rates of network packet loss
Syntax
vnetperf [[options](#Options)] [[tests](#Tests)]
Options
Option |
Description |
--condense |
Condenses the log into one JSON entry per line, instead of indented JSON syntax. |
--collect-logs |
Collects test log files from each host. |
--datarate rate |
Limits throughput to this rate in MB/s. A rate of 0 loops the tests through several different rates.
Default: 0
|
--duration seconds |
Time limit for each test to run in seconds.
Default: 1
|
--hosts host-name [,...] |
Comma-separated list of host names or IP addresses on which to run the tests. The list must not contain embedded spaces. |
--hosts file |
File that specifies the hosts on which to run the tests. If you omit this option, then the vnetperf tries to access admintools to identify cluster hosts. |
--identity-file file |
If using passwordless SSH/SCP access between hosts, then specify the key file used to gain access to the hosts. |
--ignore-bad-hosts |
If set, runs tests on reachable hosts even if some hosts are not reachable. If you omit this option and a host is unreachable, then no tests are run on any hosts. |
--log-dir directory |
If --collect-logs is set, specifies the directory in which to place the collected logs.
Default: logs.netperf. <timestamp>
|
--log-level level |
Log level to use, one of the following:
Default: WARN
|
--list-tests |
Lists the tests that vnetperf can run. |
--output-file file |
The file to which JSON results are written.
Default: results. <timestamp> .json
|
--ports port#[,...] |
Comma-delimited list of port numbers to use. If only one port number is specified, then the next two numbers in sequence are also used.
Default: 14159,14160,14161
|
--scp-options ' scp-args ' |
Specifies one or more standard SCP command line arguments. SCP is used to copy test binaries over to the target hosts. |
--ssh-options ' ssh-args ' |
Specifies one or more standard SSH command line arguments. SSH is used to issue test commands on the target hosts. |
--tmp-dir directory |
Specifies the temporary directory for vnetperf, where directory must have execute permission on all hosts, and does not include the unsupported characters " , ```, or ' .
Default: /tmp (execute permission required)
|
--vertica-install directory |
Indicates that Vertica is installed on each of the hosts, so vnetperf uses test binaries on the target system rather than copying them over with SCP. |
Tests
vnetperf can specify one or more of the following tests. If no test is specified, vnetperf runs all tests. Test results are printed for each host.
Test |
Description |
Results |
latency |
Measures latency from the host that is running the script to other hosts. Hosts with unusually high latency should be investigated further. |
|
tcp-throughput |
Tests TCP throughput among hosts. |
|
udp-throughput |
Tests UDP throughput among hosts |
-
Maximum recommended RTT (round-trip time) latency is 1000 microseconds. Ideal RTT latency is 200 microseconds or less. Vertica recommends that clock skew be less than 1 second.
-
Minimum recommended throughput is 100 MB/s. Ideal throughput is 800 MB/s or more.
Note
UDP throughput can be lower; multiple network switches can adversely affect performance.
Example
$ vnetperf latency tcp-throughput
The maximum recommended rtt latency is 2 milliseconds. The ideal rtt latency is 200 microseconds or less. It is recommended that clock skew be kept to under 1 second.
test | date | node | index | rtt latency (us) | clock skew (us)
-------------------------------------------------------------------------------------------------------------------------
latency | 2022-03-29_10:23:55,739 | 10.20.100.247 | 0 | 49 | 3
latency | 2022-03-29_10:23:55,739 | 10.20.100.248 | 1 | 272 | -702
latency | 2022-03-29_10:23:55,739 | 10.20.100.249 | 2 | 245 | 1037
The minimum recommended throughput is 100 MB/s. Ideal throughput is 800 MB/s or more. Note: UDP numbers may be lower, multiple network switches may reduce performance results.
date | test | rate limit (MB/s) | node | MB/s (sent) | MB/s (rec) | bytes (sent) | bytes (rec) | duration (s)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2022-03-29_10:23:55,742 | tcp-throughput | 32 | 10.20.100.247 | 30.579 | 30.579 | 32112640 | 32112640 | 1.00151
2022-03-29_10:23:55,742 | tcp-throughput | 32 | 10.20.100.248 | 30.5791 | 30.5791 | 32112640 | 32112640 | 1.0015
2022-03-29_10:23:55,742 | tcp-throughput | 32 | 10.20.100.249 | 30.5791 | 30.5791 | 32112640 | 32112640 | 1.0015
2022-03-29_10:23:55,742 | tcp-throughput | 32 | average | 30.579 | 30.579 | 32112640 | 32112640 | 1.0015
2022-03-29_10:23:57,749 | tcp-throughput | 64 | 10.20.100.247 | 61.0952 | 61.0952 | 64094208 | 64094208 | 1.00049
2022-03-29_10:23:57,749 | tcp-throughput | 64 | 10.20.100.248 | 61.096 | 61.096 | 64094208 | 64094208 | 1.00048
2022-03-29_10:23:57,749 | tcp-throughput | 64 | 10.20.100.249 | 61.0952 | 61.0952 | 64094208 | 64094208 | 1.00049
2022-03-29_10:23:57,749 | tcp-throughput | 64 | average | 61.0955 | 61.0955 | 64094208 | 64094208 | 1.00048
2022-03-29_10:23:59,753 | tcp-throughput | 128 | 10.20.100.247 | 122.131 | 122.131 | 128122880 | 128122880 | 1.00046
2022-03-29_10:23:59,753 | tcp-throughput | 128 | 10.20.100.248 | 122.132 | 122.132 | 128122880 | 128122880 | 1.00046
2022-03-29_10:23:59,753 | tcp-throughput | 128 | 10.20.100.249 | 122.132 | 122.132 | 128122880 | 128122880 | 1.00046
2022-03-29_10:23:59,753 | tcp-throughput | 128 | average | 122.132 | 122.132 | 128122880 | 128122880 | 1.00046
2022-03-29_10:24:01,757 | tcp-throughput | 256 | 10.20.100.247 | 243.819 | 244.132 | 255754240 | 256081920 | 1.00036
2022-03-29_10:24:01,757 | tcp-throughput | 256 | 10.20.100.248 | 244.125 | 243.282 | 256049152 | 255164416 | 1.00025
2022-03-29_10:24:01,757 | tcp-throughput | 256 | 10.20.100.249 | 244.172 | 243.391 | 256114688 | 255295488 | 1.00032
2022-03-29_10:24:01,757 | tcp-throughput | 256 | average | 244.039 | 243.601 | 255972693 | 255513941 | 1.00031
2022-03-29_10:24:03,761 | tcp-throughput | 512 | 10.20.100.247 | 337.232 | 485.247 | 355893248 | 512098304 | 1.00645
2022-03-29_10:24:03,761 | tcp-throughput | 512 | 10.20.100.248 | 446.16 | 231.001 | 467894272 | 242253824 | 1.00013
2022-03-29_10:24:03,761 | tcp-throughput | 512 | 10.20.100.249 | 349.667 | 409.961 | 368476160 | 432013312 | 1.00497
2022-03-29_10:24:03,761 | tcp-throughput | 512 | average | 377.686 | 375.403 | 397421226 | 395455146 | 1.00385
2022-03-29_10:24:05,772 | tcp-throughput | 640 | 10.20.100.247 | 328.279 | 509.256 | 383975424 | 595656704 | 1.11548
2022-03-29_10:24:05,772 | tcp-throughput | 640 | 10.20.100.248 | 505.626 | 217.217 | 532250624 | 228655104 | 1.00389
2022-03-29_10:24:05,772 | tcp-throughput | 640 | 10.20.100.249 | 390.355 | 474.89 | 410812416 | 499777536 | 1.00365
2022-03-29_10:24:05,772 | tcp-throughput | 640 | average | 408.087 | 400.454 | 442346154 | 441363114 | 1.04101
2022-03-29_10:24:07,892 | tcp-throughput | 768 | 10.20.100.247 | 300.5 | 426.762 | 318734336 | 452657152 | 1.01154
2022-03-29_10:24:07,892 | tcp-throughput | 768 | 10.20.100.248 | 268.252 | 402.891 | 283017216 | 425066496 | 1.00616
2022-03-29_10:24:07,892 | tcp-throughput | 768 | 10.20.100.249 | 510.569 | 243.649 | 535592960 | 255590400 | 1.00042
2022-03-29_10:24:07,892 | tcp-throughput | 768 | average | 359.774 | 357.767 | 379114837 | 377771349 | 1.00604
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | 10.20.100.247 | 304.545 | 444.261 | 334987264 | 488669184 | 1.049
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | 10.20.100.248 | 422.246 | 192.773 | 474284032 | 216530944 | 1.07121
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | 10.20.100.249 | 353.206 | 446.809 | 378732544 | 479100928 | 1.0226
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | average | 359.999 | 361.281 | 396001280 | 394767018 | 1.0476
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | 10.20.100.247 | 343.324 | 414.559 | 387710976 | 468156416 | 1.07697
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | 10.20.100.248 | 292.44 | 246.254 | 308314112 | 259620864 | 1.00544
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | 10.20.100.249 | 437.559 | 405.02 | 459145216 | 425000960 | 1.00072
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | average | 357.774 | 355.278 | 385056768 | 384259413 | 1.02771
JSON results available at: ./results.2022-03-29_10:23:51,548.json
5.2 - Enable secure shell (SSH) logins
The administrative account must be able to use Secure Shell (SSH) to log in (ssh) to all hosts without specifying a password.
The administrative account must be able to use Secure Shell (SSH) to log in (ssh) to all hosts without specifying a password. The shell script install_vertica does this automatically. This section describes how to do it manually if necessary.
-
If you do not already have SSH installed on all hosts, log in as root on each host and install it now. You can download a free version of the SSH connectivity tools from OpenSSH.
-
Log in to the Vertica administrator account (dbadmin in this example).
-
Make your home directory (~) writable only by yourself. Choose one of:
$ chmod 700 ~
or
$ chmod 755 ~
where:
700 includes |
755 includes |
400 read by owner
200 write by owner
100 execute by owner
|
400 read by owner
200 write by owner
100 execute by owner
040 read by group
010 execute by group
004 read by anybody (other)
001 execute by anybody
|
-
Change to your home directory:
$ cd ~
- Generate a private key/ public key pair:
$ ssh-keygen -t rsaGenerating public/private rsa key pair.
Enter file in which to save the key (/home/dbadmin/.ssh/id_rsa):
Created directory '/home/dbadmin/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/dbadmin/.ssh/id_rsa.
Your public key has been saved in /home/dbadmin/.ssh/id_rsa.pub.
- Make your .ssh directory readable and writable only by yourself:
$ chmod 700 ~/.ssh
- Change to the .ssh directory:
$ cd ~/.ssh
- Copy the file
id_rsa.pub
onto the file authorized_keys2
.
$ cp id_rsa.pub authorized_keys2
- Make the files in your .ssh directory readable and writable only by yourself:
$ chmod 600 ~/.ssh/*
- For each cluster host:
$ scp -r ~/.ssh <host>:.
- Connect to each cluster host. The first time you ssh to a new remote machine, you could get a message similar to the following:
$ ssh dev0 Warning: Permanently added 'dev0,192.168.1.92' (RSA) to the list of known hosts.
This message appears only the first time you ssh to a particular remote host.
See also
6 - After you install Vertica
The tasks described in this section are optional and are provided for your convenience.
The tasks described in this section are optional and are provided for your convenience. When you have completed this section, proceed to one of the following:
The topics in this section apply to a Vertica cluster installed on-premises. For information on creating Vertica clusters that run on various cloud platforms, see Using Vertica on the cloud.
6.1 - Install the license key
If you did not supply the -L parameter during setup, or if you did not bypass the -L parameter for a silent install, the first time you log in as the and run the Vertica or Management Console, Vertica requires you to install a license key.
If you did not supply the -L
parameter during setup, or if you did not bypass the -L
parameter for a silent install, the first time you log in as the Database Superuser and run the Vertica Administration tools or Management Console, Vertica requires you to install a license key.
Follow the instructions in Managing licenses in the Administrator's Guide.
6.2 - Optionally install vsql client application on non-cluster hosts
vsql is also available for additional platforms.
You can use the Vertica vsql executable image on a non-cluster Linux host to connect to a Vertica database.
-
On Red Hat, CentOS, and SUSE systems, you can install the client driver RPM, which includes the vsql executable. See Installing the client RPM on red hat and SUSE for details.
-
If the non-cluster host is running the same version of Linux as the cluster, copy the image file to the remote system. For example:
$ scp host01:/opt/vertica/bin/vsql .$ ./vsql
-
If the non-cluster host is running a different distribution or version of Linux than your cluster hosts, you must install the Vertica server RPM in order to get vsql:
-
Download the appropriate RPM package by browsing to Vertica website. On the Support tab, select Customer Downloads.
-
If the system you used to download the RPM is not the non-cluster host, transfer the file to the non-cluster host.
-
Log into the non-cluster host as root and install the RPM package using the command:
# rpm -Uvh filename
Where filename
is the package you downloaded. Note that you do not have to run the install_vertica
script on the non-cluster host to use vsql.
Notes
-
Use the same Command-line options that you would on a cluster host.
-
You cannot run vsql on a Cygwin bash shell (Windows). Use ssh to connect to a cluster host, then run vsql.
vsql is also available for additional platforms. See Installing the vsql client.
6.3 - Installing client drivers
After you install Vertica, install drivers on the client systems from which you plan to access your databases.
After you install Vertica, install drivers on the client systems from which you plan to access your databases. Vertica supplies drivers for ADO.NET, JDBC, ODBC, OLE DB, Perl, and Python. For instructions on installing these drivers, see Client drivers.
6.4 - Database modes
You can create a database in Enterprise Mode or Eon Mode.
You can create a database in Enterprise Mode or Eon Mode. After you create a database, the functionality is largely the same regardless of the mode. The differences in these two modes lay in their architecture, deployment, and scalability.
Enterprise Mode database architecture distributes data across local nodes, and works on-premises or in the cloud. Consider creating the database in this mode on a cluster of predetermined size, which is good for running large queries quickly. Because it persistently stores its data locally, you do not need to have access to communal storage on Amazon S3 to use an Enterprise Mode database. Enterprise Mode conceptsincludes an overview of how data store works on a database running in Enterprise Mode.
Eon Mode database architecture leverages the flexibility of EC2 instances and the persistence of Amazon S3. Eon Mode databases are ideal when you want to frequently scale up your cluster in order to run many short, concurrent queries. Because an Eon Mode database stores its data in a persistent location outside of its local nodes, you can rapidly adjust the size of your cluster without interrupting ongoing workloads when you do so. (See Eon Mode for more about Eon Mode database concepts.)
Separating the computational processes of Vertica from its storage layer is what allows you to scale your Eon Mode database up quickly as your workload changes; in Eon Mode, a scaled up cluster means the database can increase the number of queries you can run concurrently. You can only run Eon Mode on Amazon Web Services.
Running Vertica in Eon Mode might be a good choice in the following situations:
-
You are deploying Vertica in the AWS cloud.
-
You have variable workloads that sometimes require a number of short, simultaneous queries.
-
You need to elastically scale your database resources.
You can install Vertica with Eon Mode using an Amazon CloudFormation template and in-browser wizards provided by Vertica Management Console. See Vertica on Amazon Web Services and Creating an Eon Mode database in AWS with MC for more information.
6.5 - Creating a database
To get started using Vertica immediately after installation, create a database.
To get started using Vertica immediately after installation, create a database. You can use either the Administration Tools or the Management Console. To create a database using MC, refer to Creating a database using MC.
For a more detailed walk through of database creation steps, see Creating a database.
Follow these step to begin creating a database using the Administration Tools for the first time after installing Vertica.
-
Log in as the database administrator, and type admintools
to bring up the Administration Tools.
-
When the EULA (end-user license agreement) window opens, type accept
to proceed. A window displays, requesting the location of the license key file you downloaded from the Vertica Web site. The default path is /tmp/vlicense.dat
.
-
If you are using the Vertica Community Edition, click OK without entering a license key.
-
If you are using the Vertica Premium Edition, type the absolute path to your license key (for example, /tmp/vlicense.dat
) and click OK.
-
From the Administration Tools Main Menu, click Configuration Menu, and then click OK.
-
Click Create Database, and click OK to start the database creation wizard.
For a detailed walkthrough of database creation for Enterprise Mode and Eon Mode databases, see Creating a database.
See also
7 - Upgrading Vertica
The process of upgrading your database with a new Vertica version includes:.
The process of upgrading your database with a new Vertica version includes:
Click on the above links for detailed instructions.
7.1 - Upgrade paths
Upgrades are generally incremental: you must upgrade to each intermediate major and minor release.
Upgrades are generally incremental: you must upgrade to each intermediate major and minor release. For example, you upgrade from Vertica 9.0 to 10.1 in the following steps:
-
Vertica 9.0 to 9.1
-
Vertica 9.1 to 9.2
-
Vertica 9.2 to 9.3
-
Vertica 9.3 to 10.0
-
Vertica 10.0 to 10.1
Note
You can skip service pack releases. For example, the preceding upgrade path omits releases 9.0.1 and 9.1.1.
If you're upgrading from a FIPS-enabled Vertica 9.2.x database to 10.1.1 and want to maintain your FIPS certification, you must perform a direct upgrade. For instructions, see Nonsequential FIPS database upgrades.
Be sure to read the Release Notes and New Features for each version in your path. Documentation for the current Vertica version is available in the RPM and at https://docs.vertica.com/latest. The same URL also provides access to documentation for earlier versions.
For guidance on upgrading from unsupported versions, contact Vertica Technical Support.
7.1.1 - Nonsequential FIPS database upgrades
As of Vertica 10.1.1, FIPS support has been reinstated.
As of Vertica 10.1.1, FIPS support has been reinstated. Prior to this, the last version to support FIPS was Vertica 9.2.x. Vertica upgrades are typically sequential, but if you are upgrading from 9.2.x and want to maintain your FIPS certification, you must first perform a direct, nonsequential upgrade from 9.2.x to 10.1.1 before performing the standard sequential upgrades from 10.1.1 to 11.1.x.
The following procedure performs a direct upgrade from Vertica 9.2.x running on RHEL 6.x to Vertica 10.1.1 on RHEL 8.1.
Important
If you have any questions or want additional guidance for performing this upgrade, contact
Vertica Support.
-
Create a full backup of your Vertica 9.2.x database. This example uses the configuration file fullRestore.ini
.
$ vbr --config-file=/tmp/fullRestore.ini -t init
$ vbr --config-file=/tmp/fullRestore.ini -t backup
[Transmission]
concurrency_backup = 1
port_rsync = 50000
encrypt = False
serviceAccessPass = rsyncpw
hardLinkLocal = False
checksum = False
total_bwlimit_restore = 0
serviceAccessUser = rsyncuser
total_bwlimit_backup = 0
concurrency_restore = 1
[Misc]
snapshotName = full_restore
restorePointLimit = 1
retryDelay = 1
objects =
retryCount = 0
tempDir = /tmp/vbr
[Mapping]
v_fips_db_node0001 = 198.51.100.0:/home/release/backup/
v_fips_db_node0002 = 198.51.100.1:/home/release/backup/
v_fips_db_node0003 = 198.51.100.2:/home/release/backup/
[Database]
dbPort = 5433
dbPromptForPassword = False
dbUser =
dbPassword =
dbName = fips_db
-
Shut down the database gracefully. Do not start the database until instructed.
-
Acquire a RHEL 8.1 cluster with one of the following methods:
- Upgrade in place
- Reimage your machines
- Use a completely different RHEL 8.1 cluster
-
Enable FIPS on your RHEL 8.1 machines and reboot.
$ fips-mode-setup --enable
-
Install Vertica 10.1.1 on the RHEL 8.1 cluster.
$ install_vertica --hosts node0001, node0002, node0003 \
--rpm /tmp/vertica-10.1.1-0/x86_64.RHEL8.rpm
-
If you acquired your RHEL 8.1 cluster by reimaging or using a different cluster, you must restore your database.
$ vbr -c /tmp/fullRestore.ini -t restore
If you encounter the following warning, you can safely ignore it.
Warning: Vertica versions do not match: v9.2.1-xx -> v10.1.1-xxxxxxxx. This operation may not be supported.
-
Start the Vertica 10.1.1 database to trigger the upgrade. This should be the first time you've started your database since shutting it down in step 2.
$ admintools -t start_db -d fips_db
7.2 - Before you upgrade
Before you upgrade the Vertica database, perform the following steps:.
Before you upgrade the Vertica database, perform the following steps:
-
Verify that you have enough RAM available to run the upgrade. The upgrade requires approximately three times the amount of memory your database catalog uses.
You can calculate catalog memory usage on all nodes by querying system table RESOURCE_POOL_STATUS:
=> SELECT node_name, pool_name, memory_size_kb FROM resource_pool_status WHERE pool_name = 'metadata';
-
Perform a full database backup. This precautionary measure allows you to restore the current version if the upgrade is unsuccessful.
-
Perform a backup of your grants.
-
Verify platform requirements for the new version.
-
Determine whether you are using any third-party user-defined extension libraries (UDxs). UDx libraries that are compiled (such as those developed using C++ or Java) may need to be recompiled with a new version of the Vertica SDK libraries to be compatible with the new version of Vertica. See UDx library compatibility with new server versions.
-
Check catalog storage space.
-
If you're upgrading from Vertica 9.2.x and have set the PasswordMinCharChange
or PasswordMinLifeTime
system-level security parameters, take note of their current values. You will have to set these parameters again, this time at the PROFILE-level, to reproduce your configuration. To view the current values for these parameters, run the following query:
=> SELECT parameter_name,current_value from CONFIGURATION_PARAMETERS
WHERE parameter_name IN ('PasswordMinCharChange', 'PasswordMinLifeTime');
After you complete these tasks, shut down the database gracefully.
7.2.1 - Verifying platform requirements
The Vertica installer checks the target platform as it runs, and stops whenever it determines the platform fails to meet an installation requirement.
The Vertica installer checks the target platform as it runs, and stops whenever it determines the platform fails to meet an installation requirement. Before you update the server package on your systems, manually verify that your platform meets all hardware and software requirements (see Platform requirements and recommendations).
By default, the installer stops on all warnings. You can configure the level where the installer stops installation, through the installation parameter --failure-threshold
. If you set the failure threshold to FAIL
, the installer ignores warnings and stops only on failures.
Caution
Changing the failure threshold lets you immediately upgrade and bring up the Vertica database. However, Vertica cannot fully optimize performance until you correct all warnings.
7.2.2 - Checking catalog storage space
Use the commands documented here to determine how much catalog space is available before upgrading.
Use the commands documented here to determine how much catalog space is available before upgrading. This helps you determine how much space the updated catalog may take up.
Compare how much space the catalog currently uses against space that is available in the same directory:
-
Use the du
command to determine how much space the catalog directory currently uses:
$ du -s -BG v_vmart_node0001_catalog
2G v_vmart_node0001_catalog
-
Determine how much space is available in the same directory:
$ df -BG v_vmart_node0001_catalog
Filesystem 1G-blocks Used Available Use% Mounted on
/dev/sda2 48G 19G 26G 43% /
7.2.3 - Verify license compliance for ORC and Parquet data
If you are upgrading from a version before 9.1.0 and:.
If you are upgrading from a version before 9.1.0 and:
follow the steps in this topic before upgrading.
Background
Vertica licenses can include a raw data allowance. Since 2016, Vertica licenses have allowed you to use ORC and Parquet data in external tables. This data has always counted against any raw data allowance in your license. Previously, the audit of data in ORC and Parquet format was handled manually. Because this audit was not automated, the total amount of data in your native tables and external tables could exceed your licensed allowance for some time before being spotted.
Starting in version 9.1.0, Vertica automatically audits ORC and Parquet data in external tables. This auditing begins soon after you install or upgrade to version 9.1.0. If your Vertica license includes a raw data allowance and you have data in external tables based on Parquet or ORC files, review your license compliance before upgrading to Vertica 9.1.x. Verifying your database is compliant with your license terms avoids having your database become non-compliant soon after you upgrade.
Verifying your ORC and Parquet usage complies with your license terms
To verify your data usage is compliant with your license, run the following query as the database administrator:
SELECT (database_size_bytes + file_size_bytes) <= license_size_bytes
"license_compliant?"
FROM (SELECT database_size_bytes,
license_size_bytes FROM license_audits
WHERE audited_data='Total'
ORDER BY audit_end_timestamp DESC LIMIT 1) dbs,
(SELECT sum(total_file_size_bytes) file_size_bytes
FROM external_table_details
WHERE source_format IN ('ORC', 'PARQUET')) ets;
This query returns one of three values:
-
If you do not have any external data in ORC or Parquet format, the query returns 0 rows:
license_compliant?
--------------------
(0 rows)
In this case, you can proceed with your upgrade.
-
If you have data in external tables based on ORC or Parquet format, and that data does not cause your database to exceed your raw data allowance, the query returns t:
license_compliant?
--------------------
t
(1 row)
In this case, you can proceed with your upgrade.
-
If the data in your external tables based on ORC and Parquet causes your database to exceed your raw data allowance, the query returns f:
license_compliant?
--------------------
f
(1 row)
In this case, resolve the compliance issue before you upgrade. See below for more information.
Resolving non-compliance
If query in the previous section indicates that your database is not in compliance with your license, you should resolve this issue before upgrading. There are two ways you can bring your database into compliance:
-
Contact Vertica to upgrade your license to a larger data size allowance. See Obtaining a license key file.
-
Delete data (either from ORC and Parquet-based external tables or Vertica native tables) to bring your data size into compliance with your license. You should always backup any data you are about to delete from Vertica. Dropping external tables is a less disruptive way to reduce the size of your database, as the data is not lost—it is still in the files that your external table is based on.
Note
You can still choose to upgrade your database if it is not compliant. However, soon after you upgrade, you will begin getting warnings that your database is out of compliance. See
Managing license warnings and limits for more information.
7.2.4 - Backing up and restoring grants
After an upgrade, if the prototypes of UDx libraries change, Vertica will drop the grants on those libraries since they aren't technically the same function anymore.
After an upgrade, if the prototypes of UDx libraries change, Vertica will drop the grants on those libraries since they aren't technically the same function anymore. To resolve these types of issues, it's best practice to back up the grants on these libraries so you can restore them after the upgrade.
-
Save the following SQL to a file named user_ddl.sql
. It creates a view named user_ddl which contains the grants on all objects in the database.
CREATE OR REPLACE VIEW user_ddl AS
(
SELECT 0 as grant_order,
name principal_name,
'CREATE ROLE "' || name || '"' || ';' AS sql,
'NONE' AS object_type,
'NONE' AS object_name
FROM v_internal.vs_roles vr
WHERE NOT vr.predefined_role -- Exclude system roles
AND ldapdn = '' -- Limit to NON-LDAP created roles
)
UNION ALL
(
SELECT 1, -- CREATE USERs
user_name,
'CREATE USER "' || user_name || '"' ||
DECODE(is_locked, TRUE, ' ACCOUNT LOCK', '') ||
DECODE(grace_period, 'undefined', '', ' GRACEPERIOD ''' || grace_period || '''') ||
DECODE(idle_session_timeout, 'unlimited', '', ' IDLESESSIONTIMEOUT ''' || idle_session_timeout || '''') ||
DECODE(max_connections, 'unlimited', '', ' MAXCONNECTIONS ' || max_connections || ' ON ' || connection_limit_mode) ||
DECODE(memory_cap_kb, 'unlimited', '', ' MEMORYCAP ''' || memory_cap_kb || 'K''') ||
DECODE(profile_name, 'default', '', ' PROFILE ' || profile_name) ||
DECODE(resource_pool, 'general', '', ' RESOURCE POOL ' || resource_pool) ||
DECODE(run_time_cap, 'unlimited', '', ' RUNTIMECAP ''' || run_time_cap || '''') ||
DECODE(search_path, '', '', ' SEARCH_PATH ' || search_path) ||
DECODE(temp_space_cap_kb, 'unlimited', '', ' TEMPSPACECAP ''' || temp_space_cap_kb || 'K''') || ';' AS sql,
'NONE' AS object_type,
'NONE' AS object_name
FROM v_catalog.users
WHERE NOT is_super_user -- Exclude database superuser
AND ldap_dn = '' -- Limit to NON-LDAP created users
)
UNION ALL
(
SELECT 2, -- GRANTs
grantee,
'GRANT ' || REPLACE(TRIM(BOTH ' ' FROM words), '*', '') ||
CASE
WHEN object_type = 'RESOURCEPOOL' THEN ' ON RESOURCE POOL '
WHEN object_type = 'STORAGELOCATION' THEN ' ON LOCATION '
WHEN object_type = 'CLIENTAUTHENTICATION' THEN 'AUTHENTICATION '
WHEN object_type IN ('DATABASE', 'LIBRARY', 'MODEL', 'SEQUENCE', 'SCHEMA') THEN ' ON ' || object_type || ' '
WHEN object_type = 'PROCEDURE' THEN (SELECT ' ON ' || CASE REPLACE(procedure_type, 'User Defined ', '')
WHEN 'Transform' THEN 'TRANSFORM FUNCTION '
WHEN 'Aggregate' THEN 'AGGREGATE FUNCTION '
WHEN 'Analytic' THEN 'ANALYTIC FUNCTION '
ELSE UPPER(REPLACE(procedure_type, 'User Defined ', '')) || ' '
END
FROM vs_procedures
WHERE proc_oid = object_id)
WHEN object_type = 'ROLE' THEN ''
ELSE ' ON '
END ||
NVL2(object_schema, object_schema || '.', '') || CASE WHEN object_type = 'STORAGELOCATION' THEN (SELECT '''' || location_path || ''' ON ' || node_name FROM storage_locations WHERE location_id = object_id) ELSE object_name END ||
CASE
WHEN object_type = 'PROCEDURE' THEN (SELECT CASE WHEN procedure_argument_types = '' OR procedure_argument_types = 'Any' THEN '()' ELSE '(' || procedure_argument_types || ')' END
FROM vs_procedures
WHERE proc_oid = object_id)
ELSE ''
END ||
' TO ' || grantee ||
CASE WHEN INSTR(words, '*') > 0 THEN ' WITH GRANT OPTION' ELSE '' END
|| ';',
object_type,
object_name
FROM (SELECT grantee, object_type, object_schema, object_name, object_id,
v_txtindex.StringTokenizerDelim(DECODE(privileges_description, '', ',' , privileges_description), ',')
OVER (PARTITION BY grantee, object_type, object_schema, object_name, object_id)
FROM v_catalog.grants) foo
ORDER BY CASE REPLACE(TRIM(BOTH ' ' FROM words), '*', '') WHEN 'USAGE' THEN 1 ELSE 2 END
)
UNION ALL
(
SELECT 3, -- Default ROLEs
user_name,
'ALTER USER "' || user_name || '"' ||
DECODE(default_roles, '', '', ' DEFAULT ROLE ' || REPLACE(default_roles, '*', '')) || ';' ,
'NONE' AS object_type,
'NONE' AS object_name
FROM v_catalog.users
WHERE default_roles <> ''
)
UNION ALL -- GRANTs WITH ADMIN OPTION
(
SELECT 4, user_name, 'GRANT ' || REPLACE(TRIM(BOTH ' ' FROM words), '*', '') || ' TO ' || user_name || ' WITH ADMIN OPTION;',
'NONE' AS object_type ,
'NONE' AS object_name
FROM (SELECT user_name, v_txtindex.StringTokenizerDelim(DECODE(all_roles, '', ',', all_roles), ',') OVER (PARTITION BY user_name)
FROM v_catalog.users
WHERE all_roles <> '') foo
WHERE INSTR(words, '*') > 0
)
UNION ALL
(
SELECT 5, 'public', 'ALTER SCHEMA ' || name || ' DEFAULT ' || CASE WHEN defaultinheritprivileges THEN 'INCLUDE PRIVILEGES;' ELSE 'EXCLUDE PRIVILEGES;' END, 'SCHEMA', name
FROM v_internal.vs_schemata
WHERE NOT issys -- Exclude system schemas
)
UNION ALL
(
SELECT 6, 'public', 'ALTER DATABASE ' || database_name || ' SET disableinheritedprivileges = ' || current_value || ';',
'DATABASE', database_name
FROM v_internal.vs_configuration_parameters
CROSS JOIN v_catalog.databases
WHERE parameter_name = 'DisableInheritedPrivileges'
)
UNION ALL -- TABLE PRIV INHERITENCE
(
SELECT 7, 'public' , 'ALTER TABLE ' || table_schema || '.' || table_name ||
CASE WHEN inheritprivileges THEN ' INCLUDE PRIVILEGES;' ELSE ' EXCLUDE PRIVILEGES;' END,
'TABLE' AS object_type,
table_schema || '.' || table_name AS object_name
FROM v_internal.vs_tables
JOIN v_catalog.tables ON (table_id = oid)
)
UNION ALL -- VIEW PRIV INHERITENCE
(
SELECT 8, 'public', 'ALTER VIEW ' || table_schema || '.' || table_name || CASE WHEN inherit_privileges THEN ' INCLUDE PRIVILEGES;' ELSE ' EXCLUDE PRIVILEGES; ' END,
'TABLE' AS object_type, table_schema || '.' || table_name AS object_name
FROM v_catalog.views
)
UNION ALL
(
SELECT 9, owner_name, 'ALTER TABLE ' || table_schema || '.' || table_name || ' OWNER TO ' || owner_name || ';',
'TABLE', table_schema || '.' || table_name
FROM v_catalog.tables
)
UNION ALL
(
SELECT 10, owner_name, 'ALTER VIEW ' || table_schema || '.' || table_name || ' OWNER TO ' || owner_name || ';', 'TABLE',
table_schema || '.' || table_name
FROM v_catalog.views
);
-
From the Linux command line, run the script in the user_ddl.sql
file:
$ vsql -f user_ddl.sql
CREATE VIEW
-
Connect to Vertica using vsql.
-
Export the content of the user_ddl's sql column ordered on the grant_order column to a file:
=> \o pre-upgrade.txt
=> SELECT sql FROM user_ddl ORDER BY grant_order ASC;
=> \o
-
Upgrade Vertica.
-
Select and save to a different file the view's SQL column with the same command.
=> \o post-upgrade.txt
=> SELECT sql FROM user_ddl ORDER BY grant_order ASC;
=> \o
-
Create a diff between pre-upgrade.txt
and post-upgrade.txt
. This collects the missing grants into grants-list.txt
.
$ diff pre-upgrade.txt post-upgrade.txt > grants-list.txt
-
To restore any missing grants, run the remaining grants in grants-list.txt
, if any:
=> \i 'grants-list.txt'
Note
Attempting to restore grants to users with the ANY keyword triggers the following error:
ERROR 4856: Syntax error at or near "Any" at character
To avoid this error, use () instead of (ANY) as shown in the following example:
=> GRANT EXECUTE ON FUNCTION public.MapLookup() TO public;
GRANT PRIVILEGE
7.3 - Upgrade Vertica
Before running the upgrade script, be sure to review the tasks described in Before You Upgrade.
Important
Before running the upgrade script, be sure to review the tasks described in
Before you upgrade.
Repeat this procedure for each version in your upgrade path:
-
Perform a full backup of your existing database. This precautionary measure lets you restore from the backup, if the upgrade is unsuccessful. If the upgrade fails, you can reinstall the previous version of Vertica and restore your database to that version.
If your upgrade path includes multiple versions, create a full backup with the first upgrade. For each subsequent upgrade, you can perform incremental backups. However, Vertica recommends full backups before each upgrade if disk space and time allow.
-
Use admintools to stop the database.
-
On each host where an additional package is installed, such as the R language pack, uninstall it. For example:
rpm -e vertica-R-lang
Important
If you omit this step and do not uninstall additional packages, the Vertica server package fails to install in the next step.
-
Make sure you are logged in as root or sudo and use one of the following commands to run the RPM package installer:
- If you are root and installing an RPM:
# rpm -Uvh pathname
- If you are using sudo and installing an RPM:
$ sudo rpm -Uvh pathname
$ sudo dpkg -i pathname
-
On the same node on which you just installed the RPM, run update_vertica
as root or sudo. This installs the RPM on all the hosts in the cluster. For example:
Red Hat or CentOS
# /opt/vertica/sbin/update_vertica --rpm /home/dbadmin/vertica-12.0.x.x86_64.RHEL6.rpm --dba-user mydba
Debian
# /opt/vertica/sbin/update_vertica --deb /home/dbadmin/vertica-amd64.deb --dba-user mydba
The following requirements and restrictions apply:
-
The DBADMIN user must be able to read the RPM or DEB file when upgrading. Some upgrade scripts are run as the DBADMIN user, and that user must be able to read the RPM or DEB file.
-
Use the same options that you used when you last installed or upgraded the database. You can find these options in /opt/vertica/config/admintools.conf
, on the install_opts
line. For details on all options, see Installing Vertica with the installation script.
Caution
If you omit any previous options, their default settings are restored. If you do so, or if you change any options, the upgrade script uses the new settings to reconfigure the cluster. This can cause issues with the upgraded database.
-
Omit the --hosts/-s
host-list
parameter. The upgrade script automatically identifies cluster hosts.
-
If the root user is not in /etc/sudoers, an error appears. The installer reports this issue with S0311. See the Sudoers Manual for more information.
-
Start the database. The start-up scripts analyze the database and perform necessary data and catalog updates for the new version.
If Vertica issues a warning stating that one or more packages cannot be installed, run the admintools --force-reinstall
option to force reinstallation of the packages. For details, see Reinstalling packages.
-
When the upgrade is complete, the database automatically restarts.
Note
Manually restart any nodes that fail to start up.
-
Perform another database backup.
Upgrade duration
Duration depends on average in-memory size of catalogs across all cluster nodes. For every 20GB, you can expect the upgrade to last between one and two hours.
You can calculate catalog memory usage on all nodes by querying system table RESOURCE_POOL_STATUS:
=> SELECT node_name, pool_name, memory_size_kb FROM resource_pool_status WHERE pool_name = 'metadata';
Post-upgrade tasks
After you complete the upgrade, review post-upgrade tasks in After you upgrade.
7.4 - After you upgrade
After you finish upgrading the Vertica server package on your cluster, a number of tasks remain.
After you finish upgrading the Vertica server package on your cluster, a number of tasks remain.
Required tasks
Optional tasks
7.4.1 - Rebuilding partitioned projections with pre-aggregated data
If you created projections in earlier (pre-10.0.x) releases with pre-aggregated data (for example, LAPs and TopK projections) and the anchor tables were partitioned with a GROUP BY clause, their ROS containers are liable to be corrupted from various DML and ILM operations.
If you created projections in earlier (pre-10.0.x) releases with pre-aggregated data (for example, LAPs and TopK projections) and the anchor tables were partitioned with a GROUP BY clause, their ROS containers are liable to be corrupted from various DML and ILM operations. In this case, you must rebuild the projections:
-
Run the meta-function REFRESH on the database. If REFRESH detects problematic projections, it returns with failure messages. For example:
=> SELECT REFRESH();
REFRESH
-----------------------------------------------------------------------------------------------------
Refresh completed with the following outcomes:
Projection Name: [Anchor Table] [Status] [ Refresh Method] [Error Count]
"public"."store_sales_udt_sum": [store_sales] [failed: Drop and recreate projection] [] [1]
"public"."product_sales_largest": [store_sales] [failed: Drop and recreate projection] [] [1]
"public"."store_sales_recent": [store_sales] [failed: Drop and recreate projection] [] [1]
(1 row)
Vertica also logs messages to vertica.log
:
2020-07-07 11:28:41.618 Init Session:ox7fabbbfff700-aoo000000oosbs [Txnl <INFO> Be in Txn: aoooooooooo5b5 'Refresh: Evaluating which projection to refresh'
2020-07-07 11:28:41.640 Init Session:ex7fabbbfff7oe-aooooeeeeoosbs [Refresh] <INFO> Storage issues detected, unable to refresh projection 'store_sales_recent'. Drop and recreate this projection, then refresh.
2020-07-07 11:28:41.641 Init Session:Ox7fabbbfff700-aooooeooooosbs [Refresh] <INFO> Storage issues detected, unable to refresh projection 'product_sales_largest'. Drop and recreate this projection, then refresh.
2020-07-07 11:28:41.641 Init Session:Ox7fabbbfff700-aeoeeeaeeeosbs [Refresh] <INFO> Storage issues detected, unable to refresh projection 'store_sales_udt_sum'. Drop and recreate this projection, then refresh.
-
Export the DDL of these projections with EXPORT_OBJECTS or EXPORT_TABLES.
-
Drop the projections, then recreate them as defined in the exported DDL.
-
Run REFRESH. Vertica rebuilds the projections with new storage containers.
7.4.2 - Verifying catalog memory consumption
Vertica versions ≥ 9.2 significantly reduce how much memory database catalogs consume.
Vertica versions ≥ 9.2 significantly reduce how much memory database catalogs consume. After you upgrade, check catalog memory consumption on each node to verify that the upgrade refactored catalogs correctly. If memory consumption for a given catalog is as large as or larger than it was in the earlier database, restart the host node.
Known issues
Certain operations might significantly inflate catalog memory consumption. For example:
To refactor database catalogs and reduce their memory footprint, restart the database.
7.4.3 - Reinstalling packages
In most cases, Vertica automatically reinstalls all default packages when you restart your database for the first time after running the upgrade script.
In most cases, Vertica automatically reinstalls all default packages when you restart your database for the first time after running the upgrade script. Occasionally, however, one or more packages might fail to reinstall correctly.
To verify that Vertica succeeded in reinstalling all packages:
-
Restart the database after upgrading.
-
Enter a correct password.
If any packages failed to reinstall, Vertica issues a message that specifies the uninstalled packages. In this case, run the admintools command install_package
with the option --force-reinstall
:
$ admintools -t install_package -d db-name -p password -P pkg-spec --force-reinstall
Options
Option |
Function |
-d db-name
--dbname= db-name |
Database name |
-p password
--password= pword |
Database administrator password |
-P pkg
--package= pkg-spec |
Specifies which packages to install, where pkg is one of the following:
-
The name of a package—for example, flextable
-
all : All available packages
-
default : All default packages that are currently installed
|
--force-reinstall |
Force installation of a package even if it is already installed. |
Examples
Force reinstallation of default packages:
$ admintools -t install_package -d VMart -p 'password' -P default --force-reinstall
Force reinstallation of one package, flextable
:
$ admintools -t install_package -d VMart -p 'password' -P flextable --force-reinstall
7.4.4 - Writing bundle metadata to the catalog
Vertica internally stores physical table data in bundles together with metadata on the bundle contents.
Vertica internally stores physical table data in bundles together with metadata on the bundle contents. The query optimizer uses bundle metadata to look up and fetch the data it needs for a given query.
Vertica stores bundle metadata in the database catalog. This is especially beneficial in Eon mode: instead of fetching this metadata from remote (S3) storage, the optimizer can find it in the local catalog. This minimizes S3 reads, and facilitates faster query planning and overall execution.
Vertica writes bundle metadata to the catalog on two events:
-
Any DML operation that changes table content, such as INSERT
, UPDATE
, or COPY
. Vertica writes bundle metadata to the catalog on the new or changed table data. DML operations have no effect on bundle metadata for existing table data.
-
Invocations of function UPDATE_STORAGE_CATALOG
, as an argument to Vertica meta-function
DO_TM_TASK
, on existing data. You can narrow the scope of the catalog update operation to a specific projection or table. If no scope is specified, the operation is applied to the entire database.
Important
After upgrading to any Vertica version ≥ 9.2.1, you only need to call UPDATE_STORAGE_CATALOG
once on existing data. Bundle metadata on all new or updated data is always written automatically to the catalog.
For example, the following DO_TM_TASK
call writes bundle metadata on all projections in table store.store_sales_fact
:
=> SELECT DO_TM_TASK ('update_storage_catalog', 'store.store_sales_fact');
do_tm_task
-------------------------------------------------------------------------------
Task: update_storage_catalog
(Table: store.store_sales_fact) (Projection: store.store_sales_fact_b0)
(Table: store.store_sales_fact) (Projection: store.store_sales_fact_b1)
(1 row)
You can query system table
STORAGE_BUNDLE_INFO_STATISTICS
to determine which projections have invalid bundle metadata in the database catalog. For example, results from the following query show that the database catalog has invalid metadata for projections inventory_fact_b0
and inventory_fact_b1
:
=> SELECT node_name, projection_name, total_ros_count, ros_without_bundle_info_count
FROM v_monitor.storage_bundle_info_statistics where ros_without_bundle_info_count > 0
ORDER BY projection_name, node_name;
node_name | projection_name | total_ros_count | ros_without_bundle_info_count
------------------+-------------------+-----------------+-------------------------------
v_vmart_node0001 | inventory_fact_b0 | 1 | 1
v_vmart_node0002 | inventory_fact_b0 | 1 | 1
v_vmart_node0003 | inventory_fact_b0 | 1 | 1
v_vmart_node0001 | inventory_fact_b1 | 1 | 1
v_vmart_node0002 | inventory_fact_b1 | 1 | 1
v_vmart_node0003 | inventory_fact_b1 | 1 | 1
(6 rows)
Best practices
Updating the database catalog with UPDATE_STORAGE_CATALOG
is recommended only for Eon users. Enterprise users are unlikely to see measurable performance improvements from this update.
Calls to UPDATE_STORAGE_CATALOG
can incur considerable overhead, as the update process typically requires numerous and expensive S3 reads. Vertica advises against running this operation on the entire database. Instead, consider an incremental approach:
-
Call UPDATE_STORAGE_CATALOG
on a single large fact table. You can use performance metrics to estimate how much time updating other files will require.
-
Identify which tables are subject to frequent queries and prioritize catalog updates accordingly.
7.4.5 - Upgrading the streaming data scheduler utility
If you have integrated Vertica with a streaming data application, such as Apache Kafka, you must update the streaming data scheduler utility after you update Vertica.
If you have integrated Vertica with a streaming data application, such as Apache Kafka, you must update the streaming data scheduler utility after you update Vertica.
From a command prompt, enter the following command:
/opt/vertica/packages/kafka/bin/vkconfig scheduler --upgrade --upgrade-to-schema schema_name
Running the upgrade task more than once has no effect.
For more information on the Scheduler utility, refer to Scheduler tool options.
8 - Uninstalling Vertica
Choose a host machine and log in as root (or log in as another user and switch to root).
For each host in the cluster:
-
Choose a host machine and log in as root (or log in as another user and switch to root).
$ su - root
password: root-password
-
Find the name of the package that is installed:
RPM
# rpm -qa | grep vertica
DEB
# dpkg -l | grep vertica
-
Remove the package:
RPM
# rpm -e package
DEB
# dpkg -r package
Note
If you want to delete the configuration file used with your installation, you can choose to delete the /opt/vertica/
directory and all subdirectories using this command: # rm -rf
/opt/vertica/
For each client system:
-
Delete the JDBC driver jar file.
-
Delete ODBC driver data source names.
-
Delete the ODBC driver software:
-
In Windows, go to Start > Control Panel > Add or Remove Programs.
-
Locate Vertica.
-
Click Remove.
9 - Upgrading your operating system on nodes in your Vertica cluster
If you need to upgrade the operating system on the nodes in your Vertica cluster, check with the documentation for your Linux distribution to make sure they support the particular upgrade you are planning.
If you need to upgrade the operating system on the nodes in your Vertica cluster, check with the documentation for your Linux distribution to make sure they support the particular upgrade you are planning.
For example, the following articles provide information about upgrading Red Hat:
After you confirm that you can perform the upgrade, follow the steps at Best Practices for Upgrading the Operating System on Nodes in a Vertica Cluster.
10 - Using time zones with Vertica
Vertica uses the public-domain tz database (time zone database), which contains code and data that represent the history of local time for locations around the globe.
Vertica uses the public-domain tz database (time zone database), which contains code and data that represent the history of local time for locations around the globe. This database organizes time zone and daylight saving time data by partitioning the world into timezones whose clocks all agree on timestamps that are later than the POSIX Epoch (1970-01-01 00:00:00 UTC). Each timezone has a unique identifier. Identifiers typically follow the convention area
/
location
, where area
is a continent or ocean, and location
is a specific location within the area—for example, Africa/Cairo, America/New_York, and Pacific/Honolulu.
Important
IANA acknowledge that 1970 is an arbitrary cutoff. They note the problems that face moving the cutoff earlier "due to the wide variety of local practices before computer timekeeping became prevalent." IANA's own description of the tz database suggests that users should regard historical dates and times, especially those that predate the POSIX epoch date, with a healthy measure of skepticism. For details, see
Theory and pragmatics of the tz code and data.
Vertica uses the TZ
environment variable (if set) on each node for the default current time zone. Otherwise, Vertica uses the operating system time zone.
The TZ
variable can be set by the operating system during login (see /etc/profile
, /etc/profile.d
, or /etc/bashrc
) or by the user in .profile
, .bashrc
or .bash-profile
. TZ
must be set to the same value on each node when you start Vertica.
The following command returns the current time zone for your database:
=> SHOW TIMEZONE;
name | setting
----------+------------------
timezone | America/New_York
(1 row)
You can also set the time zone for a single session with SET TIME ZONE.
Conversion and storage of date/time data
There is no database default time zone. TIMESTAMPTZ (TIMESTAMP WITH TIMEZONE) data is converted from the current local time and stored as GMT/UTC (Greenwich Mean Time/Coordinated Universal Time).
When TIMESTAMPTZ data is used, data is converted back to the current local time zone, which might be different from the local time zone where the data was stored. This conversion takes into account daylight saving time (summer time), depending on the year and date to determine when daylight saving time begins and ends.
TIMESTAMP WITHOUT TIMEZONE data stores the timestamp as given, and retrieves it exactly as given. The current time zone is ignored. The same is true for TIME WITHOUT TIMEZONE. For TIME WITH TIMEZONE (TIMETZ), however, the current time zone setting is stored along with the given time, and that time zone is used on retrieval.
Note
Vertica recommends that you use TIMESTAMPTZ, not TIMETZ.
Querying data/time data
TIMESTAMPTZ uses the current time zone on both input and output, as in the following example:
=> CREATE TEMP TABLE s (tstz TIMESTAMPTZ);=> SET TIMEZONE TO 'America/New_York';
=> INSERT INTO s VALUES ('2009-02-01 00:00:00');
=> INSERT INTO s VALUES ('2009-05-12 12:00:00');
=> SELECT tstz AS 'Local timezone', tstz AT TIMEZONE 'America/New_York' AS 'America/New_York',
tstz AT TIMEZONE 'GMT' AS 'GMT' FROM s;
Local timezone | America/New_York | GMT
------------------------+---------------------+---------------------
2009-02-01 00:00:00-05 | 2009-02-01 00:00:00 | 2009-02-01 05:00:00
2009-05-12 12:00:00-04 | 2009-05-12 12:00:00 | 2009-05-12 16:00:00
(2 rows)
The -05
in the Local time zone column shows that the data is displayed in EST, while -04
indicates EDT. The other two columns show the TIMESTAMP WITHOUT TIMEZONE at the specified time zone.
The next example shows what happens if the current time zone is changed to GMT:
=> SET TIMEZONE TO 'GMT';=> SELECT tstz AS 'Local timezone', tstz AT TIMEZONE 'America/New_York' AS
'America/New_York', tstz AT TIMEZONE 'GMT' as 'GMT' FROM s;
Local timezone | America/New_York | GMT
------------------------+---------------------+---------------------
2009-02-01 05:00:00+00 | 2009-02-01 00:00:00 | 2009-02-01 05:00:00
2009-05-12 16:00:00+00 | 2009-05-12 12:00:00 | 2009-05-12 16:00:00
(2 rows)
The +00 in the Local time zone column indicates that TIMESTAMPTZ is displayed in GMT.
The approach of using TIMESTAMPTZ fields to record events captures the GMT of the event, as expressed in terms of the local time zone. Later, it allows for easy conversion to any other time zone, either by setting the local time zone or by specifying an explicit AT TIMEZONE clause.
The following example shows how TIMESTAMP WITHOUT TIMEZONE fields work in Vertica.
=> CREATE TEMP TABLE tnoz (ts TIMESTAMP);=> INSERT INTO tnoz VALUES('2009-02-01 00:00:00');
=> INSERT INTO tnoz VALUES('2009-05-12 12:00:00');
=> SET TIMEZONE TO 'GMT';
=> SELECT ts AS 'No timezone', ts AT TIMEZONE 'America/New_York' AS
'America/New_York', ts AT TIMEZONE 'GMT' AS 'GMT' FROM tnoz;
No timezone | America/New_York | GMT
---------------------+------------------------+------------------------
2009-02-01 00:00:00 | 2009-02-01 05:00:00+00 | 2009-02-01 00:00:00+00
2009-05-12 12:00:00 | 2009-05-12 16:00:00+00 | 2009-05-12 12:00:00+00
(2 rows)
The +00
at the end of a timestamp indicates that the setting is TIMESTAMP WITH TIMEZONE in GMT (the current time zone). The America/New_York column shows what the GMT setting was when you recorded the time, assuming you read a normal clock in the America/New_York time zone. What this shows is that if it is midnight in the America/New_York time zone, then it is 5 am GMT.
Note
00:00:00 Sunday February 1, 2009 in America/New_York converts to 05:00:00 Sunday February 1, 2009 in GMT.
The GMT column displays the GMT time, assuming the input data was captured in GMT.
If you don't set the time zone to GMT, and you use another time zone, for example America/New_York, then the results display in America/New_York with a -05
and -04
, showing the difference between that time zone and GMT.
=> SET TIMEZONE TO 'America/New_York';
=> SHOW TIMEZONE;
name | setting
----------+------------------
timezone | America/New_York
(1 row)
=> SELECT ts AS 'No timezone', ts AT TIMEZONE 'America/New_York' AS
'America/New_York', ts AT TIMEZONE 'GMT' AS 'GMT' FROM tnoz;
No timezone | America/New_York | GMT
---------------------+------------------------+------------------------
2009-02-01 00:00:00 | 2009-02-01 00:00:00-05 | 2009-01-31 19:00:00-05
2009-05-12 12:00:00 | 2009-05-12 12:00:00-04 | 2009-05-12 08:00:00-04
(2 rows)
In this case, the last column is interesting in that it returns the time in New York, given that the data was captured in GMT.
See also