This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
Install Vertica using the command line
This section describes how to install the Vertica software on a cluster of nodes.
This section describes how to install the Vertica software on a cluster of nodes. It assumes that you have already performed the tasks in Before You Install Vertica, and that you have a Vertica license key.
To install Vertica, complete the following tasks:
-
Download and install the Vertica server package
-
Install Vertica with the installation script
Special notes
-
Downgrade installations are not supported.
-
Be sure that you download the RPM for the correct operating system and architecture.
-
Vertica supports two-node clusters with zero fault tolerance (K=0 safety). This means that you can add a node to a single-node cluster, as long as the installation node (the node upon which you build) is not the loopback node (localhost/127.0.0.1
).
-
The installer performs platform verification tests that prevent the install from continuing if the platform requirements are not met. These tests ensure that your platform meets the hardware and software requirements for Vertica. You can simply run the installer and view a list of the failures and warnings to determine which configuration changes you must make.
1 - Download and install the Vertica server package
To download and install the Vertica server package:.
To download and install the Vertica server package:
-
Use a Web browser to go to the Vertica website.
-
Click the Support tab and select Customer Downloads.
-
Log into the portal to download the install package.
Be sure the package you download matches the operating system and the machine architecture on which you intend to install it.
-
Transfer the installation package to the Administration host.
-
If you installed a previous version of Vertica on any of the hosts in the cluster, use the Administration tools to shut down any running database.
The database must stop normally; you cannot upgrade a database that requires recovery.
-
If you are using sudo, skip to the next step. If you are root, log in to the Administration Host as root (or log in as another user and switch to root).
$ su - root
password: root-password
#
Caution
When installing Vertica using an existing user as the dba, you must exit all UNIX terminal sessions for that user after setup completes and log in again to ensure that group privileges are applied correctly.
After Vertica is installed, you no longer need root privileges. To verify sudo, see Platform and hardware requirements and recommendations.
-
Use one of the following commands to run the RPM package installer:
where pathname
is the Vertica package file you downloaded.
Note
If the package installer reports multiple dependency problems, or you receive the error "ERROR: You're attempting to install the wrong RPM for this operating system", then you are trying to install the wrong Vertica server package.
After you install the Vertica RPM, you can use several Validation scripts to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica.
2 - Linux users created by Vertica
This topic describes the Linux accounts that the installer creates and configures so Vertica can run.
This topic describes the Linux accounts that the installer creates and configures so Vertica can run. When you install Vertica, the installation script optionally creates the following Linux user and group:
dbadmin and verticadba are the default names. If you want to change what these Linux accounts are called, you can do so using the installation script. See Install Vertica with the installation script for details.
Dbadmin privileges
The Linux dbadmin user owns the database catalog and data storage on disk. When you run the install script, Vertica creates this user on each node in the database cluster. It also adds dbadmin to the Linux dbadmin and verticadba groups, and configures the account as follows:
-
Configures and authorizes dbadmin for passwordless SSH between all cluster nodes. SSH must be installed and configured to allow passwordless logins. See Enable secure shell (SSH) logins.
-
Sets the dbadmin user's BASH shell to /bin/bash
, required to run scripts, such as install_vertica and the Administration tools.
-
Provides read-write-execute permissions on the following directories:
Note
The Vertica installation script also creates a Vertica database superuser named dbadmin. They share the same name, but they are not the same; one is a Linux user and the other is a Vertica user. See
Database administration user for information about the database superuser.
After you install Vertica
Root or sudo privileges are not required to start or run Vertica after the installation process completes.
The dbadmin user can log in and perform Vertica tasks, such as creating a database, installing/changing the license key, or installing drivers. If dbadmin wants database directories in a location that differs from the default, the root user (or a user with sudo privileges) must create the requested directories and change ownership to the dbadmin user.
Vertica prevents administration from users other than the dbadmin user (or the user name you specified during the installation process if not dbadmin). Only this user can run Administration Tools.
See also
3 - Validation scripts
Vertica provides several validation utilities that can be used prior to deploying Vertica to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica.
Vertica provides several validation utilities that can be used prior to deploying Vertica to help determine if your hosts and network can properly handle the processing and network traffic required by Vertica. These utilities can also be used if you are encountering performance issues and need to troubleshoot the issue.
After you install the Vertica RPM, you have access to the following scripts in /opt/vertica/bin
:
-
Vcpuperf - a CPU performance test used to verify your CPU performance.
-
Vioperf - an Input/Output test used to verify the speed and consistency of your hard drives.
-
Vnetperf - a Network test used to test the latency and throughput of your network between hosts.
These utilities can be run at any time, but are well suited to use before running the install_vertica script.
3.1 - Vcpuperf
The vcpuperf utility measures your server's CPU processing speed and compares it against benchmarks for common server CPUs.
The vcpuperf utility measures your server's CPU processing speed and compares it against benchmarks for common server CPUs. The utility performs a CPU test and measures the time it takes to complete the test. The lower the number scored on the test, the better the performance of the CPU.
The vcpuperf utility also checks the high and low load times to determine if CPU throttling is enabled. If a server's low-load computation time is significantly longer than the high-load computation time, CPU throttling may be enabled. CPU throttling is a power-saving feature. However, CPU throttling can reduce the performance of your server. Vertica recommends disabling CPU throttling to enhance server performance.
Syntax
vcpuperf [-q]
Options
-q
- Run in quiet mode. Quiet mode displays only the CPU Time, Real Time, and high and low load times.
Returns
-
CPU Time: the amount of time it took the CPU to run the test.
-
Real Time: the total time for the test to execute.
-
High load time: The amount of time to run the load test while simulating a high CPU load.
-
Low load time: The amount of time to run the load test while simulating a low CPU load.
Example
The following example shows a CPU that is running slightly slower than the expected time on a Xeon 5670 CPU that has CPU throttling enabled.
[root@node1 bin]# /opt/vertica/bin/vcpuperf
Compiled with: 4.1.2 20080704 (Red Hat 4.1.2-52) Expected time on Core 2, 2.53GHz: ~9.5s
Expected time on Nehalem, 2.67GHz: ~9.0s
Expected time on Xeon 5670, 2.93GHz: ~8.0s
This machine's time:
CPU Time: 8.540000s
Real Time:8.710000s
Some machines automatically throttle the CPU to save power.
This test can be done in <100 microseconds (60-70 on Xeon 5670, 2.93GHz).
Low load times much larger than 100-200us or much larger than the corresponding high load time
indicate low-load throttling, which can adversely affect small query / concurrent performance.
This machine's high load time: 67 microseconds.
This machine's low load time: 208 microseconds.
3.2 - Vioperf
The vioperf
utility quickly tests the performance of your host's input and output subsystem.
The vioperf
utility quickly tests the performance of your host's input and output subsystem. The utility performs the following tests:
The utility verifies that the host reads the same bytes that it wrote and prints its output to STDOUT. The utility also logs the output to a JSON formatted file.
For data in HDFS, the utility tests reads but not writes.
Syntax
vioperf [--help] [--duration=<INTERVAL>] [--log-interval=<INTERVAL>]
[--log-file=<FILE>] [--condense-log] [--thread-count=<N>] [--max-buffer-size=<SIZE>]
[--preserve-files] [--disable-crc] [--disable-direct-io] [--debug]
[<DIR>*]
-
The minimum required I/O is 20 MB/s read/write per physical processor core on each node, in full duplex (reading and writing) simultaneously, concurrently on all nodes of the cluster.
Note
Vertica supports some AWS instance types that do not meet these minimum I/O requirements. However, all supported AWS instances types, regardless of
vioperf
performance, can be used as Vertica cluster hosts. See
Supported AWS instance types for a list of all supported AWS instance types.
-
The recommended I/O is 40 MB/s per physical core on each node.
-
The minimum required I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s. Vertica recommends 480 MB/s.
For example, the I/O rate for a node with 2 hyper-threaded six-core CPUs (12 physical cores) is 240 MB/s required minimum, 480 MB/s recommended.
Disk space vioperf needs
vioperf
requires about 4.5 GB to run.
Options
--help
- Prints a help message and exits.
--duration
- The length of time
vioprobe
runs performance tests. The default is 5 minutes. Specify the interval in seconds, minutes, or hours with any of these suffixes:
-
Seconds: s
, sec
, secs
, second
, seconds
. Example: --duration=60sec
-
Minutes: m
, min
, mins
, minute
, minutes
. Example: --duration=10min
-
Hours: h
, hr
, hrs
, hour
, hours
. Example: --duration=1hrs
--log-interval
- The interval at which the log file reports summary information. The default interval is 10 seconds. This option uses the same interval notation as
--duration
.
--log-file
- The path and name where log file contents are written, in JSON. If not specified, then
vioperf
creates a file named results
date-time.JSON
in the current directory.
--condense-log
- Directs
vioperf
to write the log file contents in condensed format, one JSON entry per line, rather than as indented JSON syntax.
--thread-count=<N>
- The number of execution threads to use. By default,
vioperf
uses all threads available on the host machine.
--max-buffer-size=<SIZE>
- The maximum size of the in-memory buffer to use for reads or writes. Specify the units with any of these suffixes:
-
Bytes: b
, byte
, bytes
.
-
Kilobytes: k
, kb
, kilobyte
, kilobytes
.
-
Megabytes: m
, mb
, megabyte
, megabytes
.
-
Gigabytes: g
, gb
, gigabyte
, gigabytes
.
--preserve-files
- Directs
vioperf
to keep the files it writes. This parameter is ignored for HDFS tests, which are read-only. Inspecting the files can help diagnose write-related failures.
--disable-crc
- Directs
vioperf
to ignore CRC checksums when validating writes. Verifying checksums can add overhead, particularly when running vioperf
on slower processors. This parameter is ignored for HDFS tests.
--disable-direct-io
- When reading from or writing to a local file system,
vioperf
goes directly to disk by default, bypassing the operating system's page cache. Using direct I/O allows vioperf
to measure performance quickly without having to fill the cache.
Disabling this behavior can produce more realistic performance results but slows down the operation of vioperf
.
--debug
- Directs
vioperf
to report verbose error messages.
<DIR>
- Zero or more directories to test. If you do not specify a directory,
vioperf
tests the current directory. To test the performance of each disk, specify different directories mounted on different disks.
To test reads from a directory on HDFS:
-
Use a URL in the hdfs
scheme that points to a single directory (not a path) containing files at least 10MB in size. For best results, use 10GB files and verify that there is at least one file per vioperf
thread.
-
If you do not specify a host and port, set the HADOOP_CONF_DIR environment variable to a path including the Hadoop configuration files. This value is the same value that you use for the HadoopConfDir configuration parameter in Vertica. For more information see Configuring HDFS access.
-
If the HDFS cluster uses Kerberos, set the HADOOP_USER_NAME environment variable to a Kerberos principal.
Returns
The utility returns the following information:
test
- The test being run (Write, ReWrite, Read, or Skip Read)
directory
- The directory in which the test is being run.
counter name
- The counter type of the test being run. Can be either MB/s or Seeks per second.
counter value
- The value of the counter in MB/s or Seeks per second across all threads. This measurement represents the bandwidth at the exact time of measurement. Contrast with counter value (avg).
counter value (10 sec avg)
- The average amount of data in MB/s, or the average number of Seeks per second, for the test being run in the duration specified with
--log-interval
. The default interval is 10 seconds. The counter value (avg)
is the average bandwidth since the last log message, across all threads.
counter value/core
- The
counter value
divided by the number of cores.
counter value/core (10 sec avg)
- The
counter value (10 sec avg)
divided by the number of cores.
thread count
- The number of threads used to run the test.
%CPU
- The available CPU percentage used during this test.
%IO Wait
- The CPU percentage in I/O Wait state during this test. I/O wait state is the time working processes are blocked while waiting for I/O operations to complete.
elapsed time
- The amount of time taken for a particular test. If you run the test multiple times, elapsed time increases the next time the test is run.
remaining time
- The time remaining until the next test. Based on the
--duration
option, each of the tests is run at least once. If the test set is run multiple times, then remaining time
is how much longer the test will run. The remaining time
value is cumulative. Its total is added to elapsed time each time the same test is run again.
Example
Invoking vioperf
from a terminal outputs the following message and sample results:
[dbadmin@v_vmart_node0001 ~]$ /opt/vertica/bin/vioperf --duration=60s
The minimum required I/O is 20 MB/s read and write per physical processor core on each node, in full duplex
i.e. reading and writing at this rate simultaneously, concurrently on all nodes of the cluster.
The recommended I/O is 40 MB/s per physical core on each node.
For example, the I/O rate for a server node with 2 hyper-threaded six-core CPUs is 240 MB/s required minimum, 480 MB/s recommended.
Using direct io (buffer size=1048576, alignment=512) for directory "/home/dbadmin"
test | directory | counter name | counter value | counter value (10 sec avg) | counter value/core | counter value/core (10 sec avg) | thread count | %CPU | %IO Wait | elapsed time (s)| remaining time (s)
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Write | /home/dbadmin | MB/s | 420 | 420 | 210 | 210 | 2 | 89 | 10 | 10 | 5
Write | /home/dbadmin | MB/s | 412 | 396 | 206 | 198 | 2 | 89 | 9 | 15 | 0
ReWrite | /home/dbadmin | (MB-read+MB-write)/s | 150+150 | 150+150 | 75+75 | 75+75 | 2 | 58 | 40 | 10 | 5
ReWrite | /home/dbadmin | (MB-read+MB-write)/s | 158+158 | 172+172 | 79+79 | 86+86 | 2 | 64 | 33 | 15 | 0
Read | /home/dbadmin | MB/s | 194 | 194 | 97 | 97 | 2 | 69 | 26 | 10 | 5
Read | /home/dbadmin | MB/s | 192 | 190 | 96 | 95 | 2 | 71 | 27 | 15 | 0
SkipRead | /home/dbadmin | seeks/s | 659 | 659 | 329.5 | 329.5 | 2 | 2 | 85 | 10 | 5
SkipRead | /home/dbadmin | seeks/s | 677 | 714 | 338.5 | 357 | 2 | 2 | 59 | 15 | 0
Note
When evaluating performance for minimum and recommended I/O, include the Write and Read values in your evaluation. ReWrite and SkipRead values are not relevant to determining minimum and recommended I/O.
3.3 - Vnetperf
The vnetperf utility measures network performance of database hosts, as well as network latency and throughput for TCP and UDP protocols.
The vnetperf utility measures network performance of database hosts, as well as network latency and throughput for TCP and UDP protocols.
Caution
This utility incurs high network load, which degrades database performance. Do not use this utility on a Vertica production database.
This utility helps identify the following issues:
-
Low throughput for all hosts or one
-
High latency for all hosts or one
-
Bottlenecks between one or more hosts or subnets
-
Too-low limit on the number of TCP connections that can be established simultaneously
-
High rates of network packet loss
Syntax
vnetperf [[options](#Options)] [[tests](#Tests)]
Options
--condense
- Condenses the log into one JSON entry per line, instead of indented JSON syntax.
--collect-logs
- Collects test log files from each host.
--datarate
rate
- Limits throughput to this rate in MB/s. A rate of 0 loops the tests through several different rates.
Default: 0
--duration
seconds
- Time limit for each test to run in seconds.
Default: 1
--hosts
host-name
[,...]
- Comma-separated list of host names or IP addresses on which to run the tests. The list must not contain embedded spaces.
--hosts
file
- File that specifies the hosts on which to run the tests. If you omit this option, then the vnetperf tries to access admintools to identify cluster hosts.
--identity-file
file
- If using passwordless SSH/SCP access between hosts, then specify the key file used to gain access to the hosts.
--ignore-bad-hosts
- If set, runs tests on reachable hosts even if some hosts are not reachable. If you omit this option and a host is unreachable, then no tests are run on any hosts.
--log-dir
directory
- If
--collect-logs
is set, specifies the directory in which to place the collected logs.
Default: logs.netperf.
<timestamp>
--log-level
level
- Log level to use, one of the following:
Default: WARN
--list-tests
- Lists the tests that vnetperf can run.
--output-file
file
- The file to which JSON results are written.
Default: results.
<timestamp>
.json
--ports port#[,...]
- Comma-delimited list of port numbers to use. If only one port number is specified, then the next two numbers in sequence are also used.
Default: 14159,14160,14161
--scp-options '
scp-args
'
- Specifies one or more standard SCP command line arguments. SCP is used to copy test binaries over to the target hosts.
--ssh-options '
ssh-args
'
- Specifies one or more standard SSH command line arguments. SSH is used to issue test commands on the target hosts.
--tmp-dir
directory
- Specifies the temporary directory for vnetperf, where
directory
must have execute permission on all hosts, and does not include the unsupported characters "
, ```, or '
.
Default: /tmp
(execute permission required)
--vertica-install
directory
- Indicates that Vertica is installed on each of the hosts, so vnetperf uses test binaries on the target system rather than copying them over with SCP.
Tests
vnetperf can specify one or more of the following tests. If no test is specified, vnetperf runs all tests. Test results are printed for each host.
Test |
Description |
Results |
latency |
Measures latency from the host that is running the script to other hosts. Hosts with unusually high latency should be investigated further. |
|
tcp-throughput |
Tests TCP throughput among hosts. |
|
udp-throughput |
Tests UDP throughput among hosts |
-
Maximum recommended RTT (round-trip time) latency is 1000 microseconds. Ideal RTT latency is 200 microseconds or less. Vertica recommends that clock skew be less than 1 second.
-
Minimum recommended throughput is 100 MB/s. Ideal throughput is 800 MB/s or more.
Note
UDP throughput can be lower; multiple network switches can adversely affect performance.
Example
$ vnetperf latency tcp-throughput
The maximum recommended rtt latency is 2 milliseconds. The ideal rtt latency is 200 microseconds or less. It is recommended that clock skew be kept to under 1 second.
test | date | node | index | rtt latency (us) | clock skew (us)
-------------------------------------------------------------------------------------------------------------------------
latency | 2022-03-29_10:23:55,739 | 10.20.100.247 | 0 | 49 | 3
latency | 2022-03-29_10:23:55,739 | 10.20.100.248 | 1 | 272 | -702
latency | 2022-03-29_10:23:55,739 | 10.20.100.249 | 2 | 245 | 1037
The minimum recommended throughput is 100 MB/s. Ideal throughput is 800 MB/s or more. Note: UDP numbers may be lower, multiple network switches may reduce performance results.
date | test | rate limit (MB/s) | node | MB/s (sent) | MB/s (rec) | bytes (sent) | bytes (rec) | duration (s)
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
2022-03-29_10:23:55,742 | tcp-throughput | 32 | 10.20.100.247 | 30.579 | 30.579 | 32112640 | 32112640 | 1.00151
2022-03-29_10:23:55,742 | tcp-throughput | 32 | 10.20.100.248 | 30.5791 | 30.5791 | 32112640 | 32112640 | 1.0015
2022-03-29_10:23:55,742 | tcp-throughput | 32 | 10.20.100.249 | 30.5791 | 30.5791 | 32112640 | 32112640 | 1.0015
2022-03-29_10:23:55,742 | tcp-throughput | 32 | average | 30.579 | 30.579 | 32112640 | 32112640 | 1.0015
2022-03-29_10:23:57,749 | tcp-throughput | 64 | 10.20.100.247 | 61.0952 | 61.0952 | 64094208 | 64094208 | 1.00049
2022-03-29_10:23:57,749 | tcp-throughput | 64 | 10.20.100.248 | 61.096 | 61.096 | 64094208 | 64094208 | 1.00048
2022-03-29_10:23:57,749 | tcp-throughput | 64 | 10.20.100.249 | 61.0952 | 61.0952 | 64094208 | 64094208 | 1.00049
2022-03-29_10:23:57,749 | tcp-throughput | 64 | average | 61.0955 | 61.0955 | 64094208 | 64094208 | 1.00048
2022-03-29_10:23:59,753 | tcp-throughput | 128 | 10.20.100.247 | 122.131 | 122.131 | 128122880 | 128122880 | 1.00046
2022-03-29_10:23:59,753 | tcp-throughput | 128 | 10.20.100.248 | 122.132 | 122.132 | 128122880 | 128122880 | 1.00046
2022-03-29_10:23:59,753 | tcp-throughput | 128 | 10.20.100.249 | 122.132 | 122.132 | 128122880 | 128122880 | 1.00046
2022-03-29_10:23:59,753 | tcp-throughput | 128 | average | 122.132 | 122.132 | 128122880 | 128122880 | 1.00046
2022-03-29_10:24:01,757 | tcp-throughput | 256 | 10.20.100.247 | 243.819 | 244.132 | 255754240 | 256081920 | 1.00036
2022-03-29_10:24:01,757 | tcp-throughput | 256 | 10.20.100.248 | 244.125 | 243.282 | 256049152 | 255164416 | 1.00025
2022-03-29_10:24:01,757 | tcp-throughput | 256 | 10.20.100.249 | 244.172 | 243.391 | 256114688 | 255295488 | 1.00032
2022-03-29_10:24:01,757 | tcp-throughput | 256 | average | 244.039 | 243.601 | 255972693 | 255513941 | 1.00031
2022-03-29_10:24:03,761 | tcp-throughput | 512 | 10.20.100.247 | 337.232 | 485.247 | 355893248 | 512098304 | 1.00645
2022-03-29_10:24:03,761 | tcp-throughput | 512 | 10.20.100.248 | 446.16 | 231.001 | 467894272 | 242253824 | 1.00013
2022-03-29_10:24:03,761 | tcp-throughput | 512 | 10.20.100.249 | 349.667 | 409.961 | 368476160 | 432013312 | 1.00497
2022-03-29_10:24:03,761 | tcp-throughput | 512 | average | 377.686 | 375.403 | 397421226 | 395455146 | 1.00385
2022-03-29_10:24:05,772 | tcp-throughput | 640 | 10.20.100.247 | 328.279 | 509.256 | 383975424 | 595656704 | 1.11548
2022-03-29_10:24:05,772 | tcp-throughput | 640 | 10.20.100.248 | 505.626 | 217.217 | 532250624 | 228655104 | 1.00389
2022-03-29_10:24:05,772 | tcp-throughput | 640 | 10.20.100.249 | 390.355 | 474.89 | 410812416 | 499777536 | 1.00365
2022-03-29_10:24:05,772 | tcp-throughput | 640 | average | 408.087 | 400.454 | 442346154 | 441363114 | 1.04101
2022-03-29_10:24:07,892 | tcp-throughput | 768 | 10.20.100.247 | 300.5 | 426.762 | 318734336 | 452657152 | 1.01154
2022-03-29_10:24:07,892 | tcp-throughput | 768 | 10.20.100.248 | 268.252 | 402.891 | 283017216 | 425066496 | 1.00616
2022-03-29_10:24:07,892 | tcp-throughput | 768 | 10.20.100.249 | 510.569 | 243.649 | 535592960 | 255590400 | 1.00042
2022-03-29_10:24:07,892 | tcp-throughput | 768 | average | 359.774 | 357.767 | 379114837 | 377771349 | 1.00604
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | 10.20.100.247 | 304.545 | 444.261 | 334987264 | 488669184 | 1.049
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | 10.20.100.248 | 422.246 | 192.773 | 474284032 | 216530944 | 1.07121
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | 10.20.100.249 | 353.206 | 446.809 | 378732544 | 479100928 | 1.0226
2022-03-29_10:24:09,911 | tcp-throughput | 1024 | average | 359.999 | 361.281 | 396001280 | 394767018 | 1.0476
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | 10.20.100.247 | 343.324 | 414.559 | 387710976 | 468156416 | 1.07697
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | 10.20.100.248 | 292.44 | 246.254 | 308314112 | 259620864 | 1.00544
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | 10.20.100.249 | 437.559 | 405.02 | 459145216 | 425000960 | 1.00072
2022-03-29_10:24:11,988 | tcp-throughput | 2048 | average | 357.774 | 355.278 | 385056768 | 384259413 | 1.02771
JSON results available at: ./results.2022-03-29_10:23:51,548.json
4 - Install Vertica with the installation script
You can run the installation script after you install the Vertica package.
You can run the installation script after you install the Vertica package. The installation script runs on a single node, using a Bash shell. The script copies the Vertica package to all other hosts (identified by the --hosts
argument) in your planned cluster.
Tip
To speed up the installation, you can provide a local copy of the RPM to each node in the cluster before running the install script. This allows the installer to bypass the time-consuming process of copying the RPM to the nodes. For details, see
--no-rpm-copy
.
The installation script runs several tests on each of the target hosts to verify that the hosts meet system and performance requirements for a Vertica node. The installation script modifies some operating system configuration settings to meet these requirements. Other settings cannot be modified by the installation script and must be manually reconfigured. For details on operating system configuration settings, see Manually configured operating system settings and Automatically configured operating system settings.
Note
The installation script sets up passwordless ssh for the admin user across all hosts. If passwordless ssh is already set up, the installation script verifies that it functions correctly.
4.1 - Install on a FIPS 140-2 enabled machine
Vertica supports the implementation of the Federal Information Processing Standard 140-2 (FIPS).
Vertica supports the implementation of the Federal Information Processing Standard 140-2 (FIPS). You enable FIPS mode in the operating system.
Note
Enabling FIPS on the operating system occurs outside of Vertica.
During installation, the install_vertica script detects whether the host is operating in FIPS mode. The installer searches for the file /proc/sys/crypto/fips_enabled
and examines its content. If the file exists and contains a '1' in the filename, the host is operating in FIPS mode and the following message appears:
/proc/sys/crypto/fips_enabled exists and contains '1', this is a FIPS system
Important
On certain systems where the libssl and libcrypto libraries do not have versioning information, when starting Vertica, you may see the message
No version information available
This message is benign and you can ignore it.
To implement FIPS 140-2 on your Vertica Analytic Database, you need to configure both the server and the client you are using. To see the detailed configuration steps, go to Implementing FIPS 140-2.
Symbolic links for OpenSSL
On some non-FIPS systems, versioning anomalies can occur when you install a new version of OpenSSL. Sometimes, the default OpenSSL build procedure produces libraries with versions named 1.0.0. For Vertica to recognize that a library has a higher version number, the library name with a higher version number must be provided. As part of the Vertica installation, symbolic links are created to the appropriate OpenSSL files. The steps are as follows:
-
The RPM installer places two OpenSSL library files in /opt/vertica/lib:
-
libssl.so.1.1
-
libcrypto.so.1.1
-
The install_vertica script creates two symbolic links in /opt/vertica/lib:
-
The symbolic links point to libssl.so.1.1 and libcrypto.so.1.1, which the RPM installer placed in /opt/vertica/lib.
4.2 - Specifying disk storage location during installation
You can specify the disk storage location when you:.
You can specify the disk storage location when you:
Specifying disk storage location when you install
When you install Vertica, the --data-dir
parameter in the install_vertica script lets you specify a directory to contain database data and catalog files. The script defaults to the database administrator's default home directory
/home/dbadmin
.
Important
Replace this default with a directory that has adequate space to hold your data and catalog files.
Requirements
-
The data and catalog directory must exist on each node in the cluster.
-
The directory on each node must be owned by the database administrator
-
Catalog and data path names must contain only alphanumeric characters and cannot have leading space characters. Failure to comply with these restrictions will result in database creation failure.
-
Vertica refuses to overwrite a directory if it appears to be in use by another database. Therefore, if you created a database for evaluation purposes, dropped the database, and want to reuse the database name, make sure that the disk storage location previously used has been completely cleaned up. See Managing storage locations for details.
4.3 - Perform a basic install
For all installation options, see [%=Vertica.INSTALL_SCRIPT%] Options.
For all installation options, see install_vertica options.
-
As root (or sudo) run the install script. The script must be run by a BASH shell as root or as a user with sudo privileges. You can configure many options when running the install script. See Basic Installation Parameters for the required options.
If the installer fails due to any requirements not being met, you can correct the issue and then rerun the installer with the same command line options.
To perform a basic installation:
-
As root:
# /opt/vertica/sbin/install_vertica --hosts host_list --rpm package_name\
--dba-user dba_username --parallel-no-prompts
-
Using sudo:
$ sudo /opt/vertica/sbin/install_vertica --hosts host_list --rpm package_name \
--dba-user dba_username --parallel-no-prompts
Important
If you place install_vertica
in a location other than /opt/vertica
, create a symlink from that location to /opt/vertica
. Create this symlink on all cluster nodes, otherwise the database will not start.
-
When prompted for a password to log into the other nodes, provide the requested password. Doing so allows the installation of the package and system configuration on the other cluster nodes.
-
If you are root, this is the root password.
-
If you are using sudo, this is the sudo user password.
The password does not echo on the command line. For example:
Vertica Database 24.2.x Installation Tool
Please enter password for root@host01:password
-
If the dbadmin user, or the user specified in the argument --dba-user
, does not exist, then the install script prompts for the password for the user. Provide the password. For example:
Enter password for new UNIX user dbadmin:password
Retype new UNIX password for user dbadmin:password
-
Carefully examine any warnings or failures returned by
install_vertica
and correct the problems.
For example, insufficient RAM, insufficient network throughput, and too high readahead settings on the file system could cause performance problems later on. Additionally, LANG warnings, if not resolved, can cause database startup to fail and issues with VSQL. The system LANG attributes must be UTF-8 compatible. After you fix the problems, rerun the install script.
-
When installation is successful, disconnect from the Administration host, as instructed by the script. Then, complete the required post-installation steps.
At this point, root privileges are no longer needed and the database administrator can perform any remaining steps.
4.4 - install_vertica options
The following tables describe script options.
The following tables describe install_vertica
script options. Most options have long and short forms—for example, --hosts
and -s
.
Required
install_vertica
requires the following options:
-
--hosts
/ -s
-
--rpm
/ -r
| --deb
| --no-rpm-copy
-
--dba-user
username
| -u
username
Required only if installing using root or upgrading versions.
For example:
# /opt/vertica/sbin/install_vertica --hosts node0001,node0002,node0003 --rpm /tmp/vertica-version.RHEL8.x86_64.rpm
--hosts
hostlist
-s
hostlist
- Comma-separated list of host names or IP addresses to include in the cluster. The list must not include embedded spaces. For example:
-
--hosts node01,node02,node03
-
--hosts 192.168.233.101,192.168.233.102,192.168.233.103
-
--hosts fd95:ff5d:5549:bdb0::1,fd95:ff5d:5549:bdb0::2,fd95:ff5d:5549:bdb0::3
The following requirements apply:
-
If upgrading an existing installation of Vertica, use the same host names used previously.
-
IP addresses or hostnames must be for unique hosts. Do not list the same host using multiple IP addresses/hostnames.
Note
Vertica stores only IP addresses in its configuration files. If you provide host names, they are converted to IP addresses when the script runs.
--rpm
package-name
-r
package-name
-deb
package-name
- Path and name of the Vertica RPM or Debian package. For example:
--rpm /tmp/vertica-version.RHEL8.x86_64.rpm
For Debian and Ubuntu installs, provide the name of the Debian package:
--deb /tmp/vertica_10.1_amd64.deb
The install package must be provided if you install or upgrade the Vertica server package on multiple nodes where the nodes do not have the latest server package installed, or if you are adding a new node. You do not need to provide the server package if you have a local copy of the RPM on each node and call the install script with the no-rpm-copy
option. Unless you provide the --no-rpm-copy
option, the install_vertica
and update_vertica
scripts serially copy the server package to the other nodes and install the package.
Tip
If installing or upgrading a large number of nodes, consider manually installing the package on all nodes before running the install/upgrade script. The script runs faster if it does not need to serially upload and install the package on each node.
--no-rpm-copy
- Installer does not copy the RPM to the nodes in the cluster. The RPM must be present on each node specified by
--hosts
, and you must provide the path to the local RPM files with the --rpm-path
option (defaults to /tmp/dbRPM.rpm
). If you specify this option, you do not need to provide the --rpm
option.
--dba-user
username
-u
username
- Name of the database superuser account to create. Only this account can run the Administration Tools. If you omit this parameter, then the default administrator account name is
dbadmin
.
This parameter is optional for new installations done as root; they must be specified when upgrading or when installing using sudo. If upgrading, use this parameter to specify the same account name that you used previously. If installing using sudo, username
must already exist.
If you manually create the user, modify the user's .bashrc
file to include the line: PATH=/opt/vertica/bin:$PATH
so Vertica tools such as vsql
and admintools
can be easily started by this user.
For details on a minimal installation procedure, see Perform a basic install.
Optional
The following
install_vertica
options are not required. Many of them enable greater control over the installation process.
--help
- Display help for this script.
--accept-eula -Y
- Silently accepts the EULA agreement. On multi-node installations, this option is propagated across the cluster at the end of the installation, at the same time as the Administration Tools metadata.
Combine this option with --license
(-L
) to activate your license.
--add-hosts
hostlist
-A
hostlist
- Comma-separated list of hosts to add to an existing Vertica cluster.
--add-hosts
modifies an existing installation of Vertica by adding a host to the database cluster and then reconfiguring spread. This is useful for improving system performance, or making the database K-safe.
If spread is configured in your installation to use point-to-point communication within the existing cluster, you must also use it when you add a new host; otherwise, the new host automatically uses UDP broadcast traffic, resulting in cluster communication problems that prevent Vertica from running properly. For example:
--add-hosts host01
--add-hosts 192.168.233.101
You can also use this option with the
update_vertica
script. For details, see Adding nodes.
--broadcast -U
- Configures spread to use UDP broadcast traffic between nodes on the subnet. This is the default setting. Up to 80 spread daemons are supported by broadcast traffic. You can exceed the 80-node limit by using large cluster mode, which does not install a spread daemon on each node.
Do not combine this option with
--point-to-point
.
Important
When changing the configuration from
--point-to-point
to
--broadcast
, you must also specify
--control-network
.
--clean
- Forcibly cleans previously stored configuration files. Use this option if you need to change the hosts that are included in your cluster. Only use this option when no database is defined.
This option is not supported by the update_vertica
script.
--config-file
file
-z
file
- Use the properties file created earlier with
[‑‑record-config](#record-config)
. This properties file contains key/value settings that map to
install_vertica
option.
--control-network {
IPaddress
| default } -S {
IPadress
| default }
- Set to one of the following arguments:
IPaddress
: A broadcast network IP address that enables configuration of spread communications on a subnet different from other Vertica data communications.
default
Important
IPaddress
must match the subnet for at least some database nodes. If the address does not match the subnet of any database node, then the installer displays an error and stops. If the provided address matches some, but not all of the node's subnets, the installer displays a warning, but installation continues.
Optimally, the value for --control-network
matches all node subnets.
You can also use this option to force a cluster-wide spread reconfiguration when changing spread-related options.
--data-dir
directory
-d
directory
- Directory for database data and catalog files. For details, see Specifying disk storage location during installation and Managing storage locations.
Caution
Do not use a shared directory over more than one host for this setting. Data and catalog directories must be distinct for each node. Multiple nodes must not be allowed to write to the same data or catalog directory.
Default: /home/dbadmin
--dba-group
group
-g
group
- UNIX group for DBA users.
Default: verticadba
--dba-user-home
directory
-l
directory
- Home directory for the database administrator.
Default: /home/dbadmin
--dba-user-password
password
-p
password
- Password for the database administrator account. If omitted, the script prompts for a password and does not echo the input.
--dba-use-password-disabled
- Disables the password for
--dba-user
. This argument stops the installer from prompting for a password for --dba-user
. You can assign a password later using standard user management tools such as passwd
.
--failure-threshold [
threshold-arg
]
- Stops the installation when the specified failure threshold is encountered, where
threshold-arg
is one of the following:
HINT
: Stop the install if a HINT or greater issue is encountered during the installation tests. HINT configurations are settings you should make, but the database runs with no significant negative consequences if you omit the setting.
WARN
: Stop the installation if a WARN or greater issue is encountered. WARN issues might affect database performance. However, for environments where high-level performance is not a priority—for example, testing—WARN issues can be ignored.
FAIL
: Stop the installation if a FAIL or greater issue is encountered. FAIL issues can have severely negative performance consequences and possible later processing issues if not addressed. However, Vertica can start even if FAIL issues are ignored.
HALT
: Stop the installation if a HALT or greater issue is encountered. The database might be unable to start if you choose his option. This option is not supported in production environments.
NONE
: Do not stop the installation. The database might be unable to start if you choose this option. This option is not supported in production environments.
Default: WARN
--ipv4
- Hosts in the cluster are identified by IPv4 network addresses. This is the default behavior.
--ipv6
- Hosts in the cluster are identified by IPv6 network addresses, required if the
--hosts
list specifies Pv6 addresses. This option automatically enables the
--point-to-point
option.
--large-cluster [
num-control-nodes
| default ]
- Enables the large cluster feature, where a subset of nodes called control nodes connect to spread to send and receive broadcast messages. Consider using this option for a cluster with more than 50 nodes in Enterprise Mode. Vertica automatically enables this feature if you install onto 120 or more nodes in Enterprise Mode, or 16 or more nodes in Eon Mode.
Supply this option with one of the following arguments:
-
num-control-nodes
: Sets the number of control nodes in the new database to the smaller of this value or the value of --hosts
. This value is applied differently in Enterprise Mode and Eon Mode:
- Enterprise Mode: Sets the number of control nodes in the entire cluster.
- Eon Mode: Sets the number of control nodes in the initial default subcluster. This value must be between 1 to 120, inclusive.
-
default
: Vertica sets the number of control nodes to the square root of the total number of cluster nodes listed in --hosts
(-s
).
For details, see Enable Large Cluster When Installing Vertica.
Default: default
--license {
license-file
| CE } -L {
hostlist
| CE }
- Silently and automatically deploys the license key to
/opt/vertica/config/share
. On multi-node installations, the –-license
option also applies the license to all nodes declared by
--hosts
. To activate your license, combined this option with ‑‑accept-eula
option. If you do not use the ‑‑accept-eula
option, you are asked to accept the EULA when you connect to your database. After you accept the EULA, your license is activated.
If specified with CE
, this option automatically deploys the Community Edition license key, which is included in your download.
---no-system-configuration
- Installer makes no changes to system properties. By default, the installer makes system configuration changes that meet server requirements.
If you use this option, the installer posts warnings or failures for configuration settings that do not meet requirements that it otherwise configures automatically.
This option has no effect on creating or updating user accounts.
--parallel-no-prompts
- Installs the server binary package (
.rpm
or .deb
) on the hosts in parallel without prompting for confirmation. This option reduces the installation time, especially on large clusters. If omitted, the install script installs the package on one host at a time. .
This option requires that the installer use passwordless ssh to connect to the hosts. It has no effect if the installer is not using passwordless ssh.
--point-to-point -T
- Configures spread to use direct point-to-point communication between all Vertica nodes. Use this option if nodes are not located on the same subnet. Also use this option for all virtual environment installations, whether or not virtual servers are on the same subnet.
Up to 80 spread daemons are supported by point-to-point communication. You can exceed the 80-node limit by using large cluster mode, which does not install a spread daemon on each node.
Do not combine this option with
--broadcast
.
This option is automatically enabled by the
--ipv6
option.
Important
When changing the configuration from
--broadcast
to
--point-to-point
, you must also specify
--control-network
.
--record-config
filename
-B
filename
- File name used with command line options to create a properties file that can be used with
[‑‑config-file](#record-config)
. This option creates the properties file and exits; it does not affect installation.
--remove-hosts
hostlist
-R
hostlist
- Comma-separated list of hosts to remove from an existing Vertica cluster. After removing the specified hosts, spread is reconfigured on the cluster.
This option is useful for removing an obsolete or over-provisioned system.
If you use --point-to-point
(-T
) to configure spread to use direct point-to-point communication within the existing cluster, you must also use it when you remove a host; otherwise, the hosts automatically use UDP broadcast traffic, resulting in cluster communication problems that prevents Vertica from running properly.
The
update_vertica
script (see Removing hosts from a cluster) calls
install_vertica
to update the installation. You can use either script with this option.
--rpm-path
rpm-filepath
- Only used in conjunction with
--no-rpm-copy
, identifies the path to the local copy of the RPM on all nodes specified by --hosts
.
Default: /tmp/dbRPM.rpm
--spread-logging -w
- Configures spread to output logging to
/opt/vertica/log/spread_hostname.log
. This option does not apply to upgrades.
Note
Enable spread logging only if requested by Vertica technical support.
--ssh-identity
file
-i
file
- The root private-key
file
to use if passwordless ssh was already configured between the hosts. Before using this option, verify that normal SSH works without a password . The file can be private key file—for example, id_rsa
—or PEM file. Do not use with the --ssh-password
(-P
) option.
Vertica accepts the following:
-
By providing an SSH private key which is not password protected. You cannot run the
install_vertica
script with the sudo command when using this method.
-
By providing a password-protected private key and using an SSH-Agent. Note that sudo typically resets environment variables when it is invoked. Specifically, the SSH_AUTHSOCK variable required by the SSH-Agent may be reset. Therefore, configure your system to maintain SSH_AUTHSOCK or invoke install_vertica
using a method similar to the following:
sudo SSH_AUTHSOCK=$SSH_AUTHSOCK /opt/vertica/sbin/install_vertica ...
--ssh-password
password
-P
password
- The password to use by default for each cluster host. If you omit this option and also omit
‑‑ssh‑identity
(-i
), then the script prompts for the password as necessary and does not echo input.
Do not use this option together with --ssh-identity
(-i
).
Important
If you run the
install_vertica
script as root, specify the root password:
# /opt/vertica/sbin/install_vertica -P root-passwd
If you run the
install_vertica
script with the sudo
command, specify the password of the user who runs
install_vertica
, not the root password. For example if the dbadmin user runs
install_vertica
with sudo
and has the password dbapasswd
, then specify the password as dbapasswd
:
$ sudo /opt/vertica/sbin/install_vertica -P dbapasswd
--temp-dir
directory
- Temporary directory used for administrative purposes. If it is a directory within
/opt/vertica
, then it is created by the installer. Otherwise, the directory should already exist on all nodes in the cluster. The location should allow dbadmin
write privileges.
Note
This is not a temporary data location for the database.
Default: /tmp
5 - Install Vertica silently
This section describes how to create a properties file that lets you install and deploy Vertica-based applications quickly and without much manual intervention.
This section describes how to create a properties file that lets you install and deploy Vertica-based applications quickly and without much manual intervention.
Install the properties file:
-
Download and install the Vertica install package, as described in Download and install the Vertica server package.
-
Create the properties file that enables non-interactive setup by supplying the parameters you want Vertica to use. For example:
The following command assumes a multi-node setup:
# /opt/vertica/sbin/install_vertica --record-config file_name --license /tmp/license.txt --accept-eula \
# --dba-user-password password --ssh-password password --hosts host_list --rpm package_name
The following command assumes a single-node setup:
# /opt/vertica/sbin/install_vertica --record-config file_name --license /tmp/license.txt --accept-eula \
# --dba-user-password password
Option |
Description |
--record-file file_name |
[Required] Accepts a file name, which when used in conjunction with command line options, creates a properties file that can be used with the --config-file option during setup. This flag creates the properties file and exits; it has no impact on installation. |
--license { license_file | CE } |
Silently and automatically deploys the license key to /opt/vertica/config/share. On multi-node installations, the –-license option also applies the license to all nodes declared in the --hosts host_list .
If specified with CE, automatically deploys the Community Edition license key, which is included in your download. You do not need to specify a license file.
|
--accept-eula |
Silently accepts the EULA agreement during setup. |
--dba-user-password password |
The password for the Database Superuser account; if not supplied, the script prompts for the password and does not echo the input. |
--ssh-password password |
The root password to use by default for each cluster host; if not supplied, the script prompts for the password if and when necessary and does not echo the input. |
--hosts host_list |
A comma-separated list of hostnames or IP addresses to include in the cluster; do not include space characters in the list.
Examples:
--hosts host01,host02,host03
--hosts 192.168.233.101,192.168.233.102,192.168.233.103
|
--rpm package_name
--deb package_name |
The name of the RPM or Debian package that contained this script. For example:
--rpm vertica-<span class="code-variable">version</span>.RHEL8.x86_64.rpm
This parameter is required on multi-node installations if the RPM or DEB package is not already installed on the other hosts.
|
See install_vertica options for the complete set of installation parameters.
Tip
Supply the parameters to the properties file once only. You can then install Vertica using just the --config-file
parameter, as described below.
-
Use one of the following commands to run the installation script.
/opt/vertica/sbin/install_vertica --config-file file_name
$ sudo /opt/vertica/sbin/install_vertica --config-file file_name
--config-file
file_name
accepts an existing properties file created by --record-config
file_name
. This properties file contains key/value parameters that map to values in the install_vertica
script, many with boolean arguments that default to false
The command for a single-node install might look like this:
# /opt/vertica/sbin/install_vertica --config-file /tmp/vertica-inst.prp
-
If you did not supply a --ssh-password
password parameter to the properties file, you are prompted to provide the requested password to allow installation of the RPM/DEB and system configuration of the other cluster nodes. If you are root, this is the root password. If you are using sudo, this is the sudo user password. The password does not echo on the command line.
Note
If you are root on a single-node installation, you are not prompted for a password.
-
If you did not supply a --dba-user-password
password parameter to the properties file, you are prompted to provide the database administrator account password.
The installation script creates a new Linux user account (dbadmin by default) with the password that you provide.
-
Carefully examine any warnings produced by install_vertica
and correct the problems if possible. For example, insufficient RAM, insufficient Network throughput and too high readahead settings on file system could cause performance problems later on.
Note
You can redirect any warning outputs to a separate file, instead of having them display on the system. Use your platforms standard redirected mechanisms. For example: install_vertica
[options]
> /tmp/file 1>&2
.
-
Optionally perform the following steps:
-
Disconnect from the Administration Host as instructed by the script. This is required to:
At this point, Linux root privileges are no longer needed. The database administrator can perform the remaining steps.
Note
When creating a new database, the database administrator might want to use different data or catalog locations than those created by the installation script. In that case, a Linux administrator might need to create those directories and change their ownership to the database administrator.
If you supplied the --license
and --accept-eula
parameters to the properties file, then proceed to Getting started and then see Configuring the database.
Otherwise:
-
Log in to the Database Superuser account on the administration host.
-
Accept the End User License Agreement and install the license key you downloaded previously as described in Install the License Key.
-
Proceed to Getting started and then see Configuring the database.
Notes
accept_eula = True
license_file = /tmp/license.txt
record_to = file_name
root_password = password
vertica_dba_group = verticadba
vertica_dba_user = dbadmin
vertica_dba_user_password = password
6 - Enable secure shell (SSH) logins
The administrative account must be able to use Secure Shell (SSH) to log in (ssh) to all hosts without specifying a password.
The administrative account must be able to use Secure Shell (SSH) to log in (ssh) to all hosts without specifying a password. The shell script install_vertica does this automatically. This section describes how to do it manually if necessary.
-
If you do not already have SSH installed on all hosts, log in as root on each host and install it now. You can download a free version of the SSH connectivity tools from OpenSSH.
-
Log in to the Vertica administrator account (dbadmin in this example).
-
Make your home directory (~) writable only by yourself. Choose one of:
or
where:
700 includes |
755 includes |
400 read by owner
200 write by owner
100 execute by owner
|
400 read by owner
200 write by owner
100 execute by owner
040 read by group
010 execute by group
004 read by anybody (other)
001 execute by anybody
|
-
Change to your home directory:
-
Generate a private key/ public key pair:
$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/dbadmin/.ssh/id_rsa):
Created directory '/home/dbadmin/.ssh'.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/dbadmin/.ssh/id_rsa.
Your public key has been saved in /home/dbadmin/.ssh/id_rsa.pub.
-
Make your .ssh directory readable and writable only by yourself:
-
Change to the .ssh directory:
-
Copy the file id_rsa.pub
onto the file authorized_keys2
.
$ cp id_rsa.pub authorized_keys2
-
Make the files in your .ssh directory readable and writable only by yourself:
-
For each cluster host:
-
Connect to each cluster host. The first time you ssh to a new remote machine, you could get a message similar to the following:
$ ssh dev0 Warning: Permanently added 'dev0,192.168.1.92' (RSA) to the list of known hosts.
This message appears only the first time you ssh to a particular remote host.
See also