This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

SQL reference

Vertica offers a robust set of SQL elements that allow you to manage and analyze massive volumes of data quickly and reliably, including:.

Vertica offers a robust set of SQL elements that allow you to manage and analyze massive volumes of data quickly and reliably, including:

  • Language elements such as keywords, operators, expressions, predicates, and hints

  • Data types including complex types

  • Functions including Vertica-specific functions that take advantage of Vertica's unique column-store architecture

  • SQL statements that let you write robust queries to quickly return large volumes of data

1 - System limits

This section describes system limits on the size and number of objects in a Vertica database.

This section describes system limits on the size and number of objects in a Vertica database. In most cases, computer memory and disk drive are the limiting factors.

Item Maximum
Nodes 128 (without Vertica assistance)
Database size

Dependent on maximum disk configuration, approximately:

numFiles*platformFileSize

Table size

Smaller of:

  • 264 rows per node

  • 263 bytes per column

Row size

(231) -1

Row size is approximately the sum of its maximum column sizes. For example, a VARCHAR(80) has a maximum size of 80 bytes.

Key size Dependent on row size
Tables/projections per database Dependent on physical RAM, as the catalog must fit in memory.
Concurrent connections per node

Dependent on physical RAM (or threads per process), typically 1024

Default: 50

Concurrent connections per cluster Dependent on physical RAM of a single node (or threads per process), typically 1024
Columns per table/view 9800
Arguments per function call 9800
Rows per load 263
ROS containers per projection

1024

See Minimizing partitions.

Length of fixed-length column 65000 bytes
Length of variable-length column 32 MB
Length of basic names 128 bytes. Basic names include table names, column names, etc.
Query length None
Depth of nesting subqueries None in FROM, WHERE, and HAVING clauses

2 - Language elements

The following topics provide detailed descriptions of the language elements and conventions of Vertica SQL.

The following topics provide detailed descriptions of the language elements and conventions of Vertica SQL.

2.1 - Keywords

Keywords are words that have a specific meaning in the SQL language.

Keywords are words that have a specific meaning in the SQL language. Every SQL statement contains one or more keywords. Although SQL is not case-sensitive with respect to keywords, they are generally shown in uppercase letters throughout this documentation for readability purposes.

Reserved words and keywords

Many keywords are also reserved words.

Vertica recommends that you not use reserved words as names for objects, or as identifiers. Including reserved words can make your SQL statements confusing. Reserved words that are used as names for objects or identifiers must be enclosed in double-quotes.

Non-reserved keywords

Non-reserved keywords have a special meaning in some contexts, but can be used as identifiers in others. You can use non-reserved keywords as aliases—for example, SOURCE:

=> SELECT my_node AS SOURCE FROM nodes;

Viewing the list of reserved and non-reserved keywords

To view the current list of Vertica reserved and non-reserved words, query system table KEYWORDS. Vertica lists keywords alphabetically and identifies them as reserved (R) or non-reserved (N).

For example, the following query gets all reserved keywords that begin with B:

=> SELECT * FROM keywords WHERE reserved = 'R' AND keyword ilike 'B%';
 keyword | reserved
---------+----------
 BETWEEN | R
 BIGINT  | R
 BINARY  | R
 BIT     | R
 BOOLEAN | R
 BOTH    | R
(6 rows)

2.2 - Identifiers

Identifiers (names) of objects such as schema, table, projection, column names, and so on, can be up to 128 bytes in length.

Identifiers (names) of objects such as schema, table, projection, column names, and so on, can be up to 128 bytes in length.

Unquoted identifiers

Unquoted SQL identifiers must begin with one of the following:

  • Non-Unicode letters: A–Z or a-z

  • Underscore (_)

Subsequent characters in an identifier can be any combination of the following:

  • Non-Unicode letters: A–Z or a-z

  • Underscore (_)

  • Digits(0–9)

  • Unicode letters (letters with diacriticals or not in the Latin alphabet), unsupported for model names

  • Dollar sign ($), unsupported for model names

Quoted identifiers

Identifiers enclosed in double quote (") characters can contain any character. If you want to include a double quote, you need a pair of them; for example """". You can use names that would otherwise be invalid—for example, names that include only numeric characters ("123") or contain space characters, punctuation marks, and SQL or Vertica-reserved keywords. For example:

CREATE SEQUENCE "my sequence!";

Double quotes are required for non-alphanumerics and SQL keywords such as "1time", "Next week" and "Select".

Case sensitivity

Identifiers are not case-sensitive. Thus, identifiers "ABC", "ABc", and "aBc" are synonymous, as are ABC, ABc, and aBc.

Non-ASCII characters

Vertica accepts non-ASCII UTF-8 Unicode characters for table names, column names, and other identifiers, extending the cases where upper/lower case distinctions are ignored (case-folded) to all alphabets, including Latin, Cyrillic, and Greek.

For example, the following CREATE TABLE statement uses the ß (German eszett) in the table name:

=> CREATE TABLE straße(x int, y int);
   CREATE TABLE

Identifiers are stored as created

SQL identifiers, such as table and column names, are not converted to lowercase. They are stored as created, and references to them are resolved using case-insensitive compares. For example, the following statement creates table ALLCAPS.

=> CREATE TABLE ALLCAPS(c1 varchar(30));
=> INSERT INTO ALLCAPS values('upper case');

The following statements are variations of the same query:

=> SELECT * FROM ALLCAPS;
=> SELECT * FROM allcaps;
=> SELECT * FROM "allcaps";

The three queries all return the same result:

     c1
------------
 upper case
(1 row)

Note that Vertica returns an error if you try to create table AllCaps:

=> CREATE TABLE AllCaps(c1 varchar(30));
   ROLLBACK:  table "AllCaps" already exists

See QUOTE_IDENT for additional information.

2.3 - Literals

Literals are numbers or strings used in SQL as constants.

Literals are numbers or strings used in SQL as constants. Literals are included in the select-list, along with expressions and built-in functions and can also be constants.

Vertica provides support for number-type literals (integers and numerics), string literals, VARBINARY string literals, and date/time literals. The various string literal formats are discussed in this section.

2.3.1 - Number-type literals

Vertica supports three types of numbers: integers, numerics, and floats.

Vertica supports three types of numbers: integers, numerics, and floats.

  • Integers are whole numbers less than 2^63 and must be digits.

  • Numerics are whole numbers larger than 2^63 or that include a decimal point with a precision and a scale. Numerics can contain exponents. Numbers that begin with 0x are hexadecimal numerics.

Numeric-type values can also be generated using casts from character strings. This is a more general syntax. See the Examples section below, as well as Data type coercion operators (CAST).

Syntax

digits
digits.[digits] | [digits].digits
digits e[+-]digits | [digits].digits e[+-]digits | digits.[digits] e[+-]digits

Parameters

digits
One or more numeric characters, 0 through 9
e
Exponent marker

Notes

  • At least one digit must follow the exponent marker (e), if e is present.
  • There cannot be any spaces or other characters embedded in the constant.
  • Leading plus (+) or minus (–) signs are not considered part of the constant; they are unary operators applied to the constant.

  • In most cases a numeric-type constant is automatically coerced to the most appropriate type depending on context. When necessary, you can force a numeric value to be interpreted as a specific data type by casting it as described in Data type coercion operators (CAST).

  • Floating point literals are not supported. If you specifically need to specify a float, you can cast as described in Data type coercion operators (CAST).

  • Vertica follows the IEEE specification for floating point, including NaN (not a number) and Infinity (Inf).

  • A NaN is not greater than and at the same time not less than anything, even itself. In other words, comparisons always return false whenever a NaN is involved.

  • Dividing INTEGERS (x / y) yields a NUMERIC result. You can use the // operator to truncate the result to a whole number.

Examples

The following are examples of number-type literals:

42
3.5
4.
.001
5e2
1.925e-3

Scientific notation:

=> SELECT NUMERIC '1e10';
 ?column?
-------------
 10000000000
(1 row)

BINARY scaling:

=> SELECT NUMERIC '1p10';
 ?column?
----------
     1024
(1 row)
=> SELECT FLOAT 'Infinity';
  ?column?
----------
 Infinity
(1 row)

The following examples illustrated using the / and // operators to divide integers:

=> SELECT 40/25;
 ?column?
----------------------
 1.600000000000000000
(1 row)
=> SELECT 40//25;
 ?column?
----------
        1
(1 row)

See also

Data type coercion

2.3.2 - String literals

String literals are string values surrounded by single or double quotes.

String literals are string values surrounded by single or double quotes. Double-quoted strings are subject to the backslash, but single-quoted strings do not require a backslash, except for \' and \\.

You can embed single quotes and backslashes into single-quoted strings.

To include other backslash (escape) sequences, such as \t (tab), you must use the double-quoted form.

Precede single-quoted strings with a space between the string and its preceding word, since single quotes are allowed in identifiers.

See also

2.3.2.1 - Character string literals

Character string literals are a sequence of characters from a predefined character set, enclosed by single quotes.

Character string literals are a sequence of characters from a predefined character set, enclosed by single quotes.

Syntax

'character-seq'

Parameters

character-seq
Arbitrary sequence of characters

Embedded single quotes

If a character string literal includes a single quote, it must be doubled. For example:

=> SELECT 'Chester''s gorilla';
  ?column?
-------------------
Chester's gorilla
(1 row)

Standard-conforming strings and escape characters

Vertica uses standard-conforming strings as specified in the SQL standard, so backslashes are treated as string literals and not escape characters.

Examples

=> SELECT 'This is a string';
  ?column?
------------------
 This is a string
(1 row)
=> SELECT 'This \is a string';
   WARNING:  nonstandard use of escape in a string literal at character 8
   HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
     ?column?
------------------
 This is a string
(1 row)
vmartdb=> SELECT E'This \is a string';
     ?column?
------------------
 This is a string
=> SELECT E'This is a \n new line';
       ?column?
----------------------
 This is a
 new line
(1 row)
=> SELECT 'String''s characters';
      ?column?
--------------------
 String's characters
(1 row)

See also

2.3.2.2 - Dollar-quoted string literals

Dollar-quoted string literals are rarely used, but are provided here for your convenience.

Dollar-quoted string literals are rarely used, but are provided here for your convenience.

The standard syntax for specifying string literals can be difficult to understand. To allow more readable queries in such situations, Vertica SQL provides dollar quoting. Dollar quoting is not part of the SQL standard, but it is often a more convenient way to write complicated string literals than the standard-compliant single quote syntax.

Syntax

$$characters$$

Parameters

characters
Arbitrary sequence of characters bounded by paired dollar signs ($$)

Dollar-quoted string content is treated as a literal. Single quote, backslash, and dollar sign characters have no special meaning within a dollar-quoted string.

Notes

A dollar-quoted string that follows a keyword or identifier must be separated from the preceding word by whitespace; otherwise, the dollar-quoting delimiter is taken as part of the preceding identifier.

Examples

=> SELECT $$Fred's\n car$$;
     ?column?
-------------------
 Fred's\n car
(1 row)

=> SELECT 'SELECT 'fact';';
ERROR:  syntax error at or near "';'" at character 21
LINE 1: SELECT 'SELECT 'fact';';

=> SELECT 'SELECT $$fact';$$;
   ?column?
---------------
 SELECT $$fact
(1 row)

=> SELECT 'SELECT ''fact'';';
    ?column?
----------------
 SELECT 'fact';
(1 row)

2.3.2.3 - Unicode string literals

hexit is hexadecimal integer (0-9, a-f).

Syntax

U&'characters' [ UESCAPE '<Unicode escape character>' ]

Parameters

characters
Arbitrary sequence of UTF-8 characters bounded by single quotes (')
Unicode escape character
A single character from the source language character set other than a hexit, plus sign (+), quote ('), double quote (''), or white space

Using standard conforming strings

With StandardConformingStrings enabled, Vertica supports SQL standard Unicode character string literals (the character set is UTF-8 only).

Before you enter a Unicode character string literal, enable standard conforming strings in one of the following ways.

See also Extended String Literals.

Examples

To enter a Unicode character in hexadecimal, such as the Russian phrase for "thank you, use the following syntax:

=> SET STANDARD_CONFORMING_STRINGS TO ON;
=> SELECT U&'\0441\043F\0430\0441\0438\0431\043E' as 'thank you';
 thank you
-----------
 спасибо
(1 row)

To enter the German word mude (where u is really u-umlaut) in hexadecimal:

=> SELECT U&'m\00fcde';
?column?
----------
müde
(1 row)
=> SELECT 'ü';
?column?
----------
ü
(1 row)

To enter the LINEAR B IDEOGRAM B240 WHEELED CHARIOT in hexadecimal:

=> SELECT E'\xF0\x90\x83\x8C';
?column?
----------
(wheeled chariot character)
(1 row)

See also

2.3.2.4 - VARBINARY string literals

VARBINARY string literals allow you to specify hexadecimal or binary digits in a string literal.

VARBINARY string literals allow you to specify hexadecimal or binary digits in a string literal.

Syntax

X''
B''

Parameters

X or x
Specifies hexadecimal digits. The <hexadecimal digits> string must be enclosed in single quotes (').
B or b
Specifies binary digits. The <binary digits> string must be enclosed in single quotes (').

Examples

=> SELECT X'abcd';
 ?column?
----------
 \253\315
(1 row)

=> SELECT B'101100';
 ?column?
----------
 ,
(1 row)

2.3.2.5 - Extended string literals

Syntax

E'characters'

Parameters

characters
Arbitrary sequence of characters bounded by single quotes (') You can use C-style backslash sequence in extended string literals, which are an extension to the SQL standard. You specify an extended string literal by writing the letter E as a prefix (before the opening single quote); for example:
E'extended character string\n'

Within an extended string, the backslash character (\) starts a C-style backslash sequence, in which the combination of backslash and following character or numbers represent a special byte value, as shown in the following list. Any other character following a backslash is taken literally; for example, to include a backslash character, write two backslashes (\\).

  • \\ is a backslash

  • \b is a backspace

  • \f is a form feed

  • \n is a newline

  • \r is a carriage return

  • \t is a tab

  • \x##,where ## is a 1 or 2-digit hexadecimal number; for example \x07 is a tab

  • \###, where ### is a 1, 2, or 3-digit octal number representing a byte with the corresponding code.

When an extended string literal is concatenated across lines, write only E before the first opening quote:

=> SELECT E'first part o'
    'f a long line';
         ?column?
---------------------------
 first part of a long line
(1 row)

Two adjacent single quotes are used as one single quote:

=> SELECT 'Aren''t string literals fun?';
          ?column?
-----------------------------
 Aren't string literals fun?
(1 row)

Standard conforming strings and escape characters

When interpreting commands, such as those entered in vsql or in queries passed via JDBC or ODBC, Vertica uses standard conforming strings as specified in the SQL standard. In standard conforming strings, backslashes are treated as string literals (ordinary characters), not escape characters.

The following options are available, but Vertica recommends that you migrate your application to use standard conforming strings at your earliest convenience, after warnings have been addressed.

  • To treat back slashes as escape characters, set configuration parameter StandardConformingStrings to 0.

  • To enable standard conforming strings permanently, set the StandardConformingStrings parameter to '1', as described below.

  • To enable standard conforming strings per session, use SET STANDARD_CONFORMING_STRING TO ON, which treats backslashes as escape characters for the current session only.

Identifying strings that are not standard conforming

The following procedure can be used to identify nonstandard conforming strings in your application so that you can convert them into standard conforming strings:

  1. Be sure the StandardConformingStrings parameter is off, as described in Internationalization parameters.

    => ALTER DATABASE DEFAULT SET StandardConformingStrings = 0;
    
  2. If necessary, turn on the EscapeStringWarning parameter.

    => ALTER DATABASE DEFAULT SET EscapeStringWarning = 1;
    

    Vertica now returns a warning each time it encounters an escape string within a string literal. For example, Vertica interprets the \n in the following example as a new line:

    => SELECT 'a\nb';
       WARNING:  nonstandard use of escape in a string literal at character 8
       HINT:  Use the escape string syntax for escapes, e.g., E'\r\n'.
     ?column?
    ----------
     a
    b
    (1 row)
    

    When StandardConformingStrings is ON, the string is interpreted as four characters: a \ n b.

    Modify each string that Vertica flags by extending it as in the following example:

    E'a\nb'
    

    Or if the string has quoted single quotes, double them; for example, 'one'' double'.

  3. Turn on the StandardConformingStrings parameter for all sessions:

    => ALTER DATABASE DEFAULT SET StandardConformingStrings = 1;
    

Doubled single quotes

This section discusses vsql inputs that are not passed on to the server. Vertica recognizes two consecutive single quotes within a string literal as one single quote character. For example, the following inputs, 'You''re here!' ignored the second consecutive quote and returns the following:

=> SELECT 'You''re here!';
   ?column?
--------------
 You're here!at
(1 row)

This is the SQL standard representation and is preferred over the form, 'You\'re here!', because backslashes are not parsed as before. You need to escape the backslash:

=> SELECT (E'You\'re here!');
   ?column?
--------------
 You're here!
(1 row)

This behavior change introduces a potential incompatibility in the use of the vsql meta-command \set, which automatically concatenates its arguments. For example:

\set file  '\''  'pwd'  '/file.txt'  '\''\echo :file

vsql takes the four arguments and outputs the following:

'/home/vertica/file.txt'

Vertica parses the adjacent single quotes as follows:

\set file  '\'''pwd''/file.txt''\''\echo :file
'/home/vertica/file.txt''

Note the extra single quote at the end. This is due to the pair of adjacent single quotes together with the backslash-quoted single quote.

The extra quote can be resolved either as in the first example above, or by combining the literals as follows:

\set file '\''`pwd`'/file.txt'''\echo :file
'/home/vertica/file.txt'

In either case the backslash-quoted single quotes should be changed to doubled single quotes as follows:

\set file '''' `pwd` '/file.txt'''

Additional examples

=> SELECT 'This \is a string';
     ?column?
------------------
 This \is a string
(1 row)

=> SELECT E'This \is a string';
     ?column?
------------------
 This is a string

=> SELECT E'This is a \n new line';
       ?column?
----------------------
 This is a
 new line
(1 row)

=> SELECT 'String''s characters';
      ?column?
--------------------
 String's characters
(1 row)

2.3.3 - Date/time literals

Date or time literal input must be enclosed in single quotes.

Date or time literal input must be enclosed in single quotes. Input is accepted in almost any reasonable format, including ISO 8601, SQL-compatible, traditional POSTGRES, and others.

Vertica handles date/time input more flexibly than the SQL standard requires. The exact parsing rules of date/time input and for the recognized text fields including months, days of the week, and time zones are described in Date/time expressions.

2.3.3.1 - Time zone values

Vertica attempts to be compatible with the SQL standard definitions for time zones.

Vertica attempts to be compatible with the SQL standard definitions for time zones. However, the SQL standard has an odd mix of date and time types and capabilities. Obvious problems are:

  • Although the DATE type does not have an associated time zone, the TIME type can. Time zones in the real world have little meaning unless associated with a date as well as a time, since the offset can vary through the year with daylight-saving time boundaries.

  • Vertica assumes your local time zone for any data type containing only date or time.

  • The default time zone is specified as a constant numeric offset from UTC. It is therefore not possible to adapt to daylight-saving time when doing date/time arithmetic across DST boundaries.

To address these difficulties, OpenText recommends using Date/Time types that contain both date and time when you use time zones. OpenText recommends that you do not use the type TIME WITH TIME ZONE, even though it is supported it for legacy applications and for compliance with the SQL standard.

Time zones and time-zone conventions are influenced by political decisions, not just earth geometry. Time zones around the world became somewhat standardized during the 1900's, but continue to be prone to arbitrary changes, particularly with respect to daylight-savings rules.

Vertica currently supports daylight-savings rules over the time period 1902 through 2038, corresponding to the full range of conventional UNIX system time. Times outside that range are taken to be in "standard time" for the selected time zone, no matter what part of the year in which they occur.

Example Description
PST Pacific Standard Time
-8:00 ISO-8601 offset for PST
-800 ISO-8601 offset for PST
-8 ISO-8601 offset for PST
zulu Military abbreviation for UTC
z Short form of zulu

2.3.3.2 - Day of the week names

The following tokens are recognized as names of days of the week:.

The following tokens are recognized as names of days of the week:

Day Abbreviations
SUNDAY SUN
MONDAY MON
TUESDAY TUE, TUES
WEDNESDAY WED, WEDS
THURSDAY THU, THUR, THURS
FRIDAY FRI
SATURDAY SAT

2.3.3.3 - Month names

The following tokens are recognized as names of months:.

The following tokens are recognized as names of months:

Month Abbreviations
JANUARY JAN
FEBRUARY FEB
MARCH MAR
APRIL APR
MAY MAY
JUNE JUN
JULY JUL
AUGUST AUG
SEPTEMBER SEP, SEPT
OCTOBER OCT
NOVEMBER NOV
DECEMBER DEC

2.3.3.4 - Interval literal

A literal that represents a time span.

A literal that represents a time span.

Syntax

[ @ ] [-] { quantity subtype-unit }[...] [ AGO ]

Parameters

@
Ignored
- (minus)
Specifies a negative interval value.
quantity
Integer numeric constant
subtype-unit
See Interval subtype units for valid values. Subtype units must be specified for year-month intervals; they are optional for day-time intervals.
AGO
Specifies a negative interval value. AGO and - (minus) are synonymous.

Notes

  • The amounts of different units are implicitly added up with appropriate sign accounting.

  • The boundaries of an interval constant are:

    • 9223372036854775807 usec to -9223372036854775807 usec

    • 296533 years 3 mons 21 days 04:00:54.775807 to -296533 years -3 mons -21 days -04:00:54.775807

  • The range of an interval constant is +/– 263 – 1 microseconds.

  • In Vertica, interval fields are additive and accept large floating-point numbers.

Examples

See Specifying interval input.

2.3.3.4.1 - Interval subtype units

The following tables lists subtype units that you can specify in an interval literal, divided into major categories:.

The following tables lists subtype units that you can specify in an interval literal, divided into major categories:

Year-month subtype units

Subtypes Units Notes
Millennium mil, millennium, millennia, mils, millenniums
Century c, cent, century, centuries
Decade dec, decs, decade, decades
Year a Julian year: 365.25 days
ka Julian kilo-year: 365250 days
y, yr, yrs, year, years Calendar year: 365 days
Quarter q, qtr, qtrs, quarter, quarters
Month m, mon, mons, months, month Vertica can interpret m as minute or month, depending on context. See Processing m Input below.
Week w, wk, week, wks, weeks

Day-time subtype units

Subtypes Units Notes
Day d, day, days
Hour h, hr, hrs, hour, hours
Minute m, min, mins, minute, minutes Vertica can interpret input unit m as minute or month, depending on context. See Processing m Input below.
Second s, sec, secs, second, seconds
Millisecond ms, msec, msecs, msecond, mseconds, millisecond, milliseconds
Microsecond us, usec, usecs, usecond, useconds, microseconds, microsecond

Processing m input

Vertica uses context to interpret the input unit m as months or minutes. For example, the following command creates a one-column table with an interval value:

=> CREATE TABLE int_test(i INTERVAL YEAR TO MONTH);

Given the following INSERT statement, Vertica interprets the interval literal 1y 6m as 1 year 6 months:

=> INSERT INTO int_test VALUES('1y 6m');
 OUTPUT
--------
      1
(1 row)
=> COMMIT;
COMMIT
=> SET INTERVALSTYLE TO UNITS;
SET
=> SELECT * FROM int_test;
        i
-----------------
 1 year 6 months
(1 row)

The following ALTER TABLE statement adds a DAY TO MINUTE interval column to table int_test:

=>  ALTER TABLE int_test ADD COLUMN x INTERVAL DAY TO MINUTE;
ALTER TABLE

The next INSERT statement sets the first and second columns to 3y 20m and 1y 6m, respectively. In this case, Vertica interprets the m input literals in two ways:

  • For column i, Vertica interprets the m input as months, and displays 4 years 8 months.

  • For column x, Vertica interprets the m input as minutes. Because the interval is defined as DAY TO MINUTE, it converts the inserted input value 1y 6m to 365 days 6 minutes:

=> INSERT INTO int_test VALUES ('3y 20m', '1y 6m');
 OUTPUT
--------
      1
(1 row)

=> SELECT * FROM int_test;
        i         |        x
------------------+-----------------
 1 year 6 months  |
 4 years 8 months | 365 days 6 mins
(2 rows)

2.3.3.4.2 - Interval qualifier

Specifies how to interpret and format an interval literal for output, and optionally sets precision.

Specifies how to interpret and format an interval literal for output, and optionally sets precision. Interval qualifiers are composed of one or two units:

unit [ TO unit ] [ (p) ]

where:

  • unit specifies a day-time or year-month subtype.

  • p specifies precision, an integer between 0 and 6.

If an interval omits an interval qualifier, the default is DAY TO SECOND(6).

Interval qualifiers are divided into two categories:

Day-time interval qualifiers

Qualifier Description
DAY Unconstrained
DAY TO HOUR Span of days and hours
DAY TO MINUTE Span of days and minutes
DAY TO SECOND [(p)] Span of days, hours, minutes, seconds, and fractions of a second.
HOUR Hours within days
HOUR TO MINUTE Span of hours and minutes
HOUR TO SECOND [(p)] Span of hours and seconds
MINUTE Minutes within hours
MINUTE TO SECOND [(p)] Span of minutes and seconds
SECOND [(p)] Seconds within minutes

Year-month interval qualifiers

YEAR
Unconstrained
MONTH
Months within year
YEAR TO MONTH
Span of years and months

Examples

See Controlling interval format.

2.4 - Operators

Operators are logical, mathematical, and equality symbols used in SQL to evaluate, compare, or calculate values.

Operators are logical, mathematical, and equality symbols used in SQL to evaluate, compare, or calculate values.

2.4.1 - Bitwise operators

Bitwise operators perform bit manipulations on INTEGER and BINARY/VARBINARY data types:.

Bitwise operators perform bit manipulations on INTEGER and BINARY/VARBINARY data types:

Operator Description Example Result
& AND 12 & 4 4
| OR 32 | 3 35
# XOR 17 # 5 20
~ NOT ~1 -2
<< Bitwise shift left 1 << 4 16
>> Bitwise shift right 8 >> 2 2

Invalid for BINARY/VARBINARY data types

String argument handling

String arguments must be explicitly cast as BINARY or VARBINARY data types for all bitwise operators. For example:

=> SELECT 'xyz'::VARBINARY & 'zyx'::VARBINARY AS AND;
 AND
-----
 xyx
(1 row)

=> SELECT 'xyz'::VARBINARY | 'zyx'::VARBINARY AS OR;
 OR
-----
 zyz
(1 row)

Bitwise operators treats all string arguments as equal in length. If the arguments have different lengths, the operator function right-pads the smaller string with one or more zero bytes to equal the length of the larger string.

For example, the following statement ANDs unequal strings xyz and zy. Vertica right-pads string zy with one zero byte. The last character in the result is represented accordingly, as \000:

=> SELECT 'xyz'::VARBINARY & 'zy'::VARBINARY AS AND;
  AND
--------
 xy\000
(1 row)

2.4.2 - Logical operators

Vertica supports the logical operators AND, OR, and NOT:.

Vertica supports the logical operators AND, OR, and NOT:

  • AND evaluates to true when both of the conditions joined by AND are true.

  • OR evaluates to true when either condition is true.

  • NOT negates the result of any Boolean expression.

AND and OR are commutative—that is, you can switch left and right operands without affecting the result. However, the order of evaluation of sub-expressions is not defined. To force evaluation order, use a CASE construct.

Logic

SQL uses a three-valued Boolean logic where NULL represents "unknown":

  • true AND NULL = NULL

  • true OR NULL = true

  • false AND NULL = false

  • false OR NULL = NULL

  • NULL AND NULL = NULL

  • NULL OR NULL = NULL

  • NOT NULL = NULL

2.4.3 - Comparison operators

Comparison operators are available for all data types where comparison makes sense.

Comparison operators are available for all data types where comparison makes sense. All comparison operators are binary operators that return values of true, false, or NULL (unknown).

Operator Description Binary function
< less than binary_lt
> greater than binary_gt
<= less than or equal to binary_le
>= greater than or equal to binary_ge
=, <=>

equal

binary_eq
!=, <> not equal (unsupported for correlated subqueries) binary_ne

NULL handling

Comparison operators return NULL (unknown) if either or both operands are null. One exception applies: <=> returns true if both operands are NULL, and false if one operand is NULL.

Collections

When comparing collections, null collections are ordered last. Otherwise, collections are compared element by element until there is a mismatch, and then they are ordered based on the non-matching elements. If all elements are equal up to the length of the shorter one, then the shorter one is ordered first.

2.4.4 - Data type coercion operators (CAST)

Data type coercion (casting) passes an expression value to an input conversion routine for a specified data type, resulting in a constant of the indicated type.

Data type coercion (casting) passes an expression value to an input conversion routine for a specified data type, resulting in a constant of the indicated type. In Vertica, data type coercion can be invoked by an explicit cast request that uses one of the following constructs:

Syntax

SELECT CAST ( expression AS data-type )
SELECT expression::data-type
SELECT data-type 'string'

Parameters

expression An expression of any type
data-type An SQL data type that Vertica supports to convert expression.

Truncation

If a binary value is cast (implicitly or explicitly) to a binary type with a smaller length, the value is silently truncated. For example:

=> SELECT 'abcd'::BINARY(2);
 ?column?
----------
 ab
(1 row)

Similarly, if a character value is cast (implicitly or explicitly) to a character value with a smaller length, the value is silently truncated. For example:

=> SELECT 'abcd'::CHAR(3);
 ?column?
----------
 abc
(1 row)

Binary casting and resizing

Vertica supports only casts and resize operations as follows:

  • BINARY to and from VARBINARY

  • VARBINARY to and from LONG VARBINARY

  • BINARY to and from LONG VARBINARY

On binary data that contains a value with fewer bytes than the target column, values are right-extended with the zero byte '\0' to the full width of the column. Trailing zeros on variable-length binary values are not right-extended:

=> SELECT 'ab'::BINARY(4), 'ab'::VARBINARY(4), 'ab'::LONG VARBINARY(4);
  ?column?  | ?column? | ?column?
------------+----------+----------
 ab\000\000 | ab       | ab
(1 row)

Automatic coercion

The explicit type cast can be omitted if there is no ambiguity as to the type the constant must be. For example, when a constant is assigned directly to a column, it is automatically coerced to the column's data type.

Examples

=> SELECT CAST((2 + 2) AS VARCHAR);
 ?column?
----------
 4
(1 row)

=> SELECT (2 + 2)::VARCHAR;
 ?column?
----------
 4
(1 row)

=> SELECT INTEGER '123';
 ?column?
----------
      123
(1 row)

=> SELECT (2 + 2)::LONG VARCHAR
 ?column?
----------
 4
(1 row)

=> SELECT '2.2' + 2;
   ERROR:  invalid input syntax for integer: "2.2"

=> SELECT FLOAT '2.2' + 2;
 ?column?
----------
      4.2
(1 row)

See also

2.4.4.1 - Cast failures

When you invoke data type coercion (casting) by an explicit cast and the cast fails, the result returns either an error or NULL.

When you invoke data type coercion (casting) by an explicit cast and the cast fails, the result returns either an error or NULL. Cast failures commonly occur when you try to cast conflicting conversions, such as coercing a VARCHAR expression that contains letters to an integer.

When a cast fails, the result returned depends on the data type.

Data type Cast failure default
date, time NULL
literals error
all other types error

Enabling strict time casts

You can enable all cast failures to result in an error, including those for date/time data types. Doing so lets you see the reason why some or all of the cast failed. To return an error instead of NULL, set the configuration parameter EnableStrictTimeCasts to 1:

ALTER SESSION SET EnableStrictTimeCasts=1;

By default, EnableStrictTimeCasts is set to 0. Thus, the following attempt to cast a VARCHAR to a TIME data type returns NULL:


==> SELECT current_value from configuration_parameters WHERE parameter_name ilike '%EnableStrictTimeCasts%';
 current_value
---------------
 0
(1 row)

=> CREATE TABLE mytable (a VARCHAR);
CREATE TABLE
=> INSERT INTO mytable VALUES('one');
OUTPUT
--------
1
(1 row)

=> INSERT INTO mytable VALUES('1');
OUTPUT
--------
1
(1 row)

=> COMMIT;
COMMIT
=> SELECT a::time FROM mytable;
 a
---


(2 rows)

If EnableStrictTimeCasts is enabled, the cast failure returns an error:


=> ALTER SESSION SET EnableStrictTimeCasts=1;
ALTER SESSION
=> SELECT a::time FROM mytable;
ERROR 3679:  Invalid input syntax for time: "1"

Returning all cast failures as NULL

To explicitly cast an expression to a requested data type, use the following construct:

SELECT expression::data-type

Using this command to cast any values to a conflicting data type returns the following error:

=> SELECT 'one'::time;
ERROR 3679:  Invalid input syntax for time: "one"

Vertica also supports the use of the coercion operator ::!, which is useful when you want to return:

  • NULL instead of an error for any non-date/time data types

  • NULL instead of an error after setting EnableStrictTimeCasts

Returning all cast failures as NULL allows those expressions that succeed during the cast to appear in the result. Those expressions that fail during the cast, however, have a NULL value.

The following example queries mytable using the coercion operator ::!. The query returns NULL where column a contains the string one, and returns 1 where the column contains 1. Query results are identical no matter how EnableStrictTimeCasts is set:

=> SELECT current_value FROM configuration_parameters WHERE parameter_name ilike '%EnableStrictTimeCasts%';
 current_value
---------------
 0
(1 row)

=> SELECT a::!int FROM mytable;
a
---

1
(2 rows)

ALTER SESSION SET EnableStrictTimeCasts=1;
ALTER SESSION
=> SELECT a::!int FROM mytable;
 a
---

 1
(2 rows)

You can use ::! for casts of arrays and sets. The cast resolves each element individually, producing NULL for elements that cannot be cast.

Note that this functionality only applies to table data. It does not work on inline constant casts and in expressions automatically reduced to such. For example,

SELECT constant ::! FLOAT from (select 'some string' as constant) a;

results in ERROR 2826: Could not convert "some string" to a float8. However, the following returns cast failures as NULL as described:

SELECT string_field ::! float FROM (SELECT 'a string' as string_field UNION ALL SELECT 'another string' ) a;

2.4.5 - Date/time operators

Vertica supports usage of arithmetic operators on DATE/TIME operands:.

Vertica supports usage of arithmetic operators on DATE/TIME operands:

  • + (addition)
  • - (subtraction)
  • * (multiplication)
  • / (division)

Examples

The operators described below that take TIME or TIMESTAMP input have two variants:

  • Operators that take TIME WITH TIME ZONE or TIMESTAMP WITH TIME ZONE input.

  • Operators that take TIME WITHOUT TIME ZONE or TIMESTAMP WITHOUT TIME ZONE input.

For brevity, these variants are not shown separately.

The + and * operators come in commutative pairs—for example, both DATE + INTEGER and INTEGER + DATE. Only one of each pair is shown.

Example Result Type Result
DATE '2001-09-28' + INTEGER '7' DATE '2001-10-05'
DATE '2001-09-28' + INTERVAL '1 HOUR' TIMESTAMP '2001-09-28 01:00:00'
DATE '2001-09-28' + TIME 
'03:00'
TIMESTAMP '2001-09-28 03:00:00'
INTERVAL '1 DAY' + INTERVAL 
'1 HOUR'
INTERVAL '1 DAY 01:00:00'
TIMESTAMP '2001-09-28 01:00' 
+ INTERVAL '23 HOURS'
TIMESTAMP '2001-09-29 00:00:00'
TIME '01:00' + INTERVAL 
'3 HOURS'
TIME '04:00:00'
- INTERVAL '23 HOURS' INTERVAL '-23:00:00'
DATE '2001-10-01' – DATE 
'2001-09-28'
INTEGER '3'
DATE '2001-10-01' – INTEGER '7' DATE '2001-09-24'
DATE '2001-09-28' – INTERVAL 
'1 HOUR'
TIMESTAMP '2001-09-27 23:00:00'
TIME '05:00' – TIME '03:00' INTERVAL '02:00:00'
TIME '05:00'  INTERVAL 
'2 HOURS'
TIME '03:00:00'
TIMESTAMP '2001-09-28 23:00' 
– INTERVAL '23 HOURS'
TIMESTAMP '2001-09-28 00:00:00'
INTERVAL '1 DAY' – INTERVAL 
'1 HOUR'
INTERVAL '1 DAY -01:00:00'
TIMESTAMP '2001-09-29 03:00' 
– TIMESTAMP '2001-09-27 12:00'
INTERVAL '1 DAY 15:00:00'
900 * INTERVAL '1 SECOND' INTERVAL '00:15:00'
21 * INTERVAL '1 DAY' INTERVAL '21 DAYS'
DOUBLE PRECISION '3.5' 
* INTERVAL '1 HOUR' 
INTERVAL '03:30:00'
INTERVAL '1 HOUR' / 
DOUBLE PRECISION '1.5'
INTERVAL '00:40:00'

2.4.6 - Mathematical operators

Mathematical operators are provided for many data types.

Mathematical operators are provided for many data types.

Operator Description Example Result
! Factorial 5 ! 120
+ Addition 2 + 3 5
Subtraction 2 – 3 –1
* Multiplication 2 * 3 6
/ Division (integer division produces NUMERIC results). 4 / 2 2.00...
// With integer division, returns an INTEGER rather than a NUMERIC. 117.32 // 2.5 46
% Modulo (remainder). For details, see MOD. 5 % 4 1
^ Exponentiation 2.0 ^ 3.0 8
|/ Square root |/ 25.0 5
||/ Cube root ||/ 27.0 3
!! Factorial (prefix operator) !! 5 120
@ Absolute value @ -5.0 5

Factorial operator support

Vertica supports use of factorial operators on positive and negative floating point (DOUBLE PRECISION) numbers and integers. For example:

=> SELECT 4.98!;
   ?column?
------------------
 115.978600750905
(1 row)

Factorial is defined in terms of the gamma function, where (-1) = Infinity and the other negative integers are undefined. For example:

(–4)! = NaN
–(4!) = –24

Factorial is defined as follows for all complex numbers z:

z! = gamma(z+1)

For details, see Abramowitz and Stegun: Handbook of Mathematical Functions.

2.4.7 - NULL operators

To check whether a value is or is not NULL, use the following equivalent constructs:.

To check whether a value is or is not NULL, use the following equivalent constructs:

Standard:

[expression IS NULL | expression IS NOT NULL]

Non-standard:

[expression ISNULL | expression NOTNULL]

Do not write expression = NULL: NULL represents an unknown value, and two unknown values are not necessarily equal. This behavior conforms to the SQL standard.

2.4.8 - String concatenation operators

To concatenate two strings on a single line, use the concatenation operator (two consecutive vertical bars).

To concatenate two strings on a single line, use the concatenation operator (two consecutive vertical bars).

Syntax

string || string

Parameters

string Expression of type CHAR or VARCHAR

Notes

  • || is used to concatenate expressions and constants. The expressions are cast to VARCHAR if possible, otherwise to VARBINARY, and must both be one or the other.
  • Two consecutive strings within a single SQL statement on separate lines are automatically concatenated

Examples

The following example is a single string written on two lines:

=> SELECT E'xx'-> '\\';
 ?column?
----------
 xx\
(1 row)

The following examples show two strings concatenated:

=> SELECT E'xx' ||-> '\\';
 ?column?
----------
 xx\\
(1 row)

=> SELECT 'auto' || 'mobile';
 ?column?
----------
 automobile
(1 row)

=> SELECT 'auto'-> 'mobile';
 ?column?
----------
 automobile
(1 row)

=> SELECT 1 || 2;
 ?column?
----------
 12
(1 row)


=> SELECT '1' || '2';
 ?column?
----------
 12
(1 row)
=> SELECT '1'-> '2';
 ?column?
----------
 12
(1 row)

2.5 - Expressions

SQL expressions are the components of a query that compare a value or values against other values.

SQL expressions are the components of a query that compare a value or values against other values. They can also perform calculations. An expression found inside a SQL statement is usually in the form of a conditional statement.

Some functions also use Lambda functions.

Operator precedence

The following table shows operator precedence in decreasing (high to low) order.

When an expression includes more than one operator, specify the order of operation using parentheses, rather than relying on operator precedence.

Operator/Element Associativity Description
. left table/column name separator
:: left typecast
[ ] left array element selection
- right unary minus
^ left exponentiation
* / % left multiplication, division, modulo
+ - left addition, subtraction
IS IS TRUE, IS FALSE, IS UNKNOWN, IS NULL
IN set membership
BETWEEN range containment
OVERLAPS time interval overlap
LIKE string pattern matching
< > less than, greater than
= right equality, assignment
NOT right logical negation
AND left logical conjunction
OR left logical disjunction

Expression evaluation rules

The order of evaluation of subexpressions is not defined. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order. To force evaluation in a specific order, use a CASE construct. For example, this is an untrustworthy way of trying to avoid division by zero in a WHERE clause:

=> SELECT x, y WHERE x <> 0 AND y/x > 1.5; --- unsafe

But this is safe:

=> SELECT x, y
   WHERE
     CASE
       WHEN x <> 0 THEN y/x > 1.5
       ELSE false
     END;

A CASE construct used in this fashion defeats optimization attempts, so use it only when necessary. (In this particular example, it would be best to avoid the issue by writing y > 1.5*x instead.)

Limits to SQL expressions

Expressions are limited by the available stack. Vertica requires at least 100KB of free stack. If this limit is exceeded then the error "The query contains an expression that is too complex to analyze" might be thrown. Adding physical memory and/or increasing the value of ulimit -s can increase the available stack and prevent the error.

Analytic expressions have a maximum recursion depth of 2000. If this limit is exceeded then the error "The query contains an expression that is too complex to analyze" might be thrown. This limit cannot be increased.

2.5.1 - Aggregate expressions

An aggregate expression applies an aggregate function across the rows or groups of rows selected by a query.

An aggregate expression applies an aggregate function across the rows or groups of rows selected by a query.

An aggregate expression only can appear in the select list or HAVING clause of a SELECT statement. It is invalid in other clauses such as WHERE, because those clauses are evaluated before the results of aggregates are formed.

Syntax

An aggregate expression has the following format:

aggregate-function ( [ * ] [ ALL | DISTINCT ] expression )

Parameters

aggregate-function A Vertica function that aggregates data over groups of rows from a query result set.
ALL | DISTINCT

Specifies which input rows to process:

  • ALL (default): Invokes aggregate-function across all input rows where expression evaluates to a non-null value.

  • DISTINCT: Invokes aggregate-function across all input rows where expression evaluates to a unique non-null value.

expression A value expression that does not itself contain an aggregate expression.

Examples

The AVG aggregate function returns the average income from the customer_dimension table:

=> SELECT AVG(annual_income) FROM customer_dimension;
 AVG
--------------
 2104270.6485
(1 row)

The following example shows how to use the COUNT aggregate function with the DISTINCT keyword to return all distinct values of evaluating the expression x+y for all inventory_fact records.

=> SELECT COUNT (DISTINCT date_key + product_key) FROM inventory_fact;
COUNT
-------
21560
(1 row)

2.5.2 - CASE expressions

The CASE expression is a generic conditional expression that can be used wherever an expression is valid.

The CASE expression is a generic conditional expression that can be used wherever an expression is valid. It is similar to CASE and IF/THEN/ELSE statements in other languages.

Syntax (form 1)

CASE
  WHEN condition THEN result
  [ WHEN condition THEN result ]
  ...
  [ ELSE result ]
END

Parameters

*condition An expression that returns a Boolean (true/false) result. If the result is false, subsequent WHEN clauses are evaluated in the same way.
*result Specifies the value to return when the associated condition is true.
ELSE result If no condition is true then the value of the CASE expression is the result in the ELSE clause. If the ELSE clause is omitted and no condition matches, the result is NULL.

Syntax (form 2)

CASE expression
 WHEN value THEN result
 [ WHEN value THEN result ]
 ...
 [ ELSE result ]
END

Parameters

*expression An expression that is evaluated and compared to all the value specifications in WHEN clauses until one is found that is equal.
*value Specifies a value to compare to the expression.
*result Specifies the value to return when the expression is equal to the specified value.
ELSE result Specifies the value to return when the expression is not equal to any value; if no ELSE clause is specified, the value returned is null.

Notes

The data types of all result expressions must be convertible to a single output type.

Examples

The following examples show two uses of the CASE statement.

=> SELECT * FROM test;
 a
---
 1
 2
 3
=> SELECT a,
     CASE WHEN a=1 THEN 'one'
          WHEN a=2 THEN 'two'
          ELSE 'other'
     END
   FROM test;
 a | case
---+-------
 1 | one
 2 | two
 3 | other
=> SELECT a,
     CASE a WHEN 1 THEN 'one'
            WHEN 2 THEN 'two'
            ELSE 'other'
     END
   FROM test;
 a | case
---+-------
 1 | one
 2 | two
 3 | other

Special example

A CASE expression does not evaluate subexpressions that are not needed to determine the result. You can use this behavior to avoid division-by-zero errors:

=> SELECT x FROM T1 WHERE
      CASE WHEN x <> 0 THEN y/x > 1.5
      ELSE false
    END;

2.5.3 - Column references

A column reference cannot contain any spaces.

Syntax

[[[database.]schema.]table-name.]column-name

Parameters

schema Database and schema. The default schema is public. If you specify a database, it must be the current database.
table-name

One of the following:

  • Name of a table

  • Table alias defined in the query's FROM clause

column-name A column name that is unique among all queried tables.

Restrictions

A column reference cannot contain any spaces.

2.5.4 - Comments

A comment is an arbitrary sequence of characters beginning with two consecutive hyphen characters and extending to the end of the line.

A comment is an arbitrary sequence of characters beginning with two consecutive hyphen characters and extending to the end of the line. For example:

   -- This is a standard SQL comment

A comment is removed from the input stream before further syntax analysis and is effectively replaced by white space.

Alternatively, C-style block comments can be used where the comment begins with /* and extends to the matching occurrence of */.

   /* multiline comment
    * with nesting: /* nested block comment */
    */

These block comments nest, as specified in the SQL standard. Unlike C, you can comment out larger blocks of code that might contain existing block comments.

2.5.5 - Date/time expressions

Vertica uses an internal heuristic parser for all date/time input support.

Vertica uses an internal heuristic parser for all date/time input support. Dates and times are input as strings, and are broken up into distinct fields with a preliminary determination of what kind of information might be in the field. Each field is interpreted and either assigned a numeric value, ignored, or rejected. The parser contains internal lookup tables for all textual fields, including months, days of the week, and time zones.

Vertica parses date/time type inputs as follows:

  1. Break the input string into tokens and categorize each token as a string, time, time zone, or number.

  2. Numeric token contains:

    • colon (:) — Parse as a time string, include all subsequent digits and colons.

    • dash (-), slash (/), or two or more dots (.) — Parse as a date string which might have a text month.

    • Numeric only — Parse as a single field or an ISO 8601 concatenated date (19990113 for January 13, 1999) or time (141516 for 14:15:16).

  3. Token starts with a plus (+) or minus (–): Parse as a time zone or a special field.

  4. Token is a text string: match up with possible strings.

    • Perform a binary-search table lookup for the token as either a special string (for example, today), day (for example, Thursday), month (for example, January), or noise word (for example, at, on).

    • Set field values and bit mask for fields. For example, set year, month, day for today, and additionally hour, minute, second for now.

    • If not found, do a similar binary-search table lookup to match the token with a time zone.

    • If still not found, throw an error.

  5. Token is a number or number field:

    • If eight or six digits, and if no other date fields were previously read, interpret as a "concatenated date" (19990118 or 990118). The interpretation is YYYYMMDD or YYMMDD.

    • If token is three digits and a year was already read, interpret as day of year.

    • If four or six digits and a year was already read, interpret as a time (HHMM or HHMMSS).

    • If three or more digits and no date fields were found yet, interpret as a year (this forces yy-mm-dd ordering of the remaining date fields).

    • Otherwise the date field ordering is assumed to follow the DateStyle setting: mm-dd-yy, dd-mm-yy, or yy-mm-dd. Throw an error if a month or day field is found to be out of range.

  6. If BC is specified: negate the year and add one for internal storage. (In the Vertica implementation, 1 BC = year zero.)

  7. If BC is not specified, and year field is two digits in length: adjust the year to four digits. If field is less than 70, add 2000, otherwise add 1900.

Month day year ordering

For some formats, ordering of month, day, and year in date input is ambiguous and there is support for specifying the expected ordering of these fields.

Special date/time values

Vertica supports several special date/time values for convenience, as shown below. All of these values need to be written in single quotes when used as constants in SQL statements.

The values INFINITY and -INFINITY are specially represented inside the system and are displayed the same way. The others are simply notational shorthands that are converted to ordinary date/time values when read. (In particular, NOW and related strings are converted to a specific time value as soon as they are read.)

String Valid Data Types Description
epoch DATE, TIMESTAMP 1970-01-01 00:00:00+00 (UNIX SYSTEM TIME ZERO)
INFINITY TIMESTAMP Later than all other time stamps
-INFINITY TIMESTAMP Earlier than all other time stamps
NOW DATE, TIME, TIMESTAMP

Current transaction's start time

Note: NOW is not the same as the NOW function.

TODAY DATE, TIMESTAMP Midnight today
TOMORROW DATE, TIMESTAMP Midnight tomorrow
YESTERDAY DATE, TIMESTAMP Midnight yesterday
ALLBALLS TIME 00:00:00.00 UTC

The following SQL-compatible functions can also be used to obtain the current time value for the corresponding data type:

The latter four accept an optional precision specification. (See Date/time functions.) However, these functions are SQL functions and are not recognized as data input strings.

2.5.6 - NULL value

NULL is a reserved keyword used to indicate that a data value is unknown.

NULL is a reserved keyword used to indicate that a data value is unknown. It is the ASCII abbreviation for NULL characters (\0).

Usage in expressions

Vertica does not treat an empty string as a NULL value. An expression must specify NULL to indicate that a column value is unknown.

The following considerations apply to using NULL in expressions:

  • NULL is not greater than, less than, equal to, or not equal to any other expression. Use the Boolean to determine whether an expression value is NULL.

  • You can write queries with expressions that contain the <=> operator for NULL=NULL joins. See Equi-joins and non equi-joins.

  • Vertica accepts NULL characters ('\0') in constant strings and does not remove null characters from VARCHAR fields on input or output.

Projection ordering of NULL data

Vertica sorts NULL values in projection columns as follows:

Column data type NULL values placed at...
NUMERIC INTEGER DATE TIME TIMESTAMP INTERVAL Beginning of sorted column (NULLS FIRST)
FLOAT STRING BOOLEAN End of sorted column (NULLS LAST)

See also

NULL-handling functions

2.6 - Lambda functions

Some SQL functions have arguments that are lambda functions.

Some SQL functions have arguments that are lambda functions. A lambda function is an unnamed inline function that is evaluated by the containing SQL function and returns a value.

Syntax

Lambda with one argument:

argument -> expression

Lambda with more than one argument:

(argument, ...) -> expression

Arguments

argument Name to use for an input value for the expression. The name cannot be a reserved keyword, the name of an argument to a parent or nested lambda, or a column name or alias.
expression Expression that uses the input arguments and returns a result to the containing SQL function. See the documentation of individual SQL functions for restrictions on return values. For example, some functions require a Boolean result.

Examples

The ARRAY_FIND function returns the first index that matches the element being searched for. Instead of a literal element, you can write a lambda function that returns a Boolean. The lambda function is applied to each element in the array until a match is found or all elements have been tested. In the following example, each person in the table has an array of email addresses, and the function locates fake addresses:

=> CREATE TABLE people (id INT, name VARCHAR, email ARRAY[VARCHAR,5]);

=> SELECT name, ARRAY_FIND(email, e -> REGEXP_LIKE(e,'example.com','i'))
   AS 'example.com'
   FROM people;
     name      | example.com
---------------+-------------
 Alice Adams   |           1
 Bob Adams     |           1
 Carol Collins |           0
 Dave Jones    |           0
(4 rows)

The argument e represents the individual element, and the body of the lambda expression is the regular-expression comparison. The input table has four rows; in each row, the lambda function is called once per array element.

In the following example, a schedules table includes an array of events, where each event is a ROW with several fields:

=> CREATE TABLE schedules
       (guest VARCHAR,
       events ARRAY[ROW(e_date DATE, e_name VARCHAR, price NUMERIC(8,2))]);

You can use the CONTAINS function with a lambda expression to find people who have more than one event on the same day. The second argument, idx, is the index of the current element:

=> SELECT guest FROM schedules
WHERE CONTAINS(events, (e, idx) ->
                       (idx < ARRAY_LENGTH(events) - 1)
                       AND (e.e_date = events[idx + 1].e_date));
    guest
-------------
 Alice Adams
(1 row)

2.7 - Predicates

Predicates are truth-tests.

Predicates are truth-tests. If the predicate test is true, it returns a value. Each predicate is evaluated per row, so that when the predicate is part of an entire table SELECT statement, the statement can return multiple results.

Predicates consist of a set of parameters and arguments. For example, in the following WHERE clause:

WHERE name = 'Smith'
  • name = 'Smith' is the predicate

  • 'Smith' is an expression

2.7.1 - ANY and ALL

ANY and ALL are logical operators that let you make comparisons on subqueries that return one or more rows.

ANY and ALL are logical operators that let you make comparisons on subqueries that return one or more rows. Both operators must be preceded by a comparison operator and followed by a subquery:

expression comparison-operator { ANY | ALL } (subquery)
  • ANY returns true if the comparison between expression and any value returned by subquery evaluates to true.
  • ALL returns true only if the comparison between expression and all values returned by subquery evaluates to true.

Equivalent operators

You can use the following operators instead of ANY or ALL:

This operator... Is equivalent to:
SOME ANY
IN = ANY
NOT IN <> ALL

NULL handling

Vertica supports multicolumn <> ALL subqueries where the columns are not marked NOT NULL. If any column contains a NULL value, Vertica returns a run-time error.

Vertica does not support ANY subqueries that are nested in another expression if any column values are NULL.

Examples

Examples below use the following tables and data:

=> SELECT * FROM t1 ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(8 rows)
=> SELECT * FROM t2 ORDER BY c1;
 c1 | c2
----+-----
  1 | abc
  2 | fed
  3 | jkl
  3 | stu
  3 | zzz
(5 rows)

ANY subqueries

Subqueries that use the ANY keyword return true when any value retrieved in the subquery matches the value of the left-hand expression.

ANY subquery within an expression:

=> SELECT c1, c2 FROM t1 WHERE COALESCE((t1.c1 > ANY (SELECT c1 FROM t2)));
 c1 | c2
----+-----
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(6 rows)

ANY noncorrelated subqueries without aggregates:

=> SELECT c1 FROM t1 WHERE c1 = ANY (SELECT c1 FROM t2) ORDER BY c1;
 c1
----
  1
  1
  2
  2
  3
  3
(6 rows)

ANY noncorrelated subqueries with aggregates:

=> SELECT c1, c2 FROM t1 WHERE c1 <> ANY (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  4 | jkl
  5 | mno
(6 rows)

=> SELECT c1 FROM t1 GROUP BY c1 HAVING c1 <> ANY (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1
----
  1
  2
  4
  5
(4 rows)

ANY noncorrelated subqueries with aggregates and a GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 <> ANY (SELECT MAX(c1) FROM t2 GROUP BY c2) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(8 rows)

ANY noncorrelated subqueries with a GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 <=> ANY (SELECT c1 FROM t2 GROUP BY c1) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
  2 | fed
  2 | def
  3 | ihg
  3 | ghi
(6 rows)

ANY correlated subqueries with no aggregates or GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 >= ANY (SELECT c1 FROM t2 WHERE t2.c2 = t1.c2) ORDER BY c1;
 c1 | c2
----+-----
  1 | abc
  2 | fed
  4 | jkl
(3 rows)

ALL subqueries

A subquery that uses the ALL keyword returns true when all values retrieved by the subquery match the left-hand expression, otherwise it returns false.

ALL noncorrelated subqueries without aggregates:

=> SELECT c1, c2 FROM t1 WHERE c1 >= ALL (SELECT c1 FROM t2) ORDER BY c1;
 c1 | c2
----+-----
  3 | ihg
  3 | ghi
  4 | jkl
  5 | mno
(4 rows)

ALL noncorrelated subqueries with aggregates:

=> SELECT c1, c2 FROM t1 WHERE c1 = ALL (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1 | c2
----+-----
  3 | ihg
  3 | ghi
(2 rows)

=> SELECT c1 FROM t1 GROUP BY c1 HAVING c1 <> ALL (SELECT MAX(c1) FROM t2) ORDER BY c1;
 c1
----
  1
  2
  4
  5
(4 rows)

ALL noncorrelated subqueries with aggregates and a GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 <= ALL (SELECT MAX(c1) FROM t2 GROUP BY c2) ORDER BY c1;
 c1 | c2
----+-----
  1 | cab
  1 | abc
(2 rows)

ALL noncorrelated subqueries with a GROUP BY clause:

=> SELECT c1, c2 FROM t1 WHERE c1 <> ALL (SELECT c1 FROM t2 GROUP BY c1) ORDER BY c1;
 c1 | c2
----+-----
  4 | jkl
  5 | mno
(2 rows)

2.7.2 - BETWEEN

Checks whether an expression is within the range of two other expressions, inclusive.

Checks whether an expression is within the range of two other expressions, inclusive. All expressions must be of the same or compatible data types.

Syntax

WHERE a BETWEEN x AND y

Equivalent predicates

The following BETWEEN predicates can be rewritten in conventional SQL with logical operators AND and OR.

This BETWEEN predicate... Is equivalent to...
WHERE aBETWEENxANDy WHERE a>=xANDa<=y
WHERE aNOT BETWEENxANDy WHERE a<xORa>y

Examples

The BETWEEN predicate can be especially useful for querying date ranges, as shown in the following examples:

=> SELECT NOW()::DATE;
    NOW
------------
 2022-12-15
(1 row)

=> CREATE TABLE t1 (a INT, b varchar(12), c DATE);
CREATE TABLE
=> INSERT INTO t1 VALUES
    (0,'today',NOW()),
    (1,'today+1',NOW()+1),
    (2,'today+2',NOW()+2),
    (3,'today+3',NOW()+3),
    (4,'today+4',NOW()+4),
    (5,'today+5',NOW()+5),
    (6,'today+6',NOW()+6);
 OUTPUT
--------
      7
(1 row)

=> COMMIT;
COMMIT
=> SELECT * FROM t1;
 a |    b    |     c
---+---------+------------
 0 | today   | 2022-12-15
 1 | today+1 | 2022-12-16
 2 | today+2 | 2022-12-17
 3 | today+3 | 2022-12-18
 4 | today+4 | 2022-12-19
 5 | today+5 | 2022-12-20
 6 | today+6 | 2022-12-21
(7 rows)

=> SELECT * FROM t1 WHERE c BETWEEN '2022-12-17' AND '2022-12-20';
 a |    b    |     c
---+---------+------------
 2 | today+2 | 2022-12-17
 3 | today+3 | 2022-12-18
 4 | today+4 | 2022-12-19
 5 | today+5 | 2022-12-20
(4 rows)

Use the NOW and INTERVAL keywords to query a date range:

=> SELECT * FROM t1 WHERE c BETWEEN NOW()::DATE AND NOW()::DATE + INTERVAL '2 days';
 a |    b    |     c
---+---------+------------
 0 | today   | 2022-12-15
 1 | today+1 | 2022-12-16
 2 | today+2 | 2022-12-17
(3 rows)

2.7.3 - Boolean

Retrieves rows where the value of an expression is true, false, or unknown (NULL).

Retrieves rows where the value of an expression is true, false, or unknown (NULL).

Syntax

expression IS [NOT] TRUE
expression IS [NOT] FALSE
expression IS [NOT] UNKNOWN

Notes

  • NULL input is treated as the value UNKNOWN.
  • IS UNKNOWN and IS NOT UNKNOWN are effectively the same as the NULL predicate, except that the input expression does not have to be a single column value. To check a single column value for NULL, use the NULL predicate.
  • Do not confuse the Boolean predicate with Boolean operators or the Boolean data type, which can have only two values: true and false.

2.7.4 - EXISTS

EXISTS and NOT EXISTS predicates compare an expression against a subquery:.

EXISTS and NOT EXISTS predicates compare an expression against a subquery:

  • EXISTS returns true if the subquery returns one or more rows.
  • NOT EXISTS returns true if the subquery returns no rows.

Syntax

expression [ NOT ] EXISTS ( subquery )

Usage

EXISTS results only depend on whether any or no records are returned, and not on the contents of those records. Because the subquery output is usually of no interest, EXISTS tests are commonly written in one of the following ways:

EXISTS (SELECT 1 WHERE...)
EXISTS (SELECT * WHERE...)

In the first case, the subquery returns 1 for every record found by the subquery. For example, the following query retrieves a list of all customers whose store purchases were greater than 550 dollars:

=> SELECT customer_key, customer_name, customer_state
   FROM public.customer_dimension WHERE EXISTS
     (SELECT 1 FROM store.store_sales_fact
      WHERE customer_key = public.customer_dimension.customer_key
      AND sales_dollar_amount > 550)
   AND customer_state = 'MA' ORDER BY customer_key;
 customer_key |     customer_name      | customer_state
--------------+------------------------+----------------
            2 | Anna G. Li              | CA
            4 | Daniel I. Fortin        | TX
            7 | David H. Greenwood      | MA
            8 | Wendy S. Young          | IL
            9 | Theodore X. Brown       | MA
      ...
        49902 | Amy Q. Pavlov          | MA
        49922 | Doug C. Carcetti       | MA
        49930 | Theodore G. McNulty    | MA
        49979 | Ben Z. Miller          | MA
(1058 rows)

EXISTS versus IN

Whether you use EXISTS or IN subqueries depends on which predicates you select in outer and inner query blocks. For example, the following query gets a list of all the orders placed by all stores on January 2, 2007 for vendors with records in the vendor table:

=> SELECT store_key, order_number, date_ordered
   FROM store.store_orders_fact WHERE EXISTS
     (SELECT 1 FROM public.vendor_dimension vd JOIN store.store_orders_fact ord ON vd.vendor_key = ord.vendor_key)
   AND date_ordered = '2007-01-02';
 store_key | order_number | date_ordered
-----------+--------------+--------------
       114 |       271071 | 2007-01-02
        19 |       290888 | 2007-01-02
       132 |        58942 | 2007-01-02
       232 |         9286 | 2007-01-02
       126 |       224474 | 2007-01-02
       196 |        63482 | 2007-01-02
  ...
       196 |        83327 | 2007-01-02
       138 |       278373 | 2007-01-02
       179 |       293586 | 2007-01-02
       155 |       213413 | 2007-01-02
(506 rows)

The above query looks for existence of the vendor and date ordered. To return a particular value, rather than simple existence, the query looks for orders placed by the vendor who got the best deal on January 2, 2007:

=> SELECT store_key, order_number, date_ordered, vendor_name
   FROM store.store_orders_fact ord JOIN public.vendor_dimension vd ON ord.vendor_key = vd.vendor_key
   WHERE vd.deal_size IN (SELECT MAX(deal_size) FROM public.vendor_dimension) AND date_ordered = '2007-01-02';
 store_key | order_number | date_ordered |     vendor_name
-----------+--------------+--------------+----------------------
        50 |        99234 | 2007-01-02   | Everything Wholesale
        81 |       200802 | 2007-01-02   | Everything Wholesale
       115 |        13793 | 2007-01-02   | Everything Wholesale
       204 |        41842 | 2007-01-02   | Everything Wholesale
       133 |       169025 | 2007-01-02   | Everything Wholesale
       163 |       208580 | 2007-01-02   | Everything Wholesale
        29 |       154972 | 2007-01-02   | Everything Wholesale
       145 |       236790 | 2007-01-02   | Everything Wholesale
       249 |        54838 | 2007-01-02   | Everything Wholesale
         7 |       161536 | 2007-01-02   | Everything Wholesale
(10 rows)

See also

IN

2.7.5 - IN

Checks whether a single value is found (or not found) within a set of values.

Checks whether a single value is found (or not found) within a set of values.

Syntax

(column-list) [ NOT ] IN ( values-list )

Arguments

column-list
One or more comma-delimited columns in the queried tables.
values-list
Comma-delimited list of constant values to find in the column-list columns. Each values-list value maps to a column-list column according to their order in values-list and column-list, respectively. Column/value pairs must have compatible data types.

You can specify multiple sets of values as follows:

( (values-list), (values-list)[,...] )

Null handling

Vertica supports multicolumn NOT IN subqueries where the columns are not marked NOT NULL. If one of the columns is found to contain a NULL value during query execution, Vertica returns a run-time error.

Similarly, IN subqueries nested within another expression are not supported if any column values are NULL. For example, if in the following statement column x from either table contains a NULL value, Vertica returns a run-time error:

=> SELECT * FROM t1 WHERE (x IN (SELECT x FROM t2)) IS FALSE;
   ERROR: NULL value found in a column used by a subquery

EXISTS versus IN

Whether you use EXISTS or IN subqueries depends on which predicates you select in outer and inner query blocks. For example, the following query gets a list of all the orders placed by all stores on January 2, 2007 for vendors with records in the vendor table:

=> SELECT store_key, order_number, date_ordered
   FROM store.store_orders_fact WHERE EXISTS
     (SELECT 1 FROM public.vendor_dimension vd JOIN store.store_orders_fact ord ON vd.vendor_key = ord.vendor_key)
   AND date_ordered = '2007-01-02';
 store_key | order_number | date_ordered
-----------+--------------+--------------
       114 |       271071 | 2007-01-02
        19 |       290888 | 2007-01-02
       132 |        58942 | 2007-01-02
       232 |         9286 | 2007-01-02
       126 |       224474 | 2007-01-02
       196 |        63482 | 2007-01-02
  ...
       196 |        83327 | 2007-01-02
       138 |       278373 | 2007-01-02
       179 |       293586 | 2007-01-02
       155 |       213413 | 2007-01-02
(506 rows)

The above query looks for existence of the vendor and date ordered. To return a particular value, rather than simple existence, the query looks for orders placed by the vendor who got the best deal on January 2, 2007:

=> SELECT store_key, order_number, date_ordered, vendor_name
   FROM store.store_orders_fact ord JOIN public.vendor_dimension vd ON ord.vendor_key = vd.vendor_key
   WHERE vd.deal_size IN (SELECT MAX(deal_size) FROM public.vendor_dimension) AND date_ordered = '2007-01-02';
 store_key | order_number | date_ordered |     vendor_name
-----------+--------------+--------------+----------------------
        50 |        99234 | 2007-01-02   | Everything Wholesale
        81 |       200802 | 2007-01-02   | Everything Wholesale
       115 |        13793 | 2007-01-02   | Everything Wholesale
       204 |        41842 | 2007-01-02   | Everything Wholesale
       133 |       169025 | 2007-01-02   | Everything Wholesale
       163 |       208580 | 2007-01-02   | Everything Wholesale
        29 |       154972 | 2007-01-02   | Everything Wholesale
       145 |       236790 | 2007-01-02   | Everything Wholesale
       249 |        54838 | 2007-01-02   | Everything Wholesale
         7 |       161536 | 2007-01-02   | Everything Wholesale
(10 rows)

Examples

The following SELECT statement queries all data in table t11.

=> SELECT * FROM t11 ORDER BY pk;
 pk | col1 | col2 | SKIP_ME_FLAG
----+------+------+--------------
  1 |    2 |    3 | t
  2 |    3 |    4 | t
  3 |    4 |    5 | f
  4 |    5 |    6 | f
  5 |    6 |    7 | t
  6 |      |    8 | f
  7 |    8 |      | t
(7 rows)

The following query specifies an IN predicate, to find all rows in t11 where columns col1 and col2 contain values of (2,3) or (6,7):

=> SELECT * FROM t11 WHERE (col1, col2) IN ((2,3), (6,7)) ORDER BY pk;
 pk | col1 | col2 | SKIP_ME_FLAG
----+------+------+--------------
  1 |    2 |    3 | t
  5 |    6 |    7 | t
(2 rows)

The following query uses the VMart schema to illustrate the use of outer expressions referring to different inner expressions:

=> SELECT product_description, product_price FROM product_dimension
   WHERE (product_dimension.product_key, product_dimension.product_key) IN
      (SELECT store.store_orders_fact.order_number,
         store.store_orders_fact.quantity_ordered
       FROM store.store_orders_fact);
     product_description     | product_price
-----------------------------+---------------
 Brand #73 wheechair         |           454
 Brand #72 box of candy      |           326
 Brand #71 vanilla ice cream |           270
(3 rows)

2.7.6 - INTERPOLATE

Joins two using some ordered attribute.

Joins two event series using some ordered attribute. Event series joins let you compare values from two series directly, rather than having to normalize the series to the same measurement interval.

An event series join is an extension of a regular outer join. The difference between expressing a regular outer join and an event series join is the INTERPOLATE predicate, which is used in the ON clause (see Examples below). Instead of padding the non-preserved side with null values when there is no match, the event series join pads the non-preserved side with the previous/next values from the table.

Interpolated values come from the table that contains the null, not from the other table.Vertica does not guarantee that the output contains no null values. If there is no previous/next value for a mismatched row, that row is padded with nulls.

Syntax

expression1 INTERPOLATE { PREVIOUS | NEXT } VALUE expression2

Arguments

expression1,expression2

A column reference from one of the tables specified in the FROM clause.

The columns can be of any data type. Because event series are time-based, the type is typically DATE/TIMEor TIMESTAMP.

{ PREVIOUS | NEXT } VALUE

Pads the non-preserved side with the previous/next values when there is no match. If previous is called on the first row (or next on the last row), will pad with null values.

Input rows are sorted in ascending logical order of the join column.

Notes

  • Data is logically partitioned on the table in which it resides, based on other ON clause equality predicates.

  • Event series join requires that the joined tables are both sorted on columns in equality predicates, in any order, followed by the INTERPOLATED column. If data is already sorted in this order, then an explicit sort is avoided, which can improve query performance. For example, given the following tables:

    ask: exchange, stock, ts, pricebid: exchange,
    stock, ts, price
    

    In the query that follows:

    • ask is sorted on exchange, stock (or the reverse), ts

    • bid is sorted on exchange, stock (or the reverse), ts

    SELECT ask.price - bid.price, ask.ts, ask.stock, ask.exchange
    FROM ask FULL OUTER JOIN bid
       ON ask.stock = bid.stock AND ask.exchange =
       bid.exchange AND ask.ts INTERPOLATE PREVIOUS
       VALUE bid.ts;
    

Restrictions

  • Only one INTERPOLATE expression is allowed per join.
  • INTERPOLATE expressions are used only with ANSI SQL-99 syntax (the ON clause), which is already true for full outer joins.
  • INTERPOLATE can be used with equality predicates only.
  • The AND operator is supported but not the OR and NOT operators.
  • Expressions and implicit or explicit casts are not supported, but subqueries are allowed.

Semantics

When you write an event series join in place of normal join, values are evaluated as follows (using the schema in the examples below):

  • t is the outer, preserved table.
  • t1 is the inner, non-preserved table.
  • For each row in outer table t, the ON clause predicates are evaluated for each combination of each row in the inner table t1.
  • If the ON clause predicates evaluate to true for any combination of rows, those combination rows are produced at the output.
  • If the ON clause is false for all combinations, a single output row is produced with the values of the row from t along with the columns of t1 chosen from the row in t1 with the greatest t1.y value such that t1.y < t.x; If no such row is found, pad with nulls.

In the case of a full outer join, all values from both tables are preserved.

Examples

The examples that follow use this simple schema.

CREATE TABLE t(x TIME);
CREATE TABLE t1(y TIME);
INSERT INTO t VALUES('12:40:23');
INSERT INTO t VALUES('13:40:25');
INSERT INTO t VALUES('13:45:00');
INSERT INTO t VALUES('14:49:55');
INSERT INTO t1 VALUES('12:40:23');
INSERT INTO t1 VALUES('14:00:00');
COMMIT;

Normal full outer join

=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x = t1.y;

Notice the null rows from the non-preserved table:

    x     |    y
----------+----------
 12:40:23 | 12:40:23
 13:40:25 |
 13:45:00 |
 14:49:55 |
          | 14:00:00
(5 rows)

Full outer join with interpolation

=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;

In this case, the rows with no entry point are padded with values from the previous row.

    x     |    y
----------+----------
 12:40:23 | 12:40:23
 13:40:25 | 12:40:23
 13:45:00 | 12:40:23
 14:49:55 | 12:40:23
 13:40:25 | 14:00:00
(5 rows)

Likewise, interpolate next is also supported:

=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x INTERPOLATE NEXT VALUE t1.y;

In this case, the rows with no entry point are padded with values from the next row.

    x     |    y
----------+----------
 12:40:23 | 12:40:23
 13:40:25 | 14:00:00
 13:45:00 | 14:00:00
 14:49:55 |
 14:49:55 | 14:00:00
(5 rows)

Normal left outer join

=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x = t1.y;

Again, there are nulls in the non-preserved table

    x     |    y
----------+----------
 12:40:23 | 12:40:23
 13:40:25 |
 13:45:00 |
 14:49:55 |
(4 rows)

Left outer join with interpolation

=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;

Nulls have been padded with interpolated values.

    x     |    y
----------+----------
 12:40:23 | 12:40:23
 13:40:25 | 12:40:23
 13:45:00 | 12:40:23
 14:49:55 | 14:00:00
(4 rows)

Likewise, interpolate next is also supported:

=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x INTERPOLATE NEXT VALUE t1.y;

Nulls have been padded with interpolated values here as well.

    x     |    y
----------+----------
 12:40:23 | 12:40:23
 13:40:25 | 14:00:00
 13:45:00 | 14:00:00
 14:49:55 |
 (4 rows)

Inner joins

For inner joins, there is no difference between a regular inner join and an event series inner join. Since null values are eliminated from the result set, there is nothing to interpolate.

A regular inner join returns only the single matching row at 12:40:23:

=> SELECT * FROM t INNER JOIN t1 ON t.x = t1.y;
    x     |    y
----------+----------
 12:40:23 | 12:40:23
(1 row)

An event series inner join finds the same single-matching row at 12:40:23:

=> SELECT * FROM t INNER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;
    x     |    y
----------+----------
 12:40:23 | 12:40:23
(1 row)

See also

Event series joins

2.7.6.1 - Join predicate

Specifies the columns on which records from two or more tables are joined.

Specifies the columns on which records from two or more tables are joined. You can connect multiple join predicates with logical operators AND, OR, and NOT.

Syntax

ON column-ref = column-ref [ {AND | OR | NOT } column-ref = column-ref ]...

Parameters

column-ref Specifies a column in a queried table. For best performance, do not join on LONG VARBINARY and LONG VARCHAR columns.

See also

Joins

2.7.7 - LIKE

Retrieves rows where a string expression—typically a column—matches the specified pattern or, if qualified by ANY or ALL, set of patterns.

Retrieves rows where a string expression—typically a column—matches the specified pattern or, if qualified by ANY or ALL, set of patterns. Patterns can contain one or more wildcard characters.

If an ANY or ALL pattern is qualified with NOT, the negation is pushed down to each clause. NOT LIKE ANY (a, b) is equivalent to NOT LIKE a OR NOT LIKE b. See the examples.

Syntax

string-expression [ NOT ] { LIKE | ILIKE | LIKEB | ILIKEB }
   { pattern | { ANY | SOME | ALL } ( pattern,... ) } [ ESCAPE 'char' ]

Arguments

string-expression
String expression, typically a column, to test for instances of the specified pattern or patterns.
NOT
Returns true if the LIKE predicate returns false and vice-versa. When used with ANY or ALL, applies to each value individually.
LIKE | ILIKE | LIKEB | ILIKEB
Type of comparison:
  • LIKE: Complies with the SQL standard, case-sensitive, operates on UTF-8 character strings, exact behavior depends on collation parameters such as strength. LIKE is stable for character strings, but immutable for binary strings
  • ILIKE: Same as LIKE but case-insensitive.
  • LIKEB: Performs case-sensitive byte-at-a-time ASCII comparisons, immutable for character and binary strings.
  • ILIKEB: Same as LIKEB but case-insensitive.
pattern
A pattern to test against the expression. Pattern strings can contain the following wildcard characters:
  • _ (underscore): Match any single character.

  • % (percent): Match any string of zero or more characters.

ANY | SOME | ALL
Apply a comma-delimited list of patterns, where:
  • ANY and SOME return true if any pattern matches, equivalent to logical OR. These options are synonyms.

  • ALL returns true only if all patterns match, equivalent to logical AND.

ESCAPE char
Escape character, by default backslash (\), used to escape reserved characters: wildcard characters (underscore and percent), and the escape character itself.

This option is enforced only for non-default collations; it is currently unsupported with ANY/ALL pattern matching.

Substitute symbols

You can substitute the following symbols for LIKE and its variants:

Symbol Eqivalent to:
~~ LIKE
~# LIKEB
~~* ILIKE
~#* ILIKEB
!~~ NOT LIKE
!~# NOT LIKEB
!~~* NOT ILIKE
!~#* NOT ILIKEB

Pattern matching

LIKE and its variants require that the entire string expression match the specified patterns. To match a sequence of characters anywhere within a string, the pattern must start and end with a percent sign.

LIKE does not ignore trailing white space characters. If the data values to match end with an indeterminate amount of white space, append the wildcard character % to pattern.

Locale dependencies

In the default locale, LIKE and ILIKE handle UTF-8 character-at-a-time, locale-insensitive comparisons. ILIKE handles language-independent case-folding.

In non-default locales, LIKE and ILIKE perform locale-sensitive string comparisons, including some automatic normalization, using the same algorithm as the = operator on VARCHAR types.

ESCAPE expressions evaluate to exactly one octet—or one UTF-8 character for non-default locales.

Examples

Basic pattern matching

The following query searches for names with a common prefix:

=> SELECT name FROM people WHERE name LIKE 'Ann%';
   name
-----------
 Ann
 Ann Marie
 Anna
(3 rows)

LIKE ANY/ALL

LIKE operators support the keywords ANY and ALL, which let you specify multiple patterns to test against a string expression. For example, the following query finds all names that begin or end with the letter 'A':

=> SELECT name FROM people WHERE name LIKE ANY ('A%', '%a');
   name
-----------
 Alice
 Ann
 Ann Marie
 Anna
 Roberta
(5 rows)

LIKE ANY usage is equivalent to individual LIKE conditions combined with OR:

=> SELECT name FROM people WHERE name LIKE 'A%' OR name LIKE '%a';
   name
-----------
 Alice
 Ann
 Ann Marie
 Anna
 Roberta
(5 rows)

Similarly, LIKE ALL is equivalent to individual LIKE conditions combined with AND.

NOT LIKE ANY/ALL

You can use NOT with LIKE ANY or LIKE ALL. NOT does not negate the LIKE expression; instead it negates each clause.

Consider a table with the following contents:

=> SELECT name FROM people;
   name
-----------
 Alice
 Ann
 Ann Marie
 Anna
 Richard
 Rob
 Robert
 Roberta
(8 rows)

In the following query, NOT LIKE ANY ('A%', '%a') is equivalent to NOT LIKE 'A%' OR NOT LIKE '%a', so the only result that is eliminated is Anna, which matches both patterns:

=> SELECT name FROM people WHERE name NOT LIKE ANY ('A%', '%a');
   name
-----------
 Alice
 Ann
 Ann Marie
 Richard
 Rob
 Robert
 Roberta
(7 rows)

--- same results:
=> SELECT name FROM people WHERE name NOT LIKE 'A%' OR name NOT LIKE '%a';

NOT LIKE ALL eliminates results that satisfy any pattern:

=> SELECT name FROM people WHERE name NOT LIKE ALL ('A%', '%a');
  name
---------
 Richard
 Rob
 Robert
(3 rows)

--- same results:
=> SELECT name FROM people WHERE name NOT LIKE 'A%' AND name NOT LIKE '%a';

Pattern matching in locales

The following example illustrates pattern matching in locales.

=> \locale default
INFO 2567:  Canonical locale: 'en_US'
Standard collation: 'LEN_KBINARY'
English (United States)
=> CREATE TABLE src(c1 VARCHAR(100));
=> INSERT INTO src VALUES (U&'\00DF'); --The sharp s (ß)
=> INSERT INTO src VALUES ('ss');
=> COMMIT;

Querying the src table in the default locale returns both ss and sharp s.

=> SELECT * FROM src;
 c1
----
 ß
 ss
(2 rows)

The following query combines pattern-matching predicates to return the results from column c1:

=> SELECT c1, c1 = 'ss' AS equality, c1 LIKE 'ss'
   AS LIKE, c1 ILIKE 'ss' AS ILIKE FROM src;
 c1 | equality | LIKE | ILIKE
----+----------+------+-------
 ß  | f        | f    | f
 ss | t        | t    | t
(2 rows)

The next query specifies unicode format for c1:

=> SELECT c1, c1 = U&'\00DF' AS equality,
   c1 LIKE U&'\00DF' AS LIKE,
   c1 ILIKE U&'\00DF' AS ILIKE from src;
 c1 | equality | LIKE | ILIKE
----+----------+------+-------
 ß  | t        | t    | t
 ss | f        | f    | f
(2 rows)

Now change the locale to German with a strength of 1 (ignore case and accents):

=> \locale LDE_S1
INFO 2567:  Canonical locale: 'de'
Standard collation: 'LDE_S1'
German  Deutsch
=> SELECT c1, c1 = 'ss' AS equality,
c1 LIKE 'ss' as LIKE, c1 ILIKE 'ss' AS ILIKE from src;
 c1 | equality | LIKE | ILIKE
----+----------+------+-------
 ß  | t        | t    | t
 ss | t        | t    | t
(2 rows)

This example illustrates binary data types with pattern-matching predicates:

=> CREATE TABLE t (c BINARY(1));
CREATE TABLE
=> INSERT INTO t VALUES (HEX_TO_BINARY('0x00')), (HEX_TO_BINARY('0xFF'));
 OUTPUT
--------
      2
(1 row)

=> COMMIT;
COMMIT
=> SELECT TO_HEX(c) FROM t;
 TO_HEX
--------
 00
 ff
(2 rows)

=> SELECT * FROM t;
  c
------
 \000
 \377
(2 rows)

=> SELECT c, c = '\000', c LIKE '\000', c ILIKE '\000' from t;
  c   | ?column? | ?column? | ?column?
------+----------+----------+----------
 \000 | t        | t        | t
 \377 | f        | f        | f
(2 rows)

=> SELECT c, c = '\377', c LIKE '\377', c ILIKE '\377' FROM t;
  c   | ?column? | ?column? | ?column?
------+----------+----------+----------
 \000 | f        | f        | f
 \377 | t        | t        | t
(2 rows)

2.7.8 - NULL

Tests for null values.

Tests for null values. The expression can be a column name, literal, or function.

Syntax

value-expression IS [ NOT ] NULL

Examples

Column name:

=> SELECT date_key FROM date_dimension WHERE date_key IS NOT NULL;
 date_key
----------
        1
      366
     1462
     1097
        2
        3
        6
        7
        8
...

Function:

=> SELECT MAX(household_id) IS NULL FROM customer_dimension;
 ?column?
----------
 f
(1 row)

Literal:

=> SELECT 'a' IS NOT NULL;
 ?column?
----------
 t
(1 row)

2.8 - Hints

Hints are directives that you embed within a query or.

Hints are directives that you embed within a query or directed query. They conform to the following syntax:

/*+hint-name[, hint-name]...*/

Hints are bracketed by comment characters /*+ and */, which can enclose multiple comma-delimited hints. For example:

SELECT /*+syntactic_join,verbatim*/

Restrictions

When embedding hints in a query, be aware of the following restrictions:

  • Do not embed spaces in the comment characters /* and */.

  • In general, spaces are allowed before and after the plus (+) character and hint-name; however, some third-party tools do not support spaces embedded inside /*+.

Supported hints

Vertica supports the following hints:

General hints

Hint Description
ALLNODES Qualifies an EXPLAIN statement to request a query plan that assumes all nodes are active.
EARLY_MATERIALIZATION Specifies early materialization of a table for the current query.
ENABLE_WITH_CLAUSE_MATERIALIZATION Enables and disables WITH clause materialization for a specific query.
LABEL Labels a query so you can identify it for profiling and debugging.
SKIP_STATISTICS Directs the optimizer to produce a query plan that incorporates only minimal statistics.

Eon Mode hints

Hint Description
DEPOT_FETCH Specifies whether a query fetches data to the depot from communal storage when the depot lacks data for this query.
ECSMODE Specifies the elastic crunch scaling (ECS) strategy for dividing shard data among its subscribers.

Join hints

Hint Description
SYNTACTIC_JOIN Enforces join order and enables other join hints.
DISTRIB Sets the input operations for a distributed join to broadcast, resegment, local, or filter.
GBYTYPE Specifies which algorithm—GROUPBY HASH or GROUPBY PIPELINED—the Vertica query optimizer should use to implement a GROUP BY clause.
JTYPE Enforces the join type: merge or hash join.
UTYPE Specifies how to combine UNION ALL input.

Projection hints

Hint Description
PROJS Specifies one or more projections to use for a queried table.
SKIP_PROJS Specifies which projections to avoid using for a queried table.

Directed query hints

The following hints are only supported by directed queries:

Hint Description
:c Marks a query constant that must be included in an input query; otherwise, that input query is disqualified from using the directed query.
:v Maps an input query constant to one or more annotated query constants.
VERBATIM Enforces execution of an annotated query exactly as written.

2.8.1 - :c

In a directed query, marks a query constant that must be included in an input query; otherwise, that input query is disqualified from using the directed query.

In a directed query, marks a query constant that must be included in an input query; otherwise, that input query is disqualified from using the directed query.

Syntax

/*+:c*/

Usage

By default, optimizer-generated directed queries set ignore constant (:v) hints on predicate constants. You can override this behavior by setting the :c hint on input query constants that must not be ignored. For example, the following statement creates a directed query that can be used only for input queries where the join predicate constant is the same as in the original input query—8:

=> CREATE DIRECTED QUERY OPTIMIZER simpleJoin_KeepPredicateConstant SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8 /*+:c*/;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY simpleJoin_KeepPredicateConstant;

See also

Conserving Predicate Constants in Directed Queries

2.8.2 - :v

In a directed query, marks an input query constant that the optimizer ignores when it considers whether to use the directed query for a given query.

In a directed query, marks an input query constant that the optimizer ignores when it considers whether to use the directed query for a given query. Use this hint to create a directed query that can be used for multiple variants of an input query.

Vertica also supports IGNORECONST as an alias of :v . Optimizer-generated directed queries automatically mark predicate constants in input and annotated queries with :v hints.

For details, see Ignoring constants in directed queries.

Syntax

/*+:v(arg)*/
/*+IGNORECONST(arg)*/
arg
Integer argument that is used in the directed query to pair each input query :v hint with one or more annotated query :v hints.

Examples

See Ignoring constants in directed queries.

2.8.3 - ALLNODES

Qualifies an EXPLAIN statement to request a query plan that assumes all nodes are active.

Qualifies an EXPLAIN statement to request a query plan that assumes all nodes are active. If you omit this hint, the EXPLAIN statement produces a query plan that takes into account any nodes that are currently down.

Syntax

EXPLAIN /*+ALLNODES*/

Examples

In the following example, the ALLNODES hint requests a query plan that assumes all nodes are active.

QUERY PLAN DESCRIPTION:
 ------------------------------

 Opt Vertica Options
 --------------------
 PLAN_ALL_NODES_ACTIVE


 EXPLAIN /*+ALLNODES*/ select * from Emp_Dimension;

 Access Path:
 +-STORAGE ACCESS for Emp_Dimension [Cost: 125, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
 |  Projection: public.Emp_Dimension_b0
 |  Materialize: Emp_Dimension.Employee_key, Emp_Dimension.Employee_gender, Emp_Dimension.Courtesy_title, Emp_Dimension.Employee_first_name, Emp_Dimension.Employee_middle_initial, Emp_Dimension.Employee_last_name, Emp_Dimension.Employee_age, Emp_Dimension.Employee_birthdate, Emp_Dimension.Employee_street, Emp_Dimension.Employee_city, Emp_Dimension.Employee_state, Emp_Dimension.Employee_region, Emp_Dimension.Employee_position
 |  Execute on: All Nodes

2.8.4 - DEPOT_FETCH

Specifies whether a query fetches data to the depot from communal storage when the depot lacks data for this query.

Eon Mode only

Specifies whether a query fetches data to the depot from communal storage when the depot lacks data for this query. This hint overrides configuration parameter DepotOperationsForQuery.

Syntax

SELECT /*+DEPOT_FETCH (option)*/

Arguments

*option*
Specifies behavior when the depot does not contain queried file data, one of the following:
  • ALL (default): Fetch file data from communal storage, if necessary displace existing files by evicting them from the depot.

  • FETCHES: Fetch file data from communal storage only if space is available; otherwise, read the queried data directly from communal storage.

  • NONE: Do not fetch file data to the depot, read the queried data directly from communal storage.

Examples

SELECT /*+DEPOT_FETCH(All)*/ count(*) FROM bar;
SELECT /*+DEPOT_FETCH(FETCHES)*/ count(*) FROM bar;
SELECT /*+DEPOT_FETCH(NONE)*/ count(*) FROM bar;

2.8.5 - DISTRIB

The DISTRIB hint specifies to the optimizer how to distribute join key data in order to implement a join.

The DISTRIB hint specifies to the optimizer how to distribute join key data in order to implement a join. If a specified distribution is not feasible, the optimizer ignores the hint and throws a warning.

The following requirements apply:

  • Queries that include the DISTRIB hint must also include the SYNTACTIC_JOIN hint. Otherwise, the optimizer ignores the DISTRIB hint and throws a warning.

  • Join syntax must conform with ANSI SQL-92 join conventions.

Syntax

JOIN /*+DISTRIB(outer-join, inner-join)*/

Arguments

outer-join
inner-join
Specifies how to distribute data on the outer and inner joins:
  • L (local): Inner and outer join keys are identically segmented on each node, join locally.

  • R (resegment): Inner and outer join keys are not identically segmented. Resegment join-key data before implementing the join.

  • B (broadcast): Inner and outer join keys are not identically segmented. Broadcast data of this join key to other nodes before implementing the join.

  • F (filter): Join table is unsegmented. Filter data as needed by the other join key before implementing the join.

  • A (any): Let the optimizer choose the distribution method that it considers to be most cost-effective.

Examples

In the following query, the join is qualified with a DISTRIB hint of /*+DISTRIB(L,R)*/. This hint tells the optimizer to resegment data of join key stores.store_key before joining it to the sales.store_key data:

SELECT /*+SYNTACTIC_JOIN*/ sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date
FROM (store.storeSales AS sales JOIN /*+DISTRIB(L,R),JTYPE(H)*/ store.store_dimension AS stores ON (sales.store_key = stores.store_key))
WHERE (sales.sale_date = '2014-12-01'::date) ORDER BY sales.store_key, sales.sale_date;

2.8.6 - EARLY_MATERIALIZATION

Specifies early materialization of a table for the current query.

Specifies early materialization of a table for the current query. A query can include this hint for any number of tables. Typically, the query optimizer delays materialization until late in the query execution process. This hint overrides any choices that the optimizer otherwise would make.

This hint can be useful in cases where late materialization of join inputs precludes other optimizations—for example, pushing aggregation down the joins, or using live aggregate projections. In these cases, qualifying a join input with EARLY_MATERIALIZATION can enable the optimizations.

Syntax

table-name [ [AS] alias ] /*+EARLY_MATERIALIZATION*/

2.8.7 - ECSMODE

Sets the ECS strategy that the optimizer uses when it divides responsibility for processing shard data among subscriber nodes.

Eon Mode only

Sets the ECS strategy that the optimizer uses when it divides responsibility for processing shard data among subscriber nodes. This hint is applied only if the subcluster uses elastic crunch scaling (ECS).

Syntax

SELECT /*+ECSMODE(option)*/

Arguments

*option*
Specifies the strategy to use when dividing shard data among its subscribing nodes, one of the following:
  • AUTO: The optimizer chooses the strategy to use, useful only if ECS mode is set at the session level (see Setting the ECS Strategy for the Session or Database).

  • IO_OPTIMIZED: Use I/O-optimized strategy.

  • COMPUTE_OPTIMIZED: Use compute-optimized strategy.

  • NONE: Disable use of ECS for this query. Only participating nodes are involved in query execution; collaborating nodes are not.

Example

The following example shows the query plan for a simple single-table query that is forced to use the compute-optimized strategy:

=> EXPLAIN SELECT /*+ECSMode(COMPUTE_OPTIMIZED)*/ employee_last_name,
             employee_first_name,employee_age
             FROM employee_dimension
             ORDER BY employee_age DESC;

                                   QUERY PLAN
--------------------------------------------------------------------------------
 ------------------------------
 QUERY PLAN DESCRIPTION:
 The execution of this query involves non-participating nodes.
 Crunch scaling strategy preserves data segmentation
 ------------------------------
. . .

2.8.8 - ENABLE_WITH_CLAUSE_MATERIALIZATION

Enables materialization of all queries in the current WITH clause. Otherwise, materialization is set by configuration parameter WithClauseMaterialization, by default set to 0 (disabled). If WithClauseMaterialization is disabled, materialization is automatically cleared when the primary query of the WITH clause returns. For details, see Materialization of WITH clause.

Syntax

WITH /*+ENABLE_WITH_CLAUSE_MATERIALIZATION*/

2.8.9 - GBYTYPE

Specifies which algorithm—GROUPBY HASH or GROUPBY PIPELINED —the Vertica query optimizer should use to implement a GROUP BY clause.

Specifies which algorithm—GROUPBY HASH or GROUPBY PIPELINED —the Vertica query optimizer should use to implement a GROUP BY clause. If both algorithms are valid for this query, the query optimizer chooses the specified algorithm over the algorithm that the query optimizer might otherwise choose in its query plan.

For more information about both algorithms, see GROUP BY implementation options.

Syntax

GROUP BY /*+GBYTYPE( HASH | PIPE )*/

Arguments

HASH
Use the GROUPBY HASH algorithm.
PIPE
Use the GROUPBY PIPELINED algorithm.

Examples

See Controlling GROUPBY Algorithm Choice.

2.8.10 - JFMT

Specifies how to size VARCHAR column data when joining tables on those columns, and buffer that data accordingly.

Specifies how to size VARCHAR column data when joining tables on those columns, and buffer that data accordingly. The JFMT hint overrides the default behavior that is set by configuration parameter JoinDefaultTupleFormat, which can be set at database and session levels.

For more information, see Joining variable length string data.

Syntax

JOIN /*+JFMT(format-type)*/

Arguments

format-type
Specifies how to format VARCHAR column data when joining tables on those columns, and buffers the data accordingly. Set to one of the following:
  • f (fixed): Use join column metadata to size column data to a fixed length, and buffer accordingly.

  • v (variable): Use the actual length of join column data, so buffer size varies for each join.

For example:

SELECT /*+SYNTACTIC_JOIN*/ s.store_region, SUM(e.vacation_days) TotalVacationDays
   FROM public.employee_dimension e
   JOIN /*+JFMT(f)*/ store.store_dimension s ON s.store_region=e.employee_region
   GROUP BY s.store_region ORDER BY TotalVacationDays;

Requirements

  • Queries that include the JFMT hint must also include the SYNTACTIC_JOIN hint. Otherwise, the optimizer ignores the JFMT hint and throws a warning.

  • Join syntax must conform with ANSI SQL-92 join conventions.

2.8.11 - JTYPE

Specifies the join algorithm as hash or merge.

Specifies the join algorithm as hash or merge.

Use the JTYPE hint to specify the algorithm the optimizer uses to join table data. If the specified algorithm is not feasible, the optimizer ignores the hint and throws a warning.

Syntax

JOIN /*+JTYPE(join-type)*/

Arguments

join-type
One of the following:
  • H: Hash join

  • M: Merge join, valid only if both join inputs are already sorted on the join columns, otherwise Vertica ignores it and throws a warning. The optimizer relies upon the query or DDL to verify whether input data is sorted, rather than the actual runtime order of the data.

  • FM: Forced merge join. Before performing the merge, the optimizer re-sorts the join inputs. Join columns must be of the same type and precision or scale, except that string columns can have different lengths.

A value of FM is valid only for simple join conditions. For example:

=> SELECT /*+SYNTACTIC_JOIN*/ * FROM x JOIN /*+JTYPE(FM)*/ y ON x.c1 = y.c1;

Requirements

  • Queries that include the JTYPE hint must also include the SYNTACTIC_JOIN hint. Otherwise, the optimizer ignores the JTYPE hint and throws a warning.

  • Join syntax must conform with ANSI SQL-92 join conventions.

2.8.12 - LABEL

Assigns a label to a statement so it can easily be identified to evaluate performance and debug problems.

Assigns a label to a statement so it can easily be identified to evaluate performance and debug problems.

LABEL hints are valid in the following statements:

Syntax

statement-name /*+LABEL (label-string)*/

Arguments

label-string
A string that is up to 128 octets long. If enclosed with single quotes, label-string can contain embedded spaces.

Examples

See Labeling statements.

2.8.13 - PROJS

Specifies one or more projections to use for a queried table.

Specifies one or more projections to use for a queried table.

The PROJS hint can specify multiple projections; the optimizer determines which ones are valid and uses the one that is most cost-effective for the queried table. If no hinted projection is valid, the query returns a warning and ignores projection hints.

Syntax

FROM `*`table-name`*` /*+PROJS( [[`*`database`*`.]`*`schema.`*`]`*`projection`*`[,...] )*/

Arguments

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

projection
The projection to use. You can specify a list of comma-delimited projections.

Examples

The employee_dimension table has two projections: segmented superprojection public.employee_dimension, which includes all table columns; and the unsegmented projection public.employee_dimension_rep, which includes a subset of the columns:


=> SELECT export_objects('','employee_dimension');
                                                      export_objects
--------------------------------------------------------------------------------------------------------------------------

CREATE TABLE public.employee_dimension
(
    employee_key int NOT NULL,
    employee_gender varchar(8),
    courtesy_title varchar(8),
    employee_first_name varchar(64),
    employee_middle_initial varchar(8),
    employee_last_name varchar(64),
    employee_age int,
    hire_date date,
    employee_street_address varchar(256),
    employee_city varchar(64),
    employee_state char(2),
    employee_region char(32),
    job_title varchar(64),
    reports_to int,
    salaried_flag int,
    annual_salary int,
    hourly_rate float,
    vacation_days int,
    CONSTRAINT C_PRIMARY PRIMARY KEY (employee_key) DISABLED
);

CREATE PROJECTION public.employee_dimension
...
AS
 SELECT employee_dimension.employee_key,
        employee_dimension.employee_gender,
        employee_dimension.courtesy_title,
        employee_dimension.employee_first_name,
        employee_dimension.employee_middle_initial,
        employee_dimension.employee_last_name,
        employee_dimension.employee_age,
        employee_dimension.hire_date,
        employee_dimension.employee_street_address,
        employee_dimension.employee_city,
        employee_dimension.employee_state,
        employee_dimension.employee_region,
        employee_dimension.job_title,
        employee_dimension.reports_to,
        employee_dimension.salaried_flag,
        employee_dimension.annual_salary,
        employee_dimension.hourly_rate,
        employee_dimension.vacation_days
 FROM public.employee_dimension
 ORDER BY employee_dimension.employee_key
SEGMENTED BY hash(employee_dimension.employee_key) ALL NODES KSAFE 1;

CREATE PROJECTION public.employee_dimension_rep
...
AS
 SELECT employee_dimension.employee_key,
        employee_dimension.employee_gender,
        employee_dimension.employee_first_name,
        employee_dimension.employee_middle_initial,
        employee_dimension.employee_last_name,
        employee_dimension.employee_age,
        employee_dimension.employee_street_address,
        employee_dimension.employee_city,
        employee_dimension.employee_state,
        employee_dimension.employee_region
 FROM public.employee_dimension
 ORDER BY employee_dimension.employee_key
UNSEGMENTED ALL NODES;

SELECT MARK_DESIGN_KSAFE(1);

(1 row)

The following query selects all table columns from employee_dimension and includes the PROJS hint, which specifies both projections. public.employee_dimension_rep does not include all columns in the queried table, so the optimizer cannot use it. The segmented projection includes all table columns so the optimizer uses it, as verified by the following query plan:

=> EXPLAIN SELECT * FROM employee_dimension /*+PROJS('public.employee_dimension_rep', 'public.employee_dimension')*/;

QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM employee_dimension /*+PROJS('public.employee_dimension_rep', 'public.employee_dimension')*/;

 Access Path:
 +-STORAGE ACCESS for employee_dimension [Cost: 177, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
 |  Projection: public.employee_dimension_b0

2.8.14 - SKIP_PROJS

Specifies which projections to avoid using for a queried table.

Specifies which projections to avoid using for a queried table. If SKIP_PROJS excludes all available projections that are valid for the query, the optimizer issues a warning and ignores the projection hints.

Syntax

FROM table-name /*+SKIP_PROJS( [[database.]schema.]projection[,...] )*/

Arguments

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

projection
A projection to skip. You can specify a list of comma-delimited projections.

Examples

In this example, the EXPLAIN output shows that the optimizer uses the projection public.employee_dimension_b0 for a given query:


QUERY PLAN DESCRIPTION:
------------------------------

 EXPLAIN SELECT employee_last_name, employee_first_name, employee_city, job_title FROM employee_dimension;

 Access Path:
 +-STORAGE ACCESS for employee_dimension [Cost: 59, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
 |  Projection: public.employee_dimension_b0
 |  Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name, employee_dimension.employee_city, employee_dimension.job_title
 |  Execute on: All Nodes

You can use the SKIP_PROJS hint to avoid using this projection. If another projection is available that is valid for this query, the optimizer uses it instead:


QUERY PLAN DESCRIPTION:
------------------------------

EXPLAIN SELECT employee_last_name, employee_first_name, employee_city, job_title FROM employee_dimension /*+SKIP_PROJS('public.employee_dimension')*/;

 Access Path:
 +-STORAGE ACCESS for employee_dimension [Cost: 156, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
 |  Projection: public.employee_dimension_super
 |  Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name, employee_dimension.emplo
yee_city, employee_dimension.job_title
 |  Execute on: Query Initiator

2.8.15 - SKIP_STATISTICS

Directs the optimizer to produce a query plan that incorporates only the minimal statistics that are collected by ANALYZE_ROW_COUNT.

Directs the optimizer to produce a query plan that incorporates only the minimal statistics that are collected by ANALYZE_ROW_COUNT. The optimizer ignores other statistics that would otherwise be used, that are generated by ANALYZE_STATISTICS and ANALYZE_STATISTICS_PARTITION. This hint is especially useful when used in queries on small tables, where the amount of time required to collect full statistics is often greater than actual execution time.

Syntax

SELECT /*+SKIP_STAT[ISTIC]S*/

EXPLAIN output

EXPLAIN returns the following output for a query that includes SKIP_STATISTICS (using its shortened form SKIP_STATS):


=> EXPLAIN SELECT /*+ SKIP_STATS*/ customer_key, customer_name, customer_gender, customer_city||', '||customer_state, customer_age
    FROM customer_dimension WHERE customer_region = 'East' AND customer_age > 60;

 QUERY PLAN DESCRIPTION:
 ------------------------------

 EXPLAIN SELECT /*+ SKIP_STATS*/ customer_key, customer_name, customer_gender, customer_city||', '||customer_state,
 customer_age FROM customer_dimension WHERE customer_region = 'East' AND customer_age > 60;

 Access Path:
 +-STORAGE ACCESS for customer_dimension [Cost: 2K, Rows: 10K (STATISTICS SKIPPED)] (PATH ID: 1)
 |  Projection: public.customer_dimension_b0
 |  Materialize: public.customer_dimension.customer_age, public.customer_dimension.customer_key, public.customer_dimensi
on.customer_name, public.customer_dimension.customer_gender, public.customer_dimension.customer_city, public.customer_di
mension.customer_state
 |  Filter: (public.customer_dimension.customer_region = 'East')
 |  Filter: (public.customer_dimension.customer_age > 60)
 |  Execute on: All Nodes
...

2.8.16 - SYNTACTIC_JOIN

Enforces join order and enables other join hints.

Enforces join order and enables other join hints.

In order to achieve optimal performance, the optimizer often overrides a query's specified join order. By including the SYNTACTIC_JOIN hint, you can ensure that the optimizer enforces the query's join order exactly as specified. One requirement applies: the join syntax must conform with ANSI SQL-92 conventions.

The SYNTACTIC_JOIN hint must immediately follow SELECT. If the annotated query includes another hint that must also follow SELECT, such as VERBATIM, combine the two hints together. For example:

SELECT /*+ syntactic_join,verbatim*/

Syntax

SELECT /*+SYN[TACTIC]_JOIN*/

Examples

In the following examples, the optimizer produces different plans for two queries that differ only by including or excluding the SYNTACTIC_JOIN hint.

Excludes SYNTACTIC_JOIN:

EXPLAIN SELECT sales.store_key, stores.store_name, products.product_description, sales.sales_quantity, sales.sale_date
FROM (store.store_sales sales JOIN products ON sales.product_key=products.product_key)
JOIN store.store_dimension stores ON sales.store_key=stores.store_key
WHERE sales.sale_date='2014-12-01' order by sales.store_key, sales.sale_date;

 Access Path:
 +-SORT [Cost: 14K, Rows: 100K (NO STATISTICS)] (PATH ID: 1)
 |  Order: sales.store_key ASC, sales.sale_date ASC
 |  Execute on: All Nodes
 | +---> JOIN HASH [Cost: 11K, Rows: 100K (NO STATISTICS)] (PATH ID: 2) Outer (RESEGMENT)(LOCAL ROUND ROBIN) Inner (RESEGMENT)
 | |      Join Cond: (sales.product_key = products.product_key)
 | |      Materialize at Input: sales.store_key, sales.product_key, sales.sale_date, sales.sales_quantity
 | |      Execute on: All Nodes
 | | +-- Outer -> JOIN HASH [Cost: 1K, Rows: 100K (NO STATISTICS)] (PATH ID: 3)
 | | |      Join Cond: (sales.store_key = stores.store_key)
 | | |      Execute on: All Nodes
 | | | +-- Outer -> STORAGE ACCESS for sales [Cost: 1K, Rows: 100K (NO STATISTICS)] (PATH ID: 4)
 | | | |      Projection: store.store_sales_b0
 | | | |      Materialize: sales.store_key
 | | | |      Filter: (sales.sale_date = '2014-12-01'::date)
 | | | |      Execute on: All Nodes
 | | | |      Runtime Filter: (SIP1(HashJoin): sales.store_key)
 | | | +-- Inner -> STORAGE ACCESS for stores [Cost: 34, Rows: 250] (PATH ID: 5)
 | | | |      Projection: store.store_dimension_DBD_10_rep_VMartDesign_node0001
 | | | |      Materialize: stores.store_key, stores.store_name
 | | | |      Execute on: All Nodes
 | | +-- Inner -> STORAGE ACCESS for products [Cost: 3K, Rows: 60K (NO STATISTICS)] (PATH ID: 6)
 | | |      Projection: public.products_b0
 | | |      Materialize: products.product_key, products.product_description
 | | |      Execute on: All Nodes

Includes SYNTACTIC_JOIN:

EXPLAIN SELECT /*+SYNTACTIC_JOIN*/ sales.store_key, stores.store_name, products.product_description, sales.sales_quantity, sales.sale_date
FROM (store.store_sales sales JOIN products ON sales.product_key=products.product_key)
JOIN store.store_dimension stores ON sales.store_key=stores.store_key
WHERE sales.sale_date='2014-12-01' order by sales.store_key, sales.sale_date;

 Access Path:
 +-SORT [Cost: 11K, Rows: 100K (NO STATISTICS)] (PATH ID: 1)
 |  Order: sales.store_key ASC, sales.sale_date ASC
 |  Execute on: All Nodes
 | +---> JOIN HASH [Cost: 8K, Rows: 100K (NO STATISTICS)] (PATH ID: 2)
 | |      Join Cond: (sales.store_key = stores.store_key)
 | |      Execute on: All Nodes
 | | +-- Outer -> JOIN HASH [Cost: 7K, Rows: 100K (NO STATISTICS)] (PATH ID: 3) Outer (BROADCAST)(LOCAL ROUND ROBIN)
 | | |      Join Cond: (sales.product_key = products.product_key)
 | | |      Execute on: All Nodes
 | | |      Runtime Filter: (SIP1(HashJoin): sales.store_key)
 | | | +-- Outer -> STORAGE ACCESS for sales [Cost: 2K, Rows: 100K (NO STATISTICS)] (PATH ID: 4)
 | | | |      Projection: store.store_sales_b0
 | | | |      Materialize: sales.sale_date, sales.store_key, sales.product_key, sales.sales_quantity
 | | | |      Filter: (sales.sale_date = '2014-12-01'::date)
 | | | |      Execute on: All Nodes
 | | | +-- Inner -> STORAGE ACCESS for products [Cost: 3K, Rows: 60K (NO STATISTICS)] (PATH ID: 5)
 | | | |      Projection: public.products_b0
 | | | |      Materialize: products.product_key, products.product_description
 | | | |      Execute on: All Nodes
 | | +-- Inner -> STORAGE ACCESS for stores [Cost: 34, Rows: 250] (PATH ID: 6)
 | | |      Projection: store.store_dimension_DBD_10_rep_VMartDesign_node0001
 | | |      Materialize: stores.store_key, stores.store_name
 | | |      Execute on: All Nodes

2.8.17 - UTYPE

Specifies how to combine UNION ALL input.

Specifies how to combine UNION ALL input.

Syntax

UNION ALL /*+UTYPE(union-type)*/

Arguments

union-type
One of the following values:
  • U: Concatenates UNION ALL input (default).

  • M: Merges UNION ALL input in the same sort order as the source query results. This option requires all input from the source queries to use the same sort order; otherwise, Vertica throws a warning and concatenates the UNION ALL input.

Requirements

Queries that include the UTYPE hint must also include the SYNTACTIC_JOIN hint. Otherwise, the optimizer ignores the UTYPE hint and throws a warning.

2.8.18 - VERBATIM

Enforces execution of an annotated query exactly as written.

Enforces execution of an annotated query exactly as written.

VERBATIM directs the optimizer to create a query plan that incorporates all hints in a annotated query. Furthermore, it directs the optimizer not to apply its own plan development processing on query plan components that pertain to those hints.

Usage of this hint varies between optimizer-generated and custom directed queries, as described below.

Syntax

SELECT /*+ VERBATIM*/

Requirements

The VERBATIM hint must immediately follow SELECT. If the annotated query includes another hint that must also follow SELECT, such as SYNTACTIC_JOIN, combine the two hints together. For example:

SELECT /*+ syntactic_join,verbatim*/

Optimizer-generated directed queries

The optimizer always includes the VERBATIM hint in the annotated queries that it generates for directed queries. For example, given the following CREATE DIRECTED QUERY OPTIMIZER statement:


=> CREATE DIRECTED QUERY OPTIMIZER getStoreSales SELECT sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date FROM store.storesales sales JOIN store.store_dimension stores ON sales.store_key=stores.store_key WHERE sales.sale_date='2014-12-01' /*+IGNORECONST(1)*/ AND stores.store_name='Store1' /*+IGNORECONST(2)*/ ORDER BY sales.store_key, sales.sale_date;
CREATE DIRECTED QUERY

The optimizer generates an annotated query that includes the VERBATIM hint:

=> SELECT query_name, annotated_query FROM V_CATALOG.DIRECTED_QUERIES WHERE query_name = 'getStoreSales';
-[ RECORD 1 ]---+------
query_name      | getStoreSales
annotated_query | SELECT /*+ syntactic_join,verbatim*/ sales.store_key AS store_key, stores.store_name AS store_name, sales.product_description AS product_description, sales.sales_quantity AS sales_quantity, sales.sale_date AS sale_date
FROM (store.storeSales AS sales/*+projs('store.storeSales')*/ JOIN /*+Distrib(L,L),JType(H)*/ store.store_dimension AS stores/*+projs('store.store_dimension_DBD_10_rep_VMartDesign')*/  ON (sales.store_key = stores.store_key))
WHERE (sales.sale_date = '2014-12-01'::date /*+IgnoreConst(1)*/) AND (stores.store_name = 'Store1'::varchar(6) /*+IgnoreConst(2)*/)
ORDER BY 1 ASC, 5 ASC

When the optimizer uses this directed query, it produces a query plan that is equivalent to the query plan that it used when it created the directed query:

=> ACTIVATE DIRECTED QUERY getStoreSales;
ACTIVATE DIRECTED QUERY

=> EXPLAIN SELECT sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date FROM store.storesales sales JOIN store.store_dimension stores ON sales.store_key=stores.store_key WHERE sales.sale_date='2014-12-04' AND stores.store_name='Store14' ORDER BY sales.store_key, sales.sale_date;

 QUERY PLAN DESCRIPTION:
 ------------------------------

 EXPLAIN SELECT sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date FROM store.storesales sales JOIN store.store_dimension stores ON sales.store_key=stores.store_key WHERE sales.sale_date='2014-12-04' AND stores.store_name='Store14' ORDER BY sales.store_key, sales.sale_date;

 The following active directed query(query name: getStoreSales) is being executed:
 SELECT /*+syntactic_join,verbatim*/ sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date
FROM (store.storeSales sales/*+projs('store.storeSales')*/ JOIN /*+Distrib('L', 'L'), JType('H')*/store.store_dimension stores
/*+projs('store.store_dimension_DBD_10_rep_VMartDesign')*/ ON ((sales.store_key = stores.store_key))) WHERE ((sales.sale_date = '2014-12-04'::date)
AND (stores.store_name = 'Store14'::varchar(7))) ORDER BY sales.store_key, sales.sale_date

 Access Path:
 +-JOIN HASH [Cost: 463, Rows: 622 (NO STATISTICS)] (PATH ID: 2)
 |  Join Cond: (sales.store_key = stores.store_key)
 |  Materialize at Output: sales.sale_date, sales.sales_quantity, sales.product_description
 |  Execute on: All Nodes
 | +-- Outer -> STORAGE ACCESS for sales [Cost: 150, Rows: 155K (NO STATISTICS)] (PATH ID: 3)
 | |      Projection: store.storeSales_b0
 | |      Materialize: sales.store_key
 | |      Filter: (sales.sale_date = '2014-12-04'::date)
 | |      Execute on: All Nodes
 | |      Runtime Filter: (SIP1(HashJoin): sales.store_key)
 | +-- Inner -> STORAGE ACCESS for stores [Cost: 35, Rows: 2] (PATH ID: 4)
 | |      Projection: store.store_dimension_DBD_10_rep_VMartDesign_node0001
 | |      Materialize: stores.store_name, stores.store_key
 | |      Filter: (stores.store_name = 'Store14')
 | |      Execute on: All Nodes

Custom directed queries

The VERBATIM hint is included in a custom directed query only if you explicitly include it in the annotated query that you write for that directed query. When the optimizer uses that directed query, it respects the VERBATIM hint and creates a query plan accordingly.

If you omit the VERBATIM hint when you create a custom directed query, the hint is not stored with the annotated query. When the optimizer uses that directed query, it applies its own plan development processing on the annotated query before it generates a query plan. This query plan might not be equivalent to the query plan that the optimizer would have generated for the Vertica version in which the directed query was created.

2.9 - Window clauses

When used with an analytic function, window clauses specify how to partition and sort function input, as well as how to frame input with respect to the current row.

When used with an analytic function, window clauses specify how to partition and sort function input, as well as how to frame input with respect to the current row. When used with a single-phase transform function, the PARTITION ROW and PARTITION LEFT JOIN window clauses support single-row partitions for single-phase transform functions, rather than analytic functions.

2.9.1 - Window partition clause

When specified, a window partition clause divides the rows of the function input based on user-provided expressions.

When specified, a window partition clause divides the rows of the function input based on user-provided expressions. If no expression is provided, the partition clause can improve query performance by using parallelism. If you do not specify a window partition clause, all input rows are treated as a single partition.

Window partitioning is similar to the GROUP BY clause. However, PARTITION BEST and PARTITION NODES may only be used with analytic functions and return only one result per input row, while PARTITION ROW and PARTITION LEFT JOIN can be used for single-phase transform functions and return multiple values per input row.

When used with analytic functions, results are computed per partition and start over again (reset) at the beginning of each subsequent partition.

Syntax

{ PARTITION BY expression[,...] 
  | PARTITION BEST 
  | PARTITION NODES 
  | PARTITION ROW 
  | PARTITION LEFT JOIN }

Arguments

PARTITION BY expression
Expression on which to sort the partition, where expression can be a column, constant, or an arbitrary expression formed on columns. Use PARTITION BY for functions with specific partitioning requirements.
PARTITION BEST
Use parallelism to improve performance for multi-threaded queries across multiple nodes.

OVER(PARTITION BEST) provides the best performance on multi-threaded queries across multiple nodes.

The following considerations apply to using PARTITION BEST:

  • Use PARTITION BEST for analytic functions that have no partitioning requirements and are thread safe—for example, a one-to-many transform.

  • Do not use PARTITION BEST on user-defined transform functions (UDTFs) that are not thread-safe. Doing so can produce an error or incorrect results. If a UDTF is not thread safe, use PARTITION NODES .

PARTITION NODES
Use parallelism to improve performance for single-threaded queries across multiple nodes.

OVER(PARTITION NODES) provides the best performance on single-threaded queries across multiple nodes.

PARTITION ROW, PARTITION LEFT JOIN
Use to feed input partitions of exactly one row. If used, any arbitrary expression may be used in the query target list alongside the UDTF. PARTITION LEFT JOIN returns a row of NULLs if an input row would otherwise produce no output.

May not be used for analytic functions or multi-phase transform functions. Note that only one PARTITION ROW transform function is allowed in the target list for each level of the query.

Examples

See Window partitioning.

2.9.2 - Window order clause

Specifies how to sort rows that are supplied to an analytic function.

Specifies how to sort rows that are supplied to an analytic function. If the OVER clause also includes a window partition clause, rows are sorted within each partition.

The window order clause only specifies order within a window result set. The query can have its own ORDER BY clause outside the OVER clause. This has precedence over the window order clause and orders the final result set.

A window order clause also creates a default window frame if none is explicitly specified.

Syntax

ORDER BY { expression [ ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] ]
  }[,...]

Arguments

expression
A column, constant, or arbitrary expression formed on columns on which to sort input rows.
ASC | DESC
Sort order: ascending (default) or descending.
NULLS {FIRST | LAST | AUTO}
How to position nulls. NULLS AUTO means to choose the positioning that is most efficient for this query.

ASC defaults to NULLS LAST and DESC defaults to NULLS FIRST.

If you omit all sort qualifiers, Vertica uses ASC NULLS LAST.

For more information, see NULL sort order and Runtime sorting of NULL values in analytic functions.

Examples

See Window ordering.

2.9.3 - Window frame clause

Specifies a window frame, which comprises a set of rows relative to the row that is currently being evaluated by an analytic function.

Specifies a window frame, which comprises a set of rows relative to the row that is currently being evaluated by an analytic function. After the function processes that row and its window, Vertica advances the current row and adjusts the window boundaries accordingly. If the OVER clause also specifies a partition, Vertica also checks that window boundaries do not cross partition boundaries. This process repeats until the function evaluates the last row of the last partition.

Syntax

{ ROWS | RANGE } { BETWEEN start-point AND end-point } | start-point

start-point | end-point:

  { UNBOUNDED {PRECEDING | FOLLOWING}
    | CURRENT ROW
    | constant-value  {PRECEDING | FOLLOWING}}

Arguments

ROWS | RANGE
Whether to interpret window frame dimensions as physical (ROWS) or logical (RANGE) offsets from the current row. See ROWS versus RANGE below for details.
BETWEEN start-point AND end-point
First and last rows of the window, where start-point and end-point can be one of the following:
  • UNBOUNDED {PRECEDING | FOLLOWING}: The current partition's first (PRECEDING) or last (FOLLOWING) row.

  • CURRENT ROW: The current row or value.

  • constant-value {PRECEDING | FOLLOWING}: A constant value or expression that evaluates to a constant value. This value is interpreted as either a physical or logical offset from the current row, depending on whether you use ROWS or RANGE. See ROWS versus RANGE for other restrictions.

start-point must resolve to a row or value that is less than or equal to end-point.

start-point
If ROWS or RANGE specifies only a start point, Vertica uses the current row as the end point and creates the window frame accordingly. In this case, start-point must resolve to a row that is less than or equal to the current row.

Requirements

In order to specify a window frame, the OVER must also specify a window order (ORDER BY) clause. If the OVER clause omits specifying a window frame, the function creates a default window that extends from the current row to the first row in the current partition. This is equivalent to the following clause:

RANGE UNBOUNDED PRECEDING AND CURRENT ROW

ROWS versus RANGE

The window frame's offset from the current row can be physical or logical:

  • ROWS (physical): the start and end points are relative to the current row. If either is a constant value, it must evaluate to a positive integer.

  • RANGE (logical): the start and end points represent a logical offset, such as time. The range value must match the window order (ORDER BY) clause data type: NUMERIC, DATE/TIME, FLOAT or INTEGER.

When setting constant values for ROWS, the constant must evaluate to a positive INTEGER.

When setting constant values for RANGE, the following requirements apply:

  • The constant must evaluate to a positive numeric value or INTERVAL literal.

  • If the constant evaluates to a NUMERIC value, the ORDER BY column type must be a NUMERIC data type.

  • If the constant evaluates to an INTERVAL DAY TO SECOND subtype, the ORDER BY column type must be one of the following: TIMESTAMP, TIME, DATE, or INTERVAL DAY TO SECOND.

  • If the constant evaluates to an INTERVAL YEAR TO MONTH, the ORDER BY column type must be one of the following: TIMESTAMP, DATE, or INTERVAL YEAR TO MONTH.

  • The window order clause can specify only one expression.

Examples

See Window framing.

2.9.4 - Window name clause

Defines a named window that specifies window partition and order clauses for an analytic function.

Defines a named window that specifies window partition and order clauses for an analytic function. This window is specified in the function's OVER clause. Named windows can be useful when you write queries that invoke multiple analytic functions with similar OVER clauses, such as functions that use the same partition (PARTITION BY) clauses.

Syntax

WINDOW window-name AS ( window-partition-clause [window-order-clause] )

Arguments

WINDOW window-name
A window name that is unique within the same query.
window-partition-clause [window-order-clause]

Clauses to invoke when an OVER clause references this window.

If the window definition omits a window order clause, the OVER clause can specify its own order clause.

Requirements

  • A WINDOW clause cannot include a window frame clause.

  • Each WINDOW clause within the same query must have a unique name.

  • A WINDOW clause can reference another window that is already named. For example, the following query names window w1 before w2. Thus, the WINDOW clause that defines w2 can reference w1:

    => SELECT RANK() OVER(w1 ORDER BY sal DESC), RANK() OVER w2
       FROM EMP WINDOW w1 AS (PARTITION BY deptno), w2 AS (w1 ORDER BY sal);
    

Examples

See Named windows.

See also

Analytic functions

3 - Data types

The following table summarizes the internal data types that Vertica supports.

The following table summarizes the internal data types that Vertica supports. It also shows the default placement of null values in projections. The Size column lists uncompressed bytes.

Data Type Size / bytes Description NULL Sorting
Binary
BINARY 1 to 65,000 Fixed-length binary string NULLS LAST
VARBINARY (synonyms: BYTEA, RAW) 1 to 65,000 Variable-length binary string NULLS LAST
LONG VARBINARY 1 to 32,000,000 Long variable-length binary string NULLS LAST
Boolean
BOOLEAN 1 True or False or NULL NULLS LAST
Character / Long
CHAR 1 to 65,000 Fixed-length character string NULLS LAST
VARCHAR 1 to 65,000 Variable-length character string NULLS LAST
LONG VARCHAR 1 to 32,000,000 Long variable-length character string NULLS LAST
Date/Time
DATE 8 A month, day, and year NULLS FIRST
TIME 8 A time of day without timezone NULLS FIRST
TIME WITH TIMEZONE 8 A time of day with timezone NULLS FIRST
TIMESTAMP (synonyms: DATETIME, SMALLDATETIME) 8 A date and time without timezone NULLS FIRST
TIMESTAMP WITH TIMEZONE 8 A date and time with timezone NULLS FIRST
INTERVAL 8 The difference between two points in time NULLS FIRST
INTERVAL DAY TO SECOND 8 An interval measured in days and seconds NULLS FIRST
INTERVAL YEAR TO MONTH 8 An interval measured in years and months NULLS FIRST
Approximate Numeric
DOUBLE PRECISION 8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage NULLS LAST
FLOAT 8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage NULLS LAST
FLOAT(n) 8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage NULLS LAST
FLOAT8 8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage NULLS LAST
REAL 8 Signed 64-bit IEEE floating point number, requiring 8 bytes of storage NULLS LAST
Exact Numeric
INTEGER 8 Signed 64-bit integer, requiring 8 bytes of storage NULLS FIRST
INT 8 Signed 64-bit integer, requiring 8 bytes of storage NULLS FIRST
BIGINT 8 Signed 64-bit integer, requiring 8 bytes of storage NULLS FIRST
INT8 8 Signed 64-bit integer, requiring 8 bytes of storage NULLS FIRST
SMALLINT 8 Signed 64-bit integer, requiring 8 bytes of storage NULLS FIRST
TINYINT 8 Signed 64-bit integer, requiring 8 bytes of storage NULLS FIRST
DECIMAL 8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits NULLS FIRST
NUMERIC 8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits NULLS FIRST
NUMBER 8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits NULLS FIRST
MONEY 8+ 8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits NULLS FIRST
Spatial
GEOMETRY 1 to 10,000,000 Coordinates expressed as (x,y) pairs, defined in the Cartesian plane. NULLS LAST
GEOGRAPHY 1 to 10,000,000 Coordinates expressed in longitude/latitude angular values, measured in degrees NULLS LAST
UUID
UUID 16 Stores universally unique identifiers (UUIDs). NULLS FIRST
Complex
ARRAY 1 to 32,000,000 Collection of values of a primitive or complex type.

Native array: same as the element type

Non-native array: cannot be used to order projections

ROW 1 to 32,000,000 Structure of property-value pairs. Cannot be used to order projections
SET 1 to 32,000,000 Collection of unique values of a primitive type. Same as the primitive type

3.1 - Binary data types (BINARY and VARBINARY)

Store raw-byte data, such as IP addresses, up to bytes.

Store raw-byte data, such as IP addresses, up to 65000 bytes. The BINARY and BINARY VARYING (VARBINARY) data types are collectively referred to as binary string types and the values of binary string types are referred to as binary strings. A binary string is a sequence of octets or bytes.

BYTEA and RAW are synonyms for VARBINARY.

Syntax

BINARY ( length )
{ VARBINARY | BINARY VARYING | BYTEA | RAW } ( max-length )

Arguments

length, max-length
The length of the string or column width, in bytes (octets).

BINARY and VARBINARY data types

BINARY and VARBINARY data types have the following attributes:

  • BINARY: A fixed-width string of length bytes, where the number of bytes is declared as an optional specifier to the type. If length is omitted, the default is 1. Where necessary, values are right-extended to the full width of the column with the zero byte. For example:

    => SELECT TO_HEX('ab'::BINARY(4));
      to_hex
    ----------
     61620000
    
  • VARBINARY: A variable-width string up to a length of max-length bytes, where the maximum number of bytes is declared as an optional specifier to the type. The default is the default attribute size, which is 80, and the maximum length is 65000 bytes. VARBINARY values are not extended to the full width of the column. For example:

    => SELECT TO_HEX('ab'::VARBINARY(4));
     to_hex
    --------
     6162
    

Input formats

You can use several formats when working with binary values. The hexadecimal format is generally the most straightforward and is emphasized in Vertica documentation.

Binary values can also be represented in octal format by prefixing the value with a backslash '\'.

You can also input values represented by printable characters. For example, the hexadecimal value '0x61' can also be represented by the symbol a.

See Data load.

On input, strings are translated from:

Both functions take a VARCHAR argument and return a VARBINARY value.

Output formats

Like the input format, the output format is a hybrid of octal codes and printable ASCII characters. A byte in the range of printable ASCII characters (the range [0x20, 0x7e]) is represented by the corresponding ASCII character, with the exception of the backslash ('\'), which is escaped as '\\'. All other byte values are represented by their corresponding octal values. For example, the bytes {97,92,98,99}, which in ASCII are {a,\,b,c}, are translated to text as 'a\\bc'.

Binary operators and functions

The binary operators &, ~, |, and # have special behavior for binary data types, as described in Bitwise operators.

The following aggregate functions are supported for binary data types:

BIT_AND, BIT_OR, and BIT_XOR are bit-wise operations that are applied to each non-null value in a group, while MAX and MIN are byte-wise comparisons of binary values.

Like their binary operator counterparts, if the values in a group vary in length, the aggregate functions treat the values as though they are all equal in length by extending shorter values with zero bytes to the full width of the column. For example, given a group containing the values 'ff', null, and 'f', a binary aggregate ignores the null value and treats the value 'f' as 'f0'. Also, like their binary operator counterparts, these aggregate functions operate on VARBINARY types explicitly and operate on BINARY types implicitly through casts. See Data type coercion operators (CAST).

Binary versus character data types

The BINARY and VARBINARY binary types are similar to the CHAR and VARCHAR character data types, respectively. They differ as follows:

  • Binary data types contain byte strings (a sequence of octets or bytes).

  • Character data types contain character strings (text).

  • The lengths of binary data types are measured in bytes, while character data types are measured in characters.

Examples

The following example shows HEX_TO_BINARY and TO_HEX usage.

Table t and its projection are created with binary columns:

=> CREATE TABLE t (c BINARY(1));
=> CREATE PROJECTION t_p (c) AS SELECT c FROM t;

Insert minimum byte and maximum byte values:

=> INSERT INTO t values(HEX_TO_BINARY('0x00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFF'));

Binary values can then be formatted in hex on output using the TO_HEX function:

=> SELECT TO_HEX(c) FROM t;
 to_hex
 --------
  00
  ff
 (2 rows)

The BIT_AND, BIT_OR, and BIT_XOR functions are interesting when operating on a group of values. For example, create a sample table and projections with binary columns:

The example that follows uses table t with a single column of VARBINARY data type:

=> CREATE TABLE t ( c VARBINARY(2) );
=> INSERT INTO t values(HEX_TO_BINARY('0xFF00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFFFF'));
=> INSERT INTO t values(HEX_TO_BINARY('0xF00F'));

Query table t to see column c output:

=> SELECT TO_HEX(c) FROM t;
 TO_HEX
--------
 ff00
 ffff
 f00f
(3 rows)

Now issue the bitwise AND operation. Because these are aggregate functions, an implicit GROUP BY operation is performed on results using (ff00&(ffff)&f00f):

=> SELECT TO_HEX(BIT_AND(c)) FROM t;
 TO_HEX
--------
f000
(1 row)

Issue the bitwise OR operation on (ff00|(ffff)|f00f):

=> SELECT TO_HEX(BIT_OR(c)) FROM t;
 TO_HEX
--------
ffff
(1 row)

Issue the bitwise XOR operation on (ff00#(ffff)#f00f):

=> SELECT TO_HEX(BIT_XOR(c)) FROM t;
 TO_HEX
--------
f0f0
(1 row)

3.2 - Boolean data type

Vertica provides the standard SQL type BOOLEAN, which has two states: true and false.

Vertica provides the standard SQL type BOOLEAN, which has two states: true and false. The third state in SQL boolean logic is unknown, which is represented by the NULL value.

Syntax

BOOLEAN

Parameters

Valid literal data values for input are:

TRUE 't' 'true' 'y' 'yes' '1' 1
FALSE 'f' 'false' 'n' 'no' '0' 0

Notes

  • Do not confuse the BOOLEAN data type with Logical operators or the Boolean.

  • The keywords TRUE and FALSE are preferred and are SQL-compliant.

  • A Boolean value of NULL appears last (largest) in ascending order.

  • All other values must be enclosed in single quotes.

  • Boolean values are output using the letters t and f.

See also

3.3 - Character data types (CHAR and VARCHAR)

Stores strings of letters, numbers, and symbols.

Stores strings of letters, numbers, and symbols. The CHARACTER (CHAR) and CHARACTER VARYING (VARCHAR) data types are collectively referred to as character string types, and the values of character string types are known as character strings.

Character data can be stored as fixed-length or variable-length strings. Fixed-length strings are right-extended with spaces on output; variable-length strings are not extended.

String literals in SQL statements must be enclosed in single quotes.

Syntax

{ CHAR | CHARACTER } [ (octet-length) ]
{ VARCHAR | CHARACTER VARYING ] } [ (octet-length) ]

Arguments

octet-length
Length of the string or column width, declared in bytes (octets).

This argument is optional.

CHAR versus VARCHAR data types

The following differences apply to CHAR and VARCHAR data:

  • CHAR is conceptually a fixed-length, blank-padded string. Trailing blanks (spaces) are removed on input and are restored on output. The default length is 1, and the maximum length is 65000 octets (bytes).

  • VARCHAR is a variable-length character data type. The default length is 80, and the maximum length is 65000 octets. For string values longer than 65000, use Long data types. Values can include trailing spaces.

Normally, you use VARCHAR for all of string data. Use CHAR when you need fixed-width string output. For example, you can use CHAR columns for data to be transferred to a legacy system that requires fixed-width strings.

Setting maximum length

When you define character columns, specify the maximum size of any string to be stored in a column. For example, to store strings up to 24 octets in length, use one of the following definitions:

CHAR(24)    --- fixed-length
VARCHAR(24) --- variable-length

The maximum length parameter for VARCHAR and CHAR data types refers to the number of octets that can be stored in that field, not the number of characters (Unicode code points). When using multibyte UTF-8 characters, the fields must be sized to accommodate from 1 to 4 octets per character, depending on the data. If the data loaded into a VARCHAR or CHAR column exceeds the specified maximum size for that column, data is truncated on UTF-8 character boundaries to fit within the specified size. See COPY.

Due to compression in Vertica, the cost of overestimating the length of these fields is incurred primarily at load time and during sorts.

NULL versus NUL

NULL and NUL differ as follows:

  • NUL represents a character whose ASCII/Unicode code is 0, sometimes qualified "ASCII NUL".

  • NULL means no value, and is true of a field (column) or constant, not of a character.

CHAR, LONG VARCHAR, and VARCHAR string data types accept ASCII NUL values.

In ascending sorts, NULL appears last (largest).

For additional information about NULL ordering, see NULL sort order.

The following example casts the input string containing NUL values to VARCHAR:

=> SELECT 'vert\0ica'::CHARACTER VARYING AS VARCHAR;
 VARCHAR
---------
 vert\0ica
(1 row)

The result contains 9 characters:

=> SELECT LENGTH('vert\0ica'::CHARACTER VARYING);
 length
--------
      9
(1 row)

If you use an extended string literal, the length is 8 characters:

=> SELECT E'vert\0ica'::CHARACTER VARYING AS VARCHAR;
 VARCHAR
---------
 vertica
(1 row)
=> SELECT LENGTH(E'vert\0ica'::CHARACTER VARYING);
 LENGTH
--------
      8
(1 row)

3.4 - Date/time data types

Vertica supports the full set of SQL date and time data types.

Vertica supports the full set of SQL date and time data types.

The following rules apply to all date/time data types:

  • All have a size of 8 bytes.

  • A date/time value of NULL is smallest relative to all other date/time values,.

  • Vertica uses Julian dates for all date/time calculations, which can correctly predict and calculate any date more recent than 4713 BC to far into the future, based on the assumption that the average length of the year is 365.2425 days.

  • All the date/time data types accept the special literal value NOW to specify the current date and time. For example:

    => SELECT TIMESTAMP 'NOW';
             ?column?
    ---------------------------
     2020-09-23 08:23:50.42325
    (1 row)
    
  • By default, Vertica rounds with a maximum precision of six decimal places. You can substitute an integer between 0 and 6 for p to specify your preferred level of precision.

The following table lists specific attributes of date/time data types:

Name Description Low Value High Value Resolution
DATE Dates only (no time of day) ~ 25e+15 BC ~ 25e+15 AD 1 day
TIME [(p)] Time of day only (no date) 00:00:00.00 23:59:60.999999 1 μs
TIMETZ [(p)] Time of day only, with time zone 00:00:00.00+14 23:59:59.999999-14 1 μs
TIMESTAMP [(p)] Both date and time, without time zone 290279-12-22 19:59:05.224194 BC 294277-01-09 04:00:54.775806 AD 1 μs
TIMESTAMPTZ [(p)]* Both date and time, with time zone 290279-12-22 19:59:05.224194 BC UTC 294277-01-09 04:00:54.775806 AD UTC 1 μs
INTERVAL DAY TO SECOND [(p)] Time intervals -106751991 days 04:00:54.775807 +-106751991 days 04:00:54.775807 1 μs
INTERVAL YEAR TO MONTH Time intervals ~ -768e15 yrs ~ 768e15 yrs 1 month

Time zone abbreviations for input

Vertica recognizes the files in /opt/vertica/share/timezonesets as date/time input values and defines the default list of strings accepted in the AT TIME ZONE zone parameter. The names are not necessarily used for date/time output—output is driven by the official time zone abbreviations associated with the currently selected time zone parameter setting.

3.4.1 - DATE

Consists of a month, day, and year.

Consists of a month, day, and year. The following limits apply:

  • Lowest value: ~ 25e+15 BC

  • Higest value: ~ 25e+15 AD

  • Resolution: 1 DAY

See SET DATESTYLE for information about ordering.

Syntax

DATE

Examples

Example Description
January 8, 1999 Unambiguous in any datestyle input mode
1999-01-08 ISO 8601; January 8 in any mode (recommended format)
1/8/1999 January 8 in MDY mode; August 1 in DMY mode
1/18/1999 January 18 in MDY mode; rejected in other modes
01/02/03 January 2, 2003 in MDY mode
February 1, 2003 in DMY mode
February 3, 2001 in YMD mode
1999-Jan-08 January 8 in any mode
Jan-08-1999 January 8 in any mode
08-Jan-1999 January 8 in any mode
99-Jan-08 January 8 in YMD mode, else error
08-Jan-99 January 8, except error in YMD mode
Jan-08-99 January 8, except error in YMD mode
19990108 ISO 8601; January 8, 1999 in any mode
990108 ISO 8601; January 8, 1999 in any mode
1999.008 Year and day of year
J2451187 Julian day
January 8, 99 BC Year 99 before the Common Era

3.4.2 - DATETIME

DATETIME is an alias for TIMESTAMP.

DATETIME is an alias for TIMESTAMP/TIMESTAMPTZ.

3.4.3 - INTERVAL

Measures the difference between two points in time.

Measures the difference between two points in time. Intervals can be positive or negative. The INTERVAL data type is SQL:2008 compliant, and supports interval qualifiers that are divided into two major subtypes:

  • Year-month: Span of years and months

  • Day-time: Span of days, hours, minutes, seconds, and fractional seconds

Intervals are represented internally as some number of microseconds and printed as up to 60 seconds, 60 minutes, 24 hours, 30 days, 12 months, and as many years as necessary. You can control the output format of interval units with SET INTERVALSTYLE and SET DATESTYLE.

Syntax

INTERVAL 'interval-literal' [ interval-qualifier ] [ (p) ]

Parameters

interval-literal
A character string that expresses an interval, conforming to this format:
[-] { quantity subtype-unit }[...] [ AGO ]

For details, see Interval literal.

interval-qualifier
Optionally specifies how to interpret and format an interval literal for output, and, optionally, sets precision. If omitted, the default is DAY TO SECOND(6). For details, see Interval qualifier.
p
Specifies precision of the seconds field, where p is an integer between 0 - 6. For details, see Specifying interval precision.

Default: 6

Limits

Name Low Value High Value Resolution
INTERVAL DAY TO SECOND [(p)] -106751991 days 04:00:54.775807 +/-106751991 days 04:00:54.775807 1 microsecond
INTERVAL YEAR TO MONTH ~/ -768e15 yrs ~ 768e15 yrs 1 month

3.4.3.1 - Setting interval unit display

SET INTERVALSTYLE and SET DATESTYLE control the output format of interval units.

SET INTERVALSTYLE and SET DATESTYLE control the output format of interval units.

Omitting interval units

To omit interval units from the output, set INTERVALSTYLE to PLAIN. This is the default setting, which conforms with the SQL:2008 standard:

=> SET INTERVALSTYLE TO PLAIN;
SET
=> SELECT INTERVAL '3 2';
?column?
----------
 3 02:00

When INTERVALSTYLE is set to PLAIN, units are omitted from the output, even if the query specifies input units:

=> SELECT INTERVAL '3 days 2 hours';
 ?column?
----------
 3 02:00

If DATESTYLE is set to SQL, Vertica conforms with SQL:2008 standard and always omits interval units from output:


=> SET DATESTYLE TO SQL;
SET
=> SET INTERVALSTYLE TO UNITS;
SET
=> SELECT INTERVAL '3 2';
 ?column?
----------
 3 02:00

Displaying interval units

To enable display of interval units, DATESTYLE must be set to ISO. You can then display interval units by setting INTERVALSTYLE to UNITS:


=> SET DATESTYLE TO ISO;
SET
=> SET INTERVALSTYLE TO UNITS;
SET
=> SELECT INTERVAL '3 2';
 ?column?
----------------
 3 days 2 hours

Checking INTERVALSTYLE and DATESTYLE settings

Use SHOW statements to check INTERVALSTYLE and DATESTYLE settings:

=> SHOW INTERVALSTYLE;
     name      | setting
---------------+---------
 intervalstyle | units
=> SHOW DATESTYLE;
   name    | setting
-----------+----------
 datestyle | ISO, MDY

3.4.3.2 - Specifying interval input

Interval values are expressed through interval literals.

Interval values are expressed through interval literals. An interval literal is composed of one or more interval fields, where each field represents a span of days and time, or years and months, as follows:

[-] { quantity subtype-unit }[...] [AGO]

Using subtype units

Subtype units are optional for day-time intervals; they must be specified for year-month intervals.

For example, the first statement below implicitly specifies days and time; the second statement explicitly identifies day and time units. Both statements return the same result:


=> SET INTERVALSTYLE TO UNITS;
=> SELECT INTERVAL '1 12:59:10:05';
      ?column?
--------------------
 1 day 12:59:10.005
(1 row)

=> SELECT INTERVAL '1 day 12 hours 59 min 10 sec 5 milliseconds';
      ?column?
--------------------
 1 day 12:59:10.005
(1 row)

The following two statements add 28 days and 4 weeks to the current date, respectively. The intervals in both cases are equal and the statements return the same result. However, in the first statement, the interval literal omits the subtype (implicitly days); in the second statement, the interval literal must include the subtype unit weeks:

=> SELECT CURRENT_DATE;
  ?column?
------------
 2016-08-15
(1 row)

=> SELECT CURRENT_DATE + INTERVAL '28';
      ?column?
---------------------
 2016-09-12 00:00:00
(1 row)

dbadmin=> SELECT CURRENT_DATE + INTERVAL '4 weeks';
      ?column?
---------------------
 2016-09-12 00:00:00
(1 row)

An interval literal can include day-time and year-month fields. For example, the following statement adds an interval of 4 years, 4 weeks, 4 days and 14 hours to the current date. The years and weeks fields must include subtype units; the days and hours fields omit them:

> SELECT CURRENT_DATE + INTERVAL '4 years 4 weeks 4 14';
      ?column?
---------------------
 2020-09-15 14:00:00
(1 row)

Omitting subtype units

You can specify quantities of days, hours, minutes, and seconds without specifying units. Vertica recognizes colons in interval literals as part of the timestamp:

=> SELECT INTERVAL '1 4 5 6';
 ?column?
------------
 1 day 04:05:06
=> SELECT INTERVAL '1 4:5:6';
 ?column?
------------
 1 day 04:05:06
=> SELECT INTERVAL '1 day 4 hour 5 min 6 sec';
 ?column?
------------
 1 day 04:05:06

If Vertica cannot determine the units, it applies the quantity to any missing units based on the interval qualifier. In the next two examples, Vertica uses the default interval qualifier (DAY TO SECOND(6)) and assigns the trailing 1 to days, since it has already processed hours, minutes, and seconds in the output:

=> SELECT INTERVAL '4:5:6 1';
 ?column?
------------
 1 day 04:05:06
=> SELECT INTERVAL '1 4:5:6';
 ?column?
------------
 1 day 04:05:06

In the next two examples, Vertica recognizes 4:5 as hours:minutes. The remaining values in the interval literal are assigned to the missing units: 1 is assigned to days and 2 is assigned to seconds.

SELECT INTERVAL '4:5 1 2';
  ?column?
------------
 1 day 04:05:02
=> SELECT INTERVAL '1 4:5 2';
 ?column?
------------
 1 day 04:05:02

Specifying the interval qualifier can change how Vertica interprets 4:5:

=> SELECT INTERVAL '4:5' MINUTE TO SECOND;
  ?column?
------------
 00:04:05

3.4.3.3 - Controlling interval format

Interval qualifiers specify a range of options that Vertica uses to interpret and format an interval literal.

Interval qualifiers specify a range of options that Vertica uses to interpret and format an interval literal. The interval qualifier can also specify precision. Each interval qualifier is composed of one or two units:

unit[p] [ TO unit[p] ]

where:

  • unit specifies a day-time or year-month subtype.

  • p specifies precision, an integer between 0 and 6. In general, precision only applies to SECOND units. The default precision for SECOND is 6. For details, see Specifying interval precision.

If an interval omits an interval qualifier, Vertica uses the default DAY TO SECOND(6).

Interval qualifier categories

Interval qualifiers belong to one of the following categories:

  • Year-month: Span of years and months

  • Day-time: Span of days, hours, minutes, seconds, and fractional seconds

Year-Month

Vertica supports two year-month subtypes: YEAR and MONTH.

In the following example, YEAR TO MONTH qualifies the interval literal 1 2 to indicate a span of 1 year and two months:

=> SELECT interval '1 2' YEAR TO MONTH;
 ?column?
----------
 1-2
(1 row)

If you omit the qualifier, Vertica uses the default interval qualifier DAY TO SECOND and returns a different result:

=> SELECT interval '1 2';
 ?column?
----------
 1 02:00
(1 row)

The following example uses the interval qualifier YEAR. In this case, Vertica extracts only the year from the interval literal 1y 10m :

=> SELECT INTERVAL '1y 10m' YEAR;
 ?column?
----------
 1
(1 row)

In the next example, the interval qualifier MONTH converts the same interval literal to months:

=> SELECT INTERVAL '1y 10m' MONTH;
 ?column?
----------
 22
(1 row)

Day-time

Vertica supports four day-time subtypes: DAY, HOUR, MINUTE, and SECOND.

In the following example, the interval qualifier DAY TO SECOND(4) qualifies the interval literal 1h 3m 6s 5msecs 57us. The qualifier also sets precision on seconds to 4:

=> SELECT INTERVAL '1h 3m 6s 5msecs 57us' DAY TO SECOND(4);
   ?column?
---------------
 01:03:06.0051
(1 row)

If no interval qualifier is specified, Vertica uses the default subtype DAY TO SECOND(6), regardless of how you specify the interval literal. For example, as an extension to SQL:2008, both of the following commands return 910 days:

=> SELECT INTERVAL '2-6';
    ?column?
-----------------
 910
=> SELECT INTERVAL '2 years 6 months';
 ?column?
-----------------
 910

An interval qualifier can extract other values from the input parameters. For example, the following command extracts the HOUR value from the interval literal 3 days 2 hours:

=> SELECT INTERVAL '3 days 2 hours' HOUR;
 ?column?
----------
 74

The primary day/time (DAY TO SECOND) and year/month (YEAR TO MONTH) subtype ranges can be restricted to more specific range of types by an interval qualifier. For example, HOUR TO MINUTE is a limited form of day/time interval, which can be used to express time zone offsets.

=> SELECT INTERVAL '1 3' HOUR to MINUTE;
 ?column?
---------------
 01:03

hh:mm:ss and hh:mm formats are used only when at least two of the fields specified in the interval qualifier are non-zero and there are no more than 23 hours or 59 minutes:

=> SELECT INTERVAL '2 days 12 hours 15 mins' DAY TO MINUTE;
  ?column?
--------------
 2 12:15
=> SELECT INTERVAL '15 mins 20 sec' MINUTE TO SECOND;
 ?column?
----------
 15:20
=> SELECT INTERVAL '1 hour 15 mins 20 sec' MINUTE TO SECOND;
 ?column?
-----------------
 75:20

3.4.3.4 - Specifying interval precision

In general, interval precision only applies to seconds.

In general, interval precision only applies to seconds. If no precision is explicitly specified, Vertica rounds precision to a maximum of six decimal places. For example:

=> SELECT INTERVAL '2 hours 4 minutes 3.709384766 seconds' DAY TO SECOND;
    ?column?
-----------------
 02:04:03.709385
(1 row)

Vertica lets you specify interval precision in two ways:

  • After the INTERVAL keyword

  • After the SECOND unit of an interval qualifier, one of the following:

    • DAY TO SECOND

    • HOUR TO SECOND

    • MINUTE TO SECOND

    • SECOND

For example, the following statements use both methods to set precision, and return identical results:

=> SELECT INTERVAL(4) '2 hours 4 minutes 3.709384766 seconds' DAY TO SECOND;
   ?column?
---------------
 02:04:03.7094
(1 row)

=> SELECT INTERVAL '2 hours 4 minutes 3.709384766 seconds' DAY TO SECOND(4);
   ?column?
---------------
 02:04:03.7094
(1 row)

If the same statement specifies precision more than once, Vertica uses the lesser precision. For example, the following statement specifies precision twice: the INTERVAL keyword specifies precision of 1, while the interval qualifier SECOND specifies precision of 2. Vertica uses the lesser precision of 1:

=> SELECT INTERVAL(1) '1.2467' SECOND(2);
 ?column?
----------
 1.2 secs

Setting precision on interval table columns

If you create a table with an interval column, the following restrictions apply to the column definition:

  • You can set precision on the INTERVAL keyword only if you omit specifying an interval qualifier. If you try to set precision on the INTERVAL keyword and include an interval qualifier, Vertica returns an error.

  • You can set precision only on the last unit of an interval qualifier. For example:

    CREATE TABLE public.testint2
    (
        i INTERVAL HOUR TO SECOND(3)
    );
    

    If you specify precision on another unit, Vertica discards it when it saves the table definition.

3.4.3.5 - Fractional seconds in interval units

Vertica supports intervals in milliseconds (hh:mm:ss:ms), where 01:02:03:25 represents 1 hour, 2 minutes, 3 seconds, and 025 milliseconds.

Vertica supports intervals in milliseconds (hh:mm:ss:ms), where 01:02:03:25 represents 1 hour, 2 minutes, 3 seconds, and 025 milliseconds. Milliseconds are converted to fractional seconds as in the following example, which returns 1 day, 2 hours, 3 minutes, 4 seconds, and 25.5 milliseconds:

=> SELECT INTERVAL '1 02:03:04:25.5';
  ?column?
------------
 1 day 02:03:04.0255

Vertica allows fractional minutes. The fractional minutes are rounded into seconds:

=> SELECT INTERVAL '10.5 minutes';
  ?column?
------------
 00:10:30
=> select interval '10.659 minutes';
  ?column?
-------------
 00:10:39.54
=> select interval '10.3333333333333 minutes';
 ?column?
----------
 00:10:20

Considerations

  • An INTERVAL can include only the subset of units that you need; however, year/month intervals represent calendar years and months with no fixed number of days, so year/month interval values cannot include days, hours, minutes. When year/month values are specified for day/time intervals, the intervals extension assumes 30 days per month and 365 days per year. Since the length of a given month or year varies, day/time intervals are never output as months or years, only as days, hours, minutes, and so on.

  • Day/time and year/month intervals are logically independent and cannot be combined with or compared to each other. In the following example, an interval-literal that contains DAYS cannot be combined with the YEAR TO MONTH type:

    => SELECT INTERVAL '1 2 3' YEAR TO MONTH;
    ERROR 3679:  Invalid input syntax for interval year to month: "1 2 3"
    
  • Vertica accepts intervals up to 2^63 – 1 microseconds or months (about 18 digits).

  • INTERVAL YEAR TO MONTH can be used in an analytic RANGE window when the ORDER BY column type is TIMESTAMP/TIMESTAMP WITH TIMEZONE, or DATE. Using TIME/TIME WITH TIMEZONE are not supported.

  • You can use INTERVAL DAY TO SECOND when the ORDER BY column type is TIMESTAMP/TIMESTAMP WITH TIMEZONE, DATE, and TIME/TIME WITH TIMEZONE.

Examples

Examples in this section assume that INTERVALSTYLE is set to PLAIN, so results omit subtype units. Interval values that omit an interval qualifier use the default to DAY TO SECOND(6).

=> SELECT INTERVAL '00:2500:00';
 ?column?
----------
 1 17:40
(1 row)

=> SELECT INTERVAL '2500' MINUTE TO SECOND;
 ?column?
----------
 2500
(1 row)

=> SELECT INTERVAL '2500' MINUTE;
 ?column?
----------
 2500
(1 row)

=> SELECT INTERVAL '28 days 3 hours' HOUR TO SECOND;
 ?column?
----------
 675:00
(1 row)

=> SELECT INTERVAL(3) '28 days 3 hours';
 ?column?
----------
 28 03:00
(1 row)

=> SELECT INTERVAL(3) '28 days 3 hours 1.234567';
    ?column?
-----------------
 28 03:01:14.074
(1 row)

=> SELECT INTERVAL(3) '28 days 3 hours 1.234567 sec';
    ?column?
-----------------
 28 03:00:01.235
(1 row)

=> SELECT INTERVAL(3) '28 days 3.3 hours' HOUR TO SECOND;
 ?column?
----------
 675:18
(1 row)

=> SELECT INTERVAL(3) '28 days 3.35 hours' HOUR TO SECOND;
 ?column?
----------
 675:21
(1 row)

=> SELECT INTERVAL(3) '28 days 3.37 hours' HOUR TO SECOND;
 ?column?
-----------
 675:22:12
(1 row)

=> SELECT INTERVAL '1.234567 days' HOUR TO SECOND;
   ?column?
---------------
 29:37:46.5888
(1 row)

=> SELECT INTERVAL '1.23456789 days' HOUR TO SECOND;
    ?column?
-----------------
 29:37:46.665696
(1 row)

=> SELECT INTERVAL(3) '1.23456789 days' HOUR TO SECOND;
   ?column?
--------------
 29:37:46.666
(1 row)

=> SELECT INTERVAL(3) '1.23456789 days' HOUR TO SECOND(2);
  ?column?
-------------
 29:37:46.67
(1 row)

=> SELECT INTERVAL(3) '01:00:01.234567' as "one hour+";
  one hour+
--------------
 01:00:01.235
(1 row)

=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL(3) '01:00:01.234567';
 ?column?
----------
 t
(1 row)

=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL '01:00:01.234567';
 ?column?
----------
 f
(1 row)

=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL '01:00:01.234567'
HOUR TO SECOND(3);
 ?column?
----------
 t
(1 row)

=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL '01:00:01.234567'
MINUTE TO SECOND(3);
 ?column?
----------
 t
(1 row)

=> SELECT INTERVAL '255 1.1111' MINUTE TO SECOND(3);
  ?column?
------------
 255:01.111
(1 row)

=> SELECT INTERVAL '@ - 5 ago';
 ?column?
----------
 5
(1 row)

=> SELECT INTERVAL '@ - 5 minutes ago';
 ?column?
----------
 00:05
(1 row)

=> SELECT INTERVAL '@ 5 minutes ago';
 ?column?
----------
 -00:05
(1 row)

=> SELECT INTERVAL '@ ago -5 minutes';
 ?column?
----------
 00:05
(1 row)

=> SELECT DATE_PART('month', INTERVAL '2-3' YEAR TO MONTH);
 DATE_PART
-----------
         3
(1 row)

=> SELECT FLOOR((TIMESTAMP '2005-01-17 10:00'
  - TIMESTAMP '2005-01-01')
/ INTERVAL '7');
 FLOOR
-------
     2
(1 row)

3.4.3.6 - Processing signed intervals

In the SQL:2008 standard, a minus sign before an interval-literal or as the first character of the interval-literal negates the entire literal, not just the first component.

In the SQL:2008 standard, a minus sign before an interval-literal or as the first character of the interval-literal negates the entire literal, not just the first component. In Vertica, a leading minus sign negates the entire interval, not just the first component. The following commands both return the same value:

=> SELECT INTERVAL '-1 month - 1 second';
 ?column?
----------
 -29 days 23:59:59

=> SELECT INTERVAL -'1 month - 1 second';
 ?column?
----------
 -29 days 23:59:59

Use one of the following commands instead to return the intended result:

=> SELECT INTERVAL -'1 month 1 second';
 ?column?
----------
 -30 days 1 sec
=> SELECT INTERVAL -'30 00:00:01';
 ?column?
----------
 -30 days 1 sec

Two negatives together return a positive:

=> SELECT INTERVAL -'-1 month - 1 second';
 ?column?
----------
 29 days 23:59:59
=> SELECT INTERVAL -'-1 month 1 second';
 ?column?
----------
 30 days 1 sec

You can use the year-month syntax with no spaces. Vertica allows the input of negative months but requires two negatives when paired with years.

=> SELECT INTERVAL '3-3' YEAR TO MONTH;
 ?column?
----------
 3 years 3 months
=> SELECT INTERVAL '3--3' YEAR TO MONTH;
 ?column?
----------
 2 years 9 months

When the interval-literal looks like a year/month type, but the type is day/second, or vice versa, Vertica reads the interval-literal from left to right, where number-number is years-months, and number <space> <signed number> is whatever the units specify. Vertica processes the following command as (–) 1 year 1 month = (–) 365 + 30 = –395 days:

=> SELECT INTERVAL '-1-1' DAY TO HOUR;
 ?column?
----------
 -395 days

If you insert a space in the interval-literal, Vertica processes it based on the subtype DAY TO HOUR: (–) 1 day – 1 hour = (–) 24 – 1 = –23 hours:

=> SELECT INTERVAL '-1 -1' DAY TO HOUR;
 ?column?
----------
 -23 hours

Two negatives together returns a positive, so Vertica processes the following command as (–) 1 year – 1 month = (–) 365 – 30 = –335 days:

=> SELECT INTERVAL '-1--1' DAY TO HOUR;
  ?column?
----------
 -335 days

If you omit the value after the hyphen, Vertica assumes 0 months and processes the following command as 1 year 0 month –1 day = 365 + 0 – 1 = –364 days:

=> SELECT INTERVAL '1- -1' DAY TO HOUR;
  ?column?
----------
 364 days

3.4.3.7 - Casting with intervals

You can use CAST to convert strings to intervals, and vice versa.

You can use CAST to convert strings to intervals, and vice versa.

String to interval

You cast a string to an interval as follows:

CAST( [ INTERVAL[(p)] ] [-] ] interval-literal AS INTERVAL[(p)] interval-qualifier )

For example:

=> SELECT CAST('3700 sec' AS INTERVAL);
 ?column?
----------
 01:01:40

You can cast intervals within day-time or the year-month subtypes but not between them:

=> SELECT CAST(INTERVAL '4440' MINUTE as INTERVAL);
  ?column?
----------
 3 days 2 hours
=> SELECT CAST(INTERVAL -'01:15' as INTERVAL MINUTE);
 ?column?
----------
 -75 mins

Interval to string

You cast an interval to a string as follows:

CAST( (SELECT interval ) AS VARCHAR[(n)] )

For example:

=> SELECT CONCAT(
  'Tomorrow at this time: ',
  CAST((SELECT INTERVAL '24 hours') + CURRENT_TIMESTAMP(0) AS VARCHAR));
                    CONCAT
-----------------------------------------------
 Tomorrow at this time: 2016-08-17 08:41:23-04
(1 row)

3.4.3.8 - Operations with intervals

If you divide an interval by an interval, you get a FLOAT:.

If you divide an interval by an interval, you get a FLOAT:

=> SELECT INTERVAL '28 days 3 hours' HOUR(4) / INTERVAL '27 days 3 hours' HOUR(4);
 ?column?
------------
 1.036866359447

An INTERVAL divided by FLOAT returns an INTERVAL:

=> SELECT INTERVAL '3' MINUTE / 1.5;
 ?column?
------------
 2 mins

INTERVAL MODULO (remainder) INTERVAL returns an INTERVAL:

=> SELECT INTERVAL '28 days 3 hours' HOUR % INTERVAL '27 days 3 hours' HOUR;
  ?column?
------------
 24 hours

If you add INTERVAL and TIME, the result is TIME, modulo 24 hours:

=> SELECT INTERVAL '1' HOUR + TIME '1:30';
 ?column?
------------
 02:30:00

3.4.4 - SMALLDATETIME

SMALLDATETIME is an alias for TIMESTAMP.

SMALLDATETIME is an alias for TIMESTAMP/TIMESTAMPTZ.

3.4.5 - TIME/TIMETZ

Stores the specified time of day.

Stores the specified time of day. TIMETZ is the same as TIME WITH TIME ZONE: both data types store the UTC offset of the specified time.

Syntax

TIME [ (p) ] [ { WITHOUT | WITH } TIME ZONE ] 'input-string' [ AT TIME ZONE zone ]

Parameters

p
Optional precision value that specifies the number of fractional digits retained in the seconds field, an integer value between 0 and 6. If you omit specifying precision, Vertica returns up to 6 fractional digits.
WITHOUT TIME ZONE
Ignore any time zone in the input string and use a value without a time zone (default).
WITH TIME ZONE
Convert the time to UTC. If the input string includes a time zone, use its UTC offset for the conversion. If the input string omits a time zone, Vertica uses the UTC offset of the time zone that is configured for your system.
input-string
See Input String below.
AT TIME ZONE zone
See TIME AT TIME ZONE and TIMESTAMP AT TIME ZONE.

TIMETZ vs. TIMESTAMPTZ

TIMETZ and TIMESTAMPTZ are not parallel SQL constructs. TIMESTAMPTZ records a time and date in GMT, converting from the specified TIME ZONE.TIMETZ records the specified time and the specified time zone, in minutes, from GMT.

Limits

Name Low Value High Value Resolution
TIME [p] 00:00:00.00 23:59:60.999999 1 µs
TIME [p] WITH TIME ZONE 00:00:00.00+14 23:59:59.999999-14 1 µs

Input string

A TIME input string can be set to any of the formats shown below:

Example Description
04:05:06.789 ISO 8601
04:05:06 ISO 8601
04:05 ISO 8601
040506 ISO 8601
04:05 AM Same as 04:05; AM does not affect value
04:05 PM Same as 16:05
04:05:06.789-8 ISO 8601
04:05:06-08:00 ISO 8601
04:05-08:00 ISO 8601
040506-08 ISO 8601
04:05:06 PST Time zone specified by name

Data type coercion

You can cast a TIME or TIMETZ interval to a TIMESTAMP. This returns the local date and time as follows:

=> SELECT (TIME '3:01am')::TIMESTAMP;
       ?column?
---------------------
 2012-08-30 03:01:00
(1 row)

=> SELECT (TIMETZ '3:01am')::TIMESTAMP;
      ?column?
---------------------
 2012-08-22 03:01:00
(1 row)

Casting the same TIME or TIMETZ interval to a TIMESTAMPTZ returns the local date and time, appended with the UTC offset—in this example, -05:

=> SELECT (TIME '3:01am')::TIMESTAMPTZ;
        ?column?
------------------------
 2016-12-08 03:01:00-05
(1 row)

3.4.6 - TIME AT TIME ZONE

Converts the specified TIME to the time in another time zone.

Converts the specified TIME to the time in another time zone.

Syntax

TIME [WITH TIME ZONE] 'input-string' AT TIME ZONE 'zone'

Parameters

WITH TIME ZONE
Converts the input string to UTC, using the UTC offset for the specified time zone. If the input string omits a time zone, Vertica uses the UTC offset of the time zone that is configured for your system, and converts the input string accordingly
zone
Specifies the time zone to use in the conversion, either as a literal or interval that specifies UTC offset:
  • AT TIME ZONE INTERVAL 'utc-offset'

  • AT TIME ZONE 'time-zone-literal'

For details, see Specifying Time Zones below.

Specifying time zones

You can specify time zones in two ways:

  • A string literal such as America/Chicago or PST

  • An interval that specifies a UTC offset—for example, INTERVAL '-08:00'

It is generally good practice to specify time zones with literals that indicate a geographic location. Vertica makes the necessary seasonal adjustments, and thereby avoids inconsistent results. For example, the following two queries are issued when daylight time is in effect. Because the local UTC offset during daylight time is -04, both queries return the same results:

=> SELECT CURRENT_TIME(0) "EDT";
     EDT
-------------
 12:34:35-04
(1 row)

=> SELECT CURRENT_TIME(0) AT TIME ZONE 'America/Denver' "Mountain Time";
 Mountain Time
---------------
 10:34:35-06
(1 row)
=> SELECT CURRENT_TIME(0) AT TIME ZONE INTERVAL '-06:00' "Mountain Time";
 Mountain Time
---------------
 10:34:35-06
(1 row)

If you issue a use the UTC offset in a similar query when standard time is in effect, you must adjust the UTC offset accordingly—for Denver time, to -07—otherwise, Vertica returns a different (and erroneous) result:

=> SELECT CURRENT_TIME(0) "EST";
     EST
-------------
 14:18:22-05
(1 row)

=> SELECT CURRENT_TIME(0) AT TIME ZONE INTERVAL '-06:00' "Mountain Time";
 Mountain Time
---------------
 13:18:22-06
(1 row)

You can show and set the session's time zone with SHOW TIMEZONE and SET TIME ZONE, respectively:

=> SHOW TIMEZONE;
   name   |     setting
----------+------------------
 timezone | America/New_York
(1 row)

=> SELECT CURRENT_TIME(0) "Eastern Daylight Time";
 Eastern Daylight Time
-----------------------
 12:18:24-04
(1 row)

=> SET TIMEZONE 'America/Los_Angeles';
SET

=> SELECT CURRENT_TIME(0) "Pacific Daylight Time";
 Pacific Daylight Time
-----------------------
 09:18:24-07
(1 row)

Time zone literals

To view the default list of valid literals, see the files in the following directory:

opt/vertica/share/timezonesets

For example:

$ cat Antarctica.txt
...
# src/timezone/tznames/Antarctica.txt
#

AWST    28800    # Australian Western Standard Time
                 #     (Antarctica/Casey)
                 #     (Australia/Perth)
...

NZST    43200    # New Zealand Standard Time
                 #     (Antarctica/McMurdo)
                 #     (Pacific/Auckland)
ROTT   -10800    # Rothera Time
                 #     (Antarctica/Rothera)
SYOT    10800    # Syowa Time
                 #     (Antarctica/Syowa)
VOST    21600    # Vostok time
                 #     (Antarctica/Vostok)

Examples

The following example assumes that local time is EST (Eastern Standard Time). The query converts the specified time to MST (mountain standard time):

=> SELECT CURRENT_TIME(0);
  timezone
-------------
 10:10:56-05
(1 row)

=> SELECT TIME '10:10:56' AT TIME ZONE 'America/Denver' "Denver Time";
 Denver Time
-------------
 08:10:56-07
(1 row)

The next example adds a time zone literal to the input string—in this case, Europe/Vilnius—and converts the time to MST:

=> SELECT TIME '09:56:13 Europe/Vilnius' AT TIME ZONE 'America/Denver';
 Denver Time
-------------
 00:56:13-07
(1 row)

See also

3.4.7 - TIMESTAMP/TIMESTAMPTZ

Stores the specified date and time.

Stores the specified date and time. TIMESTAMPTZ is the same as TIMESTAMP WITH TIME ZONE: both data types store the UTC offset of the specified time.

TIMESTAMP is an alias for DATETIME and SMALLDATETIME.

Syntax

TIMESTAMP [ (p) ] [ { WITHOUT | WITH } TIME ZONE ] 'input-string' [AT TIME ZONE zone ]
TIMESTAMPTZ [ (p) ] 'input-string' [ AT TIME ZONE zone ]

Parameters

p
Optional precision value that specifies the number of fractional digits retained in the seconds field, an integer value between 0 and 6. If you omit specifying precision, Vertica returns up to 6 fractional digits.
WITHOUT TIME ZONE
WITH TIME ZONE

Specifies whether to include a time zone with the stored value:

  • WITHOUT TIME ZONE (default): Specifiesthat input-string does not include a time zone. If the input string contains a time zone, Vertica ignores this qualifier. Instead, it conforms to WITH TIME ZONE behavior.

  • WITH TIME ZONE: Specifies to convert input-string to UTC, using the UTC offset for the specified time zone. If the input string omits a time zone, Vertica uses the UTC offset of the time zone that is configured for your system.

input-string
See Input String below.
AT TIME ZONE zone
See TIMESTAMP AT TIME ZONE.

Limits

In the following table, values are rounded. See Date/time data types for more detail.

Name Low Value High Value Resolution
TIMESTAMP [ (p) ] [ WITHOUT TIME ZONE ] 290279 BC 294277 AD 1 µs
TIMESTAMP [ (p) ] WITH TIME ZONE 290279 BC 294277 AD 1 µs

Input string

The date/time input string concatenates a date and a time. The input string can include a time zone, specified as a literal such as America/Chicago, or as a UTC offset.

The following list represents typical date/time input variations:

  • 1999-01-08 04:05:06

  • 1999-01-08 04:05:06 -8:00

  • January 8 04:05:06 1999 PST

The input string can also specify the calendar era, either AD (default) or BC. If you omit the calendar era, Vertica assumes the current calendar era (AD). The calendar era typically follows the time zone; however, the input string can include it in various locations. For example, the following queries return the same results:

=> SELECT TIMESTAMP WITH TIME ZONE 'March 1, 44 12:00 CET BC ' "Caesar's Time of Death EST";
 Caesar's Time of Death EST
----------------------------
 0044-03-01 06:00:00-05 BC
(1 row)

=> SELECT TIMESTAMP WITH TIME ZONE 'March 1, 44 12:00 BC CET' "Caesar's Time of Death EST";
 Caesar's Time of Death EST
----------------------------
 0044-03-01 06:00:00-05 BC
(1 row)

Examples

=> SELECT (TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01');
 ?column?
----------
 16 10:00
(1 row)

=> SELECT (TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01') / 7;
     ?column?
-------------------
 2 08:17:08.571429
(1 row)

=> SELECT TIMESTAMP '2009-05-29 15:21:00.456789'-TIMESTAMP '2009-05-28';
     ?column?
-------------------
 1 15:21:00.456789
(1 row)

=> SELECT (TIMESTAMP '2009-05-29 15:21:00.456789'-TIMESTAMP '2009-05-28')(3);
    ?column?
----------------
 1 15:21:00.457
(1 row)

=> SELECT '2017-03-18 07:00'::TIMESTAMPTZ(0) + INTERVAL '1.5 day';
        ?column?
------------------------
 2017-03-19 19:00:00-04
(1 row)

=> SELECT (TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01') day;
 ?column?
----------
 16
(1 row)

=> SELECT cast((TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01')
day as integer) / 7;
       ?column?
----------------------
 2.285714285714285714
(1 row)

=> SELECT floor((TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01')
/ interval '7');
 floor
-------
     2
(1 row)

=> SELECT (TIMESTAMP '2009-05-29 15:21:00.456789'-TIMESTAMP '2009-05-28') second;
   ?column?
---------------
 141660.456789
(1 row)

=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01') year;
 ?column?
----------
 3
(1 row)

=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01') month;
 ?column?
----------
 40
(1 row)

=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01')
year to month;
 ?column?
----------
 3-4
(1 row)

=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01')
second(3);
   ?column?
---------------
 107536860.457
(1 row)

=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01') minute;
 ?column?
----------
 1792281
(1 row)

=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01')
minute to second(3);
    ?column?
----------------
 1792281:00.457
(1 row)

=> SELECT TIMESTAMP 'infinity';
 ?column?
----------
 infinity
(1 row)

3.4.8 - TIMESTAMP AT TIME ZONE

Converts the specified TIMESTAMP or TIMESTAMPTZ (TIMESTAMP WITH TIMEZONE) to another time zone.

Converts the specified TIMESTAMP or TIMESTAMPTZ (TIMESTAMP WITH TIMEZONE) to another time zone. Vertica executes AT TIME ZONE differently, depending on whether the date input is a TIMESTAMP or TIMESTAMPTZ. See TIMESTAMP versus TIMESTAMPTZ Behavior below.

Syntax

timestamp-clause AT TIME ZONE 'zone'

Parameters

[timestamp-clause](/en/sql-reference/data-types/datetime-data-types/timestamptimestamptz/)
Specifies the timestamp to convert, either TIMESTAMP or TIMESTAMPTZ.

For details, see TIMESTAMP/TIMESTAMPTZ.

AT TIME ZONE zone
Specifies the time zone to use in the timestamp conversion, where zone is a literal or interval that specifies a UTC offset:
  • AT TIME ZONE INTERVAL 'utc-offset'

  • AT TIME ZONE 'time-zone-literal'

For details, see Specifying Time Zones below.

TIMESTAMP versus TIMESTAMPTZ behavior

How Vertica interprets AT TIME ZONE depends on whether the date input is a TIMESTAMP or TIMESTAMPTZ:

Date input Action
TIMESTAMP

If the input string specifies no time zone, Vertica performs two actions:

  1. Converts the input string to the time zone of the AT TIME ZONE argument.

  2. Returns the time for the current session's time zone.

If the input string includes a time zone, Vertica implicitly casts it to a TIMESTAMPTZ and converts it accordingly (see TIMESTAMPTZ below).

For example, the following statement specifies a TIMESTAMP with no time zone. Vertica executes the statement as follows:

  1. Converts the input string to PDT (Pacific Daylight Time).

  2. Returns that time in the local time zone, which is three hours later:

=> SHOW TIMEZONE;
   name   |     setting
----------+------------------
 timezone | America/New_York
(1 row)

SELECT TIMESTAMP '2017-3-14 5:30' AT TIME ZONE 'PDT';
        timezone
------------------------
 2017-03-14 08:30:00-04
(1 row)
TIMESTAMPTZ

Vertica converts the input string to the time zone of the AT TIME ZONE argument and returns that time.

For example, the following statement specifies a TIMESTAMPTZ data type. The input string omits any time zone expression, so Vertica assumes the input string to be in local time zone (America/New_York) and returns the time of the AT TIME ZONE argument:

=> SHOW TIMEZONE;
   name   |     setting
----------+------------------
 timezone | America/New_York
(1 row)

=> SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40' AT TIME ZONE 'America/Denver';
      timezone
---------------------
 2001-02-16 18:38:40
(1 row)

The input string in the next statement explicitly specifies a time zone, so Vertica coerces the TIMESTAMP to a TIMESTAMPTZ and returns the time of the AT TIME ZONE argument:

=> SELECT TIMESTAMP '2001-02-16 20:38:40 America/Mexico_City' AT TIME ZONE 'Asia/Tokyo';
      timezone
---------------------
 2001-02-17 11:38:40
(1 row)

Specifying time zones

You can specify time zones in two ways:

  • A string literal such as America/Chicago or PST

  • An interval that specifies a UTC offset—for example, INTERVAL '-08:00'

It is generally good practice to specify time zones with literals that indicate a geographic location. Vertica makes the necessary seasonal adjustments, and thereby avoids inconsistent results. For example, the following two queries are issued when daylight time is in effect. Because the local UTC offset during daylight time is -04, both queries return the same results:

=> SELECT TIMESTAMPTZ '2017-03-16 09:56:13' AT TIME ZONE 'America/Denver' "Denver Time";
     Denver Time
---------------------
 2017-03-16 07:56:13
(1 row)

=> SELECT TIMESTAMPTZ '2017-03-16 09:56:13' AT TIME ZONE INTERVAL '-06:00' "Denver Time";
     Denver Time
---------------------
 2017-03-16 07:56:13
(1 row)

If you issue a use the UTC offset in a similar query when standard time is in effect, you must adjust the UTC offset accordingly—for Denver time, to -07—otherwise, Vertica returns a different (and erroneous) result:

=> SELECT TIMESTAMPTZ '2017-01-16 09:56:13' AT TIME ZONE 'America/Denver' "Denver Time";
     Denver Time
---------------------
 2017-0-16 07:56:13
(1 row)

=> SELECT TIMESTAMPTZ '2017-01-16 09:56:13' AT TIME ZONE INTERVAL '-06:00' "Denver Time";
     Denver Time
---------------------
 2017-01-16 08:56:13
(1 row)

You can show and set the session's time zone with SHOW TIMEZONE and SET TIME ZONE, respectively:

=> SHOW TIMEZONE;
   name   |     setting
----------+------------------
 timezone | America/New_York
(1 row)

=> SELECT CURRENT_TIMESTAMP(0) "Eastern Daylight Time";
 Eastern Daylight Time
------------------------
 2017-03-20 12:18:24-04
(1 row)

=> SET TIMEZONE 'America/Los_Angeles';
SET

=> SELECT CURRENT_TIMESTAMP(0) "Pacific Daylight Time";
 Pacific Daylight Time
------------------------
 2017-03-20 09:18:24-07
(1 row)

Time zone literals

To view the default list of valid literals, see the files in the following directory:

opt/vertica/share/timezonesets

For example:

$ cat Antarctica.txt
...
# src/timezone/tznames/Antarctica.txt
#

AWST    28800    # Australian Western Standard Time
                 #     (Antarctica/Casey)
                 #     (Australia/Perth)
...

NZST    43200    # New Zealand Standard Time
                 #     (Antarctica/McMurdo)
                 #     (Pacific/Auckland)
ROTT   -10800    # Rothera Time
                 #     (Antarctica/Rothera)
SYOT    10800    # Syowa Time
                 #     (Antarctica/Syowa)
VOST    21600    # Vostok time
                 #     (Antarctica/Vostok)

See also

3.5 - Long data types

Store data up to 32000000 octets.

Store data up to 32000000 octets. Vertica supports two long data types:

  • LONG VARBINARY: Variable-length raw-byte data, such as spatial data. LONG VARBINARY values are not extended to the full width of the column.

  • LONG VARCHAR: Variable-length strings, such as log files and unstructured data. LONG VARCHAR values are not extended to the full width of the column.

Use LONG data types only when you need to store data greater than the maximum size of VARBINARY and VARCHAR data types (65 KB). Long data can include unstructured data, online comments or posts, or small log files.

Flex tables have a default LONG VARBINARY __raw__ column, with a NOT NULL constraint. For more information, see Flex tables.

Syntax

LONG VARBINARY [(max-length)]
LONG VARCHAR [(octet-length)]

Parameters

max-length
Length of the byte string or column width, declared in bytes (octets), up to 32000000.

Default: 1 MB

octet-length
Length of the string or column width, declared in bytes (octets), up to 32000000.

Default: 1 MB

Optimized performance

For optimal performance of LONG data types, Vertica recommends that you:

  • Use the LONG data types as storage only containers; Vertica supports operations on the content of LONG data types, but does not support all the operations that VARCHAR and VARBINARY take.

  • Use VARBINARY and VARCHAR data types, instead of their LONG counterparts, whenever possible. VARBINARY and VARCHAR data types are more flexible and have a wider range of operations.

  • Do not sort, segment, or partition projections on LONG data type columns.

  • Do not add constraints, such as a primary key, to any LONG VARBINARY or LONG VARCHAR columns.

  • Do not join or aggregate any LONG data type columns.

Examples

The following example creates a table user_comments with a LONG VARCHAR column and inserts data into it:

=> CREATE TABLE user_comments
                  (id INTEGER,
                  username VARCHAR(200),
                  time_posted TIMESTAMP,
                  comment_text LONG VARCHAR(200000));
=> INSERT INTO user_comments VALUES
                  (1,
                  'User1',
                  TIMESTAMP '2013-06-25 12:47:32.62',
                  'The weather tomorrow will be cold and rainy and then
                  on the day after, the sun will come and the temperature
                  will rise dramatically.');

3.6 - Numeric data types

Numeric data types are numbers stored in database columns.

Numeric data types are numbers stored in database columns. These data types are typically grouped by:

  • Exact numeric types, values where the precision and scale need to be preserved. The exact numeric types are INTEGER, BIGINT, DECIMAL, NUMERIC, NUMBER, and MONEY.

  • Approximate numeric types, values where the precision needs to be preserved and the scale can be floating. The approximate numeric types are DOUBLE PRECISION, FLOAT, and REAL.

Implicit casts from INTEGER, FLOAT, and NUMERIC to VARCHAR are not supported. If you need that functionality, write an explicit cast using one of the following forms:

CAST(numeric-expression AS data-type)
numeric-expression::data-type

For example, you can cast a float to an integer as follows:

=> SELECT(FLOAT '123.5')::INT;
 ?column?
----------
      124
(1 row)

String-to-numeric data type conversions accept formats of quoted constants for scientific notation, binary scaling, hexadecimal, and combinations of numeric-type literals:

  • Scientific notation:

    => SELECT FLOAT '1e10';
      ?column?
    -------------
     10000000000
    (1 row)
    
  • BINARY scaling:

    => SELECT NUMERIC '1p10';
     ?column?
    ----------
         1024
    (1 row)
    
  • hexadecimal:

    => SELECT NUMERIC '0x0abc';
     ?column?
    ----------
         2748
    (1 row)
    

3.6.1 - DOUBLE PRECISION (FLOAT)

Vertica supports the numeric data type DOUBLE PRECISION, which is the IEEE-754 8-byte floating point type, along with most of the usual floating point operations.

Vertica supports the numeric data type DOUBLE PRECISION, which is the IEEE-754 8-byte floating point type, along with most of the usual floating point operations.

Syntax

[ DOUBLE PRECISION | FLOAT | FLOAT(n) | FLOAT8 | REAL ]

Parameters

Double precision is an inexact, variable-precision numeric type. In other words, some values cannot be represented exactly and are stored as approximations. Thus, input and output operations involving double precision might show slight discrepancies.

  • All of the DOUBLE PRECISION data types are synonyms for 64-bit IEEE FLOAT.

  • The n in FLOAT(n) must be between 1 and 53, inclusive, but a 53-bit fraction is always used. See the IEEE-754 standard for details.

  • For exact numeric storage and calculations (money for example), use NUMERIC.

  • Floating point calculations depend on the behavior of the underlying processor, operating system, and compiler.

  • Comparing two floating-point values for equality might not work as expected.

  • While Vertica treats decimal values as FLOAT internally, if a column is defined as FLOAT then you cannot read decimal values from ORC and Parquet files. In those formats, FLOAT and DECIMAL are different types.

Values

COPY accepts floating-point data in the following format:

  • Optional leading white space

  • An optional plus ("+") or minus sign ("-")

  • A decimal number, a hexadecimal number, an infinity, a NAN, or a null value

Decimal Number

A decimal number consists of a non-empty sequence of decimal digits possibly containing a radix character (decimal point "."), optionally followed by a decimal exponent. A decimal exponent consists of an "E" or "e", followed by an optional plus or minus sign, followed by a non-empty sequence of decimal digits, and indicates multiplication by a power of 10.

Hexadecimal Number

A hexadecimal number consists of a "0x" or "0X" followed by a non-empty sequence of hexadecimal digits possibly containing a radix character, optionally followed by a binary exponent. A binary exponent consists of a "P" or "p", followed by an optional plus or minus sign, followed by a non-empty sequence of decimal digits, and indicates multiplication by a power of 2. At least one of radix character and binary exponent must be present.

Infinity

An infinity is either INF or INFINITY, disregarding case.

NaN (Not A Number)

A NaN is NAN (disregarding case) optionally followed by a sequence of characters enclosed in parentheses. The character string specifies the value of NAN in an implementation-dependent manner. (The Vertica internal representation of NAN is 0xfff8000000000000LL on x86 machines.)

When writing infinity or NAN values as constants in a SQL statement, enclose them in single quotes. For example:

=> UPDATE table SET x = 'Infinity'

A NaN example follows.

=> SELECT CBRT('Nan'); -- cube root
 CBRT 
------
 NaN
(1 row)
=> SELECT 'Nan' > 1.0;
 ?column?
----------
 f
(1 row)

Null Value

The load file format of a null value is user defined, as described in the COPY command. The Vertica internal representation of a null value is 0x7fffffffffffffffLL. The interactive format is controlled by the vsql printing option null. For example:

\pset null '(null)'

The default option is not to print anything.

Rules

  • -0 == +0

  • 1/0 = Infinity

  • 0/0 == Nan

  • NaN != anything (even NaN)

To search for NaN column values, use the following predicate:

... WHERE column != column

This is necessary because WHERE column = 'Nan' cannot be true by definition.

Sort order (ascending)

  • NaN

  • -Inf

  • numbers

  • +Inf

  • NULL

Notes

  • NULL appears last (largest) in ascending order.

  • All overflows in floats generate +/-infinity or NaN, per the IEEE floating point standard.

3.6.2 - INTEGER

A signed 8-byte (64-bit) data type.

A signed 8-byte (64-bit) data type.

Syntax

[ INTEGER | INT | BIGINT | INT8 | SMALLINT | TINYINT ]

Parameters

INT, INTEGER, INT8, SMALLINT, TINYINT, and BIGINT are all synonyms for the same signed 64-bit integer data type. Automatic compression techniques are used to conserve disk space in cases where the full 64 bits are not required.

Notes

  • The range of values is –2^63+1 to 2^63-1.

  • 2^63 = 9,223,372,036,854,775,808 (19 digits).

  • The value –2^63 is reserved to represent NULL.

  • NULL appears first (smallest) in ascending order.

  • Vertica does not have an explicit 4-byte (32-bit integer) or smaller types. Vertica's encoding and compression automatically eliminate the storage overhead of values that fit in less than 64 bits.

Restrictions

  • The JDBC type INTEGER is 4 bytes and is not supported by Vertica. Use BIGINT instead.

  • Vertica does not support the SQL/JDBC types NUMERIC, SMALLINT, or TINYINT.

  • Vertica does not check for overflow (positive or negative) except in the aggregate function SUM(). If you encounter overflow when using SUM, use SUM_FLOAT(), which converts to floating point.

See also

Data Type Coercion Chart

3.6.3 - NUMERIC

Numeric data types store fixed-point numeric data.

Numeric data types store fixed-point numeric data. For example, a value of $123.45 can be stored in a NUMERIC(5,2) field. Note that the first number, the precision, specifies the total number of digits.

Syntax

numeric-type [ ( precision[, scale] ) ]

Parameters

numeric-type
One of the following:
  • NUMERIC

  • DECIMAL

  • NUMBER

  • MONEY

precision
An unsigned integer that specifies the total number of significant digits that the data type stores, where precision is ≤ 1024. If omitted, the default precision depends on numeric type that you specify. If you assign a value that exceeds precision, Vertica returns an error.

If a data type's precision is ≤ 18, performance is equivalent to an INTEGER data type, regardless of scale. When possible, Vertica recommends using a precision ≤ 18.

scale
An unsigned integer that specifies the maximum number of digits to the right of the decimal point to store. scale must be ≤ precision. If omitted, the default scale depends on numeric type that you specify. If you assign a value with more decimal digits than scale, the scale is rounded to scale digits.

When using ALTER to modify the data type of a numeric column, scale cannot be changed.

Default precision and scale

NUMERIC, DECIMAL, NUMBER, and MONEY differ in their default precision and scale values:

Type Precision Scale
NUMERIC 37 15
DECIMAL 37 15
NUMBER 38 0
MONEY 18 4

Supported encoding

Vertica supports the following encoding for numeric data types:

  • Precision ≤ 18: AUTO, BLOCK_DICT, BLOCKDICT_COMP, COMMONDELTA_COMP, DELTAVAL, GCDDELTA, and RLE

  • Precision > 18: AUTO, BLOCK_DICT, BLOCKDICT_COMP, RLE

For details, see Encoding types.

Numeric versus integer and floating data types

Numeric data types are exact data types that store values of a specified precision and scale, expressed with a number of digits before and after a decimal point. This contrasts with the Vertica integer and floating data types:

  • DOUBLE PRECISION (FLOAT) supports ~15 digits, variable exponent, and represents numeric values approximately. It can be less precise than NUMERIC data types.

  • INTEGER supports ~18 digits, whole numbers only.

The NUMERIC data type is preferred for non-integer constants, because it is always exact. For example:

=> SELECT 1.1 + 2.2 = 3.3;
?column?
----------
 t
(1 row)

=>  SELECT 1.1::float + 2.2::float = 3.3::float;
 ?column?
----------
 f
(1 row)

Numeric operations

Supported numeric operations include the following:

NUMERIC divide operates directly on numeric values, without converting to floating point. The result has at least 18 decimal places and is rounded.

NUMERIC mod (including %) operates directly on numeric values, without converting to floating point. The result has the same scale as the numerator and never needs rounding.

Some complex operations used with numeric data types result in an implicit cast to FLOAT. When using SQRT, STDDEV, transcendental functions such as LOG, and TO_CHAR/TO_NUMBER formatting, the result is always FLOAT.

Examples

The following series of commands creates a table that contains a numeric data type and then performs some mathematical operations on the data:

=> CREATE TABLE num1 (id INTEGER, amount NUMERIC(8,2));

Insert some values into the table:

=> INSERT INTO num1 VALUES (1, 123456.78);

Query the table:

=> SELECT * FROM num1;
  id  |  amount
------+-----------
    1 | 123456.78
(1 row)

The following example returns the NUMERIC column, amount, from table num1:

=> SELECT amount FROM num1;
  amount
-----------
 123456.78
(1 row)

The following syntax adds one (1) to the amount:

=> SELECT amount+1 AS 'amount' FROM num1;
  amount
-----------
 123457.78
(1 row)

The following syntax multiplies the amount column by 2:

=> SELECT amount*2 AS 'amount' FROM num1;
  amount
-----------
 246913.56
(1 row)

The following syntax returns a negative number for the amount column:

=> SELECT -amount FROM num1;
?column?
------------
-123456.78
(1 row)

The following syntax returns the absolute value of the amount argument:

=> SELECT ABS(amount) FROM num1;
ABS
-----------
123456.78
(1 row)

The following syntax casts the NUMERIC amount as a FLOAT data type:

=> SELECT amount::float FROM num1;
  amount
-----------
 123456.78
(1 row)

See also

Mathematical functions

3.6.4 - Numeric data type overflow

Vertica does not check for overflow (positive or negative) except in the aggregate function SUM().

Vertica does not check for overflow (positive or negative) except in the aggregate function SUM(). If you encounter overflow when using SUM, use SUM_FLOAT() which converts to floating point.

For a detailed discussion of how Vertica handles overflow when you use the functions SUM, SUM_FLOAT, and AVG with numeric data types, see Numeric data type overflow with SUM, SUM_FLOAT, and AVG. The discussion includes directives for turning off silent numeric overflow and setting precision for numeric data types. Dividing by zero returns an error:

=> SELECT 0/0;
ERROR 3117:  Division by zero

=> SELECT 0.0/0;
ERROR 3117:  Division by zero

=> SELECT 0 // 0;
ERROR 3117:  Division by zero

=> SELECT 200.0/0;
ERROR 3117:  Division by zero

=> SELECT 116.43 // 0;
ERROR 3117:  Division by zero

Dividing zero as a FLOAT by zero returns NaN:

=> SELECT 0.0::float/0;
 ?column?
----------
      NaN
=> SELECT 0.0::float//0;
 ?column?
----------
      NaN

Dividing a non-zero FLOAT by zero returns Infinity:

=> SELECT 2.0::float/0;
 ?column?
----------
Infinity
=> SELECT 200.0::float//0;
?column?
----------
Infinity

Add, subtract, and multiply operations ignore overflow. Sum and average operations use 128-bit arithmetic internally. SUM() reports an error if the final result overflows, suggesting the use of SUM_FLOAT(INT), which converts the 128-bit sum to a FLOAT. For example:

=> CREATE TEMP TABLE t (i INT);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> SELECT SUM(i) FROM t;
   ERROR: sum() overflowed
   HINT: try sum_float() instead
=> SELECT SUM_FLOAT(i) FROM t;
     SUM_FLOAT
---------------------
 2.30584300921369e+19

3.6.5 - Numeric data type overflow with SUM, SUM_FLOAT, and AVG

When you use the SUM, SUM_FLOAT, and AVG functions (aggregate and analytic) to query a numeric column, overflow can occur.

When you use the SUM, SUM_FLOAT, and AVG functions (aggregate and analytic) to query a numeric column, overflow can occur. How Vertica responds to that overflow depends on the settings of two configuration parameters:

  • AllowNumericOverflow (Boolean, default 1) allows numeric overflow. Vertica does not implicitly extend precision of numeric data types.
  • NumericSumExtraPrecisionDigits (integer, default 6) determines whether to return an overflow error if a result exceeds the specified precision. This parameter is ignored if AllowNumericOverflow is set to 1 (true).

Vertica also allows numeric overflow when you use SUM or SUM_FLOAT to query pre-aggregated data. See Impact on Pre-Aggregated Data Projections below.

Default overflow handling

With numeric columns, Vertica internally works with multiples of 18 digits. If specified precision is less than 18—for example, x(12,0)—Vertica allows overflow up to and including the first multiple of 18. In some situations, if you sum a column, you can exceed the number of digits Vertica internally reserves for the result. In this case, Vertica allows silent overflow.

Turning off silent numeric overflow

You can turn off silent numeric overflow by setting AllowNumericOverflow to 0. In this case, Vertica checks the value of configuration parameter NumericSumExtraPrecisionDigits. By default, this parameter is set to 6, which means that Vertica internally adds extra digit places beyond a column's DDL-specified precision. Adding extra precision digits enables Vertica to consistently return results that overflow the column's precision. However, crossing into the second multiple of 18 internally can adversely impact performance.

For example, if AllowNumericOverflow is set to 0 :

  • Column x is defined as x(12,0)and NumericSumExtraPrecisionDigits is set to 6: Vertica internally stays within the first multiple of 18 digits and no additional performance impact occurs (a).
  • Column x is defined as x(2,0)and NumericSumExtraPrecisionDigits is set to 20: Vertica internally crosses a threshold into the second multiple of 18. In this case, performance is significantly affected (2a). Performance beyond the second multiple of 18 continues to be 2a.

Impact on pre-aggregated data projections

Vertica supports silent numeric overflow for queries that use SUM or SUM_FLOAT on projections with pre-aggregated data such as live aggregate or Top-K projections. To turn off silent numeric overflow for these queries:

  1. Set AllowNumericOverflow to 0.

  2. Set NumericSumExtraPrecisionDigits to the desired number of implicit digits. Alternatively, use the default setting of 6.

  3. Drop and re-create the affected projections.

If you turn off silent numeric overflow, be aware that overflow can sometimes cause rollback or errors:

  • Overflow occurs during load operations, such as COPY, MERGE, or INSERT:

    Vertica aggregates data before loading the projection with data. If overflow occurs while data is aggregated, , Vertica rolls back the load operation.

  • Overflow occurs after load, while Vertica sums existing data.

    Vertica computes the sum of existing data separately from the computation that it does during data load. If the projection selects a column with SUM or SUM_FLOAT and overflow occurs, Vertica produces an error message. This response resembles the way Vertica produces an error for a query that uses SUM or SUM_FLOAT.

  • Overflow occurs during mergeout.

    Vertica logs a message during mergeout if overflow occurs while Vertica computes a final sum during the mergeout operation. If an error occurs, Vertica marks the projection as out of date and disqualifies it from further mergeout operations.

3.7 - Spatial data types

The maximum amount of spatial data that a GEOMETRY or GEOGRAPHY column can store, up to 10 MB.

Vertica supports two spatial data types. These data types store two- and three-dimensional spatial objects in a table column:

  • GEOMETRY: Spatial object with coordinates expressed as (x,y) pairs, defined in the Cartesian plane. All calculations use Cartesian coordinates.
  • GEOGRAPHY: Spatial object defined as on the surface of a perfect sphere, or a spatial object in the WGS84 coordinate system. Coordinates are expressed in longitude/latitude angular values, measured in degrees. All calculations are in meters. For perfect sphere calculations, the sphere has a radius of 6371 kilometers, which approximates the shape of the earth.

The maximum size of a GEOMETRY or GEOGRAPHY data type is 10,000,000 bytes (10 MB). You cannot use either data type as a table's primary key.

Syntax

GEOMETRY [ (length) ]
GEOGRAPHY [ (length) ]

Parameters

length
The maximum amount of spatial data that a GEOMETRY or GEOGRAPHY column can store, up to 10 MB.

Default: 1 MB

3.8 - UUID data type

Stores universally unique identifiers (UUIDs).

Stores universally unique identifiers (UUIDs). UUIDs are 16-byte (128-bit) numbers used to uniquely identify records. To generate UUIDs, Vertica provides the function UUID_GENERATE, which returns UUIDs based on high-quality randomness from /dev/urandom.

Syntax

UUID

UUID input and output formats

UUIDs support input of case-insensitive string literal formats, as specified by RFC 4122. In general, a UUID is written as a sequence of hexadecimal digits, in several groups optionally separated by hyphens, for a total of 32 digits representing 128 bits.

The following input formats are valid:

6bbf0744-74b4-46b9-bb05-53905d4538e7
{6bbf0744-74b4-46b9-bb05-53905d4538e7}
6BBF074474B446B9BB0553905D4538E7
6BBf-0744-74B4-46B9-BB05-5390-5D45-38E7

On output, Vertica always uses the following format:

xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

For example, the following table defines column cust_id as a UUID:

=> CREATE TABLE public.Customers
(
    cust_id uuid,
    lname varchar(36),
    fname varchar(24)
);

The following input for cust_id uses several valid formats:

=>  COPY Customers FROM STDIN;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {cede66b7-3d29-4da6-b700-871fc0ac57be}|Kearney|Thomas
>> 34462732ed5649838f3be735b0c32d50|Pham|Duc
>> 9fb0-1de0-1d63-4d09-9415-90e0-b4e9-3b9a|Steinberg|Jeremy
>> \.

On querying this table, Vertica formats all cust_id data in the same way:

=> SELECT cust_id, fname, lname FROM Customers;
                cust_id                | fname  |   lname
--------------------------------------+--------+-----------
 9fb01de0-1d63-4d09-9415-90e0b4e93b9a | Jeremy | Steinberg
 34462732-ed56-4983-8f3b-e735b0c32d50 | Duc    | Pham
 cede66b7-3d29-4da6-b700-871fc0ac57be | Thomas | Kearney
(3 rows)

Generating UUIDs

You can use the Vertica function UUID_GENERATE to automatically generate UUIDs that uniquely identify table records. For example:

=> INSERT INTO Customers SELECT UUID_GENERATE(),'Rostova','Natasha';
 OUTPUT
--------
      1
(1 row)

=> COMMIT;
COMMIT
=> SELECT cust_id, fname, lname FROM Customers;
               cust_id                |  fname  |   lname
--------------------------------------+---------+-----------
 9fb01de0-1d63-4d09-9415-90e0b4e93b9a | Jeremy  | Steinberg
 34462732-ed56-4983-8f3b-e735b0c32d50 | Duc     | Pham
 cede66b7-3d29-4da6-b700-871fc0ac57be | Thomas  | Kearney
 9aad6757-fe1b-473a-a109-b89b7b358c69 | Natasha | Rostova
(4 rows)

NULL input and output

The following string is reserved as NULL for UUID columns:

00000000-0000-0000-0000-000000000000

Vertica always renders NULL as blank.

The following COPY statements insert NULL values into the UUID column, explicitly and implicitly:

=>  COPY Customers FROM STDIN NULL AS 'null';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> null|Doe|Jane
>> 00000000-0000-0000-0000-000000000000|Man|Nowhere
>> \.
=>  COPY Customers FROM STDIN;
>> |Doe|John
>> \.

In all cases, Vertica renders NULL as blank:

=> SELECT cust_id, fname, lname FROM Customers WHERE cust_id IS NULL;
 cust_id |  fname  | lname
---------+---------+-------
         | Nowhere | Man
         | Jane    | Doe
         | John    | Doe
(3 rows)

Usage restrictions

UUID data types only support relational operators and functions that are also supported by CHAR and VARCHAR data types—for example, MIN, MAX, and COUNT. UUID data types do not support mathematical operators or functions, such as SUM and AVG.

3.9 - Data type coercion

Vertica supports two types of data type casting:.

Vertica supports two types of data type casting:

  • Implicit casting: The expression automatically converts the data from one type to another.

  • Explicit casting: A SQL statement specifies the target data type for the conversion.

Implicit casting

The ANSI SQL-92 standard supports implicit casting among similar data types:

  • Number types

  • CHAR, VARCHAR, LONG VARCHAR

  • BINARY, VARBINARY, LONG VARBINARY

Vertica supports two types of nonstandard implicit casts of scalar types:

  • From CHAR to FLOAT, to match the one from VARCHAR to FLOAT. The following example converts the CHAR '3' to a FLOAT so it can add the number 4.33 to the FLOAT result of the second expression:

    => SELECT '3'::CHAR + 4.33::FLOAT;
     ?column?
    ----------
       7.33
    (1 row)
    
  • Between DATE and TIMESTAMP. The following example DATE to a TIMESTAMP and calculates the time 6 hours, 6 minutes, and 6 seconds back from 12:00 AM:

    => SELECT DATE('now') - INTERVAL '6:6:6';
          ?column?
    ---------------------
     2013-07-30 17:53:54
    (1 row)
    

When there is no ambiguity about the data type of an expression value, it is implicitly coerced to match the expected data type. In the following statement, the quoted string constant '2' is implicitly coerced into an INTEGER value so that it can be the operand of an arithmetic operator (addition):

=> SELECT 2 + '2';
 ?column?
----------
        4
(1 row)

A concatenate operation explicitly takes arguments of any data type. In the following example, the concatenate operation implicitly coerces the arithmetic expression 2 + 2 and the INTEGER constant 2 to VARCHAR values so that they can be concatenated.

=> SELECT 2 + 2 || 2;
 ?column?
----------
 42
(1 row)

Another example is to first get today's date:

=> SELECT DATE 'now';
  ?column?
------------
 2013-07-31
(1 row)

The following command converts DATE to a TIMESTAMP and adds a day and a half to the results by using INTERVAL:

=> SELECT DATE 'now' + INTERVAL '1 12:00:00';
      ?column?
---------------------
 2013-07-31 12:00:00
(1 row)

Most implicit casts stay within their relational family and go in one direction, from less detailed to more detailed. For example:

  • DATE to TIMESTAMP/TZ

  • INTEGER to NUMERIC to FLOAT

  • CHAR to FLOAT

  • CHAR to VARCHAR

  • CHAR and/or VARCHAR to FLOAT

  • CHAR to LONG VARCHAR

  • VARCHAR to LONG VARCHAR

  • BINARY to VARBINARY

  • BINARY to LONG VARBINARY

  • VARBINARY to LONG VARBINARY

More specifically, data type coercion works in this manner in Vertica:

Conversion Notes
INT8 > FLOAT8 Implicit, can lose significance
FLOAT8 > INT8 Explicit, rounds
VARCHAR <-> CHAR Implicit, adjusts trailing spaces
VARBINARY <-> BINARY Implicit, adjusts trailing NULs
VARCHAR > LONG VARCHAR Implicit, adjusts trailing spaces
VARBINARY > LONG VARBINARY Implicit, adjusts trailing NULs

No other types cast to or from LONGVARBINARY, VARBINARY, or BINARY. In the following list, <any> means one these types: INT8, FLOAT8, DATE, TIME, TIMETZ, TIMESTAMP, TIMESTAMPTZ, INTERVAL:

  • <any> -> VARCHAR—implicit

  • VARCHAR -> <any>—explicit, except that VARCHAR->FLOAT is implicit

  • <any> <-> CHAR—explicit

  • DATE -> TIMESTAMP/TZ—implicit

  • TIMESTAMP/TZ -> DATE—explicit, loses time-of-day

  • TIME -> TIMETZ—implicit, adds local timezone

  • TIMETZ -> TIME—explicit, loses timezone

  • TIME -> INTERVAL—implicit, day to second with days=0

  • INTERVAL -> TIME—explicit, truncates non-time parts

  • TIMESTAMP <-> TIMESTAMPTZ—implicit, adjusts to local timezone

  • TIMESTAMP/TZ -> TIME—explicit, truncates non-time parts

  • TIMESTAMPTZ -> TIMETZ—explicit

  • VARBINARY -> LONG VARBINARY—implicit

  • LONG VARBINARY -> VARBINARY—explicit

  • VARCHAR -> LONG VARCHAR—implicit

  • LONG VARCHAR -> VARCHAR—explicit

String-to-numeric data type conversions accept formats of quoted constants for scientific notation, binary scaling, hexadecimal, and combinations of numeric-type literals:

  • Scientific notation:

    => SELECT FLOAT '1e10';
      ?column?
    -------------
     10000000000
    (1 row)
    
  • BINARY scaling:

    => SELECT NUMERIC '1p10';
     ?column?
    ----------
         1024
    (1 row)
    
  • hexadecimal:

    => SELECT NUMERIC '0x0abc';
     ?column?
    ----------
         2748
    (1 row)
    

Complex types

Collections (arrays and sets) can be cast implicitly and explicitly. Casting a collection casts each element of the collection. You can, for example, cast an ARRAY[VARCHAR] to an ARRAY[INT] or a SET[DATE] to SET[TIMESTAMPTZ]. You can cast between arrays and sets.

When casting to a bounded native array, inputs that are too long are truncated. When casting to a non-native array (an array containing complex data types including other arrays), if the new bounds are too small for the data the cast fails

Rows (structs) can be cast implicitly and explicitly. Casting a ROW casts each field value. You can specify new field names in the cast or specify only the field types to use the existing field names.

Casting can increase the storage needed for a column. For example, if you cast an array of INT to an array of VARCHAR(50), each element takes more space and thus the array takes more space. If the difference is extreme or the array has many elements, this could mean that the array no longer fits within the space allotted for the column. In this case the operation reports an error and fails.

Examples

The following example casts three strings as NUMERICs:

=> SELECT NUMERIC '12.3e3', '12.3p10'::NUMERIC, CAST('0x12.3p-10e3' AS NUMERIC);
 ?column? | ?column? |     ?column?
----------+----------+-------------------
    12300 |  12595.2 | 17.76123046875000
(1 row)

This example casts a VARBINARY string into a LONG VARBINARY data type:

=> SELECT B'101111000'::LONG VARBINARY;
 ?column?
----------
 \001x
(1 row)

The following example concatenates a CHAR with a LONG VARCHAR, resulting in a LONG VARCHAR:

=> \set s ''''`cat longfile.txt`''''
=> SELECT length ('a' || :s ::LONG VARCHAR);
 length
----------
 65002
(1 row)

The following example casts a combination of NUMERIC and INTEGER data into a NUMERIC result:

=> SELECT (18. + 3./16)/1024*1000;
                ?column?
-----------------------------------------
 17.761230468750000000000000000000000000
(1 row)

See also

3.10 - Data type coercion chart

The following table defines all possible type conversions that Vertica supports.

Conversion types

The following table defines all possible type conversions that Vertica supports. The data types in the first column of the table are the inputs to convert, and the remaining columns indicate the result for the different conversion types.

Source Data Type Implicit Explicit Assignment Assignment without numeric meaning Conversion without explicit casting
BOOLEAN INTEGER, LONG VARCHAR, VARCHAR, CHAR
INTEGER BOOLEAN, NUMERIC, FLOAT INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH LONG VARCHAR, VARCHAR, CHAR
NUMERIC FLOAT INTEGER LONG VARCHAR, VARCHAR, CHAR NUMERIC
FLOAT INTEGER, NUMERIC LONG VARCHAR, VARCHAR, CHAR
LONG VARCHAR FLOAT, CHAR BOOLEAN, INTEGER, NUMERIC, VARCHAR, TIMESTAMP, TIMESTAMPTZ, DATE, TIME, TIMETZ, INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH, LONG VARBINARY LONG VARCHAR
VARCHAR CHAR, FLOAT, LONG VARCHAR BOOLEAN, INTEGER, NUMERIC, TIMESTAMP, TIMESTAMPTZ, DATE, TIME, TIMETZ, UUID, BINARY, VARBINARY, INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH VARCHAR
CHAR FLOAT, LONG VARCHAR, VARCHAR BOOLEAN, INTEGER, NUMERIC, TIMESTAMP, TIMESTAMPTZ, DATE, TIME, TIMETZ, UUID (CHAR length ≥ 36), BINARY, VARBINARY, INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH CHAR
TIMESTAMP TIMESTAMPTZ LONG CHAR, VARCHAR, CHAR, DATE, TIME TIMESTAMP
TIMESTAMPTZ TIMESTAMP LONG CHAR, VARCHAR, CHAR, DATE, TIME, TIMETZ TIMESTAMPTZ
DATE TIMESTAMP LONG CHAR, VARCHAR, CHAR, TIMESTAMPTZ
TIME TIMETZ TIMESTAMP, TIMESTAMPTZ, INTERVAL DAY/SECOND LONG CHAR, VARCHAR, CHAR TIME
TIMETZ TIMESTAMP, TIMESTAMPTZ LONG CHAR, VARCHAR, CHAR, TIME TIMETZ
INTERVAL DAY/SECOND TIME INTEGER, LONG CHAR, VARCHAR, CHAR INTERVAL DAY/SECOND
INTERVAL YEAR/MONTH INTEGER, LONG CHAR, VARCHAR, CHAR INTERVAL YEAR/MONTH
LONG VARBINARY VARBINARY LONG VARBINARY
VARBINARY LONG VARBINARY, BINARY VARBINARY
BINARY VARBINARY BINARY
UUID CHAR(36), VARCHAR UUID

Implicit and explicit conversion

Vertica supports data type conversion of values without explicit casting, such as NUMERIC(10,6) -> NUMERIC(18,4).Implicit data type conversion occurs automatically when converting values of different, but compatible, types to the target column's data type. For example, when adding values, (INTEGER + NUMERIC), the result is implicitly cast to a NUMERIC type to accommodate the prominent type in the statement. Depending on the input data types, different precision and scale can occur.

An explicit type conversion must occur when the source data cannot be cast implicitly to the target column's data type.

Assignment conversion

In data assignment conversion, coercion implicitly occurs when values are assigned to database columns in an INSERT or UPDATE...SET statement. For example, in a statement that includes INSERT...VALUES('2.5'), where the target column data type is NUMERIC(18,5), a cast from VARCHAR to the column data type is inferred.

In an assignment without numeric meaning, the value is subject to CHAR/VARCHAR/LONG VARCHAR comparisons.

See also

3.11 - Complex types

Complex types such as structures (also known as rows), arrays, and maps are composed of primitive types and sometimes other complex types.

Complex types such as structures (also known as rows), arrays, and maps are composed of primitive types and sometimes other complex types. Complex types can be used in the following ways:

  • Arrays and rows (in any combination) can be used as column data types in both native and external tables.

  • Sets of primitive element types can be used as column data types in native and external tables.

  • Arrays and rows, but not combinations of them, can be created as literals, for example to use in query expressions.

The MAP type is a legacy type. To represent maps, use ARRAY[ROW].

If a flex table has a real column that uses a complex type, the values from that column are not included in the __raw__ column. For more information, see Loading Data into Flex Table Real Columns.

3.11.1 - ARRAY

Represents array data.

Represents array data. There are two types of arrays in Vertica:

  • Native array: a one-dimensional array of a primitive type. Native arrays are tracked in the TYPES system table and used in native tables.

  • Non-native array: all other supported arrays, including arrays that contain other arrays (multi-dimensional arrays) or structs (ROWs). Non-native arrays have some usage restrictions. Non-native arrays are tracked in the COMPLEX_TYPES system table.

Both types of arrays operate in the same way, but they have different OIDs.

Arrays can be bounded, meaning they specify a maximum element count, or unbounded. Unbounded arrays have a maximum binary size, which can be set explicitly or defaulted. See Limits on Element Count and Collection Size.

Selected parsers support using COPY to load arrays. See the documentation of individual parsers for more information.

Syntax

In column definitions:


ARRAY[data_type, max_elements] |
ARRAY[data_type](max_size) |
ARRAY[data_type]

In literals:

ARRAY[value[, ...] ]

Restrictions

  • Native arrays support only data of primitive types, for example, int, UUID, and so on.

  • Array dimensionality is enforced. A column cannot contain arrays of varying dimensions. For example, a column that contains a three-dimensional array can only contain other three-dimensional arrays; it cannot simultaneously include a one-dimensional array. However, the arrays in a column can vary in size, where one array can contain four elements while another contains ten.

  • Array bounds, if specified, are enforced for all operations that load or alter data. Unbounded arrays may have as many elements as will fit in the allotted binary size.

  • An array has a maximum binary size. If this size is not set when the array is defined, a default value is used.

  • Arrays do not support LONG types (like LONG VARBINARY or LONG VARCHAR) or user-defined types (like Geometry).

Syntax for column definition

Arrays used in column definitions can be either bounded or unbounded. Bounded arrays must specify a maximum number of elements. Unbounded arrays can specify a maximum binary size (in bytes) for the array, or the value of DefaultArrayBinarySize is used. You can specify a bound or a binary size but not both. For more information about these values, see Limits on Element Count and Collection Size.

Type Syntax Semantics
Bounded array

ARRAY[data_type, max_elements]

Example:

ARRAY[VARCHAR(50),100]

Can contain no more than max_elements elements. Attempting to add more is an error.

Has a binary size of the size of the data type multiplied by the maximum number of elements (possibly rounded up).

Unbounded array with maximum binary size

ARRAY[data_type](max_size)

Example:

ARRAY[VARCHAR(50)](32000)

Can contain as many elements as fit in max_size. Ignores the value of DefaultArrayBinarySize.
Unbounded array with default binary size

ARRAY[data_type]

Example:

ARRAY[VARCHAR(50)]

Can contain as many elements as fit in the default binary size.

Equivalent to:

ARRAY[data_type](DefaultArrayBinarySize)

The following example defines a table for customers using an unbounded array:

=> CREATE TABLE customers (id INT, name VARCHAR, email ARRAY[VARCHAR(50)]);

The following example uses a bounded array for customer email addresses and an unbounded array for order history:

=> CREATE TABLE customers (id INT, name VARCHAR, email ARRAY[VARCHAR(50),5], orders ARRAY[INT]);

The following example uses an array that has ROW elements:

=> CREATE TABLE orders(
  orderid INT,
  accountid INT,
  shipments ARRAY[
    ROW(
      shipid INT,
      address ROW(
        street VARCHAR,
        city VARCHAR,
        zip INT
        ),
      shipdate DATE
    )
  ]
 );

To declare a multi-dimensional array, use nesting. For example, ARRAY[ARRAY[int]] specifies a two-dimensional array.

Syntax for direct construction (literals)

Use the ARRAY keyword to construct an array value. The following example creates an array of integer values.

=> SELECT ARRAY[1,2,3];
 array
-------
 [1,2,3]
(1 row)

You can nest an array inside another array, as in the following example.

=> SELECT ARRAY[ARRAY[1],ARRAY[2]];
   array
-----------
 [[1],[2]]
(1 row)

If an array of arrays contains no null elements and no function calls, you can abbreviate the syntax:

=> SELECT ARRAY[[1,2],[3,4]];
     array
---------------
 [[1,2],[3,4]]
(1 row)

---not valid:
=> SELECT ARRAY[[1,2],null,[3,4]];

ERROR 4856:  Syntax error at or near "null" at character 20
LINE 1: SELECT ARRAY[[1,2],null,[3,4]];
                           ^

Array literals can contain elements of all scalar types, ROW, and ARRAY. ROW elements must all have the same set of fields:

=> SELECT ARRAY[ROW(1,2),ROW(1,3)];
               array
-----------------------------------
 [{"f0":1,"f1":2},{"f0":1,"f1":3}]
(1 row)

=> SELECT ARRAY[ROW(1,2),ROW(1,3,'abc')];
ERROR 3429:  For 'ARRAY', types ROW(int,int) and ROW(int,int,unknown) are inconsistent

Because the elements are known at the time you directly construct an array, these arrays are implicitly bounded.

You can use ARRAY literals in comparisons, as in the following example:

=> SELECT id.name, id.num, GPA FROM students
   WHERE major = ARRAY[ROW('Science','Physics')];
 name  | num | GPA
-------+-----+-----
 bob   | 121 | 3.3
 carol | 123 | 3.4
(2 rows)

Output format

Queries of array columns return JSON format, with the values shown in comma-separated lists in brackets. The following example shows a query that includes array columns.

=> SELECT cust_custkey,cust_custstaddress,cust_custcity,cust_custstate from cust;
cust_custkey |               cust_custstaddress                      |                cust_custcity                | cust_custstate
-------------+-------  ----------------------------------------------+---------------------------------------------+----------------
      342176 | ["668 SW New Lane","518 Main Ave","7040 Campfire Dr"] | ["Winchester","New Hyde Park","Massapequa"] | ["VA","NY","NY"]
      342799 | ["2400 Hearst Avenue","3 Cypress Street"]             | ["Berkeley","San Antonio"]                  | ["CA","TX"]
      342845 | ["336 Boylston Street","180 Clarkhill Rd"]            | ["Boston","Amherst"]                        | ["MA","MA"]
      342321 | ["95 Fawn Drive"]                                     | ["Allen Park"]                              | ["MI"]
      342989 | ["5 Thompson St"]                                     | ["Massillon"]                               | ["OH"]
(5 rows)

Note that JSON format escapes some characters that would not be escaped in native VARCHARs. For example, if you insert "c:\users\data" into an array, the JSON output for that value is "c:\\users\\data".

Element access

Arrays are 0-indexed. The first element's ordinal position is 0, second is 1, and so on.

You can access (dereference) elements from an array by index:

=> SELECT (ARRAY['a','b','c','d','e'])[1];
array
-------
b
(1 row)

To specify a range, use the format start:end. The end of the range is non-inclusive.

=> SELECT(ARRAY['a','b','c','d','e','f','g'])[1:4];
array
---------
["b","c","d"]
(1 row)

To dereference an element from a multi-dimensional array, put each index in brackets:

=> SELECT(ARRAY[ARRAY[1,2],ARRAY[3,4]])[0][0];
 array
-------
 1
(1 row)

Out-of-bound index references return NULL.

Limits on element count and collection size

When declaring a collection type for a table column, you can limit either the number of elements or the total binary size of the collection. During query processing, Vertica always reserves the maximum memory needed for the column, based on either the element count or the binary size. If this size is much larger than your data actually requires, setting one of these limits can improve query performance by reducing the amount of memory that must be reserved for the column.

You can change the bounds of a collection, including changing between bounded and unbounded collections, by casting. See Casting.

A bounded collection specifies a maximum element count. A value in a bounded collection column may contain fewer elements, but it may not contain more. Any attempt to insert more elements into a bounded collection than the declared maximum is an error. A bounded collection has a binary size that is the product of the data-type size and the maximum number of elements, possibly rounded up.

An unbounded collection specifies a binary size in bytes, explicitly or implicitly. It may contain as many elements as can fit in that binary size.

If a nested array specifies bounds for all dimensions, Vertica sets a single bound that is the product of the bounds. In the following example, the inner and outer arrays each have a bound of 10, but only a total element count of 100 is enforced.

ARRAY[ARRAY[INT,10],10]

If a nested array specifies a bound for only the outer collection, it is treated as the total bound. The previous example is equivalent to the following:

ARRAY[ARRAY[INT],100]

You must either specify bounds for all nested collections or specify a bound only for the outer one. For any other distribution of bounds, Vertica treats the collection as unbounded.

Instead of specifying a bound, you can specify a maximum binary size for an unbounded collection. The binary size acts as an absolute limit, regardless of how many elements the collection contains. Collections that do not specify a maximum binary size use the value of DefaultArrayBinarySize. This size is set at the time the collection is defined and is not affected by later changes to the value of DefaultArrayBinarySize.

You cannot set a maximum binary size for a bounded collection, only an unbounded one.

You can change the bounds or the binary size of an array column using ALTER TABLE as in the following example:

=> ALTER TABLE cust ALTER COLUMN orders SET DATA TYPE ARRAY[INTEGER](100);

If the change reduces the size of the collection and would result in data loss, the change fails.

Comparisons

All collections support equality (=), inequality (<>), and null-safe equality (<=>). 1D collections also support comparison operators (<, <=, >, >=) between collections of the same type (arrays or sets). Comparisons follow these rules:

  • A null collection is ordered last.

  • Non-null collections are compared element by element, using the ordering rules of the element's data type. The relative order of the first pair of non-equal elements determines the order of the two collections.

  • If all elements in both collections are equal up to the length of the shorter collection, the shorter collection is ordered before the longer one.

  • If all elements in both collections are equal and the collections are of equal length, the collections are equal.

Null-handling

Null semantics for collections are consistent with normal columns in most regards. See NULL sort order for more information on null-handling.

The null-safe equality operator (<=>) behaves differently from equality (=) when the collection is null rather than empty. Comparing a collection to NULL strictly returns null:

=> SELECT ARRAY[1,3] = NULL;
?column?
----------

(1 row)

=> SELECT ARRAY[1,3] <=> NULL;
 ?column?
----------
 f
(1 row)

In the following example, the grants column in the table is null for employee 99:

=> SELECT grants = NULL FROM employees WHERE id=99;
 ?column?
----------

(1 row)

=> SELECT grants <=> NULL FROM employees WHERE id=99;
 ?column?
----------
 t
(1 row)

Empty collections are not null and behave as expected:

=> SELECT ARRAY[]::ARRAY[INT] = ARRAY[]::ARRAY[INT];
 ?column?
----------
 t
(1 row)

Collections are compared element by element. If a comparison depends on a null element, the result is unknown (null), not false. For example, ARRAY[1,2,null]=ARRAY[1,2,null] and ARRAY[1,2,null]=ARRAY[1,2,3] both return null, but ARRAY[1,2,null]=ARRAY[1,4,null] returns false because the second elements do not match.

Casting

Casting an array casts each element of the array. You can therefore cast between data types following the same rules as for casts of scalar values.

You can cast both literal arrays and array columns explicitly:

=> SELECT ARRAY['1','2','3']::ARRAY[INT];
  array
---------
[1,2,3]
(1 row)

You can change the bound of an array or set by casting. When casting to a bounded native array, inputs that are too long are truncated. When casting to a non-native array (an array containing complex data types including other arrays), if the new bounds are too small for the data the cast fails:

=> SELECT ARRAY[1,2,3]::ARRAY[VARCHAR,2];
   array
-----------
 ["1","2"]
(1 row)

=> SELECT ARRAY[ARRAY[1,2,3],ARRAY[4,5,6]]::ARRAY[ARRAY[VARCHAR,2],2];
ERROR 9227:  Output array isn't big enough
DETAIL:  Type limit is 4 elements, but value has 6 elements

If you cast to a bounded multi-dimensional array, you must specify the bounds at all levels:

=> SELECT ARRAY[ARRAY[1,2,3],ARRAY[4,5,6]]::ARRAY[ARRAY[VARCHAR,5],10];
             array
-------------------------------
 [["1","2","3"],["4","5","6"]]
(1 row)

=> SELECT ARRAY[ARRAY[1,2,3],ARRAY[4,5,6]]::ARRAY[ARRAY[VARCHAR,2]];
WARNING 9753:  Collection type bound will not be used
DETAIL:  A bound was provided for an inner dimension, but not for an outer dimension
             array
-------------------------------
 [["1","2","3"],["4","5","6"]]
(1 row)

Assignment casts and implicit casts work the same way as for scalars:

=> CREATE TABLE transactions (tid INT, prod_ids ARRAY[VARCHAR,100], quantities ARRAY[INT,100]);
CREATE TABLE

=> INSERT INTO transactions VALUES (12345, ARRAY['p1265', 'p4515'], ARRAY[15,2]);
 OUTPUT
--------
      1
(1 row)

=> CREATE TABLE txreport (prod_ids ARRAY[VARCHAR(12),100], quants ARRAY[VARCHAR(32),100]);
CREATE TABLE

=> INSERT INTO txreport SELECT prod_ids, quantities FROM transactions;
 OUTPUT
--------
      1
(1 row)

=> SELECT * FROM txreport;
     prod_ids      |   quants
-------------------+------------
 ["p1265","p4515"] | ["15","2"]
(1 row)

You can perform explicit casts, but not implicit casts, between the ARRAY and SET types (native arrays only). If the collection is unbounded and the data type does not change, the binary size is preserved. For example, if you cast an ARRAY[INT] to a SET[INT], the set has the same binary size as the array.

If you cast from one element type to another, the resulting collection uses the default binary size. If this would cause the data not to fit, the cast fails.

You cannot cast from an array to an array with a different dimensionality, for example from a two-dimensional array to a one-dimensional array.

Functions and operators

See Collection functions for a comprehensive list of functions that can be used to manipulate arrays and sets.

Collections can be used in the following ways:

Collections cannot be used in the following ways:

  • As part of an IN or NOT IN expression.

  • As partition columns when creating tables.

  • With ANALYZE_STATISTICS or TopK projections.

  • Non-native arrays only: ORDER BY, PARTITION BY, DEFAULT, SET USING, or constraints.

3.11.2 - MAP

Represents map data in external tables in the Parquet, ORC, and Avro formats only.

Represents map data in external tables in the Parquet, ORC, and Avro formats only. A MAP must use only primitive types and may not contain other complex types. You can use the MAP type in a table definition to consume columns in the data, but you cannot query those columns.

A superior alternative to MAP is ARRAY[ROW]. An array of rows can use all supported complex types and can be queried. This is the representation that INFER_TABLE_DDL suggests. For Avro data, the ROW must have fields named key and value.

Within a single table you must define all map columns using the same approach, MAP or ARRAY[ROW].

Syntax

In column definitions:

MAP<key,value>

Map input format for column definition

In a column definition in an external table, a MAP consists of a key-value pair, specified as types. The table in the following example defines a map of product IDs to names.

=> CREATE EXTERNAL TABLE store (storeID INT, inventory MAP<INT,VARCHAR(100)>)
    AS COPY FROM '...' PARQUET;

3.11.3 - ROW

Represents structured data (structs).

Represents structured data (structs). A ROW can contain fields of any primitive or complex type supported by Vertica.

Syntax

In column definitions:

ROW([field] type[, ...])

If the field name is omitted, Vertica generates names starting with "f0".

In literals:

ROW(value [AS field] [, ...]) [AS name(field[, ...])]

Syntax for column definition

In a column definition, a ROW consists of one or more comma-separated pairs of field names and types. In the following example, the Parquet data file contains a struct for the address, which is read as a ROW in an external table:

=> CREATE EXTERNAL TABLE customers (name VARCHAR,
    address ROW(street VARCHAR, city VARCHAR, zipcode INT))
    AS COPY FROM '...' PARQUET;

ROWs can be nested; a field can have a type of ROW:

=> CREATE TABLE employees(
    employeeID INT,
    personal ROW(
      name VARCHAR,
      address ROW(street VARCHAR, city VARCHAR, zipcode INT),
      taxID INT),
    department VARCHAR);

ROWs can contain arrays:

=> CREATE TABLE customers(
  name VARCHAR,
  contact ROW(
    street VARCHAR,
    city VARCHAR,
    zipcode INT,
    email ARRAY[VARCHAR]
  ),
  accountid INT );

When loading data, the primitive types in the table definition must match those in the data. The ROW structure must also match; a ROW must contain all and only the fields in the struct in the data.

Restrictions on ROW columns

ROW columns have several restrictions:

  • Maximum nesting depth is 100.
  • Vertica tables support up to 9800 columns and fields. The ROW itself is not counted, only its fields.
  • ROW columns cannot use any constraints (such as NOT NULL) or defaults.
  • ROW fields cannot be auto_increment or setof.
  • ROW definition must include at least one field.
  • Row is a reserved keyword within a ROW definition, but is permitted as the name of a table or column.
  • Tables containing ROW columns cannot also contain IDENTITY, default, SET USING, or named sequence columns.

Syntax for direct construction (literals)

In a literal, such as a value in a comparison operation, a ROW consists of one or more values. If you omit field names in the ROW expression, Vertica generates them automatically. If you do not coerce types, Vertica infers the types from the data values.

=> SELECT ROW('Amy',2,false);
                row
--------------------------------------------
 {"f0":"Amy","f1":2,"f2":false}
(1 row)

You can use an AS clause to name the ROW and its fields:

=> SELECT ROW('Amy',2,false) AS student(name, id, current);
               student
--------------------------------------------
 {"name":"Amy","id":2,"current":false}
(1 row)

You can also name individual fields using AS. This query produces the same output as the previous one:

=> SELECT ROW('Amy' AS name, 2 AS id, false AS current) AS student;

You do not need to name all fields.

In an array of ROW elements, if you use AS to name fields and the names differ among the elements, Vertica uses the right-most names for all elements:

=> SELECT ARRAY[ROW('Amy' AS name, 2 AS id),ROW('Fred' AS first_name, 4 AS id)];
                           array
------------------------------------------------------------
 [{"first_name":"Amy","id":2},{"first_name":"Fred","id":4}]
(1 row)

You can coerce types explicitly:

=> SELECT ROW('Amy',2.5::int,false::varchar);
               row
------------------------------------------
 {"f0":"Amy","f1":3,"f2":"f"}
(1 row)

Escape single quotes in literal inputs using single quotes, as in the following example:

=> SELECT ROW('Howard''s house',2,false);
                      row
---------------------------------------------------
 {"f0":"Howard's house","f1":2,"f2":false}
(1 row)

You can use fields of all scalar types, ROW, and ARRAY, as in the following example:

=> SELECT id.name, major, GPA FROM students
   WHERE id = ROW('alice',119, ARRAY['alice@example.com','ap16@cs.example.edu']);
 name  |               major                | GPA
-------+------------------------------------+-----
 alice | [{"school":"Science","dept":"CS"}] | 3.8
(1 row)

Output format

ROW values are output in JSON format as in the following example.

=> CREATE EXTERNAL TABLE customers (name VARCHAR,
    address ROW(street VARCHAR, city VARCHAR, zipcode INT))
    AS COPY FROM '...' PARQUET;

=> SELECT address FROM customers WHERE address.city ='Pasadena';
                              address
--------------------------------------------------------------------
 {"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001}
 {"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001}
 {"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001}
 {"street":"15 Raymond Dr","city":"Pasadena","zipcode":91003}
(4 rows)

The following table specifies the mappings from Vertica data types to JSON data types.

Vertica Type JSON Type
Integer Integer
Float Numeric
Numeric Numeric
Boolean Boolean
All others String

Comparisons

ROW supports equality (=), inequality (<>), and null-safe equality (<=>) between inputs that have the same set of fields. ROWs that contain only primitive types, including nested ROWs of primitive types, also support comparison operators (<, <=, >, >=).

Two ROWs are equal if and only if all fields are equal. Vertica compares fields in order until an inequality is found or all fields have been compared. The evaluation of the first non-equal field determines which ROW is greater:

=> SELECT ROW(1, 'joe') > ROW(2, 'bob');
?column?
----------
f
(1 row)

Comparisons between ROWs with different schemas fail:

=> SELECT ROW(1, 'joe') > ROW(2, 'bob', 123);
ERROR 5162:  Unequal number of entries in row expressions

If the result of a comparison depends on a null field, the result is null:

=>  select row(1, null, 3) = row(1, 2, 3);
 ?column?
----------

(1 row)

Null-handling

If a struct exists but a field value is null, Vertica assigns NULL as its value in the ROW. A struct where all fields are null is treated as a ROW with null fields. If the struct itself is null, Vertica reads the ROW as NULL.

Casting

Casting a ROW casts each field. You can therefore cast between data types following the same rules as for casts of scalar values.

The following example casts the contact ROW in the customers table, changing the zipcode field from INT to VARCHAR and adding a bound to the array:

=> SELECT contact::ROW(VARCHAR,VARCHAR,VARCHAR,ARRAY[VARCHAR,20]) FROM customers;
                                                         contact

--------------------------------------------------------------------------------
-----------------------------------------
 {"street":"911 San Marcos St","city":"Austin","zipcode":"73344","email":["missy@mit.edu","mcooper@cern.gov"]}
 {"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":"91001","email":["shelly@meemaw.name","cooper@caltech.edu"]}
 {"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":"91001","email":["hofstadter@caltech.edu"]}
 {"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":"91001","email":[]}
 {"street":null,"city":"Pasadena","zipcode":"91001","email":["raj@available.com"]}

(6 rows)

You can specify new field names to change them in the output:

=> SELECT contact::ROW(str VARCHAR, city VARCHAR, zip VARCHAR, email ARRAY[VARCHAR,
20]) FROM customers;
                                                     contact

--------------------------------------------------------------------------------
----------------------------------
 {"str":"911 San Marcos St","city":"Austin","zip":"73344","email":["missy@mit.edu","mcooper@cern.gov"]}
 {"str":"100 Main St Apt 4B","city":"Pasadena","zip":"91001","email":["shelly@meemaw.name","cooper@caltech.edu"]}
 {"str":"100 Main St Apt 4A","city":"Pasadena","zip":"91001","email":["hofstadter@caltech.edu"]}
 {"str":"23 Fifth Ave Apt 8C","city":"Pasadena","zip":"91001","email":[]}
 {"str":null,"city":"Pasadena","zip":"91001","email":["raj@available.com"]}
(6 rows)

Supported operators and predicates

ROW values may be used in queries in the following ways:

  • INNER and OUTER JOIN

  • Comparisons, IN, BETWEEN (non-nullable filters only)

  • IS NULL, IS NOT NULL

  • CASE

  • GROUP BY, ORDER BY

  • SELECT DISTINCT

  • Arguments to user-defined scalar, transform, and analytic functions

The following operators and predicates are not supported for ROW values:

  • Math operators

  • Type coercion of whole rows (coercion of field values is supported)

  • BITWISE, LIKE

  • MLA (ROLLUP, CUBE, GROUPING SETS)

  • Aggregate functions including MAX, MIN, and SUM

  • Set operators including UNION, UNION ALL, MINUS, and INTERSECT

COUNT is not supported for ROWs returned from user-defined scalar functions, but is supported for ROW columns and literals.

In comparison operations (including implicit comparisons like ORDER BY), a ROW literal is treated as the sequence of its field values. For example, the following two statements are equivalent:

GROUP BY ROW(zipcode, city)
GROUP BY zipcode, city

Using rows in views and subqueries

You can use ROW columns to construct views and in subqueries. Consider employee and customer tables with the following definitions:

=> CREATE EXTERNAL TABLE customers(name VARCHAR,
        address ROW(street VARCHAR, city VARCHAR, zipcode INT), accountID INT)
    AS COPY FROM '...' PARQUET;

=> CREATE EXTERNAL TABLE employees(employeeID INT,
        personal ROW(name VARCHAR,
        address ROW(street VARCHAR, city VARCHAR, zipcode INT),
        taxID INT), department VARCHAR)
    AS COPY FROM '...' PARQUET;

The following example creates a view and queries it.

=> CREATE VIEW neighbors (num_neighbors, area(city, zipcode))
AS SELECT count(*), ROW(address.city, address.zipcode)
FROM customers GROUP BY address.city, address.zipcode;
CREATE VIEW

=> SELECT employees.personal.name, neighbors.area FROM neighbors, employees
WHERE employees.personal.address.zipcode=neighbors.area.zipcode AND neighbors.nu
m_neighbors > 1;
        name        |                area
--------------------+-------------------------------------
 Sheldon Cooper     | {"city":"Pasadena","zipcode":91001}
 Leonard Hofstadter | {"city":"Pasadena","zipcode":91001}
(2 rows)

3.11.4 - SET

Represents a collection of unordered, unique elements.

Represents a collection of unordered, unique elements. Sets may contain only primitive types. In sets, unlike in arrays, element position is not meaningful.

Sets do not support LONG types (like LONG VARBINARY or LONG VARCHAR) or user-defined types (like Geometry).

If you populate a set from an array, Vertica sorts the values and removes duplicate elements. If you do not care about element position and plan to run queries that check for the presence of specific elements (find, contains), using a set could improve query performance.

Sets can be bounded, meaning they specify a maximum element count, or unbounded. Unbounded sets have a maximum binary size, which can be set explicitly or defaulted. See Limits on Element Count and Collection Size.

Syntax

In column definitions:

SET[data_type, max_elements] |
SET[data_type](max_size) |
SET[data_type]

In literals:

SET[value[, ...] ]

Restrictions

  • Sets support only data of primitive (scalar) types.

  • Bounds, if specified, are enforced for all operations that load or alter data. Unbounded sets may have as many elements as will fit in the allotted binary size.

  • A set has a maximum binary size. If this size is not set when the set is defined, a default value is used.

Syntax for column definition

Sets used in column definitions can be either bounded or unbounded. Bounded sets must specify a maximum number of elements. Unbounded sets can specify a maximum binary size for the set, or the value of DefaultArrayBinarySize is used. You can specify a bound or a binary size but not both. For more information about these values, see Limits on Element Count and Collection Size.

Type Syntax Semantics
Bounded set

SET[data_type, max_elements]

Example:

SET[VARCHAR(50),100]

Can contain no more than max_elements elements. Attempting to add more is an error.

Has a binary size of the size of the data type multiplied by the maximum number of elements (possibly rounded up).

Unbounded set with maximum size

SET[data_type](max_size)

Example:

SET[VARCHAR(50)](32000)

Can contain as many elements as fit in max_size. Ignores the value of DefaultArrayBinarySize.
Unbounded set

SET[data_type]

Example:

SET[VARCHAR(50)]

Can contain as many elements as fit in the default binary size.

Equivalent to:

SET[data_type](DefaultArrayBinarySize)

The following example defines a table with an unbounded set colum.

=> CREATE TABLE users
(
user_id INTEGER,
display_name VARCHAR,
email_addrs SET[VARCHAR]
);

When you load array data into a column defined as a set, the array data is automatically converted to a set.

Syntax for direct construction (literals)

Use the SET keyword to construct a set value. Literal set values are contained in brackets. For example, to create a set of INT, you would do the following:

=> SELECT SET[1,2,3];
  set
-------
 [1,2,3]
(1 row)

You can explicitly convert an array to a set by casting, as in the following example:

=> SELECT ARRAY[1, 5, 2, 6, 3, 0, 6, 4]::SET[INT];
     set
-----------------
[0,1,2,3,4,5,6]
(1 row)

Notice that duplicate elements have been removed and the elements have been sorted.

Because the elements are known at the time you directly construct a set, these sets are implicitly bounded.

Output format

Sets are shown in a JSON-like format, with comma-separated elements contained in brackets (like arrays). In the following example, the email_addrs column is a set.

=> SELECT custkey,email_addrs FROM customers LIMIT 4;
 custkey |                           email_addrs
---------+------------------------------------------------------------------------
 342176  | ["joe.smith@example.com"]
 342799  | ["bob@example,com","robert.jones@example.com"]
 342845  | ["br92@cs.example.edu"]
 342321  | ["789123@example-isp.com","sjohnson@eng.example.com","sara@johnson.example.name"]

Limits on element count and collection size

When declaring a collection type for a table column, you can limit either the number of elements or the total binary size of the collection. During query processing, Vertica always reserves the maximum memory needed for the column, based on either the element count or the binary size. If this size is much larger than your data actually requires, setting one of these limits can improve query performance by reducing the amount of memory that must be reserved for the column.

You can change the bounds of a collection, including changing between bounded and unbounded collections, by casting. See Casting.

A bounded collection specifies a maximum element count. A value in a bounded collection column may contain fewer elements, but it may not contain more. Any attempt to insert more elements into a bounded collection than the declared maximum is an error. A bounded collection has a binary size that is the product of the data-type size and the maximum number of elements, possibly rounded up.

An unbounded collection specifies a binary size in bytes, explicitly or implicitly. It may contain as many elements as can fit in that binary size.

Instead of specifying a bound, you can specify a maximum binary size for an unbounded collection. The binary size acts as an absolute limit, regardless of how many elements the collection contains. Collections that do not specify a maximum binary size use the value of DefaultArrayBinarySize. This size is set at the time the collection is defined and is not affected by later changes to the value of DefaultArrayBinarySize.

You cannot set a maximum binary size for a bounded collection, only an unbounded one.

Comparisons

All collections support equality (=), inequality (<>), and null-safe equality (<=>). 1D collections also support comparison operators (<, <=, >, >=) between collections of the same type (arrays or sets). Comparisons follow these rules:

  • A null collection is ordered last.

  • Non-null collections are compared element by element, using the ordering rules of the element's data type. The relative order of the first pair of non-equal elements determines the order of the two collections.

  • If all elements in both collections are equal up to the length of the shorter collection, the shorter collection is ordered before the longer one.

  • If all elements in both collections are equal and the collections are of equal length, the collections are equal.

Null handling

Null semantics for collections are consistent with normal columns in most regards. See NULL sort order for more information on null-handling.

The null-safe equality operator (<=>) behaves differently from equality (=) when the collection is null rather than empty. Comparing a collection to NULL strictly returns null:

=> SELECT ARRAY[1,3] = NULL;
?column?
----------

(1 row)

=> SELECT ARRAY[1,3] <=> NULL;
 ?column?
----------
 f
(1 row)

In the following example, the grants column in the table is null for employee 99:

=> SELECT grants = NULL FROM employees WHERE id=99;
 ?column?
----------

(1 row)

=> SELECT grants <=> NULL FROM employees WHERE id=99;
 ?column?
----------
 t
(1 row)

Empty collections are not null and behave as expected:

=> SELECT ARRAY[]::ARRAY[INT] = ARRAY[]::ARRAY[INT];
 ?column?
----------
 t
(1 row)

Collections are compared element by element. If a comparison depends on a null element, the result is unknown (null), not false. For example, ARRAY[1,2,null]=ARRAY[1,2,null] and ARRAY[1,2,null]=ARRAY[1,2,3] both return null, but ARRAY[1,2,null]=ARRAY[1,4,null] returns false because the second elements do not match.

Casting

Casting a set casts each element of the set. You can therefore cast between data types following the same rules as for casts of scalar values.

You can cast both literal sets and set columns explicitly:

=> SELECT SET['1','2','3']::SET[INT];
   set
---------
[1,2,3]
(1 row)

=> CREATE TABLE transactions (tid INT, prod_ids SET[VARCHAR], quantities SET[VARCHAR(32)]);

=> INSERT INTO transactions VALUES (12345, SET['p1265', 'p4515'], SET['15','2']);

=> SELECT quantities :: SET[INT] FROM transactions;
 quantities
------------
   [15,2]
(1 row)

Assignment casts and implicit casts work the same way as for scalars.

You can perform explicit casts, but not implicit casts, between ARRAY and SET types. If the collection is unbounded and the data type does not change, the binary size is preserved. For example, if you cast an ARRAY[INT] to a SET[INT], the set has the same binary size as the array.

When casting an array to a set, Vertica first casts each element and then sorts the set and removes duplicates. If two source values are cast to the same target value, one of them will be removed. For example, if you cast an array of FLOAT to a set of INT, two values in the array might be rounded to the same integer and then be treated as duplicates. This also happens if the array contains more than one value that is cast to NULL.

If you cast from one element type to another, the resulting collection uses the default binary size. If this would cause the data not to fit, the cast fails.

Functions and operators

See Collection functions for a comprehensive list of functions that can be used to manipulate arrays and sets.

Collections can be used in the following ways:

Collections cannot be used in the following ways:

  • As part of an IN or NOT IN expression.

  • As partition columns when creating tables.

  • With ANALYZE_STATISTICS or TopK projections.

  • Non-native arrays only: ORDER BY, PARTITION BY, DEFAULT, SET USING, or constraints.

3.12 - Data type mappings between Vertica and Oracle

Oracle uses proprietary data types for all main data types, such as VARCHAR, INTEGER, FLOAT, DATE.

Oracle uses proprietary data types for all main data types, such as VARCHAR, INTEGER, FLOAT, DATE. Before migrating a database from Oracle to Vertica, first convert the schema to minimize errors and time spent fixing erroneous data issues.

The following table compares the behavior of Oracle data types to Vertica data types.

Oracle Vertica Notes

NUMBER

(no explicit precision)

INTEGER

In Oracle, the NUMBER data type with no explicit precision stores each number N as an integer M, together with a scale S. The scale can range from -84 to 127, while the precision of M is limited to 38 digits. Thus:

N = M * 10^S

When precision is specified, precision/scale applies to all entries in the column. If omitted, the scale defaults to 0.

For the common case—Oracle NUMBER with no explicit precision used to store only integer values—the Vertica INTEGER data type is the most appropriate and the fastest equivalent data type. However, INTEGER is limited to a little less than 19 digits, with a scale of 0: [-9223372036854775807, +9223372036854775807].

NUMERIC

If an Oracle column contains integer values outside of the range [-9223372036854775807, +9223372036854775807], then use the Vertica data type NUMERIC(p,0) where p is the maximum number of digits required to represent values of the source data.

If the data is exact with fractional places—for example dollar amounts—Vertica recommends NUMERIC(p,s) where p is the precision (total number of digits) and s is the maximum scale (number of decimal places).

Vertica conforms to standard SQL, which requires that p ≥ s and s ≥ 0. Vertica's NUMERIC data type is most effective for p=18, and increasingly expensive for p=37, 58, 67, etc., where p ≤ 1024.

FLOAT Even though no explicit scale is specified for an Oracle NUMBER column, Oracle allows non-integer values, each with its own scale. If the data stored in the column is approximate, Vertica recommends using the Vertica data type FLOAT, which is standard IEEE floating point, like ORACLE BINARY_DOUBLE.

NUMBER(P,0)

P ≤ 18

INTEGER For Oracle NUMBER data types with 0 scale and a precision less than or equal to 18, use the Vertica INTEGER data type.

NUMBER(P,0)

P > 18

NUMERIC(p,0) In the rare case where a Oracle column specifies precision greater than 18, use the Vertica data type NUMERIC(p, 0), where p = P.

NUMBER(P,S)

All cases other than above

NUMERIC(p,s)
  • When PS and S ≥ 0, use p = P and s = S, unless the data allows reducing P or using FLOAT as discussed above.
  • If S > P, use p = S, s = S.
  • If S < 0, use p = P – S, s = 0.
FLOAT
NUMERIC(P,S) Rarely used in Oracle, see notes for Oracle NUMBER.
DECIMAL(P,S) Synonym for Oracle NUMERIC.
BINARY_FLOAT FLOAT Same as FLOAT(53) or DOUBLE PRECISION
BINARY_DOUBLE FLOAT Same as FLOAT(53) or DOUBLE PRECISION
RAW VARBINARY

Maximum sizes compared:

  • Oracle RAW data type: 2000 bytes

  • Vertica VARBINARY: 65000 bytes

LONG RAW LONG VARBINARY

Maximum sizes compared:

  • Oracle’s LONG RAW is 2GB

  • Vertica LONG VARBINARY is 32M bytes/octets (~30MB)

CHAR(n) CHAR(n)

Maximum sizes compared:

  • Oracle CHAR: 2000 bytes

  • Vertica CHAR : 65000 bytes

NCHAR(n) CHAR(*n**3) Vertica supports national characters with CHAR(n) as variable-length UTF8-encoded UNICODE character string. UTF-8 represents ASCII in 1 byte, most European characters in 2 bytes, and most oriental and Middle Eastern characters in 3 bytes.
VARCHAR2(n) VARCHAR(n)

Maximum sizes compared:

  • Oracle VARCHAR2: 4000 bytes

  • Vertica VARCHAR: 65000 bytes

NVARCHAR2(n) VARCHAR(*n**3) See notes for NCHAR.
DATE TIMESTAMP Oracle’s DATE is different from the SQL standard DATE data type implemented by Vertica. Oracle’s DATE includes the time (no fractional seconds), while Vertica DATE data types include only date as per the SQL standard.
DATE
TIMESTAMP TIMESTAMP TIMESTAMP defaults to six places—that is, to microseconds.
TIMESTAMP WITH TIME ZONE TIMESTAMP WITH TIME ZONE TIME ZONE defaults to the currently SET or system time zone.
INTERVAL YEAR TO MONTH INTERVAL YEAR TO MONTH As per the SQL standard, you can qualify Vertica INTERVAL data types with the YEAR TO MONTH subtype.
INTERVAL DAY TO SECOND INTERVAL DAY TO SECOND The default subtype for Vertica INTERVAL data types is DAY TO SECOND.
CLOB LONG VARCHAR

You can store a CLOB (character large object) or BLOB (binary large object) value in a table or in an external location. The maximum size of a CLOB or BLOB is 128 TB.

You can store Vertica LONG data types only in LONG VARCHAR and LONG VARBINARY columns. The maximum size of LONG data types is 32M bytes.

BLOB LONG VARBINARY
LONG LONG VARCHAR

Oracle recommends using CLOB and BLOB data types instead of LONG and LONG RAW data types.

An Oracle table can contain only one LONG column, The maximum size of a LONG or LONG RAW data type is 2 GB.

LONG RAW LONG VARBINARY

4 - Configuration parameters

Vertica supports a wide variety of configuration parameters that affect many facets of database behavior. These parameters can be set with the appropriate ALTER statements at one or more levels, listed here in descending order of precedence:

  1. User (ALTER USER)

  2. Session (ALTER SESSION)

  3. Node (ALTER NODE)

  4. Database (ALTER DATABASE)

Not all parameters can be set at all levels. Consult the documentation of individual parameters for restrictions.

You can query the CONFIGURATION_PARAMETERS system table to obtain the current settings for all user-accessible parameters. For example, the following query returns settings for partitioning parameters: their current and default values, which levels they can be set at, and whether changes require a database restart to take effect:

=> SELECT parameter_name, current_value, default_value, allowed_levels, change_requires_restart
      FROM configuration_parameters  WHERE parameter_name ILIKE '%partitioncount%';
    parameter_name    | current_value | default_value | allowed_levels | change_requires_restart
----------------------+---------------+---------------+----------------+-------------------------
 MaxPartitionCount    | 1024          | 1024          | NODE, DATABASE | f
 ActivePartitionCount | 1             | 1             | NODE, DATABASE | f
(2 rows)

4.1 - General parameters

The following parameters configure basic database operations.

The following parameters configure basic database operations. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.

ApplyEventsDuringSALCheck
Boolean, specifies whether Vertica uses catalog events to filter out dropped corrupt partitions during node startup. Dropping corrupt partitions can speed node recovery.

When disabled (0), Vertica reports corrupt partitions, but takes no action. Leaving corrupt partitions in place can reset the current projection checkpoint epoch to the epoch before the corruption occurred.

This parameter has no effect on unpartitioned tables.

Default: 0

ApportionedFileMinimumPortionSizeKB
Specifies the minimum portion size (in kilobytes) for use with apportioned file loads. Vertica apportions a file load across multiple nodes only if:
  • The load can be divided into portions at least equaling this value.

  • EnableApportionedFileLoad and EnableApportionLoad are set to 1 (enabled).

See also EnableApportionLoad and EnableApportionedFileLoad.

Default: 1024

BlockedSocketGracePeriod
Sets how long a session socket remains blocked while awaiting client input or output for a given query. See Handling session socket blocking.

Default: None (Socket blocking can continue indefinitely.)

CatalogCheckpointPercent
Specifies the threshold at which a checkpoint is created for the database catalog.

By default, this parameter is set to 50 (percent), so when transaction logs reach 50% of the size of the last checkpoint, Vertica adds a checkpoint. Each checkpoint demarcates all changes to the catalog since the last checkpoint.

Default: 50 (percent)

ClusterSequenceCacheMode
Boolean, specifies whether the initiator node requests cache for other nodes in a cluster, and then sends cache to other nodes along with the execution plan, one of the following.
  • 1 (enabled): Initiator node requests cache.

  • 0: (disabled): All nodes request their own cache.

See Distributing sequences.

Default: 1 (enabled)

CompressCatalogOnDisk
Whether to compress the size of the catalog on disk, one of the following:
  • 0: Do not compress.

  • 1: Compress checkpoints, but not logs.

  • 2: Compress checkpoints and logs.

This parameter is most effective if the catalog disk partition is small (<50 GB) and the metadata is large (hundreds of tables, partitions, or nodes).

Default: 1

CompressNetworkData
Boolean, specifies whether to compress all data sent over the internal network when enabled (set to 1). This compression speeds up network traffic at the expense of added CPU load. If the network is throttling database performance, enable compression to correct the issue.

Default: 0

CopyFaultTolerantExpressions
Boolean, indicates whether to report record rejections during transformations and proceed (true) or abort COPY operations if a transformation fails.

Default: 0 (false)

CopyFromVerticaWithIdentity
Allows COPY FROM VERTICA and EXPORT TO VERTICA to load values into IDENTITY columns. The destination IDENTITY column is not incremented automatically. To disable the default behavior, set this parameter to 0 (zero).

Default: 1

DatabaseHeartbeatInterval
Determines the interval (in seconds) at which each node performs a health check and communicates a heartbeat. If a node does not receive a message within five times of the specified interval, the node is evicted from the cluster. Setting the interval to 0 disables the feature.

See Automatic eviction of unhealthy nodes.

Default: 120

DataLoaderDefaultRetryLimit
Default number of times a data loader retries failed loads. Changing this value changes the retry limit for existing data loaders that do not specify another limit.

Default: 3

DefaultArrayBinarySize
The maximum binary size, in bytes, for an unbounded collection, if a maximum size is not specified at creation time.

Default: 65000

DefaultResourcePoolForUsers
Resource pool that is assigned to the profile of a new user, whether created in Vertica or LDAP. This pool is also assigned to users when their assigned resource pool is dropped.

You can set DefaultResourcePoolForUsers only to a global resource pool; attempts to set it to a subcluster resource pool return with an error.

For details, see User resource allocation.

Default: GENERAL

DefaultTempTableLocal
Boolean, specifies whether CREATE TEMPORARY TABLE creates a local or global temporary table, one of the following:
  • 0: Create global temporary table.

  • 1: Create local temporary table.

For details, see Creating temporary tables.

Default: 0

DivideZeroByZeroThrowsError
Boolean, specifies whether to return an error if a division by zero operation is requested:
  • 0: Return 0.

  • 1: Returns an error.

Default: 1

EnableApportionedChunkingInDefaultLoadParser
Boolean, specifies whether to enable the built-in parser for delimited files to take advantage of both apportioned load and cooperative parse for potentially better performance.

Default: 1 (enable)

EnableApportionedFileLoad
Boolean, specifies whether to enable automatic apportioning across nodes of file loads using COPY FROM VERTICA. Vertica attempts to apportion the load if:
  • This parameter and EnableApportionLoad are both enabled.

  • The parser supports apportioning.

  • The load is divisible into portion sizes of at least the value of ApportionedFileMinimumPortionSizeKB.

Setting this parameter does not guarantee that loads will be apportioned, but disabling it guarantees that they will not be.

See Distributing a load.

Default: 1 (enable)

EnableApportionLoad
Boolean, specifies whether to enable automatic apportioning across nodes of data loads using COPY...WITH SOURCE. Vertica attempts to apportion the load if:
  • This parameter is enabled.

  • The source and parser both support apportioning.

Setting this parameter does not guarantee that loads will be apportioned, but disabling it guarantees that they will not be.

For details, see Distributing a load.

Default: 1 (enable)

EnableBetterFlexTypeGuessing
Boolean, specifies whether to enable more accurate type guessing when assigning data types to non-string keys in a flex table __raw__ column with COMPUTE_FLEXTABLE_KEYS or COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW. If this parameter is disabled (0), Vertica uses a limited set of Vertica data type assignments.

Default: 1 (enable)

EnableCooperativeParse
Boolean, specifies whether to implement multi-threaded parsing capabilities on a node. You can use this parameter for both delimited and fixed-width loads.

Default: 1 (enable)

EnableForceOuter
Boolean, specifies whether Vertica uses a table's force_outer value to implement a join. For more information, see Controlling join inputs.

Default: 0 (forced join inputs disabled)

EnableMetadataMemoryTracking
Boolean, specifies whether to enable Vertica to track memory used by database metadata in the METADATA resource pool.

Default: 1 (enable)

EnableResourcePoolCPUAffinity
Boolean, specifies whether Vertica aligns queries to the resource pool of the processing CPU. When disabled (0), queries run on any CPU, regardless of the CPU_AFFINITY_SET of the resource pool.

Default: 1

EnableStrictTimeCasts
Specifies whether all cast failures result in an error.

Default: 0 (disable)

EnableUniquenessOptimization
Boolean, specifies whether to enable query optimization that is based on guaranteed uniqueness of column values. Columns that can be guaranteed to include unique values include:

Default: 1 (enable)

EnableWithClauseMaterialization
Superseded by WithClauseMaterialization.
ExternalTablesExceptionsLimit
Determines the maximum number of COPY exceptions and rejections allowed when a SELECT statement references an external table. Set to -1 to remove any exceptions limit. See Querying external tables.

Default: 100

FailoverToStandbyAfter
Specifies the length of time that an active standby node waits before taking the place of a failed node.

This parameter is set to an interval literal.

Default: None

FencedUDxMemoryLimitMB
Sets the maximum amount of memory, in megabytes (MB), that a fenced-mode UDF can use. If a UDF attempts to allocate more memory than this limit, that attempt triggers an exception. For more information, see Fenced and unfenced modes.

Default: -1 (no limit)

FlexTableDataTypeGuessMultiplier
Specifies a multiplier that the COMPUTE_FLEXTABLE_KEYS and COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW functions use when assigning a data type and column width for the flex keys table. Both functions assign each key a data type, and multiply the longest key value by this factor to estimate column width. This value is not used to calculate the width of any real columns in a flex table.

Default: 2.0

FlexTableRawSize
Specifies the default column width for the __raw__ column of new flex tables, a value between 1 and 32000000, inclusive.

Default: 130000

ForceUDxFencedMode
When enabled (1), forces all UDxs that support fenced mode to run in fenced mode even if their definitions specify NOT FENCED.

File exporters cannot be run in fenced mode.

Default: 0

HTTPServerPortOffset
Controls the offset for the HTTPS port. The default HTTPS port is 8443, the sum of the client port (5433) and default HTTPServerPortOffset (3010).

Default: 3010

HTTPSlowRequestThreshold
Threshold in milliseconds for logging HTTP requests in UDFS_STATISTICS. Requests that take longer than this value are logged; faster ones are not.

Default: 500

IcebergPathMapping
For external tables using Iceberg data, a map of URI prefixes used by Iceberg to prefixes accessible to Vertica. The value is a JSON object:
'{"source-prefix":"new-prefix"[,...]}'

Specify prefixes only (up through port number), not complete paths.

IdleTimeoutInitializingConnectionsMs
The length of time (in milliseconds) that the server waits before timing out, during each step in connection initialization. After connection initialization, the session is created.

Default: 10000

JavaBinaryForUDx
Sets the full path to the Java executable that Vertica uses to run Java UDxs. See Installing Java on Vertica hosts.
JoinDefaultTupleFormat
Specifies how to size VARCHAR column data when joining tables on those columns, and buffers accordingly, one of the following:
  • fixed: Use join column metadata to size column data to a fixed length, and buffer accordingly.

  • variable: Use the actual length of join column data, so buffer size varies for each join.

Default: fixed

KeepAliveIdleTime
Length (in seconds) of the idle period before the first TCP keepalive probe is sent to ensure that the client is still connected. If set to 0, Vertica uses the kernel's tcp_keepalive_time parameter setting.

Default: 0

KeepAliveProbeCount
Number of consecutive keepalive probes that must go unacknowledged by the client before the client connection is considered lost and closed. If set to 0, Vertica uses the kernel's tcp_keepalive_probes parameter setting.

Default: 0

KeepAliveProbeInterval
Time interval (in seconds) between keepalive probes. If set to 0, Vertica uses the kernel's tcp_keepalive_intvl parameter setting.

Default: 0

LockTimeout
Specifies in seconds how long a table waits to acquire a lock.

Default: 300

LoadSourceStatisticsLimit
Specifies the maximum number of sources per load operation that are profiled in the LOAD_SOURCES system table. Set it to 0 to disable profiling.

Default: 256

MaxBundleableROSSizeKB
Specifies the minimum size, in kilobytes, of an independent ROS file. Vertica bundles storage container ROS files below this size into a single file. Bundling improves the performance of any file-intensive operations, including backups, restores, and mergeouts.

If you set this parameter to a value of 0, Vertica bundles .fdb and .pidx files without bundling other storage container files.

Default: 1024

MaxClientSessions
Determines the maximum number of client sessions that can run on a single node of the database. The default value allows for five additional administrative logins. These logins prevent DBAs from being locked out of the system if non-dbadmin users reach the login limit.

Default: 50 user logins and 5 additional administrative logins

ObjectStoreGlobStrategy
For partitioned external tables in object stores, the strategy to use for expanding globs before pruning partitions:
  • Flat: COPY fetches a list of all full object names with a given prefix, which can incur latency if partitions are numerous or deeply nested.

  • Hierarchical: COPY fetches object names one partition layer at a time, allowing earlier pruning but possibly requiring more calls to the object store when queries are not selective or there are not many partittion directory levels.

For details, see Partitions on Object Stores.

Default: Flat

ObjStoreUploadParallelism
Integer (maximum 1000), size of the thread pool used when writing files to S3. Using more threads parallelizes uploads, but uses more memory. Each node has a single thread pool used for all file writes.

This configuration parameter can be set only at the database level.

Default: 0 (disabled)

ParquetMetadataCacheSizeMB
Size of the cache used for metadata when reading Parquet data. The cache uses local TEMP storage.

Default: 4096

PatternMatchingUseJit
Boolean, specifies whether to enables just-in-time compilation (to machine code) of regular expression pattern matching functions used in queries. Enabling this parameter can usually improve pattern matching performance on large tables. The Perl Compatible Regular Expressions (PCRE) pattern-match library evaluates regular expressions. Restart the database for this parameter to take effect.

See also Regular expression functions.

Default: 1 (enable)

PatternMatchStackAllocator
Boolean, specifies whether to override the stack memory allocator for the pattern-match library. The Perl Compatible Regular Expressions (PCRE) pattern-match library evaluates regular expressions. Restart the database for this parameter to take effect.

See also Regular expression functions.

Default: 1 (enable override)

TerraceRoutingFactor
Specifies whether to enable or disable terrace routing on any Enterprise Mode large cluster that implements rack-based fault groups.
  • To enable, set as follows:
    where: * numRackNodes: Number of nodes in a rack * numRacks: Number of racks in the cluster
  • To disable, set to a value so large that terrace routing cannot be enabled for the largest clusters—for example, 1000.

For details, see Terrace routing.

Default: 2

TransactionIsolationLevel
Changes the isolation level for the database. After modification, Vertica uses the new transaction level for every new session. Existing sessions and their transactions continue to use the original isolation level.

See also Change transaction isolation levels.

Default: READ COMMITTED

TransactionMode
Whether transactions are in read/write or read-only modes. Read/write is the default. Existing sessions and their transactions continue to use the original isolation level.

Default: READ WRITE

UDFSSlowRequestThreshold
Threshold in milliseconds for logging events in UDFS_EVENTS. Requests that take longer than this value are logged; faster ones are not.

Default: 1000

UDxFencedBlockTimeout
Number of seconds to wait for output before aborting a UDx running in Fenced and unfenced modes. If the server aborts a UDx for this reason, it produces an error message similar to "ERROR 3399: Failure in UDx RPC call: timed out in receiving a UDx message". If you see this error frequently, you can increase this limit. UDxs running in fenced mode do not run in the server process, so increasing this value does not impede server performance.

Default: 60

UseLocalTzForParquetTimestampConversion
Boolean, specifies whether to do timezone conversion when reading Parquet files. Hive version 1.2.1 introduced an option to localize timezones when writing Parquet files. Previously it wrote them in UTC and Vertica adjusted the value when reading the files.

Set to 0 if Hive already adjusted the timezones.

Default: 1 (enable conversion)

UseServerIdentityOverUserIdentity
Boolean, specifies whether to ignore user-supplied credentials for non-Linux file systems and always use a USER storage location to govern access to data. See Creating a Storage Location for USER Access.

Default: 0 (disable)

WithClauseMaterialization
Boolean, specifies whether to enable materialization of WITH clause results. When materialization is enabled (1), Vertica evaluates each WITH clause once and stores results in a temporary table.

For WITH queries with complex types, temp relations are disabled.

Default: 0 (disable)

WithClauseRecursionLimit
Specifies the maximum number of times a WITH RECURSIVE clause iterates over the content of its own result set before it exits. For details, see WITH clause recursion.

Default: 8

4.2 - Azure parameters

Use the following parameters to configure reading from Azure blob storage.

Use the following parameters to configure reading from Azure blob storage. For more information about reading data from Azure, see Azure Blob Storage object store.

For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
AzureStorageCredentials
Collection of JSON objects, each of which specifies connection credentials for one endpoint. This parameter takes precedence over Azure managed identities.

The collection must contain at least one object and may contain more. Each object must specify at least one of accountName or blobEndpoint, and at least one of accountKey or sharedAccessSignature.

  • accountName: If not specified, uses the label of blobEndpoint.
  • blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
  • accountKey: Access key for the account or endpoint.
  • sharedAccessSignature: Access token for finer-grained access control, if being used by the Azure endpoint.
AzureStorageEndpointConfig
Collection of JSON objects, each of which specifies configuration elements for one endpoint. Each object must specify at least one of accountName or blobEndpoint.
  • accountName: If not specified, uses the label of blobEndpoint.
  • blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
  • protocol: HTTPS (default) or HTTP.
  • isMultiAccountEndpoint: true if the endpoint supports multiple accounts, false otherwise (default is false). To use multiple-account access, you must include the account name in the URI. If a URI path contains an account, this value is assumed to be true unless explicitly set to false.

4.3 - Constraints parameters

The following configuration parameters control how Vertica evaluates and enforces constraints.

The following configuration parameters control how Vertica evaluates and enforces constraints. All parameters are set at the database level through ALTER DATABASE.

Three of these parameters—EnableNewCheckConstraintsByDefault, EnableNewPrimaryKeysByDefault, and EnableNewUniqueKeysByDefault—can be used to enforce CHECK, PRIMARY KEY, and UNIQUE constraints, respectively. For details, see Constraint enforcement.

EnableNewCheckConstraintsByDefault
Boolean parameter, set to 0 or 1:
  • 0: Disable enforcement of new CHECK constraints except where the table DDL explicitly enables them.

  • 1 (default): Enforce new CHECK constraints except where the table DDL explicitly disables them.

EnableNewPrimaryKeysByDefault
Boolean parameter, set to 0 or 1:
  • 0 (default): Disable enforcement of new PRIMARY KEY constraints except where the table DDL explicitly enables them.

  • 1: Enforce new PRIMARY KEY constraints except where the table DDL explicitly disables them.

EnableNewUniqueKeysByDefault
Boolean parameter, set to 0 or 1:
  • 0 (default): Disable enforcement of new UNIQUE constraints except where the table DDL explicitly enables them.

  • 1: Enforce new UNIQUE constraints except where the table DDL explicitly disables them.

MaxConstraintChecksPerQuery
Sets the maximum number of constraints that ANALYZE_CONSTRAINTS can handle with a single query:
  • -1 (default): No maximum set, ANALYZE_CONSTRAINTS uses a single query to evaluate all constraints within the specified scope.

  • Integer > 0: The maximum number of constraints per query. If the number of constraints to evaluate exceeds this value, ANALYZE_CONSTRAINTS handles it with multiple queries.

For details, see Distributing Constraint Analysis.

4.4 - Database Designer parameters

The following table describes the parameters for configuring the Vertica Database Designer.

The following table describes the parameters for configuring the Vertica Database Designer.

DBDCorrelationSampleRowCount
Minimum number of table rows at which Database Designer discovers and records correlated columns.

Default: 4000

DBDLogInternalDesignProcess
Enables or disables Database Designer logging.

Default: 0 (False)

DBDUseOnlyDesignerResourcePool
Enables use of the DBD pool by the Vertica Database Designer.

When set to false, design processing is mostly contained by the user's resource pool, but might spill over into some system resource pools for less-intensive tasks

Default: 0 (False)

4.5 - Eon Mode parameters

The following parameters configure how the database operates when running in Eon Mode.

The following parameters configure how the database operates when running in Eon Mode. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.

BackgroundDepotWarming
Specifies background depot warming behavior:
  • 1: The depot loads objects while it is warming, and continues to do so in the background after the node becomes active and starts executing queries.

  • 0: Node activation is deferred until the depot fetches and loads all queued objects

For details, see Depot Warming.

Default: 1

CatalogSyncInterval
Specifies in minutes how often the transaction log sync service syncs metadata to communal storage. If you change this setting, Vertica restarts the interval count.

Default: 5

DelayForDeletes
Specifies in hours how long to wait before deleting a file from communal storage. Vertica first deletes a file from the depot. After the specified time interval, the delete also occurs in communal storage.

Default: 0. Deletes the file from communal storage as soon as it is not in use by shard subscribers.

DepotOperationsForQuery
Specifies behavior when the depot does not contain queried file data, one of the following:
  • ALL (default): Fetch file data from communal storage, if necessary displace existing files by evicting them from the depot.

  • FETCHES: Fetch file data from communal storage only if space is available; otherwise, read the queried data directly from communal storage.

  • NONE: Do not fetch file data to the depot, read the queried data directly from communal storage.

You can also specify query-level behavior with the hint DEPOT_FETCH.

ECSMode
String parameter that sets the strategy Vertica uses when dividing the data in a shard among subscribing nodes during an ECS-enabled query, one of the following:
  • AUTO: Optimizer automatically determines the strategy to use.

  • COMPUTE_OPTIMIZED: Force use of the compute-optimized strategy.

  • IO_OPTIMIZED: Force use of the I/O-optimized strategy.

For details, see Manually choosing an ECS strategy.

Default: AUTO

ElasticKSafety
Boolean parameter that controls whether Vertica adjusts shard subscriptions due to the loss of a primary node:
  • 1: When a primary node is lost, Vertica subscribes other primary nodes to the down node's shard subscriptions. This action helps reduce the chances of a database into going read-only mode due to loss of shard coverage.
  • 0 : Vertica does not change shard subscriptions in reaction to the loss of a primary node.

Default: 1

For details, see Maintaining Shard Coverage.

EnableDepotWarmingFromPeers
Boolean parameter, specifies whether Vertica warms a node depot while the node is starting up and not ready to process queries:
  • 1: Warm the depot while a node comes up.

  • 0: Warm the depot only after the node is up.

For details, see Depot Warming.

Default: 0

FileDeletionServiceInterval
Specifies in seconds the interval between each execution of the reaper cleaner service task.

Default: 60 seconds

MaxDepotSizePercent
An integer value that specifies the maximum size of the depot as a percentage of disk size,

Default: 80

PreFetchPinnedObjectsToDepotAtStartup
If enabled (set to 1), a warming depot fetches objects that are pinned on its subcluster. For details, see Depot Warming.

Default: 0

ReaperCleanUpTimeoutAtShutdown
Specifies in seconds how long Vertica waits for the reaper to delete files from communal storage before shutting down. If set to a negative value, Vertica shuts down without waiting for the reaper.

Default: 300

StorageMergeMaxTempCacheMB
The size of temp space allocated per query to the StorageMerge operator for caching the data of S3 storage containers.

For details, see Local caching of storage containers.

UseCommunalStorageForBatchDepotWarming
Boolean parameter, specifies whether where a node retrieves data when warming its depot:
  • 1: Retrieve data from communal storage.

  • 0: Retrieve data from a peer.

Default: 1

UseDepotForReads
Boolean parameter, specifies whether Vertica accesses the depot to answer queries, or accesses only communal storage:
  • 1: Vertica first searches the depot for the queried data; if not there, Vertica fetches the data from communal storage for this and future queries.

  • 0: Vertica bypasses the depot and always obtains queried data from communal storage.

Default: 1

UseDepotForWrites
Boolean parameter, specifies whether Vertica writes loaded data to the depot and then uploads files to communal storage:
  • 1: Write loaded data to the depot, upload files to communal storage.

  • 0: Bypass the depot and always write directly to communal storage.

Default: 1

UsePeerToPeerDataTransfer
Boolean parameter, specifies whether Vertica pushes loaded data to other shard subscribers:
  • 1: Send loaded data to all shard subscribers.

  • 0: Do not push data to other shard subscribers.

Default: 0

4.6 - Epoch management parameters

The following table describes the epoch management parameters for configuring Vertica.

The following table describes the epoch management parameters for configuring Vertica.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
AdvanceAHMInterval
Determines how frequently (in seconds) Vertica checks the history retention status.

AdvanceAHMInterval cannot be set to a value that is less than the EpochMapInterval.

Default: 180 (seconds)

AHMBackupManagement
Blocks the advancement of the Ancient History Mark (AHM). When this parameter is enabled, the AHM epoch cannot be later than the epoch of your latest full backup. If you advance the AHM to purge and delete data, do not enable this parameter.

Default: 0

EpochMapInterval
Determines the granularity of mapping between epochs and time available to historical queries. When a historical queries AT TIME T request is issued, Vertica maps it to an epoch within a granularity of EpochMapInterval seconds. It similarly affects the time reported for Last Good Epoch during Failure recovery. Note that it does not affect internal precision of epochs themselves.

Default: 180 (seconds)

HistoryRetentionTime
Determines how long deleted data is saved (in seconds) as an historical reference. When the specified time since the deletion has passed, you can purge the data. Use the -1 setting if you prefer to use HistoryRetentionEpochs to determine which deleted data can be purged.

Default: 0 (Data saved when nodes are down.)

HistoryRetentionEpochs
Specifies the number of historical epochs to save, and therefore, the amount of deleted data.

Unless you have a reason to limit the number of epochs, Vertica recommends that you specify the time over which deleted data is saved.

If you specify both History parameters, HistoryRetentionTime takes precedence. Setting both parameters to -1, preserves all historical data.

See Setting a purge policy.

Default: -1 (disabled)

4.7 - Google Cloud Storage parameters

Use the following parameters to configure reading from Google Cloud Storage (GCS) using COPY FROM.

Use the following parameters to configure reading from Google Cloud Storage (GCS) using COPY FROM. For more information about reading data from S3, see Specifying where to load data from.

For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
GCSAuth
An ID and secret key to authenticate to GCS. You can set parameters globally and for the current session with ALTER DATABASE...SET PARAMETER and ALTER SESSION...SET PARAMETER, respectively. For extra security, do not store credentials in the database; instead, set it for the current session with ALTER SESSION. For example:
=> ALTER SESSION SET GCSAuth='ID:secret';

If you use a shared credential, set it in the database with ALTER DATABASE.

GCSEnableHttps
Specifies whether to use the HTTPS protocol when connecting to GCS, can be set only at the database level with ALTER DATABASE...SET PARAMETER.

Default: 1 (enabled)

GCSEndpoint
The connection endpoint address.

Default: storage.googleapis.com

4.8 - Hadoop parameters

The following table describes general parameters for configuring integration with Apache Hadoop.

The following table describes general parameters for configuring integration with Apache Hadoop. See Apache Hadoop integration for more information.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
EnableHDFSBlockInfoCache
Boolean, whether to distribute block location metadata collected during planning on the initiator to all database nodes for execution. Distributing this metadata reduces name node accesses, and thus load, but can degrade database performance somewhat in deployments where the name node isn't contended. This performance effect is because the data must be serialized and distributed. Enable distribution if protecting the name node is more important than query performance; usually this applies to large HDFS clusters where name node contention is already an issue.

Default: 0 (disabled)

HadoopConfDir
Directory path containing the XML configuration files copied from Hadoop. The same path must be valid on every Vertica node. You can use the VERIFY_HADOOP_CONF_DIR meta-function to test that the value is set correctly. Setting this parameter is required to read data from HDFS.

For all Vertica users, the files are accessed by the Linux user under which the Vertica server process runs.

When you set this parameter, previously-cached configuration information is flushed.

You can set this parameter at the session level. Doing so overrides the database value; it does not append to it. For example:

  
=> ALTER SESSION SET HadoopConfDir='/test/conf:/hadoop/hcat/conf';

To append, get the current value and include it on the new path after your additions. Setting this parameter at the session level does not change how the files are accessed.

Default: obtained from environment if possible

HadoopFSAuthentication
How (or whether) to use Kerberos authentication with HDFS. By default, if KerberosKeytabFile is set, Vertica uses that credential for both Vertica and HDFS. Usually this is the desired behavior. However, if you are using a Kerberized Vertica cluster with a non-Kerberized HDFS cluster, set this parameter to "none" to indicate that Vertica should not use the Vertica Kerberos credential to access HDFS.

Default: "keytab" if KerberosKeytabFile is set, otherwise "none"

HadoopFSBlockSizeBytes
Block size to write to HDFS. Larger files are divided into blocks of this size.

Default: 64MB

HadoopFSNNOperationRetryTimeout
Number of seconds a metadata operation (such as list directory) waits for a response before failing. Accepts float values for millisecond precision.

Default: 6 seconds

HadoopFSReadRetryTimeout
Number of seconds a read operation waits before failing. Accepts float values for millisecond precision. If you are confident that your file system will fail more quickly, you can improve performance by lowering this value.

Default: 180 seconds

HadoopFSReplication
Number of replicas HDFS makes. This is independent of the replication that Vertica does to provide K-safety. Do not change this setting unless directed otherwise by Vertica support.

Default: 3

HadoopFSRetryWaitInterval
Initial number of seconds to wait before retrying read, write, and metadata operations. Accepts float values for millisecond precision. The retry interval increases exponentially with every retry.

Default: 3 seconds

HadoopFSTokenRefreshFrequency
How often, in seconds, to refresh the Hadoop tokens used to hold Kerberos tickets (see Token expiration).

Default: 0 (refresh when token expires)

HadoopFSWriteRetryTimeout
Number of seconds a write operation waits before failing. Accepts float values for millisecond precision. If you are confident that your file system will fail more quickly, you can improve performance by lowering this value.

Default: 180 seconds

HadoopImpersonationConfig
Session parameter specifying the delegation token or Hadoop user for HDFS access. See HadoopImpersonationConfig format for information about the value of this parameter and Proxy users and delegation tokens for more general context.
HDFSUseWebHDFS
Boolean, whether URLs in the hdfs scheme use WebHDFS instead of LibHDFS++ to access data.

Default: 1 (enabled)

WebhdfsClientCertConf
mTLS configurations for accessing one or more WebHDFS servers. The value is a JSON string; each member has the following properties:
  • nameservice: WebHDFS name service

  • authority: host:port

  • certName: name of a certificate defined by CREATE CERTIFICATE

nameservice and authority are mutually exclusive.

For example:

=> ALTER SESSION SET WebhdfsClientCertConf =
    '[{"authority" : "my.authority.com:50070", "certName" : "myCert"},
      {"nameservice" : "prod", "certName" : "prodCert"}]';

HCatalog Connector parameters

The following table describes the parameters for configuring the HCatalog Connector. See Using the HCatalog Connector for more information.

EnableHCatImpersonation
Boolean, whether the HCatalog Connector uses (impersonates) the current Vertica user when accessing Hive. If impersonation is enabled, the HCatalog Connector uses the Kerberos credentials of the logged-in Vertica user to access Hive data. Disable impersonation if you are using an authorization service to manage access without also granting users access to the underlying files. For more information, see Configuring security.

Default: 1 (enabled)

HCatalogConnectorUseHiveServer2
Boolean, whether Vertica internally uses HiveServer2 instead of WebHCat to get metadata from Hive.

Default: 1 (enabled)

HCatalogConnectorUseLibHDFSPP
Boolean, whether the HCatalog Connector should use the hdfs scheme instead of webhdfs with the HCatalog Connector.

Default: 1 (enabled)

HCatConnectionTimeout
The number of seconds the HCatalog Connector waits for a successful connection to the HiveServer2 (or WebHCat) server before returning a timeout error.

Default: 0 (Wait indefinitely)

HCatSlowTransferLimit
Lowest transfer speed (in bytes per second) that the HCatalog Connector allows when retrieving data from the HiveServer2 (or WebHCat) server. In some cases, the data transfer rate from the server to Vertica is below this threshold. In such cases, after the number of seconds specified in the HCatSlowTransferTime parameter pass, the HCatalog Connector cancels the query and closes the connection.

Default: 65536

HCatSlowTransferTime
Number of seconds the HCatalog Connector waits before testing whether the data transfer from the server is too slow. See the HCatSlowTransferLimit parameter.

Default: 60

4.9 - Internationalization parameters

The following table describes internationalization parameters for configuring Vertica.

The following table describes internationalization parameters for configuring Vertica.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
DefaultIntervalStyle
Sets the default interval style to use. If set to 0 (default), the interval is in PLAIN style (the SQL standard), no interval units on output. If set to 1, the interval is in UNITS on output. This parameter does not take effect until the database is restarted.

Default: 0

DefaultSessionLocale
Sets the default session startup locale for the database. This parameter does not take effect until the database is restarted.

Default: en_US@collation=binary

EscapeStringWarning
Issues a warning when backslashes are used in a string literal. This can help locate backslashes that are being treated as escape characters so they can be fixed to follow the SQL standard-conforming string syntax instead.

Default: 1

StandardConformingStrings
Determines whether character string literals treat backslashes () as string literals or escape characters. When set to -1, backslashes are treated as string literals; when set to 0, backslashes are treated as escape characters.

Default: -1

4.10 - Kafka user-defined session parameters

Set Vertica user-defined session parameters to configure Kafka SSL when not using a scheduler, using ALTER SESSION SET UDPARAMETER.

Set Vertica user-defined session parameters to configure Kafka SSL when not using a scheduler, using ALTER SESSION SET UDPARAMETER. The kafka-prefixed parameters configure SSL authentication for Kafka. For details, see TLS/SSL encryption with Kafka.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
kafka_SSL_CA
The contents of the certificate authority certificate. For example:
=> ALTER SESSION SET UDPARAMETER kafka_SSL_CA='MIIBOQIBAAJBAIOL';

Default: none

kafka_SSL_Certificate
The contents of the SSL certificate. For example:
=> ALTER SESSION SET UDPARAMETER kafka_SSL_Certificate='XrM07O4dV/nJ5g';

This parameter is optional when the Kafka server's parameter ssl.client.auth is set to none or requested.

Default: none

kafka_SSL_PrivateKey_secret
The private key used to encrypt the session. Vertica does not log this information. For example:
=> ALTER SESSION SET UDPARAMETER kafka_SSL_PrivateKey_secret='A60iThKtezaCk7F';

This parameter is optional when the Kafka server's parameter ssl.client.auth is set to none or requested.

Default: none

kafka_SSL_PrivateKeyPassword_secret
The password used to create the private key. Vertica does not log this information.

For example:

ALTER SESSION SET UDPARAMETER kafka_SSL_PrivateKeyPassword_secret='secret';

This parameter is optional when the Kafka server's parameter ssl.client.auth is set to none or requested.

Default: none

kafka_Enable_SSL
Enables SSL authentication for Vertica-Kafka integration. For example:
=> ALTER SESSION SET UDPARAMETER kafka_Enable_SSL=1;
  

Default: 0

MaxSessionUDParameterSize
Sets the maximum length of a value in a user-defined session parameter. For example:
=> ALTER SESSION SET MaxSessionUDParameterSize = 2000
  

Default: 1000

User-defined session parameters

4.11 - Kerberos parameters

The following parameters let you configure the Vertica principal for Kerberos authentication and specify the location of the Kerberos keytab file.

The following parameters let you configure the Vertica principal for Kerberos authentication and specify the location of the Kerberos keytab file.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
KerberosEnableKeytabPermissionCheck
Whether the Vertica server verifies permissions on the keytab file. By default, the Vertica server verifies these permissions.

In a hybrid Kubernetes cluster, set this parameter to 0 so that there is no permissions check. Otherwise, Kerberos authentication fails because the keytab file is stored in a Secret, and the VerticaDB operator cannot verify permissions on a Secret.

Default: 1

KerberosHostname
Instance or host name portion of the Vertica Kerberos principal. For example:
vertica/host@EXAMPLE.COM

If you omit the optional KerberosHostname parameter, Vertica uses the return value from the function gethostname(). Assuming each cluster node has a different host name, those nodes will each have a different principal, which you must manage in that node's keytab file.

KerberosKeytabFile
Location of the keytab file that contains credentials for the Vertica Kerberos principal. By default, this file is located in /etc. For example:
KerberosKeytabFile=/etc/krb5.keytab
KerberosRealm
Realm portion of the Vertica Kerberos principal. A realm is the authentication administrative domain and is usually formed in uppercase letters. For example:
vertica/hostEXAMPLE.COM
KerberosServiceName
Service name portion of the Vertica Kerberos principal. By default, this parameter is vertica. For example:
vertica/host@EXAMPLE.COM

Default: vertica

KerberosTicketDuration
Lifetime of the ticket retrieved from performing a kinit. The default is 0 (zero) which disables this parameter.

If you omit setting this parameter, the lifetime is determined by the default Kerberos configuration.

4.12 - Machine learning parameters

You use machine learning parameters to configure various aspects of machine learning functionality in Vertica.

You use machine learning parameters to configure various aspects of machine learning functionality in Vertica.

MaxModelSizeKB
Sets the maximum size of models that can be imported. The sum of the size of files specified in the metadata.json file determines the model size. The unit of this parameter is KBytes. The native Vertica model (category=VERTICA_MODELS) is exempted from this limit. If you can export the model from Vertica, and the model is not altered while outside Vertica, you can import it into Vertica again.

The MaxModelSizeKB parameter can be set only by a superuser and only at the database level. It is visible only to a superuser. Its default value is 4GB, and its valid range is between 1KB and 64GB (inclusive).

Examples:

To set this parameter to 3KB:

=> ALTER DATABASE DEFAULT SET MaxModelSizeKB = 3;

To set this parameter to 64GB (the maximum allowed):

=> ALTER DATABASE DEFAULT SET MaxModelSizeKB = 67108864;

To reset this parameter to the default value:

=> ALTER DATABASE DEFAULT CLEAR MaxModelSizeKB;

Default: 4GB

4.13 - Memory management parameters

The following table describes parameters for managing Vertica memory usage.

The following table describes parameters for managing Vertica memory usage.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
MemoryPollerIntervalSec
Specifies in seconds how often the Vertica memory poller checks whether Vertica memory usage is below the thresholds of several configuration parameters (see below):
  • MemoryPollerMallocBloatThreshold

  • MemoryPollerReportThreshold

  • MemoryPollerTrimThreshold

Default: 2

MemoryPollerMallocBloatThreshold
Threshold of glibc memory bloat.

The memory poller calls glibc function malloc_info() to obtain the amount of free memory in malloc. It then compares MemoryPollerMallocBloatThreshold—by default, set to 0.3—with the following expression:

free-memory-in-malloc / RSS

If this expression evaluates to a value higher than MemoryPollerMallocBloatThreshold, the memory poller calls glibc function malloc_trim(). This function reclaims free memory from malloc and returns it to the operating system. Details on calls to malloc_trim() are written to system table MEMORY_EVENTS.

To disable polling of this threshold, set the parameter to 0.

Default: 0.3

MemoryPollerReportThreshold
Threshold of memory usage that determines whether the Vertica memory poller writes a report.

The memory poller compares MemoryPollerReportThreshold with the following expression:

RSS / available-memory

When this expression evaluates to a value higher than MemoryPollerReportThreshold—by default, set to 0.93, then the memory poller writes a report to MemoryReport.log, in the Vertica working directory. This report includes information about Vertica memory pools, how much memory is consumed by individual queries and session, and so on. The memory poller also logs the report as an event in system table MEMORY_EVENTS, where it sets EVENT_TYPE to MEMORY_REPORT.

To disable polling of this threshold, set the parameter to 0.

Default: 0.93

MemoryPollerTrimThreshold
Threshold for the memory poller to start checking whether to trim glibc-allocated memory.

The memory poller compares MemoryPollerTrimThreshold—by default, set to 0.83— with the following expression:

RSS / available-memory

If this expression evaluates to a value higher than MemoryPollerTrimThreshold, then the memory poller starts checking the next threshold—set in MemoryPollerMallocBloatThreshold—for glibc memory bloat.

To disable polling of this threshold, set the parameter to 0. Doing so also disables polling of MemoryPollerMallocBloatThreshold.

Default: 0.83

4.14 - Monitoring parameters

The following table describes parameters that control options for monitoring the Vertica database.

The following parameters control options for monitoring the Vertica database.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
EnableDataCollector
Enables the Data Collector, which is the Workload Analyzer's internal diagnostics utility. Affects all sessions on all nodes. To disable, set to 0.

Default: 1 (enabled)

EnableLogRotate
Enables and disables the timed LogRotate service.

Default: 1 (enabled)

LogRotateInterval
Interval literal, how often the LogRotate service runs.

Default: 8h (runs every 8 hours)

LogRotateMaxAge
Interval literal, the maximum age of rotated logs (.gz) to keep. Rotated logs with an age greater than this value are rotated. For details, see Rotating log files.

Default: 7 days

LogRotateMaxSize
String, the maximum size of a log (.log) file. The string has format integer{K|M|G|T}, where K is kibibytes, M is mebibytes, G is gibibytes, and T is tebibytes. Logs larger than this value are rotated. For details, see Rotating log files.

Default: 100M

SnmpTrapDestinationsList
Defines where Vertica sends traps for SNMP. See Configuring reporting for SNMP. For example:
=> ALTER DATABASE DEFAULT SET SnmpTrapDestinationsList = 'localhost 162 public';

Default: none

SnmpTrapsEnabled
Enables event trapping for SNMP. See Configuring reporting for SNMP.

Default: 0

SnmpTrapEvents
Defines which events Vertica traps through SNMP. See Configuring reporting for SNMP. For example:
=> ALTER DATABASE DEFAULT SET SnmpTrapEvents = 'Low Disk Space, Recovery Failure';
  

Default: Low Disk Space, Read Only File System, Loss of K Safety, Current Fault Tolerance at Critical Level, Too Many ROS Containers, Node State Change, Recovery Failure, Stale Checkpoint, and CRC Mismatch.

SyslogEnabled
Enables event trapping for syslog. See Configuring reporting for syslog.

Default: 0 (disabled)

SyslogEvents
Defines events that generate a syslog entry. See Configuring reporting for syslog. For example:
=> ALTER DATABASE DEFAULT SET SyslogEvents = 'Low Disk Space, Recovery Failure';
  

Default: none

SyslogFacility
Defines which SyslogFacility Vertica uses. See Configuring reporting for syslog.

Default: user

4.15 - Numeric precision parameters

The following configuration parameters let you configure numeric precision for numeric data types.

The following configuration parameters let you configure numeric precision for numeric data types. For more about using these parameters, seeNumeric data type overflow with SUM, SUM_FLOAT, and AVG.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
AllowNumericOverflow
Boolean, set to one of the following:
  • 1 (true): Allows silent numeric overflow. Vertica does not implicitly extend precision of numeric data types. Vertica ignores the value of NumericSumExtraPrecisionDigits.

  • 0 (false): Vertica produces an overflow error, if a result exceeds the precision set by NumericSumExtraPrecisionDigits.

Default: 1 (true)

NumericSumExtraPrecisionDigits
An integer between 0 and 20, inclusive. Vertica produces an overflow error if a result exceeds the specified precision. This parameter setting only applies if AllowNumericOverflow is set to 0 (false).

Default: 6 (places beyond the DDL-specified precision)

4.16 - Profiling parameters

The following table describes the profiling parameters for configuring Vertica.

The following table describes the profiling parameters for configuring Vertica. See Profiling database performance for more information on profiling queries.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
GlobalEEProfiling
Enables profiling for query execution runs in all sessions on all nodes.

Default: 0

GlobalQueryProfiling
Enables query profiling for all sessions on all nodes.

Default: 0

GlobalSessionProfiling
Enables session profiling for all sessions on all nodes.

Default: 0

SaveDCEEProfileThresholdUS
Sets in microseconds the query duration threshold for saving profiling information to system tables QUERY_CONSUMPTION and EXECUTION_ENGINE_PROFILES. You can set this parameter to a maximum value of 2147483647 (231-1, or ~35.79 minutes).

Default: 60000000 (60 seconds)

4.17 - Projection parameters

The following configuration parameters help you manage projections.

The following configuration parameters help you manage projections.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
AnalyzeRowCountInterval
Specifies how often Vertica checks the number of projection rows and whether the threshold set by ARCCommitPercentage has been crossed.

For more information, see Collecting database statistics.

Default: 86400 seconds (24 hours)

ARCCommitPercentage
Sets the threshold percentage of difference between the last-recorded aggregate projection row count and current row count for a given table. When the difference exceeds this threshold, Vertica updates the catalog with the current row count.

Default: 3 (percent)

ContainersPerProjectionLimit
Specifies how many ROS containers Vertica creates per projection before ROS pushback occurs.

Default: 1024

MaxAutoSegColumns
Specifies the number of columns (0 –1024) to use in an auto-projection's hash segmentation clause. Set to 0 to use all columns.

Default: 8

MaxAutoSortColumns
Specifies the number of columns (0 –1024) to use in an auto-projection's sort expression. Set to 0 to use all columns.

Default: 8

RebalanceQueryStorageContainers
By default, prior to performing a rebalance, Vertica performs a system table query to compute the size of all projections involved in the rebalance task. This query enables Vertica to optimize the rebalance to most efficiently utilize available disk space. This query can, however, significantly increase the time required to perform the rebalance.

By disabling the system table query, you can reduce the time required to perform the rebalance. If your nodes are low on disk space, disabling the query increases the chance that a node runs out of disk space. In that situation, the rebalance fails.

Default: 1 (enable)

RewriteQueryForLargeDim
If enabled (1), Vertica rewrites a SET USING or DEFAULT USING query during a REFRESH_COLUMNS operation by reversing the inner and outer join between the target and source tables. Doing so can optimize refresh performance in cases where the source data is in a table that is larger than the target table.

Default: 0

SegmentAutoProjection
Determines whether auto-projections are segmented if the table definition omits a segmentation clause. You can set this parameter at database and session scopes.

Default: 1 (create segmented auto projections)

4.18 - S3 parameters

Use the following parameters to configure reading from S3 file systems and on-premises storage with S3-compatible APIs, using COPY.

Use the following parameters to configure reading from S3 file systems and on-premises storage with S3-compatible APIs. For more information about reading data from S3, see S3 Object Store.

For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.

AWSAuth
ID and secret key for authentication. For extra security, do not store credentials in the database; use ALTER SESSION...SET PARAMETER to set this value for the current session only. If you use a shared credential, you can set it in the database with ALTER DATABASE...SET PARAMETER. For example:
=> ALTER SESSION SET AWSAuth='ID:secret';

In AWS, these arguments are named AccessKeyID and SecretAccessKey.

To use admintools create_db or revive_db for Eon Mode on-premises, create a configuration file called auth_params.conf with these settings:

AWSAuth = key:secret
  AWSEndpoint = IP:port
AWSCAFile

File name of the TLS server certificate bundle to use. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.

=> ALTER DATABASE DEFAULT SET AWSCAFile = '/etc/ssl/ca-bundle.pem';

Default: system-dependent

AWSCAPath

Path Vertica uses to look up TLS server certificate bundles. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.

=> ALTER DATABASE DEFAULT SET AWSCAPath = '/etc/ssl/';

Default: system-dependent

AWSEnableHttps

Boolean, specifies whether to use the HTTPS protocol when connecting to S3, can be set only at the database level with ALTER DATABASE. If you choose not to use TLS, this parameter must be set to 0.

Default: 1 (enabled)

AWSEndpoint
Endpoint to use when interpreting S3 URLs, set as follows.
  • AWS: hostname_or_ip:port_number.

  • AWS with a FIPS-compliant S3 Endpoint: S3_hostname and enable virtual addressing:

    AWSEndpoint = s3-fips.dualstack.us-east-1.amazonaws.com
    S3EnableVirtualAddressing = 1
    
  • On-premises/Pure: IP address of the Pure Storage server. If using admintools create_db or revive_db, create configuration file auth_params.conf and include these settings:

    awsauth = key:secret
          awsendpoint = IP:port
    
  • When AWSEndpoint is not set, the default behavior is to use virtual-hosted request URLs.

Default: s3.amazonaws.com

AWSLogLevel
Log level, one of the following:
  • OFF

  • FATAL

  • ERROR

  • WARN

  • INFO

  • DEBUG

  • TRACE

**Default:**ERROR

AWSRegion
AWS region containing the S3 bucket from which to read files. This parameter can only be configured with one region at a time. If you need to access buckets in multiple regions, change the parameter each time you change regions.

If you do not set the correct region, you might experience a delay before queries fail because Vertica retries several times before giving up.

Default: us-east-1

AWSSessionToken
Temporary security token generated by running the get-session-token command, which generates temporary credentials you can use to configure multi-factor authentication.

Set this parameter in a user session using ALTER SESSION. You can set this parameter at the database level, but be aware that session tokens are temporary. When the token expires, any attempt to access AWS fails.

AWSStreamingConnectionPercentage
Controls the number of connections to the communal storage that Vertica uses for streaming reads. In a cloud environment, this setting helps prevent streaming data from communal storage using up all available file handles. It leaves some file handles available for other communal storage operations.

Due to the low latency of on-premises object stores, this option is unnecessary for an Eon Mode database that uses on-premises communal storage. In this case, disable the parameter by setting it to 0.

S3BucketConfig
A JSON array of objects specifying per-bucket configuration overrides. Each property other than the bucket name has a corresponding configuration parameter (shown in parentheses). If both the database-level parameter and its equivalent in S3BucketConfig are set, the value in S3BucketConfig takes precedence.

Properties:

  • bucket: Name of the bucket

  • region (AWSRegion): Name of the region

  • protocol (AWSEnableHttps): Connection protocol, either http or https

  • endpoint (AWSEndpoint): Endpoint URL or IP address

  • enableVirtualAddressing (S3EnableVirtualAddressing): Whether to rewrite the S3 URL to use a virtual hosted path

  • requesterPays (S3RequesterPays): Whether requester (instead of bucket owner) pays the cost of accessing data on the bucket

  • serverSideEncryption (S3ServerSideEncryption): Encryption algorithm if using SSE-S3 or SSE-KMS, one of AES256, aws:kms, or an empty string

  • sseCustomerAlgorithm (S3SseCustomerAlgorithm): Encryption algorithm if using SSE-C; must be AES256

  • sseCustomerKey (S3SseCustomerKey): Key if using SSE-C encryption, either 32-character plaintext or 44-character base64-encoded

  • sseKmsKeyId (S3SseKmsKeyId): Key ID if using SSE-KMS encryption

  • proxy (S3Proxy): HTTP(S) proxy string

The configuration properties for a given bucket might differ based on its type. For example, the following S3BucketConfig is for an AWS bucket AWSBucket and a Pure Storage bucket PureStorageBucket. AWSBucket doesn't specify an endpoint, so Vertica uses the value of AWSEndpoint, which defaults to s3.amazonaws.com:

ALTER DATABASE DEFAULT SET S3BucketConfig=
'[
    {
        "bucket": "AWSBucket",
        "region": "us-east-2",
        "protocol": "https",
        "requesterPays": true
    },
    {
        "bucket": "PureStorageBucket",
        "endpoint": "pure.mycorp.net:1234",
        "protocol": "http",
        "enableVirtualAddressing": false
    }
]';
S3BucketCredentials
Contains credentials for accessing an S3 bucket. Each property in S3BucketCredentials has an equivalent parameter (shown in parentheses). When set, S3BucketCredentials takes precedence over both AWSAuth and AWSSessionToken.

Providing credentials for more than one bucket authenticates to them simultaneously, allowing you to perform cross-endpoint joins, export from one bucket to another, etc.

Properties:

  • bucket: Name of the bucket

  • accessKey: Access key for the bucket (the ID in AWSAuth)

  • secretAccessKey: Secret access key for the bucket (the secret in AWSAuth)

  • sessionToken: Session token, only used when S3BucketCredentials is set at the session level (AWSSessionToken)

For example, the following S3BucketCredentials is for an AWS bucket AWSBucket and a Pure Storage bucket PureStorageBucket and sets all possible properties:

ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "AWSBucket",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    },
    {
        "bucket": "PureStorageBucket",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>"
    }
]';

This parameter is only visible to the superuser. Users can set this parameter at the session level with ALTER SESSION.

S3EnableVirtualAddressing
Boolean, specifies whether to rewrite S3 URLs to use virtual-hosted paths. For example, if you use AWS, the S3 URLs change to bucketname.s3.amazonaws.com instead of s3.amazonaws.com/bucketname. This configuration setting takes effect only when you have specified a value for AWSEndpoint.

If you set AWSEndpoint to a FIPS-compliant S3 Endpoint, you must enable S3EnableVirtualAddressing in auth_params.conf:

AWSEndpoint = s3-fips.dualstack.us-east-1.amazonaws.com
S3EnableVirtualAddressing = 1

The value of this parameter does not affect how you specify S3 paths.

Default: 0 (disabled)

S3Proxy
HTTP(S) proxy settings, if needed, a string in the following format:
http[s]://[user:password@]host[:port]

Default: "" (empty string)

S3RequesterPays
Boolean, specifies whether requester (instead of bucket owner) pays the cost of accessing data on the bucket. When true, the bucket owner is only responsible for paying the cost of storing the data, rather than all costs associated with the bucket; must be set in order to access S3 buckets configured as Requester Pays buckets. By setting this property to true, you are accepting the charges for accessing data. If not specified, the default value is false.
S3ServerSideEncryption
String, encryption algorithm to use when reading or writing to S3. The value depends on which type of encryption at rest is configured for S3:
  • AES256: Use for SSE-S3 encryption

  • aws:kms: Use for SSE-KMS encryption

  • Empty string (""): No encryption

SSE-C encryption does not use this parameter. Instead, see S3SseCustomerAlgorithm.

For details on using SSE parameters, see S3 object store.

Default: "" (no encryption)

S3SseCustomerAlgorithm
String, the encryption algorithm to use when reading or writing to S3 using SSE-C encryption. The only supported values are AES256 and "".

For SSE-S3 and SSE-KMS, instead use S3ServerSideEncryption.

Default: "" (no encryption)

S3SseCustomerKey
If using SSE-C encryption, the client key for S3 access.
S3SseKmsKeyId
If using SSE-KMS encryption, the key identifier (not the key) to pass to the Key Management Server. Vertica must have permission to use the key, which is managed through KMS.

4.19 - SNS parameters

The following parameters configure Amazon Simple Notification Service (SNS) notifiers.

The following parameters configure Amazon Simple Notification Service (SNS) notifiers. These parameters can only be set at the database level and some, if unset, fall back to their equivalent S3 parameters.

Notifiers must be disabled and then reenabled for these parameters to take effect:

=> ALTER NOTIFIER sns_notifier DISABLE;
=> ALTER NOTIFIER sns_notifier ENABLE;
SNSAuth (falls back to AWSAuth)
ID and secret key for authentication, equivalent to the AccessKeyID and SecretAccessKey in AWS. For example:
=> ALTER DATABASE DEFAULT SET SNSAuth='myID:mySecret';

Default: "" (empty string)

SNSCAFile (falls back to AWSCAFile)

File name of the TLS server certificate bundle to use. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.

=> ALTER DATABASE DEFAULT SET SNSCAFile = '/etc/ssl/ca-bundle.pem';

Default: "" (empty string)

SNSCAPath (falls back to AWSCAPath)

Path Vertica uses to look up TLS server certificate bundles. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.

=> ALTER DATABASE DEFAULT SET SNSCAPath = '/etc/ssl/';

Default: "" (empty string)

SNSEnableHttps

Boolean, specifies whether to use the HTTPS protocol when connecting to S3, can be set only at the database level with ALTER DATABASE. If you choose not to use TLS, this parameter must be set to 0.

Default: 1 (enabled)

SNSEndpoint
URL of the SNS API endpoint. If this parameter is set to an empty string and the region is specified (either by SNSRegion or its fallback to AWSRegion), Vertica automatically infers the appropriate endpoint.

If you use FIPS, you must manually specify a FIPS-compliant endpoint.

Default: "" (empty string)

SNSRegion (falls back to AWSRegion)
AWS region for the SNSEndpoint. This parameter can only be configured with one region at a time.

Default: "" (empty string)

4.20 - SQS parameters

The following parameters configure Amazon Simple Queue Service (SQS) access.

The following parameters configure access to Amazon Simple Queue Service (SQS) queues. SQS is used with CREATE DATA LOADER to automatically load data from a queue into Vertica. These parameters can be set at the database or session level and some, if unset, fall back to their equivalent S3 parameters.

SQSAuth (falls back to AWSAuth)
ID and secret key for authentication, equivalent to the AccessKeyID and SecretAccessKey in AWS. For example:
=> ALTER DATABASE DEFAULT SET SQSAuth='myID:mySecret';

Default: "" (empty string)

SQSQueueCredentials
Per-queue credentials for accessing SQS queues, a JSON array. When set, these values take precedence over SQSAuth.

Properties:

  • accessKey: Access key for the queue (the ID in AWSAuth)

  • secretAccessKey: Secret access key for the queue (the secret in AWSAuth)

  • sessionToken: AWS session token, only used when SQSQueueCredentials is set at the session level

The following example sets credentials for two queues:

=> ALTER SESSION SET SQSQueueCredentials='[
      {"queue":"queue0","accessKey":"<AK0>","secretAccessKey":"<SAK0>","sessionToken":"1234567890"},
      {"queue":"queue1","accessKey":"foo","secretAccessKey":"bar"}]';

Default: "" (empty string)

SQSSessionToken
Temporary, session-level SQS access token. Set this parameter in a user session using ALTER SESSION.

If you set a session token at the session level, you must also set SQSAuth at the session level, even if it is already set at the database level.

In general, SQSSessionToken should be set only at the session level. While you can set this parameter at the database level, session tokens are ephemeral, so when the database-level token expires, you must set it again either at the session or database level.

4.21 - Security parameters

Use these client authentication configuration parameters and general security parameters to configure TLS.

Use these client authentication configuration parameters and general security parameters to configure TLS.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.

Database parameters

DataSSLParams
This parameter has been deprecated. Use the data_channel TLS Configuration instead.

Enables encryption using SSL on the data channel. The value of this parameter is a comma-separated list of the following:

  • An SSL certificate (chainable)

  • The corresponding SSL private key

  • The SSL CA (Certificate Authority) certificate.

You should set EncryptSpreadComm before setting this parameter.

In the following example, the SSL Certificate contains two certificates, where the certificate for the non-root CA verifies the certificate for the cluster. This is called an SSL Certificate Chain.

=> ALTER DATABASE DEFAULT SET PARAMETER DataSSLParams =
'----BEGIN CERTIFICATE-----<certificate for Cluster>-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----<certificate for non-root CA>-----END CERTIFICATE-----,
-----BEGIN RSA PRIVATE KEY-----<private key for Cluster>-----END RSA PRIVATE KEY-----,
-----BEGIN CERTIFICATE-----<certificate for public CA>-----END CERTIFICATE-----';
DefaultIdleSessionTimeout
Indicates a default session timeout value for all users where IDLESESSIONTIMEOUT is not set. For example:
=> ALTER DATABASE DEFAULT SET defaultidlesessiontimeout = '300 secs';
  
DHParams

String, a Diffie-Hellman group of at least 2048 bits in the form:

 -----BEGIN DH PARAMETERS-----...-----END DH PARAMETERS-----

You can generate your own or use the pre-calculated Modular Exponential (MODP) Diffie-Hellman groups specified in RFC 3526.

Changes to this parameter do not take effect until you restart the database.

Default: RFC 3526 2048-bit MODP Group 14:

  
-----BEGIN DH PARAMETERS-----MIIBCAKCAQEA///////////
JD9qiIWjCNMTGYouA3BzRKQJOCIpnzHQCC76mOxObIlFKCHmONAT
d75UZs806QxswKwpt8l8UN0/hNW1tUcJF5IW1dmJefsb0TELppjf
tawv/XLb0Brft7jhr+1qJn6WunyQRfEsf5kkoZlHs5Fs9wgB8uKF
jvwWY2kg2HFXTmmkWP6j9JM9fg2VdI9yjrZYcYvNWIIVSu57VKQd
wlpZtZww1Tkq8mATxdGwIyhghfDKQXkYuNs474553LBgOhgObJ4O
i7Aeij7XFXfBvTFLJ3ivL9pVYFxg5lUl86pVq5RXSJhiY+gUQFXK
OWoqsqmj//////////wIBAg==-----END DH PARAMETERS-----
  
DisableInheritedPrivileges
Boolean, whether to disable privilege inheritance at the database level (enabled by default).
DoUserSpecificFilteringInSysTables
Boolean, specifies whether a non-superuser can view details of another user:
  • 0: Users can view details of other users.

  • 1: Users can only view details about themselves.

Default: 0

EnableAllRolesOnLogin
Boolean, specifies whether to automatically enable all roles granted to a user on login:
  • 0: Do not automatically enable roles

  • 1: Automatically enable roles. With this setting, users do not need to run SET ROLE.

Default: 0 (disable)

EnabledCipherSuites
Specifies which SSL cipher suites to use for secure client-server communication. Changes to this parameter apply only to new connections.

Default: Vertica uses the Microsoft Schannel default cipher suites. For more information, see the Schannel documentation.

EnableConnectCredentialForwarding
When using CONNECT TO VERTICA, whether to use the password of the current user to authenticate to a target database.

Default: 0 (disable)

EnableOAuth2JITCleanup
If enabled, users created by just-in-time OAuth provisioning are automatically dropped if the user does not log in after the number of days specified by OAuth2UserExpiredInterval.

To view provisioned users, see USERS.

Default: 0 (disable)

EncryptSpreadComm
Enables Spread encryption on the control channel, set to one of the following strings:
  • vertica: Specifies that Vertica generates the Spread encryption key for the database cluster.

  • aws-kms|key-name, where key-name is a named key in the iAWS Key Management Service (KMS). On database restart, Vertica fetches the named key from the KMS instead of generating its own.

If the parameter is empty, Spread communication is unencrypted. In general, you should enable this parameter before modifying other security parameters.

Enabling this parameter requires database restart.

GlobalHeirUsername
A string that specifies which user inherits objects after their owners are dropped. This setting ensures preservation of data that would otherwise be lost.

Set this parameter to one of the following string values:

  • Empty string: Objects of dropped users are removed from the database.

  • username: Reassigns objects of dropped users to username. If username does not exist, Vertica creates that user and sets GlobalHeirUsername to it.

  • <auto>: Reassigns objects of dropped LDAP or just-in-time-provisioned users to the dbadmin user. The brackets (< and >) are required for this option.

For more information about usage, see Examples.

Default: <auto>

ImportExportTLSMode
When using CONNECT TO VERTICA to connect to another Vertica cluster for import or export, specifies the degree of stringency for using TLS. Possible values are:
  • PREFER: Try TLS but fall back to plaintext if TLS fails.

  • REQUIRE: Use TLS and fail if the server does not support TLS.

  • VERIFY_CA: Require TLS (as with REQUIRE), and also validate the other server's certificate using the CA specified by the "server" TLS Configuration's CA certificates (in this case, "ca_cert" and "ica_cert"):

    => SELECT name, certificate, ca_certificate, mode FROM tls_configurations WHERE name = 'server';
      name  |   certificate    |   ca_certificate   |   mode
    --------+------------------+--------------------+-----------
     server | server_cert      | ca_cert,ica_cert   | VERIFY_CA
    (1 row)
    
  • VERIFY_FULL: Require TLS and validate the certificate (as with VERIFY_CA), and also validate the server certificate's hostname.

  • REQUIRE_FORCE, VERIFY_CA_FORCE, and VERIFY_FULL_FORCE: Same behavior as REQUIRE, VERIFY_CA, and VERIFY_FULL, respectively, and cannot be overridden by CONNECT TO VERTICA.

Default: PREFER

InternodeTLSConfig
The TLS Configuration to use for internode encryption.

For example:

=> ALTER DATABASE DEFAULT SET InternodeTLSConfig = my_tls_config;

Default: data_channel

LDAPAuthTLSConfig
The TLS Configuration to use for TLS with LDAP authentication.

For example:

=> ALTER DATABASE DEFAULT SET LDAPAuthTLSConfig = my_tls_config;

Default: ldapauth

LDAPLinkTLSConfig
The TLS Configuration to use for TLS for the LDAP Link service.

For example:

=> ALTER DATABASE DEFAULT SET LDAPLinkTLSConfig = my_tls_config;

Default: ldaplink

OAuth2JITClient

The client/application name that contains Vertica roles in the Keycloak identity provider. The Vertica roles under resource_access.OAuth2JITClient.roles are automatically granted to and set as default roles for users created by just-in-time provisioning. Roles that do not exist in Vertica are ignored.

For details, see Just-in-time user provisioning.

Default: vertica

OAuth2JITRolesClaimName
Specifies an IdP client with one or more roles that correspond to a Vertica database role. During just-in-time user provisioning, each client role that matches an existing Vertica role is automatically assigned to the user. Roles that do not exist in Vertica are ignored.

For authentication records that use JWT validation, Vertica retrieves the client roles from the access token.

To retrieve client roles for authentication records that use IdP validation, Vertica sends a request to either the introspect_url or the userinfo_url set in the authentication record. This parameter requires the full path to the roles claim in that request. For example, the full path to the roles claim in a Keycloak request uses the following format:

resource_access.clientName.roles

For details, see Just-in-time user provisioning.

Default: resource_access.vertica.roles

OAuth2JITForbiddenRoles
Vertica database roles that you do not want Vertica to automatically assign to users during just-in-time provisioning. This parameter accepts one or more roles as a comma-separated list.
OAuth2JITGroupsClaimName
The name of the IdP group whose name or role mappings correspond to Vertica database roles. During just-in-time user provisioning, each group name or group role mapping that matches an existing Vertica role is automatically assigned to the user. Roles that do not exist in Vertica are ignored.

For authentication records that use JWT validation, Vertica retrieves the JIT-provisioned user's groups from the access token.

To retrieve the JIT-provisioned user's groups for authentication records that use IdP validation, Vertica sends a request to either the introspect_url or the userinfo_url set in the authentication record.

Default: groups

OAuth2UserExpiredInterval
If EnableOAuthJITCleanup is enabled, users created by just-in-time OAuth provisioning are automatically dropped after not logging in for the number of days specified by OAuth2UserExpiredInterval. The number of days the user has not logged in is calculated relative to the LAST_LOGIN_TIME column in the USERS system table.

Default: 14

PasswordLockTimeUnit
The time units for which an account is locked by PASSWORD_LOCK_TIME after FAILED_LOGIN_ATTEMPTS, one of the following:
  • 'd': days (default)

  • 'h': hours

  • 'm': minutes

  • 's': seconds

For example, to configure the default profile to lock user accounts for 30 minutes after three unsuccessful login attempts:

  
=> ALTER DATABASE DEFAULT SET PasswordLockTimeUnit = 'm'
=> ALTER PROFILE DEFAULT LIMIT PASSWORD_LOCK_TIME 30;
  
RequireFIPS
Boolean, specifies whether the FIPS mode is enabled:
  • 0 (disable)

  • 1: (enable)

On startup, Vertica automatically sets this parameter from the contents of the file crypto.fips_enabled. You cannot modify this parameter.

For details, see FIPS compliance for the Vertica server.

Default: 0

SecurityAlgorithm
Sets the algorithm for the function that hash authentication uses, one of the following:
  • SHA512

  • MD5

For example:

=> ALTER DATABASE DEFAULT SET SecurityAlgorithm = 'SHA512';

Default: SHA512

ServerTLSConfig
The TLS Configuration to use for client-server TLS.

For example:

=> ALTER DATABASE DEFAULT SET ServerTLSConfig = my_tls_config;

Default: server

SystemCABundlePath
The absolute path to a certificate bundle of trusted CAs. This CA bundle is used when establishing TLS connections to external services such as AWS or Azure through their respective SDKs and libcurl. The CA bundle file must be in the same location on all nodes.

If this parameter is empty, Vertica searches the "standard" paths for the CA bundles, which differs between distributions:

  • Red Hat-based: /etc/pki/tls/certs/ca-bundle.crt
  • Debian-based: /etc/ssl/certs/ca-certificates.crt
  • SUSE: /var/lib/ca-certificates/ca-bundle.pem

Example:

=> ALTER DATABASE DEFAULT SET SystemCABundlePath = 'path/to/ca_bundle.pem';
  

Default: Empty

TLS parameters

To set your Vertica database's TLSMode, private key, server certificate, and CA certificate(s), see TLS configurations. In versions prior to 11.0.0, these parameters were known as EnableSSL, SSLPrivateKey, SSLCertificate, and SSLCA, respectively.

Examples

Set the database parameter GlobalHeirUsername:

=> \du
      List of users
 User name | Is Superuser
-----------+--------------
 Joe       | f
 SuzyQ     | f
 dbadmin   | t
(3 rows)

=> ALTER DATABASE DEFAULT SET PARAMETER GlobalHeirUsername='SuzyQ';
ALTER DATABASE
=>  \c - Joe
You are now connected as user "Joe".
=> CREATE TABLE t1 (a int);
CREATE TABLE

=> \c
You are now connected as user "dbadmin".
=> \dt t1
             List of tables
 Schema | Name | Kind  | Owner | Comment
--------+------+-------+-------+---------
 public | t1   | table | Joe   |
(1 row)

=> DROP USER Joe;
NOTICE 4927:  The Table t1 depends on User Joe
ROLLBACK 3128:  DROP failed due to dependencies
DETAIL:  Cannot drop User Joe because other objects depend on it
HINT:  Use DROP ... CASCADE to drop the dependent objects too
=> DROP USER Joe CASCADE;
DROP USER
=> \dt t1
             List of tables
 Schema | Name | Kind  | Owner | Comment
--------+------+-------+-------+---------
 public | t1   | table | SuzyQ |
(1 row)

4.22 - Stored procedure parameters

The following parameters configure the behavior of stored procedures and triggers.

The following parameters configure the behavior of stored procedures and triggers.

EnableNestedStoredProcedures
Boolean, whether to enable nested stored procedures.

Default: 0

EnableStoredProcedureScheduler
Boolean, whether to enable the scheduler. For details, see Scheduled execution.

Default: 1

PLvSQLCoerceNull
Boolean, whether to allow NULL-to-false type coercion to improve compatibility with PLpgSQL. For details, see PL/pgSQL to PL/vSQL migration guide.

Default: 0

4.23 - Text search parameters

You can configure Vertica for text search using the following parameter.

You can configure Vertica for text search using the following parameter.

TextIndexMaxTokenLength
Controls the maximum size of a token in a text index.

For example:

ALTER DATABASE database_name SET PARAMETER TextIndexMaxTokenLength=760;

If the parameter is set to a value greater than 65000 characters, then the tokenizer truncates the token at 65000 characters.

Default: 128 (characters)

4.24 - Tuple mover parameters

These parameters control how the operates.

These parameters control how the Tuple Mover operates.

Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
ActivePartitionCount
Sets the number of active partitions. The active partitions are those most recently created. For example:
=> ALTER DATABASE DEFAULT SET ActivePartitionCount = 2;

For information about how the Tuple Mover treats active and inactive partitions during a mergeout operation, see Partition mergeout.

Default: 1

CancelTMTimeout
When partition, copy table, and rebalance operations encounter a conflict with an internal Tuple Mover job, those operations attempt to cancel the conflicting Tuple Mover job. This parameter specifies the amount of time, in seconds, that the blocked operation waits for the Tuple Mover cancellation to take effect. If the operation is unable to cancel the Tuple Mover job within limit specified by this parameter, the operation displays an error and rolls back.

Default: 300

EnableTMOnRecoveringNode
Boolean, specifies whether Tuple Mover performs mergeout activities on nodes with a node state of RECOVERING. Enabling Tuple Mover reduces the number of ROS containers generated during recovery. Having fewer than 1024 ROS containers per projection allows Vertica to maintain optimal recovery performance.

Default: 1 (enabled)

MaxMrgOutROSSizeMB
Specifies in MB the maximum size of ROS containers that are candidates for mergeout operations. The Tuple Mover avoids merging ROS containers that are larger than this setting.

Default: -1 (no maximum limit)

MergeOutInterval
Specifies in seconds how frequently the Tuple Mover checks the mergeout request queue for pending requests:
  1. If the queue contains mergeout requests, the Tuple Mover does nothing and goes back to sleep.

  2. If the queue is empty, the Tuple Mover:

    • Processes pending storage location move requests.

    • Checks for new unqueued purge requests and adds them to the queue.

    It then goes back to sleep.

Default: 600

PurgeMergeoutPercent
Specifies as a percentage the threshold of deleted records in a ROS container that invokes an automatic mergeout operation, to purge those records. Vertica only counts the number of 'aged-out' delete vectors—that is, delete vectors that are as 'old' or older than the ancient history mark (AHM) epoch.

This threshold applies to all ROS containers for non-partitioned tables. It also applies to ROS containers of all inactive partitions. In both cases, aged-out delete vectors are permanently purged from the ROS container.

Default: 20 (percent)

5 - File systems and object stores

Vertica supports access to several file systems and object stores in addition to the Linux file system.

Vertica supports access to several file systems and object stores in addition to the Linux file system. The reference pages in this section provide information on URI syntax, configuration parameters, and authentication.

Vertica accesses the file systems in this section in one of two ways:

  • If user-provided credentials are present, Vertica uses them to access the storage. Note that on HDFS, user credentials are always present because Vertica accesses HDFS using the Vertica user identity.

  • If user-provided credentials are not present, or if the UseServerIdentityOverUserIdentity configuration parameter is set, Vertica checks for a configured USER storage location. When access is managed through USER storage locations, Vertica uses the server credential to access the file system. For more information about USER storage locations, see CREATE LOCATION.

Not all file systems are supported in all contexts. See the documentation of specific features for the file systems those features support.

5.1 - Azure Blob Storage object store

Azure has several interfaces for accessing data.

Azure has several interfaces for accessing data. Vertica reads and always writes Block Blobs in Azure Storage. Vertica can read external data created using ADLS Gen2, and data that Vertica exports can be read using ADLS Gen2.

URI format

One of the following:

  • azb://account/container/path
  • azb://[account@]host[:port]/container/path

In the first version, a URI like 'azb://myaccount/mycontainer/path' treats the first token after the '//' as the account name. In the second version, you can specify account and must specify host explicitly.

The following rules apply to the second form:

  • If account is not specified, the first label of the host is used. For example, if the URI is 'azb://myaccount.blob.core.windows.net/mycontainer/my/object', then 'myaccount' is used for account.
  • If account is not specified and host has a single label and no port, the endpoint is host.blob.core.windows.net. Otherwise, the endpoint is the host and port specified in the URI.

The protocol (HTTP or HTTPS) is specified in the AzureStorageEndpointConfig configuration parameter.

Authentication

If you are using Azure managed identities, no further configuration in Vertica is needed. If your Azure storage uses multiple managed identities, you must tag the one to be used. Vertica looks for an Azure tag with a key of VerticaManagedIdentityClientId, the value of which must be the client_id attribute of the managed identity to be used. If you update the Azure tag, call AZURE_TOKEN_CACHE_CLEAR.

If you are not using managed identities, use the AzureStorageCredentials configuration parameter to provide credentials to Azure. If loading data, you can set the parameter at the session level. If using Eon Mode communal storage on Azure, you must set this configuration parameter at the database level.

In Azure you must also grant access to the containers for the identities used from Vertica.

Configuration parameters

The following database configuration parameters apply to the Azure blob file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.

For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.

AzureStorageCredentials
Collection of JSON objects, each of which specifies connection credentials for one endpoint. This parameter takes precedence over Azure managed identities.

The collection must contain at least one object and may contain more. Each object must specify at least one of accountName or blobEndpoint, and at least one of accountKey or sharedAccessSignature.

  • accountName: If not specified, uses the label of blobEndpoint.
  • blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
  • accountKey: Access key for the account or endpoint.
  • sharedAccessSignature: Access token for finer-grained access control, if being used by the Azure endpoint.
AzureStorageEndpointConfig
Collection of JSON objects, each of which specifies configuration elements for one endpoint. Each object must specify at least one of accountName or blobEndpoint.
  • accountName: If not specified, uses the label of blobEndpoint.
  • blobEndpoint: Host name with optional port (host:port). If not specified, uses account.blob.core.windows.net.
  • protocol: HTTPS (default) or HTTP.
  • isMultiAccountEndpoint: true if the endpoint supports multiple accounts, false otherwise (default is false). To use multiple-account access, you must include the account name in the URI. If a URI path contains an account, this value is assumed to be true unless explicitly set to false.

Examples

The following examples use these values for the configuration parameters. AzureStorageCredentials contains sensitive information and is set at the session level in this example.

=> ALTER SESSION SET AzureStorageCredentials =
    '[{"accountName": "myaccount", "accountKey": "REAL_KEY"},
      {"accountName": "myaccount", "blobEndpoint": "localhost:8080", "accountKey": "TEST_KEY"}]';

=> ALTER DATABASE default SET AzureStorageEndpointConfig =
    '[{"accountName": "myaccount", "blobEndpoint": "localhost:8080", "protocol": "http"}]';

The following example creates an external table using data from Azure. The URI specifies an account name of "myaccount".

=> CREATE EXTERNAL TABLE users (id INT, name VARCHAR(20))
    AS COPY FROM 'azb://myaccount/mycontainer/my/object/*';

Vertica uses AzureStorageEndpointConfig and the account name to produce the following location for the files:

https://myaccount.blob.core.windows.net/mycontainer/my/object/*

Data is accessed using the REAL_KEY credential.

If the URI in the COPY statement is instead azb://myaccount.blob.core.windows.net/mycontainer/my/object, then the resulting location is https://myaccount.blob.core.windows.net/mycontainer/my/object, again using the REAL_KEY credential.

However, if the URI in the COPY statement is azb://myaccount@localhost:8080/mycontainer/my/object, then the host and port specify a different endpoint: http://localhost:8080/myaccount/mycontainer/my/object. This endpoint is configured to use a different credential, TEST_KEY.

5.2 - Google Cloud Storage (GCS) object store

File system using the Google Cloud Storage platform.

File system using the Google Cloud Storage platform.

URI format

gs://bucket/path

Authentication

To access data in Google Cloud Storage (GCS) you must first do the following tasks:

  • Create a default project, obtain a developer key, and enable S3 interoperability mode as described in the GCS documentation.

  • Set the GCSAuth configuration parameter as in the following example.

    => ALTER SESSION SET GCSAuth='id:secret';
    

Configuration parameters

The following database configuration parameters apply to the GCS file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter. For information about all parameters related to GCS, see Google Cloud Storage parameters.

For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.

GCSAuth
An ID and secret key to authenticate to GCS. For extra security, do not store credentials in the database; instead, use ALTER SESSION...SET PARAMETER to set this value for the current session only.
GCSEnableHttps
Boolean, whether to use the HTTPS protocol when connecting to GCS, can be set only at the database level with ALTER DATABASE...SET PARAMETER.

Default: 1 (enabled)

GCSEndpoint
The connection endpoint address.

Default: storage.googleapis.com

Examples

The following example loads data from GCS:

=> ALTER SESSION SET GCSAuth='my_id:my_secret_key';

=> COPY t FROM 'gs://DataLake/clicks.parquet' PARQUET;

5.3 - HDFS file system

HDFS is the Hadoop Distributed File System.

HDFS is the Hadoop Distributed File System. You can use the webhdfs and swebhdfs schemes to access data through the WebHDFS service. Vertica also supports the hdfs scheme, which by default uses WebHDFS. To have hdfs URIs use the deprecated LibHDFS++ package, set the HDFSUseWebHDFS configuration parameter to 0 (disabled).

If you specify a webhdfs URI but the Hadoop HTTP policy (dfs.http.policy) is set to HTTPS_ONLY, Vertica automatically uses swebhdfs instead.

If you use LibHDFS++, the WebHDFS service must still be available because Vertica falls back to WebHDFS for operations not supported by LibHDFS++.

URI format

URIs in the webhdfs, swebhdfs, and hdfs schemes all have two formats, depending on whether you specify a name service or the host and port of a name node:

  • [[s]web]hdfs://[nameservice]/path
  • [[s]web]hdfs://namenode-host:port/path

Characters may be URL-encoded (%NN where NN is a two-digit hexadecimal number) but are not required to be, except that the '%' character must be encoded.

To use the default name service specified in the HDFS configuration files, omit nameservice. Use this shorthand only for reading external data, not for creating a storage location.

Always specify a name service or host explicitly when using Vertica with more than one HDFS cluster. The name service or host name must be globally unique. Using [web]hdfs:/// could produce unexpected results because Vertica uses the first value of fs.defaultFS that it finds.

Authentication

Vertica can use Kerberos authentication with Cloudera or Hortonworks HDFS clusters. See Accessing kerberized HDFS data.

For loading and exporting data, Vertica can access HDFS clusters protected by mTLS through the swebhdfs scheme. You must create a certificate and key and set the WebhdfsClientCertConf configuration parameter.

You can use CREATE KEY and CREATE CERTIFICATE to create temporary, session-scoped values if you specify the TEMPORARY keyword. Temporary keys and certificates are stored in memory, not on disk.

The WebhdfsClientCertConf configuration parameter holds client credentials for one or more HDFS clusters. The value is a JSON string listing name services or authorities and their corresponding keys. You can set the configuration parameter at the session or database level. Setting the parameter at the database level has the following additional requirements:

  • The UseServerIdentityOverUserIdentity configuration parameter must be set to 1 (true).

  • The user must be dbadmin or must have access to the user storage location on HDFS.

The following example shows how to use mTLS. The key and certificate values themselves are not shown, just the beginning and end markers:

=> CREATE TEMPORARY KEY client_key TYPE 'RSA'
   AS '-----BEGIN PRIVATE KEY-----...-----END PRIVATE KEY-----';

-> CREATE TEMPORARY CERTIFICATE client_cert
   AS '-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----' key client_key;

=> ALTER SESSION SET WebhdfsClientCertConf =
   '[{"authority": "my.hdfs.namenode1:50088", "certName": "client_cert"}]';

=> COPY people FROM 'swebhdfs://my.hdfs.namenode1:50088/path/to/file/1.txt';
Rows Loaded
-------------
1
(1 row)

To configure access to more than one HDFS cluster, define the keys and certificates and then include one object per cluster in the value of WebhdfsClientCertConf:

=> ALTER SESSION SET WebhdfsClientCertConf =
    '[{"authority" : "my.authority.com:50070", "certName" : "myCert"},
      {"nameservice" : "prod", "certName" : "prodCert"}]';

Configuration parameters

The following database configuration parameters apply to the HDFS file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter. For information about all parameters related to Hadoop, see Hadoop parameters.

EnableHDFSBlockInfoCache
Boolean, whether to distribute block location metadata collected during planning on the initiator to all database nodes for execution, reducing name node contention. Disabled by default.
HadoopConfDir
Directory path containing the XML configuration files copied from Hadoop. The same path must be valid on every Vertica node. The files are accessed by the Linux user under which the Vertica server process runs.
HadoopImpersonationConfig
Session parameter specifying the delegation token or Hadoop user for HDFS access. See HadoopImpersonationConfig format for information about the value of this parameter and Proxy users and delegation tokens for more general context.
HDFSUseWebHDFS
Boolean. If true (the default), URIs in the hdfs scheme are treated as if they were in the webhdfs scheme. If false, Vertica uses LibHDFS++ where possible, though some operations can still use WebHDFS if not supported by LibHDFS++.
WebhdfsClientCertConf
mTLS configurations for accessing one or more WebHDFS servers, a JSON string. Each object must specify either a nameservice or authority field and a certName field. See Authentication.

Configuration files

The path specified in HadoopConfDir must include a directory containing the files listed in the following table. Vertica reads these files at database start time. If you do not set a value, Vertica looks for the files in /etc/hadoop/conf.

If a property is not defined, Vertica uses the defaults shown in the table. If no default is specified for a property, the configuration files must specify a value.

File Properties Default
core-site.xml fs.defaultFS none
(for doAs users:) hadoop.proxyuser.username.users none
(for doAs users:) hadoop.proxyuser.username.hosts none
hdfs-site.xml dfs.client.failover.max.attempts 15
dfs.client.failover.sleep.base.millis 500
dfs.client.failover.sleep.max.millis 15000
(For HA NN:) dfs.nameservices none
(WebHDFS:) dfs.namenode.http-address or dfs.namenode.https-address none
(WebHDFS:) dfs.datanode.http.address or dfs.datanode.https.address none
(WebHDFS:) dfs.http.policy HTTP_ONLY

If using High Availability (HA) Name Nodes, the individual name nodes must also be defined in hdfs-site.xml.

To verify that Vertica can find configuration files in HadoopConfDir, use the VERIFY_HADOOP_CONF_DIR function.

To test access through the hdfs scheme, use the HDFS_CLUSTER_CONFIG_CHECK function.

For more information about testing your configuration, see Verifying HDFS configuration.

To reread the configuration files, use the CLEAR_HDFS_CACHES function.

Name nodes and name services

You can access HDFS data using the default name node by not specifying a name node or name service:

=> COPY users FROM 'webhdfs:///data/users.csv';

Vertica uses the fs.defaultFS Hadoop configuration parameter to find the name node. (It then uses that name node to locate the data.) You can instead specify a host and port explicitly using the following format:

webhdfs://nn-host:nn-port/

The specified host is the name node, not an individual data node. If you are using High Availability (HA) Name Nodes you should not use an explicit host because high availability is provided through name services instead.

If the HDFS cluster uses High Availability Name Nodes or defines name services, use the name service instead of the host and port, in the format webhdfs://nameservice/. The name service you specify must be defined in hdfs-site.xml.

The following example shows how you can use a name service, hadoopNS:

=> CREATE EXTERNAL TABLE users (id INT, name VARCHAR(20))
    AS COPY FROM 'webhdfs://hadoopNS/data/users.csv';

If you are using Vertica to access data from more than one HDFS cluster, always use explicit name services or hosts in the URL. Using the /// shorthand could produce unexpected results because Vertica uses the first value of fs.defaultFS that it finds. To access multiple HDFS clusters, you must use host and service names that are globally unique. See Configuring HDFS access for more information.

5.4 - S3 object store

File systems using the S3 protocol, including AWS, Pure Storage, and MinIO.

File systems using the S3 protocol, including AWS, Pure Storage, and MinIO.

URI format

s3://bucket/path

For AWS, specify the region using the AWSRegion configuration parameter, not the URI. If the region is incorrect, you might experience a delay before the load fails because Vertica retries several times before giving up. The default region is us-east-1.

Authentication

For AWS:

  • To access S3 you must create an IAM role and grant that role permission to access your S3 resources.

  • By default, bucket access is restricted to the communal storage bucket. Use an AWS access key to load data from non-communal storage buckets.

  • Either set the AWSAuth configuration parameter to provide credentials or create a USER storage location for the S3 path (see CREATE LOCATION) and grant users access.

  • You can use AWS STS temporary session tokens to load data. Because they are session tokens, do not use them for access to storage locations.

  • You can configure S3 buckets individually with the per-bucket parameters S3BucketConfig and S3BucketCredentials. For details, see Per-bucket S3 configurations.

Configuration parameters

The following database configuration parameters apply to the S3 file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.

You can configure individual buckets using the S3BucketConfig and S3BucketCredentials parameters instead of the global parameters.

For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.

The following descriptions are summaries. For details about all parameters specific to S3, see S3 parameters.

AWSAuth
An ID and secret key for authentication. AWS calls these AccessKeyID and SecretAccessKey. For extra security, do not store credentials in the database; use ALTER SESSION...SET PARAMETER to set this value for the current session only.
AWSCAFile
The file name of the TLS server certificate bundle to use. You must set a value when installing a CA certificate on a SUSE Linux Enterprise Server.
AWSCAPath
The path Vertica uses to look up TLS server certificates. You must set a value when installing a CA certificate on a SUSE Linux Enterprise Server.
AWSEnableHttps
Boolean, whether to use the HTTPS protocol when connecting to S3. Can be set only at the database level. You can set the prototol for individual buckets using S3BucketConfig.

Default: 1 (enabled)

AWSEndpoint
String, the endpoint host for all S3 URLs, set as follows:
  • AWS: hostname_or_IP:port. Do not include the scheme (http(s)).

  • AWS with a FIPS-compliant S3 Endpoint: Hostname of a FIPS-compliant S3 endpoint. You must also enable S3EnableVirtualAddressing.

  • On-premises/Pure: IP address of the Pure Storage server.

If not set, Vertica uses virtual-hosted request URLs.

Default: 's3.amazonaws.com'

AWSLogLevel
The log level, one of: OFF, FATAL, ERROR, WARN, INFO, DEBUG, or TRACE.

Default: ERROR

AWSRegion
The AWS region containing the S3 bucket from which to read files. This parameter can only be configured with one region at a time. Failure to set the correct region can lead to a delay before queries fail.

Default: 'us-east-1'

AWSSessionToken
A temporary security token generated by running the get-session-token command, used to configure multi-factor authentication.
AWSStreamingConnectionPercentage
In Eon Mode, the number of connections to the communal storage to use for streaming reads. In a cloud environment, this setting helps prevent streaming data from using up all available file handles. This setting is unnecessary when using on-premises object stores because of their lower latency.
S3BucketConfig
A JSON array of objects specifying per-bucket configuration overrides. Each property other than the bucket name has a corresponding configuration parameter (shown in parentheses). If both the database-level parameter and its equivalent in S3BucketConfig are set, the value in S3BucketConfig takes precedence.

Properties:

  • bucket: Bucket name

  • region (AWSRegion)

  • protocol (AWSEnableHttp): Connection protocol, one of http or https

  • endpoint (AWSEndpoint)

  • enableVirtualAddressing (S3BucketCredentials): Boolean, whether to rewrite the S3 URL to use a virtual hosted path

  • requesterPays (S3RequesterPays)

  • serverSideEncryption (S3ServerSideEncryption)

  • sseCustomerAlgorithm (S3SseCustomerAlgorithm)

  • sseCustomerKey (S3SseCustomerKey)

  • sseKmsKeyId (S3SseKmsKeyId)

  • proxy (S3Proxy)

S3BucketCredentials
A JSON object specifying per-bucket credentials. Each property other than the bucket name has a corresponding configuration parameter. If both the database-level parameter and its equivalent in S3BucketCredentials are set, the value in S3BucketCredentials takes precedence.

Properties:

  • bucket: Bucket name

  • accessKey: Access key for the bucket (the ID in AWSAuth)

  • secretAccessKey: Secret access key for the bucket (the secret in AWSAuth)

  • sessionToken: Session token, only used when S3BucketCredentials is set at the session level (AWSSessionToken)

This parameter is only visible to superusers. Users can set this parameter at the session level with ALTER SESSION.

S3EnableVirtualAddressing
Boolean, whether to rewrite S3 URLs to use virtual-hosted paths (disabled by default). This configuration setting takes effect only when you have specified a value for AWSEndpoint.

If you set AWSEndpoint to a FIPS-compliant S3 endpoint, you must enable S3EnableVirtualAddressing.

The value of this parameter does not affect how you specify S3 paths.

S3Proxy
HTTP(S) proxy settings, if needed, a string in the following format: http[s]://[user:password@]host[:port].
S3RequesterPays
Boolean, whether requester (instead of bucket owner) pays the cost of accessing data on the bucket.
S3ServerSideEncryption
String, encryption algorithm to use when reading or writing to S3. Supported values are AES256 (for SSE-S3), aws:kms (for SSE-KMS), and an empty string (for no encryption). See Server-Side Encryption.

Default: "" (no encryption)

S3SseCustomerAlgorithm
String, the encryption algorithm to use when reading or writing to S3 using SSE-C encryption. The only supported values are AES256 and "". For SSE-S3 and SSE-KMS, instead use S3ServerSideEncryption.

Default: "" (no encryption)

S3SseCustomerKey
If using SSE-C encryption, the client key for S3 access.
S3SseKmsKeyId
If using SSE-KMS encryption, the key identifier (not the key) to pass to the Key Management Service. Vertica must have permission to use the key, which is managed through KMS.

Write performance

By default, Vertica performs writes using a single thread, but a single write usually includes multiple files or parts of files. For writes to S3, you can use a larger thread pool to perform writes in parallel. This thread pool is used for all file writes to S3, including file exports and writes to communal storage.

The size of the thread pool is controlled by the ObjStoreUploadParallelism configuration parameter. Each node has a single thread pool used for all file writes. In general, one or two threads per concurrent writer produces good results.

Server-side encryption

By default, Vertica reads and writes S3 data that is not encrypted. If the S3 bucket uses server-side encryption (SSE), you can configure Vertica to access it. S3 supports three types of server-side encryption: SSE-S3, SSE-KMS, and SSE-C.

Vertica must also have read or write permissions (depending on the operation) on the bucket.

SSE-S3

With SSE-S3, the S3 service manages encryption keys. Reads do not require additional configuration. To write to S3, the client (Vertica, in this case) must specify only the encryption algorithm.

If the S3 bucket is configured with the default encryption settings, Vertica can read and write data to them with no further changes. If the bucket does not use the default encryption settings, set the S3ServerSideEncryption configuration parameter or the serverSideEncryption field in S3BucketConfig to AES256.

SSE-KMS

With SSE-KMS, encryption keys are managed by the Key Management Service (KMS). The client must supply a KMS key identifier (not the actual key) when writing data. For all operations, the client must have permission to use the KMS key. These permissions are managed in KMS, not in Vertica.

To use SSE-KMS:

  • Set the S3ServerSideEncryption configuration parameter or the serverSideEncryption field in S3BucketConfig to aws:kms.

  • Set the S3SseKmsKeyId configuration parameter or the sseKmsKeyId field in S3BucketConfig to the key ID.

SSE-C

With SSE-C, the client manages encryption keys and provides them to S3 for each operation.

To use SSE-C:

  • Set the S3SseCustomerAlgorithm configuration parameter or the sseCustomerAlgorithm field in S3BucketConfig to AES256.

  • Set the S3SseCustomerKey configuration parameter or the sseCustomerKey field in S3BucketConfig to the access key. The value can be either a 32-character plaintext key or a 44-character base64-encoded key.

HTTP(S) logging (AWS)

When a request to AWS fails or is retried, or Vertica detects a performance degradation, Vertica logs the event in the UDFS_EVENTS system table. The event type for these records is HttpRequestAttempt, and the description is a JSON object with more details. For example:

=> SELECT filesystem, event, description FROM UDFS_EVENTS;
 filesystem |        event        |                                                                 description
------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------
 S3         |  HttpRequestAttempt | {"AttemptStartTimeGMT":"2024-01-18 16:45:52.663","RetryCount":0,"HttpMethod":"GET","URIString":"s3://mybucket/s3_2.dat","AmzRequestId":"79104EXAMPLEB723","AmzId2":"IOWQ4fDEXAMPLEQM+ey7N9WgVhSnQ6JEXAMPLEZb7hSQDASK+Jd1vEXAMPLEa3Km","HttpStatusCode":404,"HttpRequestLatency":2,"SdkExceptionMessage":"No response body."}
 S3         |  HttpRequestAttempt | {"AttemptStartTimeGMT":"2024-01-18 16:46:02.663","RetryCount":1,"HttpMethod":"GET","URIString":"s3://mybucket/s3_2.dat","AmzRequestId":"79104EXAMPLEB791","AmzId2":"JPXQ4fDEXAMPLEQM+ey7N9WgVhSnQ6JEXAMPLEZb7hSQDASK+Jd1vEXAMPLEa4Ln","HttpStatusCode":206,"HttpRequestLatency":1}
(2 rows)

To extract fields from this description, use the MAPJSONEXTRACTOR function:

=> SELECT filesystem, event, 
     (MapJSONExtractor(description))['RetryCount'] as RetryCount, 
     (MapJSONExtractor(description))['HttpStatusCode'] as HttpStatusCode
   FROM UDFS_EVENTS;
 filesystem |        event        | RetryCount | HttpStatusCode
------------+---------------------+------------+-----------------
 S3         |  HttpRequestAttempt |          0 |            404
 S3         |  HttpRequestAttempt |          1 |            206
(2 rows)

For details on AWS-specific fields such as AmzRequestId, see the AWS documentation.

The UDFS_STATISTICS system table records the accumulated duration for all HTTP requests in the TOTAL_REQUEST_DURATION_MS column. This table also provides computed values for average duration and throughput.

Examples

The following example sets a database-wide AWS region and credentials:

=> ALTER DATABASE DEFAULT SET AWSRegion='us-west-1';
=> ALTER DATABASE DEFAULT SET AWSAuth = 'myaccesskeyid123456:mysecretaccesskey123456789012345678901234';

The following example loads data from S3. You can use a glob if all files in the glob can be loaded together. In the following example, AWS_DataLake contains only ORC files.

=> COPY t FROM 's3://datalake/*' ORC;

You can specify a list of comma-separated S3 buckets as in the following example. All buckets must be in the same region. To load from more than one region, use separate COPY statements and change the value of AWSRegion between calls.

=> COPY t FROM 's3://AWS_Data_1/sales.parquet', 's3://AWS_Data_2/sales.parquet' PARQUET;

The following example creates a user storage location and a role, so that users without their own S3 credentials can read data from S3 using the server credential.

   --- set database-level credential (once):
=> ALTER DATABASE DEFAULT SET AWSAuth = 'myaccesskeyid123456:mysecretaccesskey123456789012345678901234';

=> CREATE LOCATION 's3://datalake' SHARED USAGE 'USER' LABEL 's3user';

=> CREATE ROLE ExtUsers;
   --- Assign users to this role using GRANT (Role).

=> GRANT READ ON LOCATION 's3://datalake' TO ExtUsers;

The configuration properties for a given bucket may differ based on its type. The following S3BucketConfig setting is for an AWS bucket (AWSBucket) and a Pure Storage bucket (PureStorageBucket). AWSBucket doesn't specify an endpoint, so Vertica uses the AWSEndpoint configuration parameter, which defaults to s3.amazonaws.com:

=> ALTER DATABASE DEFAULT SET S3BucketConfig=
'[
    {
        "bucket": "AWSBucket",
        "region": "us-east-2",
        "protocol": "https",
        "requesterPays": true,
        "serverSideEncryption": "aes256"
    },
    {
        "bucket": "PureStorageBucket",
        "endpoint": "pure.mycorp.net:1234",
        "protocol": "http",
        "enableVirtualAddressing": false
    }
]';

The following example sets S3BucketCredentials for these two buckets:

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "AWSBucket",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    },
    {
        "bucket": "PureStorageBucket",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>"
    }
]';

The following example sets an STS temporary session token. Vertica uses the session token to access S3 with the specified credentials and bypasses checking for a USER storage location.

$ aws sts get-session-token
  {
    "Credentials": {
        "AccessKeyId": "ASIAJZQNDVS727EHDHOQ",
        "SecretAccessKey": "F+xnpkHbst6UPorlLGj/ilJhO5J2n3Yo7Mp4vYvd",
        "SessionToken": "FQoDYXdzEKv//////////wEaDMWKxakEkCyuDH0UjyKsAe6/3REgW5VbWtpuYyVvSnEK1jzGPHi/jPOPNT7Kd+ftSnD3qdaQ7j28SUW9YYbD50lcXikz/HPlusPuX9sAJJb7w5oiwdg+ZasIS/+ejFgCzLeNE3kDAzLxKKsunvwuo7EhTTyqmlLkLtIWu9zFykzrR+3Tl76X7EUMOaoL31HOYsVEL5d9I9KInF0gE12ZB1yN16MsQVxpSCavOFHQsj/05zbxOQ4o0erY1gU=",
        "Expiration": "2018-07-18T05:56:33Z"
    }
  }

$ vsql
=> ALTER SESSION SET AWSAuth = 'ASIAJZQNDVS727EHDHOQ:F+xnpkHbst6UPorlLGj/ilJhO5J2n3Yo7Mp4vYvd';
=> ALTER SESSION SET AWSSessionToken = 'FQoDYXdzEKv//////////wEaDMWKxakEkCyuDH0UjyKsAe6/3REgW5VbWtpuYyVvSnEK1jzGPHi/jPOPNT7Kd+ftSnD3qdaQ7j28SUW9YYbD50lcXikz/HPlusPuX9sAJJb7w5oiwdg+ZasIS/+ejFgCzLeNE3kDAzLxKKsunvwuo7EhTTyqmlLkLtIWu9zFykzrR+3Tl76X7EUMOaoL31HOYsVEL5d9I9KInF0gE12ZB1yN16MsQVxpSCavOFHQsj/05zbxOQ4o0erY1gU=';

See also

Per-Bucket S3 Configurations

5.4.1 - Per-bucket S3 configurations

You can manage configurations and credentials for individual buckets with the S3BucketConfig and S3BucketCredentials configuration parameters.

You can manage configurations and credentials for individual buckets with the S3BucketConfig and S3BucketCredentials configuration parameters. These parameters each take a JSON object, whose respective properties behave like the related S3 configuration parameters.

For example, you can create a different configuration for each of your S3 buckets by setting S3BucketConfig at the database level with ALTER DATABASE. The following S3BucketConfig specifies several common bucket properties:

=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
    {
        "bucket": "exampleAWS",
        "region": "us-east-2",
        "protocol": "https",
        "requesterPays": true
    },
    {
        "bucket": "examplePureStorage",
        "endpoint": "pure.mycorp.net:1234",
        "protocol": "http",
        "enableVirtualAddressing": false
    }
]';

Users can then access a bucket by setting S3BucketCredentials at the session level with ALTER SESSION. The following S3BucketCredentials specifies all properties and authenticates to both exampleAWS and examplePureStorage simultaneously:

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "exampleAWS",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    },
    {
        "bucket": "examplePureStorage",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>",
    }
]';

The recommended usage is as follows:

  • Define in your S3 storage system one set of credentials per principal, per storage system.

  • It is often most convenient to set S3BucketConfig once at the database level and have users authenticate by setting S3BucketCredentials at the session level.

  • To access buckets outside those configured at the database level, set both S3BucketConfig and S3BucketCredentials at the session level.

If you cannot define credentials for your S3 storage, you can set S3BucketCredentials or AWSAuth at the database level with ALTER DATABASE, but this comes with certain drawbacks:

  • Storing credentials statically in another location (in this case, in the Vertica catalog) always incurs additional risk.

  • This increases overhead for the dbadmin, who needs to create user storage locations and grant access to each user or role.

  • Users share one set of credentials, increasing the potential impact if the credentials are compromised.

Precedence of per-bucket and standard parameters

Vertica uses the following rules to determine the effective set of properties for an S3 connection:

  • If set, S3BucketCredentials takes priority over its standard parameters. S3BucketCredentials is checked first at the session level and then at the database level.

  • The level/source of the S3 credential parameters determines the source of the S3 configuration parameters:

    • If credentials come from the session level, then the configuration can come from either the session or database level (with the session level taking priority).

    • If your credentials come from the database level, then the configuration can only come from the database level.

  • If S3BucketConfig is set, it takes priority over its standard parameters. If an S3BucketConfig property isn't specified, Vertica falls back to the missing property's equivalent parameter. For example, if S3BucketConfig specifies every property except protocol, Vertica falls back to the standard parameter AWSEnableHttps.

Examples

Multiple buckets

This example configures a real Amazon S3 bucket AWSBucket and a Pure Storage bucket PureStorageBucket with S3BucketConfig.

AWSBucket does not specify an endpoint or protocol, so Vertica falls back to AWSEndpoint (defaults to s3.amazonaws.com) and AWSEnableHttps (defaults to 1).

In this example environment, access to the PureStorageBucket is over a secure network, so HTTPS is disabled:

=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
    {
        "bucket": "AWSBucket",
        "region": "us-east-2"
    },
    {
        "bucket": "PureStorageBucket",
        "endpoint": "pure.mycorp.net:1234",
        "protocol": "http",
        "enableVirtualAddressing": false
    }
]';

Bob can then set S3BucketCredentials at the session level to authenticate to AWSBucket:

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "AWSBucket",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    }
]';

Similarly, Alice can authenticate to PureStorageBucket:

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "PureStorageBucket",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>"
    }
]';

Charlie provides credentials for both AWSBucket and PureStorageBucket and authenticates to them simultaneously. This allows him to perform cross-endpoint joins, export from one bucket to another, etc.

=> ALTER SESSION SET S3BucketCredentials='
[
    {
        "bucket": "AWSBucket",
        "accessKey": "<AK0>",
        "secretAccessKey": "<SAK0>",
        "sessionToken": "1234567890"
    },
    {
        "bucket": "PureStorageBucket",
        "accessKey": "<AK1>",
        "secretAccessKey": "<SAK1>"
    }
]';

S3 server-side encryption

S3 has three types of server-side encryption: SSE-S3, SSE-KMS, and SSE-C. The following example configures access using SSE-KMS:

=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
    {
        "bucket": "AWSBucket",
        "region": "us-east-2",
        "serverSideEncryption": "aws:kms",
        "sseKmsKeyId": "1234abcd-12ab-34cd-56ef-1234567890ab"
    }
]';

For more information, see Server-Side Encryption.

Non-amazon S3 storage with AWSEndpoint and S3BucketConfig

If AWSEndpoint is set to a non-Amazon S3 bucket like Pure Storage or MinIO and you want to configure S3BucketConfig for a real Amazon S3 bucket, the following requirements apply:

  • If your real Amazon S3 region is not us-east-1 (the default), you must specify the region.

  • Set endpoint to an empty string ("").

In this example, AWSEndpoint is set to a Pure Storage bucket.

=> ALTER DATABASE DEFAULT SET AWSEndpoint='pure.mycorp.net:1234';

To configure S3BucketConfig for a real Amazon S3 bucket realAmazonS3Bucket in region "us-east-2":

=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
    {
        "bucket": "realAmazonS3Bucket",
        "region": "us-east-2",
        "endpoint": ""
    },
]';

6 - Functions

Functions return information from the database.

Functions return information from the database. This section describes functions that Vertica supports. Except for meta-functions, you can use a function anywhere an expression is allowed.

Meta-functions usually access the internal state of Vertica. They can be used in a top-level SELECT statement only, and the statement cannot contain other clauses such as FROM or WHERE. Meta-functions are labeled on their reference pages.

The Behavior Type section on each reference page categorizes the function's return behavior as one or more of the following:

  • Immutable (invariant): When run with a given set of arguments, immutable functions always produce the same result, regardless of environment or session settings such as locale.
  • Stable: When run with a given set of arguments, stable functions produce the same result within a single query or scan operation. However, a stable function can produce different results when issued under different environments or at different times, such as change of locale and time zone—for example, SYSDATE.
  • Volatile: Regardless of their arguments or environment, volatile functions can return a different result with each invocation—for example, UUID_GENERATE.

List of all functions

The following list contains all Vertica SQL functions.

Jump to letter: A - B - C - D - E - F - G - H - I - J - K - L - M - N - O - P - Q - R - S - T - U - V - W - X - Y - Z

A

ABS
Returns the absolute value of the argument. [Mathematical functions]
ACOS
Returns a DOUBLE PRECISION value representing the trigonometric inverse cosine of the argument. [Mathematical functions]
ACOSH
Returns a DOUBLE PRECISION value that represents the inverse (arc) hyperbolic cosine of the function argument. [Mathematical functions]
ACTIVE_SCHEDULER_NODE
Returns the active scheduler node. [Stored procedure functions]
ADD_MONTHS
Adds the specified number of months to a date and returns the sum as a DATE. [Date/time functions]
ADVANCE_EPOCH
Manually closes the current epoch and begins a new epoch. [Epoch functions]
AGE_IN_MONTHS
Returns the difference in months between two dates, expressed as an integer. [Date/time functions]
AGE_IN_YEARS
Returns the difference in years between two dates, expressed as an integer. [Date/time functions]
ALTER_LOCATION_LABEL
Adds a label to a storage location, or changes or removes an existing label. [Storage functions]
ALTER_LOCATION_SIZE
Resizes on one node, all nodes in a subcluster, or all nodes in the database. [Eon Mode functions]
ALTER_LOCATION_USE
Alters the type of data that a storage location holds. [Storage functions]
ANALYZE_CONSTRAINTS
Analyzes and reports on constraint violations within the specified scope. [Table functions]
ANALYZE_CORRELATIONS
This function is deprecated and will be removed in a future release. [Table functions]
ANALYZE_EXTERNAL_ROW_COUNT
Calculates the exact number of rows in an external table. [Statistics management functions]
ANALYZE_STATISTICS
Collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table. [Statistics management functions]
ANALYZE_STATISTICS_PARTITION
Collects and aggregates data samples and storage information for a range of partitions in the specified table. [Statistics management functions]
ANALYZE_WORKLOAD
Runs Workload Analyzer, a utility that analyzes system information held in system tables. [Workload management functions]
APPLY_AVG
Returns the average of all elements in a with numeric values. [Collection functions]
APPLY_BISECTING_KMEANS
Applies a trained bisecting k-means model to an input relation, and assigns each new data point to the closest matching cluster in the trained model. [Transformation functions]
APPLY_COUNT (ARRAY_COUNT)
Returns the total number of non-null elements in a. [Collection functions]
APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)
Returns the total number of elements in a , including NULLs. [Collection functions]
APPLY_IFOREST
Applies an isolation forest (iForest) model to an input relation. [Transformation functions]
APPLY_INVERSE_PCA
Inverts the APPLY_PCA-generated transform back to the original coordinate system. [Transformation functions]
APPLY_INVERSE_SVD
Transforms the data back to the original domain. [Transformation functions]
APPLY_KMEANS
Assigns each row of an input relation to a cluster center from an existing k-means model. [Transformation functions]
APPLY_KPROTOTYPES
Assigns each row of an input relation to a cluster center from an existing k-prototypes model. [Transformation functions]
APPLY_MAX
Returns the largest non-null element in a. [Collection functions]
APPLY_MIN
Returns the smallest non-null element in a. [Collection functions]
APPLY_NORMALIZE
A UDTF function that applies the normalization parameters saved in a model to a set of specified input columns. [Transformation functions]
APPLY_ONE_HOT_ENCODER
A user-defined transform function (UDTF) that loads the one hot encoder model and writes out a table that contains the encoded columns. [Transformation functions]
APPLY_PCA
Transforms the data using a PCA model. [Transformation functions]
APPLY_SUM
Computes the sum of all elements in a of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL). [Collection functions]
APPLY_SVD
Transforms the data using an SVD model. [Transformation functions]
APPROXIMATE_COUNT_DISTINCT
Returns the number of distinct non-NULL values in a data set. [Aggregate functions]
APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS
Calculates the number of distinct non-NULL values from the synopsis objects created by APPROXIMATE_COUNT_DISTINCT_SYNOPSIS. [Aggregate functions]
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS
Summarizes the information of distinct non-NULL values and materializes the result set in a VARBINARY or LONG VARBINARY synopsis object. [Aggregate functions]
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS_MERGE
Aggregates multiple synopses into one new synopsis. [Aggregate functions]
APPROXIMATE_MEDIAN [aggregate]
Computes the approximate median of an expression over a group of rows. [Aggregate functions]
APPROXIMATE_PERCENTILE [aggregate]
Computes the approximate percentile of an expression over a group of rows. [Aggregate functions]
APPROXIMATE_QUANTILES
Computes an array of weighted, approximate percentiles of a column within some user-specified error. [Aggregate functions]
ARGMAX [analytic]
This function is patterned after the mathematical function argmax(f(x)), which returns the value of x that maximizes f(x). [Analytic functions]
ARGMAX_AGG
Takes two arguments target and arg, where both are columns or column expressions in the queried dataset. [Aggregate functions]
ARGMIN [analytic]
This function is patterned after the mathematical function argmin(f(x)), which returns the value of x that minimizes f(x). [Analytic functions]
ARGMIN_AGG
Takes two arguments target and arg, where both are columns or column expressions in the queried dataset. [Aggregate functions]
ARIMA
Creates and trains an autoregressive integrated moving average (ARIMA) model from a time series with consistent timesteps. [Machine learning algorithms]
ARRAY_CAT
Concatenates two arrays of the same element type and dimensionality. [Collection functions]
ARRAY_CONTAINS
Returns true if the specified element is found in the array and false if not. [Collection functions]
ARRAY_DIMS
Returns the dimensionality of the input array. [Collection functions]
ARRAY_FIND
Returns the ordinal position of a specified element in an array, or -1 if not found. [Collection functions]
ASCII
Converts the first character of a VARCHAR datatype to an INTEGER. [String functions]
ASIN
Returns a DOUBLE PRECISION value representing the trigonometric inverse sine of the argument. [Mathematical functions]
ASINH
Returns a DOUBLE PRECISION value that represents the inverse (arc) hyperbolic sine of the function argument. [Mathematical functions]
ATAN
Returns a DOUBLE PRECISION value representing the trigonometric inverse tangent of the argument. [Mathematical functions]
ATAN2
Returns a DOUBLE PRECISION value representing the trigonometric inverse tangent of the arithmetic dividend of the arguments. [Mathematical functions]
ATANH
Returns a DOUBLE PRECISION value that represents the inverse hyperbolic tangent of the function argument. [Mathematical functions]
AUDIT
Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size. [License functions]
AUDIT_FLEX
Returns the estimated ROS size of __raw__ columns, equivalent to the export size of the flex data in the audited objects. [License functions]
AUDIT_LICENSE_SIZE
Triggers an immediate audit of the database size to determine if it is in compliance with the raw data storage allowance included in your Vertica licenses. [License functions]
AUDIT_LICENSE_TERM
Triggers an immediate audit to determine if the Vertica license has expired. [License functions]
AUTOREGRESSOR
Creates an autoregressive (AR) model from a stationary time series with consistent timesteps that can then be used for prediction via PREDICT_AR. [Machine learning algorithms]
AVG [aggregate]
Computes the average (arithmetic mean) of an expression over a group of rows. [Aggregate functions]
AVG [analytic]
Computes an average of an expression in a group within a. [Analytic functions]
AZURE_TOKEN_CACHE_CLEAR
Clears the cached access token for Azure. [Cloud functions]

B

BACKGROUND_DEPOT_WARMING
Vertica version 10.0.0 removes support for foreground depot warming. [Eon Mode functions]
BALANCE
Returns a view with an equal distribution of the input data based on the response_column. [Data preparation]
BISECTING_KMEANS
Executes the bisecting k-means algorithm on an input relation. [Machine learning algorithms]
BIT_AND
Takes the bitwise AND of all non-null input values. [Aggregate functions]
BIT_LENGTH
Returns the length of the string expression in bits (bytes * 8) as an INTEGER. [String functions]
BIT_OR
Takes the bitwise OR of all non-null input values. [Aggregate functions]
BIT_XOR
Takes the bitwise XOR of all non-null input values. [Aggregate functions]
BITCOUNT
Returns the number of one-bits (sometimes referred to as set-bits) in the given VARBINARY value. [String functions]
BITSTRING_TO_BINARY
Translates the given VARCHAR bitstring representation into a VARBINARY value. [String functions]
BOOL_AND [aggregate]
Processes Boolean values and returns a Boolean value result. [Aggregate functions]
BOOL_AND [analytic]
Returns the Boolean value of an expression within a. [Analytic functions]
BOOL_OR [aggregate]
Processes Boolean values and returns a Boolean value result. [Aggregate functions]
BOOL_OR [analytic]
Returns the Boolean value of an expression within a. [Analytic functions]
BOOL_XOR [aggregate]
Processes Boolean values and returns a Boolean value result. [Aggregate functions]
BOOL_XOR [analytic]
Returns the Boolean value of an expression within a. [Analytic functions]
BTRIM
Removes the longest string consisting only of specified characters from the start and end of a string. [String functions]
BUILD_FLEXTABLE_VIEW
Creates, or re-creates, a view for a default or user-defined keys table, ignoring any empty keys. [Flex data functions]

C

CALENDAR_HIERARCHY_DAY
Groups DATE partition keys into a hierarchy of years, months, and days. [Partition functions]
CANCEL_DEPOT_WARMING
Cancels depot warming on a node. [Eon Mode functions]
CANCEL_DRAIN_SUBCLUSTER
Cancels the draining of a subcluster or subclusters. [Eon Mode functions]
CANCEL_REBALANCE_CLUSTER
Stops any rebalance task that is currently in progress or is waiting to execute. [Cluster functions]
CANCEL_REFRESH
Cancels refresh-related internal operations initiated by START_REFRESH and REFRESH. [Session functions]
CBRT
Returns the cube root of the argument. [Mathematical functions]
CEILING
Rounds up the returned value up to the next whole number. [Mathematical functions]
CHANGE_CURRENT_STATEMENT_RUNTIME_PRIORITY
Changes the run-time priority of an active query. [Workload management functions]
CHANGE_MODEL_STATUS
Changes the status of a registered model. [Model management]
CHANGE_RUNTIME_PRIORITY
Changes the run-time priority of a query that is actively running. [Workload management functions]
CHARACTER_LENGTH
The CHARACTER_LENGTH() function:. [String functions]
CHI_SQUARED
Computes the conditional chi-Square independence test on two categorical variables to find the likelihood that the two variables are independent. [Data preparation]
CHR
Converts the first character of an INTEGER datatype to a VARCHAR. [String functions]
CLEAN_COMMUNAL_STORAGE
Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed. [Eon Mode functions]
CLEAR_CACHES
Clears the Vertica internal cache files. [Storage functions]
CLEAR_DATA_COLLECTOR
Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR. [Data Collector functions]
CLEAR_DATA_DEPOT
Deletes the specified depot data. [Eon Mode functions]
CLEAR_DEPOT_ANTI_PIN_POLICY_PARTITION
Removes an anti-pinning policy from the specified partition. [Eon Mode functions]
CLEAR_DEPOT_ANTI_PIN_POLICY_PROJECTION
Removes an anti-pinning policy from the specified projection. [Eon Mode functions]
CLEAR_DEPOT_ANTI_PIN_POLICY_TABLE
Removes an anti-pinning policy from the specified table. [Eon Mode functions]
CLEAR_DEPOT_PIN_POLICY_PARTITION
Clears a depot pinning policy from the specified table or projection partitions. [Eon Mode functions]
CLEAR_DEPOT_PIN_POLICY_PROJECTION
Clears a depot pinning policy from the specified projection. [Eon Mode functions]
CLEAR_DEPOT_PIN_POLICY_TABLE
Clears a depot pinning policy from the specified table. [Eon Mode functions]
CLEAR_DIRECTED_QUERY_USAGE
Resets the counter in the DIRECTED_QUERY_STATUS table. [Directed queries functions]
CLEAR_FETCH_QUEUE
Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage. [Eon Mode functions]
CLEAR_HDFS_CACHES
Clears the configuration information copied from HDFS and any cached connections. [Hadoop functions]
CLEAR_OBJECT_STORAGE_POLICY
Removes a user-defined storage policy from the specified database, schema or table. [Storage functions]
CLEAR_PROFILING
Clears from memory data for the specified profiling type. [Profiling functions]
CLEAR_PROJECTION_REFRESHES
Clears information projection refresh history from system table PROJECTION_REFRESHES. [Projection functions]
CLEAR_RESOURCE_REJECTIONS
Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables. [Database functions]
CLOCK_TIMESTAMP
Returns a value of type TIMESTAMP WITH TIMEZONE that represents the current system-clock time. [Date/time functions]
CLOSE_ALL_RESULTSETS
Closes all result set sessions within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets. [Client connection functions]
CLOSE_ALL_SESSIONS
Closes all external sessions except the one that issues this function. [Session functions]
CLOSE_RESULTSET
Closes a specific result set within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets. [Client connection functions]
CLOSE_SESSION
Interrupts the specified external session, rolls back the current transaction if any, and closes the socket. [Session functions]
CLOSE_USER_SESSIONS
Stops the session for a user, rolls back any transaction currently running, and closes the connection. [Session functions]
COALESCE
Returns the value of the first non-null expression in the list. [NULL-handling functions]
COLLATION
Applies a collation to two or more strings. [String functions]
COMPACT_STORAGE
Bundles existing data (.fdb) and index (.pidx) files into the .gt file format. [Database functions]
COMPUTE_FLEXTABLE_KEYS
Computes the virtual columns (keys and values) from flex table VMap data. [Flex data functions]
COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW
Combines the functionality of BUILD_FLEXTABLE_VIEW and COMPUTE_FLEXTABLE_KEYS to compute virtual columns (keys) from the VMap data of a flex table and construct a view. [Flex data functions]
CONCAT
Concatenates two strings and returns a varchar data type. [String functions]
CONDITIONAL_CHANGE_EVENT [analytic]
Assigns an event window number to each row, starting from 0, and increments by 1 when the result of evaluating the argument expression on the current row differs from that on the previous row. [Analytic functions]
CONDITIONAL_TRUE_EVENT [analytic]
Assigns an event window number to each row, starting from 0, and increments the number by 1 when the result of the boolean argument expression evaluates true. [Analytic functions]
CONFUSION_MATRIX
Computes the confusion matrix of a table with observed and predicted values of a response variable. [Model evaluation]
CONTAINS
Returns true if the specified element is found in the collection and false if not. [Collection functions]
COPY_PARTITIONS_TO_TABLE
Copies partitions from one table to another. [Partition functions]
COPY_TABLE
Copies one table to another. [Table functions]
CORR
Returns the DOUBLE PRECISION coefficient of correlation of a set of expression pairs, as per the Pearson correlation coefficient. [Aggregate functions]
CORR_MATRIX
Takes an input relation with numeric columns, and calculates the Pearson Correlation Coefficient between each pair of its input columns. [Data preparation]
COS
Returns a DOUBLE PRECISION value tat represents the trigonometric cosine of the passed parameter. [Mathematical functions]
COSH
Returns a DOUBLE PRECISION value that represents the hyperbolic cosine of the passed parameter. [Mathematical functions]
COT
Returns a DOUBLE PRECISION value representing the trigonometric cotangent of the argument. [Mathematical functions]
COUNT [aggregate]
Returns as a BIGINT the number of rows in each group where the expression is not NULL. [Aggregate functions]
COUNT [analytic]
Counts occurrences within a group within a. [Analytic functions]
COVAR_POP
Returns the population covariance for a set of expression pairs. [Aggregate functions]
COVAR_SAMP
Returns the sample covariance for a set of expression pairs. [Aggregate functions]
CROSS_VALIDATE
Performs k-fold cross validation on a learning algorithm using an input relation, and grid search for hyper parameters. [Model evaluation]
CUME_DIST [analytic]
Calculates the cumulative distribution, or relative rank, of the current row with regard to other rows in the same partition within a . [Analytic functions]
CURRENT_DATABASE
Returns the name of the current database, equivalent to DBNAME. [System information functions]
CURRENT_DATE
Returns the date (date-type value) on which the current transaction started. [Date/time functions]
CURRENT_LOAD_SOURCE
When called within the scope of a COPY statement, returns the file name or path part used for the load. [System information functions]
CURRENT_SCHEMA
Returns the name of the current schema. [System information functions]
CURRENT_SESSION
Returns the ID of the current client session. [System information functions]
CURRENT_TIME
Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction. [Date/time functions]
CURRENT_TIMESTAMP
Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction. [Date/time functions]
CURRENT_TRANS_ID
Returns the ID of the transaction currently in progress. [System information functions]
CURRENT_USER
Returns a VARCHAR containing the name of the user who initiated the current database connection. [System information functions]
CURRVAL
Returns the last value across all nodes that was set by NEXTVAL on this sequence in the current session. [Sequence functions]

D

DATA_COLLECTOR_HELP
Returns online usage instructions about the Data Collector, the V_MONITOR.DATA_COLLECTOR system table, and the Data Collector control functions. [Data Collector functions]
DATE
Converts the input value to a DATE data type. [Date/time functions]
DATE_PART
Extracts a sub-field such as year or hour from a date/time expression, equivalent to the the SQL-standard function EXTRACT. [Date/time functions]
DATE_TRUNC
Truncates date and time values to the specified precision. [Date/time functions]
DATEDIFF
Returns the time span between two dates, in the intervals specified. [Date/time functions]
DAY
Returns as an integer the day of the month from the input value. [Date/time functions]
DAYOFMONTH
Returns the day of the month as an integer. [Date/time functions]
DAYOFWEEK
Returns the day of the week as an integer, where Sunday is day 1. [Date/time functions]
DAYOFWEEK_ISO
Returns the ISO 8061 day of the week as an integer, where Monday is day 1. [Date/time functions]
DAYOFYEAR
Returns the day of the year as an integer, where January 1 is day 1. [Date/time functions]
DAYS
Returns the integer value of the specified date, where 1 AD is 1. [Date/time functions]
DBNAME (function)
Returns the name of the current database, equivalent to CURRENT_DATABASE. [