This is the multi-page printable view of this section.
Click here to print.
Return to the regular view of this page.
SQL reference
Vertica offers a robust set of SQL elements that allow you to manage and analyze massive volumes of data quickly and reliably, including:.
Vertica offers a robust set of SQL elements that allow you to manage and analyze massive volumes of data quickly and reliably, including:
-
Language elements such as keywords, operators, expressions, predicates, and hints
-
Data types including complex types
-
Functions including Vertica-specific functions that take advantage of Vertica's unique column-store architecture
-
SQL statements that let you write robust queries to quickly return large volumes of data
1 - System limits
This section describes system limits on the size and number of objects in a Vertica database.
This section describes system limits on the size and number of objects in a Vertica database. In most cases, computer memory and disk drive are the limiting factors.
Item |
Maximum |
Nodes |
128 (without Vertica assistance) |
Database size |
Dependent on maximum disk configuration, approximately:
numFiles * platformFileSize
|
Table size |
Smaller of:
-
264 rows per node
-
263 bytes per column
|
Row size |
(231) -1
Row size is approximately the sum of its maximum column sizes. For example, a VARCHAR(80) has a maximum size of 80 bytes.
|
Key size |
Dependent on row size |
Tables/projections per database |
Dependent on physical RAM, as the catalog must fit in memory. |
Concurrent connections per node |
Dependent on physical RAM (or threads per process), typically 1024
Default: 50
|
Concurrent connections per cluster |
Dependent on physical RAM of a single node (or threads per process), typically 1024 |
Columns per table/view |
9800 |
Arguments per function call |
9800 |
Rows per load |
263 |
ROS containers per projection |
1024
See Minimizing partitions.
|
Length of fixed-length column |
65000 bytes |
Length of variable-length column |
32 MB |
Length of basic names |
128 bytes. Basic names include table names, column names, etc. |
Query length |
None |
Depth of nesting subqueries |
None in FROM , WHERE , and HAVING clauses |
2 - Language elements
The following topics provide detailed descriptions of the language elements and conventions of Vertica SQL.
The following topics provide detailed descriptions of the language elements and conventions of Vertica SQL.
2.1 - Keywords
Keywords are words that have a specific meaning in the SQL language.
Keywords are words that have a specific meaning in the SQL language. Every SQL statement contains one or more keywords. Although SQL is not case-sensitive with respect to keywords, they are generally shown in uppercase letters throughout this documentation for readability purposes.
Note
If you use a keyword as the name of an identifier or an alias in your SQL statements, you may have to qualify the keyword with AS
or double-quotes. Vertica requires AS
or double-quotes for certain reserved and non-reserved words to prevent confusion with expression syntax, or where the use of a word would be ambiguous.
Reserved words and keywords
Many keywords are also reserved words.
Vertica recommends that you not use reserved words as names for objects, or as identifiers. Including reserved words can make your SQL statements confusing. Reserved words that are used as names for objects or identifiers must be enclosed in double-quotes.
Note
All reserved words are also keywords, but Vertica can add reserved words that are not keywords. A reserved word can simply be a word that is reserved for future use.
Non-reserved keywords
Non-reserved keywords have a special meaning in some contexts, but can be used as identifiers in others. You can use non-reserved keywords as aliases—for example, SOURCE
:
=> SELECT my_node AS SOURCE FROM nodes;
Note
Vertica uses several non-reserved keywords in directed queries to specify special join types. You can use these keywords as table aliases only if they are double-quoted; otherwise, double-quotes can be omitted:
ANTI
NULLAWARE
SEMI
SEMIALL
UNI
Viewing the list of reserved and non-reserved keywords
To view the current list of Vertica reserved and non-reserved words, query system table
KEYWORDS
. Vertica lists keywords alphabetically and identifies them as reserved (R) or non-reserved (N).
For example, the following query gets all reserved keywords that begin with B:
=> SELECT * FROM keywords WHERE reserved = 'R' AND keyword ilike 'B%';
keyword | reserved
---------+----------
BETWEEN | R
BIGINT | R
BINARY | R
BIT | R
BOOLEAN | R
BOTH | R
(6 rows)
2.2 - Identifiers
Identifiers (names) of objects such as schema, table, projection, column names, and so on, can be up to 128 bytes in length.
Identifiers (names) of objects such as schema, table, projection, column names, and so on, can be up to 128 bytes in length.
Unquoted identifiers
Unquoted SQL identifiers must begin with one of the following:
Subsequent characters in an identifier can be any combination of the following:
-
Non-Unicode letters: A–Z or a-z
-
Underscore (_)
-
Digits(0–9)
-
Unicode letters (letters with diacriticals or not in the Latin alphabet), unsupported for model names
-
Dollar sign ($
), unsupported for model names
Caution
The SQL standard does not support dollar sign in identifiers, so usage can cause application portability problems.
Quoted identifiers
Note
Quoted identifiers are not supported for model names
Identifiers enclosed in double quote ("
) characters can contain any character. If you want to include a double quote, you need a pair of them; for example """"
. You can use names that would otherwise be invalid—for example, names that include only numeric characters ("123"
) or contain space characters, punctuation marks, and SQL or Vertica-reserved keywords. For example:
CREATE SEQUENCE "my sequence!";
Double quotes are required for non-alphanumerics and SQL keywords such as "1time", "Next week" and "Select".
Case sensitivity
Identifiers are not case-sensitive. Thus, identifiers "ABC"
, "ABc"
, and "aBc"
are synonymous, as are ABC
, ABc
, and aBc.
Non-ASCII characters
Vertica accepts non-ASCII UTF-8 Unicode characters for table names, column names, and other identifiers, extending the cases where upper/lower case distinctions are ignored (case-folded) to all alphabets, including Latin, Cyrillic, and Greek.
For example, the following CREATE TABLE
statement uses the ß (German eszett) in the table name:
=> CREATE TABLE straße(x int, y int);
CREATE TABLE
Identifiers are stored as created
SQL identifiers, such as table and column names, are not converted to lowercase. They are stored as created, and references to them are resolved using case-insensitive compares. For example, the following statement creates table ALLCAPS
.
=> CREATE TABLE ALLCAPS(c1 varchar(30));
=> INSERT INTO ALLCAPS values('upper case');
The following statements are variations of the same query:
=> SELECT * FROM ALLCAPS;
=> SELECT * FROM allcaps;
=> SELECT * FROM "allcaps";
The three queries all return the same result:
c1
------------
upper case
(1 row)
Note that Vertica returns an error if you try to create table AllCaps
:
=> CREATE TABLE AllCaps(c1 varchar(30));
ROLLBACK: table "AllCaps" already exists
See QUOTE_IDENT for additional information.
2.3 - Literals
Literals are numbers or strings used in SQL as constants.
Literals are numbers or strings used in SQL as constants. Literals are included in the select-list, along with expressions and built-in functions and can also be constants.
Vertica provides support for number-type literals (integers and numerics), string literals, VARBINARY string literals, and date/time literals. The various string literal formats are discussed in this section.
2.3.1 - Number-type literals
Vertica supports three types of numbers: integers, numerics, and floats.
Vertica supports three types of numbers: integers, numerics, and floats.
-
Integers are whole numbers less than 2^63 and must be digits.
-
Numerics are whole numbers larger than 2^63 or that include a decimal point with a precision and a scale. Numerics can contain exponents. Numbers that begin with 0x are hexadecimal numerics.
Numeric-type values can also be generated using casts from character strings. This is a more general syntax. See the Examples section below, as well as Data type coercion operators (CAST).
Syntax
digits
digits.[digits] | [digits].digits
digits e[+-]digits | [digits].digits e[+-]digits | digits.[digits] e[+-]digits
Parameters
digits
- One or more numeric characters, 0 through 9
e
- Exponent marker
Notes
- At least one digit must follow the exponent marker (
e
), if e
is present.
- There cannot be any spaces or other characters embedded in the constant.
-
Leading plus (+) or minus (–) signs are not considered part of the constant; they are unary operators applied to the constant.
-
In most cases a numeric-type constant is automatically coerced to the most appropriate type depending on context. When necessary, you can force a numeric value to be interpreted as a specific data type by casting it as described in Data type coercion operators (CAST).
-
Floating point literals are not supported. If you specifically need to specify a float, you can cast as described in Data type coercion operators (CAST).
-
Vertica follows the IEEE specification for floating point, including NaN (not a number) and Infinity (Inf).
-
A NaN is not greater than and at the same time not less than anything, even itself. In other words, comparisons always return false whenever a NaN is involved.
-
Dividing INTEGERS (x / y) yields a NUMERIC result. You can use the // operator to truncate the result to a whole number.
Examples
The following are examples of number-type literals:
42
3.5
4.
.001
5e2
1.925e-3
Scientific notation:
=> SELECT NUMERIC '1e10';
?column?
-------------
10000000000
(1 row)
BINARY
scaling:
=> SELECT NUMERIC '1p10';
?column?
----------
1024
(1 row)
=> SELECT FLOAT 'Infinity';
?column?
----------
Infinity
(1 row)
The following examples illustrated using the / and // operators to divide integers:
=> SELECT 40/25;
?column?
----------------------
1.600000000000000000
(1 row)
=> SELECT 40//25;
?column?
----------
1
(1 row)
See also
Data type coercion
2.3.2 - String literals
String literals are string values surrounded by single or double quotes.
String literals are string values surrounded by single or double quotes. Double-quoted strings are subject to the backslash, but single-quoted strings do not require a backslash, except for \'
and \\
.
You can embed single quotes and backslashes into single-quoted strings.
To include other backslash (escape) sequences, such as \t
(tab), you must use the double-quoted form.
Precede single-quoted strings with a space between the string and its preceding word, since single quotes are allowed in identifiers.
See also
2.3.2.1 - Character string literals
Character string literals are a sequence of characters from a predefined character set, enclosed by single quotes.
Character string literals are a sequence of characters from a predefined character set, enclosed by single quotes.
Syntax
'character-seq'
Parameters
character-seq
- Arbitrary sequence of characters
Embedded single quotes
If a character string literal includes a single quote, it must be doubled. For example:
=> SELECT 'Chester''s gorilla';
?column?
-------------------
Chester's gorilla
(1 row)
Vertica uses standard-conforming strings as specified in the SQL standard, so backslashes are treated as string literals and not escape characters.
Note
Earlier versions of Vertica did not use standard conforming strings, and backslashes were always considered escape sequences. To revert to this older behavior, set the configuration parameter
StandardConformingStrings to 0. You can also use the
EscapeStringWarning parameter to locate back slashes which have been incorporated into string literals, in order to remove them.
Examples
=> SELECT 'This is a string';
?column?
------------------
This is a string
(1 row)
=> SELECT 'This \is a string';
WARNING: nonstandard use of escape in a string literal at character 8
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
?column?
------------------
This is a string
(1 row)
vmartdb=> SELECT E'This \is a string';
?column?
------------------
This is a string
=> SELECT E'This is a \n new line';
?column?
----------------------
This is a
new line
(1 row)
=> SELECT 'String''s characters';
?column?
--------------------
String's characters
(1 row)
See also
2.3.2.2 - Dollar-quoted string literals
Dollar-quoted string literals are rarely used, but are provided here for your convenience.
Dollar-quoted string literals are rarely used, but are provided here for your convenience.
The standard syntax for specifying string literals can be difficult to understand. To allow more readable queries in such situations, Vertica SQL provides dollar quoting
. Dollar quoting is not part of the SQL standard, but it is often a more convenient way to write complicated string literals than the standard-compliant single quote syntax.
Syntax
$$characters$$
Parameters
characters
- Arbitrary sequence of characters bounded by paired dollar signs (
$$
)
Dollar-quoted string content is treated as a literal. Single quote, backslash, and dollar sign characters have no special meaning within a dollar-quoted string.
Notes
A dollar-quoted string that follows a keyword or identifier must be separated from the preceding word by whitespace; otherwise, the dollar-quoting delimiter is taken as part of the preceding identifier.
Examples
=> SELECT $$Fred's\n car$$;
?column?
-------------------
Fred's\n car
(1 row)
=> SELECT 'SELECT 'fact';';
ERROR: syntax error at or near "';'" at character 21
LINE 1: SELECT 'SELECT 'fact';';
=> SELECT 'SELECT $$fact';$$;
?column?
---------------
SELECT $$fact
(1 row)
=> SELECT 'SELECT ''fact'';';
?column?
----------------
SELECT 'fact';
(1 row)
2.3.2.3 - Unicode string literals
hexit is hexadecimal integer (0-9, a-f).
Syntax
U&'characters' [ UESCAPE '<Unicode escape character>' ]
Parameters
characters
- Arbitrary sequence of UTF-8 characters bounded by single quotes (')
Unicode escape character
- A single character from the source language character set other than a hexit, plus sign (+), quote ('), double quote (''), or white space
With StandardConformingStrings
enabled, Vertica supports SQL standard Unicode character string literals (the character set is UTF-8 only).
Before you enter a Unicode character string literal, enable standard conforming strings in one of the following ways.
See also Extended String Literals.
Examples
To enter a Unicode character in hexadecimal, such as the Russian phrase for "thank you, use the following syntax:
=> SET STANDARD_CONFORMING_STRINGS TO ON;
=> SELECT U&'\0441\043F\0430\0441\0438\0431\043E' as 'thank you';
thank you
-----------
спасибо
(1 row)
To enter the German word mude
(where u
is really u-umlaut) in hexadecimal:
=> SELECT U&'m\00fcde';
?column?
----------
müde
(1 row)
=> SELECT 'ü';
?column?
----------
ü
(1 row)
To enter the LINEAR B IDEOGRAM B240 WHEELED CHARIOT
in hexadecimal:
=> SELECT E'\xF0\x90\x83\x8C';
?column?
----------
(wheeled chariot character)
(1 row)
Note
Not all fonts support the wheeled chariot character.
See also
2.3.2.4 - VARBINARY string literals
VARBINARY string literals allow you to specify hexadecimal or binary digits in a string literal.
VARBINARY string literals allow you to specify hexadecimal or binary digits in a string literal.
Syntax
X''
B''
Parameters
X
or x
- Specifies hexadecimal digits. The <hexadecimal digits> string must be enclosed in single quotes (').
B
or b
- Specifies binary digits. The <binary digits> string must be enclosed in single quotes (').
Examples
=> SELECT X'abcd';
?column?
----------
\253\315
(1 row)
=> SELECT B'101100';
?column?
----------
,
(1 row)
2.3.2.5 - Extended string literals
Syntax
E'characters'
Parameters
characters
- Arbitrary sequence of characters bounded by single quotes (')
You can use C-style backslash sequence in extended string literals, which are an extension to the SQL standard. You specify an extended string literal by writing the letter E as a prefix (before the opening single quote); for example:
E'extended character string\n'
Within an extended string, the backslash character (\
) starts a C-style backslash sequence, in which the combination of backslash and following character or numbers represent a special byte value, as shown in the following list. Any other character following a backslash is taken literally; for example, to include a backslash character, write two backslashes (\\
).
-
\\
is a backslash
-
\b
is a backspace
-
\f
is a form feed
-
\n
is a newline
-
\r
is a carriage return
-
\t
is a tab
-
\x##,
where ##
is a 1 or 2-digit hexadecimal number; for example \x07 is a tab
-
\###
, where ###
is a 1, 2, or 3-digit octal number representing a byte with the corresponding code.
When an extended string literal is concatenated across lines, write only E
before the first opening quote:
=> SELECT E'first part o'
'f a long line';
?column?
---------------------------
first part of a long line
(1 row)
Two adjacent single quotes are used as one single quote:
=> SELECT 'Aren''t string literals fun?';
?column?
-----------------------------
Aren't string literals fun?
(1 row)
When interpreting commands, such as those entered in vsql or in queries passed via JDBC or ODBC, Vertica uses standard conforming strings as specified in the SQL standard. In standard conforming strings, backslashes are treated as string literals (ordinary characters), not escape characters.
Note
Text read in from files or streams (such as the data inserted using the
COPY statement) are not treated as literal strings. The COPY command defines its own escape characters for the data it reads. See the
COPY statement documentation for details.
The following options are available, but Vertica recommends that you migrate your application to use standard conforming strings at your earliest convenience, after warnings have been addressed.
-
To treat back slashes as escape characters, set configuration parameter
StandardConformingStrings
to 0.
-
To enable standard conforming strings permanently, set the StandardConformingStrings
parameter to '1', as described below.
-
To enable standard conforming strings per session, use SET STANDARD_CONFORMING_STRING TO ON, which treats backslashes as escape characters for the current session only.
The following procedure can be used to identify nonstandard conforming strings in your application so that you can convert them into standard conforming strings:
-
Be sure the StandardConformingStrings
parameter is off, as described in Internationalization parameters.
=> ALTER DATABASE DEFAULT SET StandardConformingStrings = 0;
Note
Vertica recommends that you migrate your application to use standard conforming strings .
-
If necessary, turn on the EscapeStringWarning
parameter.
=> ALTER DATABASE DEFAULT SET EscapeStringWarning = 1;
Vertica now returns a warning each time it encounters an escape string within a string literal. For example, Vertica interprets the \n
in the following example as a new line:
=> SELECT 'a\nb';
WARNING: nonstandard use of escape in a string literal at character 8
HINT: Use the escape string syntax for escapes, e.g., E'\r\n'.
?column?
----------
a
b
(1 row)
When StandardConformingStrings
is ON
, the string is interpreted as four characters: a \ n b
.
Modify each string that Vertica flags by extending it as in the following example:
E'a\nb'
Or if the string has quoted single quotes, double them; for example, 'one'' double'.
-
Turn on the StandardConformingStrings
parameter for all sessions:
=> ALTER DATABASE DEFAULT SET StandardConformingStrings = 1;
Doubled single quotes
This section discusses vsql inputs that are not passed on to the server.
Vertica recognizes two consecutive single quotes within a string literal as one single quote character. For example, the following inputs, 'You''re here!'
ignored the second consecutive quote and returns the following:
=> SELECT 'You''re here!';
?column?
--------------
You're here!at
(1 row)
This is the SQL standard representation and is preferred over the form, 'You\'re here!'
, because backslashes are not parsed as before. You need to escape the backslash:
=> SELECT (E'You\'re here!');
?column?
--------------
You're here!
(1 row)
This behavior change introduces a potential incompatibility in the use of the vsql meta-command \set
, which automatically concatenates its arguments. For example:
\set file '\'' 'pwd' '/file.txt' '\''\echo :file
vsql takes the four arguments and outputs the following:
'/home/vertica/file.txt'
Vertica parses the adjacent single quotes as follows:
\set file '\'''pwd''/file.txt''\''\echo :file
'/home/vertica/file.txt''
Note the extra single quote at the end. This is due to the pair of adjacent single quotes together with the backslash-quoted single quote.
The extra quote can be resolved either as in the first example above, or by combining the literals as follows:
\set file '\''`pwd`'/file.txt'''\echo :file
'/home/vertica/file.txt'
In either case the backslash-quoted single quotes should be changed to doubled single quotes as follows:
\set file '''' `pwd` '/file.txt'''
Additional examples
=> SELECT 'This \is a string';
?column?
------------------
This \is a string
(1 row)
=> SELECT E'This \is a string';
?column?
------------------
This is a string
=> SELECT E'This is a \n new line';
?column?
----------------------
This is a
new line
(1 row)
=> SELECT 'String''s characters';
?column?
--------------------
String's characters
(1 row)
2.3.3 - Date/time literals
Date or time literal input must be enclosed in single quotes.
Date or time literal input must be enclosed in single quotes. Input is accepted in almost any reasonable format, including ISO 8601, SQL-compatible, traditional POSTGRES, and others.
Vertica handles date/time input more flexibly than the SQL standard requires. The exact parsing rules of date/time input and for the recognized text fields including months, days of the week, and time zones are described in Date/time expressions.
2.3.3.1 - Time zone values
Vertica attempts to be compatible with the SQL standard definitions for time zones.
Vertica attempts to be compatible with the SQL standard definitions for time zones. However, the SQL standard has an odd mix of date and time types and capabilities. Obvious problems are:
-
Although the DATE
type does not have an associated time zone, the TIME
type can. Time zones in the real world have little meaning unless associated with a date as well as a time, since the offset can vary through the year with daylight-saving time boundaries.
-
Vertica assumes your local time zone for any data type containing only date or time.
-
The default time zone is specified as a constant numeric offset from UTC. It is therefore not possible to adapt to daylight-saving time when doing date/time arithmetic across DST boundaries.
To address these difficulties, OpenText recommends using Date/Time types that contain both date and time when you use time zones. OpenText recommends that you do not use the type TIME WITH TIME ZONE,
even though it is supported it for legacy applications and for compliance with the SQL standard.
Time zones and time-zone conventions are influenced by political decisions, not just earth geometry. Time zones around the world became somewhat standardized during the 1900's, but continue to be prone to arbitrary changes, particularly with respect to daylight-savings rules.
Vertica currently supports daylight-savings rules over the time period 1902 through 2038, corresponding to the full range of conventional UNIX system time. Times outside that range are taken to be in "standard time" for the selected time zone, no matter what part of the year in which they occur.
Example |
Description |
PST |
Pacific Standard Time |
-8:00 |
ISO-8601 offset for PST |
-800 |
ISO-8601 offset for PST |
-8 |
ISO-8601 offset for PST |
zulu |
Military abbreviation for UTC |
z |
Short form of zulu |
2.3.3.2 - Day of the week names
The following tokens are recognized as names of days of the week:.
The following tokens are recognized as names of days of the week:
Day |
Abbreviations |
SUNDAY |
SUN |
MONDAY |
MON |
TUESDAY |
TUE, TUES |
WEDNESDAY |
WED, WEDS |
THURSDAY |
THU, THUR, THURS |
FRIDAY |
FRI |
SATURDAY |
SAT |
2.3.3.3 - Month names
The following tokens are recognized as names of months:.
The following tokens are recognized as names of months:
Month |
Abbreviations |
JANUARY |
JAN |
FEBRUARY |
FEB |
MARCH |
MAR |
APRIL |
APR |
MAY |
MAY |
JUNE |
JUN |
JULY |
JUL |
AUGUST |
AUG |
SEPTEMBER |
SEP, SEPT |
OCTOBER |
OCT |
NOVEMBER |
NOV |
DECEMBER |
DEC |
2.3.3.4 - Interval literal
A literal that represents a time span.
A literal that represents a time span.
Syntax
[ @ ] [-] { quantity subtype-unit }[...] [ AGO ]
Parameters
@
- Ignored
- (minus)
- Specifies a negative interval value.
quantity
- Integer numeric constant
subtype-unit
- See Interval subtype units for valid values. Subtype units must be specified for year-month intervals; they are optional for day-time intervals.
AGO
- Specifies a negative interval value.
AGO
and - (minus) are synonymous.
Notes
-
The amounts of different units are implicitly added up with appropriate sign accounting.
-
The boundaries of an interval constant are:
-
The range of an interval constant is
+/– 263 – 1
microseconds.
-
In Vertica, interval fields are additive and accept large floating-point numbers.
Examples
See Specifying interval input.
2.3.3.4.1 - Interval subtype units
The following tables lists subtype units that you can specify in an interval literal, divided into major categories:.
The following tables lists subtype units that you can specify in an interval literal, divided into major categories:
Year-month subtype units
Subtypes |
Units |
Notes |
Millennium |
mil , millennium , millennia , mils , millenniums |
|
Century |
c , cent , century , centuries |
|
Decade |
dec , decs , decade , decades |
|
Year |
a |
Julian year: 365.25 days |
|
ka |
Julian kilo-year: 365250 days |
|
y , yr , yrs , year , years |
Calendar year: 365 days |
Quarter |
q , qtr , qtrs , quarter , quarters |
|
Month |
m , mon , mons , months , month |
Vertica can interpret m as minute or month, depending on context. See Processing m Input below. |
Week |
w , wk , week , wks , weeks |
|
Day-time subtype units
Subtypes |
Units |
Notes |
Day |
d , day , days |
Hour |
h , hr , hrs , hour , hours |
|
Minute |
m , min , mins , minute , minutes |
Vertica can interpret input unit m as minute or month, depending on context. See Processing m Input below. |
Second |
s , sec , secs , second , seconds |
Millisecond |
ms , msec , msecs , msecond , mseconds , millisecond , milliseconds |
|
Microsecond |
us , usec , usecs , usecond , useconds , microseconds , microsecond |
|
Vertica uses context to interpret the input unit m
as months or minutes. For example, the following command creates a one-column table with an interval value:
=> CREATE TABLE int_test(i INTERVAL YEAR TO MONTH);
Given the following INSERT statement, Vertica interprets the interval literal 1y 6m
as 1 year 6 months:
=> INSERT INTO int_test VALUES('1y 6m');
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMIT
=> SET INTERVALSTYLE TO UNITS;
SET
=> SELECT * FROM int_test;
i
-----------------
1 year 6 months
(1 row)
The following ALTER TABLE statement adds a DAY TO MINUTE
interval column to table int_test
:
=> ALTER TABLE int_test ADD COLUMN x INTERVAL DAY TO MINUTE;
ALTER TABLE
The next INSERT statement sets the first and second columns to 3y 20m and 1y 6m, respectively. In this case, Vertica interprets the m
input literals in two ways:
-
For column i, Vertica interprets the m
input as months, and displays 4 years 8 months.
-
For column x, Vertica interprets the m
input as minutes. Because the interval is defined as DAY TO MINUTE, it converts the inserted input value 1y 6m
to 365 days 6 minutes:
=> INSERT INTO int_test VALUES ('3y 20m', '1y 6m');
OUTPUT
--------
1
(1 row)
=> SELECT * FROM int_test;
i | x
------------------+-----------------
1 year 6 months |
4 years 8 months | 365 days 6 mins
(2 rows)
2.3.3.4.2 - Interval qualifier
Specifies how to interpret and format an interval literal for output, and optionally sets precision.
Specifies how to interpret and format an interval literal for output, and optionally sets precision. Interval qualifiers are composed of one or two units:
unit [ TO unit ] [ (p) ]
where:
If an interval omits an interval qualifier, the default is DAY TO SECOND(6)
.
Interval qualifiers are divided into two categories:
Day-time interval qualifiers
Qualifier |
Description |
DAY |
Unconstrained |
DAY TO HOUR |
Span of days and hours |
DAY TO MINUTE |
Span of days and minutes |
DAY TO SECOND [( p )] |
Span of days, hours, minutes, seconds, and fractions of a second. |
HOUR |
Hours within days |
HOUR TO MINUTE |
Span of hours and minutes |
HOUR TO SECOND [( p )] |
Span of hours and seconds |
MINUTE |
Minutes within hours |
MINUTE TO SECOND [( p )] |
Span of minutes and seconds |
SECOND [( p )] |
Seconds within minutes |
Year-month interval qualifiers
YEAR
- Unconstrained
MONTH
- Months within year
YEAR TO MONTH
- Span of years and months
Note
Vertica also supports INTERVALYM
, which is an alias for INTERVAL YEAR TO MONTH
. Thus, the following two statements are equivalent:
=> SELECT INTERVALYM '1 2';
?column?
----------
1-2
(1 row)
=> SELECT INTERVAL '1 2' YEAR TO MONTH;
?column?
----------
1-2
(1 row)
Examples
See Controlling interval format.
2.4 - Operators
Operators are logical, mathematical, and equality symbols used in SQL to evaluate, compare, or calculate values.
Operators are logical, mathematical, and equality symbols used in SQL to evaluate, compare, or calculate values.
2.4.1 - Bitwise operators
Bitwise operators perform bit manipulations on INTEGER and BINARY/VARBINARY data types:.
Bitwise operators perform bit manipulations on INTEGER and BINARY/VARBINARY data types:
Operator |
Description |
Example |
Result |
& |
AND |
12 & 4 |
4 |
| |
OR |
32 | 3 |
35 |
# |
XOR |
17 # 5 |
20 |
~ |
NOT |
~1 |
-2 |
<< † |
Bitwise shift left |
1 << 4 |
16 |
>> † |
Bitwise shift right |
8 >> 2 |
2 |
† Invalid for BINARY/VARBINARY data types
String argument handling
String arguments must be explicitly cast as BINARY or VARBINARY data types for all bitwise operators. For example:
=> SELECT 'xyz'::VARBINARY & 'zyx'::VARBINARY AS AND;
AND
-----
xyx
(1 row)
=> SELECT 'xyz'::VARBINARY | 'zyx'::VARBINARY AS OR;
OR
-----
zyz
(1 row)
Bitwise operators treats all string arguments as equal in length. If the arguments have different lengths, the operator function right-pads the smaller string with one or more zero bytes to equal the length of the larger string.
For example, the following statement ANDs unequal strings xyz
and zy
. Vertica right-pads string zy
with one zero byte. The last character in the result is represented accordingly, as \000
:
=> SELECT 'xyz'::VARBINARY & 'zy'::VARBINARY AS AND;
AND
--------
xy\000
(1 row)
2.4.2 - Logical operators
Vertica supports the logical operators AND, OR, and NOT:.
Vertica supports the logical operators AND, OR, and NOT:
-
AND evaluates to true when both of the conditions joined by AND are true.
-
OR evaluates to true when either condition is true.
-
NOT negates the result of any Boolean expression.
AND and OR are commutative—that is, you can switch left and right operands without affecting the result. However, the order of evaluation of sub-expressions is not defined. To force evaluation order, use a CASE construct.
Caution
Do not confuse Boolean operators with the
Boolean predicate or
Boolean data type, which can have only two values: true and false.
Logic
SQL uses a three-valued Boolean logic where NULL
represents "unknown":
-
true
AND NULL
= NULL
-
true
OR NULL
= true
-
false
AND NULL
= false
-
false
OR NULL
= NULL
-
NULL
AND NULL
= NULL
-
NULL
OR NULL
= NULL
-
NOT NULL
= NULL
2.4.3 - Comparison operators
Comparison operators are available for all data types where comparison makes sense.
Comparison operators are available for all data types where comparison makes sense. All comparison operators are binary operators that return values of true, false, or NULL (unknown).
Operator |
Description |
Binary function |
< |
less than |
binary_lt |
> |
greater than |
binary_gt |
<= |
less than or equal to |
binary_le |
>= |
greater than or equal to |
binary_ge |
= , <=> |
equal
Note
Do not use the negation operator (! ) with the <=> operator. Instead, use != to test for inequality. ! , except as part of != , is the factorial operator.
|
binary_eq |
!= , <> |
not equal (unsupported for correlated subqueries) |
binary_ne |
NULL handling
Comparison operators return NULL (unknown) if either or both operands are null. One exception applies: <=>
returns true if both operands are NULL, and false if one operand is NULL.
Collections
When comparing collections, null collections are ordered last. Otherwise, collections are compared element by element until there is a mismatch, and then they are ordered based on the non-matching elements. If all elements are equal up to the length of the shorter one, then the shorter one is ordered first.
2.4.4 - Data type coercion operators (CAST)
Data type coercion (casting) passes an expression value to an input conversion routine for a specified data type, resulting in a constant of the indicated type.
Data type coercion (casting) passes an expression value to an input conversion routine for a specified data type, resulting in a constant of the indicated type. In Vertica, data type coercion can be invoked by an explicit cast request that uses one of the following constructs:
Syntax
SELECT CAST ( expression AS data-type )
SELECT expression::data-type
SELECT data-type 'string'
Parameters
expression |
An expression of any type |
data-type |
An SQL data type that Vertica supports to convert expression . |
Truncation
If a binary value is cast (implicitly or explicitly) to a binary type with a smaller length, the value is silently truncated. For example:
=> SELECT 'abcd'::BINARY(2);
?column?
----------
ab
(1 row)
Similarly, if a character value is cast (implicitly or explicitly) to a character value with a smaller length, the value is silently truncated. For example:
=> SELECT 'abcd'::CHAR(3);
?column?
----------
abc
(1 row)
Binary casting and resizing
Vertica supports only casts and resize operations as follows:
-
BINARY to and from VARBINARY
-
VARBINARY to and from LONG VARBINARY
-
BINARY to and from LONG VARBINARY
On binary data that contains a value with fewer bytes than the target column, values are right-extended with the zero byte '\0'
to the full width of the column. Trailing zeros on variable-length binary values are not right-extended:
=> SELECT 'ab'::BINARY(4), 'ab'::VARBINARY(4), 'ab'::LONG VARBINARY(4);
?column? | ?column? | ?column?
------------+----------+----------
ab\000\000 | ab | ab
(1 row)
Automatic coercion
The explicit type cast can be omitted if there is no ambiguity as to the type the constant must be. For example, when a constant is assigned directly to a column, it is automatically coerced to the column's data type.
Examples
=> SELECT CAST((2 + 2) AS VARCHAR);
?column?
----------
4
(1 row)
=> SELECT (2 + 2)::VARCHAR;
?column?
----------
4
(1 row)
=> SELECT INTEGER '123';
?column?
----------
123
(1 row)
=> SELECT (2 + 2)::LONG VARCHAR
?column?
----------
4
(1 row)
=> SELECT '2.2' + 2;
ERROR: invalid input syntax for integer: "2.2"
=> SELECT FLOAT '2.2' + 2;
?column?
----------
4.2
(1 row)
See also
2.4.4.1 - Cast failures
When you invoke data type coercion (casting) by an explicit cast and the cast fails, the result returns either an error or NULL.
When you invoke data type coercion (casting) by an explicit cast and the cast fails, the result returns either an error or NULL. Cast failures commonly occur when you try to cast conflicting conversions, such as coercing a VARCHAR expression that contains letters to an integer.
When a cast fails, the result returned depends on the data type.
Data type |
Cast failure default |
date, time |
NULL |
literals |
error |
all other types |
error |
Enabling strict time casts
You can enable all cast failures to result in an error, including those for date/time data types. Doing so lets you see the reason why some or all of the cast failed. To return an error instead of NULL, set the configuration parameter EnableStrictTimeCasts to 1:
ALTER SESSION SET EnableStrictTimeCasts=1;
By default, EnableStrictTimeCasts is set to 0. Thus, the following attempt to cast a VARCHAR to a TIME data type returns NULL:
==> SELECT current_value from configuration_parameters WHERE parameter_name ilike '%EnableStrictTimeCasts%';
current_value
---------------
0
(1 row)
=> CREATE TABLE mytable (a VARCHAR);
CREATE TABLE
=> INSERT INTO mytable VALUES('one');
OUTPUT
--------
1
(1 row)
=> INSERT INTO mytable VALUES('1');
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMIT
=> SELECT a::time FROM mytable;
a
---
(2 rows)
If EnableStrictTimeCasts is enabled, the cast failure returns an error:
=> ALTER SESSION SET EnableStrictTimeCasts=1;
ALTER SESSION
=> SELECT a::time FROM mytable;
ERROR 3679: Invalid input syntax for time: "1"
Returning all cast failures as NULL
To explicitly cast an expression to a requested data type, use the following construct:
SELECT expression::data-type
Using this command to cast any values to a conflicting data type returns the following error:
=> SELECT 'one'::time;
ERROR 3679: Invalid input syntax for time: "one"
Vertica also supports the use of the coercion operator ::!
, which is useful when you want to return:
Returning all cast failures as NULL allows those expressions that succeed during the cast to appear in the result. Those expressions that fail during the cast, however, have a NULL value.
The following example queries mytable
using the coercion operator ::!
. The query returns NULL where column a
contains the string one
, and returns 1
where the column contains 1
. Query results are identical no matter how EnableStrictTimeCasts is set:
=> SELECT current_value FROM configuration_parameters WHERE parameter_name ilike '%EnableStrictTimeCasts%';
current_value
---------------
0
(1 row)
=> SELECT a::!int FROM mytable;
a
---
1
(2 rows)
ALTER SESSION SET EnableStrictTimeCasts=1;
ALTER SESSION
=> SELECT a::!int FROM mytable;
a
---
1
(2 rows)
You can use ::!
for casts of arrays and sets. The cast resolves each element individually, producing NULL for elements that cannot be cast.
Note that this functionality only applies to table data. It does not work on inline constant casts and in expressions automatically reduced to such. For example,
SELECT constant ::! FLOAT from (select 'some string' as constant) a;
results in ERROR 2826: Could not convert "some string" to a float8
. However, the following returns cast failures as NULL as described:
SELECT string_field ::! float FROM (SELECT 'a string' as string_field UNION ALL SELECT 'another string' ) a;
2.4.5 - Date/time operators
Vertica supports usage of arithmetic operators on DATE/TIME operands:.
Vertica supports usage of arithmetic operators on DATE/TIME operands:
+
(addition)
-
(subtraction)
*
(multiplication)
/
(division)
Examples
The operators described below that take TIME
or TIMESTAMP
input have two variants:
For brevity, these variants are not shown separately.
The +
and *
operators come in commutative pairs—for example, both DATE
+ INTEGER
and INTEGER
+ DATE
. Only one of each pair is shown.
Example |
Result Type |
Result |
DATE '2001-09-28' + INTEGER '7' |
DATE |
'2001-10-05' |
DATE '2001-09-28' + INTERVAL '1 HOUR' |
TIMESTAMP |
'2001-09-28 01:00:00' |
DATE '2001-09-28' + TIME
'03:00'
|
TIMESTAMP |
'2001-09-28 03:00:00' |
INTERVAL '1 DAY' + INTERVAL
'1 HOUR'
|
INTERVAL |
'1 DAY 01:00:00' |
TIMESTAMP '2001-09-28 01:00'
+ INTERVAL '23 HOURS'
|
TIMESTAMP |
'2001-09-29 00:00:00' |
TIME '01:00' + INTERVAL
'3 HOURS'
|
TIME |
'04:00:00' |
- INTERVAL '23 HOURS' |
INTERVAL |
'-23:00:00' |
DATE '2001-10-01' – DATE
'2001-09-28'
|
INTEGER |
'3' |
DATE '2001-10-01' – INTEGER '7' |
DATE |
'2001-09-24' |
DATE '2001-09-28' – INTERVAL
'1 HOUR'
|
TIMESTAMP |
'2001-09-27 23:00:00' |
TIME '05:00' – TIME '03:00' |
INTERVAL |
'02:00:00' |
TIME '05:00' INTERVAL
'2 HOURS'
|
TIME |
'03:00:00' |
TIMESTAMP '2001-09-28 23:00'
– INTERVAL '23 HOURS'
|
TIMESTAMP |
'2001-09-28 00:00:00' |
INTERVAL '1 DAY' – INTERVAL
'1 HOUR'
|
INTERVAL |
'1 DAY -01:00:00' |
TIMESTAMP '2001-09-29 03:00'
– TIMESTAMP '2001-09-27 12:00'
|
INTERVAL |
'1 DAY 15:00:00' |
900 * INTERVAL '1 SECOND' |
INTERVAL |
'00:15:00' |
21 * INTERVAL '1 DAY' |
INTERVAL |
'21 DAYS' |
DOUBLE PRECISION '3.5'
* INTERVAL '1 HOUR'
|
INTERVAL |
'03:30:00' |
INTERVAL '1 HOUR' /
DOUBLE PRECISION '1.5'
|
INTERVAL |
'00:40:00' |
2.4.6 - Mathematical operators
Mathematical operators are provided for many data types.
Mathematical operators are provided for many data types.
Operator |
Description |
Example |
Result |
! |
Factorial |
5 ! |
120 |
+ |
Addition |
2 + 3 |
5 |
– |
Subtraction |
2 – 3 |
–1 |
* |
Multiplication |
2 * 3 |
6 |
/ |
Division (integer division produces NUMERIC results). |
4 / 2 |
2.00... |
// |
With integer division, returns an INTEGER rather than a NUMERIC. |
117.32 // 2.5 |
46 |
% |
Modulo (remainder). For details, see
MOD . |
5 % 4 |
1 |
^ |
Exponentiation |
2.0 ^ 3.0 |
8 |
|/ |
Square root |
|/ 25.0 |
5 |
||/ |
Cube root |
||/ 27.0 |
3 |
!! |
Factorial (prefix operator) |
!! 5 |
120 |
@ |
Absolute value |
@ -5.0 |
5 |
Factorial operator support
Vertica supports use of factorial operators on positive and negative floating point (DOUBLE PRECISION
) numbers and integers. For example:
=> SELECT 4.98!;
?column?
------------------
115.978600750905
(1 row)
Factorial is defined in terms of the gamma function, where (-1) = Infinity and the other negative integers are undefined. For example:
(–4)! = NaN
–(4!) = –24
Factorial is defined as follows for all complex numbers z
:
z! = gamma(z+1)
For details, see Abramowitz and Stegun: Handbook of Mathematical Functions.
2.4.7 - NULL operators
To check whether a value is or is not NULL, use the following equivalent constructs:.
To check whether a value is or is not NULL, use the following equivalent constructs:
Standard:
[expression IS NULL | expression IS NOT NULL]
Non-standard:
[expression ISNULL | expression NOTNULL]
Do not write expression
= NULL
: NULL represents an unknown value, and two unknown values are not necessarily equal. This behavior conforms to the SQL standard.
Note
Some applications might expect that expression
= NULL
returns true if expression
evaluates to NULL. In this case, modify the application to comply with the SQL standard.
2.4.8 - String concatenation operators
To concatenate two strings on a single line, use the concatenation operator (two consecutive vertical bars).
To concatenate two strings on a single line, use the concatenation operator (two consecutive vertical bars).
Syntax
string || string
Parameters
string |
Expression of type CHAR or VARCHAR |
Notes
- || is used to concatenate expressions and constants. The expressions are cast to
VARCHAR
if possible, otherwise to VARBINARY
, and must both be one or the other.
- Two consecutive strings within a single SQL statement on separate lines are automatically concatenated
Examples
The following example is a single string written on two lines:
=> SELECT E'xx'-> '\\';
?column?
----------
xx\
(1 row)
The following examples show two strings concatenated:
=> SELECT E'xx' ||-> '\\';
?column?
----------
xx\\
(1 row)
=> SELECT 'auto' || 'mobile';
?column?
----------
automobile
(1 row)
=> SELECT 'auto'-> 'mobile';
?column?
----------
automobile
(1 row)
=> SELECT 1 || 2;
?column?
----------
12
(1 row)
=> SELECT '1' || '2';
?column?
----------
12
(1 row)
=> SELECT '1'-> '2';
?column?
----------
12
(1 row)
2.5 - Expressions
SQL expressions are the components of a query that compare a value or values against other values.
SQL expressions are the components of a query that compare a value or values against other values. They can also perform calculations. An expression found inside a SQL statement is usually in the form of a conditional statement.
Some functions also use Lambda functions.
Operator precedence
The following table shows operator precedence in decreasing (high to low) order.
When an expression includes more than one operator, specify the order of operation using parentheses, rather than relying on operator precedence.
Operator/Element |
Associativity |
Description |
. |
left |
table/column name separator |
:: |
left |
typecast |
[ ] |
left |
array element selection |
- |
right |
unary minus |
^ |
left |
exponentiation |
* / % |
left |
multiplication, division, modulo |
+ - |
left |
addition, subtraction |
IS |
|
IS TRUE, IS FALSE, IS UNKNOWN, IS NULL |
IN |
|
set membership |
BETWEEN |
|
range containment |
OVERLAPS |
|
time interval overlap |
LIKE |
|
string pattern matching |
< > |
|
less than, greater than |
= |
right |
equality, assignment |
NOT |
right |
logical negation |
AND |
left |
logical conjunction |
OR |
left |
logical disjunction |
Expression evaluation rules
The order of evaluation of subexpressions is not defined. In particular, the inputs of an operator or function are not necessarily evaluated left-to-right or in any other fixed order. To force evaluation in a specific order, use a CASE construct. For example, this is an untrustworthy way of trying to avoid division by zero in a WHERE clause:
=> SELECT x, y WHERE x <> 0 AND y/x > 1.5; --- unsafe
But this is safe:
=> SELECT x, y
WHERE
CASE
WHEN x <> 0 THEN y/x > 1.5
ELSE false
END;
A CASE construct used in this fashion defeats optimization attempts, so use it only when necessary. (In this particular example, it would be best to avoid the issue by writing y > 1.5*x
instead.)
Limits to SQL expressions
Expressions are limited by the available stack. Vertica requires at least 100KB of free stack. If this limit is exceeded then the error "The query contains an expression that is too complex to analyze" might be thrown. Adding physical memory and/or increasing the value of ulimit -s
can increase the available stack and prevent the error.
Analytic expressions have a maximum recursion depth of 2000. If this limit is exceeded then the error "The query contains an expression that is too complex to analyze" might be thrown. This limit cannot be increased.
2.5.1 - Aggregate expressions
An aggregate expression applies an aggregate function across the rows or groups of rows selected by a query.
An aggregate expression applies an aggregate function across the rows or groups of rows selected by a query.
An aggregate expression only can appear in the select list or HAVING
clause of a SELECT
statement. It is invalid in other clauses such as WHERE
, because those clauses are evaluated before the results of aggregates are formed.
Syntax
An aggregate expression has the following format:
aggregate-function ( [ * ] [ ALL | DISTINCT ] expression )
Parameters
aggregate-function |
A Vertica function that aggregates data over groups of rows from a query result set. |
ALL | DISTINCT |
Specifies which input rows to process:
-
ALL (default): Invokes aggregate-function across all input rows where expression evaluates to a non-null value.
-
DISTINCT : Invokes aggregate-function across all input rows where expression evaluates to a unique non-null value.
|
expression |
A value expression that does not itself contain an aggregate expression. |
Examples
The AVG aggregate function returns the average income from the customer_dimension table:
=> SELECT AVG(annual_income) FROM customer_dimension;
AVG
--------------
2104270.6485
(1 row)
The following example shows how to use the COUNT aggregate function with the DISTINCT keyword to return all distinct values of evaluating the expression x+y for all inventory_fact records.
=> SELECT COUNT (DISTINCT date_key + product_key) FROM inventory_fact;
COUNT
-------
21560
(1 row)
2.5.2 - CASE expressions
The CASE expression is a generic conditional expression that can be used wherever an expression is valid.
The CASE
expression is a generic conditional expression that can be used wherever an expression is valid. It is similar to CASE
and IF/THEN/ELSE
statements in other languages.
CASE
WHEN condition THEN result
[ WHEN condition THEN result ]
...
[ ELSE result ]
END
Parameters
*condition |
An expression that returns a Boolean (true/false) result. If the result is false, subsequent WHEN clauses are evaluated in the same way. |
*result |
Specifies the value to return when the associated condition is true. |
ELSE result |
If no condition is true then the value of the CASE expression is the result in the ELSE clause. If the ELSE clause is omitted and no condition matches, the result is NULL. |
CASE expression
WHEN value THEN result
[ WHEN value THEN result ]
...
[ ELSE result ]
END
Parameters
*expression |
An expression that is evaluated and compared to all the value specifications in WHEN clauses until one is found that is equal. |
*value |
Specifies a value to compare to the expression . |
*result |
Specifies the value to return when the expression is equal to the specified value . |
ELSE result |
Specifies the value to return when the expression is not equal to any value ; if no ELSE clause is specified, the value returned is null. |
Notes
The data types of all result expressions must be convertible to a single output type.
Examples
The following examples show two uses of the CASE
statement.
=> SELECT * FROM test;
a
---
1
2
3
=> SELECT a,
CASE WHEN a=1 THEN 'one'
WHEN a=2 THEN 'two'
ELSE 'other'
END
FROM test;
a | case
---+-------
1 | one
2 | two
3 | other
=> SELECT a,
CASE a WHEN 1 THEN 'one'
WHEN 2 THEN 'two'
ELSE 'other'
END
FROM test;
a | case
---+-------
1 | one
2 | two
3 | other
Special example
A CASE
expression does not evaluate subexpressions that are not needed to determine the result. You can use this behavior to avoid division-by-zero errors:
=> SELECT x FROM T1 WHERE
CASE WHEN x <> 0 THEN y/x > 1.5
ELSE false
END;
2.5.3 - Column references
A column reference cannot contain any spaces.
Syntax
[[[database.]schema.]table-name.]column-name
Parameters
schema |
Database and schema. The default schema is public . If you specify a database, it must be the current database. |
table-name |
One of the following:
|
column-name |
A column name that is unique among all queried tables. |
Restrictions
A column reference cannot contain any spaces.
2.5.4 - Comments
A comment is an arbitrary sequence of characters beginning with two consecutive hyphen characters and extending to the end of the line.
A comment is an arbitrary sequence of characters beginning with two consecutive hyphen characters and extending to the end of the line. For example:
-- This is a standard SQL comment
A comment is removed from the input stream before further syntax analysis and is effectively replaced by white space.
Alternatively, C-style block comments can be used where the comment begins with /*
and extends to the matching occurrence of */
.
/* multiline comment
* with nesting: /* nested block comment */
*/
These block comments nest, as specified in the SQL standard. Unlike C, you can comment out larger blocks of code that might contain existing block comments.
2.5.5 - Date/time expressions
Vertica uses an internal heuristic parser for all date/time input support.
Vertica uses an internal heuristic parser for all date/time input support. Dates and times are input as strings, and are broken up into distinct fields with a preliminary determination of what kind of information might be in the field. Each field is interpreted and either assigned a numeric value, ignored, or rejected. The parser contains internal lookup tables for all textual fields, including months, days of the week, and time zones.
Vertica parses date/time type inputs as follows:
-
Break the input string into tokens and categorize each token as a string, time, time zone, or number.
-
Numeric token contains:
-
colon (:) — Parse as a time string, include all subsequent digits and colons.
-
dash (-), slash (/), or two or more dots (.) — Parse as a date string which might have a text month.
-
Numeric only — Parse as a single field or an ISO 8601 concatenated date (19990113 for January 13, 1999) or time (141516 for 14:15:16).
-
Token starts with a plus (+) or minus (–): Parse as a time zone or a special field.
-
Token is a text string: match up with possible strings.
-
Perform a binary-search table lookup for the token as either a special string (for example, today), day (for example, Thursday), month (for example, January), or noise word (for example, at, on).
-
Set field values and bit mask for fields. For example, set year, month, day for today, and additionally hour, minute, second for now.
-
If not found, do a similar binary-search table lookup to match the token with a time zone.
-
If still not found, throw an error.
-
Token is a number or number field:
-
If eight or six digits, and if no other date fields were previously read, interpret as a "concatenated date" (19990118 or 990118). The interpretation is YYYYMMDD
or YYMMDD
.
-
If token is three digits and a year was already read, interpret as day of year.
-
If four or six digits and a year was already read, interpret as a time (HHMM
or HHMMSS
).
-
If three or more digits and no date fields were found yet, interpret as a year (this forces yy-mm-dd ordering of the remaining date fields).
-
Otherwise the date field ordering is assumed to follow the DateStyle
setting: mm-dd-yy, dd-mm-yy, or yy-mm-dd. Throw an error if a month or day field is found to be out of range.
-
If BC is specified: negate the year and add one for internal storage. (In the Vertica implementation, 1 BC = year zero.)
-
If BC is not specified, and year field is two digits in length: adjust the year to four digits. If field is less than 70, add 2000, otherwise add 1900.
Tip
Gregorian years AD 1–99 can be entered as 4 digits with leading zeros— for example, 0099 = AD 99.
Month day year ordering
For some formats, ordering of month, day, and year in date input is ambiguous and there is support for specifying the expected ordering of these fields.
Special date/time values
Vertica supports several special date/time values for convenience, as shown below. All of these values need to be written in single quotes when used as constants in SQL statements.
The values INFINITY
and -INFINITY
are specially represented inside the system and are displayed the same way. The others are simply notational shorthands that are converted to ordinary date/time values when read. (In particular, NOW
and related strings are converted to a specific time value as soon as they are read.)
String |
Valid Data Types |
Description |
epoch |
DATE , TIMESTAMP
|
1970-01-01 00:00:00+00 (UNIX SYSTEM TIME ZERO) |
INFINITY |
TIMESTAMP |
Later than all other time stamps |
-INFINITY |
TIMESTAMP |
Earlier than all other time stamps |
NOW |
DATE , TIME , TIMESTAMP
|
Current transaction's start time
Note: NOW is not the same as the NOW function.
|
TODAY |
DATE , TIMESTAMP
|
Midnight today |
TOMORROW |
DATE , TIMESTAMP
|
Midnight tomorrow |
YESTERDAY |
DATE , TIMESTAMP
|
Midnight yesterday |
ALLBALLS |
TIME |
00:00:00.00 UTC |
The following SQL-compatible functions can also be used to obtain the current time value for the corresponding data type:
The latter four accept an optional precision specification. (See Date/time functions.) However, these functions are SQL functions and are not recognized as data input strings.
2.5.6 - NULL value
NULL is a reserved keyword used to indicate that a data value is unknown.
NULL is a reserved keyword used to indicate that a data value is unknown. It is the ASCII abbreviation for NULL characters (\0
).
Usage in expressions
Vertica does not treat an empty string as a NULL value. An expression must specify NULL to indicate that a column value is unknown.
The following considerations apply to using NULL in expressions:
-
NULL is not greater than, less than, equal to, or not equal to any other expression. Use the Boolean to determine whether an expression value is NULL.
-
You can write queries with expressions that contain the <=>
operator for NULL=NULL
joins. See Equi-joins and non equi-joins.
-
Vertica accepts NULL characters ('\0'
) in constant strings and does not remove null characters from VARCHAR fields on input or output.
Projection ordering of NULL data
Vertica sorts NULL values in projection columns as follows:
See also
NULL-handling functions
2.6 - Lambda functions
Some SQL functions have arguments that are lambda functions.
Some SQL functions have arguments that are lambda functions. A lambda function is an unnamed inline function that is evaluated by the containing SQL function and returns a value.
Syntax
Lambda with one argument:
argument -> expression
Lambda with more than one argument:
(argument, ...) -> expression
Arguments
argument |
Name to use for an input value for the expression. The name cannot be a reserved keyword, the name of an argument to a parent or nested lambda, or a column name or alias. |
expression |
Expression that uses the input arguments and returns a result to the containing SQL function. See the documentation of individual SQL functions for restrictions on return values. For example, some functions require a Boolean result. |
Examples
The ARRAY_FIND function returns the first index that matches the element being searched for. Instead of a literal element, you can write a lambda function that returns a Boolean. The lambda function is applied to each element in the array until a match is found or all elements have been tested. In the following example, each person in the table has an array of email addresses, and the function locates fake addresses:
=> CREATE TABLE people (id INT, name VARCHAR, email ARRAY[VARCHAR,5]);
=> SELECT name, ARRAY_FIND(email, e -> REGEXP_LIKE(e,'example.com','i'))
AS 'example.com'
FROM people;
name | example.com
---------------+-------------
Alice Adams | 1
Bob Adams | 1
Carol Collins | 0
Dave Jones | 0
(4 rows)
The argument e
represents the individual element, and the body of the lambda expression is the regular-expression comparison. The input table has four rows; in each row, the lambda function is called once per array element.
In the following example, a schedules table includes an array of events, where each event is a ROW with several fields:
=> CREATE TABLE schedules
(guest VARCHAR,
events ARRAY[ROW(e_date DATE, e_name VARCHAR, price NUMERIC(8,2))]);
You can use the CONTAINS function with a lambda expression to find people who have more than one event on the same day. The second argument, idx
, is the index of the current element:
=> SELECT guest FROM schedules
WHERE CONTAINS(events, (e, idx) ->
(idx < ARRAY_LENGTH(events) - 1)
AND (e.e_date = events[idx + 1].e_date));
guest
-------------
Alice Adams
(1 row)
2.7 - Predicates
Predicates are truth-tests.
Predicates are truth-tests. If the predicate test is true, it returns a value. Each predicate is evaluated per row, so that when the predicate is part of an entire table SELECT statement, the statement can return multiple results.
Predicates consist of a set of parameters and arguments. For example, in the following WHERE clause:
WHERE name = 'Smith'
2.7.1 - ANY and ALL
ANY and ALL are logical operators that let you make comparisons on subqueries that return one or more rows.
ANY and ALL are logical operators that let you make comparisons on subqueries that return one or more rows. Both operators must be preceded by a comparison operator and followed by a subquery:
expression comparison-operator { ANY | ALL } (subquery)
- ANY returns true if the comparison between
expression
and any value returned by subquery
evaluates to true.
- ALL returns true only if the comparison between
expression
and all values returned by subquery
evaluates to true.
Equivalent operators
You can use the following operators instead of ANY or ALL:
This operator... |
Is equivalent to: |
SOME |
ANY |
IN |
= ANY |
NOT IN |
<> ALL |
NULL handling
Vertica supports multicolumn <> ALL subqueries where the columns are not marked NOT NULL
. If any column contains a NULL value, Vertica returns a run-time error.
Vertica does not support ANY subqueries that are nested in another expression if any column values are NULL.
Examples
Examples below use the following tables and data:
=> SELECT * FROM t1 ORDER BY c1;
c1 | c2
----+-----
1 | cab
1 | abc
2 | fed
2 | def
3 | ihg
3 | ghi
4 | jkl
5 | mno
(8 rows)
=> SELECT * FROM t2 ORDER BY c1;
c1 | c2
----+-----
1 | abc
2 | fed
3 | jkl
3 | stu
3 | zzz
(5 rows)
ANY subqueries
Subqueries that use the ANY keyword return true when any value retrieved in the subquery matches the value of the left-hand expression.
ANY subquery within an expression:
=> SELECT c1, c2 FROM t1 WHERE COALESCE((t1.c1 > ANY (SELECT c1 FROM t2)));
c1 | c2
----+-----
2 | fed
2 | def
3 | ihg
3 | ghi
4 | jkl
5 | mno
(6 rows)
ANY noncorrelated subqueries without aggregates:
=> SELECT c1 FROM t1 WHERE c1 = ANY (SELECT c1 FROM t2) ORDER BY c1;
c1
----
1
1
2
2
3
3
(6 rows)
ANY noncorrelated subqueries with aggregates:
=> SELECT c1, c2 FROM t1 WHERE c1 <> ANY (SELECT MAX(c1) FROM t2) ORDER BY c1;
c1 | c2
----+-----
1 | cab
1 | abc
2 | fed
2 | def
4 | jkl
5 | mno
(6 rows)
=> SELECT c1 FROM t1 GROUP BY c1 HAVING c1 <> ANY (SELECT MAX(c1) FROM t2) ORDER BY c1;
c1
----
1
2
4
5
(4 rows)
ANY noncorrelated subqueries with aggregates and a GROUP BY clause:
=> SELECT c1, c2 FROM t1 WHERE c1 <> ANY (SELECT MAX(c1) FROM t2 GROUP BY c2) ORDER BY c1;
c1 | c2
----+-----
1 | cab
1 | abc
2 | fed
2 | def
3 | ihg
3 | ghi
4 | jkl
5 | mno
(8 rows)
ANY noncorrelated subqueries with a GROUP BY clause:
=> SELECT c1, c2 FROM t1 WHERE c1 <=> ANY (SELECT c1 FROM t2 GROUP BY c1) ORDER BY c1;
c1 | c2
----+-----
1 | cab
1 | abc
2 | fed
2 | def
3 | ihg
3 | ghi
(6 rows)
ANY correlated subqueries with no aggregates or GROUP BY clause:
=> SELECT c1, c2 FROM t1 WHERE c1 >= ANY (SELECT c1 FROM t2 WHERE t2.c2 = t1.c2) ORDER BY c1;
c1 | c2
----+-----
1 | abc
2 | fed
4 | jkl
(3 rows)
ALL subqueries
A subquery that uses the ALL keyword returns true when all values retrieved by the subquery match the left-hand expression, otherwise it returns false.
ALL noncorrelated subqueries without aggregates:
=> SELECT c1, c2 FROM t1 WHERE c1 >= ALL (SELECT c1 FROM t2) ORDER BY c1;
c1 | c2
----+-----
3 | ihg
3 | ghi
4 | jkl
5 | mno
(4 rows)
ALL noncorrelated subqueries with aggregates:
=> SELECT c1, c2 FROM t1 WHERE c1 = ALL (SELECT MAX(c1) FROM t2) ORDER BY c1;
c1 | c2
----+-----
3 | ihg
3 | ghi
(2 rows)
=> SELECT c1 FROM t1 GROUP BY c1 HAVING c1 <> ALL (SELECT MAX(c1) FROM t2) ORDER BY c1;
c1
----
1
2
4
5
(4 rows)
ALL noncorrelated subqueries with aggregates and a GROUP BY clause:
=> SELECT c1, c2 FROM t1 WHERE c1 <= ALL (SELECT MAX(c1) FROM t2 GROUP BY c2) ORDER BY c1;
c1 | c2
----+-----
1 | cab
1 | abc
(2 rows)
ALL noncorrelated subqueries with a GROUP BY clause:
=> SELECT c1, c2 FROM t1 WHERE c1 <> ALL (SELECT c1 FROM t2 GROUP BY c1) ORDER BY c1;
c1 | c2
----+-----
4 | jkl
5 | mno
(2 rows)
2.7.2 - BETWEEN
Checks whether an expression is within the range of two other expressions, inclusive.
Checks whether an expression is within the range of two other expressions, inclusive. All expressions must be of the same or compatible data types.
Syntax
WHERE a BETWEEN x AND y
Equivalent predicates
The following BETWEEN predicates can be rewritten in conventional SQL with logical operators AND and OR.
This BETWEEN predicate... |
Is equivalent to... |
WHERE a BETWEEN x AND y |
WHERE a >= x AND a <= y |
WHERE a NOT BETWEEN x AND y |
WHERE a < x OR a > y |
Examples
The BETWEEN predicate can be especially useful for querying date ranges, as shown in the following examples:
=> SELECT NOW()::DATE;
NOW
------------
2022-12-15
(1 row)
=> CREATE TABLE t1 (a INT, b varchar(12), c DATE);
CREATE TABLE
=> INSERT INTO t1 VALUES
(0,'today',NOW()),
(1,'today+1',NOW()+1),
(2,'today+2',NOW()+2),
(3,'today+3',NOW()+3),
(4,'today+4',NOW()+4),
(5,'today+5',NOW()+5),
(6,'today+6',NOW()+6);
OUTPUT
--------
7
(1 row)
=> COMMIT;
COMMIT
=> SELECT * FROM t1;
a | b | c
---+---------+------------
0 | today | 2022-12-15
1 | today+1 | 2022-12-16
2 | today+2 | 2022-12-17
3 | today+3 | 2022-12-18
4 | today+4 | 2022-12-19
5 | today+5 | 2022-12-20
6 | today+6 | 2022-12-21
(7 rows)
=> SELECT * FROM t1 WHERE c BETWEEN '2022-12-17' AND '2022-12-20';
a | b | c
---+---------+------------
2 | today+2 | 2022-12-17
3 | today+3 | 2022-12-18
4 | today+4 | 2022-12-19
5 | today+5 | 2022-12-20
(4 rows)
Use the NOW and INTERVAL keywords to query a date range:
=> SELECT * FROM t1 WHERE c BETWEEN NOW()::DATE AND NOW()::DATE + INTERVAL '2 days';
a | b | c
---+---------+------------
0 | today | 2022-12-15
1 | today+1 | 2022-12-16
2 | today+2 | 2022-12-17
(3 rows)
2.7.3 - Boolean
Retrieves rows where the value of an expression is true, false, or unknown (NULL).
Retrieves rows where the value of an expression is true, false, or unknown (NULL).
Syntax
expression IS [NOT] TRUE
expression IS [NOT] FALSE
expression IS [NOT] UNKNOWN
Notes
- NULL input is treated as the value
UNKNOWN
.
IS UNKNOWN
and IS NOT UNKNOWN
are effectively the same as the NULL predicate, except that the input expression does not have to be a single column value. To check a single column value for NULL, use the NULL predicate.
- Do not confuse the Boolean predicate with Boolean operators or the Boolean data type, which can have only two values: true and false.
2.7.4 - EXISTS
EXISTS and NOT EXISTS predicates compare an expression against a subquery:.
EXISTS and NOT EXISTS predicates compare an expression against a subquery:
- EXISTS returns true if the subquery returns one or more rows.
- NOT EXISTS returns true if the subquery returns no rows.
Syntax
expression [ NOT ] EXISTS ( subquery )
Usage
EXISTS results only depend on whether any or no records are returned, and not on the contents of those records. Because the subquery output is usually of no interest, EXISTS tests are commonly written in one of the following ways:
EXISTS (SELECT 1 WHERE...)
EXISTS (SELECT * WHERE...)
In the first case, the subquery returns 1 for every record found by the subquery. For example, the following query retrieves a list of all customers whose store purchases were greater than 550 dollars:
=> SELECT customer_key, customer_name, customer_state
FROM public.customer_dimension WHERE EXISTS
(SELECT 1 FROM store.store_sales_fact
WHERE customer_key = public.customer_dimension.customer_key
AND sales_dollar_amount > 550)
AND customer_state = 'MA' ORDER BY customer_key;
customer_key | customer_name | customer_state
--------------+------------------------+----------------
2 | Anna G. Li | CA
4 | Daniel I. Fortin | TX
7 | David H. Greenwood | MA
8 | Wendy S. Young | IL
9 | Theodore X. Brown | MA
...
49902 | Amy Q. Pavlov | MA
49922 | Doug C. Carcetti | MA
49930 | Theodore G. McNulty | MA
49979 | Ben Z. Miller | MA
(1058 rows)
EXISTS versus IN
Whether you use EXISTS or IN subqueries depends on which predicates you select in outer and inner query blocks. For example, the following query gets a list of all the orders placed by all stores on January 2, 2007 for vendors with records in the vendor table:
=> SELECT store_key, order_number, date_ordered
FROM store.store_orders_fact WHERE EXISTS
(SELECT 1 FROM public.vendor_dimension vd JOIN store.store_orders_fact ord ON vd.vendor_key = ord.vendor_key)
AND date_ordered = '2007-01-02';
store_key | order_number | date_ordered
-----------+--------------+--------------
114 | 271071 | 2007-01-02
19 | 290888 | 2007-01-02
132 | 58942 | 2007-01-02
232 | 9286 | 2007-01-02
126 | 224474 | 2007-01-02
196 | 63482 | 2007-01-02
...
196 | 83327 | 2007-01-02
138 | 278373 | 2007-01-02
179 | 293586 | 2007-01-02
155 | 213413 | 2007-01-02
(506 rows)
The above query looks for existence of the vendor and date ordered. To return a particular value, rather than simple existence, the query looks for orders placed by the vendor who got the best deal on January 2, 2007:
=> SELECT store_key, order_number, date_ordered, vendor_name
FROM store.store_orders_fact ord JOIN public.vendor_dimension vd ON ord.vendor_key = vd.vendor_key
WHERE vd.deal_size IN (SELECT MAX(deal_size) FROM public.vendor_dimension) AND date_ordered = '2007-01-02';
store_key | order_number | date_ordered | vendor_name
-----------+--------------+--------------+----------------------
50 | 99234 | 2007-01-02 | Everything Wholesale
81 | 200802 | 2007-01-02 | Everything Wholesale
115 | 13793 | 2007-01-02 | Everything Wholesale
204 | 41842 | 2007-01-02 | Everything Wholesale
133 | 169025 | 2007-01-02 | Everything Wholesale
163 | 208580 | 2007-01-02 | Everything Wholesale
29 | 154972 | 2007-01-02 | Everything Wholesale
145 | 236790 | 2007-01-02 | Everything Wholesale
249 | 54838 | 2007-01-02 | Everything Wholesale
7 | 161536 | 2007-01-02 | Everything Wholesale
(10 rows)
See also
IN
2.7.5 - IN
Checks whether a single value is found (or not found) within a set of values.
Checks whether a single value is found (or not found) within a set of values.
Syntax
(column-list) [ NOT ] IN ( values-list )
Arguments
column-list
- One or more comma-delimited columns in the queried tables.
values-list
- Comma-delimited list of constant values to find in the
column-list
columns. Each values-list
value maps to a column-list
column according to their order in values-list
and column-list
, respectively. Column/value pairs must have compatible data types.
You can specify multiple sets of values as follows:
( (
values-list
), (
values-list
)[,...] )
Null handling
Vertica supports multicolumn NOT IN subqueries where the columns are not marked NOT NULL. If one of the columns is found to contain a NULL value during query execution, Vertica returns a run-time error.
Similarly, IN subqueries nested within another expression are not supported if any column values are NULL. For example, if in the following statement column x
from either table contains a NULL value, Vertica returns a run-time error:
=> SELECT * FROM t1 WHERE (x IN (SELECT x FROM t2)) IS FALSE;
ERROR: NULL value found in a column used by a subquery
EXISTS versus IN
Whether you use EXISTS or IN subqueries depends on which predicates you select in outer and inner query blocks. For example, the following query gets a list of all the orders placed by all stores on January 2, 2007 for vendors with records in the vendor table:
=> SELECT store_key, order_number, date_ordered
FROM store.store_orders_fact WHERE EXISTS
(SELECT 1 FROM public.vendor_dimension vd JOIN store.store_orders_fact ord ON vd.vendor_key = ord.vendor_key)
AND date_ordered = '2007-01-02';
store_key | order_number | date_ordered
-----------+--------------+--------------
114 | 271071 | 2007-01-02
19 | 290888 | 2007-01-02
132 | 58942 | 2007-01-02
232 | 9286 | 2007-01-02
126 | 224474 | 2007-01-02
196 | 63482 | 2007-01-02
...
196 | 83327 | 2007-01-02
138 | 278373 | 2007-01-02
179 | 293586 | 2007-01-02
155 | 213413 | 2007-01-02
(506 rows)
The above query looks for existence of the vendor and date ordered. To return a particular value, rather than simple existence, the query looks for orders placed by the vendor who got the best deal on January 2, 2007:
=> SELECT store_key, order_number, date_ordered, vendor_name
FROM store.store_orders_fact ord JOIN public.vendor_dimension vd ON ord.vendor_key = vd.vendor_key
WHERE vd.deal_size IN (SELECT MAX(deal_size) FROM public.vendor_dimension) AND date_ordered = '2007-01-02';
store_key | order_number | date_ordered | vendor_name
-----------+--------------+--------------+----------------------
50 | 99234 | 2007-01-02 | Everything Wholesale
81 | 200802 | 2007-01-02 | Everything Wholesale
115 | 13793 | 2007-01-02 | Everything Wholesale
204 | 41842 | 2007-01-02 | Everything Wholesale
133 | 169025 | 2007-01-02 | Everything Wholesale
163 | 208580 | 2007-01-02 | Everything Wholesale
29 | 154972 | 2007-01-02 | Everything Wholesale
145 | 236790 | 2007-01-02 | Everything Wholesale
249 | 54838 | 2007-01-02 | Everything Wholesale
7 | 161536 | 2007-01-02 | Everything Wholesale
(10 rows)
Examples
The following SELECT statement queries all data in table t11
.
=> SELECT * FROM t11 ORDER BY pk;
pk | col1 | col2 | SKIP_ME_FLAG
----+------+------+--------------
1 | 2 | 3 | t
2 | 3 | 4 | t
3 | 4 | 5 | f
4 | 5 | 6 | f
5 | 6 | 7 | t
6 | | 8 | f
7 | 8 | | t
(7 rows)
The following query specifies an IN
predicate, to find all rows in t11
where columns col1
and col2
contain values of (2,3)
or (6,7)
:
=> SELECT * FROM t11 WHERE (col1, col2) IN ((2,3), (6,7)) ORDER BY pk;
pk | col1 | col2 | SKIP_ME_FLAG
----+------+------+--------------
1 | 2 | 3 | t
5 | 6 | 7 | t
(2 rows)
The following query uses the VMart schema to illustrate the use of outer expressions referring to different inner expressions:
=> SELECT product_description, product_price FROM product_dimension
WHERE (product_dimension.product_key, product_dimension.product_key) IN
(SELECT store.store_orders_fact.order_number,
store.store_orders_fact.quantity_ordered
FROM store.store_orders_fact);
product_description | product_price
-----------------------------+---------------
Brand #73 wheechair | 454
Brand #72 box of candy | 326
Brand #71 vanilla ice cream | 270
(3 rows)
2.7.6 - INTERPOLATE
Joins two using some ordered attribute.
Joins two event series using some ordered attribute. Event series joins let you compare values from two series directly, rather than having to normalize the series to the same measurement interval.
An event series join is an extension of a regular outer join. The difference between expressing a regular outer join and an event series join is the INTERPOLATE predicate, which is used in the ON clause (see Examples below). Instead of padding the non-preserved side with null values when there is no match, the event series join pads the non-preserved side with the previous/next values from the table.
Interpolated values come from the table that contains the null, not from the other table.Vertica does not guarantee that the output contains no null values. If there is no previous/next value for a mismatched row, that row is padded with nulls.
Syntax
expression1 INTERPOLATE { PREVIOUS | NEXT } VALUE expression2
Arguments
expression1 ,expression2 |
A column reference from one of the tables specified in the FROM clause.
The columns can be of any data type. Because event series are time-based, the type is typically DATE/TIME or TIMESTAMP .
|
{ PREVIOUS | NEXT } VALUE |
Pads the non-preserved side with the previous/next values when there is no match. If previous is called on the first row (or next on the last row), will pad with null values.
Input rows are sorted in ascending logical order of the join column.
Note
An ORDER BY clause, if used, does not determine the input order but only determines query output order.
|
Notes
-
Data is logically partitioned on the table in which it resides, based on other ON clause equality predicates.
-
Event series join requires that the joined tables are both sorted on columns in equality predicates, in any order, followed by the INTERPOLATED column. If data is already sorted in this order, then an explicit sort is avoided, which can improve query performance. For example, given the following tables:
ask: exchange, stock, ts, pricebid: exchange,
stock, ts, price
In the query that follows:
-
ask
is sorted on exchange, stock
(or the reverse), ts
-
bid
is sorted on exchange, stock
(or the reverse), ts
SELECT ask.price - bid.price, ask.ts, ask.stock, ask.exchange
FROM ask FULL OUTER JOIN bid
ON ask.stock = bid.stock AND ask.exchange =
bid.exchange AND ask.ts INTERPOLATE PREVIOUS
VALUE bid.ts;
Restrictions
- Only one INTERPOLATE expression is allowed per join.
- INTERPOLATE expressions are used only with ANSI SQL-99 syntax (the ON clause), which is already true for full outer joins.
- INTERPOLATE can be used with equality predicates only.
- The AND operator is supported but not the OR and NOT operators.
- Expressions and implicit or explicit casts are not supported, but subqueries are allowed.
Semantics
When you write an event series join in place of normal join, values are evaluated as follows (using the schema in the examples below):
In the case of a full outer join, all values from both tables are preserved.
Examples
The examples that follow use this simple schema.
CREATE TABLE t(x TIME);
CREATE TABLE t1(y TIME);
INSERT INTO t VALUES('12:40:23');
INSERT INTO t VALUES('13:40:25');
INSERT INTO t VALUES('13:45:00');
INSERT INTO t VALUES('14:49:55');
INSERT INTO t1 VALUES('12:40:23');
INSERT INTO t1 VALUES('14:00:00');
COMMIT;
Normal full outer join
=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x = t1.y;
Notice the null rows from the non-preserved table:
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 |
13:45:00 |
14:49:55 |
| 14:00:00
(5 rows)
Full outer join with interpolation
=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;
In this case, the rows with no entry point are padded with values from the previous row.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 12:40:23
13:45:00 | 12:40:23
14:49:55 | 12:40:23
13:40:25 | 14:00:00
(5 rows)
Likewise, interpolate next is also supported:
=> SELECT * FROM t FULL OUTER JOIN t1 ON t.x INTERPOLATE NEXT VALUE t1.y;
In this case, the rows with no entry point are padded with values from the next row.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 14:00:00
13:45:00 | 14:00:00
14:49:55 |
14:49:55 | 14:00:00
(5 rows)
Normal left outer join
=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x = t1.y;
Again, there are nulls in the non-preserved table
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 |
13:45:00 |
14:49:55 |
(4 rows)
Left outer join with interpolation
=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;
Nulls have been padded with interpolated values.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 12:40:23
13:45:00 | 12:40:23
14:49:55 | 14:00:00
(4 rows)
Likewise, interpolate next is also supported:
=> SELECT * FROM t LEFT OUTER JOIN t1 ON t.x INTERPOLATE NEXT VALUE t1.y;
Nulls have been padded with interpolated values here as well.
x | y
----------+----------
12:40:23 | 12:40:23
13:40:25 | 14:00:00
13:45:00 | 14:00:00
14:49:55 |
(4 rows)
Inner joins
For inner joins, there is no difference between a regular inner join and an event series inner join. Since null values are eliminated from the result set, there is nothing to interpolate.
A regular inner join returns only the single matching row at 12:40:23:
=> SELECT * FROM t INNER JOIN t1 ON t.x = t1.y;
x | y
----------+----------
12:40:23 | 12:40:23
(1 row)
An event series inner join finds the same single-matching row at 12:40:23:
=> SELECT * FROM t INNER JOIN t1 ON t.x INTERPOLATE PREVIOUS VALUE t1.y;
x | y
----------+----------
12:40:23 | 12:40:23
(1 row)
See also
Event series joins
2.7.6.1 - Join predicate
Specifies the columns on which records from two or more tables are joined.
Specifies the columns on which records from two or more tables are joined. You can connect multiple join predicates with logical operators AND
, OR
, and NOT
.
Syntax
ON column-ref = column-ref [ {AND | OR | NOT } column-ref = column-ref ]...
Parameters
column-ref |
Specifies a column in a queried table. For best performance, do not join on LONG VARBINARY and LONG VARCHAR columns. |
See also
Joins
2.7.7 - LIKE
Retrieves rows where a string expression—typically a column—matches the specified pattern or, if qualified by ANY or ALL, set of patterns.
Retrieves rows where a string expression—typically a column—matches the specified pattern or, if qualified by ANY or ALL, set of patterns. Patterns can contain one or more wildcard characters.
If an ANY or ALL pattern is qualified with NOT, the negation is pushed down to each clause. NOT LIKE ANY (a, b)
is equivalent to NOT LIKE a OR NOT LIKE b
. See the examples.
Syntax
string-expression [ NOT ] { LIKE | ILIKE | LIKEB | ILIKEB }
{ pattern | { ANY | SOME | ALL } ( pattern,... ) } [ ESCAPE 'char' ]
Arguments
string-expression
- String expression, typically a column, to test for instances of the specified pattern or patterns.
NOT
- Returns true if the LIKE predicate returns false and vice-versa. When used with ANY or ALL, applies to each value individually.
LIKE | ILIKE | LIKEB | ILIKEB
- Type of comparison:
LIKE
: Complies with the SQL standard, case-sensitive, operates on UTF-8 character strings, exact behavior depends on collation parameters such as strength. LIKE is stable for character strings, but immutable for binary strings
ILIKE
: Same as LIKE but case-insensitive.
LIKEB
: Performs case-sensitive byte-at-a-time ASCII comparisons, immutable for character and binary strings.
ILIKEB
: Same as LIKEB but case-insensitive.
pattern
- A pattern to test against the expression. Pattern strings can contain the following wildcard characters:
ANY
| SOME
| ALL
- Apply a comma-delimited list of patterns, where:
-
ANY
and SOME
return true if any pattern matches, equivalent to logical OR. These options are synonyms.
-
ALL
returns true only if all patterns match, equivalent to logical AND.
ESCAPE
char
- Escape character, by default backslash (
\
), used to escape reserved characters: wildcard characters (underscore and percent), and the escape character itself.
This option is enforced only for non-default collations; it is currently unsupported with ANY/ALL pattern matching.
Note
Backslash is not valid for binary data type characters. To embed an escape character for binary data types, use ESCAPE
to specify a valid binary character.
Substitute symbols
You can substitute the following symbols for LIKE and its variants:
Note
ESCAPE usage is not valid for these symbols.
Symbol |
Eqivalent to: |
~~ |
LIKE |
~# |
LIKEB |
~~* |
ILIKE |
~#* |
ILIKEB |
!~~ |
NOT LIKE |
!~# |
NOT LIKEB |
!~~* |
NOT ILIKE |
!~#* |
NOT ILIKEB |
Pattern matching
LIKE and its variants require that the entire string expression match the specified patterns. To match a sequence of characters anywhere within a string, the pattern must start and end with a percent sign.
LIKE does not ignore trailing white space characters. If the data values to match end with an indeterminate amount of white space, append the wildcard character %
to pattern
.
Locale dependencies
In the default locale, LIKE and ILIKE handle UTF-8 character-at-a-time, locale-insensitive comparisons. ILIKE handles language-independent case-folding.
In non-default locales, LIKE and ILIKE perform locale-sensitive string comparisons, including some automatic normalization, using the same algorithm as the =
operator on VARCHAR types.
ESCAPE expressions evaluate to exactly one octet—or one UTF-8 character for non-default locales.
Examples
Basic pattern matching
The following query searches for names with a common prefix:
=> SELECT name FROM people WHERE name LIKE 'Ann%';
name
-----------
Ann
Ann Marie
Anna
(3 rows)
LIKE ANY/ALL
LIKE operators support the keywords ANY and ALL, which let you specify multiple patterns to test against a string expression. For example, the following query finds all names that begin or end with the letter 'A':
=> SELECT name FROM people WHERE name LIKE ANY ('A%', '%a');
name
-----------
Alice
Ann
Ann Marie
Anna
Roberta
(5 rows)
LIKE ANY usage is equivalent to individual LIKE conditions combined with OR:
=> SELECT name FROM people WHERE name LIKE 'A%' OR name LIKE '%a';
name
-----------
Alice
Ann
Ann Marie
Anna
Roberta
(5 rows)
Similarly, LIKE ALL is equivalent to individual LIKE conditions combined with AND.
NOT LIKE ANY/ALL
You can use NOT with LIKE ANY or LIKE ALL. NOT does not negate the LIKE expression; instead it negates each clause.
Consider a table with the following contents:
=> SELECT name FROM people;
name
-----------
Alice
Ann
Ann Marie
Anna
Richard
Rob
Robert
Roberta
(8 rows)
In the following query, NOT LIKE ANY ('A%', '%a')
is equivalent to NOT LIKE 'A%' OR NOT LIKE '%a'
, so the only result that is eliminated is Anna
, which matches both patterns:
=> SELECT name FROM people WHERE name NOT LIKE ANY ('A%', '%a');
name
-----------
Alice
Ann
Ann Marie
Richard
Rob
Robert
Roberta
(7 rows)
--- same results:
=> SELECT name FROM people WHERE name NOT LIKE 'A%' OR name NOT LIKE '%a';
NOT LIKE ALL eliminates results that satisfy any pattern:
=> SELECT name FROM people WHERE name NOT LIKE ALL ('A%', '%a');
name
---------
Richard
Rob
Robert
(3 rows)
--- same results:
=> SELECT name FROM people WHERE name NOT LIKE 'A%' AND name NOT LIKE '%a';
Pattern matching in locales
The following example illustrates pattern matching in locales.
=> \locale default
INFO 2567: Canonical locale: 'en_US'
Standard collation: 'LEN_KBINARY'
English (United States)
=> CREATE TABLE src(c1 VARCHAR(100));
=> INSERT INTO src VALUES (U&'\00DF'); --The sharp s (ß)
=> INSERT INTO src VALUES ('ss');
=> COMMIT;
Querying the src
table in the default locale returns both ss and sharp s.
=> SELECT * FROM src;
c1
----
ß
ss
(2 rows)
The following query combines pattern-matching predicates to return the results from column c1
:
=> SELECT c1, c1 = 'ss' AS equality, c1 LIKE 'ss'
AS LIKE, c1 ILIKE 'ss' AS ILIKE FROM src;
c1 | equality | LIKE | ILIKE
----+----------+------+-------
ß | f | f | f
ss | t | t | t
(2 rows)
The next query specifies unicode format for c1
:
=> SELECT c1, c1 = U&'\00DF' AS equality,
c1 LIKE U&'\00DF' AS LIKE,
c1 ILIKE U&'\00DF' AS ILIKE from src;
c1 | equality | LIKE | ILIKE
----+----------+------+-------
ß | t | t | t
ss | f | f | f
(2 rows)
Now change the locale to German with a strength of 1 (ignore case and accents):
=> \locale LDE_S1
INFO 2567: Canonical locale: 'de'
Standard collation: 'LDE_S1'
German Deutsch
=> SELECT c1, c1 = 'ss' AS equality,
c1 LIKE 'ss' as LIKE, c1 ILIKE 'ss' AS ILIKE from src;
c1 | equality | LIKE | ILIKE
----+----------+------+-------
ß | t | t | t
ss | t | t | t
(2 rows)
This example illustrates binary data types with pattern-matching predicates:
=> CREATE TABLE t (c BINARY(1));
CREATE TABLE
=> INSERT INTO t VALUES (HEX_TO_BINARY('0x00')), (HEX_TO_BINARY('0xFF'));
OUTPUT
--------
2
(1 row)
=> COMMIT;
COMMIT
=> SELECT TO_HEX(c) FROM t;
TO_HEX
--------
00
ff
(2 rows)
=> SELECT * FROM t;
c
------
\000
\377
(2 rows)
=> SELECT c, c = '\000', c LIKE '\000', c ILIKE '\000' from t;
c | ?column? | ?column? | ?column?
------+----------+----------+----------
\000 | t | t | t
\377 | f | f | f
(2 rows)
=> SELECT c, c = '\377', c LIKE '\377', c ILIKE '\377' FROM t;
c | ?column? | ?column? | ?column?
------+----------+----------+----------
\000 | f | f | f
\377 | t | t | t
(2 rows)
2.7.8 - NULL
Tests for null values.
Tests for null values. The expression can be a column name, literal, or function.
Syntax
value-expression IS [ NOT ] NULL
Examples
Column name:
=> SELECT date_key FROM date_dimension WHERE date_key IS NOT NULL;
date_key
----------
1
366
1462
1097
2
3
6
7
8
...
Function:
=> SELECT MAX(household_id) IS NULL FROM customer_dimension;
?column?
----------
f
(1 row)
Literal:
=> SELECT 'a' IS NOT NULL;
?column?
----------
t
(1 row)
2.8 - Hints
Hints are directives that you embed within a query or.
Hints are directives that you embed within a query or directed query. They conform to the following syntax:
/*+hint-name[, hint-name]...*/
Hints are bracketed by comment characters /*+
and */
, which can enclose multiple comma-delimited hints. For example:
SELECT /*+syntactic_join,verbatim*/
Restrictions
When embedding hints in a query, be aware of the following restrictions:
-
Do not embed spaces in the comment characters /*
and */
.
-
In general, spaces are allowed before and after the plus (+
) character and hint-name
; however, some third-party tools do not support spaces embedded inside /*
+.
Supported hints
Vertica supports the following hints:
General hints
Eon Mode hints
Join hints
Hint |
Description |
SYNTACTIC_JOIN |
Enforces join order and enables other join hints. |
DISTRIB |
Sets the input operations for a distributed join to broadcast, resegment, local, or filter. |
GBYTYPE |
Specifies which algorithm—GROUPBY HASH or GROUPBY PIPELINED—the Vertica query optimizer should use to implement a GROUP BY clause. |
JTYPE |
Enforces the join type: merge or hash join. |
UTYPE |
Specifies how to combine UNION ALL input. |
Projection hints
Hint |
Description |
PROJS |
Specifies one or more projections to use for a queried table. |
SKIP_PROJS |
Specifies which projections to avoid using for a queried table. |
Directed query hints
The following hints are only supported by directed queries:
Hint |
Description |
:c |
Marks a query constant that must be included in an input query; otherwise, that input query is disqualified from using the directed query. |
:v |
Maps an input query constant to one or more annotated query constants. |
VERBATIM |
Enforces execution of an annotated query exactly as written. |
2.8.1 - :c
In a directed query, marks a query constant that must be included in an input query; otherwise, that input query is disqualified from using the directed query.
In a directed query, marks a query constant that must be included in an input query; otherwise, that input query is disqualified from using the directed query.
Syntax
/*+:c*/
Usage
By default, optimizer-generated directed queries set ignore constant (:v
) hints on predicate constants. You can override this behavior by setting the :c
hint on input query constants that must not be ignored. For example, the following statement creates a directed query that can be used only for input queries where the join predicate constant is the same as in the original input query—8
:
=> CREATE DIRECTED QUERY OPTIMIZER simpleJoin_KeepPredicateConstant SELECT * FROM S JOIN T ON S.a = T.b WHERE S.a = 8 /*+:c*/;
CREATE DIRECTED QUERY
=> ACTIVATE DIRECTED QUERY simpleJoin_KeepPredicateConstant;
See also
Conserving Predicate Constants in Directed Queries
2.8.2 - :v
In a directed query, marks an input query constant that the optimizer ignores when it considers whether to use the directed query for a given query.
In a directed query, marks an input query constant that the optimizer ignores when it considers whether to use the directed query for a given query. Use this hint to create a directed query that can be used for multiple variants of an input query.
Vertica also supports IGNORECONST
as an alias of :v
. Optimizer-generated directed queries automatically mark predicate constants in input and annotated queries with :v
hints.
For details, see Ignoring constants in directed queries.
Syntax
/*+:v(arg)*/
/*+IGNORECONST(arg)*/
arg
- Integer argument that is used in the directed query to pair each input query
:v
hint with one or more annotated query :v
hints.
Examples
See Ignoring constants in directed queries.
2.8.3 - ALLNODES
Qualifies an EXPLAIN statement to request a query plan that assumes all nodes are active.
Qualifies an
EXPLAIN
statement to request a query plan that assumes all nodes are active. If you omit this hint, the EXPLAIN
statement produces a query plan that takes into account any nodes that are currently down.
Syntax
EXPLAIN /*+ALLNODES*/
Examples
In the following example, the ALLNODES
hint requests a query plan that assumes all nodes are active.
QUERY PLAN DESCRIPTION:
------------------------------
Opt Vertica Options
--------------------
PLAN_ALL_NODES_ACTIVE
EXPLAIN /*+ALLNODES*/ select * from Emp_Dimension;
Access Path:
+-STORAGE ACCESS for Emp_Dimension [Cost: 125, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Projection: public.Emp_Dimension_b0
| Materialize: Emp_Dimension.Employee_key, Emp_Dimension.Employee_gender, Emp_Dimension.Courtesy_title, Emp_Dimension.Employee_first_name, Emp_Dimension.Employee_middle_initial, Emp_Dimension.Employee_last_name, Emp_Dimension.Employee_age, Emp_Dimension.Employee_birthdate, Emp_Dimension.Employee_street, Emp_Dimension.Employee_city, Emp_Dimension.Employee_state, Emp_Dimension.Employee_region, Emp_Dimension.Employee_position
| Execute on: All Nodes
2.8.4 - DEPOT_FETCH
Specifies whether a query fetches data to the depot from communal storage when the depot lacks data for this query.
Eon Mode only
Specifies whether a query fetches data to the depot from communal storage when the depot lacks data for this query. This hint overrides configuration parameter DepotOperationsForQuery.
Syntax
SELECT /*+DEPOT_FETCH (option)*/
Arguments
*
option*
- Specifies behavior when the depot does not contain queried file data, one of the following:
-
ALL
(default): Fetch file data from communal storage, if necessary displace existing files by evicting them from the depot.
-
FETCHES
: Fetch file data from communal storage only if space is available; otherwise, read the queried data directly from communal storage.
-
NONE
: Do not fetch file data to the depot, read the queried data directly from communal storage.
Examples
SELECT /*+DEPOT_FETCH(All)*/ count(*) FROM bar;
SELECT /*+DEPOT_FETCH(FETCHES)*/ count(*) FROM bar;
SELECT /*+DEPOT_FETCH(NONE)*/ count(*) FROM bar;
2.8.5 - DISTRIB
The DISTRIB hint specifies to the optimizer how to distribute join key data in order to implement a join.
The DISTRIB hint specifies to the optimizer how to distribute join key data in order to implement a join. If a specified distribution is not feasible, the optimizer ignores the hint and throws a warning.
The following requirements apply:
-
Queries that include the DISTRIB hint must also include the SYNTACTIC_JOIN hint. Otherwise, the optimizer ignores the DISTRIB hint and throws a warning.
-
Join syntax must conform with ANSI SQL-92 join conventions.
Syntax
JOIN /*+DISTRIB(outer-join, inner-join)*/
Arguments
- outer-join
inner-join
- Specifies how to distribute data on the outer and inner joins:
-
L
(local): Inner and outer join keys are identically segmented on each node, join locally.
-
R
(resegment): Inner and outer join keys are not identically segmented. Resegment join-key data before implementing the join.
-
B
(broadcast): Inner and outer join keys are not identically segmented. Broadcast data of this join key to other nodes before implementing the join.
-
F
(filter): Join table is unsegmented. Filter data as needed by the other join key before implementing the join.
-
A
(any): Let the optimizer choose the distribution method that it considers to be most cost-effective.
Examples
In the following query, the join is qualified with a DISTRIB
hint of /*+DISTRIB(L,R)*/
. This hint tells the optimizer to resegment data of join key stores.store_key
before joining it to the sales.store_key
data:
SELECT /*+SYNTACTIC_JOIN*/ sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date
FROM (store.storeSales AS sales JOIN /*+DISTRIB(L,R),JTYPE(H)*/ store.store_dimension AS stores ON (sales.store_key = stores.store_key))
WHERE (sales.sale_date = '2014-12-01'::date) ORDER BY sales.store_key, sales.sale_date;
2.8.6 - EARLY_MATERIALIZATION
Specifies early materialization of a table for the current query.
Specifies early materialization of a table for the current query. A query can include this hint for any number of tables. Typically, the query optimizer delays materialization until late in the query execution process. This hint overrides any choices that the optimizer otherwise would make.
This hint can be useful in cases where late materialization of join inputs precludes other optimizations—for example, pushing aggregation down the joins, or using live aggregate projections. In these cases, qualifying a join input with EARLY_MATERIALIZATION
can enable the optimizations.
Syntax
table-name [ [AS] alias ] /*+EARLY_MATERIALIZATION*/
2.8.7 - ECSMODE
Sets the ECS strategy that the optimizer uses when it divides responsibility for processing shard data among subscriber nodes.
Eon Mode only
Sets the ECS strategy that the optimizer uses when it divides responsibility for processing shard data among subscriber nodes. This hint is applied only if the subcluster uses elastic crunch scaling (ECS).
Syntax
SELECT /*+ECSMODE(option)*/
Arguments
*
option*
- Specifies the strategy to use when dividing shard data among its subscribing nodes, one of the following:
-
AUTO
: The optimizer chooses the strategy to use, useful only if ECS mode is set at the session level (see Setting the ECS Strategy for the Session or Database).
-
IO_OPTIMIZED
: Use I/O-optimized strategy.
-
COMPUTE_OPTIMIZED
: Use compute-optimized strategy.
-
NONE
: Disable use of ECS for this query. Only participating nodes are involved in query execution; collaborating nodes are not.
Example
The following example shows the query plan for a simple single-table query that is forced to use the compute-optimized strategy:
=> EXPLAIN SELECT /*+ECSMode(COMPUTE_OPTIMIZED)*/ employee_last_name,
employee_first_name,employee_age
FROM employee_dimension
ORDER BY employee_age DESC;
QUERY PLAN
--------------------------------------------------------------------------------
------------------------------
QUERY PLAN DESCRIPTION:
The execution of this query involves non-participating nodes.
Crunch scaling strategy preserves data segmentation
------------------------------
. . .
2.8.8 - ENABLE_WITH_CLAUSE_MATERIALIZATION
Enables materialization of all queries in the current WITH clause. Otherwise, materialization is set by configuration parameter WithClauseMaterialization, by default set to 0 (disabled). If WithClauseMaterialization is disabled, materialization is automatically cleared when the primary query of the WITH clause returns. For details, see
Materialization of WITH clause.
Syntax
WITH /*+ENABLE_WITH_CLAUSE_MATERIALIZATION*/
2.8.9 - GBYTYPE
Specifies which algorithm—GROUPBY HASH or GROUPBY PIPELINED —the Vertica query optimizer should use to implement a GROUP BY clause.
Specifies which algorithm—GROUPBY HASH or GROUPBY PIPELINED —the Vertica query optimizer should use to implement a GROUP BY clause. If both algorithms are valid for this query, the query optimizer chooses the specified algorithm over the algorithm that the query optimizer might otherwise choose in its query plan.
Note
Vertica uses the GROUPBY PIPELINED algorithm only if the query and one of its projections comply with GROUPBY PIPELINED
requirements. Otherwise, Vertica issues a warning and uses GROUPBY HASH.
For more information about both algorithms, see GROUP BY implementation options.
Syntax
GROUP BY /*+GBYTYPE( HASH | PIPE )*/
Arguments
HASH
- Use the GROUPBY HASH algorithm.
PIPE
- Use the GROUPBY PIPELINED algorithm.
Examples
See Controlling GROUPBY Algorithm Choice.
2.8.10 - JFMT
Specifies how to size VARCHAR column data when joining tables on those columns, and buffer that data accordingly.
Specifies how to size VARCHAR column data when joining tables on those columns, and buffer that data accordingly. The JFMT hint overrides the default behavior that is set by configuration parameter JoinDefaultTupleFormat, which can be set at database and session levels.
For more information, see Joining variable length string data.
Syntax
JOIN /*+JFMT(format-type)*/
Arguments
format-type
- Specifies how to format VARCHAR column data when joining tables on those columns, and buffers the data accordingly. Set to one of the following:
-
f
(fixed): Use join column metadata to size column data to a fixed length, and buffer accordingly.
-
v
(variable): Use the actual length of join column data, so buffer size varies for each join.
For example:
SELECT /*+SYNTACTIC_JOIN*/ s.store_region, SUM(e.vacation_days) TotalVacationDays
FROM public.employee_dimension e
JOIN /*+JFMT(f)*/ store.store_dimension s ON s.store_region=e.employee_region
GROUP BY s.store_region ORDER BY TotalVacationDays;
Requirements
-
Queries that include the JFMT
hint must also include the SYNTACTIC_JOIN
hint. Otherwise, the optimizer ignores the JFMT
hint and throws a warning.
-
Join syntax must conform with ANSI SQL-92 join conventions.
2.8.11 - JTYPE
Specifies the join algorithm as hash or merge.
Specifies the join algorithm as hash or merge.
Use the JTYPE hint to specify the algorithm the optimizer uses to join table data. If the specified algorithm is not feasible, the optimizer ignores the hint and throws a warning.
Syntax
JOIN /*+JTYPE(join-type)*/
Arguments
join-type
- One of the following:
-
H
: Hash join
-
M
: Merge join, valid only if both join inputs are already sorted on the join columns, otherwise Vertica ignores it and throws a warning. The optimizer relies upon the query or DDL to verify whether input data is sorted, rather than the actual runtime order of the data.
-
FM
: Forced merge join. Before performing the merge, the optimizer re-sorts the join inputs. Join columns must be of the same type and precision or scale, except that string columns can have different lengths.
A value of FM
is valid only for simple join conditions. For example:
=> SELECT /*+SYNTACTIC_JOIN*/ * FROM x JOIN /*+JTYPE(FM)*/ y ON x.c1 = y.c1;
Requirements
-
Queries that include the JTYPE hint must also include the SYNTACTIC_JOIN hint. Otherwise, the optimizer ignores the JTYPE hint and throws a warning.
-
Join syntax must conform with ANSI SQL-92 join conventions.
2.8.12 - LABEL
Assigns a label to a statement so it can easily be identified to evaluate performance and debug problems.
Assigns a label to a statement so it can easily be identified to evaluate performance and debug problems.
LABEL hints are valid in the following statements:
Syntax
statement-name /*+LABEL (label-string)*/
Arguments
label-string
- A string that is up to 128 octets long. If enclosed with single quotes,
label-string
can contain embedded spaces.
Examples
See Labeling statements.
2.8.13 - PROJS
Specifies one or more projections to use for a queried table.
Specifies one or more projections to use for a queried table.
The PROJS hint can specify multiple projections; the optimizer determines which ones are valid and uses the one that is most cost-effective for the queried table. If no hinted projection is valid, the query returns a warning and ignores projection hints.
Syntax
FROM `*`table-name`*` /*+PROJS( [[`*`database`*`.]`*`schema.`*`]`*`projection`*`[,...] )*/
Arguments
[
database
.]
schema
Database and schema. The default schema is public
. If you specify a database, it must be the current database.
projection
- The projection to use. You can specify a list of comma-delimited projections.
Examples
The employee_dimension
table has two projections: segmented superprojection public.employee_dimension
, which includes all table columns; and the unsegmented projection public.employee_dimension_rep
, which includes a subset of the columns:
=> SELECT export_objects('','employee_dimension');
export_objects
--------------------------------------------------------------------------------------------------------------------------
CREATE TABLE public.employee_dimension
(
employee_key int NOT NULL,
employee_gender varchar(8),
courtesy_title varchar(8),
employee_first_name varchar(64),
employee_middle_initial varchar(8),
employee_last_name varchar(64),
employee_age int,
hire_date date,
employee_street_address varchar(256),
employee_city varchar(64),
employee_state char(2),
employee_region char(32),
job_title varchar(64),
reports_to int,
salaried_flag int,
annual_salary int,
hourly_rate float,
vacation_days int,
CONSTRAINT C_PRIMARY PRIMARY KEY (employee_key) DISABLED
);
CREATE PROJECTION public.employee_dimension
...
AS
SELECT employee_dimension.employee_key,
employee_dimension.employee_gender,
employee_dimension.courtesy_title,
employee_dimension.employee_first_name,
employee_dimension.employee_middle_initial,
employee_dimension.employee_last_name,
employee_dimension.employee_age,
employee_dimension.hire_date,
employee_dimension.employee_street_address,
employee_dimension.employee_city,
employee_dimension.employee_state,
employee_dimension.employee_region,
employee_dimension.job_title,
employee_dimension.reports_to,
employee_dimension.salaried_flag,
employee_dimension.annual_salary,
employee_dimension.hourly_rate,
employee_dimension.vacation_days
FROM public.employee_dimension
ORDER BY employee_dimension.employee_key
SEGMENTED BY hash(employee_dimension.employee_key) ALL NODES KSAFE 1;
CREATE PROJECTION public.employee_dimension_rep
...
AS
SELECT employee_dimension.employee_key,
employee_dimension.employee_gender,
employee_dimension.employee_first_name,
employee_dimension.employee_middle_initial,
employee_dimension.employee_last_name,
employee_dimension.employee_age,
employee_dimension.employee_street_address,
employee_dimension.employee_city,
employee_dimension.employee_state,
employee_dimension.employee_region
FROM public.employee_dimension
ORDER BY employee_dimension.employee_key
UNSEGMENTED ALL NODES;
SELECT MARK_DESIGN_KSAFE(1);
(1 row)
The following query selects all table columns from employee_dimension
and includes the PROJS hint, which specifies both projections. public.employee_dimension_rep
does not include all columns in the queried table, so the optimizer cannot use it. The segmented projection includes all table columns so the optimizer uses it, as verified by the following query plan:
=> EXPLAIN SELECT * FROM employee_dimension /*+PROJS('public.employee_dimension_rep', 'public.employee_dimension')*/;
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT * FROM employee_dimension /*+PROJS('public.employee_dimension_rep', 'public.employee_dimension')*/;
Access Path:
+-STORAGE ACCESS for employee_dimension [Cost: 177, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Projection: public.employee_dimension_b0
2.8.14 - SKIP_PROJS
Specifies which projections to avoid using for a queried table.
Specifies which projections to avoid using for a queried table. If SKIP_PROJS
excludes all available projections that are valid for the query, the optimizer issues a warning and ignores the projection hints.
Syntax
FROM table-name /*+SKIP_PROJS( [[database.]schema.]projection[,...] )*/
Arguments
[
database.
]
schema
Database and schema. The default schema is public
. If you specify a database, it must be the current database.
projection
- A projection to skip. You can specify a list of comma-delimited projections.
Examples
In this example, the EXPLAIN output shows that the optimizer uses the projection public.employee_dimension_b0
for a given query:
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_last_name, employee_first_name, employee_city, job_title FROM employee_dimension;
Access Path:
+-STORAGE ACCESS for employee_dimension [Cost: 59, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Projection: public.employee_dimension_b0
| Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name, employee_dimension.employee_city, employee_dimension.job_title
| Execute on: All Nodes
You can use the SKIP_PROJS
hint to avoid using this projection. If another projection is available that is valid for this query, the optimizer uses it instead:
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT employee_last_name, employee_first_name, employee_city, job_title FROM employee_dimension /*+SKIP_PROJS('public.employee_dimension')*/;
Access Path:
+-STORAGE ACCESS for employee_dimension [Cost: 156, Rows: 10K (NO STATISTICS)] (PATH ID: 1)
| Projection: public.employee_dimension_super
| Materialize: employee_dimension.employee_first_name, employee_dimension.employee_last_name, employee_dimension.emplo
yee_city, employee_dimension.job_title
| Execute on: Query Initiator
2.8.15 - SKIP_STATISTICS
Directs the optimizer to produce a query plan that incorporates only the minimal statistics that are collected by ANALYZE_ROW_COUNT.
Directs the optimizer to produce a query plan that incorporates only the minimal statistics that are collected by
ANALYZE_ROW_COUNT
. The optimizer ignores other statistics that would otherwise be used, that are generated by
ANALYZE_STATISTICS
and
ANALYZE_STATISTICS_PARTITION
. This hint is especially useful when used in queries on small tables, where the amount of time required to collect full statistics is often greater than actual execution time.
Syntax
SELECT /*+SKIP_STAT[ISTIC]S*/
EXPLAIN output
EXPLAIN
returns the following output for a query that includes SKIP_STATISTICS
(using its shortened form SKIP_STATS
):
=> EXPLAIN SELECT /*+ SKIP_STATS*/ customer_key, customer_name, customer_gender, customer_city||', '||customer_state, customer_age
FROM customer_dimension WHERE customer_region = 'East' AND customer_age > 60;
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT /*+ SKIP_STATS*/ customer_key, customer_name, customer_gender, customer_city||', '||customer_state,
customer_age FROM customer_dimension WHERE customer_region = 'East' AND customer_age > 60;
Access Path:
+-STORAGE ACCESS for customer_dimension [Cost: 2K, Rows: 10K (STATISTICS SKIPPED)] (PATH ID: 1)
| Projection: public.customer_dimension_b0
| Materialize: public.customer_dimension.customer_age, public.customer_dimension.customer_key, public.customer_dimensi
on.customer_name, public.customer_dimension.customer_gender, public.customer_dimension.customer_city, public.customer_di
mension.customer_state
| Filter: (public.customer_dimension.customer_region = 'East')
| Filter: (public.customer_dimension.customer_age > 60)
| Execute on: All Nodes
...
2.8.16 - SYNTACTIC_JOIN
Enforces join order and enables other join hints.
Enforces join order and enables other join hints.
In order to achieve optimal performance, the optimizer often overrides a query's specified join order. By including the SYNTACTIC_JOIN
hint, you can ensure that the optimizer enforces the query's join order exactly as specified. One requirement applies: the join syntax must conform with ANSI SQL-92 conventions.
The SYNTACTIC_JOIN
hint must immediately follow SELECT
. If the annotated query includes another hint that must also follow SELECT
, such as VERBATIM
, combine the two hints together. For example:
SELECT /*+ syntactic_join,verbatim*/
Syntax
SELECT /*+SYN[TACTIC]_JOIN*/
Examples
In the following examples, the optimizer produces different plans for two queries that differ only by including or excluding the SYNTACTIC_JOIN
hint.
Excludes SYNTACTIC_JOIN:
EXPLAIN SELECT sales.store_key, stores.store_name, products.product_description, sales.sales_quantity, sales.sale_date
FROM (store.store_sales sales JOIN products ON sales.product_key=products.product_key)
JOIN store.store_dimension stores ON sales.store_key=stores.store_key
WHERE sales.sale_date='2014-12-01' order by sales.store_key, sales.sale_date;
Access Path:
+-SORT [Cost: 14K, Rows: 100K (NO STATISTICS)] (PATH ID: 1)
| Order: sales.store_key ASC, sales.sale_date ASC
| Execute on: All Nodes
| +---> JOIN HASH [Cost: 11K, Rows: 100K (NO STATISTICS)] (PATH ID: 2) Outer (RESEGMENT)(LOCAL ROUND ROBIN) Inner (RESEGMENT)
| | Join Cond: (sales.product_key = products.product_key)
| | Materialize at Input: sales.store_key, sales.product_key, sales.sale_date, sales.sales_quantity
| | Execute on: All Nodes
| | +-- Outer -> JOIN HASH [Cost: 1K, Rows: 100K (NO STATISTICS)] (PATH ID: 3)
| | | Join Cond: (sales.store_key = stores.store_key)
| | | Execute on: All Nodes
| | | +-- Outer -> STORAGE ACCESS for sales [Cost: 1K, Rows: 100K (NO STATISTICS)] (PATH ID: 4)
| | | | Projection: store.store_sales_b0
| | | | Materialize: sales.store_key
| | | | Filter: (sales.sale_date = '2014-12-01'::date)
| | | | Execute on: All Nodes
| | | | Runtime Filter: (SIP1(HashJoin): sales.store_key)
| | | +-- Inner -> STORAGE ACCESS for stores [Cost: 34, Rows: 250] (PATH ID: 5)
| | | | Projection: store.store_dimension_DBD_10_rep_VMartDesign_node0001
| | | | Materialize: stores.store_key, stores.store_name
| | | | Execute on: All Nodes
| | +-- Inner -> STORAGE ACCESS for products [Cost: 3K, Rows: 60K (NO STATISTICS)] (PATH ID: 6)
| | | Projection: public.products_b0
| | | Materialize: products.product_key, products.product_description
| | | Execute on: All Nodes
Includes SYNTACTIC_JOIN:
EXPLAIN SELECT /*+SYNTACTIC_JOIN*/ sales.store_key, stores.store_name, products.product_description, sales.sales_quantity, sales.sale_date
FROM (store.store_sales sales JOIN products ON sales.product_key=products.product_key)
JOIN store.store_dimension stores ON sales.store_key=stores.store_key
WHERE sales.sale_date='2014-12-01' order by sales.store_key, sales.sale_date;
Access Path:
+-SORT [Cost: 11K, Rows: 100K (NO STATISTICS)] (PATH ID: 1)
| Order: sales.store_key ASC, sales.sale_date ASC
| Execute on: All Nodes
| +---> JOIN HASH [Cost: 8K, Rows: 100K (NO STATISTICS)] (PATH ID: 2)
| | Join Cond: (sales.store_key = stores.store_key)
| | Execute on: All Nodes
| | +-- Outer -> JOIN HASH [Cost: 7K, Rows: 100K (NO STATISTICS)] (PATH ID: 3) Outer (BROADCAST)(LOCAL ROUND ROBIN)
| | | Join Cond: (sales.product_key = products.product_key)
| | | Execute on: All Nodes
| | | Runtime Filter: (SIP1(HashJoin): sales.store_key)
| | | +-- Outer -> STORAGE ACCESS for sales [Cost: 2K, Rows: 100K (NO STATISTICS)] (PATH ID: 4)
| | | | Projection: store.store_sales_b0
| | | | Materialize: sales.sale_date, sales.store_key, sales.product_key, sales.sales_quantity
| | | | Filter: (sales.sale_date = '2014-12-01'::date)
| | | | Execute on: All Nodes
| | | +-- Inner -> STORAGE ACCESS for products [Cost: 3K, Rows: 60K (NO STATISTICS)] (PATH ID: 5)
| | | | Projection: public.products_b0
| | | | Materialize: products.product_key, products.product_description
| | | | Execute on: All Nodes
| | +-- Inner -> STORAGE ACCESS for stores [Cost: 34, Rows: 250] (PATH ID: 6)
| | | Projection: store.store_dimension_DBD_10_rep_VMartDesign_node0001
| | | Materialize: stores.store_key, stores.store_name
| | | Execute on: All Nodes
2.8.17 - UTYPE
Specifies how to combine UNION ALL input.
Specifies how to combine UNION ALL input.
Syntax
UNION ALL /*+UTYPE(union-type)*/
Arguments
union-type
- One of the following values:
-
U
: Concatenates UNION ALL
input (default).
-
M
: Merges UNION ALL
input in the same sort order as the source query results. This option requires all input from the source queries to use the same sort order; otherwise, Vertica throws a warning and concatenates the UNION ALL
input.
Note
The optimizer relies upon the query or DDL to verify whether input data is sorted, rather than the actual runtime order of the data.
Requirements
Queries that include the UTYPE
hint must also include the SYNTACTIC_JOIN
hint. Otherwise, the optimizer ignores the UTYPE
hint and throws a warning.
2.8.18 - VERBATIM
Enforces execution of an annotated query exactly as written.
Enforces execution of an annotated query exactly as written.
VERBATIM directs the optimizer to create a query plan that incorporates all hints in a annotated query. Furthermore, it directs the optimizer not to apply its own plan development processing on query plan components that pertain to those hints.
Usage of this hint varies between optimizer-generated and custom directed queries, as described below.
Syntax
SELECT /*+ VERBATIM*/
Requirements
The VERBATIM hint must immediately follow SELECT. If the annotated query includes another hint that must also follow SELECT, such as SYNTACTIC_JOIN, combine the two hints together. For example:
SELECT /*+ syntactic_join,verbatim*/
Optimizer-generated directed queries
The optimizer always includes the VERBATIM hint in the annotated queries that it generates for directed queries. For example, given the following CREATE DIRECTED QUERY OPTIMIZER statement:
=> CREATE DIRECTED QUERY OPTIMIZER getStoreSales SELECT sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date FROM store.storesales sales JOIN store.store_dimension stores ON sales.store_key=stores.store_key WHERE sales.sale_date='2014-12-01' /*+IGNORECONST(1)*/ AND stores.store_name='Store1' /*+IGNORECONST(2)*/ ORDER BY sales.store_key, sales.sale_date;
CREATE DIRECTED QUERY
The optimizer generates an annotated query that includes the VERBATIM hint:
=> SELECT query_name, annotated_query FROM V_CATALOG.DIRECTED_QUERIES WHERE query_name = 'getStoreSales';
-[ RECORD 1 ]---+------
query_name | getStoreSales
annotated_query | SELECT /*+ syntactic_join,verbatim*/ sales.store_key AS store_key, stores.store_name AS store_name, sales.product_description AS product_description, sales.sales_quantity AS sales_quantity, sales.sale_date AS sale_date
FROM (store.storeSales AS sales/*+projs('store.storeSales')*/ JOIN /*+Distrib(L,L),JType(H)*/ store.store_dimension AS stores/*+projs('store.store_dimension_DBD_10_rep_VMartDesign')*/ ON (sales.store_key = stores.store_key))
WHERE (sales.sale_date = '2014-12-01'::date /*+IgnoreConst(1)*/) AND (stores.store_name = 'Store1'::varchar(6) /*+IgnoreConst(2)*/)
ORDER BY 1 ASC, 5 ASC
When the optimizer uses this directed query, it produces a query plan that is equivalent to the query plan that it used when it created the directed query:
=> ACTIVATE DIRECTED QUERY getStoreSales;
ACTIVATE DIRECTED QUERY
=> EXPLAIN SELECT sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date FROM store.storesales sales JOIN store.store_dimension stores ON sales.store_key=stores.store_key WHERE sales.sale_date='2014-12-04' AND stores.store_name='Store14' ORDER BY sales.store_key, sales.sale_date;
QUERY PLAN DESCRIPTION:
------------------------------
EXPLAIN SELECT sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date FROM store.storesales sales JOIN store.store_dimension stores ON sales.store_key=stores.store_key WHERE sales.sale_date='2014-12-04' AND stores.store_name='Store14' ORDER BY sales.store_key, sales.sale_date;
The following active directed query(query name: getStoreSales) is being executed:
SELECT /*+syntactic_join,verbatim*/ sales.store_key, stores.store_name, sales.product_description, sales.sales_quantity, sales.sale_date
FROM (store.storeSales sales/*+projs('store.storeSales')*/ JOIN /*+Distrib('L', 'L'), JType('H')*/store.store_dimension stores
/*+projs('store.store_dimension_DBD_10_rep_VMartDesign')*/ ON ((sales.store_key = stores.store_key))) WHERE ((sales.sale_date = '2014-12-04'::date)
AND (stores.store_name = 'Store14'::varchar(7))) ORDER BY sales.store_key, sales.sale_date
Access Path:
+-JOIN HASH [Cost: 463, Rows: 622 (NO STATISTICS)] (PATH ID: 2)
| Join Cond: (sales.store_key = stores.store_key)
| Materialize at Output: sales.sale_date, sales.sales_quantity, sales.product_description
| Execute on: All Nodes
| +-- Outer -> STORAGE ACCESS for sales [Cost: 150, Rows: 155K (NO STATISTICS)] (PATH ID: 3)
| | Projection: store.storeSales_b0
| | Materialize: sales.store_key
| | Filter: (sales.sale_date = '2014-12-04'::date)
| | Execute on: All Nodes
| | Runtime Filter: (SIP1(HashJoin): sales.store_key)
| +-- Inner -> STORAGE ACCESS for stores [Cost: 35, Rows: 2] (PATH ID: 4)
| | Projection: store.store_dimension_DBD_10_rep_VMartDesign_node0001
| | Materialize: stores.store_name, stores.store_key
| | Filter: (stores.store_name = 'Store14')
| | Execute on: All Nodes
Custom directed queries
The VERBATIM hint is included in a custom directed query only if you explicitly include it in the annotated query that you write for that directed query. When the optimizer uses that directed query, it respects the VERBATIM hint and creates a query plan accordingly.
If you omit the VERBATIM hint when you create a custom directed query, the hint is not stored with the annotated query. When the optimizer uses that directed query, it applies its own plan development processing on the annotated query before it generates a query plan. This query plan might not be equivalent to the query plan that the optimizer would have generated for the Vertica version in which the directed query was created.
2.9 - Window clauses
When used with an analytic function, window clauses specify how to partition and sort function input, as well as how to frame input with respect to the current row.
When used with an analytic function, window clauses specify how to partition and sort function input, as well as how to frame input with respect to the current row. When used with a single-phase transform function, the PARTITION ROW and PARTITION LEFT JOIN window clauses support single-row partitions for single-phase transform functions, rather than analytic functions.
2.9.1 - Window partition clause
When specified, a window partition clause divides the rows of the function input based on user-provided expressions.
When specified, a window partition clause divides the rows of the function input based on user-provided expressions. If no expression is provided, the partition clause can improve query performance by using parallelism. If you do not specify a window partition clause, all input rows are treated as a single partition.
Window partitioning is similar to the GROUP BY clause. However, PARTITION BEST and PARTITION NODES may only be used with analytic functions and return only one result per input row, while PARTITION ROW and PARTITION LEFT JOIN can be used for single-phase transform functions and return multiple values per input row.
When used with analytic functions, results are computed per partition and start over again (reset) at the beginning of each subsequent partition.
Syntax
{ PARTITION BY expression[,...]
| PARTITION BEST
| PARTITION NODES
| PARTITION ROW
| PARTITION LEFT JOIN }
Arguments
PARTITION BY
expression
- Expression on which to sort the partition, where
expression
can be a column, constant, or an arbitrary expression formed on columns. Use PARTITION BY
for functions with specific partitioning requirements.
PARTITION BEST
- Use parallelism to improve performance for multi-threaded queries across multiple nodes.
OVER(PARTITION BEST)
provides the best performance on multi-threaded queries across multiple nodes.
The following considerations apply to using PARTITION BEST
:
-
Use PARTITION BEST
for analytic functions that have no partitioning requirements and are thread safe—for example, a one-to-many transform.
-
Do not use PARTITION BEST
on user-defined transform functions (UDTFs) that are not thread-safe. Doing so can produce an error or incorrect results. If a UDTF is not thread safe, use PARTITION NODES
.
PARTITION NODES
- Use parallelism to improve performance for single-threaded queries across multiple nodes.
OVER(PARTITION NODES)
provides the best performance on single-threaded queries across multiple nodes.
PARTITION ROW
, PARTITION LEFT JOIN
- Use to feed input partitions of exactly one row. If used, any arbitrary expression may be used in the query target list alongside the UDTF. PARTITION LEFT JOIN returns a row of NULLs if an input row would otherwise produce no output.
May not be used for analytic functions or multi-phase transform functions. Note that only one PARTITION ROW transform function is allowed in the target list for each level of the query.
Examples
See Window partitioning.
2.9.2 - Window order clause
Specifies how to sort rows that are supplied to an analytic function.
Specifies how to sort rows that are supplied to an analytic function. If the OVER clause also includes a window partition clause, rows are sorted within each partition.
The window order clause only specifies order within a window result set. The query can have its own ORDER BY clause outside the OVER clause. This has precedence over the window order clause and orders the final result set.
A window order clause also creates a default window frame if none is explicitly specified.
Syntax
ORDER BY { expression [ ASC | DESC [ NULLS { FIRST | LAST | AUTO } ] ]
}[,...]
Arguments
expression
- A column, constant, or arbitrary expression formed on columns on which to sort input rows.
ASC | DESC
- Sort order: ascending (default) or descending.
NULLS {FIRST | LAST | AUTO}
- How to position nulls.
NULLS AUTO
means to choose the positioning that is most efficient for this query.
ASC
defaults to NULLS LAST
and DESC
defaults to NULLS FIRST
.
If you omit all sort qualifiers, Vertica uses ASC NULLS LAST
.
For more information, see NULL sort order and Runtime sorting of NULL values in analytic functions.
Examples
See Window ordering.
2.9.3 - Window frame clause
Specifies a window frame, which comprises a set of rows relative to the row that is currently being evaluated by an analytic function.
Specifies a window frame, which comprises a set of rows relative to the row that is currently being evaluated by an analytic function. After the function processes that row and its window, Vertica advances the current row and adjusts the window boundaries accordingly. If the OVER clause also specifies a partition, Vertica also checks that window boundaries do not cross partition boundaries. This process repeats until the function evaluates the last row of the last partition.
Syntax
{ ROWS | RANGE } { BETWEEN start-point AND end-point } | start-point
start-point | end-point:
{ UNBOUNDED {PRECEDING | FOLLOWING}
| CURRENT ROW
| constant-value {PRECEDING | FOLLOWING}}
Arguments
ROWS | RANGE
- Whether to interpret window frame dimensions as physical (
ROWS
) or logical (RANGE
) offsets from the current row. See ROWS versus RANGE below for details.
BETWEEN
start-point
AND
end-point
- First and last rows of the window, where
start-point
and end-point
can be one of the following:
-
UNBOUNDED {PRECEDING | FOLLOWING}
: The current partition's first (PRECEDING
) or last (FOLLOWING
) row.
-
CURRENT ROW
: The current row or value.
-
constant-value
{PRECEDING | FOLLOWING}
: A constant value or expression that evaluates to a constant value. This value is interpreted as either a physical or logical offset from the current row, depending on whether you use ROWS or RANGE. See ROWS versus RANGE for other restrictions.
start-point
must resolve to a row or value that is less than or equal to end-point
.
start-point
- If
ROWS
or RANGE
specifies only a start point, Vertica uses the current row as the end point and creates the window frame accordingly. In this case, start-point
must resolve to a row that is less than or equal to the current row.
Requirements
In order to specify a window frame, the OVER
must also specify a window order (ORDER BY) clause. If the OVER clause omits specifying a window frame, the function creates a default window that extends from the current row to the first row in the current partition. This is equivalent to the following clause:
RANGE UNBOUNDED PRECEDING AND CURRENT ROW
ROWS versus RANGE
The window frame's offset from the current row can be physical or logical:
-
ROWS
(physical): the start and end points are relative to the current row. If either is a constant value, it must evaluate to a positive integer.
-
RANGE
(logical): the start and end points represent a logical offset, such as time. The range value must match the window order (ORDER BY) clause data type: NUMERIC, DATE/TIME, FLOAT or INTEGER.
When setting constant values for ROWS
, the constant must evaluate to a positive INTEGER.
When setting constant values for RANGE
, the following requirements apply:
-
The constant must evaluate to a positive numeric value or INTERVAL literal.
-
If the constant evaluates to a NUMERIC value, the ORDER BY column type must be a NUMERIC data type.
-
If the constant evaluates to an INTERVAL DAY TO SECOND subtype, the ORDER BY column type must be one of the following: TIMESTAMP, TIME, DATE, or INTERVAL DAY TO SECOND.
-
If the constant evaluates to an INTERVAL YEAR TO MONTH, the ORDER BY column type must be one of the following: TIMESTAMP, DATE, or INTERVAL YEAR TO MONTH.
-
The window order clause can specify only one expression.
Examples
See Window framing.
2.9.4 - Window name clause
Defines a named window that specifies window partition and order clauses for an analytic function.
Defines a named window that specifies window partition and order clauses for an analytic function. This window is specified in the function's OVER
clause. Named windows can be useful when you write queries that invoke multiple analytic functions with similar OVER
clauses, such as functions that use the same partition (PARTITION BY
) clauses.
Syntax
WINDOW window-name AS ( window-partition-clause [window-order-clause] )
Arguments
WINDOW
window-name
- A window name that is unique within the same query.
-
window-partition-clause [window-order-clause]
-
Clauses to invoke when an OVER
clause references this window.
If the window definition omits a window order clause, the OVER
clause can specify its own order clause.
Requirements
-
A WINDOW
clause cannot include a window frame clause.
-
Each WINDOW
clause within the same query must have a unique name.
-
A WINDOW
clause can reference another window that is already named. For example, the following query names window w1
before w2
. Thus, the WINDOW
clause that defines w2
can reference w1
:
=> SELECT RANK() OVER(w1 ORDER BY sal DESC), RANK() OVER w2
FROM EMP WINDOW w1 AS (PARTITION BY deptno), w2 AS (w1 ORDER BY sal);
Examples
See Named windows.
See also
Analytic functions
3 - Data types
The following table summarizes the internal data types that Vertica supports.
The following table summarizes the internal data types that Vertica supports. It also shows the default placement of null values in projections. The Size column lists uncompressed bytes.
Data Type |
Size / bytes |
Description |
NULL Sorting |
Binary |
BINARY |
1 to 65,000 |
Fixed-length binary string |
NULLS LAST |
VARBINARY (synonyms: BYTEA, RAW) |
1 to 65,000 |
Variable-length binary string |
NULLS LAST |
LONG VARBINARY |
1 to 32,000,000 |
Long variable-length binary string |
NULLS LAST |
Boolean |
BOOLEAN |
1 |
True or False or NULL |
NULLS LAST |
Character / Long |
CHAR |
1 to 65,000 |
Fixed-length character string |
NULLS LAST |
VARCHAR |
1 to 65,000 |
Variable-length character string |
NULLS LAST |
LONG VARCHAR |
1 to 32,000,000 |
Long variable-length character string |
NULLS LAST |
Date/Time |
DATE |
8 |
A month, day, and year |
NULLS FIRST |
TIME |
8 |
A time of day without timezone |
NULLS FIRST |
TIME WITH TIMEZONE |
8 |
A time of day with timezone |
NULLS FIRST |
TIMESTAMP (synonyms: DATETIME, SMALLDATETIME) |
8 |
A date and time without timezone |
NULLS FIRST |
TIMESTAMP WITH TIMEZONE |
8 |
A date and time with timezone |
NULLS FIRST |
INTERVAL |
8 |
The difference between two points in time |
NULLS FIRST |
INTERVAL DAY TO SECOND |
8 |
An interval measured in days and seconds |
NULLS FIRST |
INTERVAL YEAR TO MONTH |
8 |
An interval measured in years and months |
NULLS FIRST |
Approximate Numeric |
DOUBLE PRECISION |
8 |
Signed 64-bit IEEE floating point number, requiring 8 bytes of storage |
NULLS LAST |
FLOAT |
8 |
Signed 64-bit IEEE floating point number, requiring 8 bytes of storage |
NULLS LAST |
FLOAT(n) |
8 |
Signed 64-bit IEEE floating point number, requiring 8 bytes of storage |
NULLS LAST |
FLOAT8 |
8 |
Signed 64-bit IEEE floating point number, requiring 8 bytes of storage |
NULLS LAST |
REAL |
8 |
Signed 64-bit IEEE floating point number, requiring 8 bytes of storage |
NULLS LAST |
Exact Numeric |
INTEGER |
8 |
Signed 64-bit integer, requiring 8 bytes of storage |
NULLS FIRST |
INT |
8 |
Signed 64-bit integer, requiring 8 bytes of storage |
NULLS FIRST |
BIGINT |
8 |
Signed 64-bit integer, requiring 8 bytes of storage |
NULLS FIRST |
INT8 |
8 |
Signed 64-bit integer, requiring 8 bytes of storage |
NULLS FIRST |
SMALLINT |
8 |
Signed 64-bit integer, requiring 8 bytes of storage |
NULLS FIRST |
TINYINT |
8 |
Signed 64-bit integer, requiring 8 bytes of storage |
NULLS FIRST |
DECIMAL |
8+ |
8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits |
NULLS FIRST |
NUMERIC |
8+ |
8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits |
NULLS FIRST |
NUMBER |
8+ |
8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits |
NULLS FIRST |
MONEY |
8+ |
8 bytes for the first 18 digits of precision, plus 8 bytes for each additional 19 digits |
NULLS FIRST |
Spatial |
GEOMETRY |
1 to 10,000,000 |
Coordinates expressed as (x ,y ) pairs, defined in the Cartesian plane. |
NULLS LAST |
GEOGRAPHY |
1 to 10,000,000 |
Coordinates expressed in longitude/latitude angular values, measured in degrees |
NULLS LAST |
UUID |
UUID |
16 |
Stores universally unique identifiers (UUIDs). |
NULLS FIRST |
Complex |
ARRAY |
1 to 32,000,000 |
Collection of values of a primitive or complex type. |
Native array: same as the element type
Non-native array: cannot be used to order projections
|
ROW |
1 to 32,000,000 |
Structure of property-value pairs. |
Cannot be used to order projections |
SET |
1 to 32,000,000 |
Collection of unique values of a primitive type. |
Same as the primitive type |
3.1 - Binary data types (BINARY and VARBINARY)
Store raw-byte data, such as IP addresses, up to bytes.
Store raw-byte data, such as IP addresses, up to 65000 bytes. The BINARY and BINARY VARYING (VARBINARY) data types are collectively referred to as binary string types and the values of binary string types are referred to as binary strings. A binary string is a sequence of octets or bytes.
BYTEA and RAW are synonyms for VARBINARY.
Syntax
BINARY ( length )
{ VARBINARY | BINARY VARYING | BYTEA | RAW } ( max-length )
Arguments
length
, max-length
- The length of the string or column width, in bytes (octets).
BINARY and VARBINARY data types
BINARY and VARBINARY data types have the following attributes:
-
BINARY: A fixed-width string of length
bytes, where the number of bytes is declared as an optional specifier to the type. If length
is omitted, the default is 1. Where necessary, values are right-extended to the full width of the column with the zero byte. For example:
=> SELECT TO_HEX('ab'::BINARY(4));
to_hex
----------
61620000
-
VARBINARY: A variable-width string up to a length of max-length
bytes, where the maximum number of bytes is declared as an optional specifier to the type. The default is the default attribute size, which is 80, and the maximum length is 65000 bytes. VARBINARY values are not extended to the full width of the column. For example:
=> SELECT TO_HEX('ab'::VARBINARY(4));
to_hex
--------
6162
You can use several formats when working with binary values. The hexadecimal format is generally the most straightforward and is emphasized in Vertica documentation.
Binary values can also be represented in octal format by prefixing the value with a backslash '\'
.
Note
If you use vsql
, you must use the escape character (\
) when you insert another backslash on input; for example, input '\141'
as '\\141'
.
You can also input values represented by printable characters. For example, the hexadecimal value '0x61'
can also be represented by the symbol a
.
See Data load.
On input, strings are translated from:
Both functions take a VARCHAR argument and return a VARBINARY value.
Like the input format, the output format is a hybrid of octal codes and printable ASCII characters. A byte in the range of printable ASCII characters (the range [0x20, 0x7e]
) is represented by the corresponding ASCII character, with the exception of the backslash ('\'
), which is escaped as '\\'
. All other byte values are represented by their corresponding octal values. For example, the bytes {97,92,98,99}, which in ASCII are {a,\,b,c}
, are translated to text as 'a\\bc'
.
Binary operators and functions
The binary operators &
, ~
, |
, and #
have special behavior for binary data types, as described in Bitwise operators.
The following aggregate functions are supported for binary data types:
BIT_AND, BIT_OR, and BIT_XOR are bit-wise operations that are applied to each non-null value in a group, while MAX and MIN are byte-wise comparisons of binary values.
Like their binary operator counterparts, if the values in a group vary in length, the aggregate functions treat the values as though they are all equal in length by extending shorter values with zero bytes to the full width of the column. For example, given a group containing the values 'ff', null, and 'f'
, a binary aggregate ignores the null value and treats the value 'f'
as 'f0'
. Also, like their binary operator counterparts, these aggregate functions operate on VARBINARY types explicitly and operate on BINARY types implicitly through casts. See Data type coercion operators (CAST).
Binary versus character data types
The BINARY and VARBINARY binary types are similar to the CHAR and VARCHAR character data types, respectively. They differ as follows:
-
Binary data types contain byte strings (a sequence of octets or bytes).
-
Character data types contain character strings (text).
-
The lengths of binary data types are measured in bytes, while character data types are measured in characters.
Examples
The following example shows HEX_TO_BINARY and TO_HEX usage.
Table t
and its projection are created with binary columns:
=> CREATE TABLE t (c BINARY(1));
=> CREATE PROJECTION t_p (c) AS SELECT c FROM t;
Insert minimum byte and maximum byte values:
=> INSERT INTO t values(HEX_TO_BINARY('0x00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFF'));
Binary values can then be formatted in hex on output using the TO_HEX function:
=> SELECT TO_HEX(c) FROM t;
to_hex
--------
00
ff
(2 rows)
The BIT_AND, BIT_OR, and BIT_XOR functions are interesting when operating on a group of values. For example, create a sample table and projections with binary columns:
The example that follows uses table t
with a single column of VARBINARY
data type:
=> CREATE TABLE t ( c VARBINARY(2) );
=> INSERT INTO t values(HEX_TO_BINARY('0xFF00'));
=> INSERT INTO t values(HEX_TO_BINARY('0xFFFF'));
=> INSERT INTO t values(HEX_TO_BINARY('0xF00F'));
Query table t
to see column c
output:
=> SELECT TO_HEX(c) FROM t;
TO_HEX
--------
ff00
ffff
f00f
(3 rows)
Now issue the bitwise AND operation. Because these are aggregate functions, an implicit GROUP BY operation is performed on results using (ff00&(ffff)&f00f)
:
=> SELECT TO_HEX(BIT_AND(c)) FROM t;
TO_HEX
--------
f000
(1 row)
Issue the bitwise OR operation on (ff00|(ffff)|f00f)
:
=> SELECT TO_HEX(BIT_OR(c)) FROM t;
TO_HEX
--------
ffff
(1 row)
Issue the bitwise XOR operation on (ff00#(ffff)#f00f)
:
=> SELECT TO_HEX(BIT_XOR(c)) FROM t;
TO_HEX
--------
f0f0
(1 row)
3.2 - Boolean data type
Vertica provides the standard SQL type BOOLEAN, which has two states: true and false.
Vertica provides the standard SQL type BOOLEAN, which has two states: true and false. The third state in SQL boolean logic is unknown, which is represented by the NULL value.
Syntax
BOOLEAN
Parameters
Valid literal data values for input are:
TRUE |
't' |
'true' |
'y' |
'yes' |
'1' |
1 |
FALSE |
'f' |
'false' |
'n' |
'no' |
'0' |
0 |
Notes
-
Do not confuse the BOOLEAN
data type with Logical operators or the Boolean.
-
The keywords TRUE
and FALSE
are preferred and are SQL-compliant.
-
A Boolean value of NULL appears last (largest) in ascending order.
-
All other values must be enclosed in single quotes.
-
Boolean values are output using the letters t and f.
See also
3.3 - Character data types (CHAR and VARCHAR)
Stores strings of letters, numbers, and symbols.
Stores strings of letters, numbers, and symbols. The CHARACTER (CHAR) and CHARACTER VARYING (VARCHAR) data types are collectively referred to as character string types, and the values of character string types are known as character strings.
Character data can be stored as fixed-length or variable-length strings. Fixed-length strings are right-extended with spaces on output; variable-length strings are not extended.
String literals in SQL statements must be enclosed in single quotes.
Syntax
{ CHAR | CHARACTER } [ (octet-length) ]
{ VARCHAR | CHARACTER VARYING ] } [ (octet-length) ]
Arguments
octet-length
- Length of the string or column width, declared in bytes (octets).
This argument is optional.
CHAR versus VARCHAR data types
The following differences apply to CHAR and VARCHAR data:
-
CHAR is conceptually a fixed-length, blank-padded string. Trailing blanks (spaces) are removed on input and are restored on output. The default length is 1, and the maximum length is 65000 octets (bytes).
-
VARCHAR is a variable-length character data type. The default length is 80, and the maximum length is 65000 octets. For string values longer than 65000, use Long data types. Values can include trailing spaces.
Normally, you use VARCHAR for all of string data. Use CHAR when you need fixed-width string output. For example, you can use CHAR columns for data to be transferred to a legacy system that requires fixed-width strings.
Setting maximum length
When you define character columns, specify the maximum size of any string to be stored in a column. For example, to store strings up to 24 octets in length, use one of the following definitions:
CHAR(24) --- fixed-length
VARCHAR(24) --- variable-length
The maximum length parameter for VARCHAR and CHAR data types refers to the number of octets that can be stored in that field, not the number of characters (Unicode code points). When using multibyte UTF-8 characters, the fields must be sized to accommodate from 1 to 4 octets per character, depending on the data. If the data loaded into a VARCHAR or CHAR column exceeds the specified maximum size for that column, data is truncated on UTF-8 character boundaries to fit within the specified size. See COPY
.
Note
Remember to include the extra octets required for multibyte characters in the column-width declaration, keeping in mind the 65000 octet column-width limit.
Due to compression in Vertica, the cost of overestimating the length of these fields is incurred primarily at load time and during sorts.
NULL versus NUL
NULL and NUL differ as follows:
-
NUL represents a character whose ASCII/Unicode code is 0, sometimes qualified "ASCII NUL".
-
NULL means no value, and is true of a field (column) or constant, not of a character.
CHAR, LONG VARCHAR, and VARCHAR string data types accept ASCII NUL values.
In ascending sorts, NULL appears last (largest).
For additional information about NULL ordering, see NULL sort order.
The following example casts the input string containing NUL values to VARCHAR:
=> SELECT 'vert\0ica'::CHARACTER VARYING AS VARCHAR;
VARCHAR
---------
vert\0ica
(1 row)
The result contains 9 characters:
=> SELECT LENGTH('vert\0ica'::CHARACTER VARYING);
length
--------
9
(1 row)
If you use an extended string literal, the length is 8 characters:
=> SELECT E'vert\0ica'::CHARACTER VARYING AS VARCHAR;
VARCHAR
---------
vertica
(1 row)
=> SELECT LENGTH(E'vert\0ica'::CHARACTER VARYING);
LENGTH
--------
8
(1 row)
3.4 - Date/time data types
Vertica supports the full set of SQL date and time data types.
Vertica supports the full set of SQL date and time data types.
The following rules apply to all date/time data types:
-
All have a size of 8 bytes.
-
A date/time value of NULL is smallest relative to all other date/time values,.
-
Vertica uses Julian dates for all date/time calculations, which can correctly predict and calculate any date more recent than 4713 BC to far into the future, based on the assumption that the average length of the year is 365.2425 days.
-
All the date/time data types accept the special literal value NOW
to specify the current date and time. For example:
=> SELECT TIMESTAMP 'NOW';
?column?
---------------------------
2020-09-23 08:23:50.42325
(1 row)
-
By default, Vertica rounds with a maximum precision of six decimal places. You can substitute an integer between 0 and 6 for p
to specify your preferred level of precision.
The following table lists specific attributes of date/time data types:
Name |
Description |
Low Value |
High Value |
Resolution |
DATE |
Dates only (no time of day) |
~ 25e+15 BC |
~ 25e+15 AD |
1 day |
TIME [(p)] |
Time of day only (no date) |
00:00:00.00 |
23:59:60.999999 |
1 μs |
TIMETZ [(p)] |
Time of day only, with time zone |
00:00:00.00+14 |
23:59:59.999999-14 |
1 μs |
TIMESTAMP [(p)] |
Both date and time, without time zone |
290279-12-22 19:59:05.224194 BC |
294277-01-09 04:00:54.775806 AD |
1 μs |
TIMESTAMPTZ [(p)]* |
Both date and time, with time zone |
290279-12-22 19:59:05.224194 BC UTC |
294277-01-09 04:00:54.775806 AD UTC |
1 μs |
INTERVAL DAY TO SECOND [(p)] |
Time intervals |
-106751991 days 04:00:54.775807 |
+-106751991 days 04:00:54.775807 |
1 μs |
INTERVAL YEAR TO MONTH |
Time intervals |
~ -768e15 yrs |
~ 768e15 yrs |
1 month |
Vertica recognizes the files in
/opt/vertica/share/timezonesets
as date/time input values and defines the default list of strings accepted in the AT TIME ZONE zone parameter. The names are not necessarily used for date/time output—output is driven by the official time zone abbreviations associated with the currently selected time zone parameter setting.
3.4.1 - DATE
Consists of a month, day, and year.
Consists of a month, day, and year. The following limits apply:
See SET DATESTYLE for information about ordering.
Note
'0000-00-00' is not valid. If you try to insert that value into a DATE or TIMESTAMP field, an error occurs. If you copy '0000-00-00' into a DATE or TIMESTAMP field, Vertica converts the value to 0001-01-01 00:00:00 BC.
Syntax
DATE
Examples
Example |
Description |
January 8, 1999 |
Unambiguous in any datestyle input mode |
1999-01-08 |
ISO 8601; January 8 in any mode (recommended format) |
1/8/1999 |
January 8 in MDY mode; August 1 in DMY mode |
1/18/1999 |
January 18 in MDY mode; rejected in other modes |
01/02/03 |
January 2, 2003 in MDY mode February 1, 2003 in DMY mode February 3, 2001 in YMD mode |
1999-Jan-08 |
January 8 in any mode |
Jan-08-1999 |
January 8 in any mode |
08-Jan-1999 |
January 8 in any mode |
99-Jan-08 |
January 8 in YMD mode, else error |
08-Jan-99 |
January 8, except error in YMD mode |
Jan-08-99 |
January 8, except error in YMD mode |
19990108 |
ISO 8601; January 8, 1999 in any mode |
990108 |
ISO 8601; January 8, 1999 in any mode |
1999.008 |
Year and day of year |
J2451187 |
Julian day |
January 8, 99 BC |
Year 99 before the Common Era |
3.4.2 - DATETIME
DATETIME is an alias for TIMESTAMP.
DATETIME is an alias for TIMESTAMP/TIMESTAMPTZ.
3.4.3 - INTERVAL
Measures the difference between two points in time.
Measures the difference between two points in time. Intervals can be positive or negative. The INTERVAL
data type is SQL:2008 compliant, and supports interval qualifiers that are divided into two major subtypes:
-
Year-month: Span of years and months
-
Day-time: Span of days, hours, minutes, seconds, and fractional seconds
Intervals are represented internally as some number of microseconds and printed as up to 60 seconds, 60 minutes, 24 hours, 30 days, 12 months, and as many years as necessary. You can control the output format of interval units with SET INTERVALSTYLE and SET DATESTYLE.
Syntax
INTERVAL 'interval-literal' [ interval-qualifier ] [ (p) ]
Parameters
-
interval-literal
- A character string that expresses an interval, conforming to this format:
[-] { quantity subtype-unit }[...] [ AGO ]
For details, see Interval literal.
-
interval-qualifier
- Optionally specifies how to interpret and format an interval literal for output, and, optionally, sets precision. If omitted, the default is
DAY TO SECOND(6)
. For details, see Interval qualifier.
p
- Specifies precision of the seconds field, where
p
is an integer between 0 - 6. For details, see Specifying interval precision.
Default: 6
Limits
Name |
Low Value |
High Value |
Resolution |
INTERVAL DAY TO SECOND [( p )] |
-106751991 days 04:00:54.775807 |
+/-106751991 days 04:00:54.775807 |
1 microsecond |
INTERVAL YEAR TO MONTH |
~/ -768e15 yrs |
~ 768e15 yrs |
1 month |
3.4.3.1 - Setting interval unit display
SET INTERVALSTYLE and SET DATESTYLE control the output format of interval units.
SET INTERVALSTYLE and SET DATESTYLE control the output format of interval units.
Important
DATESTYLE settings supersede INTERVALSTYLE. If DATESTYLE is set to SQL, interval unit display always conforms to the SQL:2008 standard, which omits interval unit display. If DATESTYLE is set to ISO, you can use
SET INTERVALSTYLE to omit or display interval unit display, as described below.
Omitting interval units
To omit interval units from the output, set INTERVALSTYLE
to PLAIN
. This is the default setting, which conforms with the SQL:2008 standard:
=> SET INTERVALSTYLE TO PLAIN;
SET
=> SELECT INTERVAL '3 2';
?column?
----------
3 02:00
When INTERVALSTYLE
is set to PLAIN
, units are omitted from the output, even if the query specifies input units:
=> SELECT INTERVAL '3 days 2 hours';
?column?
----------
3 02:00
If DATESTYLE
is set to SQL
, Vertica conforms with SQL:2008 standard and always omits interval units from output:
=> SET DATESTYLE TO SQL;
SET
=> SET INTERVALSTYLE TO UNITS;
SET
=> SELECT INTERVAL '3 2';
?column?
----------
3 02:00
Displaying interval units
To enable display of interval units, DATESTYLE
must be set to ISO. You can then display interval units by setting INTERVALSTYLE
to UNITS
:
=> SET DATESTYLE TO ISO;
SET
=> SET INTERVALSTYLE TO UNITS;
SET
=> SELECT INTERVAL '3 2';
?column?
----------------
3 days 2 hours
Checking INTERVALSTYLE and DATESTYLE settings
Use
SHOW
statements to check INTERVALSTYLE
and DATESTYLE
settings:
=> SHOW INTERVALSTYLE;
name | setting
---------------+---------
intervalstyle | units
=> SHOW DATESTYLE;
name | setting
-----------+----------
datestyle | ISO, MDY
3.4.3.2 - Specifying interval input
Interval values are expressed through interval literals.
Interval values are expressed through interval literals. An interval literal is composed of one or more interval fields, where each field represents a span of days and time, or years and months, as follows:
[-] { quantity subtype-unit }[...] [AGO]
Using subtype units
Subtype units are optional for day-time intervals; they must be specified for year-month intervals.
For example, the first statement below implicitly specifies days and time; the second statement explicitly identifies day and time units. Both statements return the same result:
=> SET INTERVALSTYLE TO UNITS;
=> SELECT INTERVAL '1 12:59:10:05';
?column?
--------------------
1 day 12:59:10.005
(1 row)
=> SELECT INTERVAL '1 day 12 hours 59 min 10 sec 5 milliseconds';
?column?
--------------------
1 day 12:59:10.005
(1 row)
The following two statements add 28 days and 4 weeks to the current date, respectively. The intervals in both cases are equal and the statements return the same result. However, in the first statement, the interval literal omits the subtype (implicitly days
); in the second statement, the interval literal must include the subtype unit weeks
:
=> SELECT CURRENT_DATE;
?column?
------------
2016-08-15
(1 row)
=> SELECT CURRENT_DATE + INTERVAL '28';
?column?
---------------------
2016-09-12 00:00:00
(1 row)
dbadmin=> SELECT CURRENT_DATE + INTERVAL '4 weeks';
?column?
---------------------
2016-09-12 00:00:00
(1 row)
An interval literal can include day-time and year-month fields. For example, the following statement adds an interval of 4 years, 4 weeks, 4 days and 14 hours to the current date. The years and weeks fields must include subtype units; the days and hours fields omit them:
> SELECT CURRENT_DATE + INTERVAL '4 years 4 weeks 4 14';
?column?
---------------------
2020-09-15 14:00:00
(1 row)
Omitting subtype units
You can specify quantities of days, hours, minutes, and seconds without specifying units. Vertica recognizes colons in interval literals as part of the timestamp:
=> SELECT INTERVAL '1 4 5 6';
?column?
------------
1 day 04:05:06
=> SELECT INTERVAL '1 4:5:6';
?column?
------------
1 day 04:05:06
=> SELECT INTERVAL '1 day 4 hour 5 min 6 sec';
?column?
------------
1 day 04:05:06
If Vertica cannot determine the units, it applies the quantity to any missing units based on the interval qualifier. In the next two examples, Vertica uses the default interval qualifier (DAY TO SECOND(6)
) and assigns the trailing 1
to days, since it has already processed hours, minutes, and seconds in the output:
=> SELECT INTERVAL '4:5:6 1';
?column?
------------
1 day 04:05:06
=> SELECT INTERVAL '1 4:5:6';
?column?
------------
1 day 04:05:06
In the next two examples, Vertica recognizes 4:5
as hours:minutes
. The remaining values in the interval literal are assigned to the missing units: 1
is assigned to days and 2
is assigned to seconds.
SELECT INTERVAL '4:5 1 2';
?column?
------------
1 day 04:05:02
=> SELECT INTERVAL '1 4:5 2';
?column?
------------
1 day 04:05:02
Specifying the interval qualifier can change how Vertica interprets 4:5
:
=> SELECT INTERVAL '4:5' MINUTE TO SECOND;
?column?
------------
00:04:05
3.4.3.3 - Controlling interval format
Interval qualifiers specify a range of options that Vertica uses to interpret and format an interval literal.
Interval qualifiers specify a range of options that Vertica uses to interpret and format an interval literal. The interval qualifier can also specify precision. Each interval qualifier is composed of one or two units:
unit[p] [ TO unit[p] ]
where:
-
unit
specifies a day-time or year-month subtype.
-
p
specifies precision, an integer between 0 and 6. In general, precision only applies to SECOND
units. The default precision for SECOND
is 6. For details, see Specifying interval precision.
If an interval omits an interval qualifier, Vertica uses the default DAY TO SECOND(6)
.
Interval qualifier categories
Interval qualifiers belong to one of the following categories:
-
Year-month: Span of years and months
-
Day-time: Span of days, hours, minutes, seconds, and fractional seconds
Note
All examples below assume that
INTERVALSTYLE is set to plain.
Year-Month
Vertica supports two year-month subtypes: YEAR
and MONTH
.
In the following example, YEAR TO MONTH
qualifies the interval literal 1 2
to indicate a span of 1 year and two months:
=> SELECT interval '1 2' YEAR TO MONTH;
?column?
----------
1-2
(1 row)
If you omit the qualifier, Vertica uses the default interval qualifier DAY TO SECOND
and returns a different result:
=> SELECT interval '1 2';
?column?
----------
1 02:00
(1 row)
The following example uses the interval qualifier YEAR
. In this case, Vertica extracts only the year from the interval literal 1y 10m
:
=> SELECT INTERVAL '1y 10m' YEAR;
?column?
----------
1
(1 row)
In the next example, the interval qualifier MONTH
converts the same interval literal to months:
=> SELECT INTERVAL '1y 10m' MONTH;
?column?
----------
22
(1 row)
Day-time
Vertica supports four day-time subtypes: DAY
, HOUR
, MINUTE
, and SECOND
.
In the following example, the interval qualifier DAY TO SECOND(4)
qualifies the interval literal 1h 3m 6s 5msecs 57us
. The qualifier also sets precision on seconds to 4:
=> SELECT INTERVAL '1h 3m 6s 5msecs 57us' DAY TO SECOND(4);
?column?
---------------
01:03:06.0051
(1 row)
If no interval qualifier is specified, Vertica uses the default subtype DAY TO SECOND(6)
, regardless of how you specify the interval literal. For example, as an extension to SQL:2008, both of the following commands return 910
days
:
=> SELECT INTERVAL '2-6';
?column?
-----------------
910
=> SELECT INTERVAL '2 years 6 months';
?column?
-----------------
910
An interval qualifier can extract other values from the input parameters. For example, the following command extracts the HOUR
value from the interval literal 3 days 2 hours
:
=> SELECT INTERVAL '3 days 2 hours' HOUR;
?column?
----------
74
The primary day/time (DAY TO SECOND
) and year/month (YEAR TO MONTH
) subtype ranges can be restricted to more specific range of types by an interval qualifier. For example, HOUR TO MINUTE
is a limited form of day/time interval, which can be used to express time zone offsets.
=> SELECT INTERVAL '1 3' HOUR to MINUTE;
?column?
---------------
01:03
hh:mm:ss
and hh:mm
formats are used only when at least two of the fields specified in the interval qualifier are non-zero and there are no more than 23 hours or 59 minutes:
=> SELECT INTERVAL '2 days 12 hours 15 mins' DAY TO MINUTE;
?column?
--------------
2 12:15
=> SELECT INTERVAL '15 mins 20 sec' MINUTE TO SECOND;
?column?
----------
15:20
=> SELECT INTERVAL '1 hour 15 mins 20 sec' MINUTE TO SECOND;
?column?
-----------------
75:20
3.4.3.4 - Specifying interval precision
In general, interval precision only applies to seconds.
In general, interval precision only applies to seconds. If no precision is explicitly specified, Vertica rounds precision to a maximum of six decimal places. For example:
=> SELECT INTERVAL '2 hours 4 minutes 3.709384766 seconds' DAY TO SECOND;
?column?
-----------------
02:04:03.709385
(1 row)
Vertica lets you specify interval precision in two ways:
For example, the following statements use both methods to set precision, and return identical results:
=> SELECT INTERVAL(4) '2 hours 4 minutes 3.709384766 seconds' DAY TO SECOND;
?column?
---------------
02:04:03.7094
(1 row)
=> SELECT INTERVAL '2 hours 4 minutes 3.709384766 seconds' DAY TO SECOND(4);
?column?
---------------
02:04:03.7094
(1 row)
If the same statement specifies precision more than once, Vertica uses the lesser precision. For example, the following statement specifies precision twice: the INTERVAL
keyword specifies precision of 1, while the interval qualifier SECOND
specifies precision of 2. Vertica uses the lesser precision of 1:
=> SELECT INTERVAL(1) '1.2467' SECOND(2);
?column?
----------
1.2 secs
Setting precision on interval table columns
If you create a table with an interval column, the following restrictions apply to the column definition:
-
You can set precision on the INTERVAL
keyword only if you omit specifying an interval qualifier. If you try to set precision on the INTERVAL
keyword and include an interval qualifier, Vertica returns an error.
-
You can set precision only on the last unit of an interval qualifier. For example:
CREATE TABLE public.testint2
(
i INTERVAL HOUR TO SECOND(3)
);
If you specify precision on another unit, Vertica discards it when it saves the table definition.
3.4.3.5 - Fractional seconds in interval units
Vertica supports intervals in milliseconds (hh:mm:ss:ms), where 01:02:03:25 represents 1 hour, 2 minutes, 3 seconds, and 025 milliseconds.
Vertica supports intervals in milliseconds (hh:mm:ss:ms), where 01:02:03:25
represents 1 hour, 2 minutes, 3 seconds, and 025 milliseconds. Milliseconds are converted to fractional seconds as in the following example, which returns 1 day, 2 hours, 3 minutes, 4 seconds, and 25.5 milliseconds:
=> SELECT INTERVAL '1 02:03:04:25.5';
?column?
------------
1 day 02:03:04.0255
Vertica allows fractional minutes. The fractional minutes are rounded into seconds:
=> SELECT INTERVAL '10.5 minutes';
?column?
------------
00:10:30
=> select interval '10.659 minutes';
?column?
-------------
00:10:39.54
=> select interval '10.3333333333333 minutes';
?column?
----------
00:10:20
Considerations
-
An INTERVAL
can include only the subset of units that you need; however, year/month intervals represent calendar years and months with no fixed number of days, so year/month interval values cannot include days, hours, minutes. When year/month values are specified for day/time intervals, the intervals extension assumes 30 days per month and 365 days per year. Since the length of a given month or year varies, day/time intervals are never output as months or years, only as days, hours, minutes, and so on.
-
Day/time and year/month intervals are logically independent and cannot be combined with or compared to each other. In the following example, an interval-literal that contains DAYS
cannot be combined with the YEAR TO MONTH
type:
=> SELECT INTERVAL '1 2 3' YEAR TO MONTH;
ERROR 3679: Invalid input syntax for interval year to month: "1 2 3"
-
Vertica accepts intervals up to 2^63 – 1 microseconds or months (about 18 digits).
-
INTERVAL YEAR TO MONTH
can be used in an analytic RANGE window when the ORDER BY
column type is TIMESTAMP/TIMESTAMP WITH TIMEZONE
, or DATE
. Using TIME/TIME WITH TIMEZONE
are not supported.
-
You can use INTERVAL DAY TO SECOND
when the ORDER BY
column type is TIMESTAMP/TIMESTAMP WITH TIMEZONE
, DATE
, and TIME/TIME WITH TIMEZONE
.
Examples
Examples in this section assume that INTERVALSTYLE is set to PLAIN, so results omit subtype units. Interval values that omit an interval qualifier use the default to DAY TO SECOND(6).
=> SELECT INTERVAL '00:2500:00';
?column?
----------
1 17:40
(1 row)
=> SELECT INTERVAL '2500' MINUTE TO SECOND;
?column?
----------
2500
(1 row)
=> SELECT INTERVAL '2500' MINUTE;
?column?
----------
2500
(1 row)
=> SELECT INTERVAL '28 days 3 hours' HOUR TO SECOND;
?column?
----------
675:00
(1 row)
=> SELECT INTERVAL(3) '28 days 3 hours';
?column?
----------
28 03:00
(1 row)
=> SELECT INTERVAL(3) '28 days 3 hours 1.234567';
?column?
-----------------
28 03:01:14.074
(1 row)
=> SELECT INTERVAL(3) '28 days 3 hours 1.234567 sec';
?column?
-----------------
28 03:00:01.235
(1 row)
=> SELECT INTERVAL(3) '28 days 3.3 hours' HOUR TO SECOND;
?column?
----------
675:18
(1 row)
=> SELECT INTERVAL(3) '28 days 3.35 hours' HOUR TO SECOND;
?column?
----------
675:21
(1 row)
=> SELECT INTERVAL(3) '28 days 3.37 hours' HOUR TO SECOND;
?column?
-----------
675:22:12
(1 row)
=> SELECT INTERVAL '1.234567 days' HOUR TO SECOND;
?column?
---------------
29:37:46.5888
(1 row)
=> SELECT INTERVAL '1.23456789 days' HOUR TO SECOND;
?column?
-----------------
29:37:46.665696
(1 row)
=> SELECT INTERVAL(3) '1.23456789 days' HOUR TO SECOND;
?column?
--------------
29:37:46.666
(1 row)
=> SELECT INTERVAL(3) '1.23456789 days' HOUR TO SECOND(2);
?column?
-------------
29:37:46.67
(1 row)
=> SELECT INTERVAL(3) '01:00:01.234567' as "one hour+";
one hour+
--------------
01:00:01.235
(1 row)
=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL(3) '01:00:01.234567';
?column?
----------
t
(1 row)
=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL '01:00:01.234567';
?column?
----------
f
(1 row)
=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL '01:00:01.234567'
HOUR TO SECOND(3);
?column?
----------
t
(1 row)
=> SELECT INTERVAL(3) '01:00:01.234567' = INTERVAL '01:00:01.234567'
MINUTE TO SECOND(3);
?column?
----------
t
(1 row)
=> SELECT INTERVAL '255 1.1111' MINUTE TO SECOND(3);
?column?
------------
255:01.111
(1 row)
=> SELECT INTERVAL '@ - 5 ago';
?column?
----------
5
(1 row)
=> SELECT INTERVAL '@ - 5 minutes ago';
?column?
----------
00:05
(1 row)
=> SELECT INTERVAL '@ 5 minutes ago';
?column?
----------
-00:05
(1 row)
=> SELECT INTERVAL '@ ago -5 minutes';
?column?
----------
00:05
(1 row)
=> SELECT DATE_PART('month', INTERVAL '2-3' YEAR TO MONTH);
DATE_PART
-----------
3
(1 row)
=> SELECT FLOOR((TIMESTAMP '2005-01-17 10:00'
- TIMESTAMP '2005-01-01')
/ INTERVAL '7');
FLOOR
-------
2
(1 row)
3.4.3.6 - Processing signed intervals
In the SQL:2008 standard, a minus sign before an interval-literal or as the first character of the interval-literal negates the entire literal, not just the first component.
In the SQL:2008 standard, a minus sign before an interval-literal or as the first character of the interval-literal negates the entire literal, not just the first component. In Vertica, a leading minus sign negates the entire interval, not just the first component. The following commands both return the same value:
=> SELECT INTERVAL '-1 month - 1 second';
?column?
----------
-29 days 23:59:59
=> SELECT INTERVAL -'1 month - 1 second';
?column?
----------
-29 days 23:59:59
Use one of the following commands instead to return the intended result:
=> SELECT INTERVAL -'1 month 1 second';
?column?
----------
-30 days 1 sec
=> SELECT INTERVAL -'30 00:00:01';
?column?
----------
-30 days 1 sec
Two negatives together return a positive:
=> SELECT INTERVAL -'-1 month - 1 second';
?column?
----------
29 days 23:59:59
=> SELECT INTERVAL -'-1 month 1 second';
?column?
----------
30 days 1 sec
You can use the year-month syntax with no spaces. Vertica allows the input of negative months but requires two negatives when paired with years.
=> SELECT INTERVAL '3-3' YEAR TO MONTH;
?column?
----------
3 years 3 months
=> SELECT INTERVAL '3--3' YEAR TO MONTH;
?column?
----------
2 years 9 months
When the interval-literal looks like a year/month type, but the type is day/second, or vice versa, Vertica reads the interval-literal from left to right, where number-number is years-months, and number <space> <signed number> is whatever the units specify. Vertica processes the following command as (–) 1 year 1 month = (–) 365 + 30 = –395 days:
=> SELECT INTERVAL '-1-1' DAY TO HOUR;
?column?
----------
-395 days
If you insert a space in the interval-literal, Vertica processes it based on the subtype DAY TO HOUR
: (–) 1 day – 1 hour = (–) 24 – 1 = –23 hours:
=> SELECT INTERVAL '-1 -1' DAY TO HOUR;
?column?
----------
-23 hours
Two negatives together returns a positive, so Vertica processes the following command as (–) 1 year – 1 month = (–) 365 – 30 = –335 days:
=> SELECT INTERVAL '-1--1' DAY TO HOUR;
?column?
----------
-335 days
If you omit the value after the hyphen, Vertica assumes 0 months and processes the following command as 1 year 0 month –1 day = 365 + 0 – 1 = –364 days:
=> SELECT INTERVAL '1- -1' DAY TO HOUR;
?column?
----------
364 days
3.4.3.7 - Casting with intervals
You can use CAST to convert strings to intervals, and vice versa.
You can use CAST
to convert strings to intervals, and vice versa.
String to interval
You cast a string to an interval as follows:
CAST( [ INTERVAL[(p)] ] [-] ] interval-literal AS INTERVAL[(p)] interval-qualifier )
For example:
=> SELECT CAST('3700 sec' AS INTERVAL);
?column?
----------
01:01:40
You can cast intervals within day-time or the year-month subtypes but not between them:
=> SELECT CAST(INTERVAL '4440' MINUTE as INTERVAL);
?column?
----------
3 days 2 hours
=> SELECT CAST(INTERVAL -'01:15' as INTERVAL MINUTE);
?column?
----------
-75 mins
Interval to string
You cast an interval to a string as follows:
CAST( (SELECT interval ) AS VARCHAR[(n)] )
For example:
=> SELECT CONCAT(
'Tomorrow at this time: ',
CAST((SELECT INTERVAL '24 hours') + CURRENT_TIMESTAMP(0) AS VARCHAR));
CONCAT
-----------------------------------------------
Tomorrow at this time: 2016-08-17 08:41:23-04
(1 row)
3.4.3.8 - Operations with intervals
If you divide an interval by an interval, you get a FLOAT:.
If you divide an interval by an interval, you get a FLOAT
:
=> SELECT INTERVAL '28 days 3 hours' HOUR(4) / INTERVAL '27 days 3 hours' HOUR(4);
?column?
------------
1.036866359447
An INTERVAL
divided by FLOAT
returns an INTERVAL
:
=> SELECT INTERVAL '3' MINUTE / 1.5;
?column?
------------
2 mins
INTERVAL MODULO
(remainder) INTERVAL
returns an INTERVAL
:
=> SELECT INTERVAL '28 days 3 hours' HOUR % INTERVAL '27 days 3 hours' HOUR;
?column?
------------
24 hours
If you add INTERVAL
and TIME
, the result is TIME
, modulo 24 hours:
=> SELECT INTERVAL '1' HOUR + TIME '1:30';
?column?
------------
02:30:00
3.4.4 - SMALLDATETIME
SMALLDATETIME is an alias for TIMESTAMP.
SMALLDATETIME is an alias for TIMESTAMP/TIMESTAMPTZ.
3.4.5 - TIME/TIMETZ
Stores the specified time of day.
Stores the specified time of day. TIMETZ
is the same as TIME WITH TIME ZONE
: both data types store the UTC offset of the specified time.
Syntax
TIME [ (p) ] [ { WITHOUT | WITH } TIME ZONE ] 'input-string' [ AT TIME ZONE zone ]
Parameters
p
- Optional precision value that specifies the number of fractional digits retained in the seconds field, an integer value between 0 and 6. If you omit specifying precision, Vertica returns up to 6 fractional digits.
WITHOUT TIME ZONE
- Ignore any time zone in the input string and use a value without a time zone (default).
WITH TIME ZONE
- Convert the time to UTC. If the input string includes a time zone, use its UTC offset for the conversion. If the input string omits a time zone, Vertica uses the UTC offset of the time zone that is configured for your system.
input-string
- See Input String below.
-
AT TIME ZONE zone
- See TIME AT TIME ZONE and TIMESTAMP AT TIME ZONE.
TIMETZ vs. TIMESTAMPTZ
TIMETZ
and
TIMESTAMPTZ
are not parallel SQL constructs. TIMESTAMPTZ
records a time and date in GMT, converting from the specified TIME ZONE.TIMETZ
records the specified time and the specified time zone, in minutes, from GMT.
Limits
Name |
Low Value |
High Value |
Resolution |
TIME [ p ] |
00:00:00.00 |
23:59:60.999999 |
1 µs |
TIME [ p ] WITH TIME ZONE |
00:00:00.00+14 |
23:59:59.999999-14 |
1 µs |
A TIME
input string can be set to any of the formats shown below:
Example |
Description |
04:05:06.789 |
ISO 8601 |
04:05:06 |
ISO 8601 |
04:05 |
ISO 8601 |
040506 |
ISO 8601 |
04:05 AM |
Same as 04:05; AM does not affect value |
04:05 PM |
Same as 16:05 |
04:05:06.789-8 |
ISO 8601 |
04:05:06-08:00 |
ISO 8601 |
04:05-08:00 |
ISO 8601 |
040506-08 |
ISO 8601 |
04:05:06 PST |
Time zone specified by name |
Data type coercion
You can cast a TIME
or TIMETZ
interval to a TIMESTAMP
. This returns the local date and time as follows:
=> SELECT (TIME '3:01am')::TIMESTAMP;
?column?
---------------------
2012-08-30 03:01:00
(1 row)
=> SELECT (TIMETZ '3:01am')::TIMESTAMP;
?column?
---------------------
2012-08-22 03:01:00
(1 row)
Casting the same TIME
or TIMETZ
interval to a TIMESTAMPTZ
returns the local date and time, appended with the UTC offset—in this example, -05
:
=> SELECT (TIME '3:01am')::TIMESTAMPTZ;
?column?
------------------------
2016-12-08 03:01:00-05
(1 row)
3.4.6 - TIME AT TIME ZONE
Converts the specified TIME to the time in another time zone.
Converts the specified TIME
to the time in another time zone.
Syntax
TIME [WITH TIME ZONE] 'input-string' AT TIME ZONE 'zone'
Parameters
WITH TIME ZONE
- Converts the input string to UTC, using the UTC offset for the specified time zone. If the input string omits a time zone, Vertica uses the UTC offset of the time zone that is configured for your system, and converts the input string accordingly
zone
- Specifies the time zone to use in the conversion, either as a literal or interval that specifies UTC offset:
For details, see Specifying Time Zones below.
Note
Vertica treats literals TIME ZONE
and TIMEZONE
as synonyms.
Specifying time zones
You can specify time zones in two ways:
-
A string literal such as America/Chicago
or PST
-
An interval that specifies a UTC offset—for example, INTERVAL '-08:00'
It is generally good practice to specify time zones with literals that indicate a geographic location. Vertica makes the necessary seasonal adjustments, and thereby avoids inconsistent results. For example, the following two queries are issued when daylight time is in effect. Because the local UTC offset during daylight time is -04
, both queries return the same results:
=> SELECT CURRENT_TIME(0) "EDT";
EDT
-------------
12:34:35-04
(1 row)
=> SELECT CURRENT_TIME(0) AT TIME ZONE 'America/Denver' "Mountain Time";
Mountain Time
---------------
10:34:35-06
(1 row)
=> SELECT CURRENT_TIME(0) AT TIME ZONE INTERVAL '-06:00' "Mountain Time";
Mountain Time
---------------
10:34:35-06
(1 row)
If you issue a use the UTC offset in a similar query when standard time is in effect, you must adjust the UTC offset accordingly—for Denver time, to -07
—otherwise, Vertica returns a different (and erroneous) result:
=> SELECT CURRENT_TIME(0) "EST";
EST
-------------
14:18:22-05
(1 row)
=> SELECT CURRENT_TIME(0) AT TIME ZONE INTERVAL '-06:00' "Mountain Time";
Mountain Time
---------------
13:18:22-06
(1 row)
You can show and set the session's time zone with
SHOW TIMEZONE
and
SET TIME ZONE
, respectively:
=> SHOW TIMEZONE;
name | setting
----------+------------------
timezone | America/New_York
(1 row)
=> SELECT CURRENT_TIME(0) "Eastern Daylight Time";
Eastern Daylight Time
-----------------------
12:18:24-04
(1 row)
=> SET TIMEZONE 'America/Los_Angeles';
SET
=> SELECT CURRENT_TIME(0) "Pacific Daylight Time";
Pacific Daylight Time
-----------------------
09:18:24-07
(1 row)
Time zone literals
To view the default list of valid literals, see the files in the following directory:
opt/vertica/share/timezonesets
For example:
$ cat Antarctica.txt
...
# src/timezone/tznames/Antarctica.txt
#
AWST 28800 # Australian Western Standard Time
# (Antarctica/Casey)
# (Australia/Perth)
...
NZST 43200 # New Zealand Standard Time
# (Antarctica/McMurdo)
# (Pacific/Auckland)
ROTT -10800 # Rothera Time
# (Antarctica/Rothera)
SYOT 10800 # Syowa Time
# (Antarctica/Syowa)
VOST 21600 # Vostok time
# (Antarctica/Vostok)
Examples
The following example assumes that local time is EST (Eastern Standard Time). The query converts the specified time to MST (mountain standard time):
=> SELECT CURRENT_TIME(0);
timezone
-------------
10:10:56-05
(1 row)
=> SELECT TIME '10:10:56' AT TIME ZONE 'America/Denver' "Denver Time";
Denver Time
-------------
08:10:56-07
(1 row)
The next example adds a time zone literal to the input string—in this case, Europe/Vilnius
—and converts the time to MST:
=> SELECT TIME '09:56:13 Europe/Vilnius' AT TIME ZONE 'America/Denver';
Denver Time
-------------
00:56:13-07
(1 row)
See also
3.4.7 - TIMESTAMP/TIMESTAMPTZ
Stores the specified date and time.
Stores the specified date and time. TIMESTAMPTZ
is the same as TIMESTAMP WITH TIME ZONE
: both data types store the UTC offset of the specified time.
TIMESTAMP
is an alias for DATETIME
and SMALLDATETIME
.
Syntax
TIMESTAMP [ (p) ] [ { WITHOUT | WITH } TIME ZONE ] 'input-string' [AT TIME ZONE zone ]
TIMESTAMPTZ [ (p) ] 'input-string' [ AT TIME ZONE zone ]
Parameters
p
- Optional precision value that specifies the number of fractional digits retained in the seconds field, an integer value between 0 and 6. If you omit specifying precision, Vertica returns up to 6 fractional digits.
WITHOUT TIME ZONE
WITH TIME ZONE
Specifies whether to include a time zone with the stored value:
-
WITHOUT TIME ZONE
(default): Specifiesthat input-string
does not include a time zone. If the input string contains a time zone, Vertica ignores this qualifier. Instead, it conforms to WITH TIME ZONE
behavior.
-
WITH TIME ZONE
: Specifies to convert input-string
to UTC, using the UTC offset for the specified time zone. If the input string omits a time zone, Vertica uses the UTC offset of the time zone that is configured for your system.
input-string
- See Input String below.
-
AT TIME ZONE zone
- See TIMESTAMP AT TIME ZONE.
Limits
In the following table, values are rounded. See Date/time data types for more detail.
Name |
Low Value |
High Value |
Resolution |
TIMESTAMP [ (p) ] [ WITHOUT TIME ZONE ] |
290279 BC |
294277 AD |
1 µs |
TIMESTAMP [ (p) ] WITH TIME ZONE |
290279 BC |
294277 AD |
1 µs |
The date/time input string concatenates a date and a time. The input string can include a time zone, specified as a literal such as America/Chicago
, or as a UTC offset.
The following list represents typical date/time input variations:
Note
0000-00-00
is invalid input. If you try to insert that value into a DATE or TIMESTAMP field, an error occurs. If you copy 0000-00-00
into a DATE or TIMESTAMP field, Vertica converts the value to 0001-01-01 00:00:00 BC
.
The input string can also specify the calendar era, either AD
(default) or BC
. If you omit the calendar era, Vertica assumes the current calendar era (AD
). The calendar era typically follows the time zone; however, the input string can include it in various locations. For example, the following queries return the same results:
=> SELECT TIMESTAMP WITH TIME ZONE 'March 1, 44 12:00 CET BC ' "Caesar's Time of Death EST";
Caesar's Time of Death EST
----------------------------
0044-03-01 06:00:00-05 BC
(1 row)
=> SELECT TIMESTAMP WITH TIME ZONE 'March 1, 44 12:00 BC CET' "Caesar's Time of Death EST";
Caesar's Time of Death EST
----------------------------
0044-03-01 06:00:00-05 BC
(1 row)
Examples
=> SELECT (TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01');
?column?
----------
16 10:00
(1 row)
=> SELECT (TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01') / 7;
?column?
-------------------
2 08:17:08.571429
(1 row)
=> SELECT TIMESTAMP '2009-05-29 15:21:00.456789'-TIMESTAMP '2009-05-28';
?column?
-------------------
1 15:21:00.456789
(1 row)
=> SELECT (TIMESTAMP '2009-05-29 15:21:00.456789'-TIMESTAMP '2009-05-28')(3);
?column?
----------------
1 15:21:00.457
(1 row)
=> SELECT '2017-03-18 07:00'::TIMESTAMPTZ(0) + INTERVAL '1.5 day';
?column?
------------------------
2017-03-19 19:00:00-04
(1 row)
=> SELECT (TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01') day;
?column?
----------
16
(1 row)
=> SELECT cast((TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01')
day as integer) / 7;
?column?
----------------------
2.285714285714285714
(1 row)
=> SELECT floor((TIMESTAMP '2014-01-17 10:00' - TIMESTAMP '2014-01-01')
/ interval '7');
floor
-------
2
(1 row)
=> SELECT (TIMESTAMP '2009-05-29 15:21:00.456789'-TIMESTAMP '2009-05-28') second;
?column?
---------------
141660.456789
(1 row)
=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01') year;
?column?
----------
3
(1 row)
=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01') month;
?column?
----------
40
(1 row)
=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01')
year to month;
?column?
----------
3-4
(1 row)
=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01')
second(3);
?column?
---------------
107536860.457
(1 row)
=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01') minute;
?column?
----------
1792281
(1 row)
=> SELECT (TIMESTAMP '2012-05-29 15:21:00.456789'-TIMESTAMP '2009-01-01')
minute to second(3);
?column?
----------------
1792281:00.457
(1 row)
=> SELECT TIMESTAMP 'infinity';
?column?
----------
infinity
(1 row)
3.4.8 - TIMESTAMP AT TIME ZONE
Converts the specified TIMESTAMP or TIMESTAMPTZ (TIMESTAMP WITH TIMEZONE) to another time zone.
Converts the specified TIMESTAMP
or TIMESTAMPTZ
(TIMESTAMP WITH TIMEZONE
) to another time zone. Vertica executes AT TIME ZONE
differently, depending on whether the date input is a TIMESTAMP
or TIMESTAMPTZ
. See TIMESTAMP versus TIMESTAMPTZ Behavior below.
Syntax
timestamp-clause AT TIME ZONE 'zone'
Parameters
- [timestamp-clause](/en/sql-reference/data-types/datetime-data-types/timestamptimestamptz/)
- Specifies the timestamp to convert, either
TIMESTAMP
or TIMESTAMPTZ
.
For details, see
TIMESTAMP/TIMESTAMPTZ
.
AT TIME ZONE
zone
- Specifies the time zone to use in the timestamp conversion, where
zone
is a literal or interval that specifies a UTC offset:
For details, see Specifying Time Zones below.
Note
Vertica treats literals TIME ZONE
and TIMEZONE
as synonyms.
TIMESTAMP versus TIMESTAMPTZ behavior
How Vertica interprets AT TIME ZONE
depends on whether the date input is a TIMESTAMP
or TIMESTAMPTZ
:
Date input |
Action |
TIMESTAMP |
If the input string specifies no time zone, Vertica performs two actions:
-
Converts the input string to the time zone of the AT TIME ZONE argument.
-
Returns the time for the current session's time zone.
If the input string includes a time zone, Vertica implicitly casts it to a TIMESTAMPTZ and converts it accordingly (see TIMESTAMPTZ below).
For example, the following statement specifies a TIMESTAMP with no time zone. Vertica executes the statement as follows:
-
Converts the input string to PDT (Pacific Daylight Time).
-
Returns that time in the local time zone, which is three hours later:
=> SHOW TIMEZONE;
name | setting
----------+------------------
timezone | America/New_York
(1 row)
SELECT TIMESTAMP '2017-3-14 5:30' AT TIME ZONE 'PDT';
timezone
------------------------
2017-03-14 08:30:00-04
(1 row)
|
TIMESTAMPTZ |
Vertica converts the input string to the time zone of the AT TIME ZONE argument and returns that time.
For example, the following statement specifies a TIMESTAMPTZ data type. The input string omits any time zone expression, so Vertica assumes the input string to be in local time zone (America/New_York ) and returns the time of the AT TIME ZONE argument:
=> SHOW TIMEZONE;
name | setting
----------+------------------
timezone | America/New_York
(1 row)
=> SELECT TIMESTAMP WITH TIME ZONE '2001-02-16 20:38:40' AT TIME ZONE 'America/Denver';
timezone
---------------------
2001-02-16 18:38:40
(1 row)
The input string in the next statement explicitly specifies a time zone, so Vertica coerces the TIMESTAMP to a TIMESTAMPTZ and returns the time of the AT TIME ZONE argument:
=> SELECT TIMESTAMP '2001-02-16 20:38:40 America/Mexico_City' AT TIME ZONE 'Asia/Tokyo';
timezone
---------------------
2001-02-17 11:38:40
(1 row)
|
Specifying time zones
You can specify time zones in two ways:
-
A string literal such as America/Chicago
or PST
-
An interval that specifies a UTC offset—for example, INTERVAL '-08:00'
It is generally good practice to specify time zones with literals that indicate a geographic location. Vertica makes the necessary seasonal adjustments, and thereby avoids inconsistent results. For example, the following two queries are issued when daylight time is in effect. Because the local UTC offset during daylight time is -04
, both queries return the same results:
=> SELECT TIMESTAMPTZ '2017-03-16 09:56:13' AT TIME ZONE 'America/Denver' "Denver Time";
Denver Time
---------------------
2017-03-16 07:56:13
(1 row)
=> SELECT TIMESTAMPTZ '2017-03-16 09:56:13' AT TIME ZONE INTERVAL '-06:00' "Denver Time";
Denver Time
---------------------
2017-03-16 07:56:13
(1 row)
If you issue a use the UTC offset in a similar query when standard time is in effect, you must adjust the UTC offset accordingly—for Denver time, to -07
—otherwise, Vertica returns a different (and erroneous) result:
=> SELECT TIMESTAMPTZ '2017-01-16 09:56:13' AT TIME ZONE 'America/Denver' "Denver Time";
Denver Time
---------------------
2017-0-16 07:56:13
(1 row)
=> SELECT TIMESTAMPTZ '2017-01-16 09:56:13' AT TIME ZONE INTERVAL '-06:00' "Denver Time";
Denver Time
---------------------
2017-01-16 08:56:13
(1 row)
You can show and set the session's time zone with
SHOW TIMEZONE
and
SET TIME ZONE
, respectively:
=> SHOW TIMEZONE;
name | setting
----------+------------------
timezone | America/New_York
(1 row)
=> SELECT CURRENT_TIMESTAMP(0) "Eastern Daylight Time";
Eastern Daylight Time
------------------------
2017-03-20 12:18:24-04
(1 row)
=> SET TIMEZONE 'America/Los_Angeles';
SET
=> SELECT CURRENT_TIMESTAMP(0) "Pacific Daylight Time";
Pacific Daylight Time
------------------------
2017-03-20 09:18:24-07
(1 row)
Time zone literals
To view the default list of valid literals, see the files in the following directory:
opt/vertica/share/timezonesets
For example:
$ cat Antarctica.txt
...
# src/timezone/tznames/Antarctica.txt
#
AWST 28800 # Australian Western Standard Time
# (Antarctica/Casey)
# (Australia/Perth)
...
NZST 43200 # New Zealand Standard Time
# (Antarctica/McMurdo)
# (Pacific/Auckland)
ROTT -10800 # Rothera Time
# (Antarctica/Rothera)
SYOT 10800 # Syowa Time
# (Antarctica/Syowa)
VOST 21600 # Vostok time
# (Antarctica/Vostok)
See also
3.5 - Long data types
Store data up to 32000000 octets.
Store data up to 32000000 octets. Vertica supports two long data types:
-
LONG VARBINARY
: Variable-length raw-byte data, such as spatial data. LONG VARBINARY
values are not extended to the full width of the column.
-
LONG VARCHAR
: Variable-length strings, such as log files and unstructured data. LONG VARCHAR
values are not extended to the full width of the column.
Use LONG
data types only when you need to store data greater than the maximum size of VARBINARY
and VARCHAR
data types (65 KB). Long data can include unstructured data, online comments or posts, or small log files.
Flex tables have a default LONG VARBINARY __raw__
column, with a NOT NULL
constraint. For more information, see Flex tables.
Syntax
LONG VARBINARY [(max-length)]
LONG VARCHAR [(octet-length)]
Parameters
max-length
- Length of the byte string or column width, declared in bytes (octets), up to 32000000.
Default: 1 MB
octet-length
- Length of the string or column width, declared in bytes (octets), up to 32000000.
Default: 1 MB
For optimal performance of LONG
data types, Vertica recommends that you:
-
Use the LONG
data types as storage only containers; Vertica supports operations on the content of LONG
data types, but does not support all the operations that VARCHAR
and VARBINARY
take.
-
Use VARBINARY
and VARCHAR
data types, instead of their LONG
counterparts, whenever possible. VARBINARY
and VARCHAR
data types are more flexible and have a wider range of operations.
-
Do not sort, segment, or partition projections on LONG
data type columns.
-
Do not add constraints, such as a primary key, to any LONG VARBINARY
or LONG VARCHAR
columns.
-
Do not join or aggregate any LONG
data type columns.
Examples
The following example creates a table user_comments
with a LONG VARCHAR
column and inserts data into it:
=> CREATE TABLE user_comments
(id INTEGER,
username VARCHAR(200),
time_posted TIMESTAMP,
comment_text LONG VARCHAR(200000));
=> INSERT INTO user_comments VALUES
(1,
'User1',
TIMESTAMP '2013-06-25 12:47:32.62',
'The weather tomorrow will be cold and rainy and then
on the day after, the sun will come and the temperature
will rise dramatically.');
3.6 - Numeric data types
Numeric data types are numbers stored in database columns.
Numeric data types are numbers stored in database columns. These data types are typically grouped by:
-
Exact numeric types, values where the precision and scale need to be preserved. The exact numeric types are INTEGER
, BIGINT
, DECIMAL
, NUMERIC
, NUMBER
, and MONEY
.
-
Approximate numeric types, values where the precision needs to be preserved and the scale can be floating. The approximate numeric types are DOUBLE PRECISION
, FLOAT
, and REAL
.
Implicit casts from INTEGER
, FLOAT
, and NUMERIC
to VARCHAR
are not supported. If you need that functionality, write an explicit cast using one of the following forms:
CAST(numeric-expression AS data-type)
numeric-expression::data-type
For example, you can cast a float to an integer as follows:
=> SELECT(FLOAT '123.5')::INT;
?column?
----------
124
(1 row)
String-to-numeric data type conversions accept formats of quoted constants for scientific notation, binary scaling, hexadecimal, and combinations of numeric-type literals:
-
Scientific notation:
=> SELECT FLOAT '1e10';
?column?
-------------
10000000000
(1 row)
-
BINARY
scaling:
=> SELECT NUMERIC '1p10';
?column?
----------
1024
(1 row)
-
hexadecimal:
=> SELECT NUMERIC '0x0abc';
?column?
----------
2748
(1 row)
3.6.1 - DOUBLE PRECISION (FLOAT)
Vertica supports the numeric data type DOUBLE PRECISION, which is the IEEE-754 8-byte floating point type, along with most of the usual floating point operations.
Vertica supports the numeric data type DOUBLE PRECISION
, which is the IEEE-754 8-byte floating point type, along with most of the usual floating point operations.
Syntax
[ DOUBLE PRECISION | FLOAT | FLOAT(n) | FLOAT8 | REAL ]
Parameters
Note
On a machine whose floating-point arithmetic does not follow IEEE-754, these values probably do not work as expected.
Double precision is an inexact, variable-precision numeric type. In other words, some values cannot be represented exactly and are stored as approximations. Thus, input and output operations involving double precision might show slight discrepancies.
-
All of the DOUBLE PRECISION
data types are synonyms for 64-bit IEEE FLOAT.
-
The n in FLOAT(n)
must be between 1 and 53, inclusive, but a 53-bit fraction is always used. See the IEEE-754 standard for details.
-
For exact numeric storage and calculations (money for example), use NUMERIC
.
-
Floating point calculations depend on the behavior of the underlying processor, operating system, and compiler.
-
Comparing two floating-point values for equality might not work as expected.
-
While Vertica treats decimal values as FLOAT
internally, if a column is defined as FLOAT
then you cannot read decimal values from ORC and Parquet files. In those formats, FLOAT
and DECIMAL
are different types.
Values
COPY
accepts floating-point data in the following format:
-
Optional leading white space
-
An optional plus ("+") or minus sign ("-")
-
A decimal number, a hexadecimal number, an infinity, a NAN, or a null value
Decimal Number
A decimal number consists of a non-empty sequence of decimal digits possibly containing a radix character (decimal point "."), optionally followed by a decimal exponent. A decimal exponent consists of an "E" or "e", followed by an optional plus or minus sign, followed by a non-empty sequence of decimal digits, and indicates multiplication by a power of 10.
Hexadecimal Number
A hexadecimal number consists of a "0x" or "0X" followed by a non-empty sequence of hexadecimal digits possibly containing a radix character, optionally followed by a binary exponent. A binary exponent consists of a "P" or "p", followed by an optional plus or minus sign, followed by a non-empty sequence of decimal digits, and indicates multiplication by a power of 2. At least one of radix character and binary exponent must be present.
Infinity
An infinity is either INF
or INFINITY
, disregarding case.
NaN (Not A Number)
A NaN is NAN
(disregarding case) optionally followed by a sequence of characters enclosed in parentheses. The character string specifies the value of NAN in an implementation-dependent manner. (The Vertica internal representation of NAN is 0xfff8000000000000LL on x86 machines.)
When writing infinity or NAN values as constants in a SQL statement, enclose them in single quotes. For example:
=> UPDATE table SET x = 'Infinity'
Note
Vertica follows the IEEE definition of NaNs (IEEE 754). The SQL standards do not specify how floating point works in detail.
IEEE defines NaNs as a set of floating point values where each one is not equal to anything, even to itself. A NaN is not greater than and at the same time not less than anything, even itself. In other words, comparisons always return false whenever a NaN is involved.
However, for the purpose of sorting data, NaN values must be placed somewhere in the result. The value generated 'NaN' appears in the context of a floating point number matches the NaN value generated by the hardware. For example, Intel hardware generates (0xfff8000000000000LL), which is technically a Negative, Quiet, Non-signaling NaN.
Vertica uses a different NaN value to represent floating point NULL (0x7ffffffffffffffeLL). This is a Positive, Quiet, Non-signaling NaN and is reserved by Vertica
A NaN example follows.
=> SELECT CBRT('Nan'); -- cube root
CBRT
------
NaN
(1 row)
=> SELECT 'Nan' > 1.0;
?column?
----------
f
(1 row)
Null Value
The load file format of a null value is user defined, as described in the COPY
command. The Vertica internal representation of a null value is 0x7fffffffffffffffLL. The interactive format is controlled by the vsql printing option null. For example:
\pset null '(null)'
The default option is not to print anything.
Rules
To search for NaN column values, use the following predicate:
... WHERE column != column
This is necessary because WHERE
column = 'Nan'
cannot be true by definition.
Sort order (ascending)
-
NaN
-
-Inf
-
numbers
-
+Inf
-
NULL
Notes
-
NULL
appears last (largest) in ascending order.
-
All overflows in floats generate +/-infinity or NaN, per the IEEE floating point standard.
3.6.2 - INTEGER
A signed 8-byte (64-bit) data type.
A signed 8-byte (64-bit) data type.
Syntax
[ INTEGER | INT | BIGINT | INT8 | SMALLINT | TINYINT ]
Parameters
INT
, INTEGER
, INT8
, SMALLINT
, TINYINT
, and BIGINT
are all synonyms for the same signed 64-bit integer data type. Automatic compression techniques are used to conserve disk space in cases where the full 64 bits are not required.
Notes
-
The range of values is –2^63+1 to 2^63-1.
-
2^63 = 9,223,372,036,854,775,808 (19 digits).
-
The value –2^63 is reserved to represent NULL.
-
NULL
appears first (smallest) in ascending order.
-
Vertica does not have an explicit 4-byte (32-bit integer) or smaller types. Vertica's encoding and compression automatically eliminate the storage overhead of values that fit in less than 64 bits.
Restrictions
-
The JDBC type INTEGER is 4 bytes and is not supported by Vertica. Use BIGINT
instead.
-
Vertica does not support the SQL/JDBC types NUMERIC
, SMALLINT
, or TINYINT
.
-
Vertica does not check for overflow (positive or negative) except in the aggregate function SUM
()
. If you encounter overflow when using SUM
, use SUM_FLOAT
()
, which converts to floating point.
See also
Data Type Coercion Chart
3.6.3 - NUMERIC
Numeric data types store fixed-point numeric data.
Numeric data types store fixed-point numeric data. For example, a value of $123.45 can be stored in a NUMERIC(5,2)
field. Note that the first number, the precision, specifies the total number of digits.
Syntax
numeric-type [ ( precision[, scale] ) ]
Parameters
numeric-type
- One of the following:
-
NUMERIC
-
DECIMAL
-
NUMBER
-
MONEY
precision
- An unsigned integer that specifies the total number of significant digits that the data type stores, where
precision
is ≤ 1024. If omitted, the default precision depends on numeric type that you specify. If you assign a value that exceeds precision
, Vertica returns an error.
If a data type's precision is ≤ 18, performance is equivalent to an INTEGER data type, regardless of scale. When possible, Vertica recommends using a precision ≤ 18.
scale
- An unsigned integer that specifies the maximum number of digits to the right of the decimal point to store.
scale
must be ≤ precision
. If omitted, the default scale depends on numeric type that you specify. If you assign a value with more decimal digits than scale
, the scale is rounded to scale
digits.
When using ALTER to modify the data type of a numeric column, scale
cannot be changed.
Default precision and scale
NUMERIC
, DECIMAL
, NUMBER
, and MONEY
differ in their default precision and scale values:
Type |
Precision |
Scale |
NUMERIC |
37 |
15 |
DECIMAL |
37 |
15 |
NUMBER |
38 |
0 |
MONEY |
18 |
4 |
Supported encoding
Vertica supports the following encoding for numeric data types:
-
Precision ≤ 18: AUTO
, BLOCK_DICT
, BLOCKDICT_COMP
, COMMONDELTA_COMP
, DELTAVAL
, GCDDELTA
, and RLE
-
Precision > 18: AUTO
, BLOCK_DICT
, BLOCKDICT_COMP
, RLE
For details, see Encoding types.
Numeric versus integer and floating data types
Numeric data types are exact data types that store values of a specified precision and scale, expressed with a number of digits before and after a decimal point. This contrasts with the Vertica integer and floating data types:
-
DOUBLE PRECISION (FLOAT) supports ~15 digits, variable exponent, and represents numeric values approximately. It can be less precise than NUMERIC data types.
-
INTEGER supports ~18 digits, whole numbers only.
The NUMERIC data type is preferred for non-integer constants, because it is always exact. For example:
=> SELECT 1.1 + 2.2 = 3.3;
?column?
----------
t
(1 row)
=> SELECT 1.1::float + 2.2::float = 3.3::float;
?column?
----------
f
(1 row)
Numeric operations
Supported numeric operations include the following:
-
Basic math: +
, -
, *
, /
-
Aggregation: SUM, MIN, MAX, COUNT
-
Comparison: <
, <=
, =
, <=>
, <>
, >
, >=
NUMERIC divide operates directly on numeric values, without converting to floating point. The result has at least 18 decimal places and is rounded.
NUMERIC mod (including %) operates directly on numeric values, without converting to floating point. The result has the same scale as the numerator and never needs rounding.
Some complex operations used with numeric data types result in an implicit cast to FLOAT. When using SQRT, STDDEV, transcendental functions such as LOG, and TO_CHAR/TO_NUMBER formatting, the result is always FLOAT.
Examples
The following series of commands creates a table that contains a numeric data type and then performs some mathematical operations on the data:
=> CREATE TABLE num1 (id INTEGER, amount NUMERIC(8,2));
Insert some values into the table:
=> INSERT INTO num1 VALUES (1, 123456.78);
Query the table:
=> SELECT * FROM num1;
id | amount
------+-----------
1 | 123456.78
(1 row)
The following example returns the NUMERIC column, amount, from table num1:
=> SELECT amount FROM num1;
amount
-----------
123456.78
(1 row)
The following syntax adds one (1) to the amount:
=> SELECT amount+1 AS 'amount' FROM num1;
amount
-----------
123457.78
(1 row)
The following syntax multiplies the amount column by 2:
=> SELECT amount*2 AS 'amount' FROM num1;
amount
-----------
246913.56
(1 row)
The following syntax returns a negative number for the amount column:
=> SELECT -amount FROM num1;
?column?
------------
-123456.78
(1 row)
The following syntax returns the absolute value of the amount argument:
=> SELECT ABS(amount) FROM num1;
ABS
-----------
123456.78
(1 row)
The following syntax casts the NUMERIC amount as a FLOAT data type:
=> SELECT amount::float FROM num1;
amount
-----------
123456.78
(1 row)
See also
Mathematical functions
3.6.4 - Numeric data type overflow
Vertica does not check for overflow (positive or negative) except in the aggregate function SUM().
Vertica does not check for overflow (positive or negative) except in the aggregate function SUM
()
. If you encounter overflow when using SUM
, use SUM_FLOAT
()
which converts to floating point.
For a detailed discussion of how Vertica handles overflow when you use the functions SUM, SUM_FLOAT, and AVG with numeric data types, see Numeric data type overflow with SUM, SUM_FLOAT, and AVG. The discussion includes directives for turning off silent numeric overflow and setting precision for numeric data types.
Dividing by zero returns an error:
=> SELECT 0/0;
ERROR 3117: Division by zero
=> SELECT 0.0/0;
ERROR 3117: Division by zero
=> SELECT 0 // 0;
ERROR 3117: Division by zero
=> SELECT 200.0/0;
ERROR 3117: Division by zero
=> SELECT 116.43 // 0;
ERROR 3117: Division by zero
Dividing zero as a FLOAT by zero returns NaN:
=> SELECT 0.0::float/0;
?column?
----------
NaN
=> SELECT 0.0::float//0;
?column?
----------
NaN
Dividing a non-zero FLOAT by zero returns Infinity:
=> SELECT 2.0::float/0;
?column?
----------
Infinity
=> SELECT 200.0::float//0;
?column?
----------
Infinity
Add, subtract, and multiply operations ignore overflow. Sum and average operations use 128-bit arithmetic internally. SUM
()
reports an error if the final result overflows, suggesting the use of SUM_FLOAT
(INT)
, which converts the 128-bit sum to a FLOAT
. For example:
=> CREATE TEMP TABLE t (i INT);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> INSERT INTO t VALUES (1<<62);
=> SELECT SUM(i) FROM t;
ERROR: sum() overflowed
HINT: try sum_float() instead
=> SELECT SUM_FLOAT(i) FROM t;
SUM_FLOAT
---------------------
2.30584300921369e+19
3.6.5 - Numeric data type overflow with SUM, SUM_FLOAT, and AVG
When you use the SUM, SUM_FLOAT, and AVG functions (aggregate and analytic) to query a numeric column, overflow can occur.
When you use the SUM, SUM_FLOAT, and AVG functions (aggregate and analytic) to query a numeric column, overflow can occur. How Vertica responds to that overflow depends on the settings of two configuration parameters:
- AllowNumericOverflow (Boolean, default 1) allows numeric overflow. Vertica does not implicitly extend precision of numeric data types.
- NumericSumExtraPrecisionDigits (integer, default 6) determines whether to return an overflow error if a result exceeds the specified precision. This parameter is ignored if AllowNumericOverflow is set to 1 (true).
Vertica also allows numeric overflow when you use SUM or SUM_FLOAT to query pre-aggregated data. See Impact on Pre-Aggregated Data Projections below.
Default overflow handling
With numeric columns, Vertica internally works with multiples of 18 digits. If specified precision is less than 18—for example, x(12,0)
—Vertica allows overflow up to and including the first multiple of 18. In some situations, if you sum a column, you can exceed the number of digits Vertica internally reserves for the result. In this case, Vertica allows silent overflow.
Turning off silent numeric overflow
You can turn off silent numeric overflow by setting AllowNumericOverflow to 0. In this case, Vertica checks the value of configuration parameter NumericSumExtraPrecisionDigits. By default, this parameter is set to 6, which means that Vertica internally adds extra digit places beyond a column's DDL-specified precision. Adding extra precision digits enables Vertica to consistently return results that overflow the column's precision. However, crossing into the second multiple of 18 internally can adversely impact performance.
For example, if AllowNumericOverflow is set to 0 :
- Column
x
is defined as x(12,0)
and NumericSumExtraPrecisionDigits is set to 6: Vertica internally stays within the first multiple of 18 digits and no additional performance impact occurs (a
).
- Column
x
is defined as x(2,0)
and NumericSumExtraPrecisionDigits is set to 20: Vertica internally crosses a threshold into the second multiple of 18. In this case, performance is significantly affected (2a
). Performance beyond the second multiple of 18 continues to be 2a
.
Tip
Vertica recommends that you turn off silent numeric overflow and set the parameter NumericSumExtraPrecisionDigits if you expect query results to exceed the precision specified in the DDL of numeric columns. Be aware of the following considerations:
- If you turn off AllowNumericOverflow and exceed the number of extra precision digits set by NumericSumExtraPrecisionDigits, Vertica returns an error.
- Be careful to set NumericSumExtraPrecisionDigits only as high as necessary to return the SUM of numeric columns.
Impact on pre-aggregated data projections
Vertica supports silent numeric overflow for queries that use SUM or SUM_FLOAT on projections with pre-aggregated data such as live aggregate or Top-K projections. To turn off silent numeric overflow for these queries:
-
Set AllowNumericOverflow to 0.
-
Set NumericSumExtraPrecisionDigits to the desired number of implicit digits. Alternatively, use the default setting of 6.
-
Drop and re-create the affected projections.
If you turn off silent numeric overflow, be aware that overflow can sometimes cause rollback or errors:
-
Overflow occurs during load operations, such as COPY, MERGE, or INSERT:
Vertica aggregates data before loading the projection with data. If overflow occurs while data is aggregated, , Vertica rolls back the load operation.
-
Overflow occurs after load, while Vertica sums existing data.
Vertica computes the sum of existing data separately from the computation that it does during data load. If the projection selects a column with SUM or SUM_FLOAT and overflow occurs, Vertica produces an error message. This response resembles the way Vertica produces an error for a query that uses SUM or SUM_FLOAT.
-
Overflow occurs during mergeout.
Vertica logs a message during mergeout if overflow occurs while Vertica computes a final sum during the mergeout operation. If an error occurs, Vertica marks the projection as out of date and disqualifies it from further mergeout operations.
3.7 - Spatial data types
The maximum amount of spatial data that a GEOMETRY or GEOGRAPHY column can store, up to 10 MB.
Vertica supports two spatial data types. These data types store two- and three-dimensional spatial objects in a table column:
The maximum size of a GEOMETRY
or GEOGRAPHY
data type is 10,000,000 bytes (10 MB). You cannot use either data type as a table's primary key.
Syntax
GEOMETRY [ (length) ]
GEOGRAPHY [ (length) ]
Parameters
length
- The maximum amount of spatial data that a
GEOMETRY
or GEOGRAPHY
column can store, up to 10 MB.
Default: 1 MB
3.8 - UUID data type
Stores universally unique identifiers (UUIDs).
Stores universally unique identifiers (UUIDs). UUIDs are 16-byte (128-bit) numbers used to uniquely identify records. To generate UUIDs, Vertica provides the function
UUID_GENERATE
, which returns UUIDs based on high-quality randomness from /dev/urandom
.
Syntax
UUID
UUIDs support input of case-insensitive string literal formats, as specified by RFC 4122. In general, a UUID is written as a sequence of hexadecimal digits, in several groups optionally separated by hyphens, for a total of 32 digits representing 128 bits.
The following input formats are valid:
6bbf0744-74b4-46b9-bb05-53905d4538e7
{6bbf0744-74b4-46b9-bb05-53905d4538e7}
6BBF074474B446B9BB0553905D4538E7
6BBf-0744-74B4-46B9-BB05-5390-5D45-38E7
On output, Vertica always uses the following format:
xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
For example, the following table defines column cust_id
as a UUID:
=> CREATE TABLE public.Customers
(
cust_id uuid,
lname varchar(36),
fname varchar(24)
);
The following input for cust_id
uses several valid formats:
=> COPY Customers FROM STDIN;
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> {cede66b7-3d29-4da6-b700-871fc0ac57be}|Kearney|Thomas
>> 34462732ed5649838f3be735b0c32d50|Pham|Duc
>> 9fb0-1de0-1d63-4d09-9415-90e0-b4e9-3b9a|Steinberg|Jeremy
>> \.
On querying this table, Vertica formats all cust_id
data in the same way:
=> SELECT cust_id, fname, lname FROM Customers;
cust_id | fname | lname
--------------------------------------+--------+-----------
9fb01de0-1d63-4d09-9415-90e0b4e93b9a | Jeremy | Steinberg
34462732-ed56-4983-8f3b-e735b0c32d50 | Duc | Pham
cede66b7-3d29-4da6-b700-871fc0ac57be | Thomas | Kearney
(3 rows)
Generating UUIDs
You can use the Vertica function
UUID_GENERATE
to automatically generate UUIDs that uniquely identify table records. For example:
=> INSERT INTO Customers SELECT UUID_GENERATE(),'Rostova','Natasha';
OUTPUT
--------
1
(1 row)
=> COMMIT;
COMMIT
=> SELECT cust_id, fname, lname FROM Customers;
cust_id | fname | lname
--------------------------------------+---------+-----------
9fb01de0-1d63-4d09-9415-90e0b4e93b9a | Jeremy | Steinberg
34462732-ed56-4983-8f3b-e735b0c32d50 | Duc | Pham
cede66b7-3d29-4da6-b700-871fc0ac57be | Thomas | Kearney
9aad6757-fe1b-473a-a109-b89b7b358c69 | Natasha | Rostova
(4 rows)
The following string is reserved as NULL for UUID columns:
00000000-0000-0000-0000-000000000000
Vertica always renders NULL as blank.
The following COPY
statements insert NULL values into the UUID column, explicitly and implicitly:
=> COPY Customers FROM STDIN NULL AS 'null';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> null|Doe|Jane
>> 00000000-0000-0000-0000-000000000000|Man|Nowhere
>> \.
=> COPY Customers FROM STDIN;
>> |Doe|John
>> \.
In all cases, Vertica renders NULL as blank:
=> SELECT cust_id, fname, lname FROM Customers WHERE cust_id IS NULL;
cust_id | fname | lname
---------+---------+-------
| Nowhere | Man
| Jane | Doe
| John | Doe
(3 rows)
Usage restrictions
UUID data types only support relational operators and functions that are also supported by CHAR and VARCHAR data types—for example,
MIN
,
MAX
, and
COUNT
. UUID data types do not support mathematical operators or functions, such as
SUM
and
AVG
.
3.9 - Data type coercion
Vertica supports two types of data type casting:.
Vertica supports two types of data type casting:
Implicit casting
The ANSI SQL-92 standard supports implicit casting among similar data types:
-
Number types
-
CHAR, VARCHAR, LONG VARCHAR
-
BINARY, VARBINARY, LONG VARBINARY
Vertica supports two types of nonstandard implicit casts of scalar types:
-
From CHAR to FLOAT, to match the one from VARCHAR to FLOAT. The following example converts the CHAR '3'
to a FLOAT so it can add the number 4.33 to the FLOAT result of the second expression:
=> SELECT '3'::CHAR + 4.33::FLOAT;
?column?
----------
7.33
(1 row)
-
Between DATE and TIMESTAMP. The following example DATE to a TIMESTAMP and calculates the time 6 hours, 6 minutes, and 6 seconds back from 12:00 AM:
=> SELECT DATE('now') - INTERVAL '6:6:6';
?column?
---------------------
2013-07-30 17:53:54
(1 row)
When there is no ambiguity about the data type of an expression value, it is implicitly coerced to match the expected data type. In the following statement, the quoted string constant '2
' is implicitly coerced into an INTEGER value so that it can be the operand of an arithmetic operator (addition):
=> SELECT 2 + '2';
?column?
----------
4
(1 row)
A concatenate operation explicitly takes arguments of any data type. In the following example, the concatenate operation implicitly coerces the arithmetic expression 2 + 2
and the INTEGER constant 2
to VARCHAR values so that they can be concatenated.
=> SELECT 2 + 2 || 2;
?column?
----------
42
(1 row)
Another example is to first get today's date:
=> SELECT DATE 'now';
?column?
------------
2013-07-31
(1 row)
The following command converts DATE to a TIMESTAMP and adds a day and a half to the results by using INTERVAL:
=> SELECT DATE 'now' + INTERVAL '1 12:00:00';
?column?
---------------------
2013-07-31 12:00:00
(1 row)
Most implicit casts stay within their relational family and go in one direction, from less detailed to more detailed. For example:
-
DATE to TIMESTAMP/TZ
-
INTEGER to NUMERIC to FLOAT
-
CHAR to FLOAT
-
CHAR to VARCHAR
-
CHAR and/or VARCHAR to FLOAT
-
CHAR to LONG VARCHAR
-
VARCHAR to LONG VARCHAR
-
BINARY to VARBINARY
-
BINARY to LONG VARBINARY
-
VARBINARY to LONG VARBINARY
More specifically, data type coercion works in this manner in Vertica:
Conversion |
Notes |
INT8 > FLOAT8 |
Implicit, can lose significance |
FLOAT8 > INT8 |
Explicit, rounds |
VARCHAR <-> CHAR |
Implicit, adjusts trailing spaces |
VARBINARY <-> BINARY |
Implicit, adjusts trailing NULs |
VARCHAR > LONG VARCHAR |
Implicit, adjusts trailing spaces |
VARBINARY > LONG VARBINARY |
Implicit, adjusts trailing NULs |
No other types cast to or from LONGVARBINARY, VARBINARY, or BINARY. In the following list, <any> means one these types: INT8, FLOAT8, DATE, TIME, TIMETZ, TIMESTAMP, TIMESTAMPTZ, INTERVAL:
-
<any> -> VARCHAR—implicit
-
VARCHAR -> <any>—explicit, except that VARCHAR->FLOAT is implicit
-
<any> <-> CHAR—explicit
-
DATE -> TIMESTAMP/TZ—implicit
-
TIMESTAMP/TZ -> DATE—explicit, loses time-of-day
-
TIME -> TIMETZ—implicit, adds local timezone
-
TIMETZ -> TIME—explicit, loses timezone
-
TIME -> INTERVAL—implicit, day to second with days=0
-
INTERVAL -> TIME—explicit, truncates non-time parts
-
TIMESTAMP <-> TIMESTAMPTZ—implicit, adjusts to local timezone
-
TIMESTAMP/TZ -> TIME—explicit, truncates non-time parts
-
TIMESTAMPTZ -> TIMETZ—explicit
-
VARBINARY -> LONG VARBINARY—implicit
-
LONG VARBINARY -> VARBINARY—explicit
-
VARCHAR -> LONG VARCHAR—implicit
-
LONG VARCHAR -> VARCHAR—explicit
Important
Implicit casts from INTEGER, FLOAT, and NUMERIC to VARCHAR are not supported. If you need that functionality, write an explicit cast:
CAST(x AS data-type-name)
or
x::data-type-name
The following example casts a FLOAT to an INTEGER:
=> SELECT(FLOAT '123.5')::INT;
?column?
----------
124
(1 row)
String-to-numeric data type conversions accept formats of quoted constants for scientific notation, binary scaling, hexadecimal, and combinations of numeric-type literals:
-
Scientific notation:
=> SELECT FLOAT '1e10';
?column?
-------------
10000000000
(1 row)
-
BINARY
scaling:
=> SELECT NUMERIC '1p10';
?column?
----------
1024
(1 row)
-
hexadecimal:
=> SELECT NUMERIC '0x0abc';
?column?
----------
2748
(1 row)
Complex types
Collections (arrays and sets) can be cast implicitly and explicitly. Casting a collection casts each element of the collection. You can, for example, cast an ARRAY[VARCHAR] to an ARRAY[INT] or a SET[DATE] to SET[TIMESTAMPTZ]. You can cast between arrays and sets.
When casting to a bounded native array, inputs that are too long are truncated. When casting to a non-native array (an array containing complex data types including other arrays), if the new bounds are too small for the data the cast fails
Rows (structs) can be cast implicitly and explicitly. Casting a ROW casts each field value. You can specify new field names in the cast or specify only the field types to use the existing field names.
Casting can increase the storage needed for a column. For example, if you cast an array of INT to an array of VARCHAR(50), each element takes more space and thus the array takes more space. If the difference is extreme or the array has many elements, this could mean that the array no longer fits within the space allotted for the column. In this case the operation reports an error and fails.
Examples
The following example casts three strings as NUMERICs:
=> SELECT NUMERIC '12.3e3', '12.3p10'::NUMERIC, CAST('0x12.3p-10e3' AS NUMERIC);
?column? | ?column? | ?column?
----------+----------+-------------------
12300 | 12595.2 | 17.76123046875000
(1 row)
This example casts a VARBINARY string into a LONG VARBINARY data type:
=> SELECT B'101111000'::LONG VARBINARY;
?column?
----------
\001x
(1 row)
The following example concatenates a CHAR with a LONG VARCHAR, resulting in a LONG VARCHAR:
=> \set s ''''`cat longfile.txt`''''
=> SELECT length ('a' || :s ::LONG VARCHAR);
length
----------
65002
(1 row)
The following example casts a combination of NUMERIC and INTEGER data into a NUMERIC result:
=> SELECT (18. + 3./16)/1024*1000;
?column?
-----------------------------------------
17.761230468750000000000000000000000000
(1 row)
Note
In SQL expressions, pure numbers between (–2^63–1) and (2^63–1) are INTEGERs. Numbers with decimal points are NUMERIC.
See also
3.10 - Data type coercion chart
The following table defines all possible type conversions that Vertica supports.
Conversion types
The following table defines all possible type conversions that Vertica supports. The data types in the first column of the table are the inputs to convert, and the remaining columns indicate the result for the different conversion types.
Source Data Type |
Implicit |
Explicit |
Assignment |
Assignment without numeric meaning |
Conversion without explicit casting |
BOOLEAN |
|
|
INTEGER, LONG VARCHAR, VARCHAR, CHAR |
|
|
INTEGER |
BOOLEAN, NUMERIC, FLOAT |
|
INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH |
LONG VARCHAR, VARCHAR, CHAR |
|
NUMERIC |
FLOAT |
|
INTEGER |
LONG VARCHAR, VARCHAR, CHAR |
NUMERIC |
FLOAT |
|
|
INTEGER, NUMERIC |
LONG VARCHAR, VARCHAR, CHAR |
|
LONG VARCHAR |
FLOAT, CHAR |
BOOLEAN, INTEGER, NUMERIC, VARCHAR, TIMESTAMP, TIMESTAMPTZ, DATE, TIME, TIMETZ, INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH, LONG VARBINARY |
|
|
LONG VARCHAR |
VARCHAR |
CHAR, FLOAT, LONG VARCHAR |
BOOLEAN, INTEGER, NUMERIC, TIMESTAMP, TIMESTAMPTZ, DATE, TIME, TIMETZ, UUID, BINARY, VARBINARY, INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH |
|
|
VARCHAR |
CHAR |
FLOAT, LONG VARCHAR, VARCHAR |
BOOLEAN, INTEGER, NUMERIC, TIMESTAMP, TIMESTAMPTZ, DATE, TIME, TIMETZ, UUID (CHAR length ≥ 36), BINARY, VARBINARY, INTERVAL DAY/SECOND, INTERVAL YEAR/MONTH |
|
|
CHAR |
TIMESTAMP |
TIMESTAMPTZ |
|
LONG CHAR, VARCHAR, CHAR, DATE, TIME |
|
TIMESTAMP |
TIMESTAMPTZ |
TIMESTAMP |
|
LONG CHAR, VARCHAR, CHAR, DATE, TIME, TIMETZ |
|
TIMESTAMPTZ |
DATE |
TIMESTAMP |
|
LONG CHAR, VARCHAR, CHAR, TIMESTAMPTZ |
|
|
TIME |
TIMETZ |
TIMESTAMP, TIMESTAMPTZ, INTERVAL DAY/SECOND |
LONG CHAR, VARCHAR, CHAR |
|
TIME |
TIMETZ |
|
TIMESTAMP, TIMESTAMPTZ |
LONG CHAR, VARCHAR, CHAR, TIME |
|
TIMETZ |
INTERVAL DAY/SECOND |
|
TIME |
INTEGER, LONG CHAR, VARCHAR, CHAR |
|
INTERVAL DAY/SECOND |
INTERVAL YEAR/MONTH |
|
|
INTEGER, LONG CHAR, VARCHAR, CHAR |
|
INTERVAL YEAR/MONTH |
LONG VARBINARY |
|
VARBINARY |
|
|
LONG VARBINARY |
VARBINARY |
LONG VARBINARY, BINARY |
|
|
|
VARBINARY |
BINARY |
VARBINARY |
|
|
|
BINARY |
UUID |
|
CHAR(36), VARCHAR |
|
|
UUID |
Implicit and explicit conversion
Vertica supports data type conversion of values without explicit casting, such as NUMERIC(10,6) -> NUMERIC(18,4)
.Implicit data type conversion occurs automatically when converting values of different, but compatible, types to the target column's data type. For example, when adding values, (INTEGER + NUMERIC
), the result is implicitly cast to a NUMERIC
type to accommodate the prominent type in the statement. Depending on the input data types, different precision and scale can occur.
An explicit type conversion must occur when the source data cannot be cast implicitly to the target column's data type.
Assignment conversion
In data assignment conversion, coercion implicitly occurs when values are assigned to database columns in an INSERT
or UPDATE...SET
statement. For example, in a statement that includes INSERT...VALUES('2.5')
, where the target column data type is NUMERIC(18,5)
, a cast from VARCHAR
to the column data type is inferred.
In an assignment without numeric meaning, the value is subject to CHAR/VARCHAR/LONG VARCHAR comparisons.
See also
3.11 - Complex types
Complex types such as structures (also known as rows), arrays, and maps are composed of primitive types and sometimes other complex types.
Complex types such as structures (also known as rows), arrays, and maps are composed of primitive types and sometimes other complex types. Complex types can be used in the following ways:
-
Arrays and rows (in any combination) can be used as column data types in both native and external tables.
-
Sets of primitive element types can be used as column data types in native and external tables.
-
Arrays and rows, but not combinations of them, can be created as literals, for example to use in query expressions.
The MAP type is a legacy type. To represent maps, use ARRAY[ROW].
If a flex table has a real column that uses a complex type, the values from that column are not included in the __raw__
column. For more information, see Loading Data into Flex Table Real Columns.
3.11.1 - ARRAY
Represents array data.
Represents array data. There are two types of arrays in Vertica:
-
Native array: a one-dimensional array of a primitive type. Native arrays are tracked in the TYPES system table and used in native tables.
-
Non-native array: all other supported arrays, including arrays that contain other arrays (multi-dimensional arrays) or structs (ROWs). Non-native arrays have some usage restrictions. Non-native arrays are tracked in the COMPLEX_TYPES system table.
Both types of arrays operate in the same way, but they have different OIDs.
Arrays can be bounded, meaning they specify a maximum element count, or unbounded. Unbounded arrays have a maximum binary size, which can be set explicitly or defaulted. See Limits on Element Count and Collection Size.
Selected parsers support using COPY to load arrays. See the documentation of individual parsers for more information.
Syntax
In column definitions:
ARRAY[data_type, max_elements] |
ARRAY[data_type](max_size) |
ARRAY[data_type]
In literals:
ARRAY[value[, ...] ]
Restrictions
-
Native arrays support only data of primitive types, for example, int, UUID, and so on.
-
Array dimensionality is enforced. A column cannot contain arrays of varying dimensions. For example, a column that contains a three-dimensional array can only contain other three-dimensional arrays; it cannot simultaneously include a one-dimensional array. However, the arrays in a column can vary in size, where one array can contain four elements while another contains ten.
-
Array bounds, if specified, are enforced for all operations that load or alter data. Unbounded arrays may have as many elements as will fit in the allotted binary size.
-
An array has a maximum binary size. If this size is not set when the array is defined, a default value is used.
-
Arrays do not support LONG types (like LONG VARBINARY or LONG VARCHAR) or user-defined types (like Geometry).
Syntax for column definition
Arrays used in column definitions can be either bounded or unbounded. Bounded arrays must specify a maximum number of elements. Unbounded arrays can specify a maximum binary size (in bytes) for the array, or the value of DefaultArrayBinarySize is used. You can specify a bound or a binary size but not both. For more information about these values, see Limits on Element Count and Collection Size.
Type |
Syntax |
Semantics |
Bounded array |
ARRAY[ data_type , max_elements ]
Example:
ARRAY[VARCHAR(50),100]
|
Can contain no more than max_elements elements. Attempting to add more is an error.
Has a binary size of the size of the data type multiplied by the maximum number of elements (possibly rounded up).
|
Unbounded array with maximum binary size |
ARRAY[ data_type ]( max_size )
Example:
ARRAY[VARCHAR(50)](32000)
|
Can contain as many elements as fit in max_size . Ignores the value of DefaultArrayBinarySize. |
Unbounded array with default binary size |
ARRAY[ data_type ]
Example:
ARRAY[VARCHAR(50)]
|
Can contain as many elements as fit in the default binary size.
Equivalent to:
ARRAY[ data_type ](DefaultArrayBinarySize)
|
The following example defines a table for customers using an unbounded array:
=> CREATE TABLE customers (id INT, name VARCHAR, email ARRAY[VARCHAR(50)]);
The following example uses a bounded array for customer email addresses and an unbounded array for order history:
=> CREATE TABLE customers (id INT, name VARCHAR, email ARRAY[VARCHAR(50),5], orders ARRAY[INT]);
The following example uses an array that has ROW elements:
=> CREATE TABLE orders(
orderid INT,
accountid INT,
shipments ARRAY[
ROW(
shipid INT,
address ROW(
street VARCHAR,
city VARCHAR,
zip INT
),
shipdate DATE
)
]
);
To declare a multi-dimensional array, use nesting. For example, ARRAY[ARRAY[int]] specifies a two-dimensional array.
Syntax for direct construction (literals)
Use the ARRAY keyword to construct an array value. The following example creates an array of integer values.
=> SELECT ARRAY[1,2,3];
array
-------
[1,2,3]
(1 row)
You can nest an array inside another array, as in the following example.
=> SELECT ARRAY[ARRAY[1],ARRAY[2]];
array
-----------
[[1],[2]]
(1 row)
If an array of arrays contains no null elements and no function calls, you can abbreviate the syntax:
=> SELECT ARRAY[[1,2],[3,4]];
array
---------------
[[1,2],[3,4]]
(1 row)
---not valid:
=> SELECT ARRAY[[1,2],null,[3,4]];
ERROR 4856: Syntax error at or near "null" at character 20
LINE 1: SELECT ARRAY[[1,2],null,[3,4]];
^
Array literals can contain elements of all scalar types, ROW, and ARRAY. ROW elements must all have the same set of fields:
=> SELECT ARRAY[ROW(1,2),ROW(1,3)];
array
-----------------------------------
[{"f0":1,"f1":2},{"f0":1,"f1":3}]
(1 row)
=> SELECT ARRAY[ROW(1,2),ROW(1,3,'abc')];
ERROR 3429: For 'ARRAY', types ROW(int,int) and ROW(int,int,unknown) are inconsistent
Because the elements are known at the time you directly construct an array, these arrays are implicitly bounded.
You can use ARRAY literals in comparisons, as in the following example:
=> SELECT id.name, id.num, GPA FROM students
WHERE major = ARRAY[ROW('Science','Physics')];
name | num | GPA
-------+-----+-----
bob | 121 | 3.3
carol | 123 | 3.4
(2 rows)
Queries of array columns return JSON format, with the values shown in comma-separated lists in brackets. The following example shows a query that includes array columns.
=> SELECT cust_custkey,cust_custstaddress,cust_custcity,cust_custstate from cust;
cust_custkey | cust_custstaddress | cust_custcity | cust_custstate
-------------+------- ----------------------------------------------+---------------------------------------------+----------------
342176 | ["668 SW New Lane","518 Main Ave","7040 Campfire Dr"] | ["Winchester","New Hyde Park","Massapequa"] | ["VA","NY","NY"]
342799 | ["2400 Hearst Avenue","3 Cypress Street"] | ["Berkeley","San Antonio"] | ["CA","TX"]
342845 | ["336 Boylston Street","180 Clarkhill Rd"] | ["Boston","Amherst"] | ["MA","MA"]
342321 | ["95 Fawn Drive"] | ["Allen Park"] | ["MI"]
342989 | ["5 Thompson St"] | ["Massillon"] | ["OH"]
(5 rows)
Note that JSON format escapes some characters that would not be escaped in native VARCHARs. For example, if you insert "c:\users\data"
into an array, the JSON output for that value is "c:\\users\\data"
.
Element access
Arrays are 0-indexed. The first element's ordinal position is 0, second is 1, and so on.
You can access (dereference) elements from an array by index:
=> SELECT (ARRAY['a','b','c','d','e'])[1];
array
-------
b
(1 row)
To specify a range, use the format start
:end
. The end of the range is non-inclusive.
=> SELECT(ARRAY['a','b','c','d','e','f','g'])[1:4];
array
---------
["b","c","d"]
(1 row)
To dereference an element from a multi-dimensional array, put each index in brackets:
=> SELECT(ARRAY[ARRAY[1,2],ARRAY[3,4]])[0][0];
array
-------
1
(1 row)
Out-of-bound index references return NULL.
Limits on element count and collection size
When declaring a collection type for a table column, you can limit either the number of elements or the total binary size of the collection. During query processing, Vertica always reserves the maximum memory needed for the column, based on either the element count or the binary size. If this size is much larger than your data actually requires, setting one of these limits can improve query performance by reducing the amount of memory that must be reserved for the column.
You can change the bounds of a collection, including changing between bounded and unbounded collections, by casting. See Casting.
A bounded collection specifies a maximum element count. A value in a bounded collection column may contain fewer elements, but it may not contain more. Any attempt to insert more elements into a bounded collection than the declared maximum is an error. A bounded collection has a binary size that is the product of the data-type size and the maximum number of elements, possibly rounded up.
An unbounded collection specifies a binary size in bytes, explicitly or implicitly. It may contain as many elements as can fit in that binary size.
If a nested array specifies bounds for all dimensions, Vertica sets a single bound that is the product of the bounds. In the following example, the inner and outer arrays each have a bound of 10, but only a total element count of 100 is enforced.
ARRAY[ARRAY[INT,10],10]
If a nested array specifies a bound for only the outer collection, it is treated as the total bound. The previous example is equivalent to the following:
ARRAY[ARRAY[INT],100]
You must either specify bounds for all nested collections or specify a bound only for the outer one. For any other distribution of bounds, Vertica treats the collection as unbounded.
Instead of specifying a bound, you can specify a maximum binary size for an unbounded collection. The binary size acts as an absolute limit, regardless of how many elements the collection contains. Collections that do not specify a maximum binary size use the value of DefaultArrayBinarySize. This size is set at the time the collection is defined and is not affected by later changes to the value of DefaultArrayBinarySize.
You cannot set a maximum binary size for a bounded collection, only an unbounded one.
You can change the bounds or the binary size of an array column using ALTER TABLE as in the following example:
=> ALTER TABLE cust ALTER COLUMN orders SET DATA TYPE ARRAY[INTEGER](100);
If the change reduces the size of the collection and would result in data loss, the change fails.
Comparisons
All collections support equality (=
), inequality (<>
), and null-safe equality (<=>
). 1D collections also support comparison operators (<
, <=
, >
, >=
) between collections of the same type (arrays or sets). Comparisons follow these rules:
-
A null collection is ordered last.
-
Non-null collections are compared element by element, using the ordering rules of the element's data type. The relative order of the first pair of non-equal elements determines the order of the two collections.
-
If all elements in both collections are equal up to the length of the shorter collection, the shorter collection is ordered before the longer one.
-
If all elements in both collections are equal and the collections are of equal length, the collections are equal.
Null-handling
Null semantics for collections are consistent with normal columns in most regards. See NULL sort order for more information on null-handling.
The null-safe equality operator (<=>) behaves differently from equality (=) when the collection is null rather than empty. Comparing a collection to NULL strictly returns null:
=> SELECT ARRAY[1,3] = NULL;
?column?
----------
(1 row)
=> SELECT ARRAY[1,3] <=> NULL;
?column?
----------
f
(1 row)
In the following example, the grants column in the table is null for employee 99:
=> SELECT grants = NULL FROM employees WHERE id=99;
?column?
----------
(1 row)
=> SELECT grants <=> NULL FROM employees WHERE id=99;
?column?
----------
t
(1 row)
Empty collections are not null and behave as expected:
=> SELECT ARRAY[]::ARRAY[INT] = ARRAY[]::ARRAY[INT];
?column?
----------
t
(1 row)
Collections are compared element by element. If a comparison depends on a null element, the result is unknown (null), not false. For example, ARRAY[1,2,null]=ARRAY[1,2,null]
and ARRAY[1,2,null]=ARRAY[1,2,3]
both return null, but ARRAY[1,2,null]=ARRAY[1,4,null]
returns false because the second elements do not match.
Casting
Casting an array casts each element of the array. You can therefore cast between data types following the same rules as for casts of scalar values.
You can cast both literal arrays and array columns explicitly:
=> SELECT ARRAY['1','2','3']::ARRAY[INT];
array
---------
[1,2,3]
(1 row)
You can change the bound of an array or set by casting. When casting to a bounded native array, inputs that are too long are truncated. When casting to a non-native array (an array containing complex data types including other arrays), if the new bounds are too small for the data the cast fails:
=> SELECT ARRAY[1,2,3]::ARRAY[VARCHAR,2];
array
-----------
["1","2"]
(1 row)
=> SELECT ARRAY[ARRAY[1,2,3],ARRAY[4,5,6]]::ARRAY[ARRAY[VARCHAR,2],2];
ERROR 9227: Output array isn't big enough
DETAIL: Type limit is 4 elements, but value has 6 elements
If you cast to a bounded multi-dimensional array, you must specify the bounds at all levels:
=> SELECT ARRAY[ARRAY[1,2,3],ARRAY[4,5,6]]::ARRAY[ARRAY[VARCHAR,5],10];
array
-------------------------------
[["1","2","3"],["4","5","6"]]
(1 row)
=> SELECT ARRAY[ARRAY[1,2,3],ARRAY[4,5,6]]::ARRAY[ARRAY[VARCHAR,2]];
WARNING 9753: Collection type bound will not be used
DETAIL: A bound was provided for an inner dimension, but not for an outer dimension
array
-------------------------------
[["1","2","3"],["4","5","6"]]
(1 row)
Assignment casts and implicit casts work the same way as for scalars:
=> CREATE TABLE transactions (tid INT, prod_ids ARRAY[VARCHAR,100], quantities ARRAY[INT,100]);
CREATE TABLE
=> INSERT INTO transactions VALUES (12345, ARRAY['p1265', 'p4515'], ARRAY[15,2]);
OUTPUT
--------
1
(1 row)
=> CREATE TABLE txreport (prod_ids ARRAY[VARCHAR(12),100], quants ARRAY[VARCHAR(32),100]);
CREATE TABLE
=> INSERT INTO txreport SELECT prod_ids, quantities FROM transactions;
OUTPUT
--------
1
(1 row)
=> SELECT * FROM txreport;
prod_ids | quants
-------------------+------------
["p1265","p4515"] | ["15","2"]
(1 row)
You can perform explicit casts, but not implicit casts, between the ARRAY and SET types (native arrays only). If the collection is unbounded and the data type does not change, the binary size is preserved. For example, if you cast an ARRAY[INT] to a SET[INT], the set has the same binary size as the array.
If you cast from one element type to another, the resulting collection uses the default binary size. If this would cause the data not to fit, the cast fails.
You cannot cast from an array to an array with a different dimensionality, for example from a two-dimensional array to a one-dimensional array.
Functions and operators
See Collection functions for a comprehensive list of functions that can be used to manipulate arrays and sets.
Collections can be used in the following ways:
Collections cannot be used in the following ways:
-
As part of an IN or NOT IN expression.
-
As partition columns when creating tables.
-
With ANALYZE_STATISTICS or TopK projections.
-
Non-native arrays only: ORDER BY, PARTITION BY, DEFAULT, SET USING, or constraints.
3.11.2 - MAP
Represents map data in external tables in the Parquet, ORC, and Avro formats only.
Represents map data in external tables in the Parquet, ORC, and Avro formats only. A MAP must use only primitive types and may not contain other complex types. You can use the MAP type in a table definition to consume columns in the data, but you cannot query those columns.
A superior alternative to MAP is ARRAY[ROW]. An array of rows can use all supported complex types and can be queried. This is the representation that INFER_TABLE_DDL suggests. For Avro data, the ROW must have fields named key
and value
.
Within a single table you must define all map columns using the same approach, MAP or ARRAY[ROW].
Syntax
In column definitions:
MAP<key,value>
In a column definition in an external table, a MAP consists of a key-value pair, specified as types. The table in the following example defines a map of product IDs to names.
=> CREATE EXTERNAL TABLE store (storeID INT, inventory MAP<INT,VARCHAR(100)>)
AS COPY FROM '...' PARQUET;
3.11.3 - ROW
Represents structured data (structs).
Represents structured data (structs). A ROW can contain fields of any primitive or complex type supported by Vertica.
Syntax
In column definitions:
ROW([field] type[, ...])
If the field name is omitted, Vertica generates names starting with "f0".
In literals:
ROW(value [AS field] [, ...]) [AS name(field[, ...])]
Syntax for column definition
In a column definition, a ROW consists of one or more comma-separated pairs of field names and types. In the following example, the Parquet data file contains a struct for the address, which is read as a ROW in an external table:
=> CREATE EXTERNAL TABLE customers (name VARCHAR,
address ROW(street VARCHAR, city VARCHAR, zipcode INT))
AS COPY FROM '...' PARQUET;
ROWs can be nested; a field can have a type of ROW:
=> CREATE TABLE employees(
employeeID INT,
personal ROW(
name VARCHAR,
address ROW(street VARCHAR, city VARCHAR, zipcode INT),
taxID INT),
department VARCHAR);
ROWs can contain arrays:
=> CREATE TABLE customers(
name VARCHAR,
contact ROW(
street VARCHAR,
city VARCHAR,
zipcode INT,
email ARRAY[VARCHAR]
),
accountid INT );
When loading data, the primitive types in the table definition must match those in the data. The ROW structure must also match; a ROW must contain all and only the fields in the struct in the data.
Restrictions on ROW columns
ROW columns have several restrictions:
- Maximum nesting depth is 100.
- Vertica tables support up to 9800 columns and fields. The ROW itself is not counted, only its fields.
- ROW columns cannot use any constraints (such as NOT NULL) or defaults.
- ROW fields cannot be auto_increment or setof.
- ROW definition must include at least one field.
Row
is a reserved keyword within a ROW definition, but is permitted as the name of a table or column.
- Tables containing ROW columns cannot also contain IDENTITY, default, SET USING, or named sequence columns.
Syntax for direct construction (literals)
In a literal, such as a value in a comparison operation, a ROW consists of one or more values. If you omit field names in the ROW expression, Vertica generates them automatically. If you do not coerce types, Vertica infers the types from the data values.
=> SELECT ROW('Amy',2,false);
row
--------------------------------------------
{"f0":"Amy","f1":2,"f2":false}
(1 row)
You can use an AS clause to name the ROW and its fields:
=> SELECT ROW('Amy',2,false) AS student(name, id, current);
student
--------------------------------------------
{"name":"Amy","id":2,"current":false}
(1 row)
You can also name individual fields using AS. This query produces the same output as the previous one:
=> SELECT ROW('Amy' AS name, 2 AS id, false AS current) AS student;
You do not need to name all fields.
In an array of ROW elements, if you use AS to name fields and the names differ among the elements, Vertica uses the right-most names for all elements:
=> SELECT ARRAY[ROW('Amy' AS name, 2 AS id),ROW('Fred' AS first_name, 4 AS id)];
array
------------------------------------------------------------
[{"first_name":"Amy","id":2},{"first_name":"Fred","id":4}]
(1 row)
You can coerce types explicitly:
=> SELECT ROW('Amy',2.5::int,false::varchar);
row
------------------------------------------
{"f0":"Amy","f1":3,"f2":"f"}
(1 row)
Escape single quotes in literal inputs using single quotes, as in the following example:
=> SELECT ROW('Howard''s house',2,false);
row
---------------------------------------------------
{"f0":"Howard's house","f1":2,"f2":false}
(1 row)
You can use fields of all scalar types, ROW, and ARRAY, as in the following example:
=> SELECT id.name, major, GPA FROM students
WHERE id = ROW('alice',119, ARRAY['alice@example.com','ap16@cs.example.edu']);
name | major | GPA
-------+------------------------------------+-----
alice | [{"school":"Science","dept":"CS"}] | 3.8
(1 row)
ROW values are output in JSON format as in the following example.
=> CREATE EXTERNAL TABLE customers (name VARCHAR,
address ROW(street VARCHAR, city VARCHAR, zipcode INT))
AS COPY FROM '...' PARQUET;
=> SELECT address FROM customers WHERE address.city ='Pasadena';
address
--------------------------------------------------------------------
{"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":91001}
{"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":91001}
{"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":91001}
{"street":"15 Raymond Dr","city":"Pasadena","zipcode":91003}
(4 rows)
The following table specifies the mappings from Vertica data types to JSON data types.
Vertica Type |
JSON Type |
Integer |
Integer |
Float |
Numeric |
Numeric |
Numeric |
Boolean |
Boolean |
All others |
String |
Comparisons
ROW supports equality (=
), inequality (<>
), and null-safe equality (<=>
) between inputs that have the same set of fields. ROWs that contain only primitive types, including nested ROWs of primitive types, also support comparison operators (<
, <=
, >
, >=
).
Two ROWs are equal if and only if all fields are equal. Vertica compares fields in order until an inequality is found or all fields have been compared. The evaluation of the first non-equal field determines which ROW is greater:
=> SELECT ROW(1, 'joe') > ROW(2, 'bob');
?column?
----------
f
(1 row)
Comparisons between ROWs with different schemas fail:
=> SELECT ROW(1, 'joe') > ROW(2, 'bob', 123);
ERROR 5162: Unequal number of entries in row expressions
If the result of a comparison depends on a null field, the result is null:
=> select row(1, null, 3) = row(1, 2, 3);
?column?
----------
(1 row)
Null-handling
If a struct exists but a field value is null, Vertica assigns NULL as its value in the ROW. A struct where all fields are null is treated as a ROW with null fields. If the struct itself is null, Vertica reads the ROW as NULL.
Casting
Casting a ROW casts each field. You can therefore cast between data types following the same rules as for casts of scalar values.
The following example casts the contact
ROW in the customers table, changing the zipcode
field from INT to VARCHAR and adding a bound to the array:
=> SELECT contact::ROW(VARCHAR,VARCHAR,VARCHAR,ARRAY[VARCHAR,20]) FROM customers;
contact
--------------------------------------------------------------------------------
-----------------------------------------
{"street":"911 San Marcos St","city":"Austin","zipcode":"73344","email":["missy@mit.edu","mcooper@cern.gov"]}
{"street":"100 Main St Apt 4B","city":"Pasadena","zipcode":"91001","email":["shelly@meemaw.name","cooper@caltech.edu"]}
{"street":"100 Main St Apt 4A","city":"Pasadena","zipcode":"91001","email":["hofstadter@caltech.edu"]}
{"street":"23 Fifth Ave Apt 8C","city":"Pasadena","zipcode":"91001","email":[]}
{"street":null,"city":"Pasadena","zipcode":"91001","email":["raj@available.com"]}
(6 rows)
You can specify new field names to change them in the output:
=> SELECT contact::ROW(str VARCHAR, city VARCHAR, zip VARCHAR, email ARRAY[VARCHAR,
20]) FROM customers;
contact
--------------------------------------------------------------------------------
----------------------------------
{"str":"911 San Marcos St","city":"Austin","zip":"73344","email":["missy@mit.edu","mcooper@cern.gov"]}
{"str":"100 Main St Apt 4B","city":"Pasadena","zip":"91001","email":["shelly@meemaw.name","cooper@caltech.edu"]}
{"str":"100 Main St Apt 4A","city":"Pasadena","zip":"91001","email":["hofstadter@caltech.edu"]}
{"str":"23 Fifth Ave Apt 8C","city":"Pasadena","zip":"91001","email":[]}
{"str":null,"city":"Pasadena","zip":"91001","email":["raj@available.com"]}
(6 rows)
Supported operators and predicates
ROW values may be used in queries in the following ways:
-
INNER and OUTER JOIN
-
Comparisons, IN, BETWEEN (non-nullable filters only)
-
IS NULL, IS NOT NULL
-
CASE
-
GROUP BY, ORDER BY
-
SELECT DISTINCT
-
Arguments to user-defined scalar, transform, and analytic functions
The following operators and predicates are not supported for ROW values:
-
Math operators
-
Type coercion of whole rows (coercion of field values is supported)
-
BITWISE, LIKE
-
MLA (ROLLUP, CUBE, GROUPING SETS)
-
Aggregate functions including MAX, MIN, and SUM
-
Set operators including UNION, UNION ALL, MINUS, and INTERSECT
COUNT is not supported for ROWs returned from user-defined scalar functions, but is supported for ROW columns and literals.
In comparison operations (including implicit comparisons like ORDER BY), a ROW literal is treated as the sequence of its field values. For example, the following two statements are equivalent:
GROUP BY ROW(zipcode, city)
GROUP BY zipcode, city
Using rows in views and subqueries
You can use ROW columns to construct views and in subqueries. Consider employee and customer tables with the following definitions:
=> CREATE EXTERNAL TABLE customers(name VARCHAR,
address ROW(street VARCHAR, city VARCHAR, zipcode INT), accountID INT)
AS COPY FROM '...' PARQUET;
=> CREATE EXTERNAL TABLE employees(employeeID INT,
personal ROW(name VARCHAR,
address ROW(street VARCHAR, city VARCHAR, zipcode INT),
taxID INT), department VARCHAR)
AS COPY FROM '...' PARQUET;
The following example creates a view and queries it.
=> CREATE VIEW neighbors (num_neighbors, area(city, zipcode))
AS SELECT count(*), ROW(address.city, address.zipcode)
FROM customers GROUP BY address.city, address.zipcode;
CREATE VIEW
=> SELECT employees.personal.name, neighbors.area FROM neighbors, employees
WHERE employees.personal.address.zipcode=neighbors.area.zipcode AND neighbors.nu
m_neighbors > 1;
name | area
--------------------+-------------------------------------
Sheldon Cooper | {"city":"Pasadena","zipcode":91001}
Leonard Hofstadter | {"city":"Pasadena","zipcode":91001}
(2 rows)
3.11.4 - SET
Represents a collection of unordered, unique elements.
Represents a collection of unordered, unique elements. Sets may contain only primitive types. In sets, unlike in arrays, element position is not meaningful.
Sets do not support LONG types (like LONG VARBINARY or LONG VARCHAR) or user-defined types (like Geometry).
If you populate a set from an array, Vertica sorts the values and removes duplicate elements. If you do not care about element position and plan to run queries that check for the presence of specific elements (find, contains), using a set could improve query performance.
Sets can be bounded, meaning they specify a maximum element count, or unbounded. Unbounded sets have a maximum binary size, which can be set explicitly or defaulted. See Limits on Element Count and Collection Size.
Syntax
In column definitions:
SET[data_type, max_elements] |
SET[data_type](max_size) |
SET[data_type]
In literals:
SET[value[, ...] ]
Restrictions
-
Sets support only data of primitive (scalar) types.
-
Bounds, if specified, are enforced for all operations that load or alter data. Unbounded sets may have as many elements as will fit in the allotted binary size.
-
A set has a maximum binary size. If this size is not set when the set is defined, a default value is used.
Syntax for column definition
Sets used in column definitions can be either bounded or unbounded. Bounded sets must specify a maximum number of elements. Unbounded sets can specify a maximum binary size for the set, or the value of DefaultArrayBinarySize is used. You can specify a bound or a binary size but not both. For more information about these values, see Limits on Element Count and Collection Size.
Type |
Syntax |
Semantics |
Bounded set |
SET[ data_type , max_elements ]
Example:
SET[VARCHAR(50),100]
|
Can contain no more than max_elements elements. Attempting to add more is an error.
Has a binary size of the size of the data type multiplied by the maximum number of elements (possibly rounded up).
|
Unbounded set with maximum size |
SET[ data_type ]( max_size )
Example:
SET[VARCHAR(50)](32000)
|
Can contain as many elements as fit in max_size . Ignores the value of DefaultArrayBinarySize. |
Unbounded set |
SET[ data_type ]
Example:
SET[VARCHAR(50)]
|
Can contain as many elements as fit in the default binary size.
Equivalent to:
SET[ data_type ](DefaultArrayBinarySize)
|
The following example defines a table with an unbounded set colum.
=> CREATE TABLE users
(
user_id INTEGER,
display_name VARCHAR,
email_addrs SET[VARCHAR]
);
When you load array data into a column defined as a set, the array data is automatically converted to a set.
Syntax for direct construction (literals)
Use the SET keyword to construct a set value. Literal set values are contained in brackets. For example, to create a set of INT, you would do the following:
=> SELECT SET[1,2,3];
set
-------
[1,2,3]
(1 row)
You can explicitly convert an array to a set by casting, as in the following example:
=> SELECT ARRAY[1, 5, 2, 6, 3, 0, 6, 4]::SET[INT];
set
-----------------
[0,1,2,3,4,5,6]
(1 row)
Notice that duplicate elements have been removed and the elements have been sorted.
Because the elements are known at the time you directly construct a set, these sets are implicitly bounded.
Sets are shown in a JSON-like format, with comma-separated elements contained in brackets (like arrays). In the following example, the email_addrs column is a set.
=> SELECT custkey,email_addrs FROM customers LIMIT 4;
custkey | email_addrs
---------+------------------------------------------------------------------------
342176 | ["joe.smith@example.com"]
342799 | ["bob@example,com","robert.jones@example.com"]
342845 | ["br92@cs.example.edu"]
342321 | ["789123@example-isp.com","sjohnson@eng.example.com","sara@johnson.example.name"]
Limits on element count and collection size
When declaring a collection type for a table column, you can limit either the number of elements or the total binary size of the collection. During query processing, Vertica always reserves the maximum memory needed for the column, based on either the element count or the binary size. If this size is much larger than your data actually requires, setting one of these limits can improve query performance by reducing the amount of memory that must be reserved for the column.
You can change the bounds of a collection, including changing between bounded and unbounded collections, by casting. See Casting.
A bounded collection specifies a maximum element count. A value in a bounded collection column may contain fewer elements, but it may not contain more. Any attempt to insert more elements into a bounded collection than the declared maximum is an error. A bounded collection has a binary size that is the product of the data-type size and the maximum number of elements, possibly rounded up.
An unbounded collection specifies a binary size in bytes, explicitly or implicitly. It may contain as many elements as can fit in that binary size.
Instead of specifying a bound, you can specify a maximum binary size for an unbounded collection. The binary size acts as an absolute limit, regardless of how many elements the collection contains. Collections that do not specify a maximum binary size use the value of DefaultArrayBinarySize. This size is set at the time the collection is defined and is not affected by later changes to the value of DefaultArrayBinarySize.
You cannot set a maximum binary size for a bounded collection, only an unbounded one.
Comparisons
All collections support equality (=
), inequality (<>
), and null-safe equality (<=>
). 1D collections also support comparison operators (<
, <=
, >
, >=
) between collections of the same type (arrays or sets). Comparisons follow these rules:
-
A null collection is ordered last.
-
Non-null collections are compared element by element, using the ordering rules of the element's data type. The relative order of the first pair of non-equal elements determines the order of the two collections.
-
If all elements in both collections are equal up to the length of the shorter collection, the shorter collection is ordered before the longer one.
-
If all elements in both collections are equal and the collections are of equal length, the collections are equal.
Null handling
Null semantics for collections are consistent with normal columns in most regards. See NULL sort order for more information on null-handling.
The null-safe equality operator (<=>) behaves differently from equality (=) when the collection is null rather than empty. Comparing a collection to NULL strictly returns null:
=> SELECT ARRAY[1,3] = NULL;
?column?
----------
(1 row)
=> SELECT ARRAY[1,3] <=> NULL;
?column?
----------
f
(1 row)
In the following example, the grants column in the table is null for employee 99:
=> SELECT grants = NULL FROM employees WHERE id=99;
?column?
----------
(1 row)
=> SELECT grants <=> NULL FROM employees WHERE id=99;
?column?
----------
t
(1 row)
Empty collections are not null and behave as expected:
=> SELECT ARRAY[]::ARRAY[INT] = ARRAY[]::ARRAY[INT];
?column?
----------
t
(1 row)
Collections are compared element by element. If a comparison depends on a null element, the result is unknown (null), not false. For example, ARRAY[1,2,null]=ARRAY[1,2,null]
and ARRAY[1,2,null]=ARRAY[1,2,3]
both return null, but ARRAY[1,2,null]=ARRAY[1,4,null]
returns false because the second elements do not match.
Casting
Casting a set casts each element of the set. You can therefore cast between data types following the same rules as for casts of scalar values.
You can cast both literal sets and set columns explicitly:
=> SELECT SET['1','2','3']::SET[INT];
set
---------
[1,2,3]
(1 row)
=> CREATE TABLE transactions (tid INT, prod_ids SET[VARCHAR], quantities SET[VARCHAR(32)]);
=> INSERT INTO transactions VALUES (12345, SET['p1265', 'p4515'], SET['15','2']);
=> SELECT quantities :: SET[INT] FROM transactions;
quantities
------------
[15,2]
(1 row)
Assignment casts and implicit casts work the same way as for scalars.
You can perform explicit casts, but not implicit casts, between ARRAY and SET types. If the collection is unbounded and the data type does not change, the binary size is preserved. For example, if you cast an ARRAY[INT] to a SET[INT], the set has the same binary size as the array.
When casting an array to a set, Vertica first casts each element and then sorts the set and removes duplicates. If two source values are cast to the same target value, one of them will be removed. For example, if you cast an array of FLOAT to a set of INT, two values in the array might be rounded to the same integer and then be treated as duplicates. This also happens if the array contains more than one value that is cast to NULL.
If you cast from one element type to another, the resulting collection uses the default binary size. If this would cause the data not to fit, the cast fails.
Functions and operators
See Collection functions for a comprehensive list of functions that can be used to manipulate arrays and sets.
Collections can be used in the following ways:
Collections cannot be used in the following ways:
-
As part of an IN or NOT IN expression.
-
As partition columns when creating tables.
-
With ANALYZE_STATISTICS or TopK projections.
-
Non-native arrays only: ORDER BY, PARTITION BY, DEFAULT, SET USING, or constraints.
3.12 - Data type mappings between Vertica and Oracle
Oracle uses proprietary data types for all main data types, such as VARCHAR, INTEGER, FLOAT, DATE.
Oracle uses proprietary data types for all main data types, such as VARCHAR, INTEGER, FLOAT, DATE. Before migrating a database from Oracle to Vertica, first convert the schema to minimize errors and time spent fixing erroneous data issues.
The following table compares the behavior of Oracle data types to Vertica data types.
Oracle |
Vertica |
Notes |
NUMBER
(no explicit precision)
|
INTEGER |
In Oracle, the NUMBER data type with no explicit precision stores each number N as an integer M, together with a scale S. The scale can range from -84 to 127, while the precision of M is limited to 38 digits. Thus:
N = M * 10^S
When precision is specified, precision/scale applies to all entries in the column. If omitted, the scale defaults to 0.
For the common case—Oracle NUMBER with no explicit precision used to store only integer values—the Vertica INTEGER data type is the most appropriate and the fastest equivalent data type. However, INTEGER is limited to a little less than 19 digits, with a scale of 0: [-9223372036854775807, +9223372036854775807] .
|
NUMERIC |
If an Oracle column contains integer values outside of the range [-9223372036854775807, +9223372036854775807] , then use the Vertica data type NUMERIC(p,0) where p is the maximum number of digits required to represent values of the source data.
If the data is exact with fractional places—for example dollar amounts—Vertica recommends NUMERIC(p,s) where p is the precision (total number of digits) and s is the maximum scale (number of decimal places).
Vertica conforms to standard SQL, which requires that p ≥ s and s ≥ 0. Vertica's NUMERIC data type is most effective for p=18, and increasingly expensive for p=37, 58, 67, etc., where p ≤ 1024.
Tip
Vertica recommends against using the data type NUMERIC(38,s) as a default "failsafe" mapping to guarantee no loss of precision. NUMERIC(18,s) is better, and INTEGER or FLOAT better yet, if one of these data types will do the job.
|
FLOAT |
Even though no explicit scale is specified for an Oracle NUMBER column, Oracle allows non-integer values, each with its own scale. If the data stored in the column is approximate, Vertica recommends using the Vertica data type FLOAT, which is standard IEEE floating point, like ORACLE BINARY_DOUBLE. |
NUMBER(P,0)
P ≤ 18
|
INTEGER |
For Oracle NUMBER data types with 0 scale and a precision less than or equal to 18, use the Vertica INTEGER data type. |
NUMBER(P,0)
P > 18
|
NUMERIC(p,0) |
In the rare case where a Oracle column specifies precision greater than 18, use the Vertica data type NUMERIC(p, 0), where p = P. |
NUMBER(P,S)
All cases other than above
|
NUMERIC(p,s) |
- When P ≥ S and S ≥ 0, use p = P and s = S, unless the data allows reducing P or using FLOAT as discussed above.
- If S > P, use p = S, s = S.
- If S < 0, use p = P – S, s = 0.
|
FLOAT |
NUMERIC(P,S) |
|
Rarely used in Oracle, see notes for Oracle NUMBER. |
DECIMAL(P,S) |
|
Synonym for Oracle NUMERIC. |
BINARY_FLOAT |
FLOAT |
Same as FLOAT(53) or DOUBLE PRECISION |
BINARY_DOUBLE |
FLOAT |
Same as FLOAT(53) or DOUBLE PRECISION |
RAW |
VARBINARY |
Maximum sizes compared:
|
LONG RAW |
LONG VARBINARY |
Maximum sizes compared:
Caution
Be careful to avoid truncation when migrating Oracle LONG RAW data to Vertica.
|
CHAR(n) |
CHAR(n) |
Maximum sizes compared:
|
NCHAR(n) |
CHAR(*n**3) |
Vertica supports national characters with CHAR(n) as variable-length UTF8-encoded UNICODE character string. UTF-8 represents ASCII in 1 byte, most European characters in 2 bytes, and most oriental and Middle Eastern characters in 3 bytes. |
VARCHAR2(n) |
VARCHAR(n) |
Maximum sizes compared:
Important
The Oracle VARCHAR2 and Vertica VARCHAR data types are semantically different:
- VARCHAR exhibits standard SQL behavior
- VARCHAR2 is inconsistent with standard SQL behavior in that it treats an empty string as NULL value, and uses non-padded comparison if one operand is VARCHAR2.
|
NVARCHAR2(n) |
VARCHAR(*n**3) |
See notes for NCHAR. |
DATE |
TIMESTAMP |
Oracle’s DATE is different from the SQL standard DATE data type implemented by Vertica. Oracle’s DATE includes the time (no fractional seconds), while Vertica DATE data types include only date as per the SQL standard. |
DATE |
TIMESTAMP |
TIMESTAMP |
TIMESTAMP defaults to six places—that is, to microseconds. |
TIMESTAMP WITH TIME ZONE |
TIMESTAMP WITH TIME ZONE |
TIME ZONE defaults to the currently SET or system time zone. |
INTERVAL YEAR TO MONTH |
INTERVAL YEAR TO MONTH |
As per the SQL standard, you can qualify Vertica INTERVAL data types with the YEAR TO MONTH subtype. |
INTERVAL DAY TO SECOND |
INTERVAL DAY TO SECOND |
The default subtype for Vertica INTERVAL data types is DAY TO SECOND. |
CLOB |
LONG VARCHAR |
You can store a CLOB (character large object) or BLOB (binary large object) value in a table or in an external location. The maximum size of a CLOB or BLOB is 128 TB.
You can store Vertica LONG data types only in LONG VARCHAR and LONG VARBINARY columns. The maximum size of LONG data types is 32M bytes.
|
BLOB |
LONG VARBINARY |
LONG |
LONG VARCHAR |
Oracle recommends using CLOB and BLOB data types instead of LONG and LONG RAW data types.
An Oracle table can contain only one LONG column, The maximum size of a LONG or LONG RAW data type is 2 GB.
|
LONG RAW |
LONG VARBINARY |
4 - Configuration parameters
Vertica supports a wide variety of configuration parameters that affect many facets of database behavior. These parameters can be set with the appropriate ALTER statements at one or more levels, listed here in descending order of precedence:
-
User (ALTER USER)
-
Session (ALTER SESSION)
-
Node (ALTER NODE)
-
Database (ALTER DATABASE)
Not all parameters can be set at all levels. Consult the documentation of individual parameters for restrictions.
You can query the CONFIGURATION_PARAMETERS system table to obtain the current settings for all user-accessible parameters. For example, the following query returns settings for partitioning parameters: their current and default values, which levels they can be set at, and whether changes require a database restart to take effect:
=> SELECT parameter_name, current_value, default_value, allowed_levels, change_requires_restart
FROM configuration_parameters WHERE parameter_name ILIKE '%partitioncount%';
parameter_name | current_value | default_value | allowed_levels | change_requires_restart
----------------------+---------------+---------------+----------------+-------------------------
MaxPartitionCount | 1024 | 1024 | NODE, DATABASE | f
ActivePartitionCount | 1 | 1 | NODE, DATABASE | f
(2 rows)
4.1 - General parameters
The following parameters configure basic database operations.
The following parameters configure basic database operations. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- ApplyEventsDuringSALCheck
- Boolean, specifies whether Vertica uses catalog events to filter out dropped corrupt partitions during node startup. Dropping corrupt partitions can speed node recovery.
When disabled (0), Vertica reports corrupt partitions, but takes no action. Leaving corrupt partitions in place can reset the current projection checkpoint epoch to the epoch before the corruption occurred.
This parameter has no effect on unpartitioned tables.
Default: 0
- ApportionedFileMinimumPortionSizeKB
- Specifies the minimum portion size (in kilobytes) for use with apportioned file loads. Vertica apportions a file load across multiple nodes only if:
See also EnableApportionLoad and EnableApportionedFileLoad.
Default: 1024
- BlockedSocketGracePeriod
- Sets how long a session socket remains blocked while awaiting client input or output for a given query. See Handling session socket blocking.
Default: None (Socket blocking can continue indefinitely.)
- CatalogCheckpointPercent
- Specifies the threshold at which a checkpoint is created for the database catalog.
By default, this parameter is set to 50 (percent), so when transaction logs reach 50% of the size of the last checkpoint, Vertica adds a checkpoint. Each checkpoint demarcates all changes to the catalog since the last checkpoint.
Default: 50 (percent)
- ClusterSequenceCacheMode
- Boolean, specifies whether the initiator node requests cache for other nodes in a cluster, and then sends cache to other nodes along with the execution plan, one of the following.
See Distributing sequences.
Default: 1 (enabled)
- CompressCatalogOnDisk
- Whether to compress the size of the catalog on disk, one of the following:
This parameter is most effective if the catalog disk partition is small (<50 GB) and the metadata is large (hundreds of tables, partitions, or nodes).
Default: 1
- CompressNetworkData
- Boolean, specifies whether to compress all data sent over the internal network when enabled (set to 1). This compression speeds up network traffic at the expense of added CPU load. If the network is throttling database performance, enable compression to correct the issue.
Default: 0
- CopyFaultTolerantExpressions
- Boolean, indicates whether to report record rejections during transformations and proceed (true) or abort COPY operations if a transformation fails.
Default: 0 (false)
- CopyFromVerticaWithIdentity
- Allows COPY FROM VERTICA and EXPORT TO VERTICA to load values into IDENTITY columns. The destination IDENTITY column is not incremented automatically. To disable the default behavior, set this parameter to 0 (zero).
Default: 1
- DatabaseHeartbeatInterval
- Determines the interval (in seconds) at which each node performs a health check and communicates a heartbeat. If a node does not receive a message within five times of the specified interval, the node is evicted from the cluster. Setting the interval to 0 disables the feature.
See Automatic eviction of unhealthy nodes.
Default: 120
- DataLoaderDefaultRetryLimit
- Default number of times a data loader retries failed loads. Changing this value changes the retry limit for existing data loaders that do not specify another limit.
Default: 3
- DefaultArrayBinarySize
- The maximum binary size, in bytes, for an unbounded collection, if a maximum size is not specified at creation time.
Default: 65000
- DefaultResourcePoolForUsers
- Resource pool that is assigned to the profile of a new user, whether created in Vertica or LDAP. This pool is also assigned to users when their assigned resource pool is dropped.
You can set DefaultResourcePoolForUsers only to a global resource pool; attempts to set it to a subcluster resource pool return with an error.
Note
If an LDAP user is merged with an existing Vertica user, the resource pool setting on the existing user remains unchanged.
For details, see User resource allocation.
Default: GENERAL
- DefaultTempTableLocal
- Boolean, specifies whether CREATE TEMPORARY TABLE creates a local or global temporary table, one of the following:
For details, see Creating temporary tables.
Default: 0
- DivideZeroByZeroThrowsError
- Boolean, specifies whether to return an error if a division by zero operation is requested:
-
0: Return 0.
-
1: Returns an error.
Default: 1
- EnableApportionedChunkingInDefaultLoadParser
- Boolean, specifies whether to enable the built-in parser for delimited files to take advantage of both apportioned load and cooperative parse for potentially better performance.
Default: 1 (enable)
- EnableApportionedFileLoad
- Boolean, specifies whether to enable automatic apportioning across nodes of file loads using COPY FROM VERTICA. Vertica attempts to apportion the load if:
-
This parameter and EnableApportionLoad are both enabled.
-
The parser supports apportioning.
-
The load is divisible into portion sizes of at least the value of ApportionedFileMinimumPortionSizeKB.
Setting this parameter does not guarantee that loads will be apportioned, but disabling it guarantees that they will not be.
See Distributing a load.
Default: 1 (enable)
- EnableApportionLoad
- Boolean, specifies whether to enable automatic apportioning across nodes of data loads using COPY...WITH SOURCE. Vertica attempts to apportion the load if:
Setting this parameter does not guarantee that loads will be apportioned, but disabling it guarantees that they will not be.
For details, see Distributing a load.
Default: 1 (enable)
- EnableBetterFlexTypeGuessing
- Boolean, specifies whether to enable more accurate type guessing when assigning data types to non-string keys in a flex table
__raw__
column with COMPUTE_FLEXTABLE_KEYS or COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW. If this parameter is disabled (0), Vertica uses a limited set of Vertica data type assignments.
Default: 1 (enable)
- EnableCooperativeParse
- Boolean, specifies whether to implement multi-threaded parsing capabilities on a node. You can use this parameter for both delimited and fixed-width loads.
Default: 1 (enable)
- EnableForceOuter
- Boolean, specifies whether Vertica uses a table's
force_oute
r value to implement a join. For more information, see Controlling join inputs.
Default: 0 (forced join inputs disabled)
- EnableMetadataMemoryTracking
- Boolean, specifies whether to enable Vertica to track memory used by database metadata in the METADATA resource pool.
Default: 1 (enable)
- EnableResourcePoolCPUAffinity
- Boolean, specifies whether Vertica aligns queries to the resource pool of the processing CPU. When disabled (0), queries run on any CPU, regardless of the
CPU_AFFINITY_SET
of the resource pool.
Default: 1
- EnableStrictTimeCasts
- Specifies whether all cast failures result in an error.
Default: 0 (disable)
- EnableUniquenessOptimization
- Boolean, specifies whether to enable query optimization that is based on guaranteed uniqueness of column values. Columns that can be guaranteed to include unique values include:
Default: 1 (enable)
- EnableWithClauseMaterialization
- Superseded by WithClauseMaterialization.
- ExternalTablesExceptionsLimit
- Determines the maximum number of
COPY
exceptions and rejections allowed when a SELECT
statement references an external table. Set to -1
to remove any exceptions limit. See Querying external tables.
Default: 100
- FailoverToStandbyAfter
- Specifies the length of time that an active standby node waits before taking the place of a failed node.
This parameter is set to an interval literal.
Default: None
- FencedUDxMemoryLimitMB
- Sets the maximum amount of memory, in megabytes (MB), that a fenced-mode UDF can use. If a UDF attempts to allocate more memory than this limit, that attempt triggers an exception. For more information, see Fenced and unfenced modes.
Default: -1 (no limit)
- FlexTableDataTypeGuessMultiplier
- Specifies a multiplier that the COMPUTE_FLEXTABLE_KEYS and COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW functions use when assigning a data type and column width for the flex keys table. Both functions assign each key a data type, and multiply the longest key value by this factor to estimate column width. This value is not used to calculate the width of any real columns in a flex table.
Default: 2.0
- FlexTableRawSize
- Specifies the default column width for the
__raw__
column of new flex tables, a value between 1 and 32000000, inclusive.
Default: 130000
- ForceUDxFencedMode
- When enabled (1), forces all UDxs that support fenced mode to run in fenced mode even if their definitions specify NOT FENCED.
-
File exporters cannot be run in fenced mode.
Default: 0
- HTTPServerPortOffset
- Controls the offset for the HTTPS port. The default HTTPS port is 8443, the sum of the client port (5433) and default HTTPServerPortOffset (3010).
Caution
This parameter should not be changed without support guidance.
Default: 3010
- HTTPSlowRequestThreshold
- Threshold in milliseconds for logging HTTP requests in UDFS_STATISTICS. Requests that take longer than this value are logged; faster ones are not.
Default: 500
- IcebergPathMapping
- For external tables using Iceberg data, a map of URI prefixes used by Iceberg to prefixes accessible to Vertica. The value is a JSON object:
'{"source-prefix":"new-prefix"[,...]}'
Specify prefixes only (up through port number), not complete paths.
- IdleTimeoutInitializingConnectionsMs
- The length of time (in milliseconds) that the server waits before timing out, during each step in connection initialization. After connection initialization, the session is created.
Default: 10000
- JavaBinaryForUDx
- Sets the full path to the Java executable that Vertica uses to run Java UDxs. See Installing Java on Vertica hosts.
- JoinDefaultTupleFormat
- Specifies how to size VARCHAR column data when joining tables on those columns, and buffers accordingly, one of the following:
-
fixed
: Use join column metadata to size column data to a fixed length, and buffer accordingly.
-
variable
: Use the actual length of join column data, so buffer size varies for each join.
Default: fixed
- KeepAliveIdleTime
- Length (in seconds) of the idle period before the first TCP keepalive probe is sent to ensure that the client is still connected. If set to 0, Vertica uses the kernel's tcp_keepalive_time parameter setting.
Default: 0
- KeepAliveProbeCount
- Number of consecutive keepalive probes that must go unacknowledged by the client before the client connection is considered lost and closed. If set to 0, Vertica uses the kernel's tcp_keepalive_probes parameter setting.
Default: 0
- KeepAliveProbeInterval
- Time interval (in seconds) between keepalive probes. If set to 0, Vertica uses the kernel's tcp_keepalive_intvl parameter setting.
Default: 0
- LockTimeout
- Specifies in seconds how long a table waits to acquire a lock.
Default: 300
- LoadSourceStatisticsLimit
- Specifies the maximum number of sources per load operation that are profiled in the LOAD_SOURCES system table. Set it to 0 to disable profiling.
Default: 256
- MaxBundleableROSSizeKB
- Specifies the minimum size, in kilobytes, of an independent ROS file. Vertica bundles storage container ROS files below this size into a single file. Bundling improves the performance of any file-intensive operations, including backups, restores, and mergeouts.
If you set this parameter to a value of 0, Vertica bundles .fdb and .pidx files without bundling other storage container files.
Default: 1024
- MaxClientSessions
- Determines the maximum number of client sessions that can run on a single node of the database. The default value allows for five additional administrative logins. These logins prevent DBAs from being locked out of the system if non-dbadmin users reach the login limit.
Tip
Setting this parameter to 0 prevents new client sessions from being opened while you are shutting down the database. Restore the parameter to its original setting after you restart the database. For details, see
Managing Sessions.
Default: 50 user logins and 5 additional administrative logins
- ObjectStoreGlobStrategy
- For partitioned external tables in object stores, the strategy to use for expanding globs before pruning partitions:
-
Flat
: COPY fetches a list of all full object names with a given prefix, which can incur latency if partitions are numerous or deeply nested.
-
Hierarchical
: COPY fetches object names one partition layer at a time, allowing earlier pruning but possibly requiring more calls to the object store when queries are not selective or there are not many partittion directory levels.
For details, see Partitions on Object Stores.
Default: Flat
- ObjStoreUploadParallelism
- Integer (maximum 1000), size of the thread pool used when writing files to S3. Using more threads parallelizes uploads, but uses more memory. Each node has a single thread pool used for all file writes.
This configuration parameter can be set only at the database level.
Default: 0 (disabled)
- ParquetMetadataCacheSizeMB
- Size of the cache used for metadata when reading Parquet data. The cache uses local TEMP storage.
Default: 4096
- PatternMatchingUseJit
- Boolean, specifies whether to enables just-in-time compilation (to machine code) of regular expression pattern matching functions used in queries. Enabling this parameter can usually improve pattern matching performance on large tables. The Perl Compatible Regular Expressions (PCRE) pattern-match library evaluates regular expressions. Restart the database for this parameter to take effect.
See also Regular expression functions.
Default: 1 (enable)
- PatternMatchStackAllocator
- Boolean, specifies whether to override the stack memory allocator for the pattern-match library. The Perl Compatible Regular Expressions (PCRE) pattern-match library evaluates regular expressions. Restart the database for this parameter to take effect.
See also Regular expression functions.
Default: 1 (enable override)
- TerraceRoutingFactor
- Specifies whether to enable or disable terrace routing on any Enterprise Mode large cluster that implements rack-based fault groups.
- To enable, set as follows:
where: * numRackNodes
: Number of nodes in a rack * numRacks
: Number of racks in the cluster
- To disable, set to a value so large that terrace routing cannot be enabled for the largest clusters—for example, 1000.
For details, see Terrace routing.
Default: 2
- TransactionIsolationLevel
- Changes the isolation level for the database. After modification, Vertica uses the new transaction level for every new session. Existing sessions and their transactions continue to use the original isolation level.
See also Change transaction isolation levels.
Default: READ COMMITTED
- TransactionMode
- Whether transactions are in read/write or read-only modes. Read/write is the default. Existing sessions and their transactions continue to use the original isolation level.
Default: READ WRITE
- UDFSSlowRequestThreshold
- Threshold in milliseconds for logging events in UDFS_EVENTS. Requests that take longer than this value are logged; faster ones are not.
Default: 1000
- UDxFencedBlockTimeout
- Number of seconds to wait for output before aborting a UDx running in Fenced and unfenced modes. If the server aborts a UDx for this reason, it produces an error message similar to "ERROR 3399: Failure in UDx RPC call: timed out in receiving a UDx message". If you see this error frequently, you can increase this limit. UDxs running in fenced mode do not run in the server process, so increasing this value does not impede server performance.
Default: 60
- UseLocalTzForParquetTimestampConversion
- Boolean, specifies whether to do timezone conversion when reading Parquet files. Hive version 1.2.1 introduced an option to localize timezones when writing Parquet files. Previously it wrote them in UTC and Vertica adjusted the value when reading the files.
Set to 0 if Hive already adjusted the timezones.
Default: 1 (enable conversion)
- UseServerIdentityOverUserIdentity
- Boolean, specifies whether to ignore user-supplied credentials for non-Linux file systems and always use a USER storage location to govern access to data. See Creating a Storage Location for USER Access.
Default: 0 (disable)
- WithClauseMaterialization
- Boolean, specifies whether to enable materialization of WITH clause results. When materialization is enabled (1), Vertica evaluates each WITH clause once and stores results in a temporary table.
For WITH queries with complex types, temp relations are disabled.
Default: 0 (disable)
- WithClauseRecursionLimit
- Specifies the maximum number of times a WITH RECURSIVE clause iterates over the content of its own result set before it exits. For details, see WITH clause recursion.
Important
Be careful to set WithClauseRecursionLimit only as high as needed to traverse the deepest hierarchies. Vertica sets no limit on this parameter; however, a high value can incur considerable overhead that adversely affects performance and exhausts system resources.
If a high recursion count is required, then consider enabling materialization. For details, see WITH RECURSIVE Materialization.
Default: 8
4.2 - Azure parameters
Use the following parameters to configure reading from Azure blob storage.
Use the following parameters to configure reading from Azure blob storage. For more information about reading data from Azure, see Azure Blob Storage object store.
For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- AzureStorageCredentials
- Collection of JSON objects, each of which specifies connection credentials for one endpoint. This parameter takes precedence over Azure managed identities.
The collection must contain at least one object and may contain more. Each object must specify at least one of accountName
or blobEndpoint
, and at least one of accountKey
or sharedAccessSignature
.
accountName
: If not specified, uses the label of blobEndpoint
.
blobEndpoint
: Host name with optional port (host:port
). If not specified, uses account
.blob.core.windows.net
.
accountKey
: Access key for the account or endpoint.
sharedAccessSignature
: Access token for finer-grained access control, if being used by the Azure endpoint.
- AzureStorageEndpointConfig
- Collection of JSON objects, each of which specifies configuration elements for one endpoint. Each object must specify at least one of
accountName
or blobEndpoint
.
accountName
: If not specified, uses the label of blobEndpoint
.
blobEndpoint
: Host name with optional port (host:port
). If not specified, uses account
.blob.core.windows.net
.
protocol
: HTTPS (default) or HTTP.
isMultiAccountEndpoint
: true if the endpoint supports multiple accounts, false otherwise (default is false). To use multiple-account access, you must include the account name in the URI. If a URI path contains an account, this value is assumed to be true unless explicitly set to false.
4.3 - Constraints parameters
The following configuration parameters control how Vertica evaluates and enforces constraints.
The following configuration parameters control how Vertica evaluates and enforces constraints. All parameters are set at the database level through
ALTER DATABASE
.
Three of these parameters—EnableNewCheckConstraintsByDefault, EnableNewPrimaryKeysByDefault, and EnableNewUniqueKeysByDefault—can be used to enforce CHECK, PRIMARY KEY, and UNIQUE constraints, respectively. For details, see Constraint enforcement.
- EnableNewCheckConstraintsByDefault
- Boolean parameter, set to 0 or 1:
- EnableNewPrimaryKeysByDefault
- Boolean parameter, set to 0 or 1:
-
0 (default): Disable enforcement of new PRIMARY KEY constraints except where the table DDL explicitly enables them.
-
1: Enforce new PRIMARY KEY constraints except where the table DDL explicitly disables them.
Note
Vertica recommends enforcing constraints PRIMARY KEY and UNIQUE together.
- EnableNewUniqueKeysByDefault
- Boolean parameter, set to 0 or 1:
- MaxConstraintChecksPerQuery
- Sets the maximum number of constraints that
ANALYZE_CONSTRAINTS
can handle with a single query:
-
-1 (default): No maximum set, ANALYZE_CONSTRAINTS
uses a single query to evaluate all constraints within the specified scope.
-
Integer > 0: The maximum number of constraints per query. If the number of constraints to evaluate exceeds this value, ANALYZE_CONSTRAINTS
handles it with multiple queries.
For details, see Distributing Constraint Analysis.
4.4 - Database Designer parameters
The following table describes the parameters for configuring the Vertica Database Designer.
The following table describes the parameters for configuring the Vertica Database Designer.
DBDCorrelationSampleRowCount
- Minimum number of table rows at which Database Designer discovers and records correlated columns.
Default: 4000
DBDLogInternalDesignProcess
- Enables or disables Database Designer logging.
Default: 0 (False)
DBDUseOnlyDesignerResourcePool
- Enables use of the DBD pool by the Vertica Database Designer.
When set to false, design processing is mostly contained by the user's resource pool, but might spill over into some system resource pools for less-intensive tasks
Default: 0 (False)
4.5 - Eon Mode parameters
The following parameters configure how the database operates when running in Eon Mode.
The following parameters configure how the database operates when running in Eon Mode. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- BackgroundDepotWarming
- Specifies background depot warming behavior:
-
1: The depot loads objects while it is warming, and continues to do so in the background after the node becomes active and starts executing queries.
-
0: Node activation is deferred until the depot fetches and loads all queued objects
For details, see Depot Warming.
Default: 1
- CatalogSyncInterval
- Specifies in minutes how often the transaction log sync service syncs metadata to communal storage. If you change this setting, Vertica restarts the interval count.
Default: 5
- DelayForDeletes
- Specifies in hours how long to wait before deleting a file from communal storage. Vertica first deletes a file from the depot. After the specified time interval, the delete also occurs in communal storage.
Default: 0. Deletes the file from communal storage as soon as it is not in use by shard subscribers.
- DepotOperationsForQuery
- Specifies behavior when the depot does not contain queried file data, one of the following:
-
ALL
(default): Fetch file data from communal storage, if necessary displace existing files by evicting them from the depot.
-
FETCHES
: Fetch file data from communal storage only if space is available; otherwise, read the queried data directly from communal storage.
-
NONE
: Do not fetch file data to the depot, read the queried data directly from communal storage.
You can also specify query-level behavior with the hint
DEPOT_FETCH
.
- ECSMode
- String parameter that sets the strategy Vertica uses when dividing the data in a shard among subscribing nodes during an ECS-enabled query, one of the following:
-
AUTO
: Optimizer automatically determines the strategy to use.
-
COMPUTE_OPTIMIZED
: Force use of the compute-optimized strategy.
-
IO_OPTIMIZED
: Force use of the I/O-optimized strategy.
For details, see Manually choosing an ECS strategy.
Default: AUTO
- ElasticKSafety
- Boolean parameter that controls whether Vertica adjusts shard subscriptions due to the loss of a primary node:
- 1: When a primary node is lost, Vertica subscribes other primary nodes to the down node's shard subscriptions. This action helps reduce the chances of a database into going read-only mode due to loss of shard coverage.
- 0 : Vertica does not change shard subscriptions in reaction to the loss of a primary node.
Default: 1
For details, see Maintaining Shard Coverage.
- EnableDepotWarmingFromPeers
- Boolean parameter, specifies whether Vertica warms a node depot while the node is starting up and not ready to process queries:
For details, see Depot Warming.
Default: 0
- FileDeletionServiceInterval
- Specifies in seconds the interval between each execution of the reaper cleaner service task.
Default: 60 seconds
- MaxDepotSizePercent
- An integer value that specifies the maximum size of the depot as a percentage of disk size,
Default: 80
- PreFetchPinnedObjectsToDepotAtStartup
- If enabled (set to 1), a warming depot fetches objects that are pinned on its subcluster. For details, see Depot Warming.
Default: 0
- ReaperCleanUpTimeoutAtShutdown
- Specifies in seconds how long Vertica waits for the reaper to delete files from communal storage before shutting down. If set to a negative value, Vertica shuts down without waiting for the reaper.
Note
The reaper is a service task that deletes disk files.
Default: 300
- StorageMergeMaxTempCacheMB
- The size of temp space allocated per query to the StorageMerge operator for caching the data of S3 storage containers.
Note
The actual temp space that is allocated is the lesser of:
For details, see Local caching of storage containers.
- UseCommunalStorageForBatchDepotWarming
- Boolean parameter, specifies whether where a node retrieves data when warming its depot:
Note
The actual temp space that is allocated is the lesser of two settings:
Default: 1
Important
This parameter is for internal use only. Do not change it unless directed to do so by Vertica support.
- UseDepotForReads
- Boolean parameter, specifies whether Vertica accesses the depot to answer queries, or accesses only communal storage:
-
1: Vertica first searches the depot for the queried data; if not there, Vertica fetches the data from communal storage for this and future queries.
-
0: Vertica bypasses the depot and always obtains queried data from communal storage.
Note
Enable depot reads to improve query performance and support K-safety.
Default: 1
- UseDepotForWrites
- Boolean parameter, specifies whether Vertica writes loaded data to the depot and then uploads files to communal storage:
-
1: Write loaded data to the depot, upload files to communal storage.
-
0: Bypass the depot and always write directly to communal storage.
Default: 1
- UsePeerToPeerDataTransfer
- Boolean parameter, specifies whether Vertica pushes loaded data to other shard subscribers:
Note
Setting to 1 helps improve performance when a node is down.
Default: 0
Important
This parameter is for internal use only. Do not change it unless directed to do so by Vertica support.
4.6 - Epoch management parameters
The following table describes the epoch management parameters for configuring Vertica.
The following table describes the epoch management parameters for configuring Vertica.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- AdvanceAHMInterval
- Determines how frequently (in seconds) Vertica checks the history retention status.
AdvanceAHMInterval cannot be set to a value that is less than the EpochMapInterval.
Default: 180 (seconds)
- AHMBackupManagement
- Blocks the advancement of the Ancient History Mark (AHM). When this parameter is enabled, the AHM epoch cannot be later than the epoch of your latest full backup. If you advance the AHM to purge and delete data, do not enable this parameter.
Caution
Do not enable this parameter before taking full backups, as it would prevent the AHM from advancing.
Default: 0
- EpochMapInterval
- Determines the granularity of mapping between epochs and time available to historical queries. When a historical queries
AT TIME T
request is issued, Vertica maps it to an epoch within a granularity of EpochMapInterval seconds. It similarly affects the time reported for Last Good Epoch during Failure recovery. Note that it does not affect internal precision of epochs themselves.
Tip
Decreasing this interval increases the number of epochs saved on disk. Therefore, consider reducing the HistoryRetentionTime parameter to limit the number of history epochs that Vertica retains.
Default: 180 (seconds)
- HistoryRetentionTime
- Determines how long deleted data is saved (in seconds) as an historical reference. When the specified time since the deletion has passed, you can purge the data. Use the -1 setting if you prefer to use
HistoryRetentionEpochs
to determine which deleted data can be purged.
Note
The default setting of 0 effectively prevents the use of the
Administration tools 'Roll Back Database to Last Good Epoch' option because the
AHM remains close to the current epoch and a rollback is not permitted to an epoch prior to the AHM.
Tip
If you rely on the Roll Back option to remove recently loaded data, consider setting a day-wide window to remove loaded data. For example:
ALTER DATABASE DEFAULT SET HistoryRetentionTime = 86400;
Default: 0 (Data saved when nodes are down.)
- HistoryRetentionEpochs
- Specifies the number of historical epochs to save, and therefore, the amount of deleted data.
Unless you have a reason to limit the number of epochs, Vertica recommends that you specify the time over which deleted data is saved.
If you specify both History
parameters, HistoryRetentionTime
takes precedence. Setting both parameters to -1, preserves all historical data.
See Setting a purge policy.
Default: -1 (disabled)
4.7 - Google Cloud Storage parameters
Use the following parameters to configure reading from Google Cloud Storage (GCS) using COPY FROM.
Use the following parameters to configure reading from Google Cloud Storage (GCS) using COPY FROM. For more information about reading data from S3, see Specifying where to load data from.
For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- GCSAuth
- An ID and secret key to authenticate to GCS. You can set parameters globally and for the current session with ALTER DATABASE...SET PARAMETER and ALTER SESSION...SET PARAMETER, respectively. For extra security, do not store credentials in the database; instead, set it for the current session with ALTER SESSION. For example:
=> ALTER SESSION SET GCSAuth='ID:secret';
If you use a shared credential, set it in the database with ALTER DATABASE.
- GCSEnableHttps
- Specifies whether to use the HTTPS protocol when connecting to GCS, can be set only at the database level with ALTER DATABASE...SET PARAMETER.
Default: 1 (enabled)
- GCSEndpoint
- The connection endpoint address.
Default: storage.googleapis.com
4.8 - Hadoop parameters
The following table describes general parameters for configuring integration with Apache Hadoop.
The following table describes general parameters for configuring integration with Apache Hadoop. See Apache Hadoop integration for more information.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- EnableHDFSBlockInfoCache
- Boolean, whether to distribute block location metadata collected during planning on the initiator to all database nodes for execution. Distributing this metadata reduces name node accesses, and thus load, but can degrade database performance somewhat in deployments where the name node isn't contended. This performance effect is because the data must be serialized and distributed. Enable distribution if protecting the name node is more important than query performance; usually this applies to large HDFS clusters where name node contention is already an issue.
Default: 0 (disabled)
- HadoopConfDir
- Directory path containing the XML configuration files copied from Hadoop. The same path must be valid on every Vertica node. You can use the VERIFY_HADOOP_CONF_DIR meta-function to test that the value is set correctly. Setting this parameter is required to read data from HDFS.
For all Vertica users, the files are accessed by the Linux user under which the Vertica server process runs.
When you set this parameter, previously-cached configuration information is flushed.
You can set this parameter at the session level. Doing so overrides the database value; it does not append to it. For example:
=> ALTER SESSION SET HadoopConfDir='/test/conf:/hadoop/hcat/conf';
To append, get the current value and include it on the new path after your additions. Setting this parameter at the session level does not change how the files are accessed.
Default: obtained from environment if possible
- HadoopFSAuthentication
- How (or whether) to use Kerberos authentication with HDFS. By default, if KerberosKeytabFile is set, Vertica uses that credential for both Vertica and HDFS. Usually this is the desired behavior. However, if you are using a Kerberized Vertica cluster with a non-Kerberized HDFS cluster, set this parameter to "none" to indicate that Vertica should not use the Vertica Kerberos credential to access HDFS.
Default: "keytab" if KerberosKeytabFile is set, otherwise "none"
- HadoopFSBlockSizeBytes
- Block size to write to HDFS. Larger files are divided into blocks of this size.
Default: 64MB
- HadoopFSNNOperationRetryTimeout
- Number of seconds a metadata operation (such as list directory) waits for a response before failing. Accepts float values for millisecond precision.
Default: 6 seconds
- HadoopFSReadRetryTimeout
- Number of seconds a read operation waits before failing. Accepts float values for millisecond precision. If you are confident that your file system will fail more quickly, you can improve performance by lowering this value.
Default: 180 seconds
- HadoopFSReplication
- Number of replicas HDFS makes. This is independent of the replication that Vertica does to provide K-safety. Do not change this setting unless directed otherwise by Vertica support.
Default: 3
- HadoopFSRetryWaitInterval
- Initial number of seconds to wait before retrying read, write, and metadata operations. Accepts float values for millisecond precision. The retry interval increases exponentially with every retry.
Default: 3 seconds
- HadoopFSTokenRefreshFrequency
- How often, in seconds, to refresh the Hadoop tokens used to hold Kerberos tickets (see Token expiration).
Default: 0 (refresh when token expires)
- HadoopFSWriteRetryTimeout
- Number of seconds a write operation waits before failing. Accepts float values for millisecond precision. If you are confident that your file system will fail more quickly, you can improve performance by lowering this value.
Default: 180 seconds
- HadoopImpersonationConfig
- Session parameter specifying the delegation token or Hadoop user for HDFS access. See HadoopImpersonationConfig format for information about the value of this parameter and Proxy users and delegation tokens for more general context.
- HDFSUseWebHDFS
- Boolean, whether URLs in the
hdfs
scheme use WebHDFS instead of LibHDFS++ to access data.
Deprecated
This parameter is deprecated because LibHDFS++ is deprecated. In the future, Vertica will use WebHDFS for all hdfs
URLs.
Default: 1 (enabled)
- WebhdfsClientCertConf
- mTLS configurations for accessing one or more WebHDFS servers. The value is a JSON string; each member has the following properties:
nameservice
and authority
are mutually exclusive.
For example:
=> ALTER SESSION SET WebhdfsClientCertConf =
'[{"authority" : "my.authority.com:50070", "certName" : "myCert"},
{"nameservice" : "prod", "certName" : "prodCert"}]';
HCatalog Connector parameters
The following table describes the parameters for configuring the HCatalog Connector. See Using the HCatalog Connector for more information.
Note
You can override HCatalog configuration parameters when you create an HCatalog schema with
CREATE HCATALOG SCHEMA.
- EnableHCatImpersonation
- Boolean, whether the HCatalog Connector uses (impersonates) the current Vertica user when accessing Hive. If impersonation is enabled, the HCatalog Connector uses the Kerberos credentials of the logged-in Vertica user to access Hive data. Disable impersonation if you are using an authorization service to manage access without also granting users access to the underlying files. For more information, see Configuring security.
Default: 1 (enabled)
- HCatalogConnectorUseHiveServer2
- Boolean, whether Vertica internally uses HiveServer2 instead of WebHCat to get metadata from Hive.
Default: 1 (enabled)
- HCatalogConnectorUseLibHDFSPP
- Boolean, whether the HCatalog Connector should use the
hdfs
scheme instead of webhdfs
with the HCatalog Connector.
Deprecated
Vertica uses the hdfs
scheme by default. To use webhdfs
, set the HDFSUseWebHDFS parameter.
Default: 1 (enabled)
- HCatConnectionTimeout
- The number of seconds the HCatalog Connector waits for a successful connection to the HiveServer2 (or WebHCat) server before returning a timeout error.
Default: 0 (Wait indefinitely)
- HCatSlowTransferLimit
- Lowest transfer speed (in bytes per second) that the HCatalog Connector allows when retrieving data from the HiveServer2 (or WebHCat) server. In some cases, the data transfer rate from the server to Vertica is below this threshold. In such cases, after the number of seconds specified in the HCatSlowTransferTime parameter pass, the HCatalog Connector cancels the query and closes the connection.
Default: 65536
- HCatSlowTransferTime
- Number of seconds the HCatalog Connector waits before testing whether the data transfer from the server is too slow. See the HCatSlowTransferLimit parameter.
Default: 60
4.9 - Internationalization parameters
The following table describes internationalization parameters for configuring Vertica.
The following table describes internationalization parameters for configuring Vertica.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- DefaultIntervalStyle
- Sets the default interval style to use. If set to 0 (default), the interval is in PLAIN style (the SQL standard), no interval units on output. If set to 1, the interval is in UNITS on output. This parameter does not take effect until the database is restarted.
Default: 0
- DefaultSessionLocale
- Sets the default session startup locale for the database. This parameter does not take effect until the database is restarted.
Default: en_US@collation=binary
- EscapeStringWarning
- Issues a warning when backslashes are used in a string literal. This can help locate backslashes that are being treated as escape characters so they can be fixed to follow the SQL standard-conforming string syntax instead.
Default: 1
- StandardConformingStrings
- Determines whether character string literals treat backslashes () as string literals or escape characters. When set to -1, backslashes are treated as string literals; when set to 0, backslashes are treated as escape characters.
Tip
To treat backslashes as escape characters, use the Extended string syntax
(E'...');
Default: -1
4.10 - Kafka user-defined session parameters
Set Vertica user-defined session parameters to configure Kafka SSL when not using a scheduler, using ALTER SESSION SET UDPARAMETER.
Set Vertica user-defined session parameters to configure Kafka SSL when not using a scheduler, using ALTER SESSION SET UDPARAMETER. The kafka
-prefixed parameters configure SSL authentication for Kafka. For details, see TLS/SSL encryption with Kafka.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- kafka_SSL_CA
- The contents of the certificate authority certificate. For example:
=> ALTER SESSION SET UDPARAMETER kafka_SSL_CA='MIIBOQIBAAJBAIOL';
Default: none
- kafka_SSL_Certificate
- The contents of the SSL certificate. For example:
=> ALTER SESSION SET UDPARAMETER kafka_SSL_Certificate='XrM07O4dV/nJ5g';
This parameter is optional when the Kafka server's parameter ssl.client.auth
is set to none
or requested
.
Default: none
- kafka_SSL_PrivateKey_secret
- The private key used to encrypt the session. Vertica does not log this information. For example:
=> ALTER SESSION SET UDPARAMETER kafka_SSL_PrivateKey_secret='A60iThKtezaCk7F';
This parameter is optional when the Kafka server's parameter ssl.client.auth
is set to none
or requested
.
Default: none
- kafka_SSL_PrivateKeyPassword_secret
- The password used to create the private key. Vertica does not log this information.
For example:
ALTER SESSION SET UDPARAMETER kafka_SSL_PrivateKeyPassword_secret='secret';
This parameter is optional when the Kafka server's parameter ssl.client.auth
is set to none
or requested
.
Default: none
- kafka_Enable_SSL
- Enables SSL authentication for Vertica-Kafka integration. For example:
=> ALTER SESSION SET UDPARAMETER kafka_Enable_SSL=1;
Default: 0
- MaxSessionUDParameterSize
- Sets the maximum length of a value in a user-defined session parameter. For example:
=> ALTER SESSION SET MaxSessionUDParameterSize = 2000
Default: 1000
User-defined session parameters
4.11 - Kerberos parameters
The following parameters let you configure the Vertica principal for Kerberos authentication and specify the location of the Kerberos keytab file.
The following parameters let you configure the Vertica principal for Kerberos authentication and specify the location of the Kerberos keytab
file.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- KerberosEnableKeytabPermissionCheck
- Whether the Vertica server verifies permissions on the keytab file. By default, the Vertica server verifies these permissions.
In a hybrid Kubernetes cluster, set this parameter to 0
so that there is no permissions check. Otherwise, Kerberos authentication fails because the keytab file is stored in a Secret, and the VerticaDB operator cannot verify permissions on a Secret.
Default: 1
- KerberosHostname
- Instance or host name portion of the Vertica Kerberos principal. For example:
vertica/host@EXAMPLE.COM
If you omit the optional KerberosHostname
parameter, Vertica uses the return value from the function gethostname()
. Assuming each cluster node has a different host name, those nodes will each have a different principal, which you must manage in that node's keytab file.
- KerberosKeytabFile
- Location of the
keytab
file that contains credentials for the Vertica Kerberos principal. By default, this file is located in /etc
. For example:
KerberosKeytabFile=/etc/krb5.keytab
Note
-
The principal must take the form KerberosServiceName/KerberosHostName@KerberosRealm
-
The keytab
file must be readable by the file owner who is running the process (typically the Linux dbadmin user assigned file permissions 0600).
- KerberosRealm
- Realm portion of the Vertica Kerberos principal. A realm is the authentication administrative domain and is usually formed in uppercase letters. For example:
vertica/hostEXAMPLE.COM
- KerberosServiceName
- Service name portion of the Vertica Kerberos principal. By default, this parameter is
vertica
. For example:
vertica/host@EXAMPLE.COM
Default: vertica
- KerberosTicketDuration
- Lifetime of the ticket retrieved from performing a kinit. The default is 0 (zero) which disables this parameter.
If you omit setting this parameter, the lifetime is determined by the default Kerberos configuration.
4.12 - Machine learning parameters
You use machine learning parameters to configure various aspects of machine learning functionality in Vertica.
You use machine learning parameters to configure various aspects of machine learning functionality in Vertica.
- MaxModelSizeKB
- Sets the maximum size of models that can be imported. The sum of the size of files specified in the metadata.json file determines the model size. The unit of this parameter is KBytes. The native Vertica model (category=VERTICA_MODELS) is exempted from this limit. If you can export the model from Vertica, and the model is not altered while outside Vertica, you can import it into Vertica again.
The MaxModelSizeKB parameter can be set only by a superuser and only at the database level. It is visible only to a superuser. Its default value is 4GB, and its valid range is between 1KB and 64GB (inclusive).
Examples:
To set this parameter to 3KB:
=> ALTER DATABASE DEFAULT SET MaxModelSizeKB = 3;
To set this parameter to 64GB (the maximum allowed):
=> ALTER DATABASE DEFAULT SET MaxModelSizeKB = 67108864;
To reset this parameter to the default value:
=> ALTER DATABASE DEFAULT CLEAR MaxModelSizeKB;
Default: 4GB
4.13 - Memory management parameters
The following table describes parameters for managing Vertica memory usage.
The following table describes parameters for managing Vertica memory usage.
Caution
Modify these parameters only under guidance from Vertica Support.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- MemoryPollerIntervalSec
- Specifies in seconds how often the Vertica memory poller checks whether Vertica memory usage is below the thresholds of several configuration parameters (see below):
-
MemoryPollerMallocBloatThreshold
-
MemoryPollerReportThreshold
-
MemoryPollerTrimThreshold
Important
To disable polling of all thresholds, set this parameter to 0. Doing so effectively disables automatic memory usage
reporting and
trimming.
Default: 2
- MemoryPollerMallocBloatThreshold
- Threshold of glibc memory bloat.
The memory poller calls glibc function malloc_info()
to obtain the amount of free memory in malloc. It then compares MemoryPollerMallocBloatThreshold
—by default, set to 0.3—with the following expression:
free-memory-in-malloc / RSS
If this expression evaluates to a value higher than MemoryPollerMallocBloatThreshold
, the memory poller calls glibc function
malloc_trim()
. This function reclaims free memory from malloc and returns it to the operating system. Details on calls to malloc_trim()
are written to system table
MEMORY_EVENTS
.
To disable polling of this threshold, set the parameter to 0.
Default: 0.3
- MemoryPollerReportThreshold
- Threshold of memory usage that determines whether the Vertica memory poller writes a report.
The memory poller compares MemoryPollerReportThreshold
with the following expression:
RSS / available-memory
When this expression evaluates to a value higher than MemoryPollerReportThreshold
—by default, set to 0.93, then the memory poller writes a report to MemoryReport.log
, in the Vertica working directory. This report includes information about Vertica memory pools, how much memory is consumed by individual queries and session, and so on. The memory poller also logs the report as an event in system table
MEMORY_EVENTS
, where it sets EVENT_TYPE
to MEMORY_REPORT
.
To disable polling of this threshold, set the parameter to 0.
Default: 0.93
- MemoryPollerTrimThreshold
- Threshold for the memory poller to start checking whether to trim glibc-allocated memory.
The memory poller compares MemoryPollerTrimThreshold
—by default, set to 0.83— with the following expression:
RSS / available-memory
If this expression evaluates to a value higher than MemoryPollerTrimThreshold
, then the memory poller starts checking the next threshold—set in MemoryPollerMallocBloatThreshold
—for glibc memory bloat.
To disable polling of this threshold, set the parameter to 0. Doing so also disables polling of MemoryPollerMallocBloatThreshold
.
Default: 0.83
4.14 - Monitoring parameters
The following table describes parameters that control options for monitoring the Vertica database.
The following parameters control options for monitoring the Vertica database.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- EnableDataCollector
- Enables the Data Collector, which is the Workload Analyzer's internal diagnostics utility. Affects all sessions on all nodes. To disable, set to 0.
Default: 1 (enabled)
- EnableLogRotate
- Enables and disables the timed LogRotate service.
Default: 1 (enabled)
- LogRotateInterval
- Interval literal, how often the LogRotate service runs.
Default: 8h (runs every 8 hours)
- LogRotateMaxAge
- Interval literal, the maximum age of rotated logs (
.gz
) to keep. Rotated logs with an age greater than this value are rotated. For details, see Rotating log files.
Default: 7 days
- LogRotateMaxSize
- String, the maximum size of a log (
.log
) file. The string has format integer
{K|M|G|T}
, where K
is kibibytes, M
is mebibytes, G
is gibibytes, and T
is tebibytes. Logs larger than this value are rotated. For details, see Rotating log files.
Default: 100M
- SnmpTrapDestinationsList
- Defines where Vertica sends traps for SNMP. See Configuring reporting for SNMP. For example:
=> ALTER DATABASE DEFAULT SET SnmpTrapDestinationsList = 'localhost 162 public';
Default: none
- SnmpTrapsEnabled
- Enables event trapping for SNMP. See Configuring reporting for SNMP.
Default: 0
- SnmpTrapEvents
- Defines which events Vertica traps through SNMP. See Configuring reporting for SNMP. For example:
=> ALTER DATABASE DEFAULT SET SnmpTrapEvents = 'Low Disk Space, Recovery Failure';
Default: Low Disk Space, Read Only File System, Loss of K Safety, Current Fault Tolerance at Critical Level, Too Many ROS Containers, Node State Change, Recovery Failure, Stale Checkpoint, and CRC Mismatch.
- SyslogEnabled
- Enables event trapping for syslog. See Configuring reporting for syslog.
Default: 0 (disabled)
- SyslogEvents
- Defines events that generate a syslog entry. See Configuring reporting for syslog. For example:
=> ALTER DATABASE DEFAULT SET SyslogEvents = 'Low Disk Space, Recovery Failure';
Default: none
- SyslogFacility
- Defines which SyslogFacility Vertica uses. See Configuring reporting for syslog.
Default: user
4.15 - Numeric precision parameters
The following configuration parameters let you configure numeric precision for numeric data types.
The following configuration parameters let you configure numeric precision for numeric data types. For more about using these parameters, seeNumeric data type overflow with SUM, SUM_FLOAT, and AVG.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- AllowNumericOverflow
- Boolean, set to one of the following:
-
1 (true): Allows silent numeric overflow. Vertica does not implicitly extend precision of numeric data types. Vertica ignores the value of NumericSumExtraPrecisionDigits.
-
0 (false): Vertica produces an overflow error, if a result exceeds the precision set by NumericSumExtraPrecisionDigits.
Default: 1 (true)
- NumericSumExtraPrecisionDigits
- An integer between 0 and 20, inclusive. Vertica produces an overflow error if a result exceeds the specified precision. This parameter setting only applies if AllowNumericOverflow is set to 0 (false).
Default: 6 (places beyond the DDL-specified precision)
4.16 - Profiling parameters
The following table describes the profiling parameters for configuring Vertica.
The following table describes the profiling parameters for configuring Vertica. See Profiling database performance for more information on profiling queries.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- GlobalEEProfiling
- Enables profiling for query execution runs in all sessions on all nodes.
Default: 0
- GlobalQueryProfiling
- Enables query profiling for all sessions on all nodes.
Default: 0
- GlobalSessionProfiling
- Enables session profiling for all sessions on all nodes.
Default: 0
- SaveDCEEProfileThresholdUS
- Sets in microseconds the query duration threshold for saving profiling information to system tables QUERY_CONSUMPTION and EXECUTION_ENGINE_PROFILES. You can set this parameter to a maximum value of 2147483647 (231-1, or ~35.79 minutes).
Default: 60000000 (60 seconds)
4.17 - Projection parameters
The following configuration parameters help you manage projections.
The following configuration parameters help you manage projections.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- AnalyzeRowCountInterval
- Specifies how often Vertica checks the number of projection rows and whether the threshold set by
ARCCommitPercentage
has been crossed.
For more information, see Collecting database statistics.
Default: 86400 seconds (24 hours)
- ARCCommitPercentage
- Sets the threshold percentage of difference between the last-recorded aggregate projection row count and current row count for a given table. When the difference exceeds this threshold, Vertica updates the catalog with the current row count.
Default: 3 (percent)
- ContainersPerProjectionLimit
- Specifies how many ROS containers Vertica creates per projection before ROS pushback occurs.
Caution
Increasing this parameter's value can cause serious degradation of database performance. Vertica strongly recommends that you not modify this parameter without first consulting with Customer Support professionals.
Default: 1024
- MaxAutoSegColumns
- Specifies the number of columns (0 –1024) to use in an auto-projection's hash segmentation clause. Set to
0
to use all columns.
Default: 8
- MaxAutoSortColumns
- Specifies the number of columns (0 –1024) to use in an auto-projection's sort expression. Set to
0
to use all columns.
Default: 8
- RebalanceQueryStorageContainers
- By default, prior to performing a rebalance, Vertica performs a system table query to compute the size of all projections involved in the rebalance task. This query enables Vertica to optimize the rebalance to most efficiently utilize available disk space. This query can, however, significantly increase the time required to perform the rebalance.
By disabling the system table query, you can reduce the time required to perform the rebalance. If your nodes are low on disk space, disabling the query increases the chance that a node runs out of disk space. In that situation, the rebalance fails.
Default: 1 (enable)
- RewriteQueryForLargeDim
- If enabled (1), Vertica rewrites a SET USING or DEFAULT USING query during a REFRESH_COLUMNS operation by reversing the inner and outer join between the target and source tables. Doing so can optimize refresh performance in cases where the source data is in a table that is larger than the target table.
Important
Enable this parameter only if the SET USING source data is in a table that is larger than the target table. If the source data is in a table smaller than the target table, then enabling RewriteQueryForLargeDim can adversely affect refresh performance.
Default: 0
- SegmentAutoProjection
- Determines whether auto-projections are segmented if the table definition omits a segmentation clause. You can set this parameter at database and session scopes.
Default: 1 (create segmented auto projections)
4.18 - S3 parameters
Use the following parameters to configure reading from S3 file systems and on-premises storage with S3-compatible APIs, using COPY.
Use the following parameters to configure reading from S3 file systems and on-premises storage with S3-compatible APIs. For more information about reading data from S3, see S3 Object Store.
For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.
Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- AWSAuth
- ID and secret key for authentication. For extra security, do not store credentials in the database; use ALTER SESSION...SET PARAMETER to set this value for the current session only. If you use a shared credential, you can set it in the database with ALTER DATABASE...SET PARAMETER. For example:
=> ALTER SESSION SET AWSAuth='ID:secret';
In AWS, these arguments are named AccessKeyID and SecretAccessKey.
To use admintools create_db
or revive_db
for Eon Mode on-premises, create a configuration file called auth_params.conf
with these settings:
AWSAuth = key:secret
AWSEndpoint = IP:port
- AWSCAFile
File name of the TLS server certificate bundle to use. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.
=> ALTER DATABASE DEFAULT SET AWSCAFile = '/etc/ssl/ca-bundle.pem';
Default: system-dependent
- AWSCAPath
Path Vertica uses to look up TLS server certificate bundles. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.
=> ALTER DATABASE DEFAULT SET AWSCAPath = '/etc/ssl/';
Default: system-dependent
- AWSEnableHttps
Boolean, specifies whether to use the HTTPS protocol when connecting to S3, can be set only at the database level with ALTER DATABASE. If you choose not to use TLS, this parameter must be set to 0.
Default: 1 (enabled)
- AWSEndpoint
- Endpoint to use when interpreting S3 URLs, set as follows.
Important
Do not include http(s)://
for AWS endpoints.
-
AWS: hostname_or_ip
:port_number
.
-
AWS with a FIPS-compliant S3 Endpoint: S3_hostname
and enable virtual addressing:
Important
Do not include http(s)://
AWSEndpoint = s3-fips.dualstack.us-east-1.amazonaws.com
S3EnableVirtualAddressing = 1
-
On-premises/Pure: IP address of the Pure Storage server. If using admintools create_db
or revive_db
, create configuration file auth_params.conf
and include these settings:
awsauth = key:secret
awsendpoint = IP:port
-
When AWSEndpoint is not set, the default behavior is to use virtual-hosted request URLs.
Default: s3.amazonaws.com
- AWSLogLevel
- Log level, one of the following:
-
OFF
-
FATAL
-
ERROR
-
WARN
-
INFO
-
DEBUG
-
TRACE
**Default:**ERROR
- AWSRegion
- AWS region containing the S3 bucket from which to read files. This parameter can only be configured with one region at a time. If you need to access buckets in multiple regions, change the parameter each time you change regions.
If you do not set the correct region, you might experience a delay before queries fail because Vertica retries several times before giving up.
Default: us-east-1
- AWSSessionToken
- Temporary security token generated by running the
get-session-token
command, which generates temporary credentials you can use to configure multi-factor authentication.
Set this parameter in a user session using ALTER SESSION. You can set this parameter at the database level, but be aware that session tokens are temporary. When the token expires, any attempt to access AWS fails.
Note
If you use session tokens at the session level, you must set all parameters at the session level, even if some of them are set at the database level. Use
ALTER SESSION to set session parameters.
- AWSStreamingConnectionPercentage
- Controls the number of connections to the communal storage that Vertica uses for streaming reads. In a cloud environment, this setting helps prevent streaming data from communal storage using up all available file handles. It leaves some file handles available for other communal storage operations.
Due to the low latency of on-premises object stores, this option is unnecessary for an Eon Mode database that uses on-premises communal storage. In this case, disable the parameter by setting it to 0.
- S3BucketConfig
- A JSON array of objects specifying per-bucket configuration overrides. Each property other than the bucket name has a corresponding configuration parameter (shown in parentheses). If both the database-level parameter and its equivalent in S3BucketConfig are set, the value in S3BucketConfig takes precedence.
Properties:
-
bucket
: Name of the bucket
-
region
(AWSRegion): Name of the region
-
protocol
(AWSEnableHttps): Connection protocol, either http
or https
-
endpoint
(AWSEndpoint): Endpoint URL or IP address
-
enableVirtualAddressing
(S3EnableVirtualAddressing): Whether to rewrite the S3 URL to use a virtual hosted path
-
requesterPays
(S3RequesterPays): Whether requester (instead of bucket owner) pays the cost of accessing data on the bucket
-
serverSideEncryption
(S3ServerSideEncryption): Encryption algorithm if using SSE-S3 or SSE-KMS, one of AES256
, aws:kms
, or an empty string
-
sseCustomerAlgorithm
(S3SseCustomerAlgorithm): Encryption algorithm if using SSE-C; must be AES256
-
sseCustomerKey
(S3SseCustomerKey): Key if using SSE-C encryption, either 32-character plaintext or 44-character base64-encoded
-
sseKmsKeyId
(S3SseKmsKeyId): Key ID if using SSE-KMS encryption
-
proxy
(S3Proxy): HTTP(S) proxy string
The configuration properties for a given bucket might differ based on its type. For example, the following S3BucketConfig is for an AWS bucket AWSBucket
and a Pure Storage bucket PureStorageBucket
. AWSBucket
doesn't specify an endpoint, so Vertica uses the value of AWSEndpoint, which defaults to s3.amazonaws.com
:
ALTER DATABASE DEFAULT SET S3BucketConfig=
'[
{
"bucket": "AWSBucket",
"region": "us-east-2",
"protocol": "https",
"requesterPays": true
},
{
"bucket": "PureStorageBucket",
"endpoint": "pure.mycorp.net:1234",
"protocol": "http",
"enableVirtualAddressing": false
}
]';
- S3BucketCredentials
- Contains credentials for accessing an S3 bucket. Each property in S3BucketCredentials has an equivalent parameter (shown in parentheses). When set, S3BucketCredentials takes precedence over both AWSAuth and AWSSessionToken.
Providing credentials for more than one bucket authenticates to them simultaneously, allowing you to perform cross-endpoint joins, export from one bucket to another, etc.
Properties:
-
bucket
: Name of the bucket
-
accessKey
: Access key for the bucket (the ID
in AWSAuth)
-
secretAccessKey
: Secret access key for the bucket (the secret
in AWSAuth)
-
sessionToken
: Session token, only used when S3BucketCredentials is set at the session level (AWSSessionToken)
For example, the following S3BucketCredentials is for an AWS bucket AWSBucket
and a Pure Storage bucket PureStorageBucket
and sets all possible properties:
ALTER SESSION SET S3BucketCredentials='
[
{
"bucket": "AWSBucket",
"accessKey": "<AK0>",
"secretAccessKey": "<SAK0>",
"sessionToken": "1234567890"
},
{
"bucket": "PureStorageBucket",
"accessKey": "<AK1>",
"secretAccessKey": "<SAK1>"
}
]';
This parameter is only visible to the superuser. Users can set this parameter at the session level with ALTER SESSION.
- S3EnableVirtualAddressing
- Boolean, specifies whether to rewrite S3 URLs to use virtual-hosted paths. For example, if you use AWS, the S3 URLs change to
bucketname.s3.amazonaws.com
instead of s3.amazonaws.com/bucketname
. This configuration setting takes effect only when you have specified a value for AWSEndpoint.
If you set AWSEndpoint to a FIPS-compliant S3 Endpoint, you must enable S3EnableVirtualAddressing in auth_params.conf:
AWSEndpoint = s3-fips.dualstack.us-east-1.amazonaws.com
S3EnableVirtualAddressing = 1
The value of this parameter does not affect how you specify S3 paths.
Default: 0 (disabled)
Note
As of September 30, 2020, AWS requires virtual address paths for newly created buckets.
- S3Proxy
- HTTP(S) proxy settings, if needed, a string in the following format:
http[s]://[user:password@]host[:port]
Default: "" (empty string)
- S3RequesterPays
- Boolean, specifies whether requester (instead of bucket owner) pays the cost of accessing data on the bucket. When true, the bucket owner is only responsible for paying the cost of storing the data, rather than all costs associated with the bucket; must be set in order to access S3 buckets configured as Requester Pays buckets. By setting this property to true, you are accepting the charges for accessing data. If not specified, the default value is false.
- S3ServerSideEncryption
- String, encryption algorithm to use when reading or writing to S3. The value depends on which type of encryption at rest is configured for S3:
-
AES256
: Use for SSE-S3 encryption
-
aws:kms
: Use for SSE-KMS encryption
-
Empty string (""
): No encryption
SSE-C encryption does not use this parameter. Instead, see S3SseCustomerAlgorithm.
For details on using SSE parameters, see S3 object store.
Default: ""
(no encryption)
- S3SseCustomerAlgorithm
- String, the encryption algorithm to use when reading or writing to S3 using SSE-C encryption. The only supported values are
AES256
and ""
.
For SSE-S3 and SSE-KMS, instead use S3ServerSideEncryption.
Default: ""
(no encryption)
- S3SseCustomerKey
- If using SSE-C encryption, the client key for S3 access.
- S3SseKmsKeyId
- If using SSE-KMS encryption, the key identifier (not the key) to pass to the Key Management Server. Vertica must have permission to use the key, which is managed through KMS.
4.19 - SNS parameters
The following parameters configure Amazon Simple Notification Service (SNS) notifiers.
The following parameters configure Amazon Simple Notification Service (SNS) notifiers. These parameters can only be set at the database level and some, if unset, fall back to their equivalent S3 parameters.
Notifiers must be disabled and then reenabled for these parameters to take effect:
=> ALTER NOTIFIER sns_notifier DISABLE;
=> ALTER NOTIFIER sns_notifier ENABLE;
- SNSAuth (falls back to AWSAuth)
- ID and secret key for authentication, equivalent to the AccessKeyID and SecretAccessKey in AWS. For example:
=> ALTER DATABASE DEFAULT SET SNSAuth='myID:mySecret';
Default: "" (empty string)
- SNSCAFile (falls back to AWSCAFile)
File name of the TLS server certificate bundle to use. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.
=> ALTER DATABASE DEFAULT SET SNSCAFile = '/etc/ssl/ca-bundle.pem';
Default: "" (empty string)
- SNSCAPath (falls back to AWSCAPath)
Path Vertica uses to look up TLS server certificate bundles. If set, this parameter overrides the Vertica default CA bundle path specified in the SystemCABundlePath parameter.
=> ALTER DATABASE DEFAULT SET SNSCAPath = '/etc/ssl/';
Default: "" (empty string)
- SNSEnableHttps
Boolean, specifies whether to use the HTTPS protocol when connecting to S3, can be set only at the database level with ALTER DATABASE. If you choose not to use TLS, this parameter must be set to 0.
Default: 1 (enabled)
- SNSEndpoint
- URL of the SNS API endpoint. If this parameter is set to an empty string and the region is specified (either by SNSRegion or its fallback to AWSRegion), Vertica automatically infers the appropriate endpoint.
If you use FIPS, you must manually specify a FIPS-compliant endpoint.
Default: "" (empty string)
- SNSRegion (falls back to AWSRegion)
- AWS region for the SNSEndpoint. This parameter can only be configured with one region at a time.
Default: "" (empty string)
4.20 - SQS parameters
The following parameters configure Amazon Simple Queue Service (SQS) access.
The following parameters configure access to Amazon Simple Queue Service (SQS) queues. SQS is used with CREATE DATA LOADER to automatically load data from a queue into Vertica. These parameters can be set at the database or session level and some, if unset, fall back to their equivalent S3 parameters.
- SQSAuth (falls back to AWSAuth)
- ID and secret key for authentication, equivalent to the AccessKeyID and SecretAccessKey in AWS. For example:
=> ALTER DATABASE DEFAULT SET SQSAuth='myID:mySecret';
Default: "" (empty string)
- SQSQueueCredentials
- Per-queue credentials for accessing SQS queues, a JSON array. When set, these values take precedence over SQSAuth.
Properties:
-
accessKey
: Access key for the queue (the ID
in AWSAuth)
-
secretAccessKey
: Secret access key for the queue (the secret
in AWSAuth)
-
sessionToken
: AWS session token, only used when SQSQueueCredentials is set at the session level
The following example sets credentials for two queues:
=> ALTER SESSION SET SQSQueueCredentials='[
{"queue":"queue0","accessKey":"<AK0>","secretAccessKey":"<SAK0>","sessionToken":"1234567890"},
{"queue":"queue1","accessKey":"foo","secretAccessKey":"bar"}]';
Default: "" (empty string)
- SQSSessionToken
- Temporary, session-level SQS access token. Set this parameter in a user session using ALTER SESSION.
If you set a session token at the session level, you must also set SQSAuth at the session level, even if it is already set at the database level.
In general, SQSSessionToken should be set only at the session level. While you can set this parameter at the database level, session tokens are ephemeral, so when the database-level token expires, you must set it again either at the session or database level.
4.21 - Security parameters
Use these client authentication configuration parameters and general security parameters to configure TLS.
Use these client authentication configuration parameters and general security parameters to configure TLS.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
Database parameters
- DataSSLParams
- This parameter has been deprecated. Use the data_channel TLS Configuration instead.
Enables encryption using SSL on the data channel. The value of this parameter is a comma-separated list of the following:
-
An SSL certificate (chainable)
-
The corresponding SSL private key
-
The SSL CA (Certificate Authority) certificate.
You should set EncryptSpreadComm
before setting this parameter.
In the following example, the SSL Certificate contains two certificates, where the certificate for the non-root CA verifies the certificate for the cluster. This is called an SSL Certificate Chain.
=> ALTER DATABASE DEFAULT SET PARAMETER DataSSLParams =
'----BEGIN CERTIFICATE-----<certificate for Cluster>-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----<certificate for non-root CA>-----END CERTIFICATE-----,
-----BEGIN RSA PRIVATE KEY-----<private key for Cluster>-----END RSA PRIVATE KEY-----,
-----BEGIN CERTIFICATE-----<certificate for public CA>-----END CERTIFICATE-----';
- DefaultIdleSessionTimeout
- Indicates a default session timeout value for all users where
IDLESESSIONTIMEOUT
is not set. For example:
=> ALTER DATABASE DEFAULT SET defaultidlesessiontimeout = '300 secs';
- DHParams
-
Deprecated
This parameter has been deprecated and will be removed in a future release.
String, a Diffie-Hellman group of at least 2048 bits in the form:
-----BEGIN DH PARAMETERS-----...-----END DH PARAMETERS-----
You can generate your own or use the pre-calculated Modular Exponential (MODP) Diffie-Hellman groups specified in RFC 3526.
Changes to this parameter do not take effect until you restart the database.
Default: RFC 3526 2048-bit MODP Group 14:
-----BEGIN DH PARAMETERS-----MIIBCAKCAQEA///////////
JD9qiIWjCNMTGYouA3BzRKQJOCIpnzHQCC76mOxObIlFKCHmONAT
d75UZs806QxswKwpt8l8UN0/hNW1tUcJF5IW1dmJefsb0TELppjf
tawv/XLb0Brft7jhr+1qJn6WunyQRfEsf5kkoZlHs5Fs9wgB8uKF
jvwWY2kg2HFXTmmkWP6j9JM9fg2VdI9yjrZYcYvNWIIVSu57VKQd
wlpZtZww1Tkq8mATxdGwIyhghfDKQXkYuNs474553LBgOhgObJ4O
i7Aeij7XFXfBvTFLJ3ivL9pVYFxg5lUl86pVq5RXSJhiY+gUQFXK
OWoqsqmj//////////wIBAg==-----END DH PARAMETERS-----
- DisableInheritedPrivileges
- Boolean, whether to disable privilege inheritance at the database level (enabled by default).
- DoUserSpecificFilteringInSysTables
- Boolean, specifies whether a non-superuser can view details of another user:
Default: 0
- EnableAllRolesOnLogin
- Boolean, specifies whether to automatically enable all roles granted to a user on login:
-
0: Do not automatically enable roles
-
1: Automatically enable roles. With this setting, users do not need to run SET ROLE.
Default: 0 (disable)
- EnabledCipherSuites
- Specifies which SSL cipher suites to use for secure client-server communication. Changes to this parameter apply only to new connections.
Default: Vertica uses the Microsoft Schannel default cipher suites. For more information, see the Schannel documentation.
- EnableConnectCredentialForwarding
- When using CONNECT TO VERTICA, whether to use the password of the current user to authenticate to a target database.
Default: 0 (disable)
- EnableOAuth2JITCleanup
- If enabled, users created by just-in-time OAuth provisioning are automatically dropped if the user does not log in after the number of days specified by OAuth2UserExpiredInterval.
To view provisioned users, see USERS.
Default: 0 (disable)
- EncryptSpreadComm
- Enables Spread encryption on the control channel, set to one of the following strings:
-
vertica
: Specifies that Vertica generates the Spread encryption key for the database cluster.
-
aws-kms|
key-name
, where key-name
is a named key in the iAWS Key Management Service (KMS). On database restart, Vertica fetches the named key from the KMS instead of generating its own.
If the parameter is empty, Spread communication is unencrypted. In general, you should enable this parameter before modifying other security parameters.
Enabling this parameter requires database restart.
- GlobalHeirUsername
- A string that specifies which user inherits objects after their owners are dropped. This setting ensures preservation of data that would otherwise be lost.
Set this parameter to one of the following string values:
-
Empty string: Objects of dropped users are removed from the database.
-
username
: Reassigns objects of dropped users to username
. If username
does not exist, Vertica creates that user and sets GlobalHeirUsername
to it.
-
<auto>
: Reassigns objects of dropped LDAP or just-in-time-provisioned users to the dbadmin
user. The brackets (< and >) are required for this option.
For more information about usage, see Examples.
Default: <auto>
- ImportExportTLSMode
- When using CONNECT TO VERTICA to connect to another Vertica cluster for import or export, specifies the degree of stringency for using TLS. Possible values are:
-
PREFER
: Try TLS but fall back to plaintext if TLS fails.
-
REQUIRE
: Use TLS and fail if the server does not support TLS.
-
VERIFY_CA
: Require TLS (as with REQUIRE), and also validate the other server's certificate using the CA specified by the "server" TLS Configuration's CA certificates (in this case, "ca_cert" and "ica_cert"):
=> SELECT name, certificate, ca_certificate, mode FROM tls_configurations WHERE name = 'server';
name | certificate | ca_certificate | mode
--------+------------------+--------------------+-----------
server | server_cert | ca_cert,ica_cert | VERIFY_CA
(1 row)
-
VERIFY_FULL
: Require TLS and validate the certificate (as with VERIFY_CA), and also validate the server certificate's hostname.
-
REQUIRE_FORCE
, VERIFY_CA_FORCE
, and VERIFY_FULL_FORCE
: Same behavior as REQUIRE
, VERIFY_CA
, and VERIFY_FULL
, respectively, and cannot be overridden by CONNECT TO VERTICA.
Default: PREFER
- InternodeTLSConfig
- The TLS Configuration to use for internode encryption.
For example:
=> ALTER DATABASE DEFAULT SET InternodeTLSConfig = my_tls_config;
Default: data_channel
- LDAPAuthTLSConfig
- The TLS Configuration to use for TLS with LDAP authentication.
For example:
=> ALTER DATABASE DEFAULT SET LDAPAuthTLSConfig = my_tls_config;
Default: ldapauth
- LDAPLinkTLSConfig
- The TLS Configuration to use for TLS for the LDAP Link service.
For example:
=> ALTER DATABASE DEFAULT SET LDAPLinkTLSConfig = my_tls_config;
Default: ldaplink
- OAuth2JITClient
-
Deprecated
This parameter is deprecated. Use the OAuth2JITRolesClaimName
parameter instead.
The client/application name that contains Vertica roles in the Keycloak identity provider. The Vertica roles under resource_access.
OAuth2JITClient
.roles
are automatically granted to and set as default roles for users created by just-in-time provisioning. Roles that do not exist in Vertica are ignored.
For details, see Just-in-time user provisioning.
Default:
vertica
- OAuth2JITRolesClaimName
- Specifies an IdP client with one or more roles that correspond to a Vertica database role. During just-in-time user provisioning, each client role that matches an existing Vertica role is automatically assigned to the user. Roles that do not exist in Vertica are ignored.
For authentication records that use JWT validation, Vertica retrieves the client roles from the access token.
To retrieve client roles for authentication records that use IdP validation, Vertica sends a request to either the introspect_url
or the userinfo_url
set in the authentication record. This parameter requires the full path to the roles claim in that request. For example, the full path to the roles claim in a Keycloak request uses the following format:
resource_access.
clientName
.roles
For details, see Just-in-time user provisioning.
Default: resource_access.vertica.roles
- OAuth2JITForbiddenRoles
- Vertica database roles that you do not want Vertica to automatically assign to users during just-in-time provisioning. This parameter accepts one or more roles as a comma-separated list.
- OAuth2JITGroupsClaimName
- The name of the IdP group whose name or role mappings correspond to Vertica database roles. During just-in-time user provisioning, each group name or group role mapping that matches an existing Vertica role is automatically assigned to the user. Roles that do not exist in Vertica are ignored.
For authentication records that use JWT validation, Vertica retrieves the JIT-provisioned user's groups from the access token.
To retrieve the JIT-provisioned user's groups for authentication records that use IdP validation, Vertica sends a request to either the introspect_url
or the userinfo_url
set in the authentication record.
Default:
groups
- OAuth2UserExpiredInterval
- If EnableOAuthJITCleanup is enabled, users created by just-in-time OAuth provisioning are automatically dropped after not logging in for the number of days specified by OAuth2UserExpiredInterval. The number of days the user has not logged in is calculated relative to the
LAST_LOGIN_TIME
column in the USERS system table.
Note
The LAST_LOGIN_TIME
as recorded by the USERS system table is not persistent; if the database is restarted, the LAST_LOGIN_TIME
for users created by just-in-time user provisioning is set to the database start time (this appears as an empty value in LAST_LOGIN_TIME
).
You can view the database start time by querying the DATABASES system table:
=> SELECT database_name, start_time FROM databases;
database_name | start_time
---------------+-------------------------------
VMart | 2023-02-06 14:26:50.630054-05
(1 row)
Default: 14
- PasswordLockTimeUnit
- The time units for which an account is locked by
PASSWORD_LOCK_TIME
after FAILED_LOGIN_ATTEMPTS
, one of the following:
-
'd'
: days (default)
-
'h'
: hours
-
'm'
: minutes
-
's'
: seconds
For example, to configure the default profile to lock user accounts for 30 minutes after three unsuccessful login attempts:
=> ALTER DATABASE DEFAULT SET PasswordLockTimeUnit = 'm'
=> ALTER PROFILE DEFAULT LIMIT PASSWORD_LOCK_TIME 30;
- RequireFIPS
- Boolean, specifies whether the FIPS mode is enabled:
On startup, Vertica automatically sets this parameter from the contents of the file crypto.fips_enabled
. You cannot modify this parameter.
For details, see FIPS compliance for the Vertica server.
Default: 0
- SecurityAlgorithm
- Sets the algorithm for the function that hash authentication uses, one of the following:
For example:
=> ALTER DATABASE DEFAULT SET SecurityAlgorithm = 'SHA512';
Default: SHA512
- ServerTLSConfig
- The TLS Configuration to use for client-server TLS.
For example:
=> ALTER DATABASE DEFAULT SET ServerTLSConfig = my_tls_config;
Default: server
- SystemCABundlePath
- The absolute path to a certificate bundle of trusted CAs. This CA bundle is used when establishing TLS connections to external services such as AWS or Azure through their respective SDKs and libcurl. The CA bundle file must be in the same location on all nodes.
If this parameter is empty, Vertica searches the "standard" paths for the CA bundles, which differs between distributions:
- Red Hat-based:
/etc/pki/tls/certs/ca-bundle.crt
- Debian-based:
/etc/ssl/certs/ca-certificates.crt
- SUSE:
/var/lib/ca-certificates/ca-bundle.pem
Example:
=> ALTER DATABASE DEFAULT SET SystemCABundlePath = 'path/to/ca_bundle.pem';
Default: Empty
TLS parameters
To set your Vertica database's TLSMode, private key, server certificate, and CA certificate(s), see TLS configurations. In versions prior to 11.0.0, these parameters were known as EnableSSL, SSLPrivateKey, SSLCertificate, and SSLCA, respectively.
Examples
Set the database parameter GlobalHeirUsername
:
=> \du
List of users
User name | Is Superuser
-----------+--------------
Joe | f
SuzyQ | f
dbadmin | t
(3 rows)
=> ALTER DATABASE DEFAULT SET PARAMETER GlobalHeirUsername='SuzyQ';
ALTER DATABASE
=> \c - Joe
You are now connected as user "Joe".
=> CREATE TABLE t1 (a int);
CREATE TABLE
=> \c
You are now connected as user "dbadmin".
=> \dt t1
List of tables
Schema | Name | Kind | Owner | Comment
--------+------+-------+-------+---------
public | t1 | table | Joe |
(1 row)
=> DROP USER Joe;
NOTICE 4927: The Table t1 depends on User Joe
ROLLBACK 3128: DROP failed due to dependencies
DETAIL: Cannot drop User Joe because other objects depend on it
HINT: Use DROP ... CASCADE to drop the dependent objects too
=> DROP USER Joe CASCADE;
DROP USER
=> \dt t1
List of tables
Schema | Name | Kind | Owner | Comment
--------+------+-------+-------+---------
public | t1 | table | SuzyQ |
(1 row)
4.22 - Stored procedure parameters
The following parameters configure the behavior of stored procedures and triggers.
The following parameters configure the behavior of stored procedures and triggers.
- EnableNestedStoredProcedures
- Boolean, whether to enable nested stored procedures.
Default: 0
- EnableStoredProcedureScheduler
- Boolean, whether to enable the scheduler. For details, see Scheduled execution.
Default: 1
- PLvSQLCoerceNull
- Boolean, whether to allow NULL-to-false type coercion to improve compatibility with PLpgSQL. For details, see PL/pgSQL to PL/vSQL migration guide.
Default: 0
4.23 - Text search parameters
You can configure Vertica for text search using the following parameter.
You can configure Vertica for text search using the following parameter.
- TextIndexMaxTokenLength
- Controls the maximum size of a token in a text index.
For example:
ALTER DATABASE database_name SET PARAMETER TextIndexMaxTokenLength=760;
If the parameter is set to a value greater than 65000 characters, then the tokenizer truncates the token at 65000 characters.
Caution
Avoid setting this parameter near its maximum value, 65000. Doing so can result in a significant decrease in performance. For optimal performance, set this parameter to the maximum token value of your tokenizer.
Default: 128 (characters)
4.24 - Tuple mover parameters
These parameters control how the operates.
These parameters control how the Tuple Mover operates.
Query the
CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
- ActivePartitionCount
- Sets the number of active partitions. The active partitions are those most recently created. For example:
=> ALTER DATABASE DEFAULT SET ActivePartitionCount = 2;
For information about how the Tuple Mover treats active and inactive partitions during a mergeout operation, see Partition mergeout.
Default: 1
- CancelTMTimeout
- When partition, copy table, and rebalance operations encounter a conflict with an internal Tuple Mover job, those operations attempt to cancel the conflicting Tuple Mover job. This parameter specifies the amount of time, in seconds, that the blocked operation waits for the Tuple Mover cancellation to take effect. If the operation is unable to cancel the Tuple Mover job within limit specified by this parameter, the operation displays an error and rolls back.
Default: 300
- EnableTMOnRecoveringNode
- Boolean, specifies whether Tuple Mover performs mergeout activities on nodes with a node state of RECOVERING. Enabling Tuple Mover reduces the number of ROS containers generated during recovery. Having fewer than 1024 ROS containers per projection allows Vertica to maintain optimal recovery performance.
Default: 1 (enabled)
- MaxMrgOutROSSizeMB
- Specifies in MB the maximum size of ROS containers that are candidates for mergeout operations. The Tuple Mover avoids merging ROS containers that are larger than this setting.
Note
After a
rebalance operation, Tuple Mover groups ROS containers in batches that are smaller than MaxMrgOutROSSizeMB. ROS containers that are larger than MaxMrgOutROSSizeMB are merged individually
Default: -1 (no maximum limit)
- MergeOutInterval
- Specifies in seconds how frequently the Tuple Mover checks the mergeout request queue for pending requests:
-
If the queue contains mergeout requests, the Tuple Mover does nothing and goes back to sleep.
-
If the queue is empty, the Tuple Mover:
It then goes back to sleep.
Default: 600
- PurgeMergeoutPercent
- Specifies as a percentage the threshold of deleted records in a ROS container that invokes an automatic mergeout operation, to purge those records. Vertica only counts the number of 'aged-out' delete vectors—that is, delete vectors that are as 'old' or older than the ancient history mark (AHM) epoch.
This threshold applies to all ROS containers for non-partitioned tables. It also applies to ROS containers of all inactive partitions. In both cases, aged-out delete vectors are permanently purged from the ROS container.
Note
This configuration parameter only applies to automatic mergeout operations. It does not apply to manual mergeout operations that are invoked by calling meta-functions
DO_TM_TASK('mergeout') and
PURGE.
Default: 20 (percent)
5 - File systems and object stores
Vertica supports access to several file systems and object stores in addition to the Linux file system.
Vertica supports access to several file systems and object stores in addition to the Linux file system. The reference pages in this section provide information on URI syntax, configuration parameters, and authentication.
Vertica accesses the file systems in this section in one of two ways:
-
If user-provided credentials are present, Vertica uses them to access the storage. Note that on HDFS, user credentials are always present because Vertica accesses HDFS using the Vertica user identity.
-
If user-provided credentials are not present, or if the UseServerIdentityOverUserIdentity configuration parameter is set, Vertica checks for a configured USER storage location. When access is managed through USER storage locations, Vertica uses the server credential to access the file system. For more information about USER storage locations, see CREATE LOCATION.
Not all file systems are supported in all contexts. See the documentation of specific features for the file systems those features support.
5.1 - Azure Blob Storage object store
Azure has several interfaces for accessing data.
Azure has several interfaces for accessing data. Vertica reads and always writes Block Blobs in Azure Storage. Vertica can read external data created using ADLS Gen2, and data that Vertica exports can be read using ADLS Gen2.
One of the following:
-
azb://
account
/
container
/
path
-
azb://[
account
@]
host
[:
port
]/
container
/
path
In the first version, a URI like 'azb://myaccount/mycontainer/path' treats the first token after the '//' as the account name. In the second version, you can specify account and must specify host explicitly.
The following rules apply to the second form:
- If
account
is not specified, the first label of the host is used. For example, if the URI is 'azb://myaccount.blob.core.windows.net/mycontainer/my/object', then 'myaccount' is used for account
.
- If
account
is not specified and host
has a single label and no port, the endpoint is host
.blob.core.windows.net
. Otherwise, the endpoint is the host and port specified in the URI.
The protocol (HTTP or HTTPS) is specified in the AzureStorageEndpointConfig configuration parameter.
Authentication
If you are using Azure managed identities, no further configuration in Vertica is needed. If your Azure storage uses multiple managed identities, you must tag the one to be used. Vertica looks for an Azure tag with a key of VerticaManagedIdentityClientId, the value of which must be the client_id attribute of the managed identity to be used. If you update the Azure tag, call AZURE_TOKEN_CACHE_CLEAR.
If you are not using managed identities, use the AzureStorageCredentials configuration parameter to provide credentials to Azure. If loading data, you can set the parameter at the session level. If using Eon Mode communal storage on Azure, you must set this configuration parameter at the database level.
In Azure you must also grant access to the containers for the identities used from Vertica.
Configuration parameters
The following database configuration parameters apply to the Azure blob file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.
- AzureStorageCredentials
- Collection of JSON objects, each of which specifies connection credentials for one endpoint. This parameter takes precedence over Azure managed identities.
The collection must contain at least one object and may contain more. Each object must specify at least one of accountName
or blobEndpoint
, and at least one of accountKey
or sharedAccessSignature
.
accountName
: If not specified, uses the label of blobEndpoint
.
blobEndpoint
: Host name with optional port (host:port
). If not specified, uses account
.blob.core.windows.net
.
accountKey
: Access key for the account or endpoint.
sharedAccessSignature
: Access token for finer-grained access control, if being used by the Azure endpoint.
- AzureStorageEndpointConfig
- Collection of JSON objects, each of which specifies configuration elements for one endpoint. Each object must specify at least one of
accountName
or blobEndpoint
.
accountName
: If not specified, uses the label of blobEndpoint
.
blobEndpoint
: Host name with optional port (host:port
). If not specified, uses account
.blob.core.windows.net
.
protocol
: HTTPS (default) or HTTP.
isMultiAccountEndpoint
: true if the endpoint supports multiple accounts, false otherwise (default is false). To use multiple-account access, you must include the account name in the URI. If a URI path contains an account, this value is assumed to be true unless explicitly set to false.
Examples
The following examples use these values for the configuration parameters. AzureStorageCredentials contains sensitive information and is set at the session level in this example.
=> ALTER SESSION SET AzureStorageCredentials =
'[{"accountName": "myaccount", "accountKey": "REAL_KEY"},
{"accountName": "myaccount", "blobEndpoint": "localhost:8080", "accountKey": "TEST_KEY"}]';
=> ALTER DATABASE default SET AzureStorageEndpointConfig =
'[{"accountName": "myaccount", "blobEndpoint": "localhost:8080", "protocol": "http"}]';
The following example creates an external table using data from Azure. The URI specifies an account name of "myaccount".
=> CREATE EXTERNAL TABLE users (id INT, name VARCHAR(20))
AS COPY FROM 'azb://myaccount/mycontainer/my/object/*';
Vertica uses AzureStorageEndpointConfig and the account name to produce the following location for the files:
https://myaccount.blob.core.windows.net/mycontainer/my/object/*
Data is accessed using the REAL_KEY credential.
If the URI in the COPY statement is instead azb://myaccount.blob.core.windows.net/mycontainer/my/object
, then the resulting location is https://myaccount.blob.core.windows.net/mycontainer/my/object
, again using the REAL_KEY credential.
However, if the URI in the COPY statement is azb://myaccount@localhost:8080/mycontainer/my/object
, then the host and port specify a different endpoint: http://localhost:8080/myaccount/mycontainer/my/object
. This endpoint is configured to use a different credential, TEST_KEY.
5.2 - Google Cloud Storage (GCS) object store
File system using the Google Cloud Storage platform.
File system using the Google Cloud Storage platform.
gs://
bucket
/
path
Authentication
To access data in Google Cloud Storage (GCS) you must first do the following tasks:
-
Create a default project, obtain a developer key, and enable S3 interoperability mode as described in the GCS documentation.
-
Set the GCSAuth configuration parameter as in the following example.
=> ALTER SESSION SET GCSAuth='id:secret';
Configuration parameters
The following database configuration parameters apply to the GCS file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter. For information about all parameters related to GCS, see Google Cloud Storage parameters.
For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.
- GCSAuth
- An ID and secret key to authenticate to GCS. For extra security, do not store credentials in the database; instead, use ALTER SESSION...SET PARAMETER to set this value for the current session only.
- GCSEnableHttps
- Boolean, whether to use the HTTPS protocol when connecting to GCS, can be set only at the database level with ALTER DATABASE...SET PARAMETER.
Default: 1 (enabled)
- GCSEndpoint
- The connection endpoint address.
Default: storage.googleapis.com
Examples
The following example loads data from GCS:
=> ALTER SESSION SET GCSAuth='my_id:my_secret_key';
=> COPY t FROM 'gs://DataLake/clicks.parquet' PARQUET;
5.3 - HDFS file system
HDFS is the Hadoop Distributed File System.
HDFS is the Hadoop Distributed File System. You can use the webhdfs
and swebhdfs
schemes to access data through the WebHDFS service. Vertica also supports the hdfs
scheme, which by default uses WebHDFS. To have hdfs
URIs use the deprecated LibHDFS++ package, set the HDFSUseWebHDFS configuration parameter to 0 (disabled).
If you specify a webhdfs
URI but the Hadoop HTTP policy (dfs.http.policy
) is set to HTTPS_ONLY, Vertica automatically uses swebhdfs
instead.
If you use LibHDFS++, the WebHDFS service must still be available because Vertica falls back to WebHDFS for operations not supported by LibHDFS++.
Deprecated
Support for LibHDFS++ is deprecated. In the future, HDFSUseWebHDFS will be enabled in all cases and hdfs
URIs will be equivalent to webhdfs
URIs.
URIs in the webhdfs
, swebhdfs
, and hdfs
schemes all have two formats, depending on whether you specify a name service or the host and port of a name node:
-
[[s]web]hdfs://[
nameservice
]/
path
-
[[s]web]hdfs://
namenode-host:port
/
path
Characters may be URL-encoded (%NN where NN is a two-digit hexadecimal number) but are not required to be, except that the '%' character must be encoded.
To use the default name service specified in the HDFS configuration files, omit nameservice
. Use this shorthand only for reading external data, not for creating a storage location.
Always specify a name service or host explicitly when using Vertica with more than one HDFS cluster. The name service or host name must be globally unique. Using [web]hdfs:///
could produce unexpected results because Vertica uses the first value of fs.defaultFS
that it finds.
Authentication
Vertica can use Kerberos authentication with Cloudera or Hortonworks HDFS clusters. See Accessing kerberized HDFS data.
For loading and exporting data, Vertica can access HDFS clusters protected by mTLS through the swebhdfs
scheme. You must create a certificate and key and set the WebhdfsClientCertConf configuration parameter.
You can use CREATE KEY and CREATE CERTIFICATE to create temporary, session-scoped values if you specify the TEMPORARY keyword. Temporary keys and certificates are stored in memory, not on disk.
The WebhdfsClientCertConf configuration parameter holds client credentials for one or more HDFS clusters. The value is a JSON string listing name services or authorities and their corresponding keys. You can set the configuration parameter at the session or database level. Setting the parameter at the database level has the following additional requirements:
The following example shows how to use mTLS. The key and certificate values themselves are not shown, just the beginning and end markers:
=> CREATE TEMPORARY KEY client_key TYPE 'RSA'
AS '-----BEGIN PRIVATE KEY-----...-----END PRIVATE KEY-----';
-> CREATE TEMPORARY CERTIFICATE client_cert
AS '-----BEGIN CERTIFICATE-----...-----END CERTIFICATE-----' key client_key;
=> ALTER SESSION SET WebhdfsClientCertConf =
'[{"authority": "my.hdfs.namenode1:50088", "certName": "client_cert"}]';
=> COPY people FROM 'swebhdfs://my.hdfs.namenode1:50088/path/to/file/1.txt';
Rows Loaded
-------------
1
(1 row)
To configure access to more than one HDFS cluster, define the keys and certificates and then include one object per cluster in the value of WebhdfsClientCertConf:
=> ALTER SESSION SET WebhdfsClientCertConf =
'[{"authority" : "my.authority.com:50070", "certName" : "myCert"},
{"nameservice" : "prod", "certName" : "prodCert"}]';
Configuration parameters
The following database configuration parameters apply to the HDFS file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter. For information about all parameters related to Hadoop, see Hadoop parameters.
- EnableHDFSBlockInfoCache
- Boolean, whether to distribute block location metadata collected during planning on the initiator to all database nodes for execution, reducing name node contention. Disabled by default.
- HadoopConfDir
- Directory path containing the XML configuration files copied from Hadoop. The same path must be valid on every Vertica node. The files are accessed by the Linux user under which the Vertica server process runs.
- HadoopImpersonationConfig
- Session parameter specifying the delegation token or Hadoop user for HDFS access. See HadoopImpersonationConfig format for information about the value of this parameter and Proxy users and delegation tokens for more general context.
- HDFSUseWebHDFS
- Boolean. If true (the default), URIs in the
hdfs
scheme are treated as if they were in the webhdfs
scheme. If false, Vertica uses LibHDFS++ where possible, though some operations can still use WebHDFS if not supported by LibHDFS++.
- WebhdfsClientCertConf
- mTLS configurations for accessing one or more WebHDFS servers, a JSON string. Each object must specify either a
nameservice
or authority
field and a certName
field. See Authentication.
Configuration files
The path specified in HadoopConfDir must include a directory containing the files listed in the following table. Vertica reads these files at database start time. If you do not set a value, Vertica looks for the files in /etc/hadoop/conf.
If a property is not defined, Vertica uses the defaults shown in the table. If no default is specified for a property, the configuration files must specify a value.
File |
Properties |
Default |
core-site.xml |
fs.defaultFS |
none |
|
(for doAs users:) hadoop.proxyuser.username .users |
none |
|
(for doAs users:) hadoop.proxyuser.username .hosts |
none |
hdfs-site.xml |
dfs.client.failover.max.attempts |
15 |
|
dfs.client.failover.sleep.base.millis |
500 |
|
dfs.client.failover.sleep.max.millis |
15000 |
|
(For HA NN:) dfs.nameservices |
none |
|
(WebHDFS:) dfs.namenode.http-address or dfs.namenode.https-address |
none |
|
(WebHDFS:) dfs.datanode.http.address or dfs.datanode.https.address |
none |
|
(WebHDFS:) dfs.http.policy |
HTTP_ONLY |
If using High Availability (HA) Name Nodes, the individual name nodes must also be defined in hdfs-site.xml.
Note
If you are using Eon Mode with communal storage on HDFS, then if you set dfs.encrypt.data.transfer you must use the swebhdfs
scheme for communal storage.
To verify that Vertica can find configuration files in HadoopConfDir, use the VERIFY_HADOOP_CONF_DIR function.
To test access through the hdfs
scheme, use the HDFS_CLUSTER_CONFIG_CHECK function.
For more information about testing your configuration, see Verifying HDFS configuration.
To reread the configuration files, use the CLEAR_HDFS_CACHES function.
Name nodes and name services
You can access HDFS data using the default name node by not specifying a name node or name service:
=> COPY users FROM 'webhdfs:///data/users.csv';
Vertica uses the fs.defaultFS
Hadoop configuration parameter to find the name node. (It then uses that name node to locate the data.) You can instead specify a host and port explicitly using the following format:
webhdfs://nn-host:nn-port/
The specified host is the name node, not an individual data node. If you are using High Availability (HA) Name Nodes you should not use an explicit host because high availability is provided through name services instead.
If the HDFS cluster uses High Availability Name Nodes or defines name services, use the name service instead of the host and port, in the format webhdfs://nameservice/
. The name service you specify must be defined in hdfs-site.xml
.
The following example shows how you can use a name service, hadoopNS:
=> CREATE EXTERNAL TABLE users (id INT, name VARCHAR(20))
AS COPY FROM 'webhdfs://hadoopNS/data/users.csv';
If you are using Vertica to access data from more than one HDFS cluster, always use explicit name services or hosts in the URL. Using the ///
shorthand could produce unexpected results because Vertica uses the first value of fs.defaultFS
that it finds. To access multiple HDFS clusters, you must use host and service names that are globally unique. See Configuring HDFS access for more information.
5.4 - S3 object store
File systems using the S3 protocol, including AWS, Pure Storage, and MinIO.
File systems using the S3 protocol, including AWS, Pure Storage, and MinIO.
s3://
bucket
/
path
For AWS, specify the region using the AWSRegion configuration parameter, not the URI. If the region is incorrect, you might experience a delay before the load fails because Vertica retries several times before giving up. The default region is us-east-1
.
Authentication
For AWS:
-
To access S3 you must create an IAM role and grant that role permission to access your S3 resources.
-
By default, bucket access is restricted to the communal storage bucket. Use an AWS access key to load data from non-communal storage buckets.
-
Either set the AWSAuth configuration parameter to provide credentials or create a USER storage location for the S3 path (see CREATE LOCATION) and grant users access.
-
You can use AWS STS temporary session tokens to load data. Because they are session tokens, do not use them for access to storage locations.
-
You can configure S3 buckets individually with the per-bucket parameters S3BucketConfig and S3BucketCredentials. For details, see Per-bucket S3 configurations.
Configuration parameters
The following database configuration parameters apply to the S3 file system. You can set parameters at different levels with the appropriate ALTER statement, such as ALTER SESSION...SET PARAMETER. Query the CONFIGURATION_PARAMETERS system table to determine what levels (node, session, user, database) are valid for a given parameter.
You can configure individual buckets using the S3BucketConfig and S3BucketCredentials parameters instead of the global parameters.
For external tables using highly partitioned data in an object store, see the ObjectStoreGlobStrategy configuration parameter and Partitions on Object Stores.
The following descriptions are summaries. For details about all parameters specific to S3, see S3 parameters.
- AWSAuth
- An ID and secret key for authentication. AWS calls these AccessKeyID and SecretAccessKey. For extra security, do not store credentials in the database; use ALTER SESSION...SET PARAMETER to set this value for the current session only.
- AWSCAFile
- The file name of the TLS server certificate bundle to use. You must set a value when installing a CA certificate on a SUSE Linux Enterprise Server.
- AWSCAPath
- The path Vertica uses to look up TLS server certificates. You must set a value when installing a CA certificate on a SUSE Linux Enterprise Server.
- AWSEnableHttps
- Boolean, whether to use the HTTPS protocol when connecting to S3. Can be set only at the database level. You can set the prototol for individual buckets using S3BucketConfig.
Default: 1 (enabled)
- AWSEndpoint
- String, the endpoint host for all S3 URLs, set as follows:
-
AWS: hostname_or_IP
:port
. Do not include the scheme (http(s)).
-
AWS with a FIPS-compliant S3 Endpoint: Hostname of a FIPS-compliant S3 endpoint. You must also enable S3EnableVirtualAddressing.
-
On-premises/Pure: IP address of the Pure Storage server.
If not set, Vertica uses virtual-hosted request URLs.
Default: 's3.amazonaws.com'
- AWSLogLevel
- The log level, one of: OFF, FATAL, ERROR, WARN, INFO, DEBUG, or TRACE.
Default: ERROR
- AWSRegion
- The AWS region containing the S3 bucket from which to read files. This parameter can only be configured with one region at a time. Failure to set the correct region can lead to a delay before queries fail.
Default: 'us-east-1'
- AWSSessionToken
- A temporary security token generated by running the
get-session-token
command, used to configure multi-factor authentication.
Note
If you use session tokens at the session level, you must set all parameters at the session level, even if some of them are set at the database level.
- AWSStreamingConnectionPercentage
- In Eon Mode, the number of connections to the communal storage to use for streaming reads. In a cloud environment, this setting helps prevent streaming data from using up all available file handles. This setting is unnecessary when using on-premises object stores because of their lower latency.
- S3BucketConfig
- A JSON array of objects specifying per-bucket configuration overrides. Each property other than the bucket name has a corresponding configuration parameter (shown in parentheses). If both the database-level parameter and its equivalent in S3BucketConfig are set, the value in S3BucketConfig takes precedence.
Properties:
-
bucket
: Bucket name
-
region
(AWSRegion)
-
protocol
(AWSEnableHttp): Connection protocol, one of http
or https
-
endpoint
(AWSEndpoint)
-
enableVirtualAddressing
(S3BucketCredentials): Boolean, whether to rewrite the S3 URL to use a virtual hosted path
-
requesterPays
(S3RequesterPays)
-
serverSideEncryption
(S3ServerSideEncryption)
-
sseCustomerAlgorithm
(S3SseCustomerAlgorithm)
-
sseCustomerKey
(S3SseCustomerKey)
-
sseKmsKeyId
(S3SseKmsKeyId)
-
proxy
(S3Proxy)
- S3BucketCredentials
- A JSON object specifying per-bucket credentials. Each property other than the bucket name has a corresponding configuration parameter. If both the database-level parameter and its equivalent in S3BucketCredentials are set, the value in S3BucketCredentials takes precedence.
Properties:
-
bucket
: Bucket name
-
accessKey
: Access key for the bucket (the ID
in AWSAuth)
-
secretAccessKey
: Secret access key for the bucket (the secret
in AWSAuth)
-
sessionToken
: Session token, only used when S3BucketCredentials is set at the session level (AWSSessionToken)
This parameter is only visible to superusers. Users can set this parameter at the session level with ALTER SESSION.
- S3EnableVirtualAddressing
- Boolean, whether to rewrite S3 URLs to use virtual-hosted paths (disabled by default). This configuration setting takes effect only when you have specified a value for AWSEndpoint.
If you set AWSEndpoint to a FIPS-compliant S3 endpoint, you must enable S3EnableVirtualAddressing.
The value of this parameter does not affect how you specify S3 paths.
- S3Proxy
- HTTP(S) proxy settings, if needed, a string in the following format:
http[s]://[user:password@]host[:port]
.
- S3RequesterPays
- Boolean, whether requester (instead of bucket owner) pays the cost of accessing data on the bucket.
- S3ServerSideEncryption
- String, encryption algorithm to use when reading or writing to S3. Supported values are
AES256
(for SSE-S3), aws:kms
(for SSE-KMS), and an empty string (for no encryption). See Server-Side Encryption.
Default: ""
(no encryption)
- S3SseCustomerAlgorithm
- String, the encryption algorithm to use when reading or writing to S3 using SSE-C encryption. The only supported values are
AES256
and ""
. For SSE-S3 and SSE-KMS, instead use S3ServerSideEncryption.
Default: ""
(no encryption)
- S3SseCustomerKey
- If using SSE-C encryption, the client key for S3 access.
- S3SseKmsKeyId
- If using SSE-KMS encryption, the key identifier (not the key) to pass to the Key Management Service. Vertica must have permission to use the key, which is managed through KMS.
By default, Vertica performs writes using a single thread, but a single write usually includes multiple files or parts of files. For writes to S3, you can use a larger thread pool to perform writes in parallel. This thread pool is used for all file writes to S3, including file exports and writes to communal storage.
The size of the thread pool is controlled by the ObjStoreUploadParallelism configuration parameter. Each node has a single thread pool used for all file writes. In general, one or two threads per concurrent writer produces good results.
Server-side encryption
By default, Vertica reads and writes S3 data that is not encrypted. If the S3 bucket uses server-side encryption (SSE), you can configure Vertica to access it. S3 supports three types of server-side encryption: SSE-S3, SSE-KMS, and SSE-C.
Vertica must also have read or write permissions (depending on the operation) on the bucket.
SSE-S3
With SSE-S3, the S3 service manages encryption keys. Reads do not require additional configuration. To write to S3, the client (Vertica, in this case) must specify only the encryption algorithm.
If the S3 bucket is configured with the default encryption settings, Vertica can read and write data to them with no further changes. If the bucket does not use the default encryption settings, set the S3ServerSideEncryption configuration parameter or the serverSideEncryption
field in S3BucketConfig to AES256
.
SSE-KMS
With SSE-KMS, encryption keys are managed by the Key Management Service (KMS). The client must supply a KMS key identifier (not the actual key) when writing data. For all operations, the client must have permission to use the KMS key. These permissions are managed in KMS, not in Vertica.
To use SSE-KMS:
-
Set the S3ServerSideEncryption configuration parameter or the serverSideEncryption
field in S3BucketConfig to aws:kms
.
-
Set the S3SseKmsKeyId configuration parameter or the sseKmsKeyId
field in S3BucketConfig to the key ID.
SSE-C
With SSE-C, the client manages encryption keys and provides them to S3 for each operation.
To use SSE-C:
-
Set the S3SseCustomerAlgorithm configuration parameter or the sseCustomerAlgorithm
field in S3BucketConfig to AES256
.
-
Set the S3SseCustomerKey configuration parameter or the sseCustomerKey
field in S3BucketConfig to the access key. The value can be either a 32-character plaintext key or a 44-character base64-encoded key.
HTTP(S) logging (AWS)
When a request to AWS fails or is retried, or Vertica detects a performance degradation, Vertica logs the event in the UDFS_EVENTS system table. The event type for these records is HttpRequestAttempt
, and the description is a JSON object with more details. For example:
=> SELECT filesystem, event, description FROM UDFS_EVENTS;
filesystem | event | description
------------+---------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------
S3 | HttpRequestAttempt | {"AttemptStartTimeGMT":"2024-01-18 16:45:52.663","RetryCount":0,"HttpMethod":"GET","URIString":"s3://mybucket/s3_2.dat","AmzRequestId":"79104EXAMPLEB723","AmzId2":"IOWQ4fDEXAMPLEQM+ey7N9WgVhSnQ6JEXAMPLEZb7hSQDASK+Jd1vEXAMPLEa3Km","HttpStatusCode":404,"HttpRequestLatency":2,"SdkExceptionMessage":"No response body."}
S3 | HttpRequestAttempt | {"AttemptStartTimeGMT":"2024-01-18 16:46:02.663","RetryCount":1,"HttpMethod":"GET","URIString":"s3://mybucket/s3_2.dat","AmzRequestId":"79104EXAMPLEB791","AmzId2":"JPXQ4fDEXAMPLEQM+ey7N9WgVhSnQ6JEXAMPLEZb7hSQDASK+Jd1vEXAMPLEa4Ln","HttpStatusCode":206,"HttpRequestLatency":1}
(2 rows)
To extract fields from this description, use the MAPJSONEXTRACTOR function:
=> SELECT filesystem, event,
(MapJSONExtractor(description))['RetryCount'] as RetryCount,
(MapJSONExtractor(description))['HttpStatusCode'] as HttpStatusCode
FROM UDFS_EVENTS;
filesystem | event | RetryCount | HttpStatusCode
------------+---------------------+------------+-----------------
S3 | HttpRequestAttempt | 0 | 404
S3 | HttpRequestAttempt | 1 | 206
(2 rows)
For details on AWS-specific fields such as AmzRequestId
, see the AWS documentation.
The UDFS_STATISTICS system table records the accumulated duration for all HTTP requests in the TOTAL_REQUEST_DURATION_MS
column. This table also provides computed values for average duration and throughput.
Examples
The following example sets a database-wide AWS region and credentials:
=> ALTER DATABASE DEFAULT SET AWSRegion='us-west-1';
=> ALTER DATABASE DEFAULT SET AWSAuth = 'myaccesskeyid123456:mysecretaccesskey123456789012345678901234';
The following example loads data from S3. You can use a glob if all files in the glob can be loaded together. In the following example, AWS_DataLake contains only ORC files.
=> COPY t FROM 's3://datalake/*' ORC;
You can specify a list of comma-separated S3 buckets as in the following example. All buckets must be in the same region. To load from more than one region, use separate COPY statements and change the value of AWSRegion between calls.
=> COPY t FROM 's3://AWS_Data_1/sales.parquet', 's3://AWS_Data_2/sales.parquet' PARQUET;
The following example creates a user storage location and a role, so that users without their own S3 credentials can read data from S3 using the server credential.
--- set database-level credential (once):
=> ALTER DATABASE DEFAULT SET AWSAuth = 'myaccesskeyid123456:mysecretaccesskey123456789012345678901234';
=> CREATE LOCATION 's3://datalake' SHARED USAGE 'USER' LABEL 's3user';
=> CREATE ROLE ExtUsers;
--- Assign users to this role using GRANT (Role).
=> GRANT READ ON LOCATION 's3://datalake' TO ExtUsers;
The configuration properties for a given bucket may differ based on its type. The following S3BucketConfig setting is for an AWS bucket (AWSBucket
) and a Pure Storage bucket (PureStorageBucket
). AWSBucket
doesn't specify an endpoint, so Vertica uses the AWSEndpoint configuration parameter, which defaults to s3.amazonaws.com
:
=> ALTER DATABASE DEFAULT SET S3BucketConfig=
'[
{
"bucket": "AWSBucket",
"region": "us-east-2",
"protocol": "https",
"requesterPays": true,
"serverSideEncryption": "aes256"
},
{
"bucket": "PureStorageBucket",
"endpoint": "pure.mycorp.net:1234",
"protocol": "http",
"enableVirtualAddressing": false
}
]';
The following example sets S3BucketCredentials for these two buckets:
=> ALTER SESSION SET S3BucketCredentials='
[
{
"bucket": "AWSBucket",
"accessKey": "<AK0>",
"secretAccessKey": "<SAK0>",
"sessionToken": "1234567890"
},
{
"bucket": "PureStorageBucket",
"accessKey": "<AK1>",
"secretAccessKey": "<SAK1>"
}
]';
The following example sets an STS temporary session token. Vertica uses the session token to access S3 with the specified credentials and bypasses checking for a USER storage location.
$ aws sts get-session-token
{
"Credentials": {
"AccessKeyId": "ASIAJZQNDVS727EHDHOQ",
"SecretAccessKey": "F+xnpkHbst6UPorlLGj/ilJhO5J2n3Yo7Mp4vYvd",
"SessionToken": "FQoDYXdzEKv//////////wEaDMWKxakEkCyuDH0UjyKsAe6/3REgW5VbWtpuYyVvSnEK1jzGPHi/jPOPNT7Kd+ftSnD3qdaQ7j28SUW9YYbD50lcXikz/HPlusPuX9sAJJb7w5oiwdg+ZasIS/+ejFgCzLeNE3kDAzLxKKsunvwuo7EhTTyqmlLkLtIWu9zFykzrR+3Tl76X7EUMOaoL31HOYsVEL5d9I9KInF0gE12ZB1yN16MsQVxpSCavOFHQsj/05zbxOQ4o0erY1gU=",
"Expiration": "2018-07-18T05:56:33Z"
}
}
$ vsql
=> ALTER SESSION SET AWSAuth = 'ASIAJZQNDVS727EHDHOQ:F+xnpkHbst6UPorlLGj/ilJhO5J2n3Yo7Mp4vYvd';
=> ALTER SESSION SET AWSSessionToken = 'FQoDYXdzEKv//////////wEaDMWKxakEkCyuDH0UjyKsAe6/3REgW5VbWtpuYyVvSnEK1jzGPHi/jPOPNT7Kd+ftSnD3qdaQ7j28SUW9YYbD50lcXikz/HPlusPuX9sAJJb7w5oiwdg+ZasIS/+ejFgCzLeNE3kDAzLxKKsunvwuo7EhTTyqmlLkLtIWu9zFykzrR+3Tl76X7EUMOaoL31HOYsVEL5d9I9KInF0gE12ZB1yN16MsQVxpSCavOFHQsj/05zbxOQ4o0erY1gU=';
See also
Per-Bucket S3 Configurations
5.4.1 - Per-bucket S3 configurations
You can manage configurations and credentials for individual buckets with the S3BucketConfig and S3BucketCredentials configuration parameters.
You can manage configurations and credentials for individual buckets with the S3BucketConfig and S3BucketCredentials configuration parameters. These parameters each take a JSON object, whose respective properties behave like the related S3 configuration parameters.
For example, you can create a different configuration for each of your S3 buckets by setting S3BucketConfig at the database level with ALTER DATABASE. The following S3BucketConfig specifies several common bucket properties:
=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
{
"bucket": "exampleAWS",
"region": "us-east-2",
"protocol": "https",
"requesterPays": true
},
{
"bucket": "examplePureStorage",
"endpoint": "pure.mycorp.net:1234",
"protocol": "http",
"enableVirtualAddressing": false
}
]';
Users can then access a bucket by setting S3BucketCredentials at the session level with ALTER SESSION. The following S3BucketCredentials specifies all properties and authenticates to both exampleAWS
and examplePureStorage
simultaneously:
=> ALTER SESSION SET S3BucketCredentials='
[
{
"bucket": "exampleAWS",
"accessKey": "<AK0>",
"secretAccessKey": "<SAK0>",
"sessionToken": "1234567890"
},
{
"bucket": "examplePureStorage",
"accessKey": "<AK1>",
"secretAccessKey": "<SAK1>",
}
]';
Recommended usage
The recommended usage is as follows:
-
Define in your S3 storage system one set of credentials per principal, per storage system.
-
It is often most convenient to set S3BucketConfig once at the database level and have users authenticate by setting S3BucketCredentials at the session level.
-
To access buckets outside those configured at the database level, set both S3BucketConfig and S3BucketCredentials at the session level.
If you cannot define credentials for your S3 storage, you can set S3BucketCredentials or AWSAuth at the database level with ALTER DATABASE, but this comes with certain drawbacks:
-
Storing credentials statically in another location (in this case, in the Vertica catalog) always incurs additional risk.
-
This increases overhead for the dbadmin, who needs to create user storage locations and grant access to each user or role.
-
Users share one set of credentials, increasing the potential impact if the credentials are compromised.
Note
If you set AWSEndpoint to a non-Amazon S3 storage system like Pure Storage or MinIO and you want to configure S3BucketConfig for real Amazon S3, the following requirements apply:
-
If your real Amazon S3 region is not us-east-1
(the default), you must specify the region
.
-
Set endpoint
to an empty string (""
).
=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
{
"bucket": "additionalAWSBucket",
"region": "us-east-2",
"endpoint": ""
}
]';
Precedence of per-bucket and standard parameters
Vertica uses the following rules to determine the effective set of properties for an S3 connection:
-
If set, S3BucketCredentials takes priority over its standard parameters. S3BucketCredentials is checked first at the session level and then at the database level.
-
The level/source of the S3 credential parameters determines the source of the S3 configuration parameters:
-
If credentials come from the session level, then the configuration can come from either the session or database level (with the session level taking priority).
-
If your credentials come from the database level, then the configuration can only come from the database level.
-
If S3BucketConfig is set, it takes priority over its standard parameters. If an S3BucketConfig property isn't specified, Vertica falls back to the missing property's equivalent parameter. For example, if S3BucketConfig specifies every property except protocol
, Vertica falls back to the standard parameter AWSEnableHttps
.
Examples
Multiple buckets
This example configures a real Amazon S3 bucket AWSBucket
and a Pure Storage bucket PureStorageBucket
with S3BucketConfig.
AWSBucket
does not specify an endpoint
or protocol, so Vertica falls back to AWSEndpoint
(defaults to s3.amazonaws.com
) and AWSEnableHttps (defaults to 1
).
In this example environment, access to the PureStorageBucket
is over a secure network, so HTTPS is disabled:
=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
{
"bucket": "AWSBucket",
"region": "us-east-2"
},
{
"bucket": "PureStorageBucket",
"endpoint": "pure.mycorp.net:1234",
"protocol": "http",
"enableVirtualAddressing": false
}
]';
Bob can then set S3BucketCredentials at the session level to authenticate to AWSBucket
:
=> ALTER SESSION SET S3BucketCredentials='
[
{
"bucket": "AWSBucket",
"accessKey": "<AK0>",
"secretAccessKey": "<SAK0>",
"sessionToken": "1234567890"
}
]';
Similarly, Alice can authenticate to PureStorageBucket
:
=> ALTER SESSION SET S3BucketCredentials='
[
{
"bucket": "PureStorageBucket",
"accessKey": "<AK1>",
"secretAccessKey": "<SAK1>"
}
]';
Charlie provides credentials for both AWSBucket
and PureStorageBucket
and authenticates to them simultaneously. This allows him to perform cross-endpoint joins, export from one bucket to another, etc.
=> ALTER SESSION SET S3BucketCredentials='
[
{
"bucket": "AWSBucket",
"accessKey": "<AK0>",
"secretAccessKey": "<SAK0>",
"sessionToken": "1234567890"
},
{
"bucket": "PureStorageBucket",
"accessKey": "<AK1>",
"secretAccessKey": "<SAK1>"
}
]';
S3 server-side encryption
S3 has three types of server-side encryption: SSE-S3, SSE-KMS, and SSE-C. The following example configures access using SSE-KMS:
=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
{
"bucket": "AWSBucket",
"region": "us-east-2",
"serverSideEncryption": "aws:kms",
"sseKmsKeyId": "1234abcd-12ab-34cd-56ef-1234567890ab"
}
]';
For more information, see Server-Side Encryption.
Non-amazon S3 storage with AWSEndpoint and S3BucketConfig
If AWSEndpoint is set to a non-Amazon S3 bucket like Pure Storage or MinIO and you want to configure S3BucketConfig for a real Amazon S3 bucket, the following requirements apply:
-
If your real Amazon S3 region is not us-east-1
(the default), you must specify the region
.
-
Set endpoint
to an empty string (""
).
In this example, AWSEndpoint is set to a Pure Storage bucket.
=> ALTER DATABASE DEFAULT SET AWSEndpoint='pure.mycorp.net:1234';
To configure S3BucketConfig for a real Amazon S3 bucket realAmazonS3Bucket
in region "us-east-2
":
=> ALTER DATABASE DEFAULT SET S3BucketConfig='
[
{
"bucket": "realAmazonS3Bucket",
"region": "us-east-2",
"endpoint": ""
},
]';
6 - Functions
Functions return information from the database.
Functions return information from the database. This section describes functions that Vertica supports. Except for meta-functions, you can use a function anywhere an expression is allowed.
Meta-functions usually access the internal state of Vertica. They can be used in a top-level SELECT statement only, and the statement cannot contain other clauses such as FROM or WHERE. Meta-functions are labeled on their reference pages.
The Behavior Type section on each reference page categorizes the function's return behavior as one or more of the following:
- Immutable (invariant): When run with a given set of arguments, immutable functions always produce the same result, regardless of environment or session settings such as locale.
- Stable: When run with a given set of arguments, stable functions produce the same result within a single query or scan operation. However, a stable function can produce different results when issued under different environments or at different times, such as change of locale and time zone—for example, SYSDATE.
- Volatile: Regardless of their arguments or environment, volatile functions can return a different result with each invocation—for example, UUID_GENERATE.
List of all functions
The following list contains all Vertica SQL functions.
-
ABS
- Returns the absolute value of the argument. [Mathematical functions]
-
ACOS
- Returns a DOUBLE PRECISION value representing the trigonometric inverse cosine of the argument. [Mathematical functions]
-
ACOSH
- Returns a DOUBLE PRECISION value that represents the inverse (arc) hyperbolic cosine of the function argument. [Mathematical functions]
-
ACTIVE_SCHEDULER_NODE
- Returns the active scheduler node. [Stored procedure functions]
-
ADD_MONTHS
- Adds the specified number of months to a date and returns the sum as a DATE. [Date/time functions]
-
ADVANCE_EPOCH
- Manually closes the current epoch and begins a new epoch. [Epoch functions]
-
AGE_IN_MONTHS
- Returns the difference in months between two dates, expressed as an integer. [Date/time functions]
-
AGE_IN_YEARS
- Returns the difference in years between two dates, expressed as an integer. [Date/time functions]
-
ALTER_LOCATION_LABEL
- Adds a label to a storage location, or changes or removes an existing label. [Storage functions]
-
ALTER_LOCATION_SIZE
- Resizes on one node, all nodes in a subcluster, or all nodes in the database. [Eon Mode functions]
-
ALTER_LOCATION_USE
- Alters the type of data that a storage location holds. [Storage functions]
-
ANALYZE_CONSTRAINTS
- Analyzes and reports on constraint violations within the specified scope. [Table functions]
-
ANALYZE_CORRELATIONS
- This function is deprecated and will be removed in a future release. [Table functions]
-
ANALYZE_EXTERNAL_ROW_COUNT
- Calculates the exact number of rows in an external table. [Statistics management functions]
-
ANALYZE_STATISTICS
- Collects and aggregates data samples and storage information from all nodes that store projections associated with the specified table. [Statistics management functions]
-
ANALYZE_STATISTICS_PARTITION
- Collects and aggregates data samples and storage information for a range of partitions in the specified table. [Statistics management functions]
-
ANALYZE_WORKLOAD
- Runs Workload Analyzer, a utility that analyzes system information held in system tables. [Workload management functions]
-
APPLY_AVG
- Returns the average of all elements in a with numeric values. [Collection functions]
-
APPLY_BISECTING_KMEANS
- Applies a trained bisecting k-means model to an input relation, and assigns each new data point to the closest matching cluster in the trained model. [Transformation functions]
-
APPLY_COUNT (ARRAY_COUNT)
- Returns the total number of non-null elements in a. [Collection functions]
-
APPLY_COUNT_ELEMENTS (ARRAY_LENGTH)
- Returns the total number of elements in a , including NULLs. [Collection functions]
-
APPLY_IFOREST
- Applies an isolation forest (iForest) model to an input relation. [Transformation functions]
-
APPLY_INVERSE_PCA
- Inverts the APPLY_PCA-generated transform back to the original coordinate system. [Transformation functions]
-
APPLY_INVERSE_SVD
- Transforms the data back to the original domain. [Transformation functions]
-
APPLY_KMEANS
- Assigns each row of an input relation to a cluster center from an existing k-means model. [Transformation functions]
-
APPLY_KPROTOTYPES
- Assigns each row of an input relation to a cluster center from an existing k-prototypes model. [Transformation functions]
-
APPLY_MAX
- Returns the largest non-null element in a. [Collection functions]
-
APPLY_MIN
- Returns the smallest non-null element in a. [Collection functions]
-
APPLY_NORMALIZE
- A UDTF function that applies the normalization parameters saved in a model to a set of specified input columns. [Transformation functions]
-
APPLY_ONE_HOT_ENCODER
- A user-defined transform function (UDTF) that loads the one hot encoder model and writes out a table that contains the encoded columns. [Transformation functions]
-
APPLY_PCA
- Transforms the data using a PCA model. [Transformation functions]
-
APPLY_SUM
- Computes the sum of all elements in a of numeric values (INTEGER, FLOAT, NUMERIC, or INTERVAL). [Collection functions]
-
APPLY_SVD
- Transforms the data using an SVD model. [Transformation functions]
-
APPROXIMATE_COUNT_DISTINCT
- Returns the number of distinct non-NULL values in a data set. [Aggregate functions]
-
APPROXIMATE_COUNT_DISTINCT_OF_SYNOPSIS
- Calculates the number of distinct non-NULL values from the synopsis objects created by APPROXIMATE_COUNT_DISTINCT_SYNOPSIS. [Aggregate functions]
-
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS
- Summarizes the information of distinct non-NULL values and materializes the result set in a VARBINARY or LONG VARBINARY synopsis object. [Aggregate functions]
-
APPROXIMATE_COUNT_DISTINCT_SYNOPSIS_MERGE
- Aggregates multiple synopses into one new synopsis. [Aggregate functions]
-
APPROXIMATE_MEDIAN [aggregate]
- Computes the approximate median of an expression over a group of rows. [Aggregate functions]
-
APPROXIMATE_PERCENTILE [aggregate]
- Computes the approximate percentile of an expression over a group of rows. [Aggregate functions]
-
APPROXIMATE_QUANTILES
- Computes an array of weighted, approximate percentiles of a column within some user-specified error. [Aggregate functions]
-
ARGMAX [analytic]
- This function is patterned after the mathematical function argmax(f(x)), which returns the value of x that maximizes f(x). [Analytic functions]
-
ARGMAX_AGG
- Takes two arguments target and arg, where both are columns or column expressions in the queried dataset. [Aggregate functions]
-
ARGMIN [analytic]
- This function is patterned after the mathematical function argmin(f(x)), which returns the value of x that minimizes f(x). [Analytic functions]
-
ARGMIN_AGG
- Takes two arguments target and arg, where both are columns or column expressions in the queried dataset. [Aggregate functions]
-
ARIMA
- Creates and trains an autoregressive integrated moving average (ARIMA) model from a time series with consistent timesteps. [Machine learning algorithms]
-
ARRAY_CAT
- Concatenates two arrays of the same element type and dimensionality. [Collection functions]
-
ARRAY_CONTAINS
- Returns true if the specified element is found in the array and false if not. [Collection functions]
-
ARRAY_DIMS
- Returns the dimensionality of the input array. [Collection functions]
-
ARRAY_FIND
- Returns the ordinal position of a specified element in an array, or -1 if not found. [Collection functions]
-
ASCII
- Converts the first character of a VARCHAR datatype to an INTEGER. [String functions]
-
ASIN
- Returns a DOUBLE PRECISION value representing the trigonometric inverse sine of the argument. [Mathematical functions]
-
ASINH
- Returns a DOUBLE PRECISION value that represents the inverse (arc) hyperbolic sine of the function argument. [Mathematical functions]
-
ATAN
- Returns a DOUBLE PRECISION value representing the trigonometric inverse tangent of the argument. [Mathematical functions]
-
ATAN2
- Returns a DOUBLE PRECISION value representing the trigonometric inverse tangent of the arithmetic dividend of the arguments. [Mathematical functions]
-
ATANH
- Returns a DOUBLE PRECISION value that represents the inverse hyperbolic tangent of the function argument. [Mathematical functions]
-
AUDIT
- Returns the raw data size (in bytes) of a database, schema, or table as it is counted in an audit of the database size. [License functions]
-
AUDIT_FLEX
- Returns the estimated ROS size of __raw__ columns, equivalent to the export size of the flex data in the audited objects. [License functions]
-
AUDIT_LICENSE_SIZE
- Triggers an immediate audit of the database size to determine if it is in compliance with the raw data storage allowance included in your Vertica licenses. [License functions]
-
AUDIT_LICENSE_TERM
- Triggers an immediate audit to determine if the Vertica license has expired. [License functions]
-
AUTOREGRESSOR
- Creates an autoregressive (AR) model from a stationary time series with consistent timesteps that can then be used for prediction via PREDICT_AR. [Machine learning algorithms]
-
AVG [aggregate]
- Computes the average (arithmetic mean) of an expression over a group of rows. [Aggregate functions]
-
AVG [analytic]
- Computes an average of an expression in a group within a. [Analytic functions]
-
AZURE_TOKEN_CACHE_CLEAR
- Clears the cached access token for Azure. [Cloud functions]
-
BACKGROUND_DEPOT_WARMING
- Vertica version 10.0.0 removes support for foreground depot warming. [Eon Mode functions]
-
BALANCE
- Returns a view with an equal distribution of the input data based on the response_column. [Data preparation]
-
BISECTING_KMEANS
- Executes the bisecting k-means algorithm on an input relation. [Machine learning algorithms]
-
BIT_AND
- Takes the bitwise AND of all non-null input values. [Aggregate functions]
-
BIT_LENGTH
- Returns the length of the string expression in bits (bytes * 8) as an INTEGER. [String functions]
-
BIT_OR
- Takes the bitwise OR of all non-null input values. [Aggregate functions]
-
BIT_XOR
- Takes the bitwise XOR of all non-null input values. [Aggregate functions]
-
BITCOUNT
- Returns the number of one-bits (sometimes referred to as set-bits) in the given VARBINARY value. [String functions]
-
BITSTRING_TO_BINARY
- Translates the given VARCHAR bitstring representation into a VARBINARY value. [String functions]
-
BOOL_AND [aggregate]
- Processes Boolean values and returns a Boolean value result. [Aggregate functions]
-
BOOL_AND [analytic]
- Returns the Boolean value of an expression within a. [Analytic functions]
-
BOOL_OR [aggregate]
- Processes Boolean values and returns a Boolean value result. [Aggregate functions]
-
BOOL_OR [analytic]
- Returns the Boolean value of an expression within a. [Analytic functions]
-
BOOL_XOR [aggregate]
- Processes Boolean values and returns a Boolean value result. [Aggregate functions]
-
BOOL_XOR [analytic]
- Returns the Boolean value of an expression within a. [Analytic functions]
-
BTRIM
- Removes the longest string consisting only of specified characters from the start and end of a string. [String functions]
-
BUILD_FLEXTABLE_VIEW
- Creates, or re-creates, a view for a default or user-defined keys table, ignoring any empty keys. [Flex data functions]
-
CALENDAR_HIERARCHY_DAY
- Groups DATE partition keys into a hierarchy of years, months, and days. [Partition functions]
-
CANCEL_DEPOT_WARMING
- Cancels depot warming on a node. [Eon Mode functions]
-
CANCEL_DRAIN_SUBCLUSTER
- Cancels the draining of a subcluster or subclusters. [Eon Mode functions]
-
CANCEL_REBALANCE_CLUSTER
- Stops any rebalance task that is currently in progress or is waiting to execute. [Cluster functions]
-
CANCEL_REFRESH
- Cancels refresh-related internal operations initiated by START_REFRESH and REFRESH. [Session functions]
-
CBRT
- Returns the cube root of the argument. [Mathematical functions]
-
CEILING
- Rounds up the returned value up to the next whole number. [Mathematical functions]
-
CHANGE_CURRENT_STATEMENT_RUNTIME_PRIORITY
- Changes the run-time priority of an active query. [Workload management functions]
-
CHANGE_MODEL_STATUS
- Changes the status of a registered model. [Model management]
-
CHANGE_RUNTIME_PRIORITY
- Changes the run-time priority of a query that is actively running. [Workload management functions]
-
CHARACTER_LENGTH
- The CHARACTER_LENGTH() function:. [String functions]
-
CHI_SQUARED
- Computes the conditional chi-Square independence test on two categorical variables to find the likelihood that the two variables are independent. [Data preparation]
-
CHR
- Converts the first character of an INTEGER datatype to a VARCHAR. [String functions]
-
CLEAN_COMMUNAL_STORAGE
- Marks for deletion invalid data in communal storage, often data that leaked due to an event where Vertica cleanup mechanisms failed. [Eon Mode functions]
-
CLEAR_CACHES
- Clears the Vertica internal cache files. [Storage functions]
-
CLEAR_DATA_COLLECTOR
- Clears all memory and disk records from Data Collector tables and logs, and resets collection statistics in system table DATA_COLLECTOR. [Data Collector functions]
-
CLEAR_DATA_DEPOT
- Deletes the specified depot data. [Eon Mode functions]
-
CLEAR_DEPOT_ANTI_PIN_POLICY_PARTITION
- Removes an anti-pinning policy from the specified partition. [Eon Mode functions]
-
CLEAR_DEPOT_ANTI_PIN_POLICY_PROJECTION
- Removes an anti-pinning policy from the specified projection. [Eon Mode functions]
-
CLEAR_DEPOT_ANTI_PIN_POLICY_TABLE
- Removes an anti-pinning policy from the specified table. [Eon Mode functions]
-
CLEAR_DEPOT_PIN_POLICY_PARTITION
- Clears a depot pinning policy from the specified table or projection partitions. [Eon Mode functions]
-
CLEAR_DEPOT_PIN_POLICY_PROJECTION
- Clears a depot pinning policy from the specified projection. [Eon Mode functions]
-
CLEAR_DEPOT_PIN_POLICY_TABLE
- Clears a depot pinning policy from the specified table. [Eon Mode functions]
-
CLEAR_DIRECTED_QUERY_USAGE
- Resets the counter in the DIRECTED_QUERY_STATUS table. [Directed queries functions]
-
CLEAR_FETCH_QUEUE
- Removes all entries or entries for a specific transaction from the queue of fetch requests of data from the communal storage. [Eon Mode functions]
-
CLEAR_HDFS_CACHES
- Clears the configuration information copied from HDFS and any cached connections. [Hadoop functions]
-
CLEAR_OBJECT_STORAGE_POLICY
- Removes a user-defined storage policy from the specified database, schema or table. [Storage functions]
-
CLEAR_PROFILING
- Clears from memory data for the specified profiling type. [Profiling functions]
-
CLEAR_PROJECTION_REFRESHES
- Clears information projection refresh history from system table PROJECTION_REFRESHES. [Projection functions]
-
CLEAR_RESOURCE_REJECTIONS
- Clears the content of the RESOURCE_REJECTIONS and DISK_RESOURCE_REJECTIONS system tables. [Database functions]
-
CLOCK_TIMESTAMP
- Returns a value of type TIMESTAMP WITH TIMEZONE that represents the current system-clock time. [Date/time functions]
-
CLOSE_ALL_RESULTSETS
- Closes all result set sessions within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets. [Client connection functions]
-
CLOSE_ALL_SESSIONS
- Closes all external sessions except the one that issues this function. [Session functions]
-
CLOSE_RESULTSET
- Closes a specific result set within Multiple Active Result Sets (MARS) and frees the MARS storage for other result sets. [Client connection functions]
-
CLOSE_SESSION
- Interrupts the specified external session, rolls back the current transaction if any, and closes the socket. [Session functions]
-
CLOSE_USER_SESSIONS
- Stops the session for a user, rolls back any transaction currently running, and closes the connection. [Session functions]
-
COALESCE
- Returns the value of the first non-null expression in the list. [NULL-handling functions]
-
COLLATION
- Applies a collation to two or more strings. [String functions]
-
COMPACT_STORAGE
- Bundles existing data (.fdb) and index (.pidx) files into the .gt file format. [Database functions]
-
COMPUTE_FLEXTABLE_KEYS
- Computes the virtual columns (keys and values) from flex table VMap data. [Flex data functions]
-
COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW
- Combines the functionality of BUILD_FLEXTABLE_VIEW and COMPUTE_FLEXTABLE_KEYS to compute virtual columns (keys) from the VMap data of a flex table and construct a view. [Flex data functions]
-
CONCAT
- Concatenates two strings and returns a varchar data type. [String functions]
-
CONDITIONAL_CHANGE_EVENT [analytic]
- Assigns an event window number to each row, starting from 0, and increments by 1 when the result of evaluating the argument expression on the current row differs from that on the previous row. [Analytic functions]
-
CONDITIONAL_TRUE_EVENT [analytic]
- Assigns an event window number to each row, starting from 0, and increments the number by 1 when the result of the boolean argument expression evaluates true. [Analytic functions]
-
CONFUSION_MATRIX
- Computes the confusion matrix of a table with observed and predicted values of a response variable. [Model evaluation]
-
CONTAINS
- Returns true if the specified element is found in the collection and false if not. [Collection functions]
-
COPY_PARTITIONS_TO_TABLE
- Copies partitions from one table to another. [Partition functions]
-
COPY_TABLE
- Copies one table to another. [Table functions]
-
CORR
- Returns the DOUBLE PRECISION coefficient of correlation of a set of expression pairs, as per the Pearson correlation coefficient. [Aggregate functions]
-
CORR_MATRIX
- Takes an input relation with numeric columns, and calculates the Pearson Correlation Coefficient between each pair of its input columns. [Data preparation]
-
COS
- Returns a DOUBLE PRECISION value tat represents the trigonometric cosine of the passed parameter. [Mathematical functions]
-
COSH
- Returns a DOUBLE PRECISION value that represents the hyperbolic cosine of the passed parameter. [Mathematical functions]
-
COT
- Returns a DOUBLE PRECISION value representing the trigonometric cotangent of the argument. [Mathematical functions]
-
COUNT [aggregate]
- Returns as a BIGINT the number of rows in each group where the expression is not NULL. [Aggregate functions]
-
COUNT [analytic]
- Counts occurrences within a group within a. [Analytic functions]
-
COVAR_POP
- Returns the population covariance for a set of expression pairs. [Aggregate functions]
-
COVAR_SAMP
- Returns the sample covariance for a set of expression pairs. [Aggregate functions]
-
CROSS_VALIDATE
- Performs k-fold cross validation on a learning algorithm using an input relation, and grid search for hyper parameters. [Model evaluation]
-
CUME_DIST [analytic]
- Calculates the cumulative distribution, or relative rank, of the current row with regard to other rows in the same partition within a . [Analytic functions]
-
CURRENT_DATABASE
- Returns the name of the current database, equivalent to DBNAME. [System information functions]
-
CURRENT_DATE
- Returns the date (date-type value) on which the current transaction started. [Date/time functions]
-
CURRENT_LOAD_SOURCE
- When called within the scope of a COPY statement, returns the file name or path part used for the load. [System information functions]
-
CURRENT_SCHEMA
- Returns the name of the current schema. [System information functions]
-
CURRENT_SESSION
- Returns the ID of the current client session. [System information functions]
-
CURRENT_TIME
- Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction. [Date/time functions]
-
CURRENT_TIMESTAMP
- Returns a value of type TIME WITH TIMEZONE that represents the start of the current transaction. [Date/time functions]
-
CURRENT_TRANS_ID
- Returns the ID of the transaction currently in progress. [System information functions]
-
CURRENT_USER
- Returns a VARCHAR containing the name of the user who initiated the current database connection. [System information functions]
-
CURRVAL
- Returns the last value across all nodes that was set by NEXTVAL on this sequence in the current session. [Sequence functions]
-
DATA_COLLECTOR_HELP
- Returns online usage instructions about the Data Collector, the V_MONITOR.DATA_COLLECTOR system table, and the Data Collector control functions. [Data Collector functions]
-
DATE
- Converts the input value to a DATE data type. [Date/time functions]
-
DATE_PART
- Extracts a sub-field such as year or hour from a date/time expression, equivalent to the the SQL-standard function EXTRACT. [Date/time functions]
-
DATE_TRUNC
- Truncates date and time values to the specified precision. [Date/time functions]
-
DATEDIFF
- Returns the time span between two dates, in the intervals specified. [Date/time functions]
-
DAY
- Returns as an integer the day of the month from the input value. [Date/time functions]
-
DAYOFMONTH
- Returns the day of the month as an integer. [Date/time functions]
-
DAYOFWEEK
- Returns the day of the week as an integer, where Sunday is day 1. [Date/time functions]
-
DAYOFWEEK_ISO
- Returns the ISO 8061 day of the week as an integer, where Monday is day 1. [Date/time functions]
-
DAYOFYEAR
- Returns the day of the year as an integer, where January 1 is day 1. [Date/time functions]
-
DAYS
- Returns the integer value of the specified date, where 1 AD is 1. [Date/time functions]
-
DBNAME (function)
- Returns the name of the current database, equivalent to CURRENT_DATABASE. [