This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

CREATE statements

CREATE statements let you create new database objects such as tables and users.

1: CREATE ACCESS POLICY
2: CREATE AUTHENTICATION
3: CREATE CA BUNDLE
4: CREATE CERTIFICATE
5: CREATE DIRECTED QUERY
6: CREATE EXTERNAL TABLE AS COPY
7: CREATE FAULT GROUP
8: CREATE FLEXIBLE EXTERNAL TABLE AS COPY
9: CREATE FLEXIBLE TABLE
10: CREATE FUNCTION statements

10.1: CREATE AGGREGATE FUNCTION
10.2: CREATE ANALYTIC FUNCTION
10.3: CREATE FILTER
10.4: CREATE FUNCTION (scalar)
10.5: CREATE FUNCTION (SQL)
10.6: CREATE PARSER
10.7: CREATE SOURCE
10.8: CREATE TRANSFORM FUNCTION

11: CREATE HCATALOG SCHEMA
12: CREATE KEY
13: CREATE LIBRARY
14: CREATE LOAD BALANCE GROUP
15: CREATE LOCAL TEMPORARY VIEW
16: CREATE LOCATION
17: CREATE NETWORK ADDRESS
18: CREATE NETWORK INTERFACE
19: CREATE NOTIFIER
20: CREATE PROCEDURE (external)
21: CREATE PROCEDURE (stored)
22: CREATE PROFILE
23: CREATE PROJECTION

23.1: Encoding types
23.2: GROUPED clause
23.3: Hash segmentation clause
23.4: Unsegmented clause

24: CREATE RESOURCE POOL

24.1: Built-in pools
24.2: Built-in resource pools configuration

25: CREATE ROLE
26: CREATE ROUTING RULE
27: CREATE SCHEMA
28: CREATE SEQUENCE
29: CREATE SUBNET
30: CREATE TABLE

30.1: Column-constraint
30.2: Column-definition
30.3: Column-name-list
30.4: Partition clause
30.5: Table-constraint

31: CREATE TEMPORARY TABLE
32: CREATE TEXT INDEX
33: CREATE USER
34: CREATE VIEW

CREATE statements let you create new database objects such as tables and users.

1 - CREATE ACCESS POLICY

Creates an access policy that filters access to table data to users and roles.

Creates an access policy that filters access to table data to users and roles. You can create access policies for table rows and columns. Vertica applies the access policy filters with each query and returns only the data that is permissible for the current user or role.

You cannot set access policies on columns of complex data types other than native arrays. If the table contains complex-type columns, you can still set row access policies and column access policies on other columns.

Syntax


CREATE ACCESS POLICY ON [[database.]schema.]table
    { FOR COLUMN column | FOR ROWS WHERE } expression [GRANT TRUSTED] { ENABLE | DISABLE }

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table

The table with the target column or rows.

FOR COLUMN column

The column on which to apply this access policy. The column can be a native array, but other complex types are not supported. (See Complex types.)

FOR ROWS WHERE

The rows on which to apply this access policy.

expression

A SQL expression that specifies conditions for accessing row or column data:

Row access policies limit access to specific rows in a table, as specified by the policy's WHERE expression. Only rows that satisfy this expression are fetched from the table. For details and sample usage, see Creating row access policies.
Column access policies limit access to specific table columns. The access policy expression can also specify how to render column data to specific users and roles. For details and sample usage, see Creating column access policies.

GRANT TRUSTED

Specifies that GRANT statements take precedence over the access policy in determining whether users can perform DML operations on the target table. If omitted, users can only modify table data if the access policy allows them to see the stored data in its original, unaltered state. For more information, see Access policies and DML operations.

Important

GRANT TRUSTED only affects DML operations and does not enable users to see data that the access policy would otherwise mask. Specifying this option may allow users with certain grants to update data that they cannot see.

ENABLE | DISABLE

Whether to enable the access policy. You can enable and disable existing access policies with ALTER ACCESS POLICY.

Privileges

Non-superuser: Ownership of the table

Restrictions

The following limitations apply to access policies:

A column can have only one access policy.
Column access policies cannot be set on columns of complex types other than native arrays.
Column access policies cannot be set for materialized columns on flex tables. While it is possible to set an access policy for the __raw__ column, doing so restricts access to the whole table.
Row access policies are invalid on temporary tables and tables with aggregate projections.
Access policy expressions cannot contain:
- Subqueries
- Aggregate functions
- Analytic functions
- User-defined transform functions (UDTF)
If the query optimizer cannot replace a deterministic expression that involves only constants with their computed values, it blocks all DML operations such as INSERT.

2 - CREATE AUTHENTICATION

Creates and enables an authentication method associated with users or roles.

Creates and enables an authentication method associated with users or roles. Vertica enables the authentication method automatically.

Syntax

CREATE AUTHENTICATION auth-method-name METHOD 'auth-type' access-method

Parameters

Name Description

auth-method-name Name of the authentication method to create, where auth-method-name conforms to conventions described in Identifiers.

Name	Description
`auth-method-name`	Name of the authentication method to create, where `auth-method-name` conforms to conventions described in Identifiers.
`auth-type`	The authentication method type to use, one of the following: `gss` `ident` `ldap` `hash` `reject` `trust` `tls` `oauth` For details, see Supported Client Authentication Methods.
`access-method`	The access method the client uses to connect, specified in one of the following ways: `LOCAL`: Matches connection attempts made using local domain sockets. `HOST [ TLS \| NO TLS ] '``host-ip-address``'`: Matches connection attempts made using TCP/IP, where `host-ip-address` can be an IPv4 or IPv6 address. You can qualify `HOST` with one of the following options: `TLS` (default): Match an SSL/TLS-wrapped TCP socket. `NO TLS`: Match a plain (non-SSL/TLS) socket only.

auth-type

The authentication method type to use, one of the following:

gss
ident
ldap
hash
reject
trust
tls
oauth

For details, see Supported Client Authentication Methods.

access-method

The access method the client uses to connect, specified in one of the following ways:

LOCAL: Matches connection attempts made using local domain sockets.
HOST [ TLS | NO TLS ] 'host-ip-address': Matches connection attempts made using TCP/IP, where host-ip-address can be an IPv4 or IPv6 address. You can qualify HOST with one of the following options:
- TLS (default): Match an SSL/TLS-wrapped TCP socket.
- NO TLS: Match a plain (non-SSL/TLS) socket only.

Privileges

DBADMIN

Examples

See Creating authentication records.

3 - CREATE CA BUNDLE

Creates a certificate authority (CA) bundle.

Creates a certificate authority (CA) bundle. These contain root CA certificates.

Syntax

CREATE CA BUNDLE name [CERTIFICATES ca_cert[, ca_cert[, ...]]

Parameters

name: The name of the CA bundle.
ca_cert: The name of the CA certificate. If no certificates are specified, the bundle will be empty.

Privileges

Ownership of the CA certificates in the CA bundle.

Examples

See Managing CA bundles.

4 - CREATE CERTIFICATE

Creates or imports a certificate, Certificate Authority (CA), or intermediate CA.

Creates or imports a certificate, Certificate Authority (CA), or intermediate CA. These certificates can be used with ALTER TLS CONFIGURATION to set up client-server TLS, LDAPLink TLS, LDAPAuth TLS, and internode TLS.

CREATE CERTIFICATE generates x509v3 certificates.

Syntax

CREATE [TEMP[ORARY]] [CA] CERTIFICATE certificate_name
    {AS cert [KEY key_name]
    | SUBJECT subject
      [ SIGNED BY ca_cert ]
      [ VALID FOR days ]
      [ EXTENSIONS ext = val[,...] ]
      [ KEY private_key ]}

Parameters

TEMPORARY

Create with session scope. The key is stored in memory and is valid only for the current session.

CA

Designates the certificate as a CA or intermediate certificate. If omitted, the operation creates a normal certificate.

certificate_name

The name of the certificate.

AS cert

The imported certificate (string).

This parameter should include the entire chain of certificates, excluding the CA certificate.

KEY key_name

The name of the key.

This parameter only needs to be set for client/server certificates and CA certificates that you intend to sign other certificates with in Vertica. If your imported CA certificate will only be used for validating other certificates, you do not need to specify a key.

SUBJECT subject

The entity to issue the certificate to (string).

SIGNED BY ca_cert

The name of the CA that signed the certificate.

When adding a CA certificate, this parameter is optional. Specifying it will create an intermediate CA that cannot be used to sign other CA certificates.

When creating a certificate, this parameter is required.

VALID FOR days

The number of days that the certificate is valid.

EXTENSIONS ext = val

Strings specifying certificate extensions. For a full list of extensions, see the OpenSSL documentation.

KEY private_key

The name of the certificate's private key.

When importing a certificate, this parameter is required.

Privileges

Superuser

Default extensions

CREATE CERTIFICATE generates x509v3 certificates and includes several extensions by default. These differ based on the type of certificate you create:

CA Certificate:

'basicConstraints' = 'critical, CA:true'
'keyUsage' = 'critical, digitalSignature, keyCertSign'
'nsComment' = Vertica generated [CA] certificate'
'subjectKeyIdentifier' = 'hash'

Certificate:

'basicConstraints' = 'CA:false'
'keyUsage' = 'critical, digitalSignature, keyEncipherment'

Examples

See Generating TLS certificates and keys.

5 - CREATE DIRECTED QUERY

Saves an association between an input query and a query that is annotated with optimizer hints.

Syntax

Optimizer-generated

CREATE DIRECTED QUERY OPT[IMIZER] directedqueryID [COMMENT 'comments'] input-query

User-defined (custom)

CREATE DIRECTED QUERY CUSTOM directedqueryID [COMMENT 'comments'] annotated-query

Parameters

OPT[IMIZER]

Directs the query optimizer to generate an annotated query from input-query, and associate both in the new directed query.

CUSTOM

Specifies to associate annotated-query with the query previously specified by SAVE QUERY.

directedqueryID

A unique identifier for the directed query, a string that conforms to conventions described in Identifiers.

COMMENT 'comments'

Comments about the directed query, up to 128 characters. Comments can be useful for future reference—for example, explain why a given directed query was created.

If you omit this argument, Vertica inserts one of the following comments:

Optimizer-generated directed query
Custom directed query

input-query

The input query to associate with an optimizer-generated directed query. The input query supports only one optimizer hint, :v (alias IGNORECONST).

annotated-query

A query with embedded optimizer hints to associate with the input query most recently saved with SAVE QUERY.

Privileges

Superuser

Description

CREATE DIRECTED QUERY associates an input query with a query annotated with optimizer hints. It stores the association under a unique identifier. CREATE DIRECTED QUERY has two variants:

CREATE DIRECTED QUERY OPTIMIZER directs the query optimizer to generate annotated SQL from the specified input query. The annotated query contains hints that the optimizer can use to recreate its current query plan for that input query.
CREATE DIRECTED QUERY CUSTOM specifies an annotated query supplied by the user. Vertica associates the annotated query with the input query specified by the last SAVE QUERY statement.

In both cases, Vertica associates the annotated query and input query, and registers their association in the system table DIRECTED_QUERIES under query_name.

Caution

Vertica associates a saved query and directed query without checking whether the two are compatible. Be careful to sequence SAVE QUERY and CREATE DIRECTED QUERY CUSTOM so the saved and directed queries are correctly matched.

6 - CREATE EXTERNAL TABLE AS COPY

CREATE EXTERNAL TABLE AS COPY creates a table definition for data external to your Vertica database.

CREATE EXTERNAL TABLE AS COPY creates a table definition for data external to your Vertica database. This statement is a combination of the CREATE TABLE and COPY statements, supporting a subset of each statement's parameters.

Canceling a CREATE EXTERNAL TABLE AS COPY statement can cause unpredictable results. If you need to make a change, allow the statement to complete, drop the table, and then retry.

You can use ALTER TABLE to change the data types of columns instead of dropping and recreating the table.

You can use CREATE EXTERNAL TABLE AS COPY with any types except types from the Place package.

Note

Vertica does not create superprojections for external tables, since external tables are not stored in the database.

Syntax

CREATE EXTERNAL TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name 
    ( column-definition[,...] )
[{INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES]
AS COPY
    [ ( { column-as-expression | column }
       [ DELIMITER [ AS ] 'char' ]
       [ ENCLOSED [ BY ] 'char' ]
       [ ENFORCELENGTH ]
       [ ESCAPE [ AS ] 'char' | NO ESCAPE ]
       [ FILLER datatype ]
       [ FORMAT 'format' ]
       [ NULL [ AS ] 'string' ]
       [ TRIM 'byte' ]
    [,...] ) ]
    [ COLUMN OPTION ( column 
       [ DELIMITER [ AS ] 'char' ]
       [ ENCLOSED [ BY ] 'char' ]
       [ ENFORCELENGTH ]
       [ ESCAPE [ AS ] 'char' | NO ESCAPE ]
       [ FORMAT 'format' ]
       [ NULL [ AS ] 'string' ]
       [ TRIM 'byte' ]
    [,...] ) ]
FROM {
      'path-to-data' [ ON { nodename | (nodeset) | ANY NODE } ] [ input-format ] }[,...]
      | [ WITH ] SOURCE source( [arg=value[,...] ] )
}
      [ NATIVE
      | FIXEDWIDTH COLSIZES {( integer )[,...]}
      | NATIVE VARCHAR
      | ORC
      | PARQUET
      ]
   [ ABORT ON ERROR ]
   [ DELIMITER [ AS ] 'char' ]
   [ ENCLOSED BY 'char' [ AND 'char' ] ]
   [ ENFORCELENGTH ]
   [ ERROR TOLERANCE ]
   [ ESCAPE AS 'char' | NO ESCAPE ]
   [ EXCEPTIONS 'path' [ ON nodename ] [,...] ]
   [ [ WITH ] FILTER filter( [ arg=value[,...] ] ) ]
   [ NULL [ AS ] 'string' ]
   [ [ WITH ] PARSER parser([arg=value [,...] ]) ]
   [ RECORD TERMINATOR 'string' ]
   [ REJECTED DATA 'path' [ ON nodename ] [,...] ]
   [ REJECTMAX integer ]
   [ SKIP integer ]
   [ SKIP BYTES integer ]
   [ TRAILING NULLCOLS ]
   [ TRIM 'byte' ]

Parameters

For all supported parameters, see the CREATE TABLE and COPY statements. For information on using this statement with UDLs, see User-defined load (UDL).

For additional guidance on using COPY parameters, see Specifying where to load data from.

Privileges

Superuser, or non-superuser with the following privileges:

READ privileges on the USER-accessible storage location, see GRANT (storage location)
Full access (including SELECT) to an external table that the user has privileges to create

ORC and Parquet data

When using the ORC and Parquet formats, Vertica supports some additional options in the COPY statement and data structures for columns. See ORC (parser) and PARQUET (parser).

If ORC or Parquet data is partitioned, Vertica expects Hive-style partitioning. If you see unexpected results when reading data, verify that globs in your file paths correctly align with the partition structure. See Troubleshooting external tables.

Examples

The following example defines an external table for delimited data stored in HDFS:

=> CREATE EXTERNAL TABLE sales (itemID INT, date DATE, price FLOAT)
    AS COPY FROM 'hdfs:///dat/ext1.csv' DELIMITER ',';

The following example uses data in the ORC format that is stored in S3. The data has two partition columns, represented as directories in the file structure. For more information about partitions, see Using partition columns.

=> CREATE EXTERNAL TABLE transactions (id int, name varchar(50), created date, region varchar(50))
   AS COPY FROM 's3://datalake/sales/*/*/*'
   ORC(hive_partition_cols='created,region');

The following example shows how you can read from all Parquet files in a local directory, with no partitions and no globs:

=> CREATE EXTERNAL TABLE sales (itemID INT, date DATE, price FLOAT)
    AS COPY FROM '/data/sales/*.parquet' PARQUET;

The following example creates an external table using partitioned data in the ORC format. The table includes four columns. Two columns, "id" and "name", are in the data files. The other two, "created" and "region", are partition columns. For more about partition columns, see Using partition columns.

=> CREATE EXTERNAL TABLE t (id int, name varchar(50), created date, region varchar(50))
   AS COPY FROM 'hdfs:///path/*/*/*'
   ORC(hive_partition_cols='created,region');

The following example creates an external table from data in Google Cloud Storage:

=> CREATE EXTERNAL TABLE sales (itemID INT, date DATE, price FLOAT)
    AS COPY FROM 'gs://data/sales/*.csv';

The following example creates an external table for data containing arrays:

=> CREATE EXTERNAL TABLE cust (cust_custkey int, cust_custname varchar(50), cust_custstaddress ARRAY[varchar(100)],
   cust_custaddressln2 ARRAY[varchar(100)], cust_custcity ARRAY[varchar(50)], cust_custstate ARRAY[char(2)], cust_custzip ARRAY[int],
   cust_email varchar(50), cust_phone varchar(30))
   AS COPY FROM '   ' PARQUET;

The following examples create external tables from data in the local file system:

=> CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/ext1.dat' DELIMITER ',';
=> CREATE EXTERNAL TABLE ext2 (x integer) AS COPY FROM '/tmp/ext2.dat.bz2' BZIP DELIMITER ',';
=> CREATE EXTERNAL TABLE ext3 (x integer, y integer) AS COPY (x as '5', y) FROM '/tmp/ext3.dat.bz2' BZIP DELIMITER ',';

To allow users without superuser access to use external tables with data on the local file system, S3, or GCS, create a location for 'user' usage and grant access to it. This example shows granting access to a user named Bob to any external table whose data is located under /tmp (including in subdirectories to any depth):

=> CREATE LOCATION '/tmp' ALL NODES USAGE 'user';
=> GRANT ALL ON LOCATION '/tmp' to Bob;

The following example shows CREATE EXTERNAL TABLE using a user-defined source:

=> CREATE SOURCE curl AS LANGUAGE 'C++' NAME 'CurlSourceFactory' LIBRARY curllib;
=> CREATE EXTERNAL TABLE curl_table1 as COPY SOURCE CurlSourceFactory;

7 - CREATE FAULT GROUP

Creates a fault group, which can contain the following:.

Enterprise Mode only

Creates a fault group, which can contain the following:

One or more nodes
One or more child fault groups
One or more nodes and one or more child fault groups

CREATE FAULT GROUP creates an empty fault group. Use ALTER FAULT GROUP to add nodes or other fault groups to an existing fault group.

Syntax

CREATE FAULT GROUP name

Parameters

name: The name of the fault group to create, unique among all fault groups, where name conforms to conventions described in Identifiers.

Privileges

Superuser

Examples

The following command creates a fault group called parent0:

=> CREATE FAULT GROUP parent0;
CREATE FAULT GROUP

Use ALTER FAULT GROUP to add nodes or other fault groups to the parent0 fault group.

8 - CREATE FLEXIBLE EXTERNAL TABLE AS COPY

CREATE FLEXIBLE EXTERNAL TABLE AS COPY creates a flexible external table.

CREATE FLEXIBLE EXTERNAL TABLE AS COPY creates a flexible external table. This statement combines statements CREATE FLEXIBLE TABLE and COPY statements, supporting a subset of each statement's parameters.

You can also use user-defined load functions (UDLs) to create external flex tables. For details about creating and using flex tables, see Using Flex Tables.

Note

Vertica does not create a superprojection for an external table when you create it.

For details about creating and using flex tables, see Creating flex tables in Using Flex Tables.

Caution

Canceling a CREATE FLEX EXTERNAL TABLE AS COPY statement can cause unpredictable results. Vertica recommends that you allow the statement to finish, then use DROP TABLE after the table exists.

Syntax

CREATE FLEX[IBLE] EXTERNAL TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
   ( [ column-definition[,...] ] )
   [ INCLUDE | EXCLUDE [SCHEMA] PRIVILEGES ]
AS COPY [ ( { column-as-expression | column } [ FILLER datatype ] ]
   FROM {
      'path-to-data' [ ON nodename | ON ANY NODE | ON (nodeset) ] input-format [,...]
      | [ WITH ] UDL-clause[...]
   }
   [ ABORT ON ERROR ]
   [ DELIMITER [ AS ] 'char' ]
   [ ENCLOSED [ BY ] 'char' ]
   [ ENFORCELENGTH ]
   [ ESCAPE [ AS ] 'char' | NO ESCAPE ]
   [ EXCEPTIONS 'path' [ ON nodename ] [,...] ]
   [ NULL [ AS ] 'string' ]
   [ RECORD TERMINATOR 'string' ]
   [ REJECTED DATA 'path' [ ON nodename ][,...] ]
   [ REJECTMAX integer ]
   [ SKIP integer ]
   [ SKIP BYTES integer ]
   [ TRAILING NULLCOLS ]
   [ TRIM 'byte' ]

Parameters

For parameter descriptions, see CREATE TABLE and Parameters.

Note

CREATE FLEXIBLE EXTERNAL TABLE AS COPY supports only a subset of CREATE TABLE and COPY parameters.

Privileges

Superuser, or non-superuser with the following privileges:

READ privileges on the USER-accessible storage location, see GRANT (storage location)
Full access (including SELECT) to an external table that the user has privileges to create

Examples

To create an external flex table:

=> CREATE flex external table mountains() AS COPY FROM 'home/release/KData/kmm_ountains.json' PARSER fjsonparser();
CREATE TABLE

As with other flex tables, creating an external flex table produces two regular tables: the named table and its associated _keys table. The keys table is not an external table:

=> \dt mountains
                 List of tables
 Schema |   Name    | Kind  |  Owner  | Comment
--------+-----------+-------+---------+---------
 public | mountains | table | release |
(1 row)

You can use the helper function, COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW, to compute keys and create a view for the external table:

=> SELECT compute_flextable_keys_and_build_view ('appLog');

                     compute_flextable_keys_and_build_view
--------------------------------------------------------------------------------------------------
Please see public.appLog_keys for updated keys
The view public.appLog_view is ready for querying
(1 row)

Check the keys from the _keys table for the results of running the helper application:

=> SELECT * FROM appLog_keys;
                          key_name                       | frequency |   data_type_guess
----------------------------------------------------------+-----------+------------------
contributors                                             |         8 | varchar(20)
coordinates                                              |         8 | varchar(20)
created_at                                               |         8 | varchar(60)
entities.hashtags                                        |         8 | long varbinary(186)
.
.
.
retweeted_status.user.time_zone                          |         1 | varchar(20)
retweeted_status.user.url                                |         1 | varchar(68)
retweeted_status.user.utc_offset                         |         1 | varchar(20)
retweeted_status.user.verified                           |         1 | varchar(20)
(125 rows)

You can query the view:

=> SELECT "user.lang" FROM appLog_view;
 user.lang
-----------
it
en
es
en
en
es
tr
en
(12 rows)

9 - CREATE FLEXIBLE TABLE

Creates a flexible (flex) table in the logical schema.

When you create a flex table, Vertica automatically creates two dependent objects:

Keys table that is named flex-table-name_keys
View that is named flex-table-name_view

The flex table requires the keys table and view. Neither of these objects can exist independently of the flex table.

Syntax

Create with column definitions

CREATE [[ scope ] TEMP[ORARY]] FLEX[IBLE] TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
    ( [ column-definition[,...] [, table-constraint ][,...] ] )
    [ ORDER BY column[,...] ]
    [ segmentation-spec ]
    [ KSAFE [k-num] ]
    [ partition-clause]
    [ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]

Create from another table

CREATE FLEX[IBLE] TABLE [[database.]schema.] table-name
  [ ( column-name-list ) ]
  [ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
AS  query [ ENCODED BY column-ref-list ]

Parameters

For general parameter descriptions, see CREATE TABLE; for parameters specific to temporary flex tables, see CREATE TEMPORARY TABLE and Creating flex tables.

You cannot partition a flex table on any virtual column (key).

Privileges

Non-superuser: CREATE privilege on table schema

Default columns

The CREATE statement can omit specifying any column definitions. CREATE FLEXIBLE TABLE always creates two columns automatically:

__raw__: LONG VARBINARY type column to store unstructured data that you load. By default, this column has a NOT NULL constraint.
__identity__: IDENTITY column that is used for segmentation and sorting when no other column is defined.

Default projections

Vertica automatically creates superprojections for both the flex table and keys tables when you create them.

If you create a flex table with one or more of the ORDER BY, ENCODED BY, SEGMENTED BY, or KSAFE clauses, the clause information is used to create projections. If no clauses are in use, Vertica uses the following defaults:

Table	Sort order	Encoding	Segmentation	K-safety
Flexible table	`ORDER BY *.__identity__`	none	`SEGMENTED BY hash *.__identity__ ALL NODES OFFSET 0`	`1`
Keys table	`ORDER BY *._keys_frequency`	none	`UNSEGMENTED ALL NODES`	`1`

Note

When you build a view for a flex table (see BUILD_FLEXTABLE_VIEW), the view is ordered by frequency, desc, and key_name.

Examples

The following example creates a flex table named darkdata without specifying any column information. Vertica creates a default superprojection and buddy projection as part of creating the table:

=> CREATE FLEXIBLE TABLE darkdata();
CREATE TABLE
=> \dj darkdata1*
                         List of projections
 Schema |         Name         |  Owner  |       Node       | Comment
--------+----------------------+---------+------------------+---------
 public | darkdata1_b0         | dbadmin |                  |
 public | darkdata1_b1         | dbadmin |                  |
 public | darkdata1_keys_super | dbadmin | v_vmart_node0001 |
 public | darkdata1_keys_super | dbadmin | v_vmart_node0002 |
 public | darkdata1_keys_super | dbadmin | v_vmart_node0003 |
(5 rows)

=> SELECT export_objects('','darkdata1_b0');
CREATE PROJECTION public.darkdata1_b0 /*+basename(darkdata1),createtype(P)*/
(
 __identity__,
 __raw__
)
AS
 SELECT darkdata1.__identity__,
        darkdata1.__raw__
 FROM public.darkdata1
 ORDER BY darkdata1.__identity__
SEGMENTED BY hash(darkdata1.__identity__) ALL NODES OFFSET 0;

SELECT MARK_DESIGN_KSAFE(1);
(1 row)

=> select export_objects('','darkdata1_keys_super');
CREATE PROJECTION public.darkdata1_keys_super /*+basename(darkdata1_keys),createtype(P)*/
(
 key_name,
 frequency,
 data_type_guess
)
AS
 SELECT darkdata1_keys.key_name,
        darkdata1_keys.frequency,
        darkdata1_keys.data_type_guess
 FROM public.darkdata1_keys
 ORDER BY darkdata1_keys.frequency
UNSEGMENTED ALL NODES;

SELECT MARK_DESIGN_KSAFE(1);
(1 row)

The following example creates a table called darkdata1 with one column definition (date_col). The statement specifies the partition by clause to partition the data by year. Vertica creates a default superprojection and buddy projections as part of creating the table:

=> CREATE FLEX TABLE darkdata1 (date_col date NOT NULL) partition by
  extract('year' from date_col);
CREATE TABLE

10 - CREATE FUNCTION statements

Vertica provides CREATE statements for each type of user-defined extension.

Vertica provides CREATE statements for each type of user-defined extension. Each CREATE statement adds a user-defined function to the Vertica catalog:

CREATE statement	Extension
CREATE FUNCTION (scalar)	User-defined scalar functions (UDSFs)
CREATE AGGREGATE FUNCTION	User-defined aggregate functions (UDAFs)
CREATE ANALYTIC FUNCTION	User-defined analytic functions (UDAnF)
CREATE TRANSFORM FUNCTION	User-defined transform functions (UDTFs)
CREATE statements for user-defined load:
• CREATE SOURCE	Load source functions
• CREATE FILTER	Load filter functions
• CREATE PARSER	Load parser functions

Vertica also provides CREATE FUNCTION (SQL), which stores SQL expressions as functions that you can invoke in a query.

10.1 - CREATE AGGREGATE FUNCTION

Adds a user-defined aggregate function (UDAF) to the catalog.

Adds a user-defined aggregate function (UDAF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.

CREATE AGGREGATE FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading aggregate functions. When you call the SQL function, Vertica passes the input table to the function to process.

User-defined aggregate functions run in unfenced mode only.

Syntax

CREATE [ OR REPLACE ] AGGREGATE FUNCTION [ IF NOT EXISTS ]
  [[database.]schema.]function AS
  [ LANGUAGE 'language' ]
  NAME 'factory'
  LIBRARY library
  [ NOT FENCED ];

Arguments

OR REPLACE

If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.

The function name must conform to the restrictions on Identifiers.

LANGUAGE 'language'

The language used to develop this function, currently C++ only (the default).

NAME 'factory'

Name of the factory class that generates the function instance.

LIBRARY library

Name of the shared library that contains the function. This library must have already been loaded by CREATE LIBRARY.

NOT FENCED

Indicates that the function runs in unfenced mode. Aggregate functions cannot be run in fenced mode.

Privileges

Non-superuser:

CREATE privilege on the function's schema
USAGE privilege on the function's library

Examples

The following example demonstrates loading a library named AggregateFunctions and then defining functions named ag_avg and ag_cat. The functions are mapped to the AverageFactory and ConcatenateFactory classes in the library:

=> CREATE LIBRARY AggregateFunctions AS '/opt/vertica/sdk/examples/build/AggregateFunctions.so';
CREATE LIBRARY
=> CREATE AGGREGATE FUNCTION ag_avg AS LANGUAGE 'C++' NAME 'AverageFactory'
   library AggregateFunctions;
CREATE AGGREGATE FUNCTION
=> CREATE AGGREGATE FUNCTION ag_cat AS LANGUAGE 'C++' NAME 'ConcatenateFactory'
   library AggregateFunctions;
CREATE AGGREGATE FUNCTION
=> \x
Expanded display is on.
select * from user_functions;
-[ RECORD 1 ]----------+------------------------------------------------------------------
schema_name            | public
function_name          | ag_avg
procedure_type         | User Defined Aggregate
function_return_type   | Numeric
function_argument_type | Numeric
function_definition    | Class 'AverageFactory' in Library 'public.AggregateFunctions'
volatility             |
is_strict              | f
is_fenced              | f
comment                |
-[ RECORD 2 ]----------+------------------------------------------------------------------
schema_name            | public
function_name          | ag_cat
procedure_type         | User Defined Aggregate
function_return_type   | Varchar
function_argument_type | Varchar
function_definition    | Class 'ConcatenateFactory' in Library 'public.AggregateFunctions'
volatility             |
is_strict              | f
is_fenced              | f
comment                |

10.2 - CREATE ANALYTIC FUNCTION

Adds a user-defined analytic function (UDAnF) to the catalog.

Adds a user-defined analytic function (UDAnF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.

CREATE ANALYTIC FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading analytic functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.

Syntax

CREATE [ OR REPLACE ] ANALYTIC FUNCTION [ IF NOT EXISTS ]
    [[database.]schema.]function AS
    [ LANGUAGE 'language' ]
    NAME 'factory'
    LIBRARY library
    [ FENCED | NOT FENCED ]

Arguments

OR REPLACE

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.

The function name must conform to the restrictions on Identifiers.

LANGUAGE 'language'

Language used to develop this function, one of the following:

C++ (default)
Java

NAME 'factory'

Name of the factory class that generates the function instance.

LIBRARY library

Name of the library that contains the function. This library must already be loaded by CREATE LIBRARY.

FENCED | NOT FENCED

Enables or disables fenced mode for this function.

Default: FENCED

Privileges

Non-superuser:

CREATE privilege on the function's schema
USAGE privilege on the function's library

Examples

This example creates an analytic function named an_rank based on the factory class named RankFactory in the AnalyticFunctions library:

=> CREATE ANALYTIC FUNCTION an_rank AS LANGUAGE 'C++'
   NAME 'RankFactory' LIBRARY AnalyticFunctions;

10.3 - CREATE FILTER

Adds a user-defined load filter function to the catalog.

Adds a user-defined load filter function to the catalog. The library containing the filter function must have been previously added using CREATE LIBRARY.

CREATE FILTER automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading load filter functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.

Important

Installing an untrusted UDL function can compromise the security of the server. UDxs can contain arbitrary code. In particular, user-defined source functions can read data from any arbitrary location. It is up to the developer of the function to enforce proper security limitations. Superusers must not grant access to UDxs to untrusted users.

Syntax

CREATE [ OR REPLACE ] FILTER [ IF NOT EXISTS ]
   [[database.]schema.]function AS
   [ LANGUAGE 'language' ]
   NAME 'factory' LIBRARY library
   [ FENCED | NOT FENCED ]

Arguments

OR REPLACE

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.

The function name must conform to the restrictions on Identifiers.

LANGUAGE 'language'

The language used to develop this function, one of the following:

C++ (default)
Java
Python

NAME 'factory'

Name of the factory class that generates the function instance. This is the same name used by the RegisterFactory class.

LIBRARY library

Name of the C++ library shared object file, Python file, or Java Jar file. This library must already have been loaded by CREATE LIBRARY.

FENCED | NOT FENCED

Enables or disables fenced mode for this function.

Default: FENCED

Privileges

Superuser

Examples

The following example demonstrates loading a library named iConverterLib, then defining a filter function named Iconverter that is mapped to the iConverterFactory factory class in the library:

=> CREATE LIBRARY iConverterLib as '/opt/vertica/sdk/examples/build/IconverterLib.so';
CREATE LIBRARY
=> CREATE FILTER Iconverter AS LANGUAGE 'C++' NAME 'IconverterFactory' LIBRARY IconverterLib;
CREATE FILTER FUNCTION
=> \x
Expanded display is on.
=> SELECT * FROM user_functions;
-[ RECORD 1 ]----------+--------------------
schema_name            | public
function_name          | Iconverter
procedure_type         | User Defined Filter
function_return_type   |
function_argument_type |
function_definition    |
volatility             |
is_strict              | f
is_fenced              | f
comment                |

10.4 - CREATE FUNCTION (scalar)

Adds a user-defined scalar function (UDSF) to the catalog.

Adds a user-defined scalar function (UDSF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.

A UDSF takes in a single row of data and returns a single value. These functions can be used anywhere a native Vertica function or statement can be used, except CREATE TABLE with its PARTITION BY or any segmentation clause.

CREATE FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading UDxs. When you call the function, Vertica passes the parameters to the function in the library to process.

Syntax

CREATE [ OR REPLACE ] FUNCTION [ IF NOT EXISTS ]
   [[database.]schema.]function AS
   [ LANGUAGE 'language' ]
   NAME 'factory'
   LIBRARY library
   [ FENCED | NOT FENCED ]

Arguments

OR REPLACE

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.

The function name must conform to the restrictions on Identifiers.

LANGUAGE 'language'

Language used to develop this function, one of the following:

C++ (default)
Python
Java
R

NAME 'factory'

Name of the factory class that generates the function instance.

LIBRARY library

Name of the C++ shared object file, Python file, Java Jar file, or R functions file. This library must already have been loaded by CREATE LIBRARY.

FENCED | NOT FENCED

Enables or disables fenced mode for this function. Functions written in Java and R always run in fenced mode.

Default: FENCED

Privileges

CREATE privilege on the function's schema
USAGE privilege on the function's library

Examples

The following example loads a library named ScalarFunctions and then defines a function named Add2ints that is mapped to the Add2intsInfo factory class in the library:

=> CREATE LIBRARY ScalarFunctions AS '/opt/vertica/sdk/examples/build/ScalarFunctions.so';
CREATE LIBRARY
=> CREATE FUNCTION Add2Ints AS LANGUAGE 'C++' NAME 'Add2IntsFactory' LIBRARY ScalarFunctions;
CREATE FUNCTION
=> \x
Expanded display is on.
=> SELECT * FROM USER_FUNCTIONS;

-[ RECORD 1 ]----------+----------------------------------------------------
schema_name            | public
function_name          | Add2Ints
procedure_type         | User Defined Function
function_return_type   | Integer
function_argument_type | Integer, Integer
function_definition    | Class 'Add2IntsFactory' in Library 'public.ScalarFunctions'
volatility             | volatile
is_strict              | f
is_fenced              | t
comment                |

=> \x
Expanded display is off.
=> -- Try a simple call to the function
=> SELECT Add2Ints(23,19);
 Add2Ints
----------
       42
(1 row)

The following example uses a scalar function that returns a ROW:

=> CREATE FUNCTION div_with_rem AS LANGUAGE 'C++' NAME 'DivFactory' LIBRARY ScalarFunctions;

=> SELECT div_with_rem(18,5);
        div_with_rem
------------------------------
 {"quotient":3,"remainder":3}
(1 row)

10.5 - CREATE FUNCTION (SQL)

Stores SQL expressions as functions for use in queries.

Stores SQL expressions as functions for use in queries. User-defined SQL functions are useful for executing complex queries and combining Vertica built-in functions. You simply call the function in a given query. If multiple SQL functions with the same name and argument types are in the search path, Vertica calls the first match that it finds.

SQL functions are flattened in all cases, including DDL.

Syntax

CREATE [ OR REPLACE ] FUNCTION [ IF NOT EXISTS ]
    [[database.]schema.]function( [ arg_list ] )
    RETURN return_type
    AS
    BEGIN
       RETURN expression;
    END;

Arguments

OR REPLACE

If a function of the same name and arguments exists, replace it. If you only change the function arguments, Vertica ignores this option and maintains both functions under the same name.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

SQL function to create, where function conforms to conventions described in Identifiers.

*arg_list*

A comma-delimited list of argument names and their data types, specified in this format:

argname argtype[,...]

where:

argname is the name of an argument passed to function.
argtype is argname's data type.

*return_type*

The data type that this function returns.

RETURN expression

The SQL function body, where expression can contain built-in functions, operators, and argument names specified in the CREATE FUNCTION statement.

A semicolon at the end of the expression is required.

Note

CREATE FUNCTION allows only one RETURN expression. Return expressions do not support the following:

FROM, WHERE, GROUP BY, ORDER BY, and LIMIT clauses
Aggregation, analytics, and meta-functions

Privileges

Non-superuser:

CREATE privilege on the function's schema
USAGE privilege on the function's library

Strictness and volatility

Vertica infers the strictness and volatility (stable, immutable, or volatile) of a SQL function from its definition. Vertica then determines the correctness of usage, such as where an immutable function is expected but a volatile function is provided.

SQL functions and views

You can create views on the queries that use SQL functions and then query the views. When you create a view, a SQL function replaces a call to the user-defined function with the function body in a view definition. Therefore, when the body of the user-defined function is replaced, the view should also be replaced.

Examples

See Creating user-defined SQL functions.

10.6 - CREATE PARSER

Adds a user-defined load parser function to the catalog.

Adds a user-defined load parser function to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.

CREATE PARSER automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading load parser functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.

Important

Syntax

CREATE [ OR REPLACE ] PARSER [ IF NOT EXISTS ]
   [[database.]schema.]function AS
   [ LANGUAGE 'language' ]
   NAME 'factory'
   LIBRARY library
   [ FENCED | NOT FENCED ]

Arguments

OR REPLACE

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.

The function name must conform to the restrictions on Identifiers.

LANGUAGE 'language'

The language used to develop this function, one of the following:

C++ (default)
Java
Python

NAME 'factory'

Name of the factory class that generates the function instance. This is the same name used by the RegisterFactory class.

LIBRARY library

Name of the C++ library shared object file, Python file, or Java Jar file. This library must already have been loaded by CREATE LIBRARY.

FENCED | NOT FENCED

Enables or disables fenced mode for this function.

Default: FENCED

Privileges

Superuser

Examples

The following example demonstrates loading a library named BasicIntegrerParserLib, then defining a parser function named BasicIntegerParser that is mapped to the BasicIntegerParserFactory factory class in the library:

=> CREATE LIBRARY BasicIntegerParserLib as '/opt/vertica/sdk/examples/build/BasicIntegerParser.so';
CREATE LIBRARY
=> CREATE PARSER BasicIntegerParser AS LANGUAGE 'C++' NAME 'BasicIntegerParserFactory' LIBRARY BasicIntegerParserLib;
CREATE PARSER FUNCTION
=> \x
Expanded display is on.
=> SELECT * FROM user_functions;
-[ RECORD 1 ]----------+--------------------
schema_name            | public
function_name          | BasicIntegerParser
procedure_type         | User Defined Parser
function_return_type   |
function_argument_type |
function_definition    |
volatility             |
is_strict              | f
is_fenced              | f
comment                |

10.7 - CREATE SOURCE

Adds a user-defined load source function to the catalog.

Adds a user-defined load source function to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.

CREATE SOURCE automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading load source functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.

Important

Syntax

CREATE [ OR REPLACE ] SOURCE [ IF NOT EXISTS ]
    [[database.]schema.]function AS
    [ LANGUAGE 'language' ]
    NAME 'factory'
    LIBRARY library
    [ FENCED | NOT FENCED ]

Arguments

OR REPLACE

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.

The function name must conform to the restrictions on Identifiers.

LANGUAGE 'language'

Language used to develop this function, one of the following:

C++ (default)
Java

NAME 'factory'

Name of the factory class that generates the function instance. This is the same name used by the RegisterFactory class.

LIBRARY library

Name of the C++ library shared object file or Java Jar file. This library must already have been loaded by CREATE LIBRARY.

FENCED | NOT FENCED

Enables or disables fenced mode for this function.

**Default: **FENCED

Privileges

Superuser

Examples

The following example demonstrates loading a library named curllib, then defining a source function named curl that is mapped to the CurlSourceFactory factory class in the library:

=> CREATE LIBRARY curllib as '/opt/vertica/sdk/examples/build/cURLLib.so';
CREATE LIBRARY
=> CREATE SOURCE curl AS LANGUAGE 'C++' NAME 'CurlSourceFactory' LIBRARY curllib;
CREATE SOURCE
=> \x
Expanded display is on.
=> SELECT * FROM user_functions;
-[ RECORD 1 ]----------+--------------------
schema_name            | public
function_name          | curl
procedure_type         | User Defined Source
function_return_type   |
function_argument_type |
function_definition    |
volatility             |
is_strict              | f
is_fenced              | f
comment                |

10.8 - CREATE TRANSFORM FUNCTION

Adds a user-defined transform function (UDTF) to the catalog.

Adds a user-defined transform function (UDTF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.

CREATE TRANSFORM FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading transform functions. When you call the SQL function, Vertica passes the input table to the transform function in the library to process.

Syntax

CREATE [ OR REPLACE ] TRANSFORM FUNCTION [ IF NOT EXISTS ]
    [[database.]schema.]function AS
    [ LANGUAGE 'language' ]
    NAME 'factory'
    LIBRARY library
    [ FENCED | NOT FENCED ]

Arguments

OR REPLACE

OR REPLACE and IF NOT EXISTS are mutually exclusive.

IF NOT EXISTS

If a function with the same name and arguments exists, return without creating the function.

OR REPLACE and IF NOT EXISTS are mutually exclusive.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*function*

Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.

The function name must conform to the restrictions on Identifiers.

LANGUAGE 'language'

The language used to develop this function, one of the following:

C++ (default)
Java
R
Python

NAME 'factory'

Name of the factory class that generates the function instance.

LIBRARY library

Name of the C++ shared object file, Python file, Java Jar file, or R functions file. This library must already have been loaded by CREATE LIBRARY.

FENCED | NOT FENCED

Enables or disables fenced mode for this function. Functions written in Java and R always run in fenced mode.

Default: FENCED

Privileges

Non-superuser:

CREATE privilege on the function's schema
USAGE privilege on the function's library

Restrictions

A query that includes a UDTF cannot:

Include statements other than the SELECT statement that calls this UDTF and a PARTITION BY expression
Call an analytic function
Call another UDTF
Include one of the following clauses:

Examples

The following example loads a library named TransformFunctions and then defines a function named tokenize that is mapped to the TokenFactory factory class in the library:

=> CREATE LIBRARY TransformFunctions AS
   '/home/dbadmin/TransformFunctions.so';
CREATE LIBRARY
=> CREATE TRANSFORM FUNCTION tokenize
   AS LANGUAGE 'C++' NAME 'TokenFactory' LIBRARY TransformFunctions;
CREATE TRANSFORM FUNCTION

11 - CREATE HCATALOG SCHEMA

Define a schema for data stored in a Hive data warehouse using the HCatalog Connector.

Define a schema for data stored in a Hive data warehouse using the HCatalog Connector. For more information, see Using the HCatalog Connector.

Most of the optional parameters are read out of Hadoop configuration files if available. If you copied the Hadoop configuration files as described in Configuring Vertica for HCatalog, you can omit most parameters. By default this statement uses the values specified in those configuration files. If the configuration files are complete, the following is a valid statement:

=> CREATE HCATALOG SCHEMA hcat;

If a value is not specified in the configuration files and a default is shown in the parameter list, then that default value is used.

Some parameters apply only if you are using HiveServer2 (the default). Others apply only if you are using WebHCat, a legacy Hadoop service. When using HiveServer2, use HIVESERVER2_HOSTNAME to specify the server host. When using WebHCat, use WEBSERVICE_HOSTNAME to specify the server host.

If you need to use WebHCat you must also set the HCatalogConnectorUseHiveServer2 configuration parameter to 0. See Apache Hadoop parameters.

After creating the schema, you can change many (but not all) parameters using ALTER HCATALOG SCHEMA.

Syntax

CREATE HCATALOG SCHEMA [IF NOT EXISTS] schemaName
    [AUTHORIZATION user-id]
    [WITH [param=value [,...] ] ]

Arguments

Argument	Description
`[IF NOT EXISTS]`	If given, the statement exits without an error when the schema named in `schemaName` already exists.
`schemaName`	The name of the schema to create in the Vertica catalog. The tables in the Hive database will be available through this schema.
`AUTHORIZATION` `user-id`	The name of a Vertica account to own the schema being created. This parameter is ignored if Kerberos authentication is being used; in that case the current vsql user is used.

Parameters

Parameter	Description
`HOSTNAME`	The hostname, IP address, or URI of the database server that stores the Hive data warehouse's metastore information. If you specify this parameter and do not also specify `PORT`, then this value must be in the URI format used for hive.metastore.uris in hive-site.xml. If the Hive metastore supports High Availability, you can specify a comma-separated list of URIs for this value. If this value is not specified, hive-site.xml must be available.
`PORT`	The port number on which the metastore database is running. If you specify this parameter, you must also specify `HOSTNAME` and it must be a name or IP address (not a URI).
`HIVESERVER2_HOSTNAME`	The hostname or IP address of the HiveServer2 service. This parameter is optional if in hive-site.xml you set one of the following properties: hive.server2.thrift.bind.host to a valid host hive.server2.support.dynamic.service.discovery to true This parameter is ignored if you are using WebHCat.
`WEBSERVICE_HOSTNAME`	The hostname or IP address of the WebHCat service, if using WebHCat instead of HiveServer2. If this value is not specified, webhcat-site.xml must be available.
`WEBSERVICE_PORT`	The port number on which the WebHCat service is running, if using WebHCat instead of HiveServer2. If this value is not specified, webhcat-site.xml must be available.
`WEBHDFS_ADDRESS`	The host and port ("host:port") for the WebHDFS service. This parameter is used only for reading ORC and Parquet files. If this value is not set, hdfs-site.xml must be available to read these file types through the HCatalog Connector.
`HCATALOG_SCHEMA`	The name of the Hive schema or database that the Vertica schema is being mapped to. The default is `schemaName`.
`CUSTOM_PARTITIONS`	Whether the Hive schema uses custom partition locations ('YES' or 'NO'). If the schema uses custom partition locations, then Vertica queries Hive to get those locations when executing queries. These additional Hive queries can be expensive, so use this parameter only if you need to. The default is 'NO' (disabled). For more information, see Using Partitioned Data.
`HCATALOG_USER`	The username of the HCatalog user to use when making calls to the HiveServer2 or WebHCat server. The default is the current database user.
`HCATALOG_CONNECTION_TIMEOUT`	The number of seconds the HCatalog Connector waits for a successful connection to the HiveServer or WebHCat server. A value of 0 means wait indefinitely.
`HCATALOG_SLOW_TRANSFER_LIMIT`	The lowest data transfer rate (in bytes per second) from the HiveServer2 or WebHCat server that the HCatalog Connector accepts. See HCATALOG_SLOW_TRANSFER_TIME for details.
`HCATALOG_SLOW_TRANSFER_TIME`	The number of seconds the HCatalog Connector waits before enforcing the data transfer rate lower limit. After this time has passed, the HCatalog Connector tests whether the data transfer rate is at least as fast as the value set in HCATALOG_SLOW_TRANSFER_LIMIT. If it is not, then the HCatalog Connector breaks the connection and terminates the query.
`SSL_CONFIG`	The path of the Hadoop ssl-client.xml configuration file. This parameter is required if you are using HiveServer2 and it uses SSL wire encryption. This parameter is ignored if you are using WebHCat.

The default values for HCATALOG_CONNECTOR_TIMEOUT, HCATALOG_SLOW_TRANSFER_LIMIT, and HCATALOG_SLOW_TRANSFER_TIME are set by the database configuration parameters HCatConnectionTimeout, HCatSlowTransferLimit, and HCatSlowTransferTime. See Apache Hadoop parameters for more information.

Configuration files

The HCatalog Connector uses the following values from the Hadoop configuration files if you do not override them when creating the schema.

File	Properties
`hive-site.xml`	hive.server2.thrift.bind.host (used for HIVESERVER2_HOSTNAME)
	hive.server2.thrift.port
	hive.server2.transport.mode
	hive.server2.authentication
	hive.server2.authentication.kerberos.principal
	hive.server2.support.dynamic.service.discovery
	hive.zookeeper.quorum (used as HIVESERVER2_HOSTNAME if dynamic service discovery is enabled)
	hive.zookeeper.client.port
	hive.server2.zookeeper.namespace
	hive.metastore.uris (used for HOSTNAME and PORT)
`ssl-client.xml`	ssl.client.truststore.location
`ssl-client.xml`	ssl.client.truststore.password

Privileges

The user must be a superuser or be granted all permissions on the database to use this statement.

The user also requires access to Hive data in one of the following ways:

Have USAGE permissions on hcatalog_schema, if Hive does not use an authorization service (Sentry or Ranger) to manage access.
Have permission through an authorization service, if Hive uses it to manage access. In this case you must either set EnableHCatImpersonation to 0, to access data as the Vertica principal, or grant users access to the HDFS data. For Sentry, you can use ACL synchronization to manage HDFS access.
Be the dbadmin user, with or without an authorization service.

Examples

The following example shows how to use CREATE HCATALOG SCHEMA to define a new schema for tables stored in a Hive database and then query the system tables that contain information about those tables:

=> CREATE HCATALOG SCHEMA hcat WITH HOSTNAME='hcathost' PORT=9083
   HCATALOG_SCHEMA='default' HIVESERVER2_HOSTNAME='hs.example.com'
   SSL_CONFIG='/etc/hadoop/conf/ssl-client.xml' HCATALOG_USER='admin';
CREATE SCHEMA
=> \x
Expanded display is on.

=> SELECT * FROM v_catalog.hcatalog_schemata;
-[ RECORD 1 ]----------------+-------------------------------------------
schema_id                    | 45035996273748224
schema_name                  | hcat
schema_owner_id              | 45035996273704962
schema_owner                 | admin
create_time                  | 2017-12-05 14:43:03.353404-05
hostname                     | hcathost
port                         | -1
hiveserver2_hostname         | hs.example.com
webservice_hostname          |
webservice_port              | 50111
webhdfs_address              | hs.example.com:50070
hcatalog_schema_name         | default
ssl_config                   | /etc/hadoop/conf/ssl-client.xml
hcatalog_user_name           | admin
hcatalog_connection_timeout  | -1
hcatalog_slow_transfer_limit | -1
hcatalog_slow_transfer_time  | -1
custom_partitions            | f

=> SELECT * FROM v_catalog.hcatalog_table_list;
-[ RECORD 1 ]------+------------------
table_schema_id    | 45035996273748224
table_schema       | hcat
hcatalog_schema    | default
table_name         | nation
hcatalog_user_name | admin
-[ RECORD 2 ]------+------------------
table_schema_id    | 45035996273748224
table_schema       | hcat
hcatalog_schema    | default
table_name         | raw
hcatalog_user_name | admin
-[ RECORD 3 ]------+------------------
table_schema_id    | 45035996273748224
table_schema       | hcat
hcatalog_schema    | default
table_name         | raw_rcfile
hcatalog_user_name | admin
-[ RECORD 4 ]------+------------------
table_schema_id    | 45035996273748224
table_schema       | hcat
hcatalog_schema    | default
table_name         | raw_sequence
hcatalog_user_name | admin

The following example shows how to specify more than one metastore host.

=> CREATE HCATALOG SCHEMA hcat
   WITH HOSTNAME='thrift://node1.example.com:9083,thrift://node2.example.com:9083';

The following example shows how to include custom partition locations:

=> CREATE HCATALOG SCHEMA hcat WITH HCATALOG_SCHEMA='default'
    HIVESERVER2_HOSTNAME='hs.example.com'
    CUSTOM_PARTITIONS='yes';

12 - CREATE KEY

Creates a private key.

Syntax

CREATE [TEMP[ORARY]] KEY name
       { 'AES' [ PASSWORD 'password' ] | 'RSA' }
       {LENGTH length | AS key_text}

Parameters

TEMPORARY

Create with session scope. The key is stored in memory and is valid only for the current session.

*name*

The name of the key.

*password*

Password for the key.

*length*

Size of the key in bits.

Example: 2048

*key_text*

The contents of the key to import.

Example:

-----BEGIN RSA PRIVATE KEY-----...ABCD1234...-----END RSA PRIVATE KEY-----

Privileges

Superuser

Examples

See Generating TLS certificates and keys.

13 - CREATE LIBRARY

Loads a library containing user-defined extensions (UDxs) into the Vertica catalog.

Loads a library containing user-defined extensions (UDxs) into the Vertica catalog. Vertica automatically distributes copies of the library file and supporting libraries to all cluster nodes.

Because libraries are added to the database catalog, they persist across database restarts.

After loading a library in the catalog, you can use statements such as CREATE FUNCTION to define the extensions contained in the library. See Developing user-defined extensions (UDxs) for details.

Syntax

CREATE [OR REPLACE] LIBRARY
    [[database.]schema.]name
    AS 'path'
    [ DEPENDS 'depends-path' ]
    [ LANGUAGE 'language' ]

Arguments

OR REPLACE

If a library with the same name exists, replace it. UDxs defined in the catalog that reference the updated library automatically start using the new library file.

If you do not use this directive and the library already exists, the CREATE statement returns with an error.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

name

Name of the library to create. This is the name used when creating functions in the library (see Creating UDx Functions). While not required, it is good practice to match the file name.

AS path

Path of the library to load, either an absolute path on the initiator node file system or a URI for another supported file system or object store.

DEPENDS 'depends-path'

Files or libraries on which this library depends, one or more files or directories on the initiator node file system or other supported file systems or object stores. For a directory, end the path entry with a slash (/), optionally followed by a wildcard (*). To specify more than one file, separate entries with colons (:).

If any path entry contain colons, such as a URI, place brackets around the entire DEPENDS path and use double quotes for the individual path elements, as in the following example:

DEPENDS '["s3://mybucket/gson-2.3.1.jar"]'

To specify libraries with multiple directory levels, see Multi-level Library Dependencies.

DEPENDS has no effect for libraries written in R. R packages must be installed locally on each node, including external dependencies.

Important

The performance of CREATE LIBRARY can degrade in Eon Mode, in proportion to the number and depth of dependencies specified by the DEPENDS clause.

If a Java library depends on native libraries (SO files), use DEPENDS to specify the path and call System.loadLibrary() in your UDx to load the native libraries from that path.

LANGUAGE 'language'

The programming language of the functions in the library, one of:

C++ (default)
Python
Java
R

Privileges

Superuser, or UDXDEVELOPER and CREATE on the schema. Non-superusers must explicitly enable the UDXDEVELOPER role, as in the following example:

=> SET ROLE UDXDEVELOPER;
SET

-- Not required, but you can confirm the role as follows:
=> SHOW ENABLED ROLES;
     name      |   setting
---------------+--------------
 enabled roles | udxdeveloper
(1 row)

=> CREATE LIBRARY MyLib AS '/home/dbadmin/my_lib.so';
CREATE LIBRARY

-- Create functions...

-- UDXDEVELOPER also grants DROP (replace):
=> CREATE OR REPLACE LIBRARY MyLib AS '/home/dbadmin/my_lib.so';

Requirements

Vertica makes its own copies of the library files. Later modification or deletion of the original files specified in the statement does not affect the library defined in the catalog. To update the library, use ALTER LIBRARY.
Loading a library does not guarantee that it functions correctly. CREATE LIBRARY performs some basic checks on the library file to verify it is compatible with Vertica. The statement fails if it detects that the library was not correctly compiled or it finds other basic incompatibilities. However, CREATE LIBRARY cannot detect many other issues in shared libraries.

Multi-level library dependencies

If a DEPENDS clause specifies a library with multiple directory levels, Vertica follows the library path to include all subdirectories of that library. For example, the following CREATE LIBRARY statement enables the UDx library mylib to import all Python packages and modules that it finds in subdirectories of site-packages:

=> CREATE LIBRARY mylib AS '/path/to/python_udx' DEPENDS '/path/to/python/site-packages' LANGUAGE 'Python';

Important

DEPENDS can specify Java library dependencies that are up to 100 levels deep.

Examples

Load a library in the home directory of the dbadmin account:

=> CREATE LIBRARY MyFunctions AS '/home/dbadmin/my_functions.so';

Load a library located in the directory where you started vsql:

=> \set libfile '\''`pwd`'/MyOtherFunctions.so\'';
=> CREATE LIBRARY MyOtherFunctions AS :libfile;

Load a library from the cloud:

=> CREATE LIBRARY SomeFunctions AS 'S3://mybucket/extensions.so';

Load a library that depends on multiple JAR files in the same directory:

=> CREATE LIBRARY DeleteVowelsLib AS '/home/dbadmin/JavaLib.jar'
   DEPENDS '/home/dbadmin/mylibs/*' LANGUAGE 'Java';

Load a library with multiple explicit dependencies:

=> CREATE LIBRARY mylib AS '/path/to/java_udx'
   DEPENDS '/path/to/jars/this.jar:/path/to/jars/that.jar' LANGUAGE 'Java';

Load a library with dependencies in the cloud:

=> CREATE LIBRARY s3lib AS 's3://mybucket/UdlLib.jar'
   DEPENDS '["s3://mybucket/gson-2.3.1.jar"]' LANGUAGE 'Java';

14 - CREATE LOAD BALANCE GROUP

Creates a group of network addresses that can be targeted by a load balancing routing rule.

Creates a group of network addresses that can be targeted by a load balancing routing rule. You create a group either using a list of network addresses, or basing it on one or more fault groups or subclusters.

Note

You cannot add multiple network addresses for one node to the same load balancing group.

Syntax

CREATE LOAD BALANCE GROUP group_name WITH {
      ADDRESS address[,...]
    | FAULT GROUP  fault_group[,...] FILTER 'IP_range'
    | SUBCLUSTER subcluster[,...] FILTER 'IP_range'
    }
    [ POLICY 'policy_setting' ]

Parameters

group_name

Name of the group to create. You use this name later when defining load balancing rules.

address[,...]

Comma-delimited list of network addresses you created earlier.

fault_group[,...]

Comma-delimited list of fault groups to use as the basis of the load balance group.

Note

Before you create your load balance group from a fault group, you must create network addresses on the nodes you want in your load balance group. Load balance groups only work with the network addresses you define on nodes, rather than IP addresses. See CREATE NETWORK ADDRESS.

subcluster[,...]

Comma-delimited list of subclusters to use as the basis of the load balance group.

Note

As with fault groups, you must create network addresses on the nodes in the subcluster you want to be part of the load balance group.

IP_range

Range of IP addresses in CIDR notation to include in the load balance group from the fault groups or subclusters. This range can be either IPv4 or IPv6. Only nodes that have a network address with an IP address that falls within this range are added to the load balancing group.

policy_setting

Determines how the initially-contacted node chooses a target from the group, one of the following:

ROUNDROBIN (default) rotates among the available members of the load balancing group. The initially-contacted node keeps track of which node it chose last time, and chooses the next one in the cluster.

Note
Each node in the cluster maintains its own round-robin pointer that indicates which node it should pick next for each load-balancing group. Therefore, if clients connect to different initial nodes, they may be redirected to the same node.
RANDOM chooses an available node from the group randomly.
NONE disables load balancing.

Privileges

Superuser

Examples

The following statement demonstrates creating a load balance group that contains several network addresses:

=> CREATE NETWORK ADDRESS addr01 ON v_vmart_node0001 WITH '10.20.110.21';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr02 ON v_vmart_node0002 WITH '10.20.110.22';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr03 on v_vmart_node0003 WITH '10.20.110.23';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr04 on v_vmart_node0004 WITH '10.20.110.24';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP group_1 WITH ADDRESS addr01, addr02;
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_2 WITH ADDRESS addr03, addr04;
CREATE LOAD BALANCE GROUP

=> SELECT * FROM LOAD_BALANCE_GROUPS;
    name    |   policy   |     filter      |         type          | object_name
------------+------------+-----------------+-----------------------+-------------
 group_1    | ROUNDROBIN |                 | Network Address Group | addr01
 group_1    | ROUNDROBIN |                 | Network Address Group | addr02
 group_2    | ROUNDROBIN |                 | Network Address Group | addr03
 group_2    | ROUNDROBIN |                 | Network Address Group | addr04
(4 rows)

This example demonstrates creating a load balancing group using a fault group:

=> CREATE FAULT GROUP fault_1;
CREATE FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0001;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0002;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0003;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE  v_vmart_node0004;
ALTER FAULT GROUP
=> SELECT node_name,node_address,node_address_family,export_address
   FROM v_catalog.nodes;
    node_name     | node_address | node_address_family | export_address
------------------+--------------+---------------------+----------------
 v_vmart_node0001 | 10.20.110.21 | ipv4                | 10.20.110.21
 v_vmart_node0002 | 10.20.110.22 | ipv4                | 10.20.110.22
 v_vmart_node0003 | 10.20.110.23 | ipv4                | 10.20.110.23
 v_vmart_node0004 | 10.20.110.24 | ipv4                | 10.20.110.24
(4 rows)

=> CREATE LOAD BALANCE GROUP group_all WITH FAULT GROUP fault_1 FILTER
   '0.0.0.0/0';
CREATE LOAD BALANCE GROUP

=> CREATE LOAD BALANCE GROUP group_some WITH FAULT GROUP fault_1 FILTER
   '10.20.110.21/30';
CREATE LOAD BALANCE GROUP

=> SELECT * FROM LOAD_BALANCE_GROUPS;
      name      |   policy   |     filter      |         type          | object_name
----------------+------------+-----------------+-----------------------+-------------
 group_all      | ROUNDROBIN | 0.0.0.0/0       | Fault Group           | fault_1
 group_some     | ROUNDROBIN | 10.20.110.21/30 | Fault Group           | fault_1
(2 rows)

15 - CREATE LOCAL TEMPORARY VIEW

Creates or replaces a local temporary view.

Creates or replaces a local temporary view. Views are read only, so they do not support insert, update, delete, or copy operations. Local temporary views are session-scoped, so they are visible only to their creator in the current session. Vertica drops the view when the session ends.

Note

Vertica does not support global temporary views.

Syntax

CREATE [OR REPLACE] LOCAL TEMP[ORARY] VIEW view [ (column[,...] ) ] AS query

Parameters

OR REPLACE: Specifies to overwrite the existing view view-name. If you omit this option and view-name already exists, CREATE VIEW returns an error.
view: Identifies the view to create, where view conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.
column[,...]: List of up to 9800 names to use as view column names. Vertica maps view column names to query columns according to the order of their respective lists. By default, the view uses column names as they are specified in the query.
AS query: A SELECT statement that the temporary view executes. The SELECT statement can reference tables, temporary tables, and other views.

Privileges

See Creating views.

Examples

The following CREATE LOCAL TEMPORARY VIEW statement creates the temporary view myview. This view sums all individual incomes of customers listed in the store.store_sales_fact table, and groups results by state:

=> CREATE LOCAL TEMP VIEW myview AS
   SELECT SUM(annual_income), customer_state FROM public.customer_dimension
     WHERE customer_key IN (SELECT customer_key FROM store.store_sales_fact)
     GROUP BY customer_state
     ORDER BY customer_state ASC;

The following example uses the temporary view myview with a WHERE clause that limits the results to combined salaries greater than $2 billion:

=> SELECT * FROM myview WHERE SUM > 2000000000;


     SUM     | customer_state
-------------+----------------
  2723441590 | AZ
 29253817091 | CA
  4907216137 | CO
  3769455689 | CT
  3330524215 | FL
  4581840709 | IL
  3310667307 | IN
  2793284639 | MA
  5225333668 | MI
  2128169759 | NV
  2806150503 | PA
  2832710696 | TN
 14215397659 | TX
  2642551509 | UT
(14 rows)

16 - CREATE LOCATION

Creates a storage location where Vertica can store data.

Creates a storage location where Vertica can store data. After you create the location, you create storage policies that assign the storage location to the database objects that will store data in the location.

Caution

While no technical issue prevents you from using CREATE LOCATION to add one or more Network File System (NFS) storage locations, Vertica does not support NFS data or catalog storage except for MapR mount points. You will be unable to run queries against any other NFS data. When creating locations on MapR file systems, you must specify ALL NODES SHARED.

If you use HDFS storage locations, the HDFS data must be available when you start Vertica. Your HDFS cluster must be operational, and the ROS files must be present. If you moved data files, or they are corrupted, or your HDFS cluster is not responsive, Vertica cannot start.

Syntax

CREATE LOCATION 'path'
    [NODE 'node' | ALL NODES]
    [SHARED]
    [USAGE 'usage']
    [LABEL 'label']
    [LIMIT 'size']

Arguments

path

Where to store this location's data. The type of file system on which the location is based determines the path format:

Linux: Absolute path to the directory where Vertica can write the storage location's data.
Shared file systems: See the URL specifications in HDFS file system, S3 object store, Google Cloud Storage (GCS) object store, and Azure Blob Storage object store.

HDFS storage locations have additional requirements.

ALL NODES | NODE 'node'

The node or nodes on which the storage location is defined, one of the following:

ALL NODES (default): Create the storage location on each node. If SHARED is also specified, create the storage location once for use by all nodes.
NODE 'node': Create the storage location on a single node, where node is the name of the node in the NODES system table. You cannot use this option with SHARED.

SHARED

Indicates the location set by path is shared (used by all nodes) rather than local to each node. You cannot specify individual nodes with SHARED; you must use ALL NODES.

Most remote file systems such as HDFS and S3 are shared. For these file systems, the path argument represents a single location in the remote file system where all nodes store data. If using a remote file system, you must specify SHARED, even for one-node clusters.

Note

If path is set to S3 communal storage, SHARED is always implied and can be omitted.

Deprecated

SHARED DATA and SHARED DATA,TEMP storage locations are deprecated.

USAGE 'usage'

The type of data the storage location can hold, where usage is one of the following:

DATA,TEMP (default): The storage location can store persistent and temporary DML-generated data, and data for temporary tables.
TEMP: A path-specified location to store DML-generated temporary data. If path is set to S3, then this location is used only when the RemoteStorageForTemp configuration parameter is set to 1, and TEMP must be qualified with ALL NODES SHARED. For details, see S3 Storage of Temporary Data.
DATA: The storage location can only store persistent data.
USER: Users with READ and WRITE privileges can access data and external tables of this storage location.
DEPOT: The storage location is used in Eon Mode to store the depot. Only create DEPOT storage locations on local Linux file systems.

Vertica allows a single DEPOT storage location per node. If you want to move your depot to different location (on a different file system, for example) you must first drop the old depot storage location, then create the new location.

LABEL 'label'

A label for the storage location, used when assigning the storage location to data objects. You use this name later when assigning the storage location to data objects.

Important

You must supply a label for depot storage locations.

LIMIT 'size'

Valid only if the storage location usage type is set to DEPOT, specifies the maximum amount of disk space that the depot can allocate from the storage location's file system.

You can specify size in two ways:

integer%: Percentage of storage location disk size.
integer{K|M|G|T}: Amount of storage location disk size in kilobytes, megabytes, gigabytes, or terabytes.

Important

The depot size cannot exceed 80 percent of the file system disk space where the depot is stored. If you specify a value that is too large, Vertica issues a warning and automatically changes the value to 80 percent of the file system size.

If you do not specify a limit, it is set to 60 percent.

Privileges

Superuser

File system access

The Vertica process must have read and write permissions to the location where data is to be stored. Each file system has its own requirements:

File system	Requirements
Linux	Database superuser account (usually named dbadmin) must have full read and write access to the directory in the `path` argument.
HDFS without Kerberos	A Hadoop user whose username matches the Vertica database administrator username (usually dbadmin) must have read and write access to the HDFS directory specified in the `path` argument. The UseServerIdentityOverUserIdentity configuration parameter must be set to true in the user session; otherwise Vertica tries to use the identity associated with the logged-in user.
HDFS with Kerberos	A Hadoop user whose username matches the principal in the keytab file on each Vertica node must have read and write access to the HDFS directory stored in the path argument. This is not the same as the database administrator username. The UseServerIdentityOverUserIdentity configuration parameter must be set to true in the user session; otherwise Vertica tries to use the Kerberos principal associated with the logged-in user.
Object stores (S3, GCS, Azure)	Database-level credentials must be specified and provide full read and write access to the location in the path argument. If session-level credentials are specified they are used, directly overriding the use of the storage location.

Examples

Create a storage location in the local Linux file system for temporary data storage:

=> CREATE LOCATION '/home/dbadmin/testloc' USAGE 'TEMP' LABEL 'tempfiles';

Create a storage location on HDFS. The HDFS cluster does not use Kerberos:

=> CREATE LOCATION 'hdfs://hadoopNS/vertica/colddata' ALL NODES SHARED
   USAGE 'data' LABEL 'coldstorage';

Create the same storage location, but on a Hadoop cluster that uses Kerberos. Note the output that reports the principal being used:

=> CREATE LOCATION 'hdfs://hadoopNS/vertica/colddata' ALL NODES SHARED
   USAGE 'data' LABEL 'coldstorage';
NOTICE 0: Performing HDFS operations using kerberos principal [vertica/hadoop.example.com]
CREATE LOCATION

Create a location for user data, grant access to it, and use it to create an external table:

=> CREATE LOCATION '/tmp' ALL NODES USAGE 'user';
CREATE LOCATION
=> GRANT ALL ON LOCATION '/tmp' to Bob;
GRANT PRIVILEGE
=> CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/data/ext1.dat' DELIMITER ',';
CREATE TABLE

Create a user storage location on S3 and a role, so that users without their own S3 credentials can read data from S3 using the server credential:

   --- set database-level credential (once):
=> ALTER DATABASE DEFAULT SET AWSAuth = 'myaccesskeyid123456:mysecretaccesskey123456789012345678901234';

=> CREATE LOCATION 's3://datalake' SHARED USAGE 'USER' LABEL 's3user';

=> CREATE ROLE ExtUsers;
   --- Assign users to this role using GRANT (Role).

=> GRANT READ ON LOCATION 's3://datalake' TO ExtUsers;

17 - CREATE NETWORK ADDRESS

Creates a network address that can be used as part of a connection load balancing policy.

Creates a network address that can be used as part of a connection load balancing policy. A network address creates a name in the Vertica catalog for an IP address and port number associated with a node. Nodes can have multiple network addresses, up to one for each IP address they have on the network.

Syntax

CREATE NETWORK ADDRESS name ON node WITH 'ip-address' [PORT port-number] [ENABLED | DISABLED]

Parameters

name: The name of the new network address. Use this name when creating connection load balancing groups.
node: The name of the node on which to create the network address. This should be name of the node as it appears in the node_name column of system table NODES.
ip-address: The IPv4 or and IPv6 address on the node to associate with the network address.

Note
Vertica does not verify that the IP address you supply in this parameter is actually associated with the specified node. Be sure that the IP address actually belongs to the node. Otherwise, your load balancing policy is liable to send a client connection to the wrong node, or a non-Vertica host. Vertica rejects IP address that are invalid for a node. For example, it checks whether the IP address falls in the loopback address range of 127.0.0.0/8. If it finds that the IP address is invalid, CREATE NETWORK ADDRESS returns an error.
PORT port-number: Sets the port number for the network address. You must supply a network address when altering the port number.
ENABLED | DISABLED: Enables or disables the network address.

Privileges

Superuser

Examples

Create three network addresses, one for each node in a three-node cluster:

=> SELECT node_name,export_address from v_catalog.nodes;
      node_name      | export_address
---------------------+----------------
 v_vmart_br_node0001 | 10.20.100.62
 v_vmart_br_node0002 | 10.20.100.63
 v_vmart_br_node0003 | 10.20.100.64
(3 rows)

=> CREATE NETWORK ADDRESS node01 ON v_vmart_br_node0001 WITH '10.20.100.62';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_vmart_br_node0002 WITH '10.20.100.63';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_vmart_br_node0003 WITH '10.20.100.64';

18 - CREATE NETWORK INTERFACE

Identifies a network interface to which a node belongs.

Use this statement when you want to configure import/export operations from individual nodes to other Vertica clusters. By default, when you install Vertica, it creates interfaces for all connected networks. You would only need CREATE NETWORK INTERFACE in situations where the network topology has changed since you installed Vertica.

Note

Do not confuse this statement with CREATE NETWORK ADDRESS, which is used to identify network addresses for connection load balancing (see Connection load balancing policies).

Syntax

CREATE NETWORK INTERFACE network-interface-name ON node-name [WITH] 'node-IP-address' [PORT port-number] [ENABLED | DISABLED]

network-interface-name: The name you assign to the network interface, where network-interface-name conforms to conventions described in Identifiers.
node-name: The name of the node.
node-IP-address: The node's IP address, either a public or private IP address. For more information, see Using Public and Private IP Networks.
PORT port-number: Sets the port number for the network interface. You must supply a network interface when altering the port number.
[ENABLED | DISABLED]: Enables or disables the network interface.

Privileges

Superuser

Examples

Create a network interface:

=> CREATE NETWORK INTERFACE mynetwork ON v_vmart_node0001 WITH '123.4.5.6' PORT 456 ENABLED;

19 - CREATE NOTIFIER

Creates a push-based notifier to send event notifications and messages out of Vertica.

Syntax

CREATE NOTIFIER [ IF NOT EXISTS ] notifier-name ACTION 'notifier-type'
    [ ENABLE | DISABLE ]
    [ MAXPAYLOAD 'integer{K|M}' ]
    MAXMEMORYSIZE 'integer{K|M|G|T}'
    [ TLSMODE 'tls-mode' ]
    [ CA BUNDLE bundle-name [ CERTIFICATE certificate-name ] ]
    [ IDENTIFIED BY 'uuid' ]
    [ [NO] CHECK COMMITTED ]
    [ PARAMETERS 'adapter-params' ]

Parameters

IF NOT EXISTS

If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

notifier-name

This notifier's unique identifier.

ACTION 'notifier-type'

String, the type of notifier, one of the following:

URL, with the following format, that identifies one or more target Kafka servers:
```
kafka://kafka-server-ip-address:port-number
```
To enable failover when a Kafka server is unavailable, specify additional hosts in a comma-delimited list. For example:
```
kafka://192.0.2.0:9092,192.0.2.1:9092,192.0.2.2:9092
```
syslog: Notifications are sent to syslog. To use notifiers of this type, you must set the SyslogEnabled parameter:
```
=> ALTER DATABASE DEFAULT SET SyslogEnabled = 1
```
Events monitored by this notifier type are not logged to MONITORING_EVENTS nor vertica.log.

ENABLE | DISABLE

Specifies whether to enable or disable the notifier.

Default: ENABLE.

MAXPAYLOAD 'integer{K|M}'

The maximum size of the message, up to 10^9 bytes, specified in kilobytes or megabytes.

The following restrictions apply:

MAXPAYLOAD cannot be greater than MAXMEMORYSIZE.
If you configure syslog to send messages to a remote destination, ensure that MaxMessageSize (in /etc/rsyslog for rsyslog) is greater than or equal to MAXPAYLOAD.

Defaults:

Kafka: 1M
syslog: 1M

MAXMEMORYSIZE 'integer{K|M|G|T}'

The maximum size of the internal notifier, up to 2 TB, specified in kilobytes, megabytes, gigabytes, or terabytes.

MAXMEMORYSIZE must be greater than MAXPAYLOAD.

If the size of the message queue exceeds MAXMEMORYSIZE, the notifier drops excess messages.

TLSMODE 'tls-mode'

Specifies the type of connection between the notifier and an endpoint, one of the following:

disable (default): Plaintext connection.
verify-ca: Encrypted connection, and the server's certificate is verified as being signed by a trusted CA.

If the notifier ACTION is 'syslog', this parameter has no effect; you must configure syslog for TLS to enable encryption for these Vertica syslog notifiers.

CA BUNDLE bundle-name

Specifies a CA bundle. The certificates inside the bundle are used to validate the Kafka server's certificate if the TLSMODE requires it.

If a CA bundle is specified for a notifier that currently uses disable, which doesn't validate the Kafka server's certificate, the bundle will go unused when connecting to the Kafka server. This behavior persists unless the TLSMODE is changed to one that validates server certificates.

Changes to contents of the CA bundle take effect either after the notifier is disabled and re-enabled or after the database restarts. However, changes to which CA bundle the notifier uses takes effect immediately.

If the notifier ACTION is 'syslog', this parameter has no effect; you must configure syslog for TLS to enable encryption for these Vertica syslog notifiers.

CERTIFICATE certificate-name

Specifies a client certificate for validation by the endpoint.

If the notifier ACTION is 'syslog', this parameter has no effect; you must configure syslog for TLS to enable encryption for these Vertica syslog notifiers.

IDENTIFIED BY uuid

Specifies the notifier's unique identifier. If set, all the messages published by this notifier have this attribute.

[NO] CHECK COMMITTED

Specifies to wait for delivery confirmation before sending the next message in the queue.

Some messaging systems, like syslog, do not support delivery confirmation.

PARAMETERS 'adapter-params'

Specifies one or more optional adapter parameters that are passed as a string to the adapter. Adapter parameters apply only to the adapter associated with the notifier.

For Kafka notifiers, refer to Kafka and Vertica configuration settings.

For syslog notifiers, specify the severity of the event with eventSeverity=severity, where severity is one of the following:

0: Emergency
1: Alert
2: Critical
3: Error
4: Warning
5: Notice
6: Informational
7: Debug

Most syslog implementations, by default, do not log events with a severity level of 7. You must configure syslog to record these types of events.

Privileges

Superuser

Encrypted notifiers for SASL_SSL Kafka configurations

Follow this procedure to create or alter notifiers for Kafka endpoints that use SASL_SSL. Note that you must repeat this procedure whenever you change the TLSMODE, certificates, or CA bundle for a given notifier.

Use CREATE or ALTER to disable the notifier while setting the TLSMODE, certificate, and CA bundle.

=> ALTER NOTIFIER encrypted_notifier
    DISABLE
    TLSMODE 'verify-ca'
    CA BUNDLE ca_bundle2;

ALTER the notifier and set the proper rdkafka adapter parameters for SASL_SSL.

=> ALTER NOTIFIER encrypted_notifier PARAMETERS
  'sasl.username=user;sasl.password=password;sasl.mechanism=PLAIN;security.protocol=SASL_SSL';

Enable the notifier.

=> ALTER NOTIFIER encrypted_notifier ENABLE;

Examples

Kafka notifiers

Create a Kafka notifier:

=> CREATE NOTIFIER my_dc_notifier
    ACTION 'kafka://172.16.20.10:9092'
    MAXMEMORYSIZE '1G'
    IDENTIFIED BY 'f8b0278a-3282-4e1a-9c86-e0f3f042a971'
    NO CHECK COMMITTED;

Create a notifier with an adapter-specific parameter:

=> CREATE NOTIFIER my_notifier
    ACTION 'kafka://127.0.0.1:9092'
    MAXMEMORYSIZE '10M'
    PARAMETERS 'queue.buffering.max.ms=1000';

Create a notifier that uses an encrypted connection and verifies the Kafka server's certificate with the provided CA bundle:

=> CREATE NOTIFIER encrypted_notifier
    ACTION 'kafka://127.0.0.1:9092'
    MAXMEMORYSIZE '10M'
    TLSMODE 'verify-ca'
    CA BUNDLE ca_bundle;

Syslog notifiers

The following example creates a notifier that writes a message to syslog when the Data collector (DC) component LoginFailures updates:

Enable syslog notifiers for the current database:

=> ALTER DATABASE DEFAULT SET SyslogEnabled = 1;

Create and enable a syslog notifier v_syslog_notifier:

=> CREATE NOTIFIER v_syslog_notifier ACTION 'syslog'
    ENABLE
    MAXMEMORYSIZE '10M'
    IDENTIFIED BY 'f8b0278a-3282-4e1a-9c86-e0f3f042a971'
    PARAMETERS 'eventSeverity = 5';

Configure the syslog notifier v_syslog_notifier for updates to the LoginFailures DC component with SET_DATA_COLLECTOR_NOTIFY_POLICY:

=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','v_syslog_notifier', 'Login failed!', true);

This notifier writes the following message to syslog (default location: /var/log/messages) when a user fails to authenticate as the user Bob:

Apr 25 16:04:58
vertica_host_01
vertica:
    Event Posted:
        Event Code:21
        Event Id:0
        Event Severity: Notice [5]
        PostedTimestamp: 2022-04-25 16:04:58.083063
        ExpirationTimestamp: 2022-04-25 16:04:58.083063
        EventCodeDescription: Notifier
        ProblemDescription: (Login failed!)
    {
       "_db":"VMart",
       "_schema":"v_internal",
       "_table":"dc_login_failures",
       "_uuid":"f8b0278a-3282-4e1a-9c86-e0f3f042a971",
       "authentication_method":"Reject",
       "client_authentication_name":"default: Reject",
       "client_hostname":"::1",
       "client_label":"",
       "client_os_user_name":"dbadmin",
       "client_pid":523418,
       "client_version":"",
       "database_name":"dbadmin",
       "effective_protocol":"3.8",
       "node_name":"v_vmart_node0001",
       "reason":"REJECT",
       "requested_protocol":"3.8",
       "ssl_client_fingerprint":"",
       "ssl_client_subject":"",
       "time":"2022-04-25 16:04:58.082568-05",
       "user_name":"Bob"
    }#012
    DatabaseName: VMart
    Hostname: vertica_host_01

For details on syslog notifiers, see Configuring reporting for syslog.

20 - CREATE PROCEDURE (external)

Adds an external procedure to Vertica.

Enterprise Mode only

Adds an external procedure to Vertica. See External procedures for more information.

Syntax

CREATE PROCEDURE [ IF NOT EXISTS ]
    [[database.]schema.]procedure( [ argument-list ] )
    AS executable
    LANGUAGE 'EXTERNAL'
    USER OS-user

Parameters

IF NOT EXISTS

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

This option cannot be used with OR REPLACE.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*procedure*

Specifies a name for the external procedure, where *procedure-name*conforms to conventions described in Identifiers.

argument-list

A comma-delimited list of procedure arguments, where each argument is specified as follows:

[ argname ] argtype

argname optionally provides a descriptive name for this argument.
argtype must be one of the following data types supported byVertica:
- BIGINT
- BOOLEAN
- DECIMAL
- DOUBLE PRECISION
- FLOAT
- FLOAT8
- INT
- INT8
- INTEGER
- MONEY
- NUMBER
- NUMERIC
- REAL
- SMALLINT
- TINYINT
- VARCHAR

executable

The name of the executable program in the procedures directory, a string.

OS-user

The owner of the file, a string. The owner:

Cannot be root
Must have execute privileges on executable

Privileges

Superuser

System security

The procedure file must be owned by the database administrator (OS account) or by a user in the same group as the administrator. The procedure file must also have the set UID attribute enabled, and allow read and execute permission for the group.
External procedures that you create with CREATE PROCEDURE (external) are always run with Linux dbadmin privileges. If a dbadmin or pseudosuperuser grants a non-dbadmin permission to run a procedure using GRANT (procedure), be aware that the non-dbadmin user runs the procedure with full Linux dbadmin privileges.

Examples

The following example shows how to create a procedure named helloplanet for the procedure file helloplanet.sh. This file accepts one VARCHAR argument.

Create the file:

#!/bin/bash
echo "hello planet argument: $1" >> /tmp/myprocedure.log

Create the procedure with the following SQL:

=> CREATE PROCEDURE helloplanet(arg1 varchar) AS 'helloplanet.sh' LANGUAGE 'external' USER 'dbadmin';

21 - CREATE PROCEDURE (stored)

Creates a stored procedure.

Syntax

CREATE [ OR REPLACE ] PROCEDURE [ IF NOT EXISTS ]
    [[database.]schema.]procedure( [ parameter-list ] )
    [ LANGUAGE 'language-name' ]
    [ SECURITY { DEFINER | INVOKER } ]
    AS $$ source $$;

Parameters

OR REPLACE

If a procedure with the same name already exists, replace it. Users and roles with privileges on the original procedure retain these privileges on the new procedure.

This option cannot be used with IF NOT EXISTS.

IF NOT EXISTS

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

This option cannot be used with OR REPLACE.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*procedure*

The name of the stored procedure, where *procedure-name*conforms to conventions described in Identifiers.

parameter-list

A comma-delimited list of formal parameters, each specified as follows:

[ parameter-mode ] parameter-name parameter-type

parameter-name: the name of the parameter.
parameter-type: Any SQL data type, with the following exceptions:
- DECIMAL
- NUMERIC
- NUMBER
- MONEY
- UUID
- GEOGRAPHY
- GEOMETRY
- Complex types

language-name

Specifies the language of the procedure source, one of the following (both options refer to PLvSQL; PLpgSQL is included to maintain compatibility with existing scripts):

PLvSQL
PLpgSQL

Default: PLvSQL

SECURITY { DEFINER | INVOKER }

Determines whose privileges to use when the procedure is called and executes it as if the user is one of the following:

DEFINER: User who defined the procedure
INVOKER: User who called the procedure

A procedure with SECURITY DEFINER effectively executes the procedure as that user, so changes to the database appear to be performed by the procedure's definer rather than its caller.

Caution

Improper use of SECURITY DEFINER can lead to the confused deputy problem and introduce vulnerabilities into your system like SQL injection.

For more information, see Executing stored procedures.

source

The procedure source code. For details, see Scope and structure.

Privileges

Non-superuser: CREATE on the procedure's schema

Examples

For more complex examples, see Stored procedures: use cases and examples

This procedure prints its arguments:

=> CREATE PROCEDURE raiseXY(IN x INT, y VARCHAR) LANGUAGE PLvSQL AS $$
BEGIN
    RAISE NOTICE 'x = %', x;
    RAISE NOTICE 'y = %', y;
    -- some processing statements
END;
$$;

CALL raiseXY(3, 'some string');
NOTICE 2005:  x = 3
NOTICE 2005:  y = some string

For more information on RAISE NOTICE, see Errors and diagnostics.

22 - CREATE PROFILE

Creates a profile that controls password requirements for users.

Syntax

CREATE PROFILE profile-name LIMIT [
    PASSWORD_LIFE_TIME setting
    PASSWORD_MIN_LIFE_TIME setting
    PASSWORD_GRACE_TIME setting
    FAILED_LOGIN_ATTEMPTS setting
    PASSWORD_LOCK_TIME setting
    PASSWORD_REUSE_MAX setting
    PASSWORD_REUSE_TIME setting
    PASSWORD_MAX_LENGTH setting
    PASSWORD_MIN_LENGTH setting
    PASSWORD_MIN_LETTERS setting
    PASSWORD_MIN_UPPERCASE_LETTERS setting
    PASSWORD_MIN_LOWERCASE_LETTERS setting
    PASSWORD_MIN_DIGITS setting
    PASSWORD_MIN_SYMBOLS setting
    PASSWORD_MIN_CHAR_CHANGE setting ]

Parameters

Note

All parameters that are not explicitly set in a new profile are set to default, and inherit their settings from the default profile.

Name	Description
`name`	The name of the profile to create, where `name`conforms to conventions described in Identifiers. To modify the default profile, set `name` to `default`. For example: `ALTER PROFILE DEFAULT LIMIT PASSWORD_MIN_SYMBOLS 1;`
`PASSWORD_LIFE_TIME`	Set to an integer value, one of the following: ≥ 1: The number of days a password remains valid. `UNLIMITED`: Password remains valid indefinitely. After your password's lifetime and grace period expire, you must change your password on your next login, if you have not done so already.
`PASSWORD_MIN_LIFE_TIME`	Set to an integer value, one of the following: Default: 0 ≥ 1: The number of days a password must be set before it can be changed `UNLIMITED`: Password can be reset at any time.
`PASSWORD_GRACE_TIME`	Set to an integer value, one of the following: ≥ 1: The number of days a password can be used after it expires. `UNLIMITED`: No grace period.
`FAILED_LOGIN_ATTEMPTS`	Set to an integer value, one of the following: ≥ 1: The number of consecutive failed login attempts Vertica allows before locking your account. `UNLIMITED`: Vertica allows an unlimited number of failed login attempts.
`PASSWORD_LOCK_TIME`	≥ 1: The number of days (units configurable with PasswordLockTimeUnit) a user's account is locked after `FAILED_LOGIN_ATTEMPTS` number of login attempts. The account is automatically unlocked when the lock time elapses. `UNLIMITED`: Account remains indefinitely inaccessible until a superuser manually unlocks it.
`PASSWORD_REUSE_MAX`	Set to an integer value, one of the following: ≥ 1: The number of times you must change your password before you can reuse an earlier password. `UNLIMITED`: You can reuse an earlier password without any intervening changes.
`PASSWORD_REUSE_TIME`	Set to an integer value, one of the following: ≥ 1: The number of days that must pass after a password is set before you can reuse it. `UNLIMITED`: You can reuse an earlier password immediately.
`PASSWORD_MAX_LENGTH`	The maximum number of characters allowed in a password, one of the following: Integer between 8 and 512, inclusive
`PASSWORD_MIN_LENGTH`	The minimum number of characters required in a password, one of the following: 0 to `PASSWORD_MAX_LENGTH` `UNLIMITED`: Minimum of `PASSWORD_MAX_LENGTH`
`PASSWORD_MIN_LETTERS`	Minimum number of letters (a-z and A-Z) that must be in a password, one of the following: Integer between 0 and `PASSWORD_MAX_LENGTH`, inclusive `UNLIMITED`: 0 (no minimum)
`PASSWORD_MIN_UPPERCASE_LETTERS`	Minimum number of uppercase letters (A-Z) that must be in a password, one of the following: Integer between 0 and `PASSWORD_MAX_LENGTH`, inclusive `UNLIMITED`: 0 (no minimum)
`PASSWORD_MIN_LOWERCASE_LETTERS`	Minimum number of lowercase letters (a-z) that must be in a password, one of the following: Integer between 0 and `PASSWORD_MAX_LENGTH`, inclusive `UNLIMITED`: 0 (no minimum)
`PASSWORD_MIN_DIGITS`	Minimum number of digits (0-9) that must be in a password, one of the following: Integer between 0 and `PASSWORD_MAX_LENGTH`, inclusive `UNLIMITED`: 0 (no minimum)
`PASSWORD_MIN_SYMBOLS`	Minimum number of symbols—printable non-letter and non-digit characters such as $, #, @—that must be in a password, one of the following: Integer between 0 and `PASSWORD_MAX_LENGTH`, inclusive `UNLIMITED`: 0 (no minimum)
`PASSWORD_MIN_CHAR_CHANGE`	Minimum number of characters that must be different from the previous password: Default: 0 Integer between 0 and `PASSWORD_MAX_LENGTH`, inclusive `UNLIMITED`: 0 (no minimum)

Privileges

Superuser

Profile settings and client authentication

The following profile settings affect client authentication methods, such as LDAP or GSS:

FAILED_LOGIN_ATTEMPTS
PASSWORD_LOCK_TIME

All other profile settings are used only by Vertica to manage its passwords.

Examples

=> CREATE PROFILE sample_profile LIMIT PASSWORD_MAX_LENGTH 20;

23 - CREATE PROJECTION

Creates metadata for a in the Vertica catalog.

Creates metadata for a projection in the Vertica catalog. Vertica supports four types of projections:

Standard projection: Stores collection of table data in a format that optimizes execution of certain queries on that table.
Live aggregate projection: Stores the grouped results of queries that invoke aggregate functions (such as SUM) on table columns.
Top-K projection: Stores the top k rows from partitions of selected rows.
UDTF projection: Stores newly-loaded data after it is transformed and/or aggregated by user-defined transformation functions (UDTFs).

Complex data types have additional restrictions when used within a projection:

Each projection must include at least one column that is a primitive type or native array.
An AS SELECT clause can use a complex-type column, but any other expression must be of a scalar type or native array.
The ORDER BY, PARTITION BY, and GROUP BY clauses cannot use complex types.
If a projection does not include an ORDER BY or segmentation clause, Vertica uses only the primitive columns from the select list to order or segment data.
Projection columns cannot be complex types returned from functions such as ARRAY_CAT.
TopK and UDTF projections do not support complex types.

23.1 - Encoding types

Vertica supports various encoding and compression types, specified by the following ENCODING parameter arguments:.

Vertica supports various encoding and compression types, specified by the following ENCODING parameter arguments:

AUTO (default)
BLOCK_DICT
BLOCKDICT_COMP
BZIP_COMP
COMMONDELTA_COMP
DELTARANGE_COMP
DELTAVAL
GCDDELTA
GZIP_COMP
RLE
Zstandard Compression

Note

Vertica supports the following encoding for numeric data types:

Precision ≤ 18: AUTO, BLOCK_DICT, BLOCKDICT_COMP, COMMONDELTA_COMP, DELTAVAL, GCDDELTA, and RLE
Precision > 18: AUTO, BLOCK_DICT, BLOCKDICT_COMP, RLE

You can set encoding types on a projection column when you create the projection. You can also change the encoding of one or more projection columns for a given table with ALTER TABLE...ALTER COLUMN.

AUTO (default)

AUTO encoding is ideal for sorted, many-valued columns such as primary keys. It is also suitable for general purpose applications for which no other encoding or compression scheme is applicable. Therefore, it serves as the default if no encoding/compression is specified.

Column data type	Default encoding type
BINARY/VARBINARY BOOLEAN CHAR/VARCHAR FLOAT	Lempel-Ziv-Oberhumer-based (LZO) compression
DATE/TIME/TIMESTAMP INTEGER INTERVAL	Compression scheme based on the delta between consecutive column values.

The CPU requirements for this type are relatively small. In the worst case, data might expand by eight percent (8%) for LZO and twenty percent (20%) for integer data.

BLOCK_DICT

For each block of storage, Vertica compiles distinct column values into a dictionary and then stores the dictionary and a list of indexes to represent the data block.

BLOCK_DICT is ideal for few-valued, unsorted columnswhere saving space is more important than encoding speed. Certain kinds of data, such as stock prices, are typically few-valued within a localized area after the data is sorted, such as by stock symbol and timestamp, and are good candidates for BLOCK_DICT. By contrast, long CHAR/VARCHAR columns are not good candidates for BLOCK_DICT encoding.

CHAR and VARCHAR columns that contain 0x00 or 0xFF characters should not be encoded with BLOCK_DICT. Also, BINARY/VARBINARY columns do not support BLOCK_DICT encoding.

BLOCK_DICT encoding requires significantly higher CPU usage than default encoding schemes. The maximum data expansion is eight percent (8%).

BLOCKDICT_COMP

This encoding type is similar to BLOCK_DICT except dictionary indexes are entropy coded. This encoding type requires significantly more CPU time to encode and decode and has a poorer worst-case performance. However, if the distribution of values is extremely skewed, using BLOCK_DICT_COMP encoding can lead to space savings.

BZIP_COMP

BZIP_COMP encoding uses the bzip2 compression algorithm on the block contents. See bzip web site for more information. This algorithm results in higher compression than the automatic LZO and gzip encoding; however, it requires more CPU time to compress. This algorithm is best used on large string columns such as VARCHAR, VARBINARY, CHAR, and BINARY. Choose this encoding type when you are willing to trade slower load speeds for higher data compression.

COMMONDELTA_COMP

This compression scheme builds a dictionary of all deltas in the block and then stores indexes into the delta dictionary using entropy coding.

This scheme is ideal for sorted FLOAT and INTEGER-based (DATE/TIME/TIMESTAMP/INTERVAL) data columns with predictable sequences and only occasional sequence breaks, such as timestamps recorded at periodic intervals or primary keys. For example, the following sequence compresses well: 300, 600, 900, 1200, 1500, 600, 1200, 1800, 2400. The following sequence does not compress well: 1, 3, 6, 10, 15, 21, 28, 36, 45, 55.

If delta distribution is excellent, columns can be stored in less than one bit per row. However, this scheme is very CPU intensive. If you use this scheme on data with arbitrary deltas, it can cause significant data expansion.

DELTARANGE_COMP

This compression scheme is primarily used for floating-point data; it stores each value as a delta from the previous one.

This scheme is ideal for many-valued FLOAT columns that are sorted or confined to a range. Do not use this scheme for unsorted columns that contain NULL values, as the storage cost for representing a NULL value is high. This scheme has a high cost for both compression and decompression.

To determine if DELTARANGE_COMP is suitable for a particular set of data, compare it to other schemes. Be sure to use the same sort order as the projection, and select sample data that will be stored consecutively in the database.

DELTAVAL

For INTEGER and DATE/TIME/TIMESTAMP/INTERVAL columns, data is recorded as a difference from the smallest value in the data block. This encoding has no effect on other data types.

DELTAVAL is best used for many-valued, unsorted integer or integer-based columns. CPU requirements for this encoding type are minimal, and data never expands.

GCDDELTA

For INTEGER and DATE/TIME/TIMESTAMP/INTERVAL columns, and NUMERIC columns with 18 or fewer digits, data is recorded as the difference from the smallest value in the data block divided by the greatest common divisor (GCD) of all entries in the block. This encoding has no effect on other data types.

ENCODING GCDDELTA is best used for many-valued, unsorted, integer columns or integer-based columns, when the values are a multiple of a common factor. For example, timestamps are stored internally in microseconds, so data that is only precise to the millisecond are all multiples of 1000. The CPU requirements for decoding GCDDELTA encoding are minimal, and the data never expands, but GCDDELTA may take more encoding time than DELTAVAL.

GZIP_COMP

This encoding type uses the gzip compression algorithm. See gzip web site for more information. This algorithm results in better compression than the automatic LZO compression, but lower compression than BZIP_COMP. It requires more CPU time to compress than LZO but less CPU time than BZIP_COMP. This algorithm is best used on large string columns such as VARCHAR, VARBINARY, CHAR, and BINARY. Use this encoding when you want a better compression than LZO, but at less CPU time than bzip2.

RLE

RLE (run length encoding) replaces sequences (runs) of identical values with a single pair that contains the value and number of occurrences. Therefore, it is best used for low cardinality columns that are present in the ORDER BY clause of a projection.

The Vertica execution engine processes RLE encoding run-by-run and the Vertica optimizer gives it preference. Use it only when run length is large, such as when low-cardinality columns are sorted.

Zstandard compression

Vertica supports three ZSTD compression types:

ZSTD_COMP provides high compression ratios. This encoding type has a higher compression than gzip. Use this when you want a better compression than gzip. For general use cases, use this or the ZSTD_FAST_COMP encoding type.
ZSTD_FAST_COMP uses the fastest compression level that the zstd library provides. It is the fastest encoding type of the zstd library, but takes up more space than the other two encoding types. For general use cases, use this or the ZSTD_COMP encoding type.
ZSTD_HIGH_COMP offers the best compression in the zstd library. It is slower than the other two encoding types. Use this type when you need the best compression, with slower CPU time.

23.2 - GROUPED clause

Groups two or more columns into a single disk file.

Enterprise Mode only

Groups two or more columns into a single disk file. This minimizes file I/O for work loads that:

Read a large percentage of the columns in a table.
Perform single row look-ups.
Query against many small columns.
Frequently update data in these columns.

If you have data that is always accessed together and it is not used in predicates, you can increase query performance by grouping these columns. Once grouped, queries can no longer independently retrieve from disk all records for an individual column independent of the other columns within the group.

Note

RLE encoding is reduced when an RLE column is grouped with one or more non-RLE columns.

When grouping columns you can:

Group some of the columns:

(a, GROUPED(b, c), d)
Group all of the columns:

(GROUPED(a, b, c, d))
Create multiple groupings in the same projection:

(GROUPED(a, b), GROUPED(c, d))

Note

Vertica performs dynamic column grouping. For example, to provide better read and write efficiency for small loads, Vertica ignores any projection-defined column grouping (or lack thereof) and groups all columns together by default.

Grouping correlated columns

The following example shows how to group highly correlated columns bid and ask. The stock column is stored separately.

=> CREATE TABLE trades (stock CHAR(5), bid INT, ask INT);
=> CREATE PROJECTION tradeproj (stock ENCODING RLE,
   GROUPED(bid ENCODING DELTAVAL, ask))
   AS (SELECT * FROM trades) KSAFE 1;

The following example show how to create a projection that uses expressions in the column definition. The projection contains two integer columns a and b, and a third column product_value that stores the product of a and b:

=> CREATE TABLE values (a INT, b INT


=> CREATE PROJECTION product (a, b, product_value) AS
   SELECT a, b, a*b FROM values ORDER BY a KSAFE;

23.3 - Hash segmentation clause

A general SQL expression.

Specifies how to segment projection data for distribution across all cluster nodes. You can specify segmentation for a table and a projection. If a table definition specifies segmentation, Vertica uses it for that table's auto-projections.

It is strongly recommended that you use Vertica's built-in HASH function, which distributes data evenly across the cluster, and facilitates optimal query execution.

Syntax

SEGMENTED BY expression ALL NODES [ OFFSET offset ]

Parameters

SEGMENTED BY expression

A general SQL expression. Hash segmentation is the preferred method of segmentation. Vertica recommends using its built-in HASH function, whose arguments resolve to table columns. If you use an expression other than HASH, Vertica issues a warning.

The segmentation expression should specify columns with a large number of unique data values and acceptable skew in their data distribution. In general, primary key columns that meet these criteria are good candidates for hash segmentation.

For details, see Expression Requirements below.

ALL NODES

Automatically distributes data evenly across all nodes when the projection is created. Node ordering is fixed.

OFFSET offset

A zero-based offset that indicates on which node to start segmentation distribution.

This option is not valid for CREATE TABLE and CREATE TEMPORARY TABLE.

Important

If you create a projection for a table with the OFFSET option, be sure to create enough copies of each projection segment to satisfy system K-safety; otherwise, Vertica regards the projection as unsafe and cannot use it to query the table.

You can ensure K-safety compliance when you create projections by combining OFFSET and KSAFE options in the CREATE PROJECTION statement. On executing this statement, Vertica automatically creates the necessary number of projection copies.

Expression requirements

A segmentation expression must specify table columns as they are defined in the source table. Projection column names are not supported.

The following restrictions apply to segmentation expressions:

All leaf expressions must be constants or column references to a column in the CREATE PROJECTION 's SELECT list.
The expression must return the same value over the life of the database.
Aggregate functions are not allowed.
The expression must return non-negative INTEGER values in the range 0 <= x < 2⁶³, and values are generally distributed uniformly over that range.

Note
If the expression produces a value outside the expected range—for example, a negative value—no error occurs, and the row is added to the projection's first segment.

Examples

The following CREATE PROJECTION statement creates projection public.employee_dimension_super. It specifies to include all columns in table public.employee_dimension. The hash segmentation clause invokes the Vertica HASH function to segment projection data on the column employee_key; it also includes the ALL NODES clause, which specifies to distribute projection data evenly across all nodes in the cluster:

=> CREATE PROJECTION public.employee_dimension_super
    AS SELECT * FROM public.employee_dimension
    ORDER BY employee_key
    SEGMENTED BY hash(employee_key) ALL NODES;

23.4 - Unsegmented clause

Specifies to distribute identical copies of table or projection data on all nodes across the cluster.

Specifies to distribute identical copies of table or projection data on all nodes across the cluster. Use this clause to facilitate distributed query execution on tables and projections that are too small to benefit from segmentation.

Vertica uses the same name to identify all instances of an unsegmented projection. For more information about projection name conventions, see Projection naming.

Syntax

UNSEGMENTED ALL NODES

Examples

This example creates an unsegmented projection for table store.store_dimension:


=> CREATE PROJECTION store.store_dimension_proj (storekey, name, city, state)
             AS SELECT store_key, store_name, store_city, store_state
             FROM store.store_dimension
             UNSEGMENTED ALL NODES;
CREATE PROJECTION

=>  SELECT anchor_table_name anchor_table, projection_name, node_name
      FROM PROJECTIONS WHERE projection_basename='store_dimension_proj';
  anchor_table   |   projection_name    |    node_name
-----------------+----------------------+------------------
 store_dimension | store_dimension_proj | v_vmart_node0001
 store_dimension | store_dimension_proj | v_vmart_node0002
 store_dimension | store_dimension_proj | v_vmart_node0003
(3 rows)

24 - CREATE RESOURCE POOL

Creates a custom resource pool and sets one or more resource pool parameters.

Syntax

CREATE RESOURCE POOL pool-name [ FOR subcluster-spec ] [ parameter-name setting ]...

Parameters

pool-name

The name of the resource pool. Built-in pool names cannot be used for user-defined pools.

Note

If you specify a resource pool name with uppercase letters, Vertica converts them to lowercase letters.

FOR subcluster-spec

Eon Mode only, specifies to associate this resource pool with a subcluster, where subcluster-spec is one of the following:

SUBCLUSTER subcluster-name: Creates the resource pool for an existing subcluster. You cannot be connected to this subcluster, otherwise Vertica returns an error.
CURRENT SUBCLUSTER: Creates the resource pool for the subcluster that you are connected to.

If omitted, the resource pool is created globally. Attempts to create a global resource pool with the same name as a subcluster-specific resource pool return an error.

parameter-name

The parameter to set, listed below.

setting

The value to set on parameter-name. To reset this parameter to its default value, specify DEFAULT.

Note

Default values specified in this table pertain only to user-defined resource pools. For built-in pool default values, see Built-in resource pools configuration, or query system table RESOURCE_POOL_DEFAULTS.

CASCADE TO

Specifies a secondary resource pool for executing queries that exceed the [RUNTIMECAP](#RUNTIMECAP) setting of their assigned resource pool:

CASCADE TO secondary-pool

CPUAFFINITYMODE

Specifies whether the resource pool has exclusive or shared use of the CPUs specified in [CPUAFFINITYSET](#CPUAFFINITYSET):

CPUAFFINITYMODE { SHARED | EXCLUSIVE | ANY }

SHARED: Queries that run in this pool share its CPUAFFINITYSET CPUs with other Vertica resource pools.
EXCLUSIVE: Dedicates CPUAFFINITYSET CPUs to this resource pool only, and excludes other Vertica resource pools. If CPUAFFINITYSET is set as a percentage, then that percentage of CPU resources available to Vertica is assigned solely for this resource pool.
ANY (default): Queries in this resource pool can run on any CPU, invalid if CPUAFFINITYSET designates CPU resources.

Important

CPUAFFINITYMODE and CPUAFFINITYSET must be set together in the same statement.

CPUAFFINITYSET

Specifies which CPUs are available to this resource pool. All cluster nodes must have the same number of CPUs. The CPU resources assigned to this set are unavailable to general resource pools.

CPUAFFINITYSET {
  'cpu-index[,...]'
| 'cpu-indexi-cpu-indexn'
| 'integer%'
| NONE
}

cpu-index[,...]: Dedicates one or more comma-delimited CPUs to this pool.
**``*cpu-indexi*-*cpu-index*n**: Dedicates a range of contiguous CPU indexes to this pool
integer%: Percentage of all available CPUs to use for this pool. Vertica rounds this percentage down to include whole CPU units.
NONE (default): No affinity set is assigned to this resource pool. The queries associated with this pool are executed on any CPU.

Important

CPUAFFINITYSET and CPUAFFINITYMODE must be set together in the same statement.

EXECUTIONPARALLELISM

Limits the number of threads used to process any single query issued in this resource pool.

EXECUTIONPARALLELISM { limit | AUTO }

limit: An integer value between 1 and the number of cores. Setting this parameter to a reduced value increases throughput of short queries issued in the pool, especially if the queries are executed concurrently.
AUTO or 0 (default): Vertica calculates the setting from the number of cores, available memory, and amount of data in the system. Unless memory is limited, or the amount of data is very small, Vertica sets this parameter to the number of cores on the node.

MAXCONCURRENCY

Sets the maximum number of concurrent execution slots available to the resource pool, across the cluster:

MAXCONCURRENCY { integer | NONE }

NONE (default) specifies unlimited number of concurrent execution slots.

MAXMEMORYSIZE

The maximum size per node the resource pool can grow by borrowing memory from the GENERAL pool:

MAXMEMORYSIZE {
  'integer%'
| 'integer{K|M|G|T}'
| NONE
}

integer%: Percentage of total memory
integer{K|M|G|T}: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes
NONE (default): Unlimited, pool can borrow any amount of available memory from the GENERAL pool.

MAXQUERYMEMORYSIZE

The maximum amount of memory that this pool can allocate at runtime to process a query. If the query requires more memory than this setting, Vertica stops execution and returns an error.

Set this parameter as follows:

MAXQUERYMEMORYSIZE {
  'integer%'
| 'integer{K|M|G|T}'
| NONE
}

integer%: Percentage of MAXMEMORYSIZE for this pool.
integer{K|M|G|T}: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes, up to the value of MAXMEMORYSIZE.
NONE (default): Unlimited; pool can borrow any amount of available memory from the GENERAL pool, within the limits set by MAXMEMORYSIZE.

MEMORYSIZE

The amount of total memory available to the Vertica resource manager that is allocated to this pool per node:

MEMORYSIZE {
  'integer%'
| 'integer{K|M|G|T}'
}

integer%: Percentage of total memory
integer{K|M|G|T}: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes

Default: 0%. No memory allocated, the resource pool borrows memory from the GENERAL pool.

PLANNEDCONCURRENCY

Specifies the preferred number queries to execute concurrently in the resource pool. This setting applies to the entire cluster:

PLANNEDCONCURRENCY { num-queries | AUTO }

num-queries: Integer value ≥ 1, specifies the preferred number of concurrently executing queries. When possible, query resource budgets are limited to allow this level of concurrent execution.
AUTO (default): Value is calculated automatically at query runtime. Vertica sets this parameter to the lower of these two calculations, but never less than 4:
- Number of logical cores
- Memory divided by 2GB

For clusters where the number of logical cores differs on different nodes, AUTO can apply differently on each node. Distributed queries run like the minimal effective planned concurrency. Single node queries run with the planned concurrency of the initiator.

Tip

Change this parameter only after evaluating performance over a period of time.

PRIORITY

Specifies priority of queries in this pool when they compete for resources in the GENERAL pool:

PRIORITY { integer | HOLD }

integer: A negative or positive integer value, where higher numbers denote higher priority:
- User-defined pools: -100 to 100
- Built-in pools SYSQUERY, RECOVERY, and TM: -110 to 110
HOLD: Sets priority to -999. Queries in this pool are queued until [QUEUETIMEOUT](#QUEUETIMEOUT) is reached.

Default: 0

QUEUETIMEOUT

Species how long a request can wait for pool resources before it is rejected:

QUEUETIMEOUT { integer | NONE }

integer: Maximum wait time in seconds
NONE: No maximum wait time, request can be queued indefinitely.

**Default:** 300 seconds

RUNTIMECAP

Prevents runaway queries by setting the maximum time a query in the pool can execute. If a query exceeds this setting, it tries to cascade to a secondary pool:

RUNTIMECAP { 'interval' | NONE }

interval: Maximum wait time expressed in the following format:
NONE (default): No time limit on queries running in this pool.

To specify a value in days, provide an integer value. To provide a value less than one day, provide the interval in the format hours:minutes:seconds. For example a value of 1:30:00 would equal 90 minutes.

If the user or session also has a RUNTIMECAP, the shorter limit applies.

RUNTIMEPRIORITY

Determines how the resource manager should prioritize dedication of run-time resources (CPU, I/O bandwidth) to queries already running in this resource pool:

RUNTIMEPRIORITY { HIGH | MEDIUM | LOW }

Default: MEDIUM

RUNTIMEPRIORITYTHRESHOLD

Specifies in seconds a time limit in which a query must finish before the resource manager assigns to it the resource pool's RUNTIMEPRIORITY. All queries begin running at a HIGH priority. When a query's duration exceeds this threshold, it is assigned the RUNTIMEPRIORITY of the resource pool.

RUNTIMEPRIORITYTHRESHOLD seconds

Default: 2

SINGLEINITIATOR

By default, set to false for backward compatibility. Do not change this setting.

Privileges

Superuser

Examples

This example shows how to create a resource pool with MEMORYSIZE of 1800 MB.

=> CREATE RESOURCE POOL ceo_pool MEMORYSIZE '1800M' PRIORITY 10;
CREATE RESOURCE POOL

Assuming the CEO report user already exists, associate this user with the preceding resource pool using ALTER USER statement.

=> GRANT USAGE ON RESOURCE POOL ceo_pool to ceo_user;
GRANT PRIVILEGE
=> ALTER USER ceo_user RESOURCE POOL ceo_pool;
ALTER USER

Issue the following command to confirm that the ceo_user is associated with the ceo_pool:

=> SELECT * FROM users WHERE user_name ='ceo_user';
-[ RECORD 1 ]-----+--------------------------------------------------
user_id           | 45035996273733402
user_name         | ceo_user
is_super_user     | f
profile_name      | default
is_locked         | f
lock_time         |
resource_pool     | ceo_pool
memory_cap_kb     | unlimited
temp_space_cap_kb | unlimited
run_time_cap      | unlimited
all_roles         |
default_roles     |
search_path       | "$user", public, v_catalog, v_monitor, v_internal

This exampleshows how to create and designate secondary resource pools.

=> CREATE RESOURCE POOL rp3 RUNTIMECAP '5 minutes';
=> CREATE RESOURCE POOL rp2 RUNTIMECAP '3 minutes' CASCADE TO rp3;
=> CREATE RESOURCE POOL rp1 RUNTIMECAP '1 minute' CASCADE TO rp2;
=> SET SESSION RESOURCE_POOL = rp1;

This Eon Mode example confirms the current subcluster name, then creates a resource pool for the current subcluster:

=> SELECT CURRENT_SUBCLUSTER_NAME();
 CURRENT_SUBCLUSTER_NAME
-------------------------
 analytics_1
(1 row)

=> CREATE RESOURCE POOL dashboard FOR SUBCLUSTER analytics_1;
CREATE RESOURCE POOL

24.1 - Built-in pools

Vertica is preconfigured with built-in pools for various system tasks:.

Vertica is preconfigured with built-in pools for various system tasks:

GENERAL
BLOBDATA
DBD
JVM
METADATA
RECOVERY
REFRESH
SYSQUERY
TM

For details on resource pool settings, see ALTER RESOURCE POOL.

GENERAL

Catch-all pool used to answer requests that have no specific resource pool associated with them. Any memory left over after memory has been allocated to all other pools is automatically allocated to the GENERAL pool. The MEMORYSIZE parameter of the GENERAL pool is undefined (variable), however, the GENERAL pool must be at least 1GB in size and cannot be smaller than 25% of the memory in the system.

The MAXMEMORYSIZE parameter of the GENERAL pool has special meaning; when set as a % value it represents the percent of total physical RAM on the machine that the Resource manager can use for queries. By default, it is set to 95%. MAXMEMORYSIZE governs the total amount of RAM that the Resource Manager can use for queries, regardless of whether it is set to a percent or to a specific value (for example, '10GB').

User-defined pools can borrow memory from the GENERAL pool to satisfy requests that need extra memory until the MAXMEMORYSIZE parameter of that pool is reached. If the pool is configured to have MEMORYSIZE equal to MAXMEMORYSIZE, it cannot borrow any memory from the GENERAL pool. When multiple pools request memory from the GENERAL pool, they are granted access to general pool memory according to their priority setting. In this manner, the GENERAL pool provides some elasticity to account for point-in-time deviations from normal usage of individual resource pools.

Vertica recommends reducing the GENERAL pool MAXMEMORYSIZE if your catalog uses over 5 percent of overall memory. You can calculate what percentage of GENERAL pool memory the catalog uses as follows:

=> WITH memory_use_metadata AS (SELECT node_name, memory_size_kb FROM resource_pool_status WHERE pool_name='metadata'),
        memory_use_general  AS (SELECT node_name, memory_size_kb FROM resource_pool_status WHERE pool_name='general')
   SELECT m.node_name, ((m.memory_size_kb/g.memory_size_kb) * 100)::NUMERIC(4,2) pct_catalog_usage
   FROM memory_use_metadata m JOIN memory_use_general g ON m.node_name = g.node_name;
    node_name     | pct_catalog_usage
------------------+-------------------
 v_vmart_node0001 |              0.41
 v_vmart_node0002 |              0.37
 v_vmart_node0003 |              0.36
(3 rows)

BLOBDATA

Controls resource usage for in-memory blobs. In-memory blobs are objects used by a number of the machine learning SQL functions. You should adjust this pool if you plan on processing large machine learning workloads. For information about tuning the pool, see Tuning for machine learning.

If a query using the BLOBDATA pool exceeds its query planning budget, then it spills to disk. For more information about tuning your query budget, see Query budgeting.

DBD

Controls resource usage for Database Designer processing. Use of this pool is enabled by configuration parameter DBDUseOnlyDesignerResourcePool, by default set to false.

By default, QUEUETIMEOUT is set to 0 for this pool. When resources are under pressure, this setting causes the DBD to time out immediately, and not be queued to run later. Database Designer then requests the user to run the designer later, when resources are more available.

Important

Do not change QUEUETIMEOUT or any DBD resource pool parameters.

JVM

Controls Java Virtual Machine resources used by Java User Defined Extensions. When a Java UDx starts the JVM, it draws resources from the those specified in the JVM resource pool. Vertica does not reserve memory in advance for the JVM pool. When needed, the pool can expand to 10% of physical memory or 2 GB of memory, whichever is smaller. If you are buffering large amounts of data, you may need to increase the size of the JVM resource pool.

You can adjust the size of your JVM resource pool by changing its configuration settings. Unlike other resource pools, the JVM resource pool does not release resources until a session is closed.

METADATA

Tracks memory allocated for catalog data and storage data structures. This pool increases in size as Vertica metadata consumes additional resources. Memory assigned to the METADATA pool is subtracted from the GENERAL pool, enabling the Vertica resource manager to make more effective use of available resources. If the METADATA resource pool reaches 75% of the GENERAL pool, Vertica stops updating METADATA memory size and displays a warning message in vertica.log. You can enable or disable the METADATA pool with configuration parameter EnableMetadataMemoryTracking.

If you created a "dummy" or "swap" resource pool to protect resources for use by your operating system, you can replace that pool with the METADATA pool.

Users cannot change the parameters of the METADATA resource pool.

RECOVERY

Used by queries issued when recovering another node of the database. The MAXCONCURRENCY parameter is used to determine how many concurrent recovery threads to use. You can use the PLANNEDCONCURRENCY parameter (by default, set to twice the MAXCONCURRENCY) to tune how to apportion memory to recovery queries.

See Tuning for recovery.

REFRESH

Used by queries issued by PROJECTION_REFRESHES operations. Refresh does not currently use multiple concurrent threads; thus, changes to the MAXCONCURRENCY values have no effect.

See Scenario: Tuning for Refresh.

SYSQUERY

Runs queries against all system monitoring and catalog tables. The SYSQUERY pool reserves resources for system table queries so that they are never blocked by contention for available resources.

TM

The Tuple Mover (TM) pool. You can set the MAXCONCURRENCY parameter for the TM pool to allow concurrent TM operations.

See Tuning tuple mover pool settings.

24.2 - Built-in resource pools configuration

To view the current and default configuration for built-in resource pools, query the system tables RESOURCE_POOLS and RESOURCE_POOL_DEFAULTS, respectively.

To view the current and default configuration for built-in resource pools, query the system tables RESOURCE_POOLS and RESOURCE_POOL_DEFAULTS, respectively. The sections below provide this information, and also indicate which built-in pool parameters can be modified with ALTER RESOURCE POOL:

GENERAL
BLOBDATA
DBD
JVM
METADATA
RECOVERY
REFRESH
SYSQUERY
TM

GENERAL

Parameter	Settings
MEMORYSIZE	Empty / cannot be set
MAXMEMORYSIZE	The maximum memory to use for all resource pools, one of the following: MAXMEMORYSIZE { 'integer%' \| 'integer{K\|M\|G\|T}' } `integer``%`: Percentage of total system RAM, must be ≥ 25% Caution Setting this parameter to 100% generates a warning of potential swapping. `integer``{K\|M\|G\|T}`: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes, must be ≥ 1GB For example, if your node has 64GB of memory, setting MAXMEMORYSIZE to 50% allocates half of available memory. Thus, the maximum amount of memory available to all resource pools is 32GB. Default: 95%
MAXQUERYMEMORYSIZE	The maximum amount of memory allocated by this pool to process any query: MAXQUERYMEMORYSIZE { 'integer%' \| 'integer{K\|M\|G\|T}' } `integer``%`: Percentage of `MAXMEMORYSIZE` for this pool. `integer``{K\|M\|G\|T}`: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes
EXECUTIONPARALLELISM	Default: AUTO
PRIORITY	Default: 0
RUNTIMEPRIORITY	Default: Medium
RUNTIMEPRIORITYTHRESHOLD	Default: 2
QUEUETIMEOUT	Default: 00:05 (minutes)
RUNTIMECAP	Prevents runaway queries by setting the maximum time a query in the pool can execute. If a query exceeds this setting, it tries to cascade to a secondary pool: `RUNTIMECAP { '``interval``' \| NONE }` `interval`: An interval of 1 minute or 100 seconds; should not exceed one year. `NONE` (default): No time limit on queries running in this pool.
PLANNEDCONCURRENCY	The number of concurrent queries you expect to run against the resource pool, an integer ≥ 4. If set to AUTO (default), Vertica automatically sets PLANNEDCONCURRENCY at query runtime, choosing the lower of these two values: Number of cores Memory/2GB Important In systems with a large number of cores, the default AUTO setting of `PLANNEDCONCURRENCY` is liable to be too low. In this case, set the parameter to the actual number of cores: `ALTER RESOURCE POOL general PLANNEDCONCURRENCY` `#cores``;` Default: AUTO
MAXCONCURRENCY	Default: Empty Caution Must be set ≥ 1, otherwise Vertica generates a warning that system queries might be unable to execute.
SINGLEINITIATOR	Default: False. Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Default: Empty
CPUAFFINITYMODE	Default: ANY
CASCADETO	Default: Empty

BLOBDATA

Parameter	Default Setting
MEMORYSIZE	0%
MAXMEMORYSIZE	10
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM
PRIORITY
RUNTIMEPRIORITY
RUNTIMEPRIORITYTHRESHOLD
QUEUETIMEOUT
RUNTIMECAP	NONE
PLANNEDCONCURRENCY	AUTO
MAXCONCURRENCY	Empty / cannot be set
SINGLEINITIATOR
CPUAFFINITYSET
CPUAFFINITYMODE	ANY / cannot be set
CASCADETO	Empty / cannot be set

DBD

Parameter	Default Setting
MEMORYSIZE	0%
MAXMEMORYSIZE	Unlimited
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM	AUTO
PRIORITY	0
RUNTIMEPRIORITY	MEDIUM
RUNTIMEPRIORITYTHRESHOLD	0
QUEUETIMEOUT	0
RUNTIMECAP	NONE
PLANNEDCONCURRENCY	AUTO
`MAXCONCURRENCY`	NONE
SINGLEINITIATOR	True Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Empty / cannot be set
CPUAFFINITYMODE	ANY / cannot be set
CASCADETO	Empty / cannot be set

JVM

Parameter	Default Setting
MEMORYSIZE	0%
MAXMEMORYSIZE	10% of memory or 2 GB, whichever is smaller
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM	AUTO
PRIORITY	0
RUNTIMEPRIORITY	MEDIUM
RUNTIMEPRIORITYTHRESHOLD	2
QUEUETIMEOUT	00:05 (minutes)
RUNTIMECAP	NONE
PLANNEDCONCURRENCY	AUTO
MAXCONCURRENCY	Empty / cannot be set
SINGLEINITIATOR	FALSE Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Empty / cannot be set
CPUAFFINITYMODE	ANY / cannot be set
CASCADETO	Empty / cannot be set

METADATA

Parameter	Default Setting
MEMORYSIZE	0%
MAXMEMORYSIZE	Unlimited
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM	AUTO
PRIORITY	108
RUNTIMEPRIORITY	HIGH
RUNTIMEPRIORITYTHRESHOLD	0
QUEUETIMEOUT	0
RUNTIMECAP	NONE
PLANNEDCONCURRENCY	AUTO
MAXCONCURRENCY	0
SINGLEINITIATOR	FALSE. Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Empty / cannot be set
CPUAFFINITYMODE	ANY / cannot be set
CASCADETO	Empty / cannot be set

RECOVERY

Parameter	Default Setting
MEMORYSIZE	0%
MAXMEMORYSIZE	The maximum size per node the resource pool can grow by borrowing memory from the `GENERAL` pool: `MAXMEMORYSIZE { 'integer%'`	'integer{K\|M\|G\|T}'	NONE } `integer``%`: Percentage of total memory `integer``{K\|M\|G\|T}`: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes `NONE` (default): Unlimited, pool can borrow any amount of available memory from the `GENERAL` pool. Caution Setting must resolve to ≥ 25%. Otherwise, Vertica generates a warning that system queries might be unable to execute.
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM	AUTO
PRIORITY	One of the following: Enterprise Mode: 107 Eon Mode: 110 Caution Change these settings only under guidance from Vertica technical support.
RUNTIMEPRIORITY	MEDIUM
RUNTIMEPRIORITYTHRESHOLD	60
QUEUETIMEOUT	00:05 (minutes)
RUNTIMECAP	NONE
PLANNEDCONCURRENCY	AUTO
MAXCONCURRENCY	By default, set as follows: `(``numberCores` `/ 2) + 1` Thus, given a system with four cores, MAXCONCURRENCY has a default setting of 3. Note 0 or NONE (unlimited) are invalid settings.
SINGLEINITIATOR	True. Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Empty / cannot be set
CPUAFFINITYMODE	ANY / cannot be set
CASCADETO	Empty / cannot be set

REFRESH

Parameter	Default Setting
MEMORYSIZE	0%
MAXMEMORYSIZE	NONE (unlimited)
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM	AUTO
PRIORITY	-10
RUNTIMEPRIORITY	MEDIUM
RUNTIMEPRIORITYTHRESHOLD	60
QUEUETIMEOUT	00:05 (minutes)
RUNTIMECAP	NONE (unlimited)
PLANNEDCONCURRENCY	AUTO (4)
MAXCONCURRENCY	3 This parameter must be set ≥ 1.
SINGLEINITIATOR	True. Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Empty / cannot be set
CPUAFFINITYMODE	ANY / cannot be set
CASCADETO	Empty / cannot be set

SYSQUERY

Parameter	Default Setting
MEMORYSIZE	1G Caution Setting must resolve to ≥ 20M, otherwise Vertica generates a warning that system queries might be unable to execute, and diagnosing problems might be difficult.
MAXMEMORYSIZE	Empty (unlimited)
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM	AUTO
PRIORITY	110
RUNTIMEPRIORITY	HIGH
RUNTIMEPRIORITYTHRESHOLD	0
QUEUETIMEOUT	00:05 (minutes)
RUNTIMECAP	NONE
PLANNEDCONCURRENCY	AUTO
MAXCONCURRENCY	Empty Caution Must be set ≥ 1, otherwise Vertica generates a warning that system queries might be unable to execute.
SINGLEINITIATOR	False. Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Empty / cannot be set
CPUAFFINITYMODE
CASCADETO

TM

Parameter	Default Setting
MEMORYSIZE	5% (of the GENERAL pool's MAXMEMORYSIZE setting) + 2GB Important You can estimate the optimal amount of RAM for the TM resource pool as follows: `GbRAM` `/ (6 ` `#table-cols``) > 10` where `#table-cols`* is the number of columns in the largest database table. For example, given a 100-column table, `MEMORYSIZE` needs least 6GB of RAM: `6144MB / (6 * 100) = 10.24`
MAXMEMORYSIZE	Unlimited
MAXQUERYMEMORYSIZE	Empty / cannot be set
EXECUTIONPARALLELISM	AUTO
PRIORITY	105
RUNTIMEPRIORITY	MEDIUM
RUNTIMEPRIORITYTHRESHOLD	60
QUEUETIMEOUT	00:05 (minutes)
RUNTIMECAP	NONE
PLANNEDCONCURRENCY	7
MAXCONCURRENCY	Sets across all nodes the maximum number of concurrent execution slots available to TM pool. In databases created in Vertica releases ≥9.3, the default value is 7. In databases created in earlier versions, the default is 3.This setting specifies the maximum number of merges that can occur simultaneously on multiple threads. Note 0 or NONE (unlimited) are invalid settings.
SINGLEINITIATOR	True Important Included for backwards compatibility. Do not change.
CPUAFFINITYSET	Empty / cannot be set
CPUAFFINITYMODE	ANY / cannot be set
CASCADETO	Empty / cannot be set

25 - CREATE ROLE

Creates a.

Creates a role. After creating a role, use GRANT statements to specify role permissions.

Syntax

CREATE ROLE role

Parameters

role: The name for the new role, where role conforms to conventions described in Identifiers.

Privileges

Superuser

Examples

This example shows to create an empty role called roleA.

=> CREATE ROLE roleA;
CREATE ROLE

26 - CREATE ROUTING RULE

Creates a load balancing routing rule that directs incoming client connections from an IP address range to a group of Vertica nodes.

Creates a load balancing routing rule that directs incoming client connections from an IP address range to a group of Vertica nodes. This group of Vertica nodes is defined by a load balance group. Once you create a routing rule, any client connection originating from the rule's IP address range is redirected to one of the nodes in the load balance group if the client opts into load balancing.

Syntax

CREATE ROUTING RULE rule_name ROUTE 'address_range' TO group_name

Arguments

rule_name ``: A name for the routing rule.
*`address_range`*: An IPv4 or IPv6 address range in CIDR format. Sets the address range of client connections that this rule applies to.
group_name: The name of the load balance group to handle the client connections from the address range. You create this group using the CREATE LOAD BALANCE GROUP statement.

Privileges

Superuser.

Examples

The following example creates a routing rule that routes all client connections from 192.168.1.0 to 192.168.1.255 to a load balance group named internal_clients:

=> CREATE ROUTING RULE internal_clients ROUTE '192.168.1.0/24' TO internal_clients;
CREATE ROUTING RULE

27 - CREATE SCHEMA

Defines a schema.

Syntax

CREATE SCHEMA [ IF NOT EXISTS ] [database.]schema
   [ AUTHORIZATION username]
   [ DEFAULT { INCLUDE | EXCLUDE } [ SCHEMA ] PRIVILEGES ]

Parameters

IF NOT EXISTS

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Identifies the schema to create, where schema conforms to conventions described in Identifiers. The following naming requirements also apply:

The name must be unique among all other schema names in the database.
It must comply with keyword restrictions.
It cannot begin with v_; this prefix is reserved for Vertica system tables.
If you specify a database, it must be the current database.

AUTHORIZATION username

Valid only for superusers, assigns ownership of the schema to another user. By default, the user who creates a schema is also assigned ownership.

After you create a schema, you can reassign ownership to another user with ALTER SCHEMA.

DEFAULT {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES

Specifies whether to enable or disable default inheritance of privileges for new tables in the specified schema:

EXCLUDE SCHEMA PRIVILEGES (default): Disables inheritance of schema privileges.
INCLUDE SCHEMA PRIVILEGES: Specifies to grant tables in the specified schema the same privileges granted to that schema. This option has no effect on existing tables in the schema.

If you omit INCLUDE PRIVILEGES, you must explicitly grant schema privileges on the desired tables.

For more information see Enabling schema inheritance.

Privileges

Superuser
CREATE privilege for the database

Supported sub-statements

CREATE SCHEMA can include one or more sub-statements—for example, to create tables or projections within the new schema. Supported sub-statements include:

CREATE TABLE / CREATE TEMPORARY TABLE
GRANT statements
CREATE PROJECTION
CREATE SEQUENCE
CREATE TEXT INDEX
CREATE VIEW

CREATE SCHEMA statement and all sub-statements are treated as a single transaction. If any statement fails, Vertica rolls back the entire transaction. The owner of the new schema is assigned ownership of all objects that are created within this transaction.

For example, the following CREATE SCHEMA statement also grants privileges on the new schema, and creates a table and view of that table:

=> \c - Joan
You are now connected as user "Joan".
=> CREATE SCHEMA s1
     GRANT USAGE, CREATE ON SCHEMA s1 TO public
     CREATE TABLE s1.t1 (a varchar)
     CREATE VIEW s1.t1v AS SELECT * FROM s1.t1;
CREATE SCHEMA
=> \dtv s1.*
             List of tables
 Schema | Name | Kind  | Owner | Comment
--------+------+-------+-------+---------
 s1     | t1   | table | Joan  |
 s1     | t1v  | view  | Joan  |
(2 rows)

Examples

Create schema s1:

=> CREATE SCHEMA s1;

Create schema s2 if it does not already exist:

=> CREATE SCHEMA IF NOT EXISTS s2;

If the schema already exists, Vertica returns a rollback message:

=> CREATE SCHEMA IF NOT EXISTS s2;
NOTICE 4214:  Object "s2" already exists; nothing was done

Create table t1 in schema s1, then grant users Fred and Aniket access to all existing tables and all privileges on table t1:


=> CREATE TABLE s1.t1 (c INT);
CREATE TABLE
=> GRANT USAGE ON SCHEMA s1 TO Fred, Aniket;
GRANT PRIVILEGE
=> GRANT ALL PRIVILEGES ON TABLE s1.t1 TO Fred, Aniket;
GRANT PRIVILEGE

Enable inheritance on new schema s3 so all tables created in it automatically inherit its privileges. In this case, new table s3.t2 inherits USAGE, CREATE, and SELECT privileges, which are automatically granted to all database users: => CREATE SCHEMA s3 DEFAULT INCLUDE SCHEMA PRIVILEGES; CREATE SCHEMA => GRANT USAGE, CREATE, SELECT, INSERT ON SCHEMA S3 TO PUBLIC; GRANT PRIVILEGE => CREATE TABLE s3.t2(i int); WARNING 6978: Table "t2" will include privileges from schema "s3" CREATE TABLE

28 - CREATE SEQUENCE

Defines a new named sequence number generator object.

Defines a new named sequence number generator object. Like AUTO_INCREMENT and IDENTITY sequences, named sequences let you set the default values of primary key columns. Sequences guarantee uniqueness, and avoid constraint enforcement problems and overhead.

For more information about sequence types and their usage, see Sequences.

Syntax

CREATE SEQUENCE [ IF NOT EXISTS ] [[database.]schema.]sequence
   [ INCREMENT [ BY ] integer ]
   [ MINVALUE `*`integer`*` | NO MINVALUE ]
   [ MAXVALUE maxvalue | NO MAXVALUE ]
   [ START [ WITH ] integer ]
   [ CACHE integer | NO CACHE ]
   [ CYCLE | NO CYCLE ]

Parameters

IF NOT EXISTS

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

*sequence*

Identifies the sequence to create, where sequence conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.

INCREMENT [BY] integer

A positive or negative integer that specifies how much to increment or decrement the sequence on each call to NEXTVAL, by default set to 1.

Note

Setting this parameter to integer guarantees that column values always increment by at least integer. However, column values can sometimes increment by more than integer unless you also set the NO CACHE parameter.

MINVALUE integer | NO MINVALUE

Determines the minimum value a sequence can generate. If you omit this clause or specify NO MINVALUE (the default), default values are used: 1 and -2⁶³-1 for ascending and descending sequences, respectively.

MAXVALUE integer | NO MAXVALUE

Determines the maximum value for the sequence. If you omit this clause or specify NO MAXVALUE (the default), default values are used: 2⁶³-1 and -1 for ascending and descending sequences, respectively.

START [WITH] integer

Sets the sequence start value to integer. The next call to NEXTVAL returns integer. If you omit this clause, the sequence start value is set to MINVALUE for ascending sequences, and MAXVALUE for descending sequences.

CACHE integer | NO CACHE

Specifies whether to cache unique sequence numbers on each node for faster access. CACHE takes an integer argument as follows:

>1 specifies how many unique numbers each node caches per session.

Caution
If sequence caching is set to a low number, nodes are liable to request a new set of cache values more frequently. While it supplies a new cache, Vertica must lock the catalog. Until Vertica releases the lock, other database activities such as table inserts are blocked, which can adversely affect overall performance.
0 or 1 specifies to disable caching (equivalent to NO CACHE).

If you omit this clause, the sequence cache is set to 250,000.

For details on named sequence caching, see Distributing named sequences.

CYCLE | NO CYCLE

Specifies whether the sequence can wrap when its minimum or maximum values are reached:

CYCLE: The sequence wraps as follows:
- When an incrementing sequence reaches its upper limit, it is reset to its minimum value.
- When an decrementing sequence reaches its lower limit, it is reset to its maximum value.
NO CYCLE (default): Calls to NEXTVAL return an error after the sequence reaches its maximum or minimum value.

Privileges

Non-superusers: CREATE privilege on the schema

Examples

See Creating and using named sequences.

29 - CREATE SUBNET

Identifies the subnet to which the nodes of a Vertica database belong.

Identifies the subnet to which the nodes of a Vertica database belong. Use this statement to configure import/export from a database to other Vertica clusters.

Syntax

CREATE SUBNET subnet-name WITH 'subnet-prefix'

Parameters

subnet-name: A name you assign to the subnet, where subnet-name conforms to conventions described in Identifiers.
subnet-prefix: The subnet prefix in either a dotted-quad number format for IPv4 addresses, or four colon-delimited four-digit hexadecimal numbers for IPv6 addresses. Refer to system table NETWORK_INTERFACES to get the prefix of all available IP networks.

You can then configure the database to use the subnet for import/export. For details, see Identify the database or nodes used for import/export.

Privileges

Superuser

Examples

=> CREATE SUBNET mySubnet WITH '123.4.5.6';
=> CREATE SUBNET mysubnet WITH 'fd9b:1fcc:1dc4:78d3::';

30 - CREATE TABLE

Creates a table in the logical schema.

Syntax

Create with column definitions:

CREATE TABLE [ IF NOT EXISTS ] [[database.]schema.]table
   ( column-definition[,...] [, table-constraint ][,...] )
   [ ORDER BY column[,...] ]
   [ segmentation-spec ]
   [ KSAFE [k-num] ]
   [ partition-clause]
   [ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]

Create from another table:


CREATE TABLE [ IF NOT EXISTS ] [[database.]schema.]table { AS-clause | LIKE-clause }

AS-clause

[ ( column-name-list ) ]
[ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
AS  [ /*+ LABEL */ ] [ AT epoch ] query [ ENCODED BY column-ref-list ] [ segmentation-spec ]

LIKE-clause

LIKE [[database.]schema.]existing-table
  [ {INCLUDING | EXCLUDING} PROJECTIONS ]
  [ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]

Parameters

IF NOT EXISTS

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

table

Name of the table to create, which must be unique among names of all sequences, tables, projections, views, and models within the schema.

column-definition

Column name, data type, and optional constraints. A table can have up to 9800 columns. At least one column in the table must be of a scalar type or native array.

table-constraint

Table-level constraint, as opposed to column constraints.

ORDER BY column[,...]

Invalid for external tables, specifies columns from the SELECT list on which to sort the superprojection that is automatically created for this table. The ORDER BY clause cannot include qualifiers ASC or DESC. Vertica always stores projection data in ascending sort order.

If you omit the ORDER BY clause, Vertica uses the SELECT list order as the projection sort order.

segmentation-spec

Invalid for external tables, specifies how to distribute data for auto-projections of this table. Supply one of the following clauses:

hash-segmentation-clause: Specifies to segment data evenly and distribute across cluster nodes. Vertica recommends segmenting large tables.
unsegmented-clause: Specifies to create an unsegmented projection.

If this clause is omitted, Vertica generates auto-projections with default hash segmentation.

KSAFE [k-num]

Invalid for external tables, specifies K-safety of auto-projections created for this table, where k-num must be equal to or greater than system K-safety. If you omit this option, the projection uses the system K-safety level.

partition-clause

Invalid for external tables, logically divides table data storage through a PARTITION BY clause:

PARTITION BY partition-expression
  [ GROUP BY group-expression ] [ ACTIVEPARTITIONCOUNT integer ]

column-name-list

Valid only when creating a table from a query (AS query), defines column names that map to the query output. If you omit this list, Vertica uses the query output column names. The names in column-name-list and queried columns must be the same in number.

For example:

CREATE TABLE customer_occupations (name, profession)
   AS SELECT customer_name, occupation FROM customer_dimension;

This clause and the ENCODED BY clause are mutually exclusive. Column name lists are invalid for external tables

{INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES

Default inheritance of schema privileges for this table:

INCLUDE PRIVILEGES specifies that the table inherits privileges that are set on its schema. This is the default behavior if privileges inheritance is enabled for the schema.
EXCLUDE PRIVILEGES disables inheritance of privileges from the schema.

For details, see Inherited privileges.

AS query

Creates and loads a table from the results of a query, specified as follows:


AS  [ /*+ LABEL */ ] [ AT epoch ] query

The query cannot include complex type columns.

ENCODED BY column-ref-list

A comma-delimited list of columns from the source table, where each column is qualified by one or both of the following encoding options:

ACCESSRANK integer: Overrides the default access rank for a column, useful for prioritizing access to a column. See Prioritizing column access speed.
ENCODING encoding-type: Specifies the type of encoding to use on the column. The default encoding type is AUTO.

This option and column-name-list are mutually exclusive. This option is invalid for external tables.

LIKE existing-table

Creates the table by replicating an existing table. You can qualify the LIKE clause with one of the following options:

EXCLUDING PROJECTIONS (default): Do not copy projections from the source table.
INCLUDING PROJECTIONS: Copy current projections from the source table for the new table.
{INCLUDE|EXCLUDE} [SCHEMA] PRIVILEGES: See description above).

Privileges

Non-superuser:

CREATE privileges on the table schema
If creating a table that includes a named sequence:
- SELECT privilege on sequence object
- USAGE privilege on sequence schema
If creating a table with the LIKE clause, source table owner

Restrictions for complex types

Complex types used in native tables have some restrictions, in addition to the restrictions for individual types listed on their reference pages:

A native table must have at least one column that is a primitive type or a native array (one-dimensional array of a primitive type). If a flex table has real columns, it must also have at least one column satisfying this restriction.
Complex type columns cannot be used in ORDER BY or PARTITION BY clauses nor as FILLER columns.
Complex type columns cannot have constraints.
Expressions returning complex types cannot be used as projection columns, and projections cannot be segmented or ordered by columns of complex types.
Tables with columns of complex types cannot use DEFAULT and SET USING.

Examples

The following example creates a table in the public schema:

CREATE TABLE public.Premium_Customer
(
    ID IDENTITY ,
    lname varchar(25),
    fname varchar(25),
    store_membership_card int
);

The following example uses LIKE to create a new table from this one:

=> CREATE TABLE All_Customers LIKE Premium_Customer;
CREATE TABLE

The following example selects columns from one table to use in a new table, using an AS clause:

=> CREATE TABLE cust_basic_profile AS SELECT
     customer_key, customer_gender, customer_age, marital_status, annual_income, occupation
     FROM customer_dimension WHERE customer_age>18 AND customer_gender !='';
CREATE TABLE
=> SELECT customer_age, annual_income, occupation FROM cust_basic_profile
     WHERE customer_age > 23 ORDER BY customer_age;
 customer_age | annual_income |     occupation
--------------+---------------+--------------------
           24 |        469210 | Hairdresser
           24 |        140833 | Butler
           24 |        558867 | Lumberjack
           24 |        529117 | Mechanic
           24 |        322062 | Acrobat
           24 |        213734 | Writer
           ...

The following example creates a table using array columns:

=> CREATE TABLE orders(
    orderkey    INT,
    custkey     INT,
    prodkey     ARRAY[VARCHAR(10)],
    orderprices ARRAY[DECIMAL(12,2)],
    orderdate   DATE
);

The following example uses a ROW complex type:

=> CREATE TABLE inventory
    (store INT, products ROW(name VARCHAR, code VARCHAR));

30.1 - Column-constraint

Adds a constraint to a column's metadata.

Adds a constraint to a column's metadata. For details, see Constraints.

Syntax

[ { AUTO_INCREMENT | IDENTITY } [ (args) ] ]
[ CONSTRAINT constraint-name ] {
   [ CHECK (expression) [ ENABLED | DISABLED ] ]
   [ [ DEFAULT expression ] [ SET USING expression } | DEFAULT USING expression ]
   [ NULL | NOT NULL ]
   [ { PRIMARY KEY [ ENABLED | DISABLED ] REFERENCES table [( column )] } ]
   [ UNIQUE [ ENABLED | DISABLED ] ]
}

Parameters

Note

You can specify enforcement of several constraints by qualifying them with the keywords ENABLED or DISABLED. See Enforcing Constraints below.

AUTO_INCREMENT | IDENTITY

Creates a table column whose values are automatically generated by and managed by the database. You cannot change or load values in this column. You can set this constraint on only one table column.

AUTO_INCREMENT and IDENTITY are synonyms. For details on this constraint and optional arguments, see AUTO_INCREMENT and IDENTITY sequences.

These options are invalid for temporary tables.

CONSTRAINT constraint-name

Assigns a name to the constraint, valid for the following constraints:

PRIMARY KEY
REFERENCES (foreign key)
CHECK
UNIQUE

If you omit assigning a name to these constraints, Vertica assigns its own name. For details, see Naming constraints.

Vertica recommends that you name all constraints.

CHECK (expression)

Adds check condition expression, which returns a Boolean value.

DEFAULT

Specifies this column's default value:

DEFAULT default-expr

Vertica evaluates the DEFAULT expression and sets the column on load operations, if the operation omits a value for the column. For details about valid expressions, see Defining column values.

SET USING

Specifies to set values in this column from the specified expression:

SET USING using-expr

Vertica evaluates the SET USING expression and refreshes column values only when the function REFRESH_COLUMNS is invoked. For details about valid expressions, see Defining column values.

DEFAULT USING

Defines the column with DEFAULT and SET USING constraints, specifying the same expression for both. DEFAULT USING columns support the same expressions as SET USING columns, and are subject to the same restrictions.

NULL | NOT NULL

Specifies whether the column can contain null values:

NULL: Allows null values in the column. If you set this constraint on a primary key column, Vertica ignores it and sets it to NOT NULL.
NOT NULL: Specifies that the column must be set to a value during insert and update operations. If the column has no default value and no value is provided, INSERT or UPDATE returns an error.

If you omit this constraint, the default is NULL for all columns except primary key columns, which Vertica always sets to NOT NULL.

External tables: If you specify NOT NULL and the column contains null values, queries are liable to return errors or generate unexpected behavior. Specify NOT NULL for an external table column only if you are sure that the column does not contain nulls.

PRIMARY KEY

Identifies this column as the table's primary key.

REFERENCES

Identifies this column as a foreign key:

REFERENCES table [column]

where column is the primary key in table. If you omit column, Vertica references the primary key in table.

UNIQUE

Requires column data to be unique with respect to all table rows.

Privileges

Table owner or user WITH GRANT OPTION is grantor.

REFERENCES privilege on table to create foreign key constraints that reference this table
USAGE privilege on schema that contains the table

Enforcing constraints

The following constraints can be qualified with the keyword ENABLED or DISABLED:

PRIMARY KEY
UNIQUE
CHECK

If you omit ENABLED or DISABLED, Vertica determines whether to enable the constraint automatically by checking the appropriate configuration parameter:

EnableNewPrimaryKeysByDefault
EnableNewUniqueKeysByDefault
EnableNewCheckConstraintsByDefault

For details, see Constraint enforcement.

30.2 - Column-definition

Specifies the name, data type, and constraints to be applied to a column.

Syntax

column-name data-type
    [ column-constraint ][...]
    [ ENCODING encoding-type ]
    [ ACCESSRANK integer ]

Parameters

*column-name*: The name of a column to be created or added.
*data-type*: A Vertica-supported data type.

Tip
When specifying the maximum column width in a CREATE TABLE statement, use the width in bytes (octets) for any of the string types. Each UTF-8 character might require four bytes, but European languages generally require a little over one byte per character, while Oriental languages generally require a little under three bytes per character.
column-constraint: A constraint type that Vertica supports—for example, NOT NULL or UNIQUE. For general information, see Constraints.
ENCODING
encoding-type: The column encoding type, by default set to AUTO.
ACCESSRANK integer: Overrides the default access rank for a column. Use this parameter to increase or decrease the speed at which Vertica accesses a column. For more information, see Overriding Default Column Ranking.

Examples

The following example creates a table named Employee_Dimension and its associated superprojection in the public schema. The Employee_key column is designated as a primary key, and RLE encoding is specified for the Employee_gender column definition:

=> CREATE TABLE public.Employee_Dimension (
    Employee_key                   integer PRIMARY KEY NOT NULL,
    Employee_gender                varchar(8) ENCODING RLE,
    Courtesy_title                 varchar(8),
    Employee_first_name            varchar(64),
    Employee_middle_initial        varchar(8),
    Employee_last_name             varchar(64)
);

30.3 - Column-name-list

Used to rename columns when creating a table or temporary table from a query; also used to specify the column's encoding type and .

Used to rename columns when creating a table or temporary table from a query; also used to specify the column's encoding type and access rank .

Syntax

column-name-list
    [ ENCODING encoding-type ]
    [ ACCESSRANK integer ]
    [ GROUPED ( column-reference[,...] ) ]

Parameters

column-name: Specifies the new name for the column.
ENCODING encoding-type: Specifies the type of encoding to use on the column. The default encoding type is AUTO.
ACCESSRANK integer: Overrides the default access rank for a column, useful for prioritizing access to a column. See Prioritizing column access speed.
GROUPED: Groups two or more columns . For detailed information, see GROUPED clause.

Requirements

A column in the list can not specify the column's data type or any constraint. These are derived from the queried table.
If the query output has expressions other than simple columns (for example, constants or functions) then an alias must be specified for that expression, or the column name list must include all queried columns.
CREATE TABLE can specify encoding types and access ranks in the column name list or the query's ENCODED BY clause, but not in both. For example, the following CREATE TABLE statement sets encoding and access rank on two columns in the column name list:
```
=> CREATE TABLE promo1 (state ENCODING RLE ACCESSRANK 1, zip ENCODING RLE,...)
     AS SELECT * FROM customer_dimension ORDER BY customer_state;
```
The next statement specifies the same encoding and access rank in the query's ENCODED BY clause.
```
=> CREATE TABLE promo2
     AS SELECT * FROM customer_dimension ORDER BY customer_state
     ENCODED BY customer_state ENCODING RLE ACCESSRANK 1, customer_zip ENCODING RLE;
```

30.4 - Partition clause

Specifies partitioning of table data, through a PARTITION BY clause in the table definition:.

Specifies partitioning of table data, through a PARTITION BY clause in the table definition:

PARTITION BY partition-expression [ GROUP BY group-expression ] [ active-partition-count-expr ]

PARTITION BY partition-expression

For each table row, resolves to a partition key that is derived from one or more table columns.

Caution

Avoid partitioning tables on LONG VARBINARY and LONG VARCHAR columns. Doing so can adversely impact performance.

GROUP BY group-expression

For each table row, resolves to a partition group key that is derived from the partition key. Vertica uses group keys to merge partitions into separate partition groups. GROUP BY must use the same expression as PARTITION BY. For example:

...PARTITION BY (i+j) GROUP BY (
     CASE WHEN (i+j) < 5 THEN 1
          WHEN (i+j) < 10 THEN 2
          ELSE 3);

For details on partitioning table data by groups, see Partition grouping and Hierarchical partitioning.

active-partition-count-expr

Specifies how many partitions are active for this table, specified as follows:

In partition clause of CREATE TABLE:
```
ACTIVEPARTITIONCOUNT integer
```
In partition clause of ALTER TABLE:
```
SET ACTIVEPARTITIONCOUNT integer
```

This setting supersedes configuration parameter ActivePartitionCount. For details on usage, see Active and inactive partitions.

Partitioning requirements and restrictions

PARTITION BY expressions can specify leaf expressions, functions, and operators. The following requirements and restrictions apply:

All table projections must include all columns referenced in the expression; otherwise, Vertica cannot resolve the expression.
The expression can reference multiple columns, but it must resolve to a single non-null value for each row.

Note
You can avoid null-related errors with the function ZEROIFNULL. This function can check a PARTITION BY expression for null values and evaluate them to 0. For example: CREATE TABLE t1 (a int, b int) PARTITION BY (ZEROIFNULL(a)); CREATE TABLE
All leaf expressions must be constants or table columns.
All other expressions must be functions and operators. The following restrictions apply to functions: * They must be immutable—that is, they return the same value regardless of time and locale and other session- or environment-specific conditions. * They cannot be aggregate functions. * They cannot be Vertica meta-functions.
The expression cannot include queries.
The expression cannot include user-defined data types such as Geometry.

GROUP BY expressions do not support modulo (%) operations.

Examples

The following statements create the store_orders table and load data into it. The CREATE TABLE statement includes a simple partition clause that specifies to partition data by year:

=> CREATE TABLE public.store_orders
(
    order_no int,
    order_date timestamp NOT NULL,
    shipper varchar(20),
    ship_date date
)
UNSEGMENTED ALL NODES
PARTITION BY YEAR(order_date);
CREATE TABLE
=> COPY store_orders FROM '/home/dbadmin/export_store_orders_data.txt';
41834

As COPY loads the new table data into ROS storage, the Tuple Mover executes the table's partition clause by dividing orders for each year into separate partitions, and consolidating these partitions in ROS containers.

In this case, the Tuple Mover creates four partition keys for the loaded data—2017, 2016, 2015, and 2014—and divides the data into separate ROS containers accordingly:

=> SELECT dump_table_partition_keys('store_orders');
... Partition keys on node v_vmart_node0001
  Projection 'store_orders_super'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 2017
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 2016
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 2015
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 2014

 Partition keys on node v_vmart_node0002
  Projection 'store_orders_super'
   Storage [ROS container]
     No of partition keys: 1
     Partition keys: 2017
...

(1 row)

As new data is loaded into store_orders, the Tuple Mover merges it into the appropriate partitions, creating partition keys as needed for new years.

30.5 - Table-constraint

Table-constraint

Adds a constraint to table metadata. You can specify table constraints with CREATE TABLE, or add a constraint to an existing table with ALTER TABLE. For details, see Setting constraints.

Note

Adding a constraint to a table that is referenced in a view does not affect the view.

Syntax

[ CONSTRAINT constraint-name ]
{
... PRIMARY KEY (column[,... ]) [ ENABLED | DISABLED ]
... | FOREIGN KEY (column[,... ] ) REFERENCES table [ (column[,...]) ]
... | UNIQUE (column[,...]) [ ENABLED | DISABLED ]
... | CHECK (expression) [ ENABLED | DISABLED ]
}

Parameters

CONSTRAINT constraint-name

Assigns a name to the constraint. Vertica recommends that you name all constraints.

PRIMARY KEY

Defines one or more NOT NULL columns as the primary key as follows:

PRIMARY KEY (column[,...]) [ ENABLED | DISABLED]

You can qualify this constraint with the keyword ENABLED or DISABLED. See Enforcing Constraints below.

If you do not name a primary key constraint, Vertica assigns the name C_PRIMARY.

FOREIGN KEY

Adds a referential integrity constraint defining one or more columns as foreign keys as follows:

FOREIGN KEY (column[,... ]) REFERENCES table [(column[,... ])]

If you omit column, Vertica references the primary key in table.

If you do not name a foreign key constraint, Vertica assigns the name C_FOREIGN.

Important

Adding a foreign key constraint requires the following privileges (in addition to privileges also required by ALTER TABLE):

REFERENCES on the referenced table
USAGE on the schema of the referenced table

UNIQUE

Specifies that the data in a column or group of columns is unique with respect to all table rows, as follows:

UNIQUE (column[,...]) [ENABLED | DISABLED]

You can qualify this constraint with the keyword ENABLED or DISABLED. See Enforcing Constraints below.

If you do not name a unique constraint, Vertica assigns the name C_UNIQUE.

CHECK

Specifies a check condition as an expression that returns a Boolean value, as follows:

CHECK (expression) [ENABLED | DISABLED]

You can qualify this constraint with the keyword ENABLED or DISABLED. See Enforcing Constraints below.

If you do not name a check constraint, Vertica assigns the name C_CHECK.

Privileges

Non-superusers: table owner, or the following privileges:

USAGE on schema
ALTER on table
SELECT on table to enable or disable constraint enforcement

Enforcing constraints

A table can specify whether Vertica automatically enforces a primary key, unique key or check constraint with the keyword ENABLED or DISABLED. If you omit ENABLED or DISABLED, Vertica determines whether to enable the constraint automatically by checking the appropriate configuration parameter:

EnableNewPrimaryKeysByDefault
EnableNewUniqueKeysByDefault
EnableNewCheckConstraintsByDefault

For details, see Constraint enforcement.

Examples

The following example creates a table (t01) with a primary key constraint.

CREATE TABLE t01 (id int CONSTRAINT sampleconstraint PRIMARY KEY);
CREATE TABLE

This example creates the same table without the constraint, and then adds the constraint with ALTER TABLE ADD CONSTRAINT

CREATE TABLE t01 (id int);
CREATE TABLE

ALTER TABLE t01 ADD CONSTRAINT sampleconstraint PRIMARY KEY(id);
WARNING 2623:  Column "id" definition changed to NOT NULL
ALTER TABLE

The following example creates a table (addapk) with two columns, adds a third column to the table, and then adds a primary key constraint on the third column.

=> CREATE TABLE addapk (col1 INT, col2 INT);
CREATE TABLE

=> ALTER TABLE addapk ADD COLUMN col3 INT;
ALTER TABLE

=> ALTER TABLE addapk ADD CONSTRAINT col3constraint PRIMARY KEY (col3) ENABLED;
WARNING 2623:  Column "col3" definition changed to NOT NULL
ALTER TABLE

Using the sample table addapk, check that the primary key constraint is enabled (is_enabled is t).

=> SELECT constraint_name, column_name, constraint_type, is_enabled FROM PRIMARY_KEYS WHERE table_name IN ('addapk');

 constraint_name | column_name | constraint_type | is_enabled
-----------------+-------------+-----------------+------------
 col3constraint  | col3        | p               | t
(1 row)

This example disables the constraint using ALTER TABLE ALTER CONSTRAINT.

=> ALTER TABLE addapk ALTER CONSTRAINT col3constraint DISABLED;

Check that the primary key is now disabled (is_enabled is f).

=> SELECT constraint_name, column_name, constraint_type, is_enabled FROM PRIMARY_KEYS WHERE table_name IN ('addapk');

 constraint_name | column_name | constraint_type | is_enabled
-----------------+-------------+-----------------+------------
 col3constraint  | col3        | p               | f
(1 row)

For a general discussion of constraints, see Constraints. For additional examples of creating and naming constraints, see Naming constraints.

31 - CREATE TEMPORARY TABLE

Creates a table whose data persists only during the current session.

Creates a table whose data persists only during the current session. Temporary table data is not visible to other sessions.

Syntax

Create with column definitions:

CREATE [ scope ] TEMP[ORARY] TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name 
   ( column-definition[,...] )
   [ table-constraint ]
   [ ON COMMIT { DELETE | PRESERVE } ROWS ]
   [ NO PROJECTION ]
   [ ORDER BY table-column[,...] ]
   [ segmentation-spec ]
   [ KSAFE [k-num] ]
   [ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]

Create from another table:


CREATE TEMP[ORARY] TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
   [ ( column-name-list ) ]
   [ ON COMMIT { DELETE | PRESERVE } ROWS ]
AS  [ /*+ LABEL */ ] [ AT epoch ] query [ ENCODED BY column-ref-list ]

Parameters

scope

Specifies visibility of the table definition:

GLOBAL: The table definition is visible to all sessions, and persists until you explicitly drop the table.
LOCAL: the table definition is visible only to the session in which it is created, and is dropped when the session ends.

If no scope is specified, Vertica uses the default that is set by configuration parameter DefaultTempTableLocal.

Regardless of this setting, retention of temporary table data is set by the keywords ON COMMIT DELETE and ON COMMIT PRESERVE (see below).

For more information, see Creating temporary tables.

IF NOT EXISTS

The IF NOT EXISTS clause is useful for SQL scripts where you want to create an object if it does not already exist.

For related information, see ON_ERROR_STOP.

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

If you do not specify a schema, the table is created in the default schema.

*table-name*

Identifies the table to create, where table-name conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.

column-definition

Defines a table column. A table can have up to 9800 columns.

table-constraint

Adds a constraint to table metadata.

ON COMMIT

Specifies whether data is transaction- or session-scoped:

ON COMMIT {PRESERVE | DELETE} ROWS

DELETE (default) marks the temporary table for transaction-scoped data. Vertica removes all table data after each commit.
PRESERVE marks the temporary table for session-scoped data, which is preserved beyond the lifetime of a single transaction. Vertica removes all table data when the session ends.

NO PROJECTION

Prevents Vertica from creating auto-projections for this table. A superprojection is created only when data is explicitly loaded into this table.

NO PROJECTION is invalid with the following clauses:

ORDER BY
KSAFE
Any segmentation clause (hash-segmentation-clause or unsegmented-clause).

{INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES

Default inheritance of schema privileges for this table:

INCLUDE PRIVILEGES specifies that the table inherits privileges that are set on its schema. This is the default behavior if privileges inheritance is enabled for the schema.
EXCLUDE PRIVILEGES disables inheritance of privileges from the schema.

For details, see Inherited privileges.

ORDER BY table-column[,...]

If you omit the ORDER BY clause, Vertica uses the SELECT list order as the projection sort order.

segmentation-spec

Invalid for external tables, specifies how to distribute data for auto-projections of this table. Supply one of the following clauses:

hash-segmentation-clause: Specifies to segment data evenly and distribute across cluster nodes. Vertica recommends segmenting large tables.
unsegmented-clause: Specifies to create an unsegmented projection.

If this clause is omitted, Vertica generates auto-projections with default hash segmentation.

KSAFE [k-num]

Eon Mode: K-safety of temporary tables is always set to 0, regardless of system K-safety. If a CREATE TEMPORARY TABLE statement sets k-num greater than 0, Vertica returns an warning.

column-name-list

For example:

CREATE TABLE customer_occupations (name, profession)
   AS SELECT customer_name, occupation FROM customer_dimension;

This clause and the ENCODED BY clause are mutually exclusive. Column name lists are invalid for external tables

AS query

Creates and loads a table from the results of a query, specified as follows:


AS  [ /*+ LABEL */ ] [ AT epoch ] query

The query cannot include complex type columns.

ENCODED BY column-ref-list

A comma-delimited list of columns from the source table, where each column is qualified by one or both of the following encoding options:

ACCESSRANK integer: Overrides the default access rank for a column, useful for prioritizing access to a column. See Prioritizing column access speed.
ENCODING encoding-type: Specifies the type of encoding to use on the column. The default encoding type is AUTO.

This option and column-name-list are mutually exclusive. This option is invalid for external tables.

Privileges

The following privileges are required:

CREATE privileges on the table schema
If creating a temporary table that includes a named sequence:
- SELECT privilege on sequence object
- USAGE privilege on sequence schema

Restrictions

Queries on temporary tables are subject to the same restrictions on SQL support as persistent tables.
You cannot add projections to non-empty, global temporary tables (ON COMMIT PRESERVE ROWS). Make sure that projections exist before you load data. See Auto-projections.
While you can add projections for temporary tables that are defined with ON COMMIT DELETE ROWS specified, be aware that you might lose all data.
Mergeout operations cannot be used on session-scoped temporary data.
In general, session-scoped temporary table data is not visible using system (virtual) tables.
Temporary tables do not recover. If a node fails, queries that use the temporary table also fail. Restart the session and populate the temporary table.

Examples

See Creating temporary tables.

32 - CREATE TEXT INDEX

Creates a text index used to perform text searches.

Creates a text index used to perform text searches. If data within a table is partitioned, then an extra column appears in the text index, showing the partition.

Syntax

CREATE TEXT INDEX [[database.]schema.]txtindex-name
 ON [schema.]source-table (unique-id, text-field [, column-name,...])
 [STEMMER {stemmer-name(stemmer-input-data-type)| NONE}]
 [TOKENIZER tokenizer-name(tokenizer-input-data-type)];

Parameters

[database.]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

If you do not specify a schema, the table is created in the default schema.

*txtindex-name*

The text index name.

*source-table*

The source table to index.

*unique-id*

The name of the column in the source table that contains a unique identifier. Any data type is permissible. The column must be the primary key in the source table.

*text-field*

The name of the column in the source table that contains the text field. Valid data types are:

CHAR
VARCHAR
LONG VARCHAR
VARBINARY
LONG VARBINARY

Nulls are allowed.

*column-name*

The name of a column or columns to be included as additional columns.

*stemmer-name*

The name of the stemmer.

*stemmer-input-data-type*

The input data type of the stemmer-name function.

*tokenizer-name*

Specifies the name of the tokenizer.

*tokenizer-input-data-type*

This value is the input data type of the tokenizer-name function. It can accept any number of arguments.

If a Vertica tokenizers is used, then this parameter can be omitted.

Privileges

The index automatically inherits the query permissions of its parent table. The table owner and dbadmin will be allowed to create and/or modify the indices.

Important

Do not alter the contents or definitions of the text index. If the contents or definitions of the text index are altered, then the results will not appropriately match the source table.

Requirements

Requires there be a column with a unique identifier set as the primary key.
The source table must have an associated projection, and must be both sorted and segmented by the primary key.

Examples

The following example shows how to create a text index with an additional unindexed column on the table t_log using the CREATE TEXT INDEX statement:

=> CREATE TEXT INDEX t_log_index ON t_log (id, text, day_of_week);
CREATE INDEX
=> SELECT * FROM t_log_index;
        token          | doc_id | day_of_week
-----------------------+--------+-------------
'catalog               |      1 | Monday
'dbadmin'              |      2 | Monday
2014-06-04             |      1 | Monday
2014-06-04             |      2 | Monday
2014-06-04             |      3 | Monday
2014-06-04             |      4 | Monday
2014-06-04             |      5 | Monday
2014-06-04             |      6 | Monday
2014-06-04             |      7 | Monday
2014-06-04             |      8 | Monday
45035996273704966      |      3 | Tuesday
45035996273704968      |      4 | Tuesday
<INFO>                 |      1 | Tuesday
<INFO>                 |      6 | Tuesday
<INFO>                 |      7 | Tuesday
<INFO>                 |      8 | Tuesday
<WARNING>              |      2 | Tuesday
<WARNING>              |      3 | Tuesday
<WARNING>              |      4 | Tuesday
<WARNING>              |      5 | Tuesday

...

(97 rows)

The following example shows a text index, tpart_index, created from a partitioned source table:

=> SELECT * FROM tpart_index;
         token          | doc_id | partition
------------------------+--------+-----------
 0                      |      4 |      2014
 0                      |      5 |      2014
 11:00:49.568           |      4 |      2014
 11:00:49.568           |      5 |      2014
 11:00:49.569           |      6 |      2014
 <INFO>                 |      6 |      2014
 <WARNING>              |      4 |      2014
 <WARNING>              |      5 |      2014
 Database               |      6 |      2014
 Execute:               |      6 |      2014
 Object                 |      4 |      2014
 Object                 |      5 |      2014
 [Catalog]              |      4 |      2014
 [Catalog]              |      5 |      2014
 'catalog               |      1 |      2013
 'dbadmin'              |      2 |      2013
 0                      |      3 |      2013
 11:00:49.568           |      1 |      2013
 11:00:49.568           |      2 |      2013
 11:00:49.568           |      3 |      2013
 11:00:49.570           |      7 |      2013
 11:00:49.571           |      8 |      2013
 45035996273704966      |      3 |      2013

...

(89 rows)

33 - CREATE USER

Adds a name to the list of authorized database users.

Note

New users lack default access to schema PUBLIC. Be sure to grant new users USAGE privileges on the PUBLIC schema.

Syntax

CREATE USER user-name [ account-parameter value[,...] ]

Parameters

user-name: Name of the new user, where user-name conforms to conventions described in Identifiers.
account-parameter value: One or more user account parameter settings (see below).

User account parameters

Specify one or more user account parameters as a comma-delimited list:

account-parameter setting[,...]

Parameter	Settings
`ACCOUNT`	Locks or unlocks user access to the database, one of the following: `UNLOCK` (default) `LOCK` prevents a new user from logging in. This can be useful when creating an account for a user who does not need immediate access. Tip To automate account locking, set a maximum number of failed login attempts with CREATE PROFILE.
`GRACEPERIOD`	Specifies how long a user query can block on any session socket, one of the following: `NONE` (default): Removes any grace period previously set on session queries. `'``interval``'`: Specifies as an interval the maximum grace period for current session queries, up to 20 days. For details, see Handling session socket blocking.
`IDENTIFIED BY`	Sets the user's password as follows: `IDENTIFIED BY { '[``password``]' \| '``hashed-password``' SALT '``hash-salt``' }` `password`: ASCII password that Vertica then hashes for internal storage. An empty string enables this user to access the database with no password. `hashed-password`: A pre-hashed password and its associated hex string `hash-salt`. Setting a password this way bypasses all password complexity requirements. Important If you omit this parameter, this user can access the database with no password. For details, see Password guidelines and Creating a database name and password.
`IDLESESSIONTIMEOUT`	The length of time the system waits before disconnecting an idle session, one of the following: `NONE` (default): No limit set for this user. If you omit this parameter, no limit is set for this user. `'` `interval'`: An interval value, up to one year. For details, see Managing client connections.
`MAXCONNECTIONS`	Sets the maximum number of connections the user can have to the server, one of the following: `NONE` (default): No limit set. If you omit this parameter, the user can have an unlimited number of connections across the database cluster. `integer` `ON DATABASE`: Sets to `integer` the maximum number of connections across the database cluster. `integer` `ON NODE`: Sets to `integer` the maximum number of connections to each node. For details, see Managing client connections.
`MEMORYCAP`	Sets how much memory can be allocated to user requests, one of the following: `NONE` (default): No limit A string value that specifies the memory limit, one of the following: '`int``%'` expresses the maximum as a percentage of total memory available to the Resource Manager, where `int` is an integer value between 0 and 100.For example: `MEMORYCAP '40%'` `'``int``{K\|M\|G\|T}'` expresses memory allocation in kilobytes, megabytes, gigabytes, or terabytes. For example: `MEMORYCAP '10G'`
`PASSWORD EXPIRE`	Forces immediate expiration of the user's password. The user must change the password on the next login. Note `PASSWORD EXPIRE` has no effect when using external password authentication methods such as LDAP or Kerberos.
`PROFILE`	Assigns a profile that controls password requirements for this user, one of the following: `DEFAULT` (default): Assigns the default database profile to this user. profile-name: A profile that is defined by CREATE PROFILE. If you omit this parameter, the user is assigned the default profile.
`RESOURCE POOL`	Assigns a default resource pool to this user. The user must also be granted privileges to this pool, unless privileges to the pool are set to `PUBLIC`.
`RUNTIMECAP`	Sets how long this user's queries can execute, one of the following: `NONE` (default): No limit set for this user. If you omit this parameter, no limit is set for this user. `'` `interval'`: An interval value, up to one year. A query's runtime limit can be set at three levels: the user's runtime limit, the user's resource pool, and the session setting. For more information, see Setting a runtime limit for queries.
`SEARCH_PATH`	Specifies the user's default search path, that tells Vertica which schemas to search for unqualified references to tables and UDFs, one of the following: `DEFAULT` (default): Sets the search path as follows: `"$user", public, v_catalog, v_monitor, v_internal` Comma-delimited list of schemas. For details, see Setting Search Paths.
`TEMPSPACECAP`	Sets how much temporary file storage is available for user requests, one of the following: `NONE` (default): No limit String value that specifies the storage limit, one of the following: `int``%` expresses the maximum as a percentage of total temporary storage available to the Resource Manager, where `int` is an integer value between 0 and 100. For example: `TEMPSPACECAP '40%'` `int``{K\|M\|G\|T}` expresses storage allocation in kilobytes, megabytes, gigabytes, or terabytes. For example: `TEMPSPACECAP '10G'`

Privileges

Superuser

User name best practices

Vertica database user names are logically separate from user names of the operating system in which the server runs. If all the users of a particular server also have accounts on the server's machine, it makes sense to assign database user names that match their operating system user names. However, a server that accepts remote connections might many database users with no local operating system account. In this case, there is no need to connect database and system user names.

Examples

=> CREATE USER Fred IDENTIFIED BY 'Mxyzptlk';
=> GRANT USAGE ON SCHEMA PUBLIC to Fred;

34 - CREATE VIEW

Defines a.

Defines a view. Views are read only, so they do not support insert, update, delete, or copy operations.

Syntax

CREATE [ OR REPLACE ] VIEW [[database.]schema.]view [ (column[,...]) ]
  [ {INCLUDE|EXCLUDE} [SCHEMA] PRIVILEGES ] AS query

Parameters

OR REPLACE

Specifies to overwrite the existing view view-name. If you omit this option and view-name already exists, CREATE VIEW returns an error.

Any grants assigned to the view before you execute a CREATE OR REPLACE remain on the updated view. See GRANT (view).

[database]schema

Database and schema. The default schema is public. If you specify a database, it must be the current database.

view

Identifies the view to create, where view conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema.

column[,...]

List of up to 9800 names to use as view column names. Vertica maps view column names to query columns according to the order of their respective lists. By default, the view uses column names as they are specified in the query.

query

A SELECT statement that the temporary view executes. The SELECT statement can reference tables, temporary tables, and other views.

{INCLUDE|EXCLUDE}[SCHEMA] PRIVILEGES

Specifies whether this view inherits schema privileges:

INCLUDE PRIVILEGES specifies that the view inherits privileges that are set on its schema. This is the default behavior if privileges inheritance is enabled for the schema.
EXCLUDE PRIVILEGES disables inheritance of privileges from the schema.

For details, see Inherited privileges.

Privileges

See Creating views.

Examples

The following example shows how to create a view that contains data from multiple tables.

=> CREATE VIEW temp_t0 AS SELECT * from t0_p1 UNION ALL
     SELECT * from t0_p2 UNION ALL
       SELECT * from t0_p3 UNION ALL
         SELECT * from t0_p4 UNION ALL
           SELECT * from t0_p5;

CREATE statements

1 - CREATE ACCESS POLICY

Syntax

Parameters

Important

Privileges

Restrictions

See also

2 - CREATE AUTHENTICATION

Syntax

Parameters

Privileges

Examples

See also

3 - CREATE CA BUNDLE

Syntax

Parameters

Privileges

Examples

See also

4 - CREATE CERTIFICATE

Syntax

Parameters

Privileges

Default extensions

Examples

See also

5 - CREATE DIRECTED QUERY

Syntax

Parameters

Privileges

Description

Caution

See also

6 - CREATE EXTERNAL TABLE AS COPY

Note

Syntax

Parameters

Privileges

ORC and Parquet data

Examples

See also

7 - CREATE FAULT GROUP

Syntax

Parameters

Privileges

Examples

See also

8 - CREATE FLEXIBLE EXTERNAL TABLE AS COPY

Note

Caution

Syntax

Parameters

Note

Privileges

Examples

See also

9 - CREATE FLEXIBLE TABLE

Syntax

Parameters

Privileges

Default columns

Default projections

Note

Examples

See also

10 - CREATE FUNCTION statements

10.1 - CREATE AGGREGATE FUNCTION

Syntax

Arguments

Privileges

Examples

See also

10.2 - CREATE ANALYTIC FUNCTION

Syntax

Arguments

Privileges

Examples

See also

10.3 - CREATE FILTER