CREATE statements let you create new database objects such as tables and users.
This is the multi-page printable view of this section. Click here to print.
CREATE statements
- 1: CREATE ACCESS POLICY
- 2: CREATE AUTHENTICATION
- 3: CREATE CA BUNDLE
- 4: CREATE CERTIFICATE
- 5: CREATE DIRECTED QUERY
- 6: CREATE EXTERNAL TABLE AS COPY
- 7: CREATE FAULT GROUP
- 8: CREATE FLEXIBLE EXTERNAL TABLE AS COPY
- 9: CREATE FLEXIBLE TABLE
- 10: CREATE FUNCTION statements
- 10.1: CREATE AGGREGATE FUNCTION
- 10.2: CREATE ANALYTIC FUNCTION
- 10.3: CREATE FILTER
- 10.4: CREATE FUNCTION (scalar)
- 10.5: CREATE FUNCTION (SQL)
- 10.6: CREATE PARSER
- 10.7: CREATE SOURCE
- 10.8: CREATE TRANSFORM FUNCTION
- 11: CREATE HCATALOG SCHEMA
- 12: CREATE KEY
- 13: CREATE LIBRARY
- 14: CREATE LOAD BALANCE GROUP
- 15: CREATE LOCAL TEMPORARY VIEW
- 16: CREATE LOCATION
- 17: CREATE NETWORK ADDRESS
- 18: CREATE NETWORK INTERFACE
- 19: CREATE NOTIFIER
- 20: CREATE PROCEDURE (external)
- 21: CREATE PROCEDURE (stored)
- 22: CREATE PROFILE
- 23: CREATE PROJECTION
- 23.1: Encoding types
- 23.2: GROUPED clause
- 23.3: Hash segmentation clause
- 23.4: Unsegmented clause
- 24: CREATE RESOURCE POOL
- 24.1: Built-in pools
- 24.2: Built-in resource pools configuration
- 25: CREATE ROLE
- 26: CREATE ROUTING RULE
- 27: CREATE SCHEMA
- 28: CREATE SEQUENCE
- 29: CREATE SUBNET
- 30: CREATE TABLE
- 30.1: Column-constraint
- 30.2: Column-definition
- 30.3: Column-name-list
- 30.4: Partition clause
- 30.5: Table-constraint
- 31: CREATE TEMPORARY TABLE
- 32: CREATE TEXT INDEX
- 33: CREATE USER
- 34: CREATE VIEW
1 - CREATE ACCESS POLICY
Creates an access policy that filters access to table data to users and roles. You can create access policies for table rows and columns. Vertica applies the access policy filters with each query and returns only the data that is permissible for the current user or role.
You cannot set access policies on columns of complex data types other than native arrays. If the table contains complex-type columns, you can still set row access policies and column access policies on other columns.
Syntax
CREATE ACCESS POLICY ON [[database.]schema.]table
{ FOR COLUMN column | FOR ROWS WHERE } expression [GRANT TRUSTED] { ENABLE | DISABLE }
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- The table with the target column or rows.
FOR COLUMN
column
- The column on which to apply this access policy. The column can be a native array, but other complex types are not supported. (See Complex types.)
FOR ROWS WHERE
- The rows on which to apply this access policy.
expression
- A SQL expression that specifies conditions for accessing row or column data:
-
Row access policies limit access to specific rows in a table, as specified by the policy's WHERE expression. Only rows that satisfy this expression are fetched from the table. For details and sample usage, see Creating row access policies.
-
Column access policies limit access to specific table columns. The access policy expression can also specify how to render column data to specific users and roles. For details and sample usage, see Creating column access policies.
-
GRANT TRUSTED
Specifies that GRANT statements take precedence over the access policy in determining whether users can perform DML operations on the target table. If omitted, users can only modify table data if the access policy allows them to see the stored data in its original, unaltered state. For more information, see Access policies and DML operations.
Important
GRANT TRUSTED only affects DML operations and does not enable users to see data that the access policy would otherwise mask. Specifying this option may allow users with certain grants to update data that they cannot see.ENABLE | DISABLE
- Whether to enable the access policy. You can enable and disable existing access policies with ALTER ACCESS POLICY.
Privileges
Non-superuser: Ownership of the table
Restrictions
The following limitations apply to access policies:
-
A column can have only one access policy.
-
Column access policies cannot be set on columns of complex types other than native arrays.
-
Column access policies cannot be set for materialized columns on flex tables. While it is possible to set an access policy for the
__raw__
column, doing so restricts access to the whole table. -
Row access policies are invalid on temporary tables and tables with aggregate projections.
-
Access policy expressions cannot contain:
-
Subqueries
-
Aggregate functions
-
Analytic functions
-
User-defined transform functions (UDTF)
-
-
If the query optimizer cannot replace a deterministic expression that involves only constants with their computed values, it blocks all DML operations such as INSERT.
See also
2 - CREATE AUTHENTICATION
Creates and enables an authentication method associated with users or roles. Vertica enables the authentication method automatically.
Syntax
CREATE AUTHENTICATION auth-method-name METHOD 'auth-type' access-method
Parameters
Name | Description |
---|---|
auth-method-name |
Name of the authentication method to create, where auth-method-name conforms to conventions described in Identifiers. |
auth-type |
The authentication method type to use, one of the following:
For details, see Supported Client Authentication Methods. |
access-method |
The access method the client uses to connect, specified in one of the following ways:
|
Privileges
DBADMIN
Examples
See Creating authentication records.
See also
3 - CREATE CA BUNDLE
Creates a certificate authority (CA) bundle. These contain root CA certificates.
Syntax
CREATE CA BUNDLE name [CERTIFICATES ca_cert[, ca_cert[, ...]]
Parameters
name
- The name of the CA bundle.
ca_cert
- The name of the CA certificate. If no certificates are specified, the bundle will be empty.
Privileges
Ownership of the CA certificates in the CA bundle.
Examples
See Managing CA bundles.
See also
4 - CREATE CERTIFICATE
Creates or imports a certificate, Certificate Authority (CA), or intermediate CA. These certificates can be used with ALTER TLS CONFIGURATION to set up client-server TLS, LDAPLink TLS, LDAPAuth TLS, and internode TLS.
CREATE CERTIFICATE generates x509v3 certificates.
Syntax
CREATE [TEMP[ORARY]] [CA] CERTIFICATE certificate_name
{AS cert [KEY key_name]
| SUBJECT subject
[ SIGNED BY ca_cert ]
[ VALID FOR days ]
[ EXTENSIONS ext = val[,...] ]
[ KEY private_key ]}
Parameters
TEMPORARY
- Create with session scope. The key is stored in memory and is valid only for the current session.
CA
- Designates the certificate as a CA or intermediate certificate. If omitted, the operation creates a normal certificate.
certificate_name
- The name of the certificate.
AS
cert
- The imported certificate (string).
This parameter should include the entire chain of certificates, excluding the CA certificate.
KEY
key_name
- The name of the key.
This parameter only needs to be set for client/server certificates and CA certificates that you intend to sign other certificates with in Vertica. If your imported CA certificate will only be used for validating other certificates, you do not need to specify a key.
SUBJECT
subject
- The entity to issue the certificate to (string).
SIGNED BY
ca_cert
- The name of the CA that signed the certificate.
When adding a CA certificate, this parameter is optional. Specifying it will create an intermediate CA that cannot be used to sign other CA certificates.
When creating a certificate, this parameter is required.
VALID FOR
days
- The number of days that the certificate is valid.
EXTENSIONS
ext
=
val
- Strings specifying certificate extensions. For a full list of extensions, see the OpenSSL documentation.
KEY
private_key
- The name of the certificate's private key.
When importing a certificate, this parameter is required.
Privileges
Superuser
Default extensions
CREATE CERTIFICATE generates x509v3 certificates and includes several extensions by default. These differ based on the type of certificate you create:
CA Certificate:
-
'basicConstraints' = 'critical, CA:true'
-
'keyUsage' = 'critical, digitalSignature, keyCertSign'
-
'nsComment' = Vertica generated [CA] certificate'
-
'subjectKeyIdentifier' = 'hash'
Certificate:
-
'basicConstraints' = 'CA:false'
-
'keyUsage' = 'critical, digitalSignature, keyEncipherment'
Examples
See Generating TLS certificates and keys.
See also
5 - CREATE DIRECTED QUERY
Saves an association between an input query and a query that is annotated with optimizer hints.
Syntax
Optimizer-generated
CREATE DIRECTED QUERY OPT[IMIZER] directedqueryID [COMMENT 'comments'] input-query
User-defined (custom)
CREATE DIRECTED QUERY CUSTOM directedqueryID [COMMENT 'comments'] annotated-query
Parameters
OPT[IMIZER]
- Directs the query optimizer to generate an annotated query from
input-query
, and associate both in the new directed query. CUSTOM
- Specifies to associate
annotated-query
with the query previously specified by SAVE QUERY. directedqueryID
- A unique identifier for the directed query, a string that conforms to conventions described in Identifiers.
COMMENT '
comments
'
- Comments about the directed query, up to 128 characters. Comments can be useful for future reference—for example, explain why a given directed query was created.
If you omit this argument, Vertica inserts one of the following comments:
-
Optimizer-generated directed query
-
Custom directed query
-
input-query
- The input query to associate with an optimizer-generated directed query. The input query supports only one optimizer hint,
:v
(alias IGNORECONST). annotated-query
- A query with embedded optimizer hints to associate with the input query most recently saved with SAVE QUERY.
Privileges
SuperuserDescription
CREATE DIRECTED QUERY associates an input query with a query annotated with optimizer hints. It stores the association under a unique identifier. CREATE DIRECTED QUERY has two variants:
-
CREATE DIRECTED QUERY OPTIMIZER directs the query optimizer to generate annotated SQL from the specified input query. The annotated query contains hints that the optimizer can use to recreate its current query plan for that input query.
-
CREATE DIRECTED QUERY CUSTOM specifies an annotated query supplied by the user. Vertica associates the annotated query with the input query specified by the last SAVE QUERY statement.
In both cases, Vertica associates the annotated query and input query, and registers their association in the system table DIRECTED_QUERIES under query_name
.
Caution
Vertica associates a saved query and directed query without checking whether the two are compatible. Be careful to sequence SAVE QUERY and CREATE DIRECTED QUERY CUSTOM so the saved and directed queries are correctly matched.See also
Creating directed queries6 - CREATE EXTERNAL TABLE AS COPY
CREATE EXTERNAL TABLE AS COPY creates a table definition for data external to your Vertica database. This statement is a combination of the CREATE TABLE and COPY statements, supporting a subset of each statement's parameters.
Canceling a CREATE EXTERNAL TABLE AS COPY statement can cause unpredictable results. If you need to make a change, allow the statement to complete, drop the table, and then retry.
You can use ALTER TABLE to change the data types of columns instead of dropping and recreating the table.
You can use CREATE EXTERNAL TABLE AS COPY with any types except types from the Place package.
Note
Vertica does not create superprojections for external tables, since external tables are not stored in the database.Syntax
CREATE EXTERNAL TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
( column-definition[,...] )
[{INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES]
AS COPY
[ ( { column-as-expression | column }
[ DELIMITER [ AS ] 'char' ]
[ ENCLOSED [ BY ] 'char' ]
[ ENFORCELENGTH ]
[ ESCAPE [ AS ] 'char' | NO ESCAPE ]
[ FILLER datatype ]
[ FORMAT 'format' ]
[ NULL [ AS ] 'string' ]
[ TRIM 'byte' ]
[,...] ) ]
[ COLUMN OPTION ( column
[ DELIMITER [ AS ] 'char' ]
[ ENCLOSED [ BY ] 'char' ]
[ ENFORCELENGTH ]
[ ESCAPE [ AS ] 'char' | NO ESCAPE ]
[ FORMAT 'format' ]
[ NULL [ AS ] 'string' ]
[ TRIM 'byte' ]
[,...] ) ]
FROM {
'path-to-data' [ ON { nodename | (nodeset) | ANY NODE } ] [ input-format ] }[,...]
| [ WITH ] SOURCE source( [arg=value[,...] ] )
}
[ NATIVE
| FIXEDWIDTH COLSIZES {( integer )[,...]}
| NATIVE VARCHAR
| ORC
| PARQUET
]
[ ABORT ON ERROR ]
[ DELIMITER [ AS ] 'char' ]
[ ENCLOSED BY 'char' [ AND 'char' ] ]
[ ENFORCELENGTH ]
[ ERROR TOLERANCE ]
[ ESCAPE AS 'char' | NO ESCAPE ]
[ EXCEPTIONS 'path' [ ON nodename ] [,...] ]
[ [ WITH ] FILTER filter( [ arg=value[,...] ] ) ]
[ NULL [ AS ] 'string' ]
[ [ WITH ] PARSER parser([arg=value [,...] ]) ]
[ RECORD TERMINATOR 'string' ]
[ REJECTED DATA 'path' [ ON nodename ] [,...] ]
[ REJECTMAX integer ]
[ SKIP integer ]
[ SKIP BYTES integer ]
[ TRAILING NULLCOLS ]
[ TRIM 'byte' ]
Parameters
For all supported parameters, see the CREATE TABLE and COPY statements. For information on using this statement with UDLs, see User-defined load (UDL).
For additional guidance on using COPY parameters, see Specifying where to load data from.
Privileges
Superuser, or non-superuser with the following privileges:
-
READ privileges on the USER-accessible storage location, see GRANT (storage location)
-
Full access (including SELECT) to an external table that the user has privileges to create
ORC and Parquet data
When using the ORC and Parquet formats, Vertica supports some additional options in the COPY
statement and data structures for columns. See ORC (parser) and PARQUET (parser).
If ORC or Parquet data is partitioned, Vertica expects Hive-style partitioning. If you see unexpected results when reading data, verify that globs in your file paths correctly align with the partition structure. See Troubleshooting external tables.
Examples
The following example defines an external table for delimited data stored in HDFS:
=> CREATE EXTERNAL TABLE sales (itemID INT, date DATE, price FLOAT)
AS COPY FROM 'hdfs:///dat/ext1.csv' DELIMITER ',';
The following example uses data in the ORC format that is stored in S3. The data has two partition columns, represented as directories in the file structure. For more information about partitions, see Using partition columns.
=> CREATE EXTERNAL TABLE transactions (id int, name varchar(50), created date, region varchar(50))
AS COPY FROM 's3://datalake/sales/*/*/*'
ORC(hive_partition_cols='created,region');
The following example shows how you can read from all Parquet files in a local directory, with no partitions and no globs:
=> CREATE EXTERNAL TABLE sales (itemID INT, date DATE, price FLOAT)
AS COPY FROM '/data/sales/*.parquet' PARQUET;
The following example creates an external table using partitioned data in the ORC format. The table includes four columns. Two columns, "id" and "name", are in the data files. The other two, "created" and "region", are partition columns. For more about partition columns, see Using partition columns.
=> CREATE EXTERNAL TABLE t (id int, name varchar(50), created date, region varchar(50))
AS COPY FROM 'hdfs:///path/*/*/*'
ORC(hive_partition_cols='created,region');
The following example creates an external table from data in Google Cloud Storage:
=> CREATE EXTERNAL TABLE sales (itemID INT, date DATE, price FLOAT)
AS COPY FROM 'gs://data/sales/*.csv';
The following example creates an external table for data containing arrays:
=> CREATE EXTERNAL TABLE cust (cust_custkey int, cust_custname varchar(50), cust_custstaddress ARRAY[varchar(100)],
cust_custaddressln2 ARRAY[varchar(100)], cust_custcity ARRAY[varchar(50)], cust_custstate ARRAY[char(2)], cust_custzip ARRAY[int],
cust_email varchar(50), cust_phone varchar(30))
AS COPY FROM ' ' PARQUET;
The following examples create external tables from data in the local file system:
=> CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/ext1.dat' DELIMITER ',';
=> CREATE EXTERNAL TABLE ext2 (x integer) AS COPY FROM '/tmp/ext2.dat.bz2' BZIP DELIMITER ',';
=> CREATE EXTERNAL TABLE ext3 (x integer, y integer) AS COPY (x as '5', y) FROM '/tmp/ext3.dat.bz2' BZIP DELIMITER ',';
To allow users without superuser access to use external tables with data on the local file system, S3, or GCS, create a location for 'user' usage and grant access to it. This example shows granting access to a user named Bob to any external table whose data is located under /tmp (including in subdirectories to any depth):
=> CREATE LOCATION '/tmp' ALL NODES USAGE 'user';
=> GRANT ALL ON LOCATION '/tmp' to Bob;
The following example shows CREATE EXTERNAL TABLE using a user-defined source:
=> CREATE SOURCE curl AS LANGUAGE 'C++' NAME 'CurlSourceFactory' LIBRARY curllib;
=> CREATE EXTERNAL TABLE curl_table1 as COPY SOURCE CurlSourceFactory;
See also
Creating external tables7 - CREATE FAULT GROUP
Enterprise Mode only
Creates a fault group, which can contain the following:
-
One or more nodes
-
One or more child fault groups
-
One or more nodes and one or more child fault groups
CREATE FAULT GROUP
creates an empty fault group. Use ALTER FAULT GROUP to add nodes or other fault groups to an existing fault group.
Syntax
CREATE FAULT GROUP name
Parameters
name
- The name of the fault group to create, unique among all fault groups, where
name
conforms to conventions described in Identifiers.
Privileges
Superuser
Examples
The following command creates a fault group called parent0
:
=> CREATE FAULT GROUP parent0;
CREATE FAULT GROUP
Use ALTER FAULT GROUP to add nodes or other fault groups to the parent0
fault group.
See also
8 - CREATE FLEXIBLE EXTERNAL TABLE AS COPY
CREATE FLEXIBLE EXTERNAL TABLE AS COPY creates a flexible external table. This statement combines statements CREATE FLEXIBLE TABLE and COPY statements, supporting a subset of each statement's parameters.
You can also use user-defined load functions (UDLs) to create external flex tables. For details about creating and using flex tables, see Using Flex Tables.
Note
Vertica does not create a superprojection for an external table when you create it.For details about creating and using flex tables, see Creating flex tables in Using Flex Tables.
Caution
Canceling aCREATE FLEX EXTERNAL TABLE AS COPY
statement can cause unpredictable results. Vertica recommends that you allow the statement to finish, then use DROP TABLE after the table exists.
Syntax
CREATE FLEX[IBLE] EXTERNAL TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
( [ column-definition[,...] ] )
[ INCLUDE | EXCLUDE [SCHEMA] PRIVILEGES ]
AS COPY [ ( { column-as-expression | column } [ FILLER datatype ] ]
FROM {
'path-to-data' [ ON nodename | ON ANY NODE | ON (nodeset) ] input-format [,...]
| [ WITH ] UDL-clause[...]
}
[ ABORT ON ERROR ]
[ DELIMITER [ AS ] 'char' ]
[ ENCLOSED [ BY ] 'char' ]
[ ENFORCELENGTH ]
[ ESCAPE [ AS ] 'char' | NO ESCAPE ]
[ EXCEPTIONS 'path' [ ON nodename ] [,...] ]
[ NULL [ AS ] 'string' ]
[ RECORD TERMINATOR 'string' ]
[ REJECTED DATA 'path' [ ON nodename ][,...] ]
[ REJECTMAX integer ]
[ SKIP integer ]
[ SKIP BYTES integer ]
[ TRAILING NULLCOLS ]
[ TRIM 'byte' ]
Parameters
For parameter descriptions, see CREATE TABLE and Parameters.
Note
CREATE FLEXIBLE EXTERNAL TABLE AS COPY supports only a subset of CREATE TABLE and COPY parameters.Privileges
Superuser, or non-superuser with the following privileges:
-
READ privileges on the USER-accessible storage location, see GRANT (storage location)
-
Full access (including SELECT) to an external table that the user has privileges to create
Examples
To create an external flex table:
=> CREATE flex external table mountains() AS COPY FROM 'home/release/KData/kmm_ountains.json' PARSER fjsonparser();
CREATE TABLE
As with other flex tables, creating an external flex table produces two regular tables: the named table and its associated _keys
table. The keys table is not an external table:
=> \dt mountains
List of tables
Schema | Name | Kind | Owner | Comment
--------+-----------+-------+---------+---------
public | mountains | table | release |
(1 row)
You can use the helper function, COMPUTE_FLEXTABLE_KEYS_AND_BUILD_VIEW, to compute keys and create a view for the external table:
=> SELECT compute_flextable_keys_and_build_view ('appLog');
compute_flextable_keys_and_build_view
--------------------------------------------------------------------------------------------------
Please see public.appLog_keys for updated keys
The view public.appLog_view is ready for querying
(1 row)
Check the keys from the _keys
table for the results of running the helper application:
=> SELECT * FROM appLog_keys;
key_name | frequency | data_type_guess
----------------------------------------------------------+-----------+------------------
contributors | 8 | varchar(20)
coordinates | 8 | varchar(20)
created_at | 8 | varchar(60)
entities.hashtags | 8 | long varbinary(186)
.
.
.
retweeted_status.user.time_zone | 1 | varchar(20)
retweeted_status.user.url | 1 | varchar(68)
retweeted_status.user.utc_offset | 1 | varchar(20)
retweeted_status.user.verified | 1 | varchar(20)
(125 rows)
You can query the view:
=> SELECT "user.lang" FROM appLog_view;
user.lang
-----------
it
en
es
en
en
es
tr
en
(12 rows)
See also
9 - CREATE FLEXIBLE TABLE
Creates a flexible (flex) table in the logical schema.
When you create a flex table, Vertica automatically creates two dependent objects:
-
Keys table that is named
flex-table-name
_keys
-
View that is named
flex-table-name
_view
The flex table requires the keys table and view. Neither of these objects can exist independently of the flex table.
Syntax
Create with column definitions
CREATE [[ scope ] TEMP[ORARY]] FLEX[IBLE] TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
( [ column-definition[,...] [, table-constraint ][,...] ] )
[ ORDER BY column[,...] ]
[ segmentation-spec ]
[ KSAFE [k-num] ]
[ partition-clause]
[ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
Create from another table
CREATE FLEX[IBLE] TABLE [[database.]schema.] table-name
[ ( column-name-list ) ]
[ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
AS query [ ENCODED BY column-ref-list ]
Parameters
For general parameter descriptions, see CREATE TABLE; for parameters specific to temporary flex tables, see CREATE TEMPORARY TABLE and Creating flex tables.
You cannot partition a flex table on any virtual column (key).
Privileges
Non-superuser: CREATE privilege on table schema
Default columns
The CREATE statement can omit specifying any column definitions. CREATE FLEXIBLE TABLE always creates two columns automatically:
__raw__
- LONG VARBINARY type column to store unstructured data that you load. By default, this column has a
NOT NULL
constraint. __identity__
- IDENTITY column that is used for segmentation and sorting when no other column is defined.
Default projections
Vertica automatically creates superprojections for both the flex table and keys tables when you create them.
If you create a flex table with one or more of the ORDER BY, ENCODED BY, SEGMENTED BY, or KSAFE clauses, the clause information is used to create projections. If no clauses are in use, Vertica uses the following defaults:
Table | Sort order | Encoding | Segmentation | K-safety |
---|---|---|---|---|
Flexible table |
ORDER BY *.__identity__ |
none |
SEGMENTED BY hash *.__identity__ ALL NODES OFFSET 0 |
1 |
Keys table |
ORDER BY *._keys_frequency |
none |
UNSEGMENTED ALL NODES |
1 |
Note
When you build a view for a flex table (see BUILD_FLEXTABLE_VIEW), the view is ordered by frequency, desc, and key_name.Examples
The following example creates a flex table named darkdata
without specifying any column information. Vertica creates a default superprojection and buddy projection as part of creating the table:
=> CREATE FLEXIBLE TABLE darkdata();
CREATE TABLE
=> \dj darkdata1*
List of projections
Schema | Name | Owner | Node | Comment
--------+----------------------+---------+------------------+---------
public | darkdata1_b0 | dbadmin | |
public | darkdata1_b1 | dbadmin | |
public | darkdata1_keys_super | dbadmin | v_vmart_node0001 |
public | darkdata1_keys_super | dbadmin | v_vmart_node0002 |
public | darkdata1_keys_super | dbadmin | v_vmart_node0003 |
(5 rows)
=> SELECT export_objects('','darkdata1_b0');
CREATE PROJECTION public.darkdata1_b0 /*+basename(darkdata1),createtype(P)*/
(
__identity__,
__raw__
)
AS
SELECT darkdata1.__identity__,
darkdata1.__raw__
FROM public.darkdata1
ORDER BY darkdata1.__identity__
SEGMENTED BY hash(darkdata1.__identity__) ALL NODES OFFSET 0;
SELECT MARK_DESIGN_KSAFE(1);
(1 row)
=> select export_objects('','darkdata1_keys_super');
CREATE PROJECTION public.darkdata1_keys_super /*+basename(darkdata1_keys),createtype(P)*/
(
key_name,
frequency,
data_type_guess
)
AS
SELECT darkdata1_keys.key_name,
darkdata1_keys.frequency,
darkdata1_keys.data_type_guess
FROM public.darkdata1_keys
ORDER BY darkdata1_keys.frequency
UNSEGMENTED ALL NODES;
SELECT MARK_DESIGN_KSAFE(1);
(1 row)
The following example creates a table called darkdata1
with one column definition (date_col
). The statement specifies the partition by
clause to partition the data by year. Vertica creates a default superprojection and buddy projections as part of creating the table:
=> CREATE FLEX TABLE darkdata1 (date_col date NOT NULL) partition by
extract('year' from date_col);
CREATE TABLE
See also
10 - CREATE FUNCTION statements
Vertica provides CREATE statements for each type of user-defined extension. Each CREATE statement adds a user-defined function to the Vertica catalog:
CREATE statement | Extension |
---|---|
CREATE FUNCTION (scalar) | User-defined scalar functions (UDSFs) |
CREATE AGGREGATE FUNCTION | User-defined aggregate functions (UDAFs) |
CREATE ANALYTIC FUNCTION | User-defined analytic functions (UDAnF) |
CREATE TRANSFORM FUNCTION | User-defined transform functions (UDTFs) |
CREATE statements for user-defined load: | |
• CREATE SOURCE | Load source functions |
• CREATE FILTER | Load filter functions |
• CREATE PARSER | Load parser functions |
Vertica also provides CREATE FUNCTION (SQL), which stores SQL expressions as functions that you can invoke in a query.
10.1 - CREATE AGGREGATE FUNCTION
Adds a user-defined aggregate function (UDAF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.
CREATE AGGREGATE FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading aggregate functions. When you call the SQL function, Vertica passes the input table to the function to process.
User-defined aggregate functions run in unfenced mode only.
Syntax
CREATE [ OR REPLACE ] AGGREGATE FUNCTION [ IF NOT EXISTS ]
[[database.]schema.]function AS
[ LANGUAGE 'language' ]
NAME 'factory'
LIBRARY library
[ NOT FENCED ];
Arguments
OR REPLACE
If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.
The function name must conform to the restrictions on Identifiers.
LANGUAGE '
language
'
- The language used to develop this function, currently
C++
only (the default). NAME '
factory
'
- Name of the factory class that generates the function instance.
LIBRARY
library
- Name of the shared library that contains the function. This library must have already been loaded by CREATE LIBRARY.
NOT FENCED
- Indicates that the function runs in unfenced mode. Aggregate functions cannot be run in fenced mode.
Privileges
Non-superuser:
-
CREATE privilege on the function's schema
-
USAGE privilege on the function's library
Examples
The following example demonstrates loading a library named AggregateFunctions
and then defining functions named ag_avg
and ag_cat
. The functions are mapped to the AverageFactory
and ConcatenateFactory
classes in the library:
=> CREATE LIBRARY AggregateFunctions AS '/opt/vertica/sdk/examples/build/AggregateFunctions.so';
CREATE LIBRARY
=> CREATE AGGREGATE FUNCTION ag_avg AS LANGUAGE 'C++' NAME 'AverageFactory'
library AggregateFunctions;
CREATE AGGREGATE FUNCTION
=> CREATE AGGREGATE FUNCTION ag_cat AS LANGUAGE 'C++' NAME 'ConcatenateFactory'
library AggregateFunctions;
CREATE AGGREGATE FUNCTION
=> \x
Expanded display is on.
select * from user_functions;
-[ RECORD 1 ]----------+------------------------------------------------------------------
schema_name | public
function_name | ag_avg
procedure_type | User Defined Aggregate
function_return_type | Numeric
function_argument_type | Numeric
function_definition | Class 'AverageFactory' in Library 'public.AggregateFunctions'
volatility |
is_strict | f
is_fenced | f
comment |
-[ RECORD 2 ]----------+------------------------------------------------------------------
schema_name | public
function_name | ag_cat
procedure_type | User Defined Aggregate
function_return_type | Varchar
function_argument_type | Varchar
function_definition | Class 'ConcatenateFactory' in Library 'public.AggregateFunctions'
volatility |
is_strict | f
is_fenced | f
comment |
See also
10.2 - CREATE ANALYTIC FUNCTION
Adds a user-defined analytic function (UDAnF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.
CREATE ANALYTIC FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading analytic functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.
Syntax
CREATE [ OR REPLACE ] ANALYTIC FUNCTION [ IF NOT EXISTS ]
[[database.]schema.]function AS
[ LANGUAGE 'language' ]
NAME 'factory'
LIBRARY library
[ FENCED | NOT FENCED ]
Arguments
OR REPLACE
If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.
The function name must conform to the restrictions on Identifiers.
LANGUAGE '
language
'
- Language used to develop this function, one of the following:
-
C++
(default) -
Java
-
NAME '
factory
'
- Name of the factory class that generates the function instance.
LIBRARY
library
- Name of the library that contains the function. This library must already be loaded by CREATE LIBRARY.
FENCED | NOT FENCED
- Enables or disables fenced mode for this function.
Default:
FENCED
Privileges
Non-superuser:
-
CREATE privilege on the function's schema
-
USAGE privilege on the function's library
Examples
This example creates an analytic function named an_rank
based on the factory class named RankFactory
in the AnalyticFunctions
library:
=> CREATE ANALYTIC FUNCTION an_rank AS LANGUAGE 'C++'
NAME 'RankFactory' LIBRARY AnalyticFunctions;
See also
Analytic functions (UDAnFs)10.3 - CREATE FILTER
Adds a user-defined load filter function to the catalog. The library containing the filter function must have been previously added using CREATE LIBRARY.
CREATE FILTER automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading load filter functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.
Important
Installing an untrusted UDL function can compromise the security of the server. UDxs can contain arbitrary code. In particular, user-defined source functions can read data from any arbitrary location. It is up to the developer of the function to enforce proper security limitations. Superusers must not grant access to UDxs to untrusted users.Syntax
CREATE [ OR REPLACE ] FILTER [ IF NOT EXISTS ]
[[database.]schema.]function AS
[ LANGUAGE 'language' ]
NAME 'factory' LIBRARY library
[ FENCED | NOT FENCED ]
Arguments
OR REPLACE
If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.
The function name must conform to the restrictions on Identifiers.
LANGUAGE '
language
'
- The language used to develop this function, one of the following:
-
C++
(default) -
Java
-
Python
-
NAME '
factory
'
- Name of the factory class that generates the function instance. This is the same name used by the RegisterFactory class.
LIBRARY
library
- Name of the C++ library shared object file, Python file, or Java Jar file. This library must already have been loaded by CREATE LIBRARY.
FENCED | NOT FENCED
- Enables or disables fenced mode for this function.
Default:
FENCED
Privileges
Superuser
Examples
The following example demonstrates loading a library named iConverterLib
, then defining a filter function named Iconverter
that is mapped to the iConverterFactory
factory class in the library:
=> CREATE LIBRARY iConverterLib as '/opt/vertica/sdk/examples/build/IconverterLib.so';
CREATE LIBRARY
=> CREATE FILTER Iconverter AS LANGUAGE 'C++' NAME 'IconverterFactory' LIBRARY IconverterLib;
CREATE FILTER FUNCTION
=> \x
Expanded display is on.
=> SELECT * FROM user_functions;
-[ RECORD 1 ]----------+--------------------
schema_name | public
function_name | Iconverter
procedure_type | User Defined Filter
function_return_type |
function_argument_type |
function_definition |
volatility |
is_strict | f
is_fenced | f
comment |
See also
10.4 - CREATE FUNCTION (scalar)
Adds a user-defined scalar function (UDSF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.
A UDSF takes in a single row of data and returns a single value. These functions can be used anywhere a native Vertica function or statement can be used, except CREATE TABLE with its PARTITION BY or any segmentation clause.
CREATE FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading UDxs. When you call the function, Vertica passes the parameters to the function in the library to process.
Syntax
CREATE [ OR REPLACE ] FUNCTION [ IF NOT EXISTS ]
[[database.]schema.]function AS
[ LANGUAGE 'language' ]
NAME 'factory'
LIBRARY library
[ FENCED | NOT FENCED ]
Arguments
OR REPLACE
If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.
The function name must conform to the restrictions on Identifiers.
LANGUAGE '
language
'
- Language used to develop this function, one of the following:
-
C++
(default) -
Python
-
Java
-
R
-
NAME '
factory
'
- Name of the factory class that generates the function instance.
LIBRARY
library
- Name of the C++ shared object file, Python file, Java Jar file, or R functions file. This library must already have been loaded by CREATE LIBRARY.
FENCED | NOT FENCED
- Enables or disables fenced mode for this function. Functions written in Java and R always run in fenced mode.
Default:
FENCED
Privileges
-
CREATE privilege on the function's schema
-
USAGE privilege on the function's library
Examples
The following example loads a library named ScalarFunctions
and then defines a function named Add2ints
that is mapped to the Add2intsInfo
factory class in the library:
=> CREATE LIBRARY ScalarFunctions AS '/opt/vertica/sdk/examples/build/ScalarFunctions.so';
CREATE LIBRARY
=> CREATE FUNCTION Add2Ints AS LANGUAGE 'C++' NAME 'Add2IntsFactory' LIBRARY ScalarFunctions;
CREATE FUNCTION
=> \x
Expanded display is on.
=> SELECT * FROM USER_FUNCTIONS;
-[ RECORD 1 ]----------+----------------------------------------------------
schema_name | public
function_name | Add2Ints
procedure_type | User Defined Function
function_return_type | Integer
function_argument_type | Integer, Integer
function_definition | Class 'Add2IntsFactory' in Library 'public.ScalarFunctions'
volatility | volatile
is_strict | f
is_fenced | t
comment |
=> \x
Expanded display is off.
=> -- Try a simple call to the function
=> SELECT Add2Ints(23,19);
Add2Ints
----------
42
(1 row)
The following example uses a scalar function that returns a ROW:
=> CREATE FUNCTION div_with_rem AS LANGUAGE 'C++' NAME 'DivFactory' LIBRARY ScalarFunctions;
=> SELECT div_with_rem(18,5);
div_with_rem
------------------------------
{"quotient":3,"remainder":3}
(1 row)
See also
Developing user-defined extensions (UDxs)10.5 - CREATE FUNCTION (SQL)
Stores SQL expressions as functions for use in queries. User-defined SQL functions are useful for executing complex queries and combining Vertica built-in functions. You simply call the function in a given query. If multiple SQL functions with the same name and argument types are in the search path, Vertica calls the first match that it finds.
SQL functions are flattened in all cases, including DDL.
Syntax
CREATE [ OR REPLACE ] FUNCTION [ IF NOT EXISTS ]
[[database.]schema.]function( [ arg_list ] )
RETURN return_type
AS
BEGIN
RETURN expression;
END;
Arguments
OR REPLACE
- If a function of the same name and arguments exists, replace it. If you only change the function arguments, Vertica ignores this option and maintains both functions under the same name.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
- SQL function to create, where
function
conforms to conventions described in Identifiers. *
arg_list*
- A comma-delimited list of argument names and their data types, specified in this format:
argname argtype[,...]
where:
-
argname
is the name of an argument passed tofunction
. -
argtype
isargname
's data type.
-
*
return_type*
- The data type that this function returns.
RETURN
expression
- The SQL function body, where
expression
can contain built-in functions, operators, and argument names specified in the CREATE FUNCTION statement.A semicolon at the end of the expression is required.
Note
CREATE FUNCTION allows only one RETURN expression. Return expressions do not support the following:
-
FROM, WHERE, GROUP BY, ORDER BY, and LIMIT clauses
-
Aggregation, analytics, and meta-functions
-
Privileges
Non-superuser:
-
CREATE privilege on the function's schema
-
USAGE privilege on the function's library
Strictness and volatility
Vertica infers the strictness and volatility (stable, immutable, or volatile) of a SQL function from its definition. Vertica then determines the correctness of usage, such as where an immutable function is expected but a volatile function is provided.
SQL functions and views
You can create views on the queries that use SQL functions and then query the views. When you create a view, a SQL function replaces a call to the user-defined function with the function body in a view definition. Therefore, when the body of the user-defined function is replaced, the view should also be replaced.
Examples
See Creating user-defined SQL functions.
See also
10.6 - CREATE PARSER
Adds a user-defined load parser function to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.
CREATE PARSER automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading load parser functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.
Important
Installing an untrusted UDL function can compromise the security of the server. UDxs can contain arbitrary code. In particular, user-defined source functions can read data from any arbitrary location. It is up to the developer of the function to enforce proper security limitations. Superusers must not grant access to UDxs to untrusted users.Syntax
CREATE [ OR REPLACE ] PARSER [ IF NOT EXISTS ]
[[database.]schema.]function AS
[ LANGUAGE 'language' ]
NAME 'factory'
LIBRARY library
[ FENCED | NOT FENCED ]
Arguments
OR REPLACE
If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.
The function name must conform to the restrictions on Identifiers.
LANGUAGE '
language
'
- The language used to develop this function, one of the following:
-
C++ (default)
-
Java
-
Python
-
NAME '
factory
'
- Name of the factory class that generates the function instance. This is the same name used by the RegisterFactory class.
LIBRARY
library
- Name of the C++ library shared object file, Python file, or Java Jar file. This library must already have been loaded by CREATE LIBRARY.
FENCED | NOT FENCED
- Enables or disables fenced mode for this function.
Default:
FENCED
Privileges
Superuser
Examples
The following example demonstrates loading a library named BasicIntegrerParserLib
, then defining a parser function named BasicIntegerParser
that is mapped to the BasicIntegerParserFactory
factory class in the library:
=> CREATE LIBRARY BasicIntegerParserLib as '/opt/vertica/sdk/examples/build/BasicIntegerParser.so';
CREATE LIBRARY
=> CREATE PARSER BasicIntegerParser AS LANGUAGE 'C++' NAME 'BasicIntegerParserFactory' LIBRARY BasicIntegerParserLib;
CREATE PARSER FUNCTION
=> \x
Expanded display is on.
=> SELECT * FROM user_functions;
-[ RECORD 1 ]----------+--------------------
schema_name | public
function_name | BasicIntegerParser
procedure_type | User Defined Parser
function_return_type |
function_argument_type |
function_definition |
volatility |
is_strict | f
is_fenced | f
comment |
See also
10.7 - CREATE SOURCE
Adds a user-defined load source function to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.
CREATE SOURCE automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading load source functions. When you call the SQL function, Vertica passes the input table to the function in the library to process.
Important
Installing an untrusted UDL function can compromise the security of the server. UDxs can contain arbitrary code. In particular, user-defined source functions can read data from any arbitrary location. It is up to the developer of the function to enforce proper security limitations. Superusers must not grant access to UDxs to untrusted users.Syntax
CREATE [ OR REPLACE ] SOURCE [ IF NOT EXISTS ]
[[database.]schema.]function AS
[ LANGUAGE 'language' ]
NAME 'factory'
LIBRARY library
[ FENCED | NOT FENCED ]
Arguments
OR REPLACE
If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.
The function name must conform to the restrictions on Identifiers.
LANGUAGE '
language
'
- Language used to develop this function, one of the following:
-
C++ (default)
-
Java
-
NAME '
factory
'
- Name of the factory class that generates the function instance. This is the same name used by the RegisterFactory class.
LIBRARY
library
- Name of the C++ library shared object file or Java Jar file. This library must already have been loaded by CREATE LIBRARY.
FENCED | NOT FENCED
- Enables or disables fenced mode for this function.
**Default: **
FENCED
Privileges
Superuser
Examples
The following example demonstrates loading a library named curllib
, then defining a source function named curl
that is mapped to the CurlSourceFactory
factory class in the library:
=> CREATE LIBRARY curllib as '/opt/vertica/sdk/examples/build/cURLLib.so';
CREATE LIBRARY
=> CREATE SOURCE curl AS LANGUAGE 'C++' NAME 'CurlSourceFactory' LIBRARY curllib;
CREATE SOURCE
=> \x
Expanded display is on.
=> SELECT * FROM user_functions;
-[ RECORD 1 ]----------+--------------------
schema_name | public
function_name | curl
procedure_type | User Defined Source
function_return_type |
function_argument_type |
function_definition |
volatility |
is_strict | f
is_fenced | f
comment |
See also
10.8 - CREATE TRANSFORM FUNCTION
Adds a user-defined transform function (UDTF) to the catalog. The library containing the function must have been previously added using CREATE LIBRARY.
CREATE TRANSFORM FUNCTION automatically determines the function parameters and return value from data supplied by the factory class. Vertica supports overloading transform functions. When you call the SQL function, Vertica passes the input table to the transform function in the library to process.
Syntax
CREATE [ OR REPLACE ] TRANSFORM FUNCTION [ IF NOT EXISTS ]
[[database.]schema.]function AS
[ LANGUAGE 'language' ]
NAME 'factory'
LIBRARY library
[ FENCED | NOT FENCED ]
Arguments
OR REPLACE
If a function with the same name and arguments exists, replace it. You can use this to change between fenced and unfenced modes, for example. If you do not use this directive and the function already exists, the CREATE statement returns with a rollback error.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
IF NOT EXISTS
If a function with the same name and arguments exists, return without creating the function.
OR REPLACE and IF NOT EXISTS are mutually exclusive.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
function*
Name of the function to create. This is the name used in SQL invocations of the function. It does not need to match the name of the factory, but it is less confusing if they are the same or similar.
The function name must conform to the restrictions on Identifiers.
LANGUAGE '
language
'
- The language used to develop this function, one of the following:
-
C++
(default) -
Java
-
R
-
Python
-
NAME '
factory
'
- Name of the factory class that generates the function instance.
LIBRARY
library
- Name of the C++ shared object file, Python file, Java Jar file, or R functions file. This library must already have been loaded by CREATE LIBRARY.
FENCED | NOT FENCED
- Enables or disables fenced mode for this function. Functions written in Java and R always run in fenced mode.
Default:
FENCED
Privileges
Non-superuser:
-
CREATE privilege on the function's schema
-
USAGE privilege on the function's library
Restrictions
A query that includes a UDTF cannot:
-
Include statements other than the SELECT statement that calls this UDTF and a PARTITION BY expression
-
Call an analytic function
-
Call another UDTF
-
Include one of the following clauses:
Examples
The following example loads a library named TransformFunctions
and then defines a function named tokenize
that is mapped to the TokenFactory
factory class in the library:
=> CREATE LIBRARY TransformFunctions AS
'/home/dbadmin/TransformFunctions.so';
CREATE LIBRARY
=> CREATE TRANSFORM FUNCTION tokenize
AS LANGUAGE 'C++' NAME 'TokenFactory' LIBRARY TransformFunctions;
CREATE TRANSFORM FUNCTION
See also
11 - CREATE HCATALOG SCHEMA
Define a schema for data stored in a Hive data warehouse using the HCatalog Connector. For more information, see Using the HCatalog Connector.
Most of the optional parameters are read out of Hadoop configuration files if available. If you copied the Hadoop configuration files as described in Configuring Vertica for HCatalog, you can omit most parameters. By default this statement uses the values specified in those configuration files. If the configuration files are complete, the following is a valid statement:
=> CREATE HCATALOG SCHEMA hcat;
If a value is not specified in the configuration files and a default is shown in the parameter list, then that default value is used.
Some parameters apply only if you are using HiveServer2 (the default). Others apply only if you are using WebHCat, a legacy Hadoop service. When using HiveServer2, use HIVESERVER2_HOSTNAME to specify the server host. When using WebHCat, use WEBSERVICE_HOSTNAME to specify the server host.
If you need to use WebHCat you must also set the HCatalogConnectorUseHiveServer2 configuration parameter to 0. See Apache Hadoop parameters.
After creating the schema, you can change many (but not all) parameters using ALTER HCATALOG SCHEMA.
Syntax
CREATE HCATALOG SCHEMA [IF NOT EXISTS] schemaName
[AUTHORIZATION user-id]
[WITH [param=value [,...] ] ]
Arguments
Argument | Description |
---|---|
[IF NOT EXISTS] |
If given, the statement exits without an error when the schema named in schemaName already exists. |
schemaName |
The name of the schema to create in the Vertica catalog. The tables in the Hive database will be available through this schema. |
AUTHORIZATION user-id |
The name of a Vertica account to own the schema being created. This parameter is ignored if Kerberos authentication is being used; in that case the current vsql user is used. |
Parameters
Parameter | Description |
---|---|
HOSTNAME |
The hostname, IP address, or URI of the database server that stores the Hive data warehouse's metastore information. If you specify this parameter and do not also specify If the Hive metastore supports High Availability, you can specify a comma-separated list of URIs for this value. If this value is not specified, hive-site.xml must be available. |
PORT |
The port number on which the metastore database is running. If you specify this parameter, you must also specify HOSTNAME and it must be a name or IP address (not a URI). |
HIVESERVER2_HOSTNAME |
The hostname or IP address of the HiveServer2 service. This parameter is optional if in hive-site.xml you set one of the following properties:
This parameter is ignored if you are using WebHCat. |
WEBSERVICE_HOSTNAME |
The hostname or IP address of the WebHCat service, if using WebHCat instead of HiveServer2. If this value is not specified, webhcat-site.xml must be available. |
WEBSERVICE_PORT |
The port number on which the WebHCat service is running, if using WebHCat instead of HiveServer2. If this value is not specified, webhcat-site.xml must be available. |
WEBHDFS_ADDRESS |
The host and port ("host:port") for the WebHDFS service. This parameter is used only for reading ORC and Parquet files. If this value is not set, hdfs-site.xml must be available to read these file types through the HCatalog Connector. |
HCATALOG_SCHEMA |
The name of the Hive schema or database that the Vertica schema is being mapped to. The default is schemaName . |
CUSTOM_PARTITIONS |
Whether the Hive schema uses custom partition locations ('YES' or 'NO'). If the schema uses custom partition locations, then Vertica queries Hive to get those locations when executing queries. These additional Hive queries can be expensive, so use this parameter only if you need to. The default is 'NO' (disabled). For more information, see Using Partitioned Data. |
HCATALOG_USER |
The username of the HCatalog user to use when making calls to the HiveServer2 or WebHCat server. The default is the current database user. |
HCATALOG_CONNECTION_TIMEOUT |
The number of seconds the HCatalog Connector waits for a successful connection to the HiveServer or WebHCat server. A value of 0 means wait indefinitely. |
HCATALOG_SLOW_TRANSFER_LIMIT |
The lowest data transfer rate (in bytes per second) from the HiveServer2 or WebHCat server that the HCatalog Connector accepts. See HCATALOG_SLOW_TRANSFER_TIME for details. |
HCATALOG_SLOW_TRANSFER_TIME |
The number of seconds the HCatalog Connector waits before enforcing the data transfer rate lower limit. After this time has passed, the HCatalog Connector tests whether the data transfer rate is at least as fast as the value set in HCATALOG_SLOW_TRANSFER_LIMIT. If it is not, then the HCatalog Connector breaks the connection and terminates the query. |
SSL_CONFIG |
The path of the Hadoop ssl-client.xml configuration file. This parameter is required if you are using HiveServer2 and it uses SSL wire encryption. This parameter is ignored if you are using WebHCat. |
The default values for HCATALOG_CONNECTOR_TIMEOUT, HCATALOG_SLOW_TRANSFER_LIMIT, and HCATALOG_SLOW_TRANSFER_TIME are set by the database configuration parameters HCatConnectionTimeout, HCatSlowTransferLimit, and HCatSlowTransferTime. See Apache Hadoop parameters for more information.
Configuration files
The HCatalog Connector uses the following values from the Hadoop configuration files if you do not override them when creating the schema.
File | Properties |
---|---|
hive-site.xml |
hive.server2.thrift.bind.host (used for HIVESERVER2_HOSTNAME) |
hive.server2.thrift.port | |
hive.server2.transport.mode | |
hive.server2.authentication | |
hive.server2.authentication.kerberos.principal | |
hive.server2.support.dynamic.service.discovery | |
hive.zookeeper.quorum (used as HIVESERVER2_HOSTNAME if dynamic service discovery is enabled) | |
hive.zookeeper.client.port | |
hive.server2.zookeeper.namespace | |
hive.metastore.uris (used for HOSTNAME and PORT) | |
ssl-client.xml |
ssl.client.truststore.location |
ssl.client.truststore.password |
Privileges
The user must be a superuser or be granted all permissions on the database to use this statement.
The user also requires access to Hive data in one of the following ways:
-
Have USAGE permissions on
hcatalog_schema
, if Hive does not use an authorization service (Sentry or Ranger) to manage access. -
Have permission through an authorization service, if Hive uses it to manage access. In this case you must either set EnableHCatImpersonation to 0, to access data as the Vertica principal, or grant users access to the HDFS data. For Sentry, you can use ACL synchronization to manage HDFS access.
-
Be the dbadmin user, with or without an authorization service.
Examples
The following example shows how to use CREATE HCATALOG SCHEMA to define a new schema for tables stored in a Hive database and then query the system tables that contain information about those tables:
=> CREATE HCATALOG SCHEMA hcat WITH HOSTNAME='hcathost' PORT=9083
HCATALOG_SCHEMA='default' HIVESERVER2_HOSTNAME='hs.example.com'
SSL_CONFIG='/etc/hadoop/conf/ssl-client.xml' HCATALOG_USER='admin';
CREATE SCHEMA
=> \x
Expanded display is on.
=> SELECT * FROM v_catalog.hcatalog_schemata;
-[ RECORD 1 ]----------------+-------------------------------------------
schema_id | 45035996273748224
schema_name | hcat
schema_owner_id | 45035996273704962
schema_owner | admin
create_time | 2017-12-05 14:43:03.353404-05
hostname | hcathost
port | -1
hiveserver2_hostname | hs.example.com
webservice_hostname |
webservice_port | 50111
webhdfs_address | hs.example.com:50070
hcatalog_schema_name | default
ssl_config | /etc/hadoop/conf/ssl-client.xml
hcatalog_user_name | admin
hcatalog_connection_timeout | -1
hcatalog_slow_transfer_limit | -1
hcatalog_slow_transfer_time | -1
custom_partitions | f
=> SELECT * FROM v_catalog.hcatalog_table_list;
-[ RECORD 1 ]------+------------------
table_schema_id | 45035996273748224
table_schema | hcat
hcatalog_schema | default
table_name | nation
hcatalog_user_name | admin
-[ RECORD 2 ]------+------------------
table_schema_id | 45035996273748224
table_schema | hcat
hcatalog_schema | default
table_name | raw
hcatalog_user_name | admin
-[ RECORD 3 ]------+------------------
table_schema_id | 45035996273748224
table_schema | hcat
hcatalog_schema | default
table_name | raw_rcfile
hcatalog_user_name | admin
-[ RECORD 4 ]------+------------------
table_schema_id | 45035996273748224
table_schema | hcat
hcatalog_schema | default
table_name | raw_sequence
hcatalog_user_name | admin
The following example shows how to specify more than one metastore host.
=> CREATE HCATALOG SCHEMA hcat
WITH HOSTNAME='thrift://node1.example.com:9083,thrift://node2.example.com:9083';
The following example shows how to include custom partition locations:
=> CREATE HCATALOG SCHEMA hcat WITH HCATALOG_SCHEMA='default'
HIVESERVER2_HOSTNAME='hs.example.com'
CUSTOM_PARTITIONS='yes';
12 - CREATE KEY
Creates a private key.
Syntax
CREATE [TEMP[ORARY]] KEY name
{ 'AES' [ PASSWORD 'password' ] | 'RSA' }
{LENGTH length | AS key_text}
Parameters
TEMPORARY
- Create with session scope. The key is stored in memory and is valid only for the current session.
*
name*
- The name of the key.
*
password*
- Password for the key.
*
length*
- Size of the key in bits.
Example: 2048
*
key_text*
- The contents of the key to import.
Example:
-----BEGIN RSA PRIVATE KEY-----...ABCD1234...-----END RSA PRIVATE KEY-----
Privileges
Superuser
Examples
See Generating TLS certificates and keys.
See also
13 - CREATE LIBRARY
Loads a library containing user-defined extensions (UDxs) into the Vertica catalog. Vertica automatically distributes copies of the library file and supporting libraries to all cluster nodes.
Because libraries are added to the database catalog, they persist across database restarts.
After loading a library in the catalog, you can use statements such as CREATE FUNCTION to define the extensions contained in the library. See Developing user-defined extensions (UDxs) for details.
Syntax
CREATE [OR REPLACE] LIBRARY
[[database.]schema.]name
AS 'path'
[ DEPENDS 'depends-path' ]
[ LANGUAGE 'language' ]
Arguments
OR REPLACE
- If a library with the same name exists, replace it. UDxs defined in the catalog that reference the updated library automatically start using the new library file.
If you do not use this directive and the library already exists, the CREATE statement returns with an error.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.name
- Name of the library to create. This is the name used when creating functions in the library (see Creating UDx Functions). While not required, it is good practice to match the file name.
AS
path
- Path of the library to load, either an absolute path on the initiator node file system or a URI for another supported file system or object store.
DEPENDS '
depends-path
'
Files or libraries on which this library depends, one or more files or directories on the initiator node file system or other supported file systems or object stores. For a directory, end the path entry with a slash (
/
), optionally followed by a wildcard (*
). To specify more than one file, separate entries with colons (:
).If any path entry contain colons, such as a URI, place brackets around the entire DEPENDS path and use double quotes for the individual path elements, as in the following example:
DEPENDS '["s3://mybucket/gson-2.3.1.jar"]'
To specify libraries with multiple directory levels, see Multi-level Library Dependencies.
DEPENDS has no effect for libraries written in R. R packages must be installed locally on each node, including external dependencies.
Important
The performance of CREATE LIBRARY can degrade in Eon Mode, in proportion to the number and depth of dependencies specified by the DEPENDS clause.If a Java library depends on native libraries (SO files), use DEPENDS to specify the path and call
System.loadLibrary()
in your UDx to load the native libraries from that path.LANGUAGE '
language
'
- The programming language of the functions in the library, one of:
-
C++
(default) -
Python
-
Java
-
R
-
Privileges
Superuser, or UDXDEVELOPER and CREATE on the schema. Non-superusers must explicitly enable the UDXDEVELOPER role, as in the following example:
=> SET ROLE UDXDEVELOPER;
SET
-- Not required, but you can confirm the role as follows:
=> SHOW ENABLED ROLES;
name | setting
---------------+--------------
enabled roles | udxdeveloper
(1 row)
=> CREATE LIBRARY MyLib AS '/home/dbadmin/my_lib.so';
CREATE LIBRARY
-- Create functions...
-- UDXDEVELOPER also grants DROP (replace):
=> CREATE OR REPLACE LIBRARY MyLib AS '/home/dbadmin/my_lib.so';
Requirements
-
Vertica makes its own copies of the library files. Later modification or deletion of the original files specified in the statement does not affect the library defined in the catalog. To update the library, use ALTER LIBRARY.
-
Loading a library does not guarantee that it functions correctly. CREATE LIBRARY performs some basic checks on the library file to verify it is compatible with Vertica. The statement fails if it detects that the library was not correctly compiled or it finds other basic incompatibilities. However, CREATE LIBRARY cannot detect many other issues in shared libraries.
Multi-level library dependencies
If a DEPENDS clause specifies a library with multiple directory levels, Vertica follows the library path to include all subdirectories of that library. For example, the following CREATE LIBRARY statement enables the UDx library mylib
to import all Python packages and modules that it finds in subdirectories of site-packages
:
=> CREATE LIBRARY mylib AS '/path/to/python_udx' DEPENDS '/path/to/python/site-packages' LANGUAGE 'Python';
Important
DEPENDS can specify Java library dependencies that are up to 100 levels deep.Examples
Load a library in the home directory of the dbadmin account:
=> CREATE LIBRARY MyFunctions AS '/home/dbadmin/my_functions.so';
Load a library located in the directory where you started vsql
:
=> \set libfile '\''`pwd`'/MyOtherFunctions.so\'';
=> CREATE LIBRARY MyOtherFunctions AS :libfile;
Load a library from the cloud:
=> CREATE LIBRARY SomeFunctions AS 'S3://mybucket/extensions.so';
Load a library that depends on multiple JAR files in the same directory:
=> CREATE LIBRARY DeleteVowelsLib AS '/home/dbadmin/JavaLib.jar'
DEPENDS '/home/dbadmin/mylibs/*' LANGUAGE 'Java';
Load a library with multiple explicit dependencies:
=> CREATE LIBRARY mylib AS '/path/to/java_udx'
DEPENDS '/path/to/jars/this.jar:/path/to/jars/that.jar' LANGUAGE 'Java';
Load a library with dependencies in the cloud:
=> CREATE LIBRARY s3lib AS 's3://mybucket/UdlLib.jar'
DEPENDS '["s3://mybucket/gson-2.3.1.jar"]' LANGUAGE 'Java';
14 - CREATE LOAD BALANCE GROUP
Creates a group of network addresses that can be targeted by a load balancing routing rule. You create a group either using a list of network addresses, or basing it on one or more fault groups or subclusters.
Note
You cannot add multiple network addresses for one node to the same load balancing group.Syntax
CREATE LOAD BALANCE GROUP group_name WITH {
ADDRESS address[,...]
| FAULT GROUP fault_group[,...] FILTER 'IP_range'
| SUBCLUSTER subcluster[,...] FILTER 'IP_range'
}
[ POLICY 'policy_setting' ]
Parameters
group_name
- Name of the group to create. You use this name later when defining load balancing rules.
address
[,...]
- Comma-delimited list of network addresses you created earlier.
fault_group
[,...]
- Comma-delimited list of fault groups to use as the basis of the load balance group.
Note
Before you create your load balance group from a fault group, you must create network addresses on the nodes you want in your load balance group. Load balance groups only work with the network addresses you define on nodes, rather than IP addresses. See CREATE NETWORK ADDRESS. subcluster
[,...]
- Comma-delimited list of subclusters to use as the basis of the load balance group.
Note
As with fault groups, you must create network addresses on the nodes in the subcluster you want to be part of the load balance group. IP_range
- Range of IP addresses in CIDR notation to include in the load balance group from the fault groups or subclusters. This range can be either IPv4 or IPv6. Only nodes that have a network address with an IP address that falls within this range are added to the load balancing group.
policy_setting
- Determines how the initially-contacted node chooses a target from the group, one of the following:
-
ROUNDROBIN (default) rotates among the available members of the load balancing group. The initially-contacted node keeps track of which node it chose last time, and chooses the next one in the cluster.
Note
Each node in the cluster maintains its own round-robin pointer that indicates which node it should pick next for each load-balancing group. Therefore, if clients connect to different initial nodes, they may be redirected to the same node. -
RANDOM chooses an available node from the group randomly.
-
NONE disables load balancing.
-
Privileges
SuperuserExamples
The following statement demonstrates creating a load balance group that contains several network addresses:
=> CREATE NETWORK ADDRESS addr01 ON v_vmart_node0001 WITH '10.20.110.21';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr02 ON v_vmart_node0002 WITH '10.20.110.22';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr03 on v_vmart_node0003 WITH '10.20.110.23';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS addr04 on v_vmart_node0004 WITH '10.20.110.24';
CREATE NETWORK ADDRESS
=> CREATE LOAD BALANCE GROUP group_1 WITH ADDRESS addr01, addr02;
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_2 WITH ADDRESS addr03, addr04;
CREATE LOAD BALANCE GROUP
=> SELECT * FROM LOAD_BALANCE_GROUPS;
name | policy | filter | type | object_name
------------+------------+-----------------+-----------------------+-------------
group_1 | ROUNDROBIN | | Network Address Group | addr01
group_1 | ROUNDROBIN | | Network Address Group | addr02
group_2 | ROUNDROBIN | | Network Address Group | addr03
group_2 | ROUNDROBIN | | Network Address Group | addr04
(4 rows)
This example demonstrates creating a load balancing group using a fault group:
=> CREATE FAULT GROUP fault_1;
CREATE FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0001;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0002;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0003;
ALTER FAULT GROUP
=> ALTER FAULT GROUP fault_1 ADD NODE v_vmart_node0004;
ALTER FAULT GROUP
=> SELECT node_name,node_address,node_address_family,export_address
FROM v_catalog.nodes;
node_name | node_address | node_address_family | export_address
------------------+--------------+---------------------+----------------
v_vmart_node0001 | 10.20.110.21 | ipv4 | 10.20.110.21
v_vmart_node0002 | 10.20.110.22 | ipv4 | 10.20.110.22
v_vmart_node0003 | 10.20.110.23 | ipv4 | 10.20.110.23
v_vmart_node0004 | 10.20.110.24 | ipv4 | 10.20.110.24
(4 rows)
=> CREATE LOAD BALANCE GROUP group_all WITH FAULT GROUP fault_1 FILTER
'0.0.0.0/0';
CREATE LOAD BALANCE GROUP
=> CREATE LOAD BALANCE GROUP group_some WITH FAULT GROUP fault_1 FILTER
'10.20.110.21/30';
CREATE LOAD BALANCE GROUP
=> SELECT * FROM LOAD_BALANCE_GROUPS;
name | policy | filter | type | object_name
----------------+------------+-----------------+-----------------------+-------------
group_all | ROUNDROBIN | 0.0.0.0/0 | Fault Group | fault_1
group_some | ROUNDROBIN | 10.20.110.21/30 | Fault Group | fault_1
(2 rows)
See also
15 - CREATE LOCAL TEMPORARY VIEW
Creates or replaces a local temporary view. Views are read only, so they do not support insert, update, delete, or copy operations. Local temporary views are session-scoped, so they are visible only to their creator in the current session. Vertica drops the view when the session ends.
Note
Vertica does not support global temporary views.Syntax
CREATE [OR REPLACE] LOCAL TEMP[ORARY] VIEW view [ (column[,...] ) ] AS query
Parameters
OR REPLACE
- Specifies to overwrite the existing view
view-name
. If you omit this option andview-name
already exists,CREATE VIEW
returns an error. view
- Identifies the view to create, where
view
conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema. column
[,...]
- List of up to 9800 names to use as view column names. Vertica maps view column names to query columns according to the order of their respective lists. By default, the view uses column names as they are specified in the query.
AS
query
- A
SELECT
statement that the temporary view executes. TheSELECT
statement can reference tables, temporary tables, and other views.
Privileges
See Creating views.
Examples
The following CREATE LOCAL TEMPORARY VIEW
statement creates the temporary view myview
. This view sums all individual incomes of customers listed in the store.store_sales_fact
table, and groups results by state:
=> CREATE LOCAL TEMP VIEW myview AS
SELECT SUM(annual_income), customer_state FROM public.customer_dimension
WHERE customer_key IN (SELECT customer_key FROM store.store_sales_fact)
GROUP BY customer_state
ORDER BY customer_state ASC;
The following example uses the temporary view myview
with a WHERE
clause that limits the results to combined salaries greater than $2 billion:
=> SELECT * FROM myview WHERE SUM > 2000000000;
SUM | customer_state
-------------+----------------
2723441590 | AZ
29253817091 | CA
4907216137 | CO
3769455689 | CT
3330524215 | FL
4581840709 | IL
3310667307 | IN
2793284639 | MA
5225333668 | MI
2128169759 | NV
2806150503 | PA
2832710696 | TN
14215397659 | TX
2642551509 | UT
(14 rows)
See also
16 - CREATE LOCATION
Creates a storage location where Vertica can store data. After you create the location, you create storage policies that assign the storage location to the database objects that will store data in the location.
Caution
While no technical issue prevents you from using CREATE LOCATION
to add one or more Network File System (NFS) storage locations, Vertica does not support NFS data or catalog storage except for MapR mount points. You will be unable to run queries against any other NFS data. When creating locations on MapR file systems, you must specify ALL NODES SHARED
.
If you use HDFS storage locations, the HDFS data must be available when you start Vertica. Your HDFS cluster must be operational, and the ROS files must be present. If you moved data files, or they are corrupted, or your HDFS cluster is not responsive, Vertica cannot start.
Syntax
CREATE LOCATION 'path'
[NODE 'node' | ALL NODES]
[SHARED]
[USAGE 'usage']
[LABEL 'label']
[LIMIT 'size']
Arguments
path
- Where to store this location's data. The type of file system on which the location is based determines the
path
format:-
Linux: Absolute path to the directory where Vertica can write the storage location's data.
-
Shared file systems: See the URL specifications in HDFS file system, S3 object store, Google Cloud Storage (GCS) object store, and Azure Blob Storage object store.
HDFS storage locations have additional requirements.
-
ALL NODES | NODE '
node
'- The node or nodes on which the storage location is defined, one of the following:
-
ALL NODES
(default): Create the storage location on each node. If SHARED is also specified, create the storage location once for use by all nodes. -
NODE '
node
'
: Create the storage location on a single node, wherenode
is the name of the node in the NODES system table. You cannot use this option with SHARED.
-
SHARED
- Indicates the location set by
path
is shared (used by all nodes) rather than local to each node. You cannot specify individual nodes with SHARED; you must use ALL NODES.Most remote file systems such as HDFS and S3 are shared. For these file systems, the
path
argument represents a single location in the remote file system where all nodes store data. If using a remote file system, you must specify SHARED, even for one-node clusters.Note
Ifpath
is set to S3 communal storage,SHARED
is always implied and can be omitted.Deprecated
SHARED DATA and SHARED DATA,TEMP storage locations are deprecated. USAGE '
usage
'
- The type of data the storage location can hold, where
usage
is one of the following:-
DATA,TEMP
(default): The storage location can store persistent and temporary DML-generated data, and data for temporary tables. -
TEMP
: Apath
-specified location to store DML-generated temporary data. Ifpath
is set to S3, then this location is used only when the RemoteStorageForTemp configuration parameter is set to 1, andTEMP
must be qualified with ALL NODES SHARED. For details, see S3 Storage of Temporary Data. -
DATA
: The storage location can only store persistent data. -
USER
: Users with READ and WRITE privileges can access data and external tables of this storage location. -
DEPOT
: The storage location is used in Eon Mode to store the depot. Only createDEPOT
storage locations on local Linux file systems.Vertica allows a single
DEPOT
storage location per node. If you want to move your depot to different location (on a different file system, for example) you must first drop the old depot storage location, then create the new location.
-
LABEL '
label
'
- A label for the storage location, used when assigning the storage location to data objects. You use this name later when assigning the storage location to data objects.
Important
You must supply a label for depot storage locations. LIMIT '
size
'
Valid only if the storage location usage type is set to
DEPOT
, specifies the maximum amount of disk space that the depot can allocate from the storage location's file system.You can specify
size
in two ways:-
integer
%
: Percentage of storage location disk size. -
integer
{K|M|G|T}
: Amount of storage location disk size in kilobytes, megabytes, gigabytes, or terabytes.
Important
The depot size cannot exceed 80 percent of the file system disk space where the depot is stored. If you specify a value that is too large, Vertica issues a warning and automatically changes the value to 80 percent of the file system size.If you do not specify a limit, it is set to 60 percent.
-
Privileges
Superuser
File system access
The Vertica process must have read and write permissions to the location where data is to be stored. Each file system has its own requirements:
File system | Requirements |
---|---|
Linux |
Database superuser account (usually named dbadmin) must have full read and write access to the directory in the path argument. |
HDFS without Kerberos |
A Hadoop user whose username matches the Vertica database administrator username (usually dbadmin) must have read and write access to the HDFS directory specified in the path argument. The UseServerIdentityOverUserIdentity configuration parameter must be set to true in the user session; otherwise Vertica tries to use the identity associated with the logged-in user. |
HDFS with Kerberos | A Hadoop user whose username matches the principal in the keytab file on each Vertica node must have read and write access to the HDFS directory stored in the path argument. This is not the same as the database administrator username. The UseServerIdentityOverUserIdentity configuration parameter must be set to true in the user session; otherwise Vertica tries to use the Kerberos principal associated with the logged-in user. |
Object stores (S3, GCS, Azure) | Database-level credentials must be specified and provide full read and write access to the location in the path argument. If session-level credentials are specified they are used, directly overriding the use of the storage location. |
Examples
Create a storage location in the local Linux file system for temporary data storage:
=> CREATE LOCATION '/home/dbadmin/testloc' USAGE 'TEMP' LABEL 'tempfiles';
Create a storage location on HDFS. The HDFS cluster does not use Kerberos:
=> CREATE LOCATION 'hdfs://hadoopNS/vertica/colddata' ALL NODES SHARED
USAGE 'data' LABEL 'coldstorage';
Create the same storage location, but on a Hadoop cluster that uses Kerberos. Note the output that reports the principal being used:
=> CREATE LOCATION 'hdfs://hadoopNS/vertica/colddata' ALL NODES SHARED
USAGE 'data' LABEL 'coldstorage';
NOTICE 0: Performing HDFS operations using kerberos principal [vertica/hadoop.example.com]
CREATE LOCATION
Create a location for user data, grant access to it, and use it to create an external table:
=> CREATE LOCATION '/tmp' ALL NODES USAGE 'user';
CREATE LOCATION
=> GRANT ALL ON LOCATION '/tmp' to Bob;
GRANT PRIVILEGE
=> CREATE EXTERNAL TABLE ext1 (x integer) AS COPY FROM '/tmp/data/ext1.dat' DELIMITER ',';
CREATE TABLE
Create a user storage location on S3 and a role, so that users without their own S3 credentials can read data from S3 using the server credential:
--- set database-level credential (once):
=> ALTER DATABASE DEFAULT SET AWSAuth = 'myaccesskeyid123456:mysecretaccesskey123456789012345678901234';
=> CREATE LOCATION 's3://datalake' SHARED USAGE 'USER' LABEL 's3user';
=> CREATE ROLE ExtUsers;
--- Assign users to this role using GRANT (Role).
=> GRANT READ ON LOCATION 's3://datalake' TO ExtUsers;
See also
17 - CREATE NETWORK ADDRESS
Creates a network address that can be used as part of a connection load balancing policy. A network address creates a name in the Vertica catalog for an IP address and port number associated with a node. Nodes can have multiple network addresses, up to one for each IP address they have on the network.
Syntax
CREATE NETWORK ADDRESS name ON node WITH 'ip-address' [PORT port-number] [ENABLED | DISABLED]
Parameters
name
- The name of the new network address. Use this name when creating connection load balancing groups.
node
- The name of the node on which to create the network address. This should be name of the node as it appears in the
node_name
column of system table NODES. ip-address
- The IPv4 or and IPv6 address on the node to associate with the network address.
Note
Vertica does not verify that the IP address you supply in this parameter is actually associated with the specified node. Be sure that the IP address actually belongs to the node. Otherwise, your load balancing policy is liable to send a client connection to the wrong node, or a non-Vertica host. Vertica rejects IP address that are invalid for a node. For example, it checks whether the IP address falls in the loopback address range of 127.0.0.0/8. If it finds that the IP address is invalid, CREATE NETWORK ADDRESS returns an error. PORT
port-number
- Sets the port number for the network address. You must supply a network address when altering the port number.
ENABLED | DISABLED
- Enables or disables the network address.
Privileges
Superuser
Examples
Create three network addresses, one for each node in a three-node cluster:
=> SELECT node_name,export_address from v_catalog.nodes;
node_name | export_address
---------------------+----------------
v_vmart_br_node0001 | 10.20.100.62
v_vmart_br_node0002 | 10.20.100.63
v_vmart_br_node0003 | 10.20.100.64
(3 rows)
=> CREATE NETWORK ADDRESS node01 ON v_vmart_br_node0001 WITH '10.20.100.62';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node02 ON v_vmart_br_node0002 WITH '10.20.100.63';
CREATE NETWORK ADDRESS
=> CREATE NETWORK ADDRESS node03 ON v_vmart_br_node0003 WITH '10.20.100.64';
See also
18 - CREATE NETWORK INTERFACE
Identifies a network interface to which a node belongs.
Use this statement when you want to configure import/export operations from individual nodes to other Vertica clusters. By default, when you install Vertica, it creates interfaces for all connected networks. You would only need CREATE NETWORK INTERFACE in situations where the network topology has changed since you installed Vertica.
Note
Do not confuse this statement with CREATE NETWORK ADDRESS, which is used to identify network addresses for connection load balancing (see Connection load balancing policies).Syntax
CREATE NETWORK INTERFACE network-interface-name ON node-name [WITH] 'node-IP-address' [PORT port-number] [ENABLED | DISABLED]
network-interface-name
- The name you assign to the network interface, where
network-interface-name
conforms to conventions described in Identifiers. node-name
- The name of the node.
node-IP-address
- The node's IP address, either a public or private IP address. For more information, see Using Public and Private IP Networks.
- PORT
port-number
- Sets the port number for the network interface. You must supply a network interface when altering the port number.
- [ENABLED | DISABLED]
- Enables or disables the network interface.
Privileges
Superuser
Examples
Create a network interface:
=> CREATE NETWORK INTERFACE mynetwork ON v_vmart_node0001 WITH '123.4.5.6' PORT 456 ENABLED;
19 - CREATE NOTIFIER
Creates a push-based notifier to send event notifications and messages out of Vertica.
Syntax
CREATE NOTIFIER [ IF NOT EXISTS ] notifier-name ACTION 'notifier-type'
[ ENABLE | DISABLE ]
[ MAXPAYLOAD 'integer{K|M}' ]
MAXMEMORYSIZE 'integer{K|M|G|T}'
[ TLSMODE 'tls-mode' ]
[ CA BUNDLE bundle-name [ CERTIFICATE certificate-name ] ]
[ IDENTIFIED BY 'uuid' ]
[ [NO] CHECK COMMITTED ]
[ PARAMETERS 'adapter-params' ]
Parameters
IF NOT EXISTS
If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.
The
IF NOT EXISTS
clause is useful for SQL scripts where you want to create an object if it does not already exist.For related information, see ON_ERROR_STOP.
notifier-name
- This notifier's unique identifier.
ACTION '
notifier-type
'
- String, the type of notifier, one of the following:
-
URL, with the following format, that identifies one or more target Kafka servers:
kafka://kafka-server-ip-address:port-number
To enable failover when a Kafka server is unavailable, specify additional hosts in a comma-delimited list. For example:
kafka://192.0.2.0:9092,192.0.2.1:9092,192.0.2.2:9092
-
syslog
: Notifications are sent to syslog. To use notifiers of this type, you must set theSyslogEnabled
parameter:=> ALTER DATABASE DEFAULT SET SyslogEnabled = 1
Events monitored by this notifier type are not logged to MONITORING_EVENTS nor
vertica.log
.
-
ENABLE | DISABLE
- Specifies whether to enable or disable the notifier.
Default:
ENABLE
. MAXPAYLOAD
'integer
{K|M}
'- The maximum size of the message, up to 10^9 bytes, specified in kilobytes or megabytes.
The following restrictions apply:
-
MAXPAYLOAD
cannot be greater thanMAXMEMORYSIZE
. -
If you configure syslog to send messages to a remote destination, ensure that
MaxMessageSize
(in/etc/rsyslog
forrsyslog
) is greater than or equal toMAXPAYLOAD
.
Defaults:
-
Kafka: 1M
-
syslog: 1M
-
MAXMEMORYSIZE
'integer
{K|M|G|T}
'- The maximum size of the internal notifier, up to 2 TB, specified in kilobytes, megabytes, gigabytes, or terabytes.
MAXMEMORYSIZE
must be greater thanMAXPAYLOAD
.If the size of the message queue exceeds
MAXMEMORYSIZE
, the notifier drops excess messages. TLSMODE '
tls-mode
'
Specifies the type of connection between the notifier and an endpoint, one of the following:
-
disable
(default): Plaintext connection. -
verify-ca
: Encrypted connection, and the server's certificate is verified as being signed by a trusted CA.
If the notifier
ACTION
is'syslog'
, this parameter has no effect; you must configure syslog for TLS to enable encryption for these Vertica syslog notifiers.-
CA BUNDLE
bundle-name
Specifies a CA bundle. The certificates inside the bundle are used to validate the Kafka server's certificate if the
TLSMODE
requires it.If a CA bundle is specified for a notifier that currently uses
disable
, which doesn't validate the Kafka server's certificate, the bundle will go unused when connecting to the Kafka server. This behavior persists unless theTLSMODE
is changed to one that validates server certificates.Changes to contents of the CA bundle take effect either after the notifier is disabled and re-enabled or after the database restarts. However, changes to which CA bundle the notifier uses takes effect immediately.
If the notifier
ACTION
is'syslog'
, this parameter has no effect; you must configure syslog for TLS to enable encryption for these Vertica syslog notifiers.CERTIFICATE
certificate-name
Specifies a client certificate for validation by the endpoint.
If the notifier
ACTION
is'syslog'
, this parameter has no effect; you must configure syslog for TLS to enable encryption for these Vertica syslog notifiers.IDENTIFIED BY
uuid
- Specifies the notifier's unique identifier. If set, all the messages published by this notifier have this attribute.
[NO] CHECK COMMITTED
- Specifies to wait for delivery confirmation before sending the next message in the queue.
Some messaging systems, like syslog, do not support delivery confirmation.
PARAMETERS '
adapter-params
'
- Specifies one or more optional adapter parameters that are passed as a string to the adapter. Adapter parameters apply only to the adapter associated with the notifier.
For Kafka notifiers, refer to Kafka and Vertica configuration settings.
For syslog notifiers, specify the severity of the event with
eventSeverity=
severity
, whereseverity
is one of the following:-
0
: Emergency -
1
: Alert -
2
: Critical -
3
: Error -
4
: Warning -
5
: Notice -
6
: Informational -
7
: Debug
Most syslog implementations, by default, do not log events with a severity level of 7. You must configure syslog to record these types of events.
-
Privileges
SuperuserEncrypted notifiers for SASL_SSL Kafka configurations
Follow this procedure to create or alter notifiers for Kafka endpoints that use SASL_SSL. Note that you must repeat this procedure whenever you change the TLSMODE, certificates, or CA bundle for a given notifier.
-
Use CREATE or ALTER to disable the notifier while setting the TLSMODE, certificate, and CA bundle.
=> ALTER NOTIFIER encrypted_notifier DISABLE TLSMODE 'verify-ca' CA BUNDLE ca_bundle2;
-
ALTER the notifier and set the proper rdkafka adapter parameters for SASL_SSL.
=> ALTER NOTIFIER encrypted_notifier PARAMETERS 'sasl.username=user;sasl.password=password;sasl.mechanism=PLAIN;security.protocol=SASL_SSL';
-
Enable the notifier.
=> ALTER NOTIFIER encrypted_notifier ENABLE;
Examples
Kafka notifiers
Create a Kafka notifier:
=> CREATE NOTIFIER my_dc_notifier
ACTION 'kafka://172.16.20.10:9092'
MAXMEMORYSIZE '1G'
IDENTIFIED BY 'f8b0278a-3282-4e1a-9c86-e0f3f042a971'
NO CHECK COMMITTED;
Create a notifier with an adapter-specific parameter:
=> CREATE NOTIFIER my_notifier
ACTION 'kafka://127.0.0.1:9092'
MAXMEMORYSIZE '10M'
PARAMETERS 'queue.buffering.max.ms=1000';
Create a notifier that uses an encrypted connection and verifies the Kafka server's certificate with the provided CA bundle:
=> CREATE NOTIFIER encrypted_notifier
ACTION 'kafka://127.0.0.1:9092'
MAXMEMORYSIZE '10M'
TLSMODE 'verify-ca'
CA BUNDLE ca_bundle;
Syslog notifiers
The following example creates a notifier that writes a message to syslog when the Data collector (DC) component LoginFailures
updates:
-
Enable syslog notifiers for the current database:
=> ALTER DATABASE DEFAULT SET SyslogEnabled = 1;
-
Create and enable a syslog notifier
v_syslog_notifier
:=> CREATE NOTIFIER v_syslog_notifier ACTION 'syslog' ENABLE MAXMEMORYSIZE '10M' IDENTIFIED BY 'f8b0278a-3282-4e1a-9c86-e0f3f042a971' PARAMETERS 'eventSeverity = 5';
-
Configure the syslog notifier
v_syslog_notifier
for updates to theLoginFailures
DC component with SET_DATA_COLLECTOR_NOTIFY_POLICY:=> SELECT SET_DATA_COLLECTOR_NOTIFY_POLICY('LoginFailures','v_syslog_notifier', 'Login failed!', true);
This notifier writes the following message to syslog (default location:
/var/log/messages
) when a user fails to authenticate as the userBob
:Apr 25 16:04:58 vertica_host_01 vertica: Event Posted: Event Code:21 Event Id:0 Event Severity: Notice [5] PostedTimestamp: 2022-04-25 16:04:58.083063 ExpirationTimestamp: 2022-04-25 16:04:58.083063 EventCodeDescription: Notifier ProblemDescription: (Login failed!) { "_db":"VMart", "_schema":"v_internal", "_table":"dc_login_failures", "_uuid":"f8b0278a-3282-4e1a-9c86-e0f3f042a971", "authentication_method":"Reject", "client_authentication_name":"default: Reject", "client_hostname":"::1", "client_label":"", "client_os_user_name":"dbadmin", "client_pid":523418, "client_version":"", "database_name":"dbadmin", "effective_protocol":"3.8", "node_name":"v_vmart_node0001", "reason":"REJECT", "requested_protocol":"3.8", "ssl_client_fingerprint":"", "ssl_client_subject":"", "time":"2022-04-25 16:04:58.082568-05", "user_name":"Bob" }#012 DatabaseName: VMart Hostname: vertica_host_01
For details on syslog notifiers, see Configuring reporting for syslog.
See also
20 - CREATE PROCEDURE (external)
Enterprise Mode only
Adds an external procedure to Vertica. See External procedures for more information.
Syntax
CREATE PROCEDURE [ IF NOT EXISTS ]
[[database.]schema.]procedure( [ argument-list ] )
AS executable
LANGUAGE 'EXTERNAL'
USER OS-user
Parameters
IF NOT EXISTS
If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.
The
IF NOT EXISTS
clause is useful for SQL scripts where you want to create an object if it does not already exist.For related information, see ON_ERROR_STOP.
This option cannot be used with
OR REPLACE
.[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
procedure*
- Specifies a name for the external procedure, where *
procedure-name
*conforms to conventions described in Identifiers. argument-list
- A comma-delimited list of procedure arguments, where each argument is specified as follows:
[ argname ] argtype
-
argname
optionally provides a descriptive name for this argument. -
argtype
must be one of the following data types supported byVertica:-
BIGINT
-
BOOLEAN
-
DECIMAL
-
DOUBLE PRECISION
-
FLOAT
-
FLOAT8
-
INT
-
INT8
-
INTEGER
-
MONEY
-
NUMBER
-
NUMERIC
-
REAL
-
SMALLINT
-
TINYINT
-
VARCHAR
-
-
executable
- The name of the executable program in the procedures directory, a string.
OS-user
- The owner of the file, a string. The owner:
-
Cannot be root
-
Must have execute privileges on
executable
-
Privileges
Superuser
System security
-
The procedure file must be owned by the database administrator (OS account) or by a user in the same group as the administrator. The procedure file must also have the set UID attribute enabled, and allow read and execute permission for the group.
-
External procedures that you create with CREATE PROCEDURE (external) are always run with Linux dbadmin privileges. If a dbadmin or pseudosuperuser grants a non-dbadmin permission to run a procedure using GRANT (procedure), be aware that the non-dbadmin user runs the procedure with full Linux dbadmin privileges.
Examples
The following example shows how to create a procedure named helloplanet
for the procedure file helloplanet.sh
. This file accepts one VARCHAR argument.
Create the file:
#!/bin/bash
echo "hello planet argument: $1" >> /tmp/myprocedure.log
Create the procedure with the following SQL:
=> CREATE PROCEDURE helloplanet(arg1 varchar) AS 'helloplanet.sh' LANGUAGE 'external' USER 'dbadmin';
See also
21 - CREATE PROCEDURE (stored)
Creates a stored procedure.
Syntax
CREATE [ OR REPLACE ] PROCEDURE [ IF NOT EXISTS ]
[[database.]schema.]procedure( [ parameter-list ] )
[ LANGUAGE 'language-name' ]
[ SECURITY { DEFINER | INVOKER } ]
AS $$ source $$;
Parameters
OR REPLACE
- If a procedure with the same name already exists, replace it. Users and roles with privileges on the original procedure retain these privileges on the new procedure.
This option cannot be used with
IF NOT EXISTS
. IF NOT EXISTS
If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.
The
IF NOT EXISTS
clause is useful for SQL scripts where you want to create an object if it does not already exist.For related information, see ON_ERROR_STOP.
This option cannot be used with
OR REPLACE
.[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
procedure*
- The name of the stored procedure, where *
procedure-name
*conforms to conventions described in Identifiers. parameter-list
- A comma-delimited list of formal parameters, each specified as follows:
[ parameter-mode ] parameter-name parameter-type
-
parameter-name
: the name of the parameter. -
parameter-type
: Any SQL data type, with the following exceptions:-
DECIMAL
-
NUMERIC
-
NUMBER
-
MONEY
-
UUID
-
GEOGRAPHY
-
GEOMETRY
-
Complex types
-
-
language-name
- Specifies the language of the procedure
source
, one of the following (both options refer to PLvSQL; PLpgSQL is included to maintain compatibility with existing scripts):-
PLvSQL
-
PLpgSQL
Default:
PLvSQL
-
SECURITY { DEFINER | INVOKER }
- Determines whose privileges to use when the procedure is called and executes it as if the user is one of the following:
-
DEFINER: User who defined the procedure
-
INVOKER: User who called the procedure
A procedure with SECURITY DEFINER effectively executes the procedure as that user, so changes to the database appear to be performed by the procedure's definer rather than its caller.
Caution
Improper use of SECURITY DEFINER can lead to the confused deputy problem and introduce vulnerabilities into your system like SQL injection.For more information, see Executing stored procedures.
-
source
- The procedure source code. For details, see Scope and structure.
Privileges
Non-superuser: CREATE on the procedure's schema
Examples
For more complex examples, see Stored procedures: use cases and examples
This procedure prints its arguments:
=> CREATE PROCEDURE raiseXY(IN x INT, y VARCHAR) LANGUAGE PLvSQL AS $$
BEGIN
RAISE NOTICE 'x = %', x;
RAISE NOTICE 'y = %', y;
-- some processing statements
END;
$$;
CALL raiseXY(3, 'some string');
NOTICE 2005: x = 3
NOTICE 2005: y = some string
For more information on RAISE NOTICE, see Errors and diagnostics.
See also
22 - CREATE PROFILE
Creates a profile that controls password requirements for users.
Syntax
CREATE PROFILE profile-name LIMIT [
PASSWORD_LIFE_TIME setting
PASSWORD_MIN_LIFE_TIME setting
PASSWORD_GRACE_TIME setting
FAILED_LOGIN_ATTEMPTS setting
PASSWORD_LOCK_TIME setting
PASSWORD_REUSE_MAX setting
PASSWORD_REUSE_TIME setting
PASSWORD_MAX_LENGTH setting
PASSWORD_MIN_LENGTH setting
PASSWORD_MIN_LETTERS setting
PASSWORD_MIN_UPPERCASE_LETTERS setting
PASSWORD_MIN_LOWERCASE_LETTERS setting
PASSWORD_MIN_DIGITS setting
PASSWORD_MIN_SYMBOLS setting
PASSWORD_MIN_CHAR_CHANGE setting ]
Parameters
Note
All parameters that are not explicitly set in a new profile are set todefault
, and inherit their settings from the default profile.
Name | Description |
---|---|
name |
The name of the profile to create, where * To modify the default profile, set
|
PASSWORD_LIFE_TIME |
Set to an integer value, one of the following:
After your password's lifetime and grace period expire, you must change your password on your next login, if you have not done so already. |
PASSWORD_MIN_LIFE_TIME |
Set to an integer value, one of the following:
|
PASSWORD_GRACE_TIME |
Set to an integer value, one of the following:
|
FAILED_LOGIN_ATTEMPTS |
Set to an integer value, one of the following:
|
PASSWORD_LOCK_TIME |
|
PASSWORD_REUSE_MAX |
Set to an integer value, one of the following:
|
PASSWORD_REUSE_TIME |
Set to an integer value, one of the following:
|
PASSWORD_MAX_LENGTH |
The maximum number of characters allowed in a password, one of the following:
|
PASSWORD_MIN_LENGTH |
The minimum number of characters required in a password, one of the following:
|
PASSWORD_MIN_LETTERS |
Minimum number of letters (a-z and A-Z) that must be in a password, one of the following:
|
PASSWORD_MIN_UPPERCASE_LETTERS |
Minimum number of uppercase letters (A-Z) that must be in a password, one of the following:
|
PASSWORD_MIN_LOWERCASE_LETTERS |
Minimum number of lowercase letters (a-z) that must be in a password, one of the following:
|
PASSWORD_MIN_DIGITS |
Minimum number of digits (0-9) that must be in a password, one of the following:
|
PASSWORD_MIN_SYMBOLS |
Minimum number of symbols—printable non-letter and non-digit characters such as $, #, @—that must be in a password, one of the following:
|
PASSWORD_MIN_CHAR_CHANGE |
Minimum number of characters that must be different from the previous password:
|
Privileges
Superuser
Profile settings and client authentication
The following profile settings affect client authentication methods, such as LDAP or GSS:
-
FAILED_LOGIN_ATTEMPTS
-
PASSWORD_LOCK_TIME
All other profile settings are used only by Vertica to manage its passwords.
Examples
=> CREATE PROFILE sample_profile LIMIT PASSWORD_MAX_LENGTH 20;
See also
23 - CREATE PROJECTION
Creates metadata for a projection in the Vertica catalog. Vertica supports four types of projections:
-
Standard projection: Stores collection of table data in a format that optimizes execution of certain queries on that table.
-
Live aggregate projection: Stores the grouped results of queries that invoke aggregate functions (such as SUM) on table columns.
-
Top-K projection: Stores the top k rows from partitions of selected rows.
-
UDTF projection: Stores newly-loaded data after it is transformed and/or aggregated by user-defined transformation functions (UDTFs).
Complex data types have additional restrictions when used within a projection:
-
Each projection must include at least one column that is a primitive type or native array.
-
An AS SELECT clause can use a complex-type column, but any other expression must be of a scalar type or native array.
-
The ORDER BY, PARTITION BY, and GROUP BY clauses cannot use complex types.
-
If a projection does not include an ORDER BY or segmentation clause, Vertica uses only the primitive columns from the select list to order or segment data.
-
Projection columns cannot be complex types returned from functions such as ARRAY_CAT.
-
TopK and UDTF projections do not support complex types.
23.1 - Encoding types
Vertica supports various encoding and compression types, specified by the following ENCODING
parameter arguments:
Note
Vertica supports the following encoding for numeric data types:
-
Precision ≤ 18:
AUTO
,BLOCK_DICT
,BLOCKDICT_COMP
,COMMONDELTA_COMP
,DELTAVAL
,GCDDELTA
, andRLE
-
Precision > 18:
AUTO
,BLOCK_DICT
,BLOCKDICT_COMP
,RLE
You can set encoding types on a projection column when you create the projection. You can also change the encoding of one or more projection columns for a given table with ALTER TABLE...ALTER COLUMN.
AUTO (default)
AUTO encoding is ideal for sorted, many-valued columns such as primary keys. It is also suitable for general purpose applications for which no other encoding or compression scheme is applicable. Therefore, it serves as the default if no encoding/compression is specified.
Column data type | Default encoding type |
---|---|
BINARY/VARBINARY BOOLEAN CHAR/VARCHAR FLOAT |
Lempel-Ziv-Oberhumer-based (LZO) compression |
DATE/TIME/TIMESTAMP INTEGER INTERVAL |
Compression scheme based on the delta between consecutive column values. |
The CPU requirements for this type are relatively small. In the worst case, data might expand by eight percent (8%) for LZO and twenty percent (20%) for integer data.
BLOCK_DICT
For each block of storage, Vertica compiles distinct column values into a dictionary and then stores the dictionary and a list of indexes to represent the data block.
BLOCK_DICT is ideal for few-valued, unsorted columnswhere saving space is more important than encoding speed. Certain kinds of data, such as stock prices, are typically few-valued within a localized area after the data is sorted, such as by stock symbol and timestamp, and are good candidates for BLOCK_DICT. By contrast, long CHAR/VARCHAR columns are not good candidates for BLOCK_DICT encoding.
CHAR and VARCHAR columns that contain 0x00 or 0xFF characters should not be encoded with BLOCK_DICT. Also, BINARY/VARBINARY columns do not support BLOCK_DICT encoding.
BLOCK_DICT encoding requires significantly higher CPU usage than default encoding schemes. The maximum data expansion is eight percent (8%).
BLOCKDICT_COMP
This encoding type is similar to BLOCK_DICT except dictionary indexes are entropy coded. This encoding type requires significantly more CPU time to encode and decode and has a poorer worst-case performance. However, if the distribution of values is extremely skewed, using BLOCK_DICT_COMP
encoding can lead to space savings.
BZIP_COMP
BZIP_COMP encoding uses the bzip2 compression algorithm on the block contents. See bzip web site for more information. This algorithm results in higher compression than the automatic LZO and gzip encoding; however, it requires more CPU time to compress. This algorithm is best used on large string columns such as VARCHAR, VARBINARY, CHAR, and BINARY. Choose this encoding type when you are willing to trade slower load speeds for higher data compression.
COMMONDELTA_COMP
This compression scheme builds a dictionary of all deltas in the block and then stores indexes into the delta dictionary using entropy coding.
This scheme is ideal for sorted FLOAT and INTEGER-based (DATE/TIME/TIMESTAMP/INTERVAL) data columns with predictable sequences and only occasional sequence breaks, such as timestamps recorded at periodic intervals or primary keys. For example, the following sequence compresses well: 300, 600, 900, 1200, 1500, 600, 1200, 1800, 2400. The following sequence does not compress well: 1, 3, 6, 10, 15, 21, 28, 36, 45, 55.
If delta distribution is excellent, columns can be stored in less than one bit per row. However, this scheme is very CPU intensive. If you use this scheme on data with arbitrary deltas, it can cause significant data expansion.
DELTARANGE_COMP
This compression scheme is primarily used for floating-point data; it stores each value as a delta from the previous one.
This scheme is ideal for many-valued FLOAT columns that are sorted or confined to a range. Do not use this scheme for unsorted columns that contain NULL values, as the storage cost for representing a NULL value is high. This scheme has a high cost for both compression and decompression.
To determine if DELTARANGE_COMP is suitable for a particular set of data, compare it to other schemes. Be sure to use the same sort order as the projection, and select sample data that will be stored consecutively in the database.
DELTAVAL
For INTEGER and DATE/TIME/TIMESTAMP/INTERVAL columns, data is recorded as a difference from the smallest value in the data block. This encoding has no effect on other data types.
DELTAVAL is best used for many-valued, unsorted integer or integer-based columns. CPU requirements for this encoding type are minimal, and data never expands.
GCDDELTA
For INTEGER and DATE/TIME/TIMESTAMP/INTERVAL columns, and NUMERIC columns with 18 or fewer digits, data is recorded as the difference from the smallest value in the data block divided by the greatest common divisor (GCD) of all entries in the block. This encoding has no effect on other data types.
ENCODING GCDDELTA is best used for many-valued, unsorted, integer columns or integer-based columns, when the values are a multiple of a common factor. For example, timestamps are stored internally in microseconds, so data that is only precise to the millisecond are all multiples of 1000. The CPU requirements for decoding GCDDELTA encoding are minimal, and the data never expands, but GCDDELTA may take more encoding time than DELTAVAL.
GZIP_COMP
This encoding type uses the gzip compression algorithm. See gzip web site for more information. This algorithm results in better compression than the automatic LZO compression, but lower compression than BZIP_COMP. It requires more CPU time to compress than LZO but less CPU time than BZIP_COMP. This algorithm is best used on large string columns such as VARCHAR, VARBINARY, CHAR, and BINARY. Use this encoding when you want a better compression than LZO, but at less CPU time than bzip2.
RLE
RLE (run length encoding) replaces sequences (runs) of identical values with a single pair that contains the value and number of occurrences. Therefore, it is best used for low cardinality columns that are present in the ORDER BY clause of a projection.
The Vertica execution engine processes RLE encoding run-by-run and the Vertica optimizer gives it preference. Use it only when run length is large, such as when low-cardinality columns are sorted.
Zstandard compression
Vertica supports three ZSTD compression types:
-
ZSTD_COMP
provides high compression ratios. This encoding type has a higher compression than gzip. Use this when you want a better compression than gzip. For general use cases, use this or theZSTD_FAST_COMP
encoding type. -
ZSTD_FAST_COMP
uses the fastest compression level that the zstd library provides. It is the fastest encoding type of the zstd library, but takes up more space than the other two encoding types. For general use cases, use this or theZSTD_COMP
encoding type. -
ZSTD_HIGH_COMP
offers the best compression in the zstd library. It is slower than the other two encoding types. Use this type when you need the best compression, with slower CPU time.
23.2 - GROUPED clause
Enterprise Mode only
Groups two or more columns into a single disk file. This minimizes file I/O for work loads that:
-
Read a large percentage of the columns in a table.
-
Perform single row look-ups.
-
Query against many small columns.
-
Frequently update data in these columns.
If you have data that is always accessed together and it is not used in predicates, you can increase query performance by grouping these columns. Once grouped, queries can no longer independently retrieve from disk all records for an individual column independent of the other columns within the group.
Note
RLE encoding is reduced when an RLE column is grouped with one or more non-RLE columns.When grouping columns you can:
-
Group some of the columns:
(a, GROUPED(b, c), d)
-
Group all of the columns:
(GROUPED(a, b, c, d))
-
Create multiple groupings in the same projection:
(GROUPED(a, b), GROUPED(c, d))
Note
Vertica performs dynamic column grouping. For example, to provide better read and write efficiency for small loads, Vertica ignores any projection-defined column grouping (or lack thereof) and groups all columns together by default.Grouping correlated columns
The following example shows how to group highly correlated columns bid
and ask
. The stock
column is stored separately.
=> CREATE TABLE trades (stock CHAR(5), bid INT, ask INT);
=> CREATE PROJECTION tradeproj (stock ENCODING RLE,
GROUPED(bid ENCODING DELTAVAL, ask))
AS (SELECT * FROM trades) KSAFE 1;
The following example show how to create a projection that uses expressions in the column definition. The projection contains two integer columns a
and b
, and a third column product_value
that stores the product of a
and b
:
=> CREATE TABLE values (a INT, b INT
=> CREATE PROJECTION product (a, b, product_value) AS
SELECT a, b, a*b FROM values ORDER BY a KSAFE;
23.3 - Hash segmentation clause
Specifies how to segment projection data for distribution across all cluster nodes. You can specify segmentation for a table and a projection. If a table definition specifies segmentation, Vertica uses it for that table's auto-projections.
It is strongly recommended that you use Vertica's built-in
HASH
function, which distributes data evenly across the cluster, and facilitates optimal query execution.
Syntax
SEGMENTED BY expression ALL NODES [ OFFSET offset ]
Parameters
SEGMENTED BY
expression
- A general SQL expression. Hash segmentation is the preferred method of segmentation. Vertica recommends using its built-in
HASH
function, whose arguments resolve to table columns. If you use an expression other thanHASH
, Vertica issues a warning.The segmentation expression should specify columns with a large number of unique data values and acceptable skew in their data distribution. In general, primary key columns that meet these criteria are good candidates for hash segmentation.
For details, see Expression Requirements below.
ALL NODES
- Automatically distributes data evenly across all nodes when the projection is created. Node ordering is fixed.
OFFSET
offset
- A zero-based offset that indicates on which node to start segmentation distribution.
This option is not valid for
CREATE TABLE
andCREATE TEMPORARY TABLE
.Important
If you create a projection for a table with the
OFFSET
option, be sure to create enough copies of each projection segment to satisfy system K-safety; otherwise, Vertica regards the projection as unsafe and cannot use it to query the table.You can ensure K-safety compliance when you create projections by combining
OFFSET
andKSAFE
options in theCREATE PROJECTION
statement. On executing this statement, Vertica automatically creates the necessary number of projection copies.
Expression requirements
A segmentation expression must specify table columns as they are defined in the source table. Projection column names are not supported.
The following restrictions apply to segmentation expressions:
-
All leaf expressions must be constants or column references to a column in the
CREATE PROJECTION
'sSELECT
list. -
The expression must return the same value over the life of the database.
-
Aggregate functions are not allowed.
-
The expression must return non-negative
INTEGER
values in the range0 <= x < 2
63
, and values are generally distributed uniformly over that range.Note
If the expression produces a value outside the expected range—for example, a negative value—no error occurs, and the row is added to the projection's first segment.
Examples
The following CREATE PROJECTION
statement creates projection public.employee_dimension_super
. It specifies to include all columns in table public.employee_dimension
. The hash segmentation clause invokes the Vertica HASH
function to segment projection data on the column employee_key
; it also includes the ALL NODES
clause, which specifies to distribute projection data evenly across all nodes in the cluster:
=> CREATE PROJECTION public.employee_dimension_super
AS SELECT * FROM public.employee_dimension
ORDER BY employee_key
SEGMENTED BY hash(employee_key) ALL NODES;
23.4 - Unsegmented clause
Specifies to distribute identical copies of table or projection data on all nodes across the cluster. Use this clause to facilitate distributed query execution on tables and projections that are too small to benefit from segmentation.
Vertica uses the same name to identify all instances of an unsegmented projection. For more information about projection name conventions, see Projection naming.
Syntax
UNSEGMENTED ALL NODES
Examples
This example creates an unsegmented projection for table store.store_dimension
:
=> CREATE PROJECTION store.store_dimension_proj (storekey, name, city, state)
AS SELECT store_key, store_name, store_city, store_state
FROM store.store_dimension
UNSEGMENTED ALL NODES;
CREATE PROJECTION
=> SELECT anchor_table_name anchor_table, projection_name, node_name
FROM PROJECTIONS WHERE projection_basename='store_dimension_proj';
anchor_table | projection_name | node_name
-----------------+----------------------+------------------
store_dimension | store_dimension_proj | v_vmart_node0001
store_dimension | store_dimension_proj | v_vmart_node0002
store_dimension | store_dimension_proj | v_vmart_node0003
(3 rows)
24 - CREATE RESOURCE POOL
Creates a custom resource pool and sets one or more resource pool parameters.
Syntax
CREATE RESOURCE POOL pool-name [ FOR subcluster-spec ] [ parameter-name setting ]...
Parameters
pool-name
- The name of the resource pool. Built-in pool names cannot be used for user-defined pools.
Note
If you specify a resource pool name with uppercase letters, Vertica converts them to lowercase letters. FOR
subcluster-spec
- Eon Mode only, specifies to associate this resource pool with a subcluster, where
subcluster-spec
is one of the following:SUBCLUSTER
subcluster-name
: Creates the resource pool for an existing subcluster. You cannot be connected to this subcluster, otherwise Vertica returns an error.CURRENT SUBCLUSTER
: Creates the resource pool for the subcluster that you are connected to.
If omitted, the resource pool is created globally. Attempts to create a global resource pool with the same name as a subcluster-specific resource pool return an error.
parameter-name
- The parameter to set, listed below.
setting
The value to set on
parameter-name
. To reset this parameter to its default value, specifyDEFAULT
.Note
Default values specified in this table pertain only to user-defined resource pools. For built-in pool default values, see Built-in resource pools configuration, or query system table RESOURCE_POOL_DEFAULTS.CASCADE TO
Specifies a secondary resource pool for executing queries that exceed the
[RUNTIMECAP](#RUNTIMECAP)
setting of their assigned resource pool:CASCADE TO secondary-pool
CPUAFFINITYMODE
Specifies whether the resource pool has exclusive or shared use of the CPUs specified in
[CPUAFFINITYSET](#CPUAFFINITYSET)
:CPUAFFINITYMODE { SHARED | EXCLUSIVE | ANY }
-
SHARED
: Queries that run in this pool share itsCPUAFFINITYSET
CPUs with other Vertica resource pools. -
EXCLUSIVE
: DedicatesCPUAFFINITYSET
CPUs to this resource pool only, and excludes other Vertica resource pools. IfCPUAFFINITYSET
is set as a percentage, then that percentage of CPU resources available to Vertica is assigned solely for this resource pool. -
ANY
(default): Queries in this resource pool can run on any CPU, invalid ifCPUAFFINITYSET
designates CPU resources.
Important
CPUAFFINITYMODE
andCPUAFFINITYSET
must be set together in the same statement.-
CPUAFFINITYSET
Specifies which CPUs are available to this resource pool. All cluster nodes must have the same number of CPUs. The CPU resources assigned to this set are unavailable to general resource pools.
CPUAFFINITYSET { 'cpu-index[,...]' | 'cpu-indexi-cpu-indexn' | 'integer%' | NONE }
-
cpu-index
[,...]
: Dedicates one or more comma-delimited CPUs to this pool. -
**``*
cpu-indexi*
-*
cpu-index*
n**
: Dedicates a range of contiguous CPU indexes to this pool -
integer
%
: Percentage of all available CPUs to use for this pool. Vertica rounds this percentage down to include whole CPU units. -
NONE
(default): No affinity set is assigned to this resource pool. The queries associated with this pool are executed on any CPU.
Important
CPUAFFINITYSET
andCPUAFFINITYMODE
must be set together in the same statement.-
EXECUTIONPARALLELISM
Limits the number of threads used to process any single query issued in this resource pool.
EXECUTIONPARALLELISM { limit | AUTO }
-
limit
: An integer value between 1 and the number of cores. Setting this parameter to a reduced value increases throughput of short queries issued in the pool, especially if the queries are executed concurrently. -
AUTO
or0
(default): Vertica calculates the setting from the number of cores, available memory, and amount of data in the system. Unless memory is limited, or the amount of data is very small, Vertica sets this parameter to the number of cores on the node.
-
MAXCONCURRENCY
Sets the maximum number of concurrent execution slots available to the resource pool, across the cluster:
MAXCONCURRENCY { integer | NONE }
NONE
(default) specifies unlimited number of concurrent execution slots.MAXMEMORYSIZE
The maximum size per node the resource pool can grow by borrowing memory from the
GENERAL
pool:MAXMEMORYSIZE { 'integer%' | 'integer{K|M|G|T}' | NONE }
-
integer
%
: Percentage of total memory -
integer
{K|M|G|T}
: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes -
NONE
(default): Unlimited, pool can borrow any amount of available memory from theGENERAL
pool.
-
MAXQUERYMEMORYSIZE
The maximum amount of memory that this pool can allocate at runtime to process a query. If the query requires more memory than this setting, Vertica stops execution and returns an error.
Set this parameter as follows:
MAXQUERYMEMORYSIZE { 'integer%' | 'integer{K|M|G|T}' | NONE }
MEMORYSIZE
The amount of total memory available to the Vertica resource manager that is allocated to this pool per node:
MEMORYSIZE { 'integer%' | 'integer{K|M|G|T}' }
-
integer
%
: Percentage of total memory -
integer
{K|M|G|T}
: Amount of memory in kilobytes, megabytes, gigabytes, or terabytes
Default: 0%. No memory allocated, the resource pool borrows memory from the
GENERAL
pool.-
PLANNEDCONCURRENCY
Specifies the preferred number queries to execute concurrently in the resource pool. This setting applies to the entire cluster:
PLANNEDCONCURRENCY { num-queries | AUTO }
-
num-queries
: Integer value ≥ 1, specifies the preferred number of concurrently executing queries. When possible, query resource budgets are limited to allow this level of concurrent execution. -
AUTO
(default): Value is calculated automatically at query runtime. Vertica sets this parameter to the lower of these two calculations, but never less than 4:-
Number of logical cores
-
Memory divided by 2GB
-
For clusters where the number of logical cores differs on different nodes,
AUTO
can apply differently on each node. Distributed queries run like the minimal effective planned concurrency. Single node queries run with the planned concurrency of the initiator.Tip
Change this parameter only after evaluating performance over a period of time.-
PRIORITY
Specifies priority of queries in this pool when they compete for resources in the
GENERAL
pool:PRIORITY { integer | HOLD }
-
integer
: A negative or positive integer value, where higher numbers denote higher priority: -
HOLD
: Sets priority to-999
. Queries in this pool are queued until[QUEUETIMEOUT](#QUEUETIMEOUT)
is reached.
Default: 0
-
QUEUETIMEOUT
Species how long a request can wait for pool resources before it is rejected:
QUEUETIMEOUT { integer | NONE }
-
integer
: Maximum wait time in seconds -
NONE
: No maximum wait time, request can be queued indefinitely.
**Default:** 300 seconds
-
RUNTIMECAP
Prevents runaway queries by setting the maximum time a query in the pool can execute. If a query exceeds this setting, it tries to cascade to a secondary pool:
RUNTIMECAP { 'interval' | NONE }
-
interval
: Maximum wait time expressed in the following format: -
NONE
(default): No time limit on queries running in this pool.To specify a value in days, provide an integer value. To provide a value less than one day, provide the interval in the format
hours:minutes:seconds
. For example a value of1:30:00
would equal 90 minutes.If the user or session also has a
RUNTIMECAP
, the shorter limit applies.
-
RUNTIMEPRIORITY
Determines how the resource manager should prioritize dedication of run-time resources (CPU, I/O bandwidth) to queries already running in this resource pool:
RUNTIMEPRIORITY { HIGH | MEDIUM | LOW }
Default:
MEDIUM
RUNTIMEPRIORITYTHRESHOLD
Specifies in seconds a time limit in which a query must finish before the resource manager assigns to it the resource pool's
RUNTIMEPRIORITY
. All queries begin running at aHIGH
priority. When a query's duration exceeds this threshold, it is assigned theRUNTIMEPRIORITY
of the resource pool.RUNTIMEPRIORITYTHRESHOLD seconds
Default:
2
SINGLEINITIATOR
By default, set to false for backward compatibility. Do not change this setting.
Privileges
Superuser
Examples
This example shows how to create a resource pool with MEMORYSIZE
of 1800 MB.
=> CREATE RESOURCE POOL ceo_pool MEMORYSIZE '1800M' PRIORITY 10;
CREATE RESOURCE POOL
Assuming the CEO report user already exists, associate this user with the preceding resource pool using ALTER USER
statement.
=> GRANT USAGE ON RESOURCE POOL ceo_pool to ceo_user;
GRANT PRIVILEGE
=> ALTER USER ceo_user RESOURCE POOL ceo_pool;
ALTER USER
Issue the following command to confirm that the ceo_user is associated with the ceo_pool:
=> SELECT * FROM users WHERE user_name ='ceo_user';
-[ RECORD 1 ]-----+--------------------------------------------------
user_id | 45035996273733402
user_name | ceo_user
is_super_user | f
profile_name | default
is_locked | f
lock_time |
resource_pool | ceo_pool
memory_cap_kb | unlimited
temp_space_cap_kb | unlimited
run_time_cap | unlimited
all_roles |
default_roles |
search_path | "$user", public, v_catalog, v_monitor, v_internal
This exampleshows how to create and designate secondary resource pools.
=> CREATE RESOURCE POOL rp3 RUNTIMECAP '5 minutes';
=> CREATE RESOURCE POOL rp2 RUNTIMECAP '3 minutes' CASCADE TO rp3;
=> CREATE RESOURCE POOL rp1 RUNTIMECAP '1 minute' CASCADE TO rp2;
=> SET SESSION RESOURCE_POOL = rp1;
This Eon Mode example confirms the current subcluster name, then creates a resource pool for the current subcluster:
=> SELECT CURRENT_SUBCLUSTER_NAME();
CURRENT_SUBCLUSTER_NAME
-------------------------
analytics_1
(1 row)
=> CREATE RESOURCE POOL dashboard FOR SUBCLUSTER analytics_1;
CREATE RESOURCE POOL
See also
24.1 - Built-in pools
Vertica is preconfigured with built-in pools for various system tasks:
For details on resource pool settings, see ALTER RESOURCE POOL.
GENERAL
Catch-all pool used to answer requests that have no specific resource pool associated with them. Any memory left over after memory has been allocated to all other pools is automatically allocated to the GENERAL pool. The MEMORYSIZE parameter of the GENERAL pool is undefined (variable), however, the GENERAL pool must be at least 1GB in size and cannot be smaller than 25% of the memory in the system.
The MAXMEMORYSIZE parameter of the GENERAL pool has special meaning; when set as a % value it represents the percent of total physical RAM on the machine that the Resource manager can use for queries. By default, it is set to 95%. MAXMEMORYSIZE governs the total amount of RAM that the Resource Manager can use for queries, regardless of whether it is set to a percent or to a specific value (for example, '10GB').
User-defined pools can borrow memory from the GENERAL pool to satisfy requests that need extra memory until the MAXMEMORYSIZE parameter of that pool is reached. If the pool is configured to have MEMORYSIZE equal to MAXMEMORYSIZE, it cannot borrow any memory from the GENERAL pool. When multiple pools request memory from the GENERAL pool, they are granted access to general pool memory according to their priority setting. In this manner, the GENERAL pool provides some elasticity to account for point-in-time deviations from normal usage of individual resource pools.
Vertica recommends reducing the GENERAL pool MAXMEMORYSIZE if your catalog uses over 5 percent of overall memory. You can calculate what percentage of GENERAL pool memory the catalog uses as follows:
=> WITH memory_use_metadata AS (SELECT node_name, memory_size_kb FROM resource_pool_status WHERE pool_name='metadata'),
memory_use_general AS (SELECT node_name, memory_size_kb FROM resource_pool_status WHERE pool_name='general')
SELECT m.node_name, ((m.memory_size_kb/g.memory_size_kb) * 100)::NUMERIC(4,2) pct_catalog_usage
FROM memory_use_metadata m JOIN memory_use_general g ON m.node_name = g.node_name;
node_name | pct_catalog_usage
------------------+-------------------
v_vmart_node0001 | 0.41
v_vmart_node0002 | 0.37
v_vmart_node0003 | 0.36
(3 rows)
BLOBDATA
Controls resource usage for in-memory blobs. In-memory blobs are objects used by a number of the machine learning SQL functions. You should adjust this pool if you plan on processing large machine learning workloads. For information about tuning the pool, see Tuning for machine learning.
If a query using the BLOBDATA pool exceeds its query planning budget, then it spills to disk. For more information about tuning your query budget, see Query budgeting.
DBD
Controls resource usage for Database Designer processing. Use of this pool is enabled by configuration parameter DBDUseOnlyDesignerResourcePool, by default set to false.
By default, QUEUETIMEOUT is set to 0 for this pool. When resources are under pressure, this setting causes the DBD to time out immediately, and not be queued to run later. Database Designer then requests the user to run the designer later, when resources are more available.
Important
Do not change QUEUETIMEOUT or any DBD resource pool parameters.JVM
Controls Java Virtual Machine resources used by Java User Defined Extensions. When a Java UDx starts the JVM, it draws resources from the those specified in the JVM resource pool. Vertica does not reserve memory in advance for the JVM pool. When needed, the pool can expand to 10% of physical memory or 2 GB of memory, whichever is smaller. If you are buffering large amounts of data, you may need to increase the size of the JVM resource pool.
You can adjust the size of your JVM resource pool by changing its configuration settings. Unlike other resource pools, the JVM resource pool does not release resources until a session is closed.
METADATA
Tracks memory allocated for catalog data and storage data structures. This pool increases in size as Vertica metadata consumes additional resources. Memory assigned to the METADATA pool is subtracted from the GENERAL pool, enabling the Vertica resource manager to make more effective use of available resources. If the METADATA resource pool reaches 75% of the GENERAL pool, Vertica stops updating METADATA memory size and displays a warning message in vertica.log
. You can enable or disable the METADATA pool with configuration parameter EnableMetadataMemoryTracking.
If you created a "dummy" or "swap" resource pool to protect resources for use by your operating system, you can replace that pool with the METADATA pool.
Users cannot change the parameters of the METADATA resource pool.
RECOVERY
Used by queries issued when recovering another node of the database. The MAXCONCURRENCY parameter is used to determine how many concurrent recovery threads to use. You can use the PLANNEDCONCURRENCY parameter (by default, set to twice the MAXCONCURRENCY
) to tune how to apportion memory to recovery queries.
See Tuning for recovery.
REFRESH
Used by queries issued by
PROJECTION_REFRESHES
operations. Refresh does not currently use multiple concurrent threads; thus, changes to the MAXCONCURRENCY values have no effect.
See Scenario: Tuning for Refresh.
SYSQUERY
Runs queries against all system monitoring and catalog tables. The SYSQUERY pool reserves resources for system table queries so that they are never blocked by contention for available resources.
TM
The Tuple Mover (TM) pool. You can set the MAXCONCURRENCY parameter for the TM pool to allow concurrent TM operations.
24.2 - Built-in resource pools configuration
To view the current and default configuration for built-in resource pools, query the system tables RESOURCE_POOLS and RESOURCE_POOL_DEFAULTS, respectively. The sections below provide this information, and also indicate which built-in pool parameters can be modified with ALTER RESOURCE POOL:
GENERAL
Parameter | Settings |
---|---|
MEMORYSIZE | Empty / cannot be set |
MAXMEMORYSIZE |
The maximum memory to use for all resource pools, one of the following: MAXMEMORYSIZE { 'integer%' | 'integer{K|M|G|T}' }
CautionSetting this parameter to 100% generates a warning of potential swapping.
For example, if your node has 64GB of memory, setting MAXMEMORYSIZE to 50% allocates half of available memory. Thus, the maximum amount of memory available to all resource pools is 32GB. Default: 95% |
MAXQUERYMEMORYSIZE |
The maximum amount of memory allocated by this pool to process any query: MAXQUERYMEMORYSIZE { 'integer%' | 'integer{K|M|G|T}' } |
EXECUTIONPARALLELISM | Default: AUTO |
PRIORITY | Default: 0 |
RUNTIMEPRIORITY | Default: Medium |
RUNTIMEPRIORITYTHRESHOLD | Default: 2 |
QUEUETIMEOUT | Default: 00:05 (minutes) |
RUNTIMECAP |
Prevents runaway queries by setting the maximum time a query in the pool can execute. If a query exceeds this setting, it tries to cascade to a secondary pool:
|
PLANNEDCONCURRENCY |
The number of concurrent queries you expect to run against the resource pool, an integer ≥ 4. If set to AUTO (default), Vertica automatically sets PLANNEDCONCURRENCY at query runtime, choosing the lower of these two values:
ImportantIn systems with a large number of cores, the default AUTO setting of
Default: AUTO |
MAXCONCURRENCY |
Default: Empty CautionMust be set ≥ 1, otherwise Vertica generates a warning that system queries might be unable to execute. |
SINGLEINITIATOR |
Default: False. ImportantIncluded for backwards compatibility. Do not change. |
CPUAFFINITYSET | Default: Empty |
CPUAFFINITYMODE | Default: ANY |
CASCADETO | Default: Empty |
BLOBDATA
Parameter | Default Setting |
---|---|
MEMORYSIZE | 0% |
MAXMEMORYSIZE | 10 |
MAXQUERYMEMORYSIZE | Empty / cannot be set |
EXECUTIONPARALLELISM | |
PRIORITY | |
RUNTIMEPRIORITY | |
RUNTIMEPRIORITYTHRESHOLD | |
QUEUETIMEOUT | |
RUNTIMECAP | NONE |
PLANNEDCONCURRENCY | AUTO |
MAXCONCURRENCY | Empty / cannot be set |
SINGLEINITIATOR | |
CPUAFFINITYSET | |
CPUAFFINITYMODE | ANY / cannot be set |
CASCADETO | Empty / cannot be set |
DBD
Parameter | Default Setting |
---|---|
MEMORYSIZE | 0% |
MAXMEMORYSIZE | Unlimited |
MAXQUERYMEMORYSIZE | Empty / cannot be set |
EXECUTIONPARALLELISM | AUTO |
PRIORITY | 0 |
RUNTIMEPRIORITY | MEDIUM |
RUNTIMEPRIORITYTHRESHOLD | 0 |
QUEUETIMEOUT | 0 |
RUNTIMECAP | NONE |
PLANNEDCONCURRENCY | AUTO |
MAXCONCURRENCY |
NONE |
SINGLEINITIATOR |
True ImportantIncluded for backwards compatibility. Do not change. |
CPUAFFINITYSET | Empty / cannot be set |
CPUAFFINITYMODE | ANY / cannot be set |
CASCADETO | Empty / cannot be set |
JVM
Parameter | Default Setting |
---|---|
MEMORYSIZE | 0% |
MAXMEMORYSIZE | 10% of memory or 2 GB, whichever is smaller |
MAXQUERYMEMORYSIZE | Empty / cannot be set |
EXECUTIONPARALLELISM | AUTO |
PRIORITY | 0 |
RUNTIMEPRIORITY | MEDIUM |
RUNTIMEPRIORITYTHRESHOLD | 2 |
QUEUETIMEOUT | 00:05 (minutes) |
RUNTIMECAP | NONE |
PLANNEDCONCURRENCY | AUTO |
MAXCONCURRENCY | Empty / cannot be set |
SINGLEINITIATOR |
FALSE ImportantIncluded for backwards compatibility. Do not change. |
CPUAFFINITYSET | Empty / cannot be set |
CPUAFFINITYMODE | ANY / cannot be set |
CASCADETO | Empty / cannot be set |
METADATA
Parameter | Default Setting |
---|---|
MEMORYSIZE | 0% |
MAXMEMORYSIZE | Unlimited |
MAXQUERYMEMORYSIZE | Empty / cannot be set |
EXECUTIONPARALLELISM | AUTO |
PRIORITY | 108 |
RUNTIMEPRIORITY | HIGH |
RUNTIMEPRIORITYTHRESHOLD | 0 |
QUEUETIMEOUT | 0 |
RUNTIMECAP | NONE |
PLANNEDCONCURRENCY | AUTO |
MAXCONCURRENCY | 0 |
SINGLEINITIATOR |
FALSE. ImportantIncluded for backwards compatibility. Do not change. |
CPUAFFINITYSET | Empty / cannot be set |
CPUAFFINITYMODE | ANY / cannot be set |
CASCADETO | Empty / cannot be set |
RECOVERY
Parameter | Default Setting | ||
---|---|---|---|
MEMORYSIZE | 0% | ||
MAXMEMORYSIZE |
The maximum size per node the resource pool can grow by borrowing memory from the
|
'integer{K|M|G|T}' |
NONE }
CautionSetting must resolve to ≥ 25%. Otherwise, Vertica generates a warning that system queries might be unable to execute. |
MAXQUERYMEMORYSIZE | Empty / cannot be set | ||
EXECUTIONPARALLELISM | AUTO | ||
PRIORITY |
One of the following:
CautionChange these settings only under guidance from Vertica technical support. | ||
RUNTIMEPRIORITY | MEDIUM | ||
RUNTIMEPRIORITYTHRESHOLD | 60 | ||
QUEUETIMEOUT | 00:05 (minutes) | ||
RUNTIMECAP | NONE | ||
PLANNEDCONCURRENCY | AUTO | ||
MAXCONCURRENCY |
By default, set as follows:
Thus, given a system with four cores, MAXCONCURRENCY has a default setting of 3. Note0 or NONE (unlimited) are invalid settings. | ||
SINGLEINITIATOR |
True. ImportantIncluded for backwards compatibility. Do not change. | ||
CPUAFFINITYSET | Empty / cannot be set | ||
CPUAFFINITYMODE | ANY / cannot be set | ||
CASCADETO | Empty / cannot be set |
REFRESH
Parameter | Default Setting |
---|---|
MEMORYSIZE | 0% |
MAXMEMORYSIZE | NONE (unlimited) |
MAXQUERYMEMORYSIZE | Empty / cannot be set |
EXECUTIONPARALLELISM | AUTO |
PRIORITY | -10 |
RUNTIMEPRIORITY | MEDIUM |
RUNTIMEPRIORITYTHRESHOLD | 60 |
QUEUETIMEOUT | 00:05 (minutes) |
RUNTIMECAP | NONE (unlimited) |
PLANNEDCONCURRENCY | AUTO (4) |
MAXCONCURRENCY |
3 This parameter must be set ≥ 1. |
SINGLEINITIATOR |
True. ImportantIncluded for backwards compatibility. Do not change. |
CPUAFFINITYSET | Empty / cannot be set |
CPUAFFINITYMODE | ANY / cannot be set |
CASCADETO | Empty / cannot be set |
SYSQUERY
Parameter | Default Setting |
---|---|
MEMORYSIZE |
1G CautionSetting must resolve to ≥ 20M, otherwise Vertica generates a warning that system queries might be unable to execute, and diagnosing problems might be difficult. |
MAXMEMORYSIZE | Empty (unlimited) |
MAXQUERYMEMORYSIZE | Empty / cannot be set |
EXECUTIONPARALLELISM | AUTO |
PRIORITY | 110 |
RUNTIMEPRIORITY | HIGH |
RUNTIMEPRIORITYTHRESHOLD | 0 |
QUEUETIMEOUT | 00:05 (minutes) |
RUNTIMECAP | NONE |
PLANNEDCONCURRENCY | AUTO |
MAXCONCURRENCY |
Empty CautionMust be set ≥ 1, otherwise Vertica generates a warning that system queries might be unable to execute. |
SINGLEINITIATOR |
False. ImportantIncluded for backwards compatibility. Do not change. |
CPUAFFINITYSET | Empty / cannot be set |
CPUAFFINITYMODE | |
CASCADETO |
TM
Parameter | Default Setting |
---|---|
MEMORYSIZE |
5% (of the GENERAL pool's MAXMEMORYSIZE setting) + 2GB ImportantYou can estimate the optimal amount of RAM for the TM resource pool as follows:
where
|
MAXMEMORYSIZE | Unlimited |
MAXQUERYMEMORYSIZE | Empty / cannot be set |
EXECUTIONPARALLELISM | AUTO |
PRIORITY | 105 |
RUNTIMEPRIORITY | MEDIUM |
RUNTIMEPRIORITYTHRESHOLD | 60 |
QUEUETIMEOUT | 00:05 (minutes) |
RUNTIMECAP | NONE |
PLANNEDCONCURRENCY | 7 |
MAXCONCURRENCY |
Sets across all nodes the maximum number of concurrent execution slots available to TM pool. In databases created in Vertica releases ≥9.3, the default value is 7. In databases created in earlier versions, the default is 3.This setting specifies the maximum number of merges that can occur simultaneously on multiple threads. Note0 or NONE (unlimited) are invalid settings. |
SINGLEINITIATOR |
True ImportantIncluded for backwards compatibility. Do not change. |
CPUAFFINITYSET | Empty / cannot be set |
CPUAFFINITYMODE | ANY / cannot be set |
CASCADETO | Empty / cannot be set |
25 - CREATE ROLE
Creates a role. After creating a role, use GRANT statements to specify role permissions.
Syntax
CREATE ROLE role
Parameters
role
- The name for the new role, where
role
conforms to conventions described in Identifiers.
Privileges
Superuser
Examples
This example shows to create an empty role called roleA.
=> CREATE ROLE roleA;
CREATE ROLE
See also
26 - CREATE ROUTING RULE
Creates a load balancing routing rule that directs incoming client connections from an IP address range to a group of Vertica nodes. This group of Vertica nodes is defined by a load balance group. Once you create a routing rule, any client connection originating from the rule's IP address range is redirected to one of the nodes in the load balance group if the client opts into load balancing.
Syntax
CREATE ROUTING RULE rule_name ROUTE 'address_range' TO group_name
Arguments
rule_name
``- A name for the routing rule.
*`address_range`*
- An IPv4 or IPv6 address range in CIDR format. Sets the address range of client connections that this rule applies to.
group_name
- The name of the load balance group to handle the client connections from the address range. You create this group using the CREATE LOAD BALANCE GROUP statement.
Privileges
Superuser.
Examples
The following example creates a routing rule that routes all client connections from 192.168.1.0 to 192.168.1.255 to a load balance group named internal_clients:
=> CREATE ROUTING RULE internal_clients ROUTE '192.168.1.0/24' TO internal_clients;
CREATE ROUTING RULE
See also
27 - CREATE SCHEMA
Defines a schema.
Syntax
CREATE SCHEMA [ IF NOT EXISTS ] [database.]schema
[ AUTHORIZATION username]
[ DEFAULT { INCLUDE | EXCLUDE } [ SCHEMA ] PRIVILEGES ]
Parameters
IF NOT EXISTS
If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.
The
IF NOT EXISTS
clause is useful for SQL scripts where you want to create an object if it does not already exist.For related information, see ON_ERROR_STOP.
[
database
.]
schema
- Identifies the schema to create, where
schema
conforms to conventions described in Identifiers. The following naming requirements also apply:-
The name must be unique among all other schema names in the database.
-
It must comply with keyword restrictions.
-
It cannot begin with
v_
; this prefix is reserved for Vertica system tables. -
If you specify a database, it must be the current database.
-
AUTHORIZATION
username
- Valid only for superusers, assigns ownership of the schema to another user. By default, the user who creates a schema is also assigned ownership.
After you create a schema, you can reassign ownership to another user with
ALTER SCHEMA
. DEFAULT {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES
Specifies whether to enable or disable default inheritance of privileges for new tables in the specified schema:
-
EXCLUDE SCHEMA PRIVILEGES
(default): Disables inheritance of schema privileges. -
INCLUDE SCHEMA PRIVILEGES
: Specifies to grant tables in the specified schema the same privileges granted to that schema. This option has no effect on existing tables in the schema.
If you omit
INCLUDE PRIVILEGES
, you must explicitly grant schema privileges on the desired tables.For more information see Enabling schema inheritance.
-
Privileges
-
Superuser
Supported sub-statements
CREATE SCHEMA
can include one or more sub-statements—for example, to create tables or projections within the new schema. Supported sub-statements include:
CREATE SCHEMA
statement and all sub-statements are treated as a single transaction. If any statement fails, Vertica rolls back the entire transaction. The owner of the new schema is assigned ownership of all objects that are created within this transaction.
For example, the following CREATE SCHEMA
statement also grants privileges on the new schema, and creates a table and view of that table:
=> \c - Joan
You are now connected as user "Joan".
=> CREATE SCHEMA s1
GRANT USAGE, CREATE ON SCHEMA s1 TO public
CREATE TABLE s1.t1 (a varchar)
CREATE VIEW s1.t1v AS SELECT * FROM s1.t1;
CREATE SCHEMA
=> \dtv s1.*
List of tables
Schema | Name | Kind | Owner | Comment
--------+------+-------+-------+---------
s1 | t1 | table | Joan |
s1 | t1v | view | Joan |
(2 rows)
Examples
Create schema s1
:
=> CREATE SCHEMA s1;
Create schema s2
if it does not already exist:
=> CREATE SCHEMA IF NOT EXISTS s2;
If the schema already exists, Vertica returns a rollback message:
=> CREATE SCHEMA IF NOT EXISTS s2;
NOTICE 4214: Object "s2" already exists; nothing was done
Create table t1
in schema s1
, then grant users Fred
and Aniket
access to all existing tables and all privileges on table t1
:
=> CREATE TABLE s1.t1 (c INT);
CREATE TABLE
=> GRANT USAGE ON SCHEMA s1 TO Fred, Aniket;
GRANT PRIVILEGE
=> GRANT ALL PRIVILEGES ON TABLE s1.t1 TO Fred, Aniket;
GRANT PRIVILEGE
Enable inheritance on new schema s3
so all tables created in it automatically inherit its privileges. In this case, new table s3.t2
inherits USAGE, CREATE, and SELECT privileges, which are automatically granted to all database users:
=> CREATE SCHEMA s3 DEFAULT INCLUDE SCHEMA PRIVILEGES; CREATE SCHEMA => GRANT USAGE, CREATE, SELECT, INSERT ON SCHEMA S3 TO PUBLIC; GRANT PRIVILEGE => CREATE TABLE s3.t2(i int); WARNING 6978: Table "t2" will include privileges from schema "s3" CREATE TABLE
See also
28 - CREATE SEQUENCE
Defines a new named sequence number generator object. Like AUTO_INCREMENT and IDENTITY sequences, named sequences let you set the default values of primary key columns. Sequences guarantee uniqueness, and avoid constraint enforcement problems and overhead.
For more information about sequence types and their usage, see Sequences.
Syntax
CREATE SEQUENCE [ IF NOT EXISTS ] [[database.]schema.]sequence
[ INCREMENT [ BY ] integer ]
[ MINVALUE `*`integer`*` | NO MINVALUE ]
[ MAXVALUE maxvalue | NO MAXVALUE ]
[ START [ WITH ] integer ]
[ CACHE integer | NO CACHE ]
[ CYCLE | NO CYCLE ]
Parameters
IF NOT EXISTS
If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.
The
IF NOT EXISTS
clause is useful for SQL scripts where you want to create an object if it does not already exist.For related information, see ON_ERROR_STOP.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.*
sequence*
- Identifies the sequence to create, where
sequence
conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema. INCREMENT [BY]
integer
A positive or negative integer that specifies how much to increment or decrement the sequence on each call to NEXTVAL, by default set to 1.
Note
Setting this parameter tointeger
guarantees that column values always increment by at leastinteger
. However, column values can sometimes increment by more thaninteger
unless you also set theNO CACHE
parameter.MINVALUE
integer
| NO MINVALUE
- Determines the minimum value a sequence can generate. If you omit this clause or specify NO MINVALUE (the default), default values are used: 1 and -263-1 for ascending and descending sequences, respectively.
MAXVALUE
integer
| NO MAXVALUE
- Determines the maximum value for the sequence. If you omit this clause or specify NO MAXVALUE (the default), default values are used: 263-1 and -1 for ascending and descending sequences, respectively.
START [WITH]
integer
- Sets the sequence start value to
integer
. The next call to NEXTVAL returnsinteger
. If you omit this clause, the sequence start value is set toMINVALUE
for ascending sequences, andMAXVALUE
for descending sequences. CACHE
integer
| NO CACHE
- Specifies whether to cache unique sequence numbers on each node for faster access.
CACHE
takes an integer argument as follows:-
>1 specifies how many unique numbers each node caches per session.
Caution
If sequence caching is set to a low number, nodes are liable to request a new set of cache values more frequently. While it supplies a new cache, Vertica must lock the catalog. Until Vertica releases the lock, other database activities such as table inserts are blocked, which can adversely affect overall performance. -
0 or 1 specifies to disable caching (equivalent to
NO CACHE
).
If you omit this clause, the sequence cache is set to 250,000.
For details on named sequence caching, see Distributing named sequences.
-
CYCLE | NO CYCLE
- Specifies whether the sequence can wrap when its minimum or maximum values are reached:
-
CYCLE
: The sequence wraps as follows:-
When an incrementing sequence reaches its upper limit, it is reset to its minimum value.
-
When an decrementing sequence reaches its lower limit, it is reset to its maximum value.
-
-
NO CYCLE
(default): Calls to NEXTVAL return an error after the sequence reaches its maximum or minimum value.
-
Privileges
Non-superusers: CREATE privilege on the schema
Examples
See Creating and using named sequences.
See also
29 - CREATE SUBNET
Identifies the subnet to which the nodes of a Vertica database belong. Use this statement to configure import/export from a database to other Vertica clusters.
Syntax
CREATE SUBNET subnet-name WITH 'subnet-prefix'
Parameters
subnet-name
- A name you assign to the subnet, where
subnet-name
conforms to conventions described in Identifiers. subnet-prefix
- The subnet prefix in either a dotted-quad number format for IPv4 addresses, or four colon-delimited four-digit hexadecimal numbers for IPv6 addresses. Refer to system table
NETWORK_INTERFACES
to get the prefix of all available IP networks.
You can then configure the database to use the subnet for import/export. For details, see Identify the database or nodes used for import/export.
Privileges
Superuser
Examples
=> CREATE SUBNET mySubnet WITH '123.4.5.6';
=> CREATE SUBNET mysubnet WITH 'fd9b:1fcc:1dc4:78d3::';
30 - CREATE TABLE
Creates a table in the logical schema.
Syntax
Create with column definitions:
CREATE TABLE [ IF NOT EXISTS ] [[database.]schema.]table
( column-definition[,...] [, table-constraint ][,...] )
[ ORDER BY column[,...] ]
[ segmentation-spec ]
[ KSAFE [k-num] ]
[ partition-clause]
[ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
Create from another table:
CREATE TABLE [ IF NOT EXISTS ] [[database.]schema.]table { AS-clause | LIKE-clause }
AS-clause
[ ( column-name-list ) ]
[ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
AS [ /*+ LABEL */ ] [ AT epoch ] query [ ENCODED BY column-ref-list ] [ segmentation-spec ]
LIKE-clause
LIKE [[database.]schema.]existing-table
[ {INCLUDING | EXCLUDING} PROJECTIONS ]
[ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
Parameters
IF NOT EXISTS
If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.
The
IF NOT EXISTS
clause is useful for SQL scripts where you want to create an object if it does not already exist.For related information, see ON_ERROR_STOP.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.table
- Name of the table to create, which must be unique among names of all sequences, tables, projections, views, and models within the schema.
- column-definition
- Column name, data type, and optional constraints. A table can have up to 9800 columns. At least one column in the table must be of a scalar type or native array.
- table-constraint
- Table-level constraint, as opposed to column constraints.
ORDER BY
column
[,...]
Invalid for external tables, specifies columns from the
SELECT
list on which to sort the superprojection that is automatically created for this table. TheORDER BY
clause cannot include qualifiersASC
orDESC
. Vertica always stores projection data in ascending sort order.If you omit the
ORDER BY
clause, Vertica uses theSELECT
list order as the projection sort order.segmentation-spec
Invalid for external tables, specifies how to distribute data for auto-projections of this table. Supply one of the following clauses:
-
hash-segmentation-clause: Specifies to segment data evenly and distribute across cluster nodes. Vertica recommends segmenting large tables.
-
unsegmented-clause: Specifies to create an unsegmented projection.
If this clause is omitted, Vertica generates auto-projections with default hash segmentation.
-
KSAFE [
k-num
]
Invalid for external tables, specifies K-safety of auto-projections created for this table, where
k-num
must be equal to or greater than system K-safety. If you omit this option, the projection uses the system K-safety level.- partition-clause
- Invalid for external tables, logically divides table data storage through a PARTITION BY clause:
PARTITION BY partition-expression [ GROUP BY group-expression ] [ ACTIVEPARTITIONCOUNT integer ]
- column-name-list
Valid only when creating a table from a query (
AS
query
), defines column names that map to the query output. If you omit this list, Vertica uses the query output column names. The names incolumn-name-list
and queried columns must be the same in number.For example:
CREATE TABLE customer_occupations (name, profession) AS SELECT customer_name, occupation FROM customer_dimension;
This clause and the
ENCODED BY
clause are mutually exclusive. Column name lists are invalid for external tables{INCLUDE | EXCLUDE}
[SCHEMA] PRIVILEGES
Default inheritance of schema privileges for this table:
-
INCLUDE PRIVILEGES specifies that the table inherits privileges that are set on its schema. This is the default behavior if privileges inheritance is enabled for the schema.
-
EXCLUDE PRIVILEGES disables inheritance of privileges from the schema.
For details, see Inherited privileges.
-
AS
query
Creates and loads a table from the results of a query, specified as follows:
AS [ /*+ LABEL */ ] [ AT epoch ] query
The query cannot include complex type columns.
ENCODED BY
column-ref-list
A comma-delimited list of columns from the source table, where each column is qualified by one or both of the following encoding options:
-
ACCESSRANK
integer
: Overrides the default access rank for a column, useful for prioritizing access to a column. See Prioritizing column access speed. -
ENCODING
encoding-type
: Specifies the type of encoding to use on the column. The default encoding type isAUTO
.
This option and
column-name-list
are mutually exclusive. This option is invalid for external tables.-
LIKE
existing-table
- Creates the table by replicating an existing table. You can qualify the LIKE clause with one of the following options:
-
EXCLUDING PROJECTIONS (default): Do not copy projections from the source table.
-
INCLUDING PROJECTIONS: Copy current projections from the source table for the new table.
-
{INCLUDE|EXCLUDE} [SCHEMA] PRIVILEGES: See description above).
-
Privileges
Non-superuser:
-
CREATE privileges on the table schema
-
If creating a table that includes a named sequence:
-
SELECT privilege on sequence object
-
USAGE privilege on sequence schema
-
-
If creating a table with the LIKE clause, source table owner
Restrictions for complex types
Complex types used in native tables have some restrictions, in addition to the restrictions for individual types listed on their reference pages:
-
A native table must have at least one column that is a primitive type or a native array (one-dimensional array of a primitive type). If a flex table has real columns, it must also have at least one column satisfying this restriction.
-
Complex type columns cannot be used in ORDER BY or PARTITION BY clauses nor as FILLER columns.
-
Complex type columns cannot have constraints.
-
Expressions returning complex types cannot be used as projection columns, and projections cannot be segmented or ordered by columns of complex types.
-
Tables with columns of complex types cannot use DEFAULT and SET USING.
Examples
The following example creates a table in the public schema:
CREATE TABLE public.Premium_Customer
(
ID IDENTITY ,
lname varchar(25),
fname varchar(25),
store_membership_card int
);
The following example uses LIKE to create a new table from this one:
=> CREATE TABLE All_Customers LIKE Premium_Customer;
CREATE TABLE
The following example selects columns from one table to use in a new table, using an AS clause:
=> CREATE TABLE cust_basic_profile AS SELECT
customer_key, customer_gender, customer_age, marital_status, annual_income, occupation
FROM customer_dimension WHERE customer_age>18 AND customer_gender !='';
CREATE TABLE
=> SELECT customer_age, annual_income, occupation FROM cust_basic_profile
WHERE customer_age > 23 ORDER BY customer_age;
customer_age | annual_income | occupation
--------------+---------------+--------------------
24 | 469210 | Hairdresser
24 | 140833 | Butler
24 | 558867 | Lumberjack
24 | 529117 | Mechanic
24 | 322062 | Acrobat
24 | 213734 | Writer
...
The following example creates a table using array columns:
=> CREATE TABLE orders(
orderkey INT,
custkey INT,
prodkey ARRAY[VARCHAR(10)],
orderprices ARRAY[DECIMAL(12,2)],
orderdate DATE
);
The following example uses a ROW complex type:
=> CREATE TABLE inventory
(store INT, products ROW(name VARCHAR, code VARCHAR));
See also
30.1 - Column-constraint
Adds a constraint to a column's metadata. For details, see Constraints.
Syntax
[ { AUTO_INCREMENT | IDENTITY } [ (args) ] ]
[ CONSTRAINT constraint-name ] {
[ CHECK (expression) [ ENABLED | DISABLED ] ]
[ [ DEFAULT expression ] [ SET USING expression } | DEFAULT USING expression ]
[ NULL | NOT NULL ]
[ { PRIMARY KEY [ ENABLED | DISABLED ] REFERENCES table [( column )] } ]
[ UNIQUE [ ENABLED | DISABLED ] ]
}
Parameters
Note
You can specify enforcement of several constraints by qualifying them with the keywordsENABLED
or DISABLED
. See Enforcing Constraints below.
AUTO_INCREMENT | IDENTITY
- Creates a table column whose values are automatically generated by and managed by the database. You cannot change or load values in this column. You can set this constraint on only one table column.
AUTO_INCREMENT
andIDENTITY
are synonyms. For details on this constraint and optional arguments, see AUTO_INCREMENT and IDENTITY sequences.These options are invalid for temporary tables.
CONSTRAINT
constraint-name
- Assigns a name to the constraint, valid for the following constraints:
-
PRIMARY KEY
-
REFERENCES
(foreign key) -
CHECK
-
UNIQUE
If you omit assigning a name to these constraints, Vertica assigns its own name. For details, see Naming constraints.
Vertica recommends that you name all constraints.
-
CHECK (
expression
)
- Adds check condition
expression
, which returns a Boolean value. DEFAULT
- Specifies this column's default value:
DEFAULT default-expr
Vertica evaluates the
DEFAULT
expression and sets the column on load operations, if the operation omits a value for the column. For details about valid expressions, see Defining column values. SET USING
- Specifies to set values in this column from the specified expression:
SET USING using-expr
Vertica evaluates the
SET USING
expression and refreshes column values only when the functionREFRESH_COLUMNS
is invoked. For details about valid expressions, see Defining column values. DEFAULT USING
- Defines the column with
DEFAULT
andSET USING
constraints, specifying the same expression for both.DEFAULT USING
columns support the same expressions asSET USING
columns, and are subject to the same restrictions. NULL | NOT NULL
- Specifies whether the column can contain null values:
-
NULL
: Allows null values in the column. If you set this constraint on a primary key column, Vertica ignores it and sets it toNOT NULL
. -
NOT NULL
: Specifies that the column must be set to a value during insert and update operations. If the column has no default value and no value is provided,INSERT
orUPDATE
returns an error.
If you omit this constraint, the default is
NULL
for all columns except primary key columns, which Vertica always sets toNOT NULL
.External tables: If you specify
NOT NULL
and the column contains null values, queries are liable to return errors or generate unexpected behavior. SpecifyNOT NULL
for an external table column only if you are sure that the column does not contain nulls. -
PRIMARY KEY
- Identifies this column as the table's primary key.
REFERENCES
- Identifies this column as a foreign key:
REFERENCES table [column]
where
column
is the primary key intable
. If you omitcolumn
, Vertica references the primary key intable
. UNIQUE
- Requires column data to be unique with respect to all table rows.
Privileges
Table owner or user WITH GRANT OPTION is grantor.
-
REFERENCES privilege on table to create foreign key constraints that reference this table
-
USAGE privilege on schema that contains the table
Enforcing constraints
The following constraints can be qualified with the keyword ENABLED
or DISABLED
:
-
PRIMARY KEY
-
UNIQUE
-
CHECK
If you omit ENABLED
or DISABLED
, Vertica determines whether to enable the constraint automatically by checking the appropriate configuration parameter:
-
EnableNewPrimaryKeysByDefault
-
EnableNewUniqueKeysByDefault
-
EnableNewCheckConstraintsByDefault
For details, see Constraint enforcement.
30.2 - Column-definition
Specifies the name, data type, and constraints to be applied to a column.
Syntax
column-name data-type
[ column-constraint ][...]
[ ENCODING encoding-type ]
[ ACCESSRANK integer ]
Parameters
*
column-name*
- The name of a column to be created or added.
*
data-type*
- A Vertica-supported data type.
Tip
When specifying the maximum column width in a CREATE TABLE statement, use the width in bytes (octets) for any of the string types. Each UTF-8 character might require four bytes, but European languages generally require a little over one byte per character, while Oriental languages generally require a little under three bytes per character. - column-constraint
- A constraint type that Vertica supports—for example, NOT NULL or UNIQUE. For general information, see Constraints.
ENCODING
encoding-type
The column encoding type, by default set to AUTO.
-
ACCESSRANK integer
Overrides the default access rank for a column. Use this parameter to increase or decrease the speed at which Vertica accesses a column. For more information, see Overriding Default Column Ranking.
Examples
The following example creates a table named Employee_Dimension
and its associated superprojection in the public
schema. The Employee_key
column is designated as a primary key, and RLE encoding is specified for the Employee_gender
column definition:
=> CREATE TABLE public.Employee_Dimension (
Employee_key integer PRIMARY KEY NOT NULL,
Employee_gender varchar(8) ENCODING RLE,
Courtesy_title varchar(8),
Employee_first_name varchar(64),
Employee_middle_initial varchar(8),
Employee_last_name varchar(64)
);
30.3 - Column-name-list
Used to rename columns when creating a table or temporary table from a query; also used to specify the column's encoding type and access rank .
Syntax
column-name-list
[ ENCODING encoding-type ]
[ ACCESSRANK integer ]
[ GROUPED ( column-reference[,...] ) ]
Parameters
column-name
- Specifies the new name for the column.
ENCODING
encoding-type- Specifies the type of encoding to use on the column. The default encoding type is
AUTO
. ACCESSRANK
integer
- Overrides the default access rank for a column, useful for prioritizing access to a column. See Prioritizing column access speed.
GROUPED
- Groups two or more columns . For detailed information, see GROUPED clause.
Requirements
-
A column in the list can not specify the column's data type or any constraint. These are derived from the queried table.
-
If the query output has expressions other than simple columns (for example, constants or functions) then an alias must be specified for that expression, or the column name list must include all queried columns.
-
CREATE TABLE can specify encoding types and access ranks in the column name list or the query's ENCODED BY clause, but not in both. For example, the following CREATE TABLE statement sets encoding and access rank on two columns in the column name list:
=> CREATE TABLE promo1 (state ENCODING RLE ACCESSRANK 1, zip ENCODING RLE,...) AS SELECT * FROM customer_dimension ORDER BY customer_state;
The next statement specifies the same encoding and access rank in the query's ENCODED BY clause.
=> CREATE TABLE promo2 AS SELECT * FROM customer_dimension ORDER BY customer_state ENCODED BY customer_state ENCODING RLE ACCESSRANK 1, customer_zip ENCODING RLE;
30.4 - Partition clause
Specifies partitioning of table data, through a PARTITION BY clause in the table definition:
PARTITION BY partition-expression [ GROUP BY group-expression ] [ active-partition-count-expr ]
PARTITION BY
partition-expression
- For each table row, resolves to a partition key that is derived from one or more table columns.
Caution
Avoid partitioning tables on LONG VARBINARY and LONG VARCHAR columns. Doing so can adversely impact performance. GROUP BY
group-expression
- For each table row, resolves to a partition group key that is derived from the partition key. Vertica uses group keys to merge partitions into separate partition groups. GROUP BY must use the same expression as PARTITION BY. For example:
...PARTITION BY (i+j) GROUP BY ( CASE WHEN (i+j) < 5 THEN 1 WHEN (i+j) < 10 THEN 2 ELSE 3);
For details on partitioning table data by groups, see Partition grouping and Hierarchical partitioning.
active-partition-count-expr
- Specifies how many partitions are active for this table, specified as follows:
-
In partition clause of CREATE TABLE:
ACTIVEPARTITIONCOUNT integer
-
In partition clause of ALTER TABLE:
SET ACTIVEPARTITIONCOUNT integer
This setting supersedes configuration parameter ActivePartitionCount. For details on usage, see Active and inactive partitions.
-
Partitioning requirements and restrictions
PARTITION BY expressions can specify leaf expressions, functions, and operators. The following requirements and restrictions apply:
- All table projections must include all columns referenced in the expression; otherwise, Vertica cannot resolve the expression.
- The expression can reference multiple columns, but it must resolve to a single non-null value for each row.
Note
You can avoid null-related errors with the function ZEROIFNULL. This function can check a PARTITION BY expression for null values and evaluate them to 0. For example:CREATE TABLE t1 (a int, b int) PARTITION BY (ZEROIFNULL(a)); CREATE TABLE
- All leaf expressions must be constants or table columns.
- All other expressions must be functions and operators. The following restrictions apply to functions: * They must be immutable—that is, they return the same value regardless of time and locale and other session- or environment-specific conditions. * They cannot be aggregate functions. * They cannot be Vertica meta-functions.
- The expression cannot include queries.
- The expression cannot include user-defined data types such as Geometry.
GROUP BY expressions do not support modulo (%) operations.
Examples
The following statements create the store_orders
table and load data into it. The CREATE TABLE statement includes a simple partition clause that specifies to partition data by year:
=> CREATE TABLE public.store_orders
(
order_no int,
order_date timestamp NOT NULL,
shipper varchar(20),
ship_date date
)
UNSEGMENTED ALL NODES
PARTITION BY YEAR(order_date);
CREATE TABLE
=> COPY store_orders FROM '/home/dbadmin/export_store_orders_data.txt';
41834
As COPY loads the new table data into ROS storage, the Tuple Mover executes the table's partition clause by dividing orders for each year into separate partitions, and consolidating these partitions in ROS containers.
In this case, the Tuple Mover creates four partition keys for the loaded data—2017, 2016, 2015, and 2014—and divides the data into separate ROS containers accordingly:
=> SELECT dump_table_partition_keys('store_orders');
... Partition keys on node v_vmart_node0001
Projection 'store_orders_super'
Storage [ROS container]
No of partition keys: 1
Partition keys: 2017
Storage [ROS container]
No of partition keys: 1
Partition keys: 2016
Storage [ROS container]
No of partition keys: 1
Partition keys: 2015
Storage [ROS container]
No of partition keys: 1
Partition keys: 2014
Partition keys on node v_vmart_node0002
Projection 'store_orders_super'
Storage [ROS container]
No of partition keys: 1
Partition keys: 2017
...
(1 row)
As new data is loaded into store_orders
, the Tuple Mover merges it into the appropriate partitions, creating partition keys as needed for new years.
See also
Partitioning tables30.5 - Table-constraint
Table-constraint
Adds a constraint to table metadata. You can specify table constraints with
CREATE TABLE
, or add a constraint to an existing table with
ALTER TABLE
. For details, see Setting constraints.
Note
Adding a constraint to a table that is referenced in a view does not affect the view.Syntax
[ CONSTRAINT constraint-name ]
{
... PRIMARY KEY (column[,... ]) [ ENABLED | DISABLED ]
... | FOREIGN KEY (column[,... ] ) REFERENCES table [ (column[,...]) ]
... | UNIQUE (column[,...]) [ ENABLED | DISABLED ]
... | CHECK (expression) [ ENABLED | DISABLED ]
}
Parameters
CONSTRAINT
constraint-name
- Assigns a name to the constraint. Vertica recommends that you name all constraints.
PRIMARY KEY
- Defines one or more
NOT NULL
columns as the primary key as follows:PRIMARY KEY (column[,...]) [ ENABLED | DISABLED]
You can qualify this constraint with the keyword
ENABLED
orDISABLED
. See Enforcing Constraints below.If you do not name a primary key constraint, Vertica assigns the name
C_PRIMARY
. FOREIGN KEY
- Adds a referential integrity constraint defining one or more columns as foreign keys as follows:
FOREIGN KEY (column[,... ]) REFERENCES table [(column[,... ])]
If you omit
column
, Vertica references the primary key intable
.If you do not name a foreign key constraint, Vertica assigns the name
C_FOREIGN
.Important
Adding a foreign key constraint requires the following privileges (in addition to privileges also required by ALTER TABLE):
-
REFERENCES on the referenced table
-
USAGE on the schema of the referenced table
-
UNIQUE
- Specifies that the data in a column or group of columns is unique with respect to all table rows, as follows:
UNIQUE (column[,...]) [ENABLED | DISABLED]
You can qualify this constraint with the keyword
ENABLED
orDISABLED
. See Enforcing Constraints below.If you do not name a unique constraint, Vertica assigns the name
C_UNIQUE
. CHECK
- Specifies a check condition as an expression that returns a Boolean value, as follows:
CHECK (expression) [ENABLED | DISABLED]
You can qualify this constraint with the keyword
ENABLED
orDISABLED
. See Enforcing Constraints below.If you do not name a check constraint, Vertica assigns the name
C_CHECK
.
Privileges
Non-superusers: table owner, or the following privileges:
-
USAGE on schema
-
ALTER on table
-
SELECT on table to enable or disable constraint enforcement
Enforcing constraints
A table can specify whether Vertica automatically enforces a primary key, unique key or check constraint with the keyword ENABLED
or DISABLED
. If you omit ENABLED
or DISABLED
, Vertica determines whether to enable the constraint automatically by checking the appropriate configuration parameter:
-
EnableNewPrimaryKeysByDefault
-
EnableNewUniqueKeysByDefault
-
EnableNewCheckConstraintsByDefault
For details, see Constraint enforcement.
Examples
The following example creates a table (t01
) with a primary key constraint.
CREATE TABLE t01 (id int CONSTRAINT sampleconstraint PRIMARY KEY);
CREATE TABLE
This example creates the same table without the constraint, and then adds the constraint with ALTER TABLE ADD CONSTRAINT
CREATE TABLE t01 (id int);
CREATE TABLE
ALTER TABLE t01 ADD CONSTRAINT sampleconstraint PRIMARY KEY(id);
WARNING 2623: Column "id" definition changed to NOT NULL
ALTER TABLE
The following example creates a table (addapk
) with two columns, adds a third column to the table, and then adds a primary key constraint on the third column.
=> CREATE TABLE addapk (col1 INT, col2 INT);
CREATE TABLE
=> ALTER TABLE addapk ADD COLUMN col3 INT;
ALTER TABLE
=> ALTER TABLE addapk ADD CONSTRAINT col3constraint PRIMARY KEY (col3) ENABLED;
WARNING 2623: Column "col3" definition changed to NOT NULL
ALTER TABLE
Using the sample table addapk
, check that the primary key constraint is enabled (is_enabled
is t
).
=> SELECT constraint_name, column_name, constraint_type, is_enabled FROM PRIMARY_KEYS WHERE table_name IN ('addapk');
constraint_name | column_name | constraint_type | is_enabled
-----------------+-------------+-----------------+------------
col3constraint | col3 | p | t
(1 row)
This example disables the constraint using ALTER TABLE ALTER CONSTRAINT
.
=> ALTER TABLE addapk ALTER CONSTRAINT col3constraint DISABLED;
Check that the primary key is now disabled (is_enabled
is f
).
=> SELECT constraint_name, column_name, constraint_type, is_enabled FROM PRIMARY_KEYS WHERE table_name IN ('addapk');
constraint_name | column_name | constraint_type | is_enabled
-----------------+-------------+-----------------+------------
col3constraint | col3 | p | f
(1 row)
For a general discussion of constraints, see Constraints. For additional examples of creating and naming constraints, see Naming constraints.
31 - CREATE TEMPORARY TABLE
Creates a table whose data persists only during the current session. Temporary table data is not visible to other sessions.
Syntax
Create with column definitions:
CREATE [ scope ] TEMP[ORARY] TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
( column-definition[,...] )
[ table-constraint ]
[ ON COMMIT { DELETE | PRESERVE } ROWS ]
[ NO PROJECTION ]
[ ORDER BY table-column[,...] ]
[ segmentation-spec ]
[ KSAFE [k-num] ]
[ {INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES ]
Create from another table:
CREATE TEMP[ORARY] TABLE [ IF NOT EXISTS ] [[database.]schema.]table-name
[ ( column-name-list ) ]
[ ON COMMIT { DELETE | PRESERVE } ROWS ]
AS [ /*+ LABEL */ ] [ AT epoch ] query [ ENCODED BY column-ref-list ]
Parameters
scope
- Specifies visibility of the table definition:
-
GLOBAL
: The table definition is visible to all sessions, and persists until you explicitly drop the table. -
LOCAL
: the table definition is visible only to the session in which it is created, and is dropped when the session ends.
If no scope is specified, Vertica uses the default that is set by configuration parameter DefaultTempTableLocal.
Regardless of this setting, retention of temporary table data is set by the keywords ON COMMIT DELETE and
ON COMMIT PRESERVE
(see below).For more information, see Creating temporary tables.
-
IF NOT EXISTS
If an object with the same name exists, do not create it and proceed. If you omit this option and the object exists, Vertica generates a ROLLBACK error message. In both cases, the object is not created if it already exists.
The
IF NOT EXISTS
clause is useful for SQL scripts where you want to create an object if it does not already exist.For related information, see ON_ERROR_STOP.
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.If you do not specify a schema, the table is created in the default schema.
*
table-name*
- Identifies the table to create, where
table-name
conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema. - column-definition
- Defines a table column. A table can have up to 9800 columns.
- table-constraint
- Adds a constraint to table metadata.
ON COMMIT
- Specifies whether data is transaction- or session-scoped:
ON COMMIT {PRESERVE | DELETE} ROWS
-
DELETE
(default) marks the temporary table for transaction-scoped data. Vertica removes all table data after each commit. -
PRESERVE
marks the temporary table for session-scoped data, which is preserved beyond the lifetime of a single transaction. Vertica removes all table data when the session ends.
-
NO PROJECTION
- Prevents Vertica from creating auto-projections for this table. A superprojection is created only when data is explicitly loaded into this table.
NO PROJECTION
is invalid with the following clauses:-
ORDER BY
-
KSAFE
-
Any segmentation clause (hash-segmentation-clause or unsegmented-clause).
-
{INCLUDE | EXCLUDE} [SCHEMA] PRIVILEGES
Default inheritance of schema privileges for this table:
-
INCLUDE PRIVILEGES specifies that the table inherits privileges that are set on its schema. This is the default behavior if privileges inheritance is enabled for the schema.
-
EXCLUDE PRIVILEGES disables inheritance of privileges from the schema.
For details, see Inherited privileges.
-
ORDER BY
table-column
[,...]
Invalid for external tables, specifies columns from the
SELECT
list on which to sort the superprojection that is automatically created for this table. TheORDER BY
clause cannot include qualifiersASC
orDESC
. Vertica always stores projection data in ascending sort order.If you omit the
ORDER BY
clause, Vertica uses theSELECT
list order as the projection sort order.segmentation-spec
Invalid for external tables, specifies how to distribute data for auto-projections of this table. Supply one of the following clauses:
-
hash-segmentation-clause: Specifies to segment data evenly and distribute across cluster nodes. Vertica recommends segmenting large tables.
-
unsegmented-clause: Specifies to create an unsegmented projection.
If this clause is omitted, Vertica generates auto-projections with default hash segmentation.
-
KSAFE [
k-num
]
Invalid for external tables, specifies K-safety of auto-projections created for this table, where
k-num
must be equal to or greater than system K-safety. If you omit this option, the projection uses the system K-safety level.Eon Mode: K-safety of temporary tables is always set to 0, regardless of system K-safety. If a
CREATE TEMPORARY TABLE
statement setsk-num
greater than 0, Vertica returns an warning.- column-name-list
Valid only when creating a table from a query (
AS
query
), defines column names that map to the query output. If you omit this list, Vertica uses the query output column names. The names incolumn-name-list
and queried columns must be the same in number.For example:
CREATE TABLE customer_occupations (name, profession) AS SELECT customer_name, occupation FROM customer_dimension;
This clause and the
ENCODED BY
clause are mutually exclusive. Column name lists are invalid for external tablesAS
query
Creates and loads a table from the results of a query, specified as follows:
AS [ /*+ LABEL */ ] [ AT epoch ] query
The query cannot include complex type columns.
ENCODED BY
column-ref-list
A comma-delimited list of columns from the source table, where each column is qualified by one or both of the following encoding options:
-
ACCESSRANK
integer
: Overrides the default access rank for a column, useful for prioritizing access to a column. See Prioritizing column access speed. -
ENCODING
encoding-type
: Specifies the type of encoding to use on the column. The default encoding type isAUTO
.
This option and
column-name-list
are mutually exclusive. This option is invalid for external tables.-
Privileges
The following privileges are required:
-
CREATE
privileges on the table schema -
If creating a temporary table that includes a named sequence:
-
SELECT
privilege on sequence object -
USAGE
privilege on sequence schema
-
Restrictions
-
Queries on temporary tables are subject to the same restrictions on SQL support as persistent tables.
-
You cannot add projections to non-empty, global temporary tables (ON COMMIT PRESERVE ROWS). Make sure that projections exist before you load data. See Auto-projections.
-
While you can add projections for temporary tables that are defined with ON COMMIT DELETE ROWS specified, be aware that you might lose all data.
-
Mergeout operations cannot be used on session-scoped temporary data.
-
In general, session-scoped temporary table data is not visible using system (virtual) tables.
-
Temporary tables do not recover. If a node fails, queries that use the temporary table also fail. Restart the session and populate the temporary table.
Examples
See Creating temporary tables.
See also
32 - CREATE TEXT INDEX
Creates a text index used to perform text searches. If data within a table is partitioned, then an extra column appears in the text index, showing the partition.
Syntax
CREATE TEXT INDEX [[database.]schema.]txtindex-name
ON [schema.]source-table (unique-id, text-field [, column-name,...])
[STEMMER {stemmer-name(stemmer-input-data-type)| NONE}]
[TOKENIZER tokenizer-name(tokenizer-input-data-type)];
Parameters
[
database
.]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.If you do not specify a schema, the table is created in the default schema.
*
txtindex-name*
- The text index name.
*
source-table*
- The source table to index.
*
unique-id*
- The name of the column in the source table that contains a unique identifier. Any data type is permissible. The column must be the primary key in the source table.
*
text-field*
- The name of the column in the source table that contains the text field. Valid data types are:
-
CHAR
-
VARCHAR
-
LONG VARCHAR
-
VARBINARY
-
LONG VARBINARY
Nulls are allowed.
-
*
column-name*
- The name of a column or columns to be included as additional columns.
*
stemmer-name*
- The name of the stemmer.
*
stemmer-input-data-type*
- The input data type of the
stemmer-name
function. *
tokenizer-name*
- Specifies the name of the tokenizer.
*
tokenizer-input-data-type*
- This value is the input data type of the
tokenizer-name
function. It can accept any number of arguments.If a Vertica tokenizers is used, then this parameter can be omitted.
Privileges
The index automatically inherits the query permissions of its parent table. The table owner and dbadmin will be allowed to create and/or modify the indices.
Important
Do not alter the contents or definitions of the text index. If the contents or definitions of the text index are altered, then the results will not appropriately match the source table.Requirements
-
Requires there be a column with a unique identifier set as the primary key.
-
The source table must have an associated projection, and must be both sorted and segmented by the primary key.
Examples
The following example shows how to create a text index with an additional unindexed column on the table t_log using the CREATE TEXT INDEX statement:
=> CREATE TEXT INDEX t_log_index ON t_log (id, text, day_of_week);
CREATE INDEX
=> SELECT * FROM t_log_index;
token | doc_id | day_of_week
-----------------------+--------+-------------
'catalog | 1 | Monday
'dbadmin' | 2 | Monday
2014-06-04 | 1 | Monday
2014-06-04 | 2 | Monday
2014-06-04 | 3 | Monday
2014-06-04 | 4 | Monday
2014-06-04 | 5 | Monday
2014-06-04 | 6 | Monday
2014-06-04 | 7 | Monday
2014-06-04 | 8 | Monday
45035996273704966 | 3 | Tuesday
45035996273704968 | 4 | Tuesday
<INFO> | 1 | Tuesday
<INFO> | 6 | Tuesday
<INFO> | 7 | Tuesday
<INFO> | 8 | Tuesday
<WARNING> | 2 | Tuesday
<WARNING> | 3 | Tuesday
<WARNING> | 4 | Tuesday
<WARNING> | 5 | Tuesday
...
(97 rows)
The following example shows a text index, tpart_index, created from a partitioned source table:
=> SELECT * FROM tpart_index;
token | doc_id | partition
------------------------+--------+-----------
0 | 4 | 2014
0 | 5 | 2014
11:00:49.568 | 4 | 2014
11:00:49.568 | 5 | 2014
11:00:49.569 | 6 | 2014
<INFO> | 6 | 2014
<WARNING> | 4 | 2014
<WARNING> | 5 | 2014
Database | 6 | 2014
Execute: | 6 | 2014
Object | 4 | 2014
Object | 5 | 2014
[Catalog] | 4 | 2014
[Catalog] | 5 | 2014
'catalog | 1 | 2013
'dbadmin' | 2 | 2013
0 | 3 | 2013
11:00:49.568 | 1 | 2013
11:00:49.568 | 2 | 2013
11:00:49.568 | 3 | 2013
11:00:49.570 | 7 | 2013
11:00:49.571 | 8 | 2013
45035996273704966 | 3 | 2013
...
(89 rows)
See also
33 - CREATE USER
Adds a name to the list of authorized database users.
Note
New users lack default access to schema PUBLIC. Be sure to grant new users USAGE privileges on the PUBLIC schema.Syntax
CREATE USER user-name [ account-parameter value[,...] ]
Parameters
user-name
- Name of the new user, where
user-name
conforms to conventions described in Identifiers. account-parameter
value
- One or more user account parameter settings (see below).
User account parameters
Specify one or more user account parameters as a comma-delimited list:
account-parameter setting[,...]
Parameter | Settings |
---|---|
ACCOUNT |
Locks or unlocks user access to the database, one of the following:
TipTo automate account locking, set a maximum number of failed login attempts with CREATE PROFILE. |
GRACEPERIOD |
Specifies how long a user query can block on any session socket, one of the following:
For details, see Handling session socket blocking. |
IDENTIFIED BY |
Sets the user's password as follows:
ImportantIf you omit this parameter, this user can access the database with no password.For details, see Password guidelines and Creating a database name and password. |
IDLESESSIONTIMEOUT |
The length of time the system waits before disconnecting an idle session, one of the following:
For details, see Managing client connections. |
MAXCONNECTIONS |
Sets the maximum number of connections the user can have to the server, one of the following:
For details, see Managing client connections. |
MEMORYCAP |
Sets how much memory can be allocated to user requests, one of the following:
|
PASSWORD EXPIRE |
Forces immediate expiration of the user's password. The user must change the password on the next login. Note
|
PROFILE |
Assigns a profile that controls password requirements for this user, one of the following:
If you omit this parameter, the user is assigned the default profile. |
RESOURCE POOL |
Assigns a default resource pool to this user. The user must also be granted privileges to this pool, unless privileges to the pool are set to PUBLIC . |
RUNTIMECAP |
Sets how long this user's queries can execute, one of the following:
A query's runtime limit can be set at three levels: the user's runtime limit, the user's resource pool, and the session setting. For more information, see Setting a runtime limit for queries. |
SEARCH_PATH |
Specifies the user's default search path, that tells Vertica which schemas to search for unqualified references to tables and UDFs, one of the following:
For details, see Setting Search Paths. |
TEMPSPACECAP |
Sets how much temporary file storage is available for user requests, one of the following:
|
Privileges
Superuser
User name best practices
Vertica database user names are logically separate from user names of the operating system in which the server runs. If all the users of a particular server also have accounts on the server's machine, it makes sense to assign database user names that match their operating system user names. However, a server that accepts remote connections might many database users with no local operating system account. In this case, there is no need to connect database and system user names.
Examples
=> CREATE USER Fred IDENTIFIED BY 'Mxyzptlk';
=> GRANT USAGE ON SCHEMA PUBLIC to Fred;
See also
34 - CREATE VIEW
Defines a view. Views are read only, so they do not support insert, update, delete, or copy operations.
Syntax
CREATE [ OR REPLACE ] VIEW [[database.]schema.]view [ (column[,...]) ]
[ {INCLUDE|EXCLUDE} [SCHEMA] PRIVILEGES ] AS query
Parameters
OR REPLACE
- Specifies to overwrite the existing view
view-name
. If you omit this option andview-name
already exists,CREATE VIEW
returns an error.Any grants assigned to the view before you execute a CREATE OR REPLACE remain on the updated view. See GRANT (view).
[
database
]
schema
Database and schema. The default schema is
public
. If you specify a database, it must be the current database.view
- Identifies the view to create, where
view
conforms to conventions described in Identifiers. It must also be unique among all names of sequences, tables, projections, views, and models within the same schema. column
[,...]
- List of up to 9800 names to use as view column names. Vertica maps view column names to query columns according to the order of their respective lists. By default, the view uses column names as they are specified in the query.
query
- A
SELECT
statement that the temporary view executes. TheSELECT
statement can reference tables, temporary tables, and other views. {INCLUDE|EXCLUDE}[SCHEMA] PRIVILEGES
- Specifies whether this view inherits schema privileges:
-
INCLUDE PRIVILEGES
specifies that the view inherits privileges that are set on its schema. This is the default behavior if privileges inheritance is enabled for the schema. -
EXCLUDE PRIVILEGES
disables inheritance of privileges from the schema.
For details, see Inherited privileges.
-
Privileges
See Creating views.
Examples
The following example shows how to create a view that contains data from multiple tables.
=> CREATE VIEW temp_t0 AS SELECT * from t0_p1 UNION ALL
SELECT * from t0_p2 UNION ALL
SELECT * from t0_p3 UNION ALL
SELECT * from t0_p4 UNION ALL
SELECT * from t0_p5;